Mplus Discussion >> Means for latent variables

Topics
Last Day
Last 3 Days
Last Week
Tree View

Edit Profile


Means for latent variables

Mplus Discussion > Confirmatory Factor Analysis >

Message/Author

Christian S posted on Wednesday, August 25, 2010 - 11:51 am

Dear Drs. Muthen,

I have a CFA with latent factors (each measured with multiple Likert-scale [-3;3] indicators). For the descriptive statistics, I would like to list mean and std. deviation for each latent factor. The TECH 4 output gives me 0 as mean for every factor. What should I do to get the "real" means and the standard deviations? Does the fact that I get 0 as mean for each factor mean that something with my model is wrong?

I really appreciate your reply.

Best regards,

Christian

Linda K. Muthen posted on Wednesday, August 25, 2010 - 2:44 pm

In cross-sectional studies, the means of latent variables are zero. In multiple group analysis or with repeated measures, the means of latent variables are zero in one group or at one time point and are estimated in the other groups or time points.

Christian S posted on Wednesday, August 25, 2010 - 3:21 pm

Dear Dr. Muthen,

thank you very much for your reply. As far as I understand, you wrote how Mplus handles factors in cross-sectional studies.

However, in many publications, I find means and std. dev.s of latent factors in the descriptive statistics part.

When indicators are e.g. all skewed to the right (avg>0), then the mean of the latent variable should (e.g. in my example with Likert scales from -3 to 3) not be zero. Is there a way with Mplus to get this mean? Should I use an average of the indicators of each factor weighted by the indicators' yx-standardized factor loading and then calculate the mean and the std. dev. for each factor?

Best Regards,

Christian

Linda K. Muthen posted on Wednesday, August 25, 2010 - 3:38 pm

This is not how Mplus handles factors in cross-sectional studies, it is the conventional way to do this. A factor mean in a cross-sectional study has no meaning. You can't compare it to other factor means because there is no basis for comparison. It makes sense to compare factor means only across groups or across time after measurement invariance has been established.

I would imagine in the studies you mention, factor score means are being reported. Factor scores are generally not good approximations of true factor values.

The mean of a factor indicator is equal to the intercept of the factor indicator plus the factor loading times the mean of the factor. When the factor mean is zero, the mean of the factor indicator is equal to its intercept. This is why the factor mean can be zero even when the observed variable indicator mean is not zero.

Brewery Lin posted on Friday, April 06, 2012 - 12:12 am

In Byrne's book (2012), p.211
Appearing below these specifications, however, you will see the following:
[Fl@O F2@0 F3@0]......in structuring the input file for a configural model, it is necessary to void this default by fixing all factor means to zero.

Is it more appropriate to do that constriain? Thank you.

Linda K. Muthen posted on Friday, April 06, 2012 - 8:27 am

To test if factor means are different across groups use a model where factor means are zero in all groups versus a model where factor means are zero in one group and free in the other groups.

Brewery Lin posted on Sunday, April 08, 2012 - 7:42 pm

Dear Linda,
If I just want to test the configural model, is it still recommend to do that?

Thank you.

Linda K. Muthen posted on Monday, April 09, 2012 - 8:21 am

The configural model is factor means free across groups, intercepts free across groups, and factor means zero in all groups. You may find the multiple group section of the Topic 1 course handout on the website useful. It shows all of the inputs needed to test for measurement invariance.

Jorge Walter posted on Saturday, June 07, 2014 - 4:03 am

Dear Linda,

From your first reply in this post, I understand that estimated means for latent variables in a cross-sectional model are always zero.

Given this, how can I then calculate the informative indices as part of models with sampling weights discussed in Asparounov (2004: 12-13)? The numerator would always be zero...

Thanks in advance for your help!

Bengt O. Muthen posted on Saturday, June 07, 2014 - 5:53 pm

What are "informative indices"?

Jorge Walter posted on Sunday, June 08, 2014 - 4:27 am

I'm sorry that my question was unclear. According to Asparouhov (2004: 12), an informative index is essentially a t-statistic comparing weighted with unweighted estimates:

I = (weighted estimated mean of Y - unweighted estimated mean of Y) / sqrt (estimated variance of the weighted mean - estimated variance of the unweighted mean).

Source:

Asparouhov, T. 2004. Weighting for unequal probability of selection in multilevel modeling. Mplus Web Notes: No. 8.

Linda K. Muthen posted on Sunday, June 08, 2014 - 11:11 am

I believe this formula is for observed not latent variables.

JW posted on Tuesday, July 08, 2014 - 7:09 am

Hi Linda,

in the post from 6th April 2012 you say:

"To test if factor means are different across groups use a model where factor means are zero in all groups versus a model where factor means are zero in one group and free in the other groups."

I am unsure on how to obtain the test. I have 3 latent variables measured at 2 time points so I would like to compare the means at time2 vs. mean at time1 (which would be 0).

Is this provided by the p-value associated with the mean at time2 in the output as in the example below -

for example, the mean score associated with variable 1 at follow-up (Follow1) is .118 which is associated with a p-value of .019 - would I interpret this an increase over time:

Means
Base_var1 0.000 0.000 999.000 999.000
Base_var2 0.000 0.000 999.000 999.000
Base_var3 0.000 0.000 999.000 999.000

Follow1 0.118 0.050 2.349 0.019
Follow2 0.071 0.065 1.082 0.279
Follow3 0.049 0.062 0.794 0.427

Or how else can I assess this?

Grateful for your help!

J

Linda K. Muthen posted on Tuesday, July 08, 2014 - 8:19 am

With two time points, the z-test is column three of the follow variables is a test of the difference in means across the two time points.

JW posted on Tuesday, July 08, 2014 - 8:44 am

Hi Linda -

Thank you for your reply!

So in the example above, looking at the line corresponding with Follow1, I could report the results for measure 1 as an increase over time, t(1) = 2.3, p = .02 -

is that correct?

Thanks again,
J

JW posted on Tuesday, July 08, 2014 - 9:00 am

oh my other question is - can I quantify the difference at follow-up (vs. baseline time0).

Does the estimate (i.e., 0.118) indicate that at follow-up the scores are .12 points higher? should I use standardised estimates?

Thanks!

Linda K. Muthen posted on Wednesday, July 09, 2014 - 9:22 am

Yes, but it is not a t test. It is a z-test in large samples.

Yes, it is the difference between the two groups. It should be compared to its standard deviation to understand how large it is.

JW posted on Thursday, July 10, 2014 - 2:05 am

Thanks again Linda,

should I specify

OUTPUT: STDX;

to be able to estimate how large it is?

Linda K. Muthen posted on Thursday, July 10, 2014 - 6:23 am

You should take the square root of its variance to get the standard error.

JW posted on Thursday, July 10, 2014 - 6:34 am

Thank you very much!!!

JW posted on Monday, July 14, 2014 - 8:46 am

As I am writing my results up, it suddenly occurred to me: is diving the estimate by standard deviation equivalent to Cohen's d effect size?

Thanks again!!!

Linda K. Muthen posted on Monday, July 14, 2014 - 4:12 pm

This is only true if the estimate is the difference between two means.

JW posted on Tuesday, July 15, 2014 - 1:41 am

I am unsure whether this is true in my case where I am interested in comparing means of 2 latent variables. One assessed at time 0 (and set equal to 0) and the other assessed at time 1 (which is estimated). Does dividing the z-test value corresponding to the mean at time 1 by the variable's SD correspond to a cohen's d?

I am unsure as we are forcing the first mean to be 0...

Grateful for your help!

Linda K. Muthen posted on Tuesday, July 15, 2014 - 11:40 am

This is a mean difference even if one mean is zero. The difference in means is the mean parameter that is not zero. You divide this value by the standard deviation of that latent variable which you can find in the results or TECH4.

JW posted on Tuesday, July 29, 2014 - 4:23 am

Hi Linda,

I have been looking at this a bit more and was discussing it with someone at work.

This conversation has brought to my attention the fact that I am not sure how MPLUS constructs the factors and how the free mean at time 2 is calculated, especially since the latent factor consists of questionnaires with different scales.

Grateful for your response and please feel free to point me to any relevant reference.

Thanks

Linda K. Muthen posted on Tuesday, July 29, 2014 - 9:25 am

See the Topic 4 course handout on the website under multiple indicator growth.

JW posted on Wednesday, July 30, 2014 - 2:10 am

Thanks Linda!

Francesca posted on Wednesday, August 13, 2014 - 4:37 am

Hi Dr Muthen,

I have a similar question. I have data from 2 time points and I would like to test if the mean of the latent variable is higher at follow-up compared to baseline.

I tested for scalar invariance and found that it is met so I can compare the means.

Here come my questions:

1. to compare the means shall I keep the loadings and thresholds fixed or should I do it in the model with all free parameters?

2. To compare means I need to fix the baseline mean at 0 and let Mplus estimate the mean at follow-up - how does Mplus estimate/calculate the free mean?

Thanks,
Fran

Linda K. Muthen posted on Wednesday, August 13, 2014 - 8:41 am

To compare means, you should keep the loadings and thresholds constrained to be equal across time. The two models to use for testing mean differences across time are the model with means zero at both time points versus the model with the mean fixed at zero at one time point and free at the other.

With constrained loadings and thresholds, the difference in probabilities of the items at the two time points is expressed by the free factor mean.

Francesca posted on Thursday, August 14, 2014 - 2:20 am

Thanks Linda!

So to double check, with fixed loadings and thresholds:

1. do I need to assess the difftest for the model with both means set to 0 and the one with one mean at zero and the other free? is this what tells me if the means are different?

2. I am not 100% sure I understand what the free factor mean tells me - I thought this tested if the free mean is significantly different from 0 and therefore if data differ at followup (vs. baseline)

3. how is the free mean estimated in Mplus?

Thanks again
Francesca

Bengt O. Muthen posted on Thursday, August 14, 2014 - 3:15 pm

1. You have only 2 time points so the z-test for the estimated mean at time point 2 tells you if the means are significantly different at the 2 time points (because it tests against zero).

2. Correct.

3. With equal loadings and thresholds the factor mean lets the probabilities at time 2 be different from those of time 1. ML finds the optimal factor mean at time 2 using iterative methods.

Angela posted on Wednesday, September 03, 2014 - 10:34 pm

Hello Muthens!

First, I must say THANK YOU for all your support. Without it, many of us would be full of questions with no answers.

I am trying to conduct a multiple group analysis with two groups. I also have two timepoints for each group. Right now, I see that the group one means of the latent variables for time one and intercepts for time two are fixed at zero. My questions are these:

1) Just confirming what I've read here- Am I correct in saying that if the group two means are significant (and then different from zero), they are then different from group one means?

2) How do I find out the extent of the mean shift for each of the groups?

Thank you in advance!

Bengt O. Muthen posted on Thursday, September 04, 2014 - 11:04 am

1) Yes

2) I assume you have specified measurement invariance across time and across groups. TECH1 will show you what's been done. I also assume that the time 2 factor is regressed on the time 1 factor and that this regression coefficient varies across groups (again, check Tech1). Am I right so far?

Angela posted on Thursday, September 04, 2014 - 11:17 am

I have specified measurement invariance. (Does it matter, by the way, whether it is weak, strong, or strict invariance?)

Time 2 is regressed on Time 1; however, after testing equality of parameters through imposing constraints and using delta chi-square tests, the coefficients are the same for the two groups. How does that affect results?

Thanks!

Bengt O. Muthen posted on Thursday, September 04, 2014 - 2:34 pm

And you want the mean difference across groups for a certain time point, or the mean difference across time for a certain group?

Angela posted on Thursday, September 04, 2014 - 2:46 pm

The mean difference across time for each group.

Bengt O. Muthen posted on Thursday, September 04, 2014 - 3:48 pm

To compare factor means you should have strong (scalar) invariance.

You express that with Model Constraint using parameter labels from Model, drawing on simple regression formulas where you build on the following input for a certain group:

Model:
f2 on f1 (slope);
[f1] (f1mean);
[f2] (f2int);

Model Constraint:
New(f2mean diff);
f2mean = f2int + slope*f1mean;
diff = f2mean - f1mean;

"diff" gets you the estimate and SE of what you want. When you have a fixed zero for f1mean you just add

New(f1mean);
f1mean = 0;

Angela posted on Thursday, September 04, 2014 - 3:55 pm

Great! That all makes sense to me except for the last little bit on the fixed zero. It's already at zero; I actually want it to be free so that I can actually get a difference between f1 and f2 (which is at zero).

Bengt O. Muthen posted on Thursday, September 04, 2014 - 4:41 pm

If you have scalar invariance across time you can estimate the f2 intercept as is done above when saying

[f2]

You want the f1 mean to be zero in one group for identification. My formulas still give you a mean difference across time.

Angela posted on Thursday, September 04, 2014 - 4:48 pm

Okay- I'll double-check that all. Thank you!

Sarah Herpertz posted on Monday, November 24, 2014 - 7:42 pm

Dear Drs. Muthen,

I have a similar question. I have data from 2 time points and 2 groups (treatment and control).

I would like to test how high is the mean difference across time for each group.

I used your syntax above (September 04, 2014 - 3:48 pm) for each group separatly. Unfortunately I have received the following error message:

A parameter label has been redeclared in MODEL CONSTRAINT.
Problem with: f1mean

Do you know what I can do now?

Thank you very much?

Bengt O. Muthen posted on Monday, November 24, 2014 - 8:08 pm

Send output and license number to support.

Sarah Herpertz posted on Tuesday, December 02, 2014 - 11:36 am

Dear Ms. Muthen

I have data from 2 time points and 2 groups (treatment and control).
I have four related questions:

1)
I would like to test how large is the mean difference across time for each group. I used your syntax above to test the mean difference across time for each group (it works now). Now I�m wondering if I should fix the f1 mean to zero or not? I have tried both and the results are really different.

Diff1 (F1 mean is not fixed to zero)=7.19**
Diff2 (F1 mean is fixed to zero � like in your syntax)=31.77**

2)
Is it possible to standardize this difference?

3)
I also tested the mean difference across groups. I compared the unstandardized estimate of the latent mean (which is not fixed to zero) to its standard deviation. Is it correct that this value is equal to the standardized intercept?

4)
Can I compare this difference to the difference across time for each group? Or is different metric?

Thank you very much for all your support.

Bengt O. Muthen posted on Tuesday, December 02, 2014 - 11:50 am

1) The f1 mean needs to be fixed at zero at time 1 for one of the two groups, otherwise the model is not identified (you should see an error message).

2) Yes, you can divide by its standard deviation as estimated in the model (SD = sqrt of variance). This SD can also be expressed in Model constraint. In its simplest case, the SD is the same at time 1 and 2 (check Tech4 for time 2), in which case you simply divide the factor mean difference by

sqrt(factor variance).

3) See my answer to 2)

4) You can compare the across-time factor mean difference or standardized across-time factor mean difference across the groups.

Sarah Herpertz posted on Tuesday, December 02, 2014 - 12:35 pm

Thank you very much.

For double check:

To your answer 1:
I examined the mean change from pre to posttest separately in each group. I tried now the model with the two groups. I fixed the mean to zero in the reference group. I only got the difference in the group which has both means free.

1) How can I get the difference for the reference group?

I used your syntax for the difference:

Model:
f2 on f1 (slope);
[f1] (f1mean);
[f2] (f2int);

Model Constraint:
New(f1mean2);
f1mean2 = 0;
New(f2mean diff);
f2mean = f2int + slope*f1mean2;
diff = f2mean - f1mean2;

2) Should I fix the f1mean2 also to Zero or not?

To your answer 2:
I checked Tech4. The SD is not the same at time 1 and 2.
SD Time 1 = sqrt(188)=13.7
SD Time 2 = sqrt(100)=10

3) Can I use the average of both SD�s to standardize the difference?

Thank you again!

Bengt O. Muthen posted on Tuesday, December 02, 2014 - 5:57 pm

You have 2 groups, reference and focal, say. Then you have the factor mean status

reference group time1: fixed at zero
reference group time 2: free (through its intercept being free)

focal group time 1: free
focal group time 2: free (through its intercept being free)

With unequal factor variances, I would vote for standardizing with respect to time 1 variance (starting variance) and explain that this is what you do.

To feel comfortable with this type of modeling, you will want to take a course in multiple-group factor analysis (see our Topic 1 handout and video on our website).

Sarah Herpertz posted on Friday, December 12, 2014 - 10:04 am

Thank you very much. It works now.

Sarah Herpertz posted on Wednesday, February 04, 2015 - 7:46 am

Dear Dr. Muthen,

I am writing up the results for the latent mean differences across groups.
May you please check my report style? I don�t want to report wrong results�

I have a multiple-group SEM (2 groups, 2 time-points). Time 2 is regressed on Time 1. I fixed the mean to zero in the reference group at both time-points.

Question 1:
I describe in the section overview of analyses how I calculate the standardized difference �d� (by comparing the unstandardized estimate of the latent mean (which is not fixed to zero) to its standard deviation of the latent variable).

Is it correct, when I report d = XX, z = 4.90, p < .001 ?

Question 2:
Is it correct to say there is no significant group difference at Time 1 when the z-test for the estimated mean at time point 1 tells me no significant difference at Time 1?

Thank you so much again!

Bengt O. Muthen posted on Wednesday, February 04, 2015 - 3:09 pm

Q1. Yes.

Q2. Yes.

Till Martin posted on Monday, March 23, 2015 - 6:28 am

Dear Drs. Muthen,

above it has been asked "Should I use an average of the indicators of each factor weighted by the indicators'yx-standardized factor loading and then calculate the mean and the std. dev. for each factor?"

and Dr. L. Muthen replied

"This is not how Mplus handles factors in cross-sectional studies, it is the conventional way to do this."

I was searching for days for a reference/citation for this procedure of forming aggregated scores from items based on the observed indicators scores weighted by the standardized factor loadings, and I could not find any reference indicating that this is usual. I could only find classical EFA context weighting. It was recommended to me to fit a congeneric single-factor model yielding standardized factor loadings and using them as weights to account for the relative importance an item had for a factor.

My goal is to perform multiple regression based on composite scores.

Is it true that weighted means/sum are only appropriate in formative measurement?

Or is there a reference for the above procedure, that was termed "conventional"?

Your help is greatly appreciated
Thank you very much
Best,
Till

Bengt O. Muthen posted on Monday, March 23, 2015 - 2:11 pm

If you can't do it in a single step using SEM, I would use factor scores or multiply-imputed factor scores, also referred to as plausible values. See also our FAQ "Factor scores".

I can't think of a reference for summing items weighted by their loadings - in my experience it is a common procedure, particularly in the past - and there have been debates about whether or not that is better than "unit-weighting".

Till Martin posted on Thursday, March 26, 2015 - 2:14 pm

Dear Dr. Muthen,

thank you very much for your reply. I think I will use the factor scores then, though I have found a procedure by Rowe (2006) using factor score coefficients as weights to account for the relative importance of an item for a factor.

I used a syntax by Raykov (2009) but I would like to use WLSMV instead of MLR:

But I get this error message:

WARNING in ANALYSIS command
Estimator WLSMV is not available for analysis with all continuous variables.
Default estimator will be used.
1 WARNING(S) FOUND IN THE INPUT INSTRUCTIONS

The syntax is (6 Likert items, non-normal, 99 cases, no missings)

VARIABLE:
VARIABLE:
NAMES ARE DEE1 DEE2 DEE3 DEE4 DEE5 DEE6;
Usevariables are DEE1 DEE2 DEE3 DEE4 DEE5 DEE6;

MODEL: KSI BY DEE1*(B1)
DEE2-DEE6(B2-B6);
KSI@1;
DEE1-DEE6(THETA1-THETA6);
MODEL CONSTRAINT:
NEW (RHO);
RHO = (B1 + B2 + B3 + B4 + B5 + B6)**2/((B1 + B2 + B3 + B4 + B5 +B6)**2 + THETA1
+ THETA2 + THETA3 + THETA4 + THETA5 + THETA6);
ANALYSIS: ESTIMATOR = WLSMV;
OUTPUT: CINTERVAL SAMPSTAT TECH1 STDYX FSCOEFFICIENT;

What could be the problem?

Thank you very much
Best,
Till

Bengt O. Muthen posted on Thursday, March 26, 2015 - 3:34 pm

We don't make WLSMV available in this case. There is no reason not to choose say MLR.

Seth Frndak posted on Tuesday, March 29, 2016 - 10:04 am

I have come across the same issue that Till Martin has. I am trying to estimate a continuous growth model with WLSMV, but I receive the error: "Estimator WLSMV is not available for analysis with all continuous variables."

The reason I am trying to estimate the continuous model in WLSMV is because I am creating a two-part model. My dichotomous model will not converge in MLR so I am using WLSMV. When I combine the continuous and dichotomous models, the continuous model is estimated with WLSMV and not the estimator that was used in the model building process (MLR).

Is there any way to override this?

Thank you!

Bengt O. Muthen posted on Tuesday, March 29, 2016 - 4:04 pm

Two-part modeling needs to be done in ML to properly take the missing data into account.

Stefan posted on Wednesday, July 18, 2018 - 1:40 am

Dear Drs Muthen,

I face a problem with a multi group latent mean comparison. I have a four-group model and thus, group one's mean is fixed to zero whereas the other groups are reported in relation to group one. Is there any way to set group 2 as the reference group?

I tried:

model group_1:

[VAR*];

model group_2:

[VAR@0];

However, the estimation of the factor intercepts is different then and and the mean difference towards group 1 is different to the output which I get when I have group one as the reference group.

I thought about constraining all factor intercepts to the values of the first output. However, that doesn't work either.

Thank you in advance. Best, Stefan

Bengt O. Muthen posted on Wednesday, July 18, 2018 - 8:17 am

Check that the 2 runs have the same number of parameters (it's printed in the output).

Stefan posted on Thursday, July 19, 2018 - 1:24 am

Thank you. It worked now. However, only for unstandardised estimates. Is there a way, to get equal values in the standardised estimation as well?

Bengt O. Muthen posted on Thursday, July 19, 2018 - 6:13 am

If unstandardized parameter values are equal, the standardized values are equal only of the variances are equal - that is a much stronger statement and I don't think you want to go there.

Sara Namazi posted on Tuesday, January 22, 2019 - 8:50 am

Hi Dr. Muthen,

I am trying to plot my latent interaction terms in excel. I understand that the mean of latent variables are zero. However, where on the output can I find the standard deviation of my latent variables?

Bengt O. Muthen posted on Tuesday, January 22, 2019 - 2:33 pm

For an exogenous latent variable you find its SD as the square root of its estimated variance. For endogenous latent variables you find them in Tech4.

Sara Namazi posted on Tuesday, January 22, 2019 - 2:48 pm

Thank you, Dr. Muthen. I really appreciate your quick reply. Would the intercept/constant be 0 for latent variables?

Bengt O. Muthen posted on Tuesday, January 22, 2019 - 4:52 pm

Yes, unless you have multiple groups, multiple time points, or mixtures.

Sara Namazi posted on Tuesday, January 22, 2019 - 5:05 pm

Thank you, Dr. Muthen! This is all so helpful!

Sara