Message/Author 

Christian S posted on Wednesday, August 25, 2010  5:51 pm



Dear Drs. Muthen, I have a CFA with latent factors (each measured with multiple Likertscale [3;3] indicators). For the descriptive statistics, I would like to list mean and std. deviation for each latent factor. The TECH 4 output gives me 0 as mean for every factor. What should I do to get the "real" means and the standard deviations? Does the fact that I get 0 as mean for each factor mean that something with my model is wrong? I really appreciate your reply. Best regards, Christian 


In crosssectional studies, the means of latent variables are zero. In multiple group analysis or with repeated measures, the means of latent variables are zero in one group or at one time point and are estimated in the other groups or time points. 

Christian S posted on Wednesday, August 25, 2010  9:21 pm



Dear Dr. Muthen, thank you very much for your reply. As far as I understand, you wrote how Mplus handles factors in crosssectional studies. However, in many publications, I find means and std. dev.s of latent factors in the descriptive statistics part. When indicators are e.g. all skewed to the right (avg>0), then the mean of the latent variable should (e.g. in my example with Likert scales from 3 to 3) not be zero. Is there a way with Mplus to get this mean? Should I use an average of the indicators of each factor weighted by the indicators' yxstandardized factor loading and then calculate the mean and the std. dev. for each factor? Best Regards, Christian 


This is not how Mplus handles factors in crosssectional studies, it is the conventional way to do this. A factor mean in a crosssectional study has no meaning. You can't compare it to other factor means because there is no basis for comparison. It makes sense to compare factor means only across groups or across time after measurement invariance has been established. I would imagine in the studies you mention, factor score means are being reported. Factor scores are generally not good approximations of true factor values. The mean of a factor indicator is equal to the intercept of the factor indicator plus the factor loading times the mean of the factor. When the factor mean is zero, the mean of the factor indicator is equal to its intercept. This is why the factor mean can be zero even when the observed variable indicator mean is not zero. 


In Byrne's book (2012), p.211 Appearing below these specifications, however, you will see the following: [Fl@O F2@0 F3@0]......in structuring the input file for a configural model, it is necessary to void this default by fixing all factor means to zero. Is it more appropriate to do that constriain? Thank you. 


To test if factor means are different across groups use a model where factor means are zero in all groups versus a model where factor means are zero in one group and free in the other groups. 


Dear Linda, If I just want to test the configural model, is it still recommend to do that? Thank you. 


The configural model is factor means free across groups, intercepts free across groups, and factor means zero in all groups. You may find the multiple group section of the Topic 1 course handout on the website useful. It shows all of the inputs needed to test for measurement invariance. 


Dear Linda, From your first reply in this post, I understand that estimated means for latent variables in a crosssectional model are always zero. Given this, how can I then calculate the informative indices as part of models with sampling weights discussed in Asparounov (2004: 1213)? The numerator would always be zero... Thanks in advance for your help! 


What are "informative indices"? 


I'm sorry that my question was unclear. According to Asparouhov (2004: 12), an informative index is essentially a tstatistic comparing weighted with unweighted estimates: I = (weighted estimated mean of Y  unweighted estimated mean of Y) / sqrt (estimated variance of the weighted mean  estimated variance of the unweighted mean). Source: Asparouhov, T. 2004. Weighting for unequal probability of selection in multilevel modeling. Mplus Web Notes: No. 8. 


I believe this formula is for observed not latent variables. 

JW posted on Tuesday, July 08, 2014  1:09 pm



Hi Linda, in the post from 6th April 2012 you say: "To test if factor means are different across groups use a model where factor means are zero in all groups versus a model where factor means are zero in one group and free in the other groups." I am unsure on how to obtain the test. I have 3 latent variables measured at 2 time points so I would like to compare the means at time2 vs. mean at time1 (which would be 0). Is this provided by the pvalue associated with the mean at time2 in the output as in the example below  for example, the mean score associated with variable 1 at followup (Follow1) is .118 which is associated with a pvalue of .019  would I interpret this an increase over time: Means Base_var1 0.000 0.000 999.000 999.000 Base_var2 0.000 0.000 999.000 999.000 Base_var3 0.000 0.000 999.000 999.000 Follow1 0.118 0.050 2.349 0.019 Follow2 0.071 0.065 1.082 0.279 Follow3 0.049 0.062 0.794 0.427 Or how else can I assess this? Grateful for your help! J 


With two time points, the ztest is column three of the follow variables is a test of the difference in means across the two time points. 

JW posted on Tuesday, July 08, 2014  2:44 pm



Hi Linda  Thank you for your reply! So in the example above, looking at the line corresponding with Follow1, I could report the results for measure 1 as an increase over time, t(1) = 2.3, p = .02  is that correct? Thanks again, J 

JW posted on Tuesday, July 08, 2014  3:00 pm



oh my other question is  can I quantify the difference at followup (vs. baseline time0). Does the estimate (i.e., 0.118) indicate that at followup the scores are .12 points higher? should I use standardised estimates? Thanks! 


Yes, but it is not a t test. It is a ztest in large samples. Yes, it is the difference between the two groups. It should be compared to its standard deviation to understand how large it is. 

JW posted on Thursday, July 10, 2014  8:05 am



Thanks again Linda, should I specify OUTPUT: STDX; to be able to estimate how large it is? 


You should take the square root of its variance to get the standard error. 

JW posted on Thursday, July 10, 2014  12:34 pm



Thank you very much!!! 

JW posted on Monday, July 14, 2014  2:46 pm



As I am writing my results up, it suddenly occurred to me: is diving the estimate by standard deviation equivalent to Cohen's d effect size? Thanks again!!! 


This is only true if the estimate is the difference between two means. 

JW posted on Tuesday, July 15, 2014  7:41 am



I am unsure whether this is true in my case where I am interested in comparing means of 2 latent variables. One assessed at time 0 (and set equal to 0) and the other assessed at time 1 (which is estimated). Does dividing the ztest value corresponding to the mean at time 1 by the variable's SD correspond to a cohen's d? I am unsure as we are forcing the first mean to be 0... Grateful for your help! 


This is a mean difference even if one mean is zero. The difference in means is the mean parameter that is not zero. You divide this value by the standard deviation of that latent variable which you can find in the results or TECH4. 

JW posted on Tuesday, July 29, 2014  10:23 am



Hi Linda, I have been looking at this a bit more and was discussing it with someone at work. This conversation has brought to my attention the fact that I am not sure how MPLUS constructs the factors and how the free mean at time 2 is calculated, especially since the latent factor consists of questionnaires with different scales. Grateful for your response and please feel free to point me to any relevant reference. Thanks 


See the Topic 4 course handout on the website under multiple indicator growth. 

JW posted on Wednesday, July 30, 2014  8:10 am



Thanks Linda! 

Francesca posted on Wednesday, August 13, 2014  10:37 am



Hi Dr Muthen, I have a similar question. I have data from 2 time points and I would like to test if the mean of the latent variable is higher at followup compared to baseline. I tested for scalar invariance and found that it is met so I can compare the means. Here come my questions: 1. to compare the means shall I keep the loadings and thresholds fixed or should I do it in the model with all free parameters? 2. To compare means I need to fix the baseline mean at 0 and let Mplus estimate the mean at followup  how does Mplus estimate/calculate the free mean? Thanks, Fran 


To compare means, you should keep the loadings and thresholds constrained to be equal across time. The two models to use for testing mean differences across time are the model with means zero at both time points versus the model with the mean fixed at zero at one time point and free at the other. With constrained loadings and thresholds, the difference in probabilities of the items at the two time points is expressed by the free factor mean. 

Francesca posted on Thursday, August 14, 2014  8:20 am



Thanks Linda! So to double check, with fixed loadings and thresholds: 1. do I need to assess the difftest for the model with both means set to 0 and the one with one mean at zero and the other free? is this what tells me if the means are different? 2. I am not 100% sure I understand what the free factor mean tells me  I thought this tested if the free mean is significantly different from 0 and therefore if data differ at followup (vs. baseline) 3. how is the free mean estimated in Mplus? Thanks again Francesca 


1. You have only 2 time points so the ztest for the estimated mean at time point 2 tells you if the means are significantly different at the 2 time points (because it tests against zero). 2. Correct. 3. With equal loadings and thresholds the factor mean lets the probabilities at time 2 be different from those of time 1. ML finds the optimal factor mean at time 2 using iterative methods. 

Angela posted on Thursday, September 04, 2014  4:34 am



Hello Muthens! First, I must say THANK YOU for all your support. Without it, many of us would be full of questions with no answers. I am trying to conduct a multiple group analysis with two groups. I also have two timepoints for each group. Right now, I see that the group one means of the latent variables for time one and intercepts for time two are fixed at zero. My questions are these: 1) Just confirming what I've read here Am I correct in saying that if the group two means are significant (and then different from zero), they are then different from group one means? 2) How do I find out the extent of the mean shift for each of the groups? Thank you in advance! 


1) Yes 2) I assume you have specified measurement invariance across time and across groups. TECH1 will show you what's been done. I also assume that the time 2 factor is regressed on the time 1 factor and that this regression coefficient varies across groups (again, check Tech1). Am I right so far? 

Angela posted on Thursday, September 04, 2014  5:17 pm



I have specified measurement invariance. (Does it matter, by the way, whether it is weak, strong, or strict invariance?) Time 2 is regressed on Time 1; however, after testing equality of parameters through imposing constraints and using delta chisquare tests, the coefficients are the same for the two groups. How does that affect results? Thanks! 


And you want the mean difference across groups for a certain time point, or the mean difference across time for a certain group? 

Angela posted on Thursday, September 04, 2014  8:46 pm



The mean difference across time for each group. 


To compare factor means you should have strong (scalar) invariance. You express that with Model Constraint using parameter labels from Model, drawing on simple regression formulas where you build on the following input for a certain group: Model: f2 on f1 (slope); [f1] (f1mean); [f2] (f2int); Model Constraint: New(f2mean diff); f2mean = f2int + slope*f1mean; diff = f2mean  f1mean; "diff" gets you the estimate and SE of what you want. When you have a fixed zero for f1mean you just add New(f1mean); f1mean = 0; 

Angela posted on Thursday, September 04, 2014  9:55 pm



Great! That all makes sense to me except for the last little bit on the fixed zero. It's already at zero; I actually want it to be free so that I can actually get a difference between f1 and f2 (which is at zero). 


If you have scalar invariance across time you can estimate the f2 intercept as is done above when saying [f2] You want the f1 mean to be zero in one group for identification. My formulas still give you a mean difference across time. 

Angela posted on Thursday, September 04, 2014  10:48 pm



Okay I'll doublecheck that all. Thank you! 


Dear Drs. Muthen, I have a similar question. I have data from 2 time points and 2 groups (treatment and control). I would like to test how high is the mean difference across time for each group. I used your syntax above (September 04, 2014  3:48 pm) for each group separatly. Unfortunately I have received the following error message: A parameter label has been redeclared in MODEL CONSTRAINT. Problem with: f1mean Do you know what I can do now? Thank you very much? 


Send output and license number to support. 


Dear Ms. Muthen I have data from 2 time points and 2 groups (treatment and control). I have four related questions: 1) I would like to test how large is the mean difference across time for each group. I used your syntax above to test the mean difference across time for each group (it works now). Now I´m wondering if I should fix the f1 mean to zero or not? I have tried both and the results are really different. Diff1 (F1 mean is not fixed to zero)=7.19** Diff2 (F1 mean is fixed to zero – like in your syntax)=31.77** 2) Is it possible to standardize this difference? 3) I also tested the mean difference across groups. I compared the unstandardized estimate of the latent mean (which is not fixed to zero) to its standard deviation. Is it correct that this value is equal to the standardized intercept? 4) Can I compare this difference to the difference across time for each group? Or is different metric? Thank you very much for all your support. 


1) The f1 mean needs to be fixed at zero at time 1 for one of the two groups, otherwise the model is not identified (you should see an error message). 2) Yes, you can divide by its standard deviation as estimated in the model (SD = sqrt of variance). This SD can also be expressed in Model constraint. In its simplest case, the SD is the same at time 1 and 2 (check Tech4 for time 2), in which case you simply divide the factor mean difference by sqrt(factor variance). 3) See my answer to 2) 4) You can compare the acrosstime factor mean difference or standardized acrosstime factor mean difference across the groups. 


Thank you very much. For double check: To your answer 1: I examined the mean change from pre to posttest separately in each group. I tried now the model with the two groups. I fixed the mean to zero in the reference group. I only got the difference in the group which has both means free. 1) How can I get the difference for the reference group? I used your syntax for the difference: Model: f2 on f1 (slope); [f1] (f1mean); [f2] (f2int); Model Constraint: New(f1mean2); f1mean2 = 0; New(f2mean diff); f2mean = f2int + slope*f1mean2; diff = f2mean  f1mean2; 2) Should I fix the f1mean2 also to Zero or not? To your answer 2: I checked Tech4. The SD is not the same at time 1 and 2. SD Time 1 = sqrt(188)=13.7 SD Time 2 = sqrt(100)=10 3) Can I use the average of both SD´s to standardize the difference? Thank you again! 


You have 2 groups, reference and focal, say. Then you have the factor mean status reference group time1: fixed at zero reference group time 2: free (through its intercept being free) focal group time 1: free focal group time 2: free (through its intercept being free) With unequal factor variances, I would vote for standardizing with respect to time 1 variance (starting variance) and explain that this is what you do. To feel comfortable with this type of modeling, you will want to take a course in multiplegroup factor analysis (see our Topic 1 handout and video on our website). 


Thank you very much. It works now. 


Dear Dr. Muthen, I am writing up the results for the latent mean differences across groups. May you please check my report style? I don´t want to report wrong results… I have a multiplegroup SEM (2 groups, 2 timepoints). Time 2 is regressed on Time 1. I fixed the mean to zero in the reference group at both timepoints. Question 1: I describe in the section overview of analyses how I calculate the standardized difference ”d” (by comparing the unstandardized estimate of the latent mean (which is not fixed to zero) to its standard deviation of the latent variable). Is it correct, when I report d = XX, z = 4.90, p < .001 ? Question 2: Is it correct to say there is no significant group difference at Time 1 when the ztest for the estimated mean at time point 1 tells me no significant difference at Time 1? Thank you so much again! 


Q1. Yes. Q2. Yes. 

Till Martin posted on Monday, March 23, 2015  11:28 am



Dear Drs. Muthen, above it has been asked "Should I use an average of the indicators of each factor weighted by the indicators'yxstandardized factor loading and then calculate the mean and the std. dev. for each factor?" and Dr. L. Muthen replied "This is not how Mplus handles factors in crosssectional studies, it is the conventional way to do this." I was searching for days for a reference/citation for this procedure of forming aggregated scores from items based on the observed indicators scores weighted by the standardized factor loadings, and I could not find any reference indicating that this is usual. I could only find classical EFA context weighting. It was recommended to me to fit a congeneric singlefactor model yielding standardized factor loadings and using them as weights to account for the relative importance an item had for a factor. My goal is to perform multiple regression based on composite scores. Is it true that weighted means/sum are only appropriate in formative measurement? Or is there a reference for the above procedure, that was termed "conventional"? Your help is greatly appreciated Thank you very much Best, Till 


If you can't do it in a single step using SEM, I would use factor scores or multiplyimputed factor scores, also referred to as plausible values. See also our FAQ "Factor scores". I can't think of a reference for summing items weighted by their loadings  in my experience it is a common procedure, particularly in the past  and there have been debates about whether or not that is better than "unitweighting". 

Till Martin posted on Thursday, March 26, 2015  7:14 pm



Dear Dr. Muthen, thank you very much for your reply. I think I will use the factor scores then, though I have found a procedure by Rowe (2006) using factor score coefficients as weights to account for the relative importance of an item for a factor. I used a syntax by Raykov (2009) but I would like to use WLSMV instead of MLR: But I get this error message: WARNING in ANALYSIS command Estimator WLSMV is not available for analysis with all continuous variables. Default estimator will be used. 1 WARNING(S) FOUND IN THE INPUT INSTRUCTIONS The syntax is (6 Likert items, nonnormal, 99 cases, no missings) VARIABLE: VARIABLE: NAMES ARE DEE1 DEE2 DEE3 DEE4 DEE5 DEE6; Usevariables are DEE1 DEE2 DEE3 DEE4 DEE5 DEE6; MODEL: KSI BY DEE1*(B1) DEE2DEE6(B2B6); KSI@1; DEE1DEE6(THETA1THETA6); MODEL CONSTRAINT: NEW (RHO); RHO = (B1 + B2 + B3 + B4 + B5 + B6)**2/((B1 + B2 + B3 + B4 + B5 +B6)**2 + THETA1 + THETA2 + THETA3 + THETA4 + THETA5 + THETA6); ANALYSIS: ESTIMATOR = WLSMV; OUTPUT: CINTERVAL SAMPSTAT TECH1 STDYX FSCOEFFICIENT; What could be the problem? Thank you very much Best, Till 


We don't make WLSMV available in this case. There is no reason not to choose say MLR. 

Back to top 