I have a CFA with latent factors (each measured with multiple Likert-scale [-3;3] indicators). For the descriptive statistics, I would like to list mean and std. deviation for each latent factor. The TECH 4 output gives me 0 as mean for every factor. What should I do to get the "real" means and the standard deviations? Does the fact that I get 0 as mean for each factor mean that something with my model is wrong?
In cross-sectional studies, the means of latent variables are zero. In multiple group analysis or with repeated measures, the means of latent variables are zero in one group or at one time point and are estimated in the other groups or time points.
thank you very much for your reply. As far as I understand, you wrote how Mplus handles factors in cross-sectional studies.
However, in many publications, I find means and std. dev.s of latent factors in the descriptive statistics part.
When indicators are e.g. all skewed to the right (avg>0), then the mean of the latent variable should (e.g. in my example with Likert scales from -3 to 3) not be zero. Is there a way with Mplus to get this mean? Should I use an average of the indicators of each factor weighted by the indicators' yx-standardized factor loading and then calculate the mean and the std. dev. for each factor?
This is not how Mplus handles factors in cross-sectional studies, it is the conventional way to do this. A factor mean in a cross-sectional study has no meaning. You can't compare it to other factor means because there is no basis for comparison. It makes sense to compare factor means only across groups or across time after measurement invariance has been established.
I would imagine in the studies you mention, factor score means are being reported. Factor scores are generally not good approximations of true factor values.
The mean of a factor indicator is equal to the intercept of the factor indicator plus the factor loading times the mean of the factor. When the factor mean is zero, the mean of the factor indicator is equal to its intercept. This is why the factor mean can be zero even when the observed variable indicator mean is not zero.
Brewery Lin posted on Friday, April 06, 2012 - 12:12 am
In Byrne's book (2012), p.211 Appearing below these specifications, however, you will see the following: [Fl@OF2@0F3@0]......in structuring the input file for a configural model, it is necessary to void this default by fixing all factor means to zero.
Is it more appropriate to do that constriain? Thank you.
The configural model is factor means free across groups, intercepts free across groups, and factor means zero in all groups. You may find the multiple group section of the Topic 1 course handout on the website useful. It shows all of the inputs needed to test for measurement invariance.
I am unsure whether this is true in my case where I am interested in comparing means of 2 latent variables. One assessed at time 0 (and set equal to 0) and the other assessed at time 1 (which is estimated). Does dividing the z-test value corresponding to the mean at time 1 by the variable's SD correspond to a cohen's d?
I am unsure as we are forcing the first mean to be 0...
This is a mean difference even if one mean is zero. The difference in means is the mean parameter that is not zero. You divide this value by the standard deviation of that latent variable which you can find in the results or TECH4.
I have been looking at this a bit more and was discussing it with someone at work.
This conversation has brought to my attention the fact that I am not sure how MPLUS constructs the factors and how the free mean at time 2 is calculated, especially since the latent factor consists of questionnaires with different scales.
Grateful for your response and please feel free to point me to any relevant reference.
To compare means, you should keep the loadings and thresholds constrained to be equal across time. The two models to use for testing mean differences across time are the model with means zero at both time points versus the model with the mean fixed at zero at one time point and free at the other.
With constrained loadings and thresholds, the difference in probabilities of the items at the two time points is expressed by the free factor mean.
Francesca posted on Thursday, August 14, 2014 - 2:20 am
So to double check, with fixed loadings and thresholds:
1. do I need to assess the difftest for the model with both means set to 0 and the one with one mean at zero and the other free? is this what tells me if the means are different?
2. I am not 100% sure I understand what the free factor mean tells me - I thought this tested if the free mean is significantly different from 0 and therefore if data differ at followup (vs. baseline)
1. You have only 2 time points so the z-test for the estimated mean at time point 2 tells you if the means are significantly different at the 2 time points (because it tests against zero).
3. With equal loadings and thresholds the factor mean lets the probabilities at time 2 be different from those of time 1. ML finds the optimal factor mean at time 2 using iterative methods.
Angela posted on Wednesday, September 03, 2014 - 10:34 pm
First, I must say THANK YOU for all your support. Without it, many of us would be full of questions with no answers.
I am trying to conduct a multiple group analysis with two groups. I also have two timepoints for each group. Right now, I see that the group one means of the latent variables for time one and intercepts for time two are fixed at zero. My questions are these:
1) Just confirming what I've read here- Am I correct in saying that if the group two means are significant (and then different from zero), they are then different from group one means?
2) How do I find out the extent of the mean shift for each of the groups?
2) I assume you have specified measurement invariance across time and across groups. TECH1 will show you what's been done. I also assume that the time 2 factor is regressed on the time 1 factor and that this regression coefficient varies across groups (again, check Tech1). Am I right so far?
Angela posted on Thursday, September 04, 2014 - 11:17 am
I have specified measurement invariance. (Does it matter, by the way, whether it is weak, strong, or strict invariance?)
Time 2 is regressed on Time 1; however, after testing equality of parameters through imposing constraints and using delta chi-square tests, the coefficients are the same for the two groups. How does that affect results?
"diff" gets you the estimate and SE of what you want. When you have a fixed zero for f1mean you just add
New(f1mean); f1mean = 0;
Angela posted on Thursday, September 04, 2014 - 3:55 pm
Great! That all makes sense to me except for the last little bit on the fixed zero. It's already at zero; I actually want it to be free so that I can actually get a difference between f1 and f2 (which is at zero).
I have data from 2 time points and 2 groups (treatment and control). I have four related questions:
1) I would like to test how large is the mean difference across time for each group. I used your syntax above to test the mean difference across time for each group (it works now). Now I´m wondering if I should fix the f1 mean to zero or not? I have tried both and the results are really different.
Diff1 (F1 mean is not fixed to zero)=7.19** Diff2 (F1 mean is fixed to zero – like in your syntax)=31.77**
2) Is it possible to standardize this difference?
3) I also tested the mean difference across groups. I compared the unstandardized estimate of the latent mean (which is not fixed to zero) to its standard deviation. Is it correct that this value is equal to the standardized intercept?
4) Can I compare this difference to the difference across time for each group? Or is different metric?
1) The f1 mean needs to be fixed at zero at time 1 for one of the two groups, otherwise the model is not identified (you should see an error message).
2) Yes, you can divide by its standard deviation as estimated in the model (SD = sqrt of variance). This SD can also be expressed in Model constraint. In its simplest case, the SD is the same at time 1 and 2 (check Tech4 for time 2), in which case you simply divide the factor mean difference by
3) See my answer to 2)
4) You can compare the across-time factor mean difference or standardized across-time factor mean difference across the groups.
To your answer 1: I examined the mean change from pre to posttest separately in each group. I tried now the model with the two groups. I fixed the mean to zero in the reference group. I only got the difference in the group which has both means free.
1) How can I get the difference for the reference group?
I used your syntax for the difference:
Model: f2 on f1 (slope); [f1] (f1mean); [f2] (f2int);
I am writing up the results for the latent mean differences across groups. May you please check my report style? I don´t want to report wrong results…
I have a multiple-group SEM (2 groups, 2 time-points). Time 2 is regressed on Time 1. I fixed the mean to zero in the reference group at both time-points.
Question 1: I describe in the section overview of analyses how I calculate the standardized difference ”d” (by comparing the unstandardized estimate of the latent mean (which is not fixed to zero) to its standard deviation of the latent variable).
Is it correct, when I report d = XX, z = 4.90, p < .001 ?
Question 2: Is it correct to say there is no significant group difference at Time 1 when the z-test for the estimated mean at time point 1 tells me no significant difference at Time 1?
above it has been asked "Should I use an average of the indicators of each factor weighted by the indicators'yx-standardized factor loading and then calculate the mean and the std. dev. for each factor?"
and Dr. L. Muthen replied
"This is not how Mplus handles factors in cross-sectional studies, it is the conventional way to do this."
I was searching for days for a reference/citation for this procedure of forming aggregated scores from items based on the observed indicators scores weighted by the standardized factor loadings, and I could not find any reference indicating that this is usual. I could only find classical EFA context weighting. It was recommended to me to fit a congeneric single-factor model yielding standardized factor loadings and using them as weights to account for the relative importance an item had for a factor.
My goal is to perform multiple regression based on composite scores.
Is it true that weighted means/sum are only appropriate in formative measurement?
Or is there a reference for the above procedure, that was termed "conventional"?
Your help is greatly appreciated Thank you very much Best, Till
If you can't do it in a single step using SEM, I would use factor scores or multiply-imputed factor scores, also referred to as plausible values. See also our FAQ "Factor scores".
I can't think of a reference for summing items weighted by their loadings - in my experience it is a common procedure, particularly in the past - and there have been debates about whether or not that is better than "unit-weighting".
Till Martin posted on Thursday, March 26, 2015 - 2:14 pm
Dear Dr. Muthen,
thank you very much for your reply. I think I will use the factor scores then, though I have found a procedure by Rowe (2006) using factor score coefficients as weights to account for the relative importance of an item for a factor.
I used a syntax by Raykov (2009) but I would like to use WLSMV instead of MLR:
But I get this error message:
WARNING in ANALYSIS command Estimator WLSMV is not available for analysis with all continuous variables. Default estimator will be used. 1 WARNING(S) FOUND IN THE INPUT INSTRUCTIONS
The syntax is (6 Likert items, non-normal, 99 cases, no missings)
VARIABLE: VARIABLE: NAMES ARE DEE1 DEE2 DEE3 DEE4 DEE5 DEE6; Usevariables are DEE1 DEE2 DEE3 DEE4 DEE5 DEE6;
We don't make WLSMV available in this case. There is no reason not to choose say MLR.
Seth Frndak posted on Tuesday, March 29, 2016 - 10:04 am
I have come across the same issue that Till Martin has. I am trying to estimate a continuous growth model with WLSMV, but I receive the error: "Estimator WLSMV is not available for analysis with all continuous variables."
The reason I am trying to estimate the continuous model in WLSMV is because I am creating a two-part model. My dichotomous model will not converge in MLR so I am using WLSMV. When I combine the continuous and dichotomous models, the continuous model is estimated with WLSMV and not the estimator that was used in the model building process (MLR).