Hi Bengt and/or Linda, I have a couple of questions about invariance that concerns invariance of the variance of a factor. Is my understanding correct that, if one has say two groups, the variance of a factor has to be constrained to some value in one group (either by setting it equal to 1 or by setting one of its loadings equal to 1), and at least one of the loadings on the factor need to be constrained in the second group? Second, is my understanding correct that we still won't know what the variance of the factor equals in the second group in an absolute sense but rather the estimate of the variance in the second group will only be relative the value set by constraint in the first group? Thanks very much! Rick Zinbarg
The metric of a factor can be set by either fixing a factor loading to one or fixing the factor variance to one. With multiple group analysis, if a factor loading is fixed to one in each group, a factor variance can be estimated for each group. I think this way of setting the metric would be best for multiple group analysis.
wow - that was fast, thanks Linda! I am a bit confused though as I thought fixing a factor loading to one fixes the factor variance to the variance of that indicator. Thus, I must be misunderstanding something as I don't see how the variance can both be set to equal that of a particular indicator and is estimated. Any help clearing up my confusion would be appreciated.
Fixing a factor indicator to set the metric of the factor uses the scale of the y variable that has the factor loading fixed to one. This implies that as the factor changes one unit, y is expected to change the same amount. This does not mean that the variance of the factor is equal to the variance of y nor that the variance of the factor is fixed at one.
that helps, thanks very much! Just to make sure I understand, I am going to try to restate what you said in somewhat different terms.
a standardized loading equals an unstandardized loading times the ratio of the standard deviation of y divided by the standard deviation of the factor.
we can get a standardized loading directly from the item correlations, and the standard deviation of y (or at least our sample estimate of it) is observed. Thus, in the above equation, we are still left with two unknowns and the equation can't be solved in the absence of setting a constraint on one of the two unknowns. We can either constrain the sd of the factor (typically to 1) and estimate the unstandardized loading OR we can constrain the unstandardized loading and estimate the sd of the factor. In the later case (constraining an unstandardized loading), once we constrain one loading, we can now use the estimated sd of the factor to plug into the equation for the unstandardized loadings for all the other items and therefore estimate the remaining unstandardized loadings. Does this sound close to accurate? Thanks again!
Hi Linda and Bengt, I am a Mplus-beginner and testing for CFA measurement invariance between two groups. Variables are categorical, parameterization is theta. As recommended in the handbook, I compared the configural invariance model with loadings and intercepts free across groups, factor means fixed at 0 for both groups and residual variances of obs. variables fixed at 1 for both groups with the more restrictive default model. (Part of) Input for configural model: MODEL: abt by cieqr_2 cieqr_1 cieqr_3 cieqr_7; auff by cieqr_12 cieqr_5 cieqr_10 cieqr_13; empa by cieqr_6 cieqr_9 cieqr_8 cieqr_10 cieqr_14; schue by cieqr_16 cieqr_6 cieqr_11 cieqr_15 cieqr_17 cieqr_18 cieqr_19 cieqr_20 cieqr_21 cieqr_22; CIEQR_15 WITH CIEQR_16; cieqr_2 with cieqr_3; [abt@0]; [auff@0]; [empa@0]; [schue@0]; CIEQR_1@1; CIEQR_2@1; and so on... (all residual variances) MODEL Offline: (one of the two groups) --> the same model with loadings of reference variables fixed at 1 + [CIEQR_1$1 CIEQR_1$2 CIEQR_2$1 and so on...(all tresholds)];
Config. model: CFI=.948; RMSEA=.069. Default model: CFI=.951; RMSEA=.066. So the more restrictive model has a better fit? What am I doing wrong? Does this have something to do with the restrictions for factor means and residual variances? How should I test for invariance of loadings and tresholds separately? I am quite desperate. Thanks a lot in advance and greetings, Marc
John Capman posted on Saturday, January 10, 2009 - 7:48 am
I realize this posting was quite a long time ago, but I was wondering if there may be any feedback as to why the data presented by Marc Moller (above) occurred. i am curious as I am having a similar problem with my data. That is, the more restrictive model is fitting better than the less restrictive. Thanks in advance for your help.
If you are using the WLSMV estimator, you should be using the DIFFTEST option to test nested models. With WLSMV, only the p-value should be interpreted. The chi-square and degrees of freedom are adjusted to obtain a correct chi-square and should not be interpreted in the regular way. If this is not the case, please send all relevant files and your license number to email@example.com.
John Capman posted on Saturday, January 10, 2009 - 8:23 am
I did not run the DIFFTEST option. I will try it and let you know. Otherwise, I will send the data with the license #.
Thanks for your prompt reply. Greatly appreciated.
I am conducting a CFA with 3 factors and 18 ordinal items at two time points using WLSMV estimation. I am interested in evaluating measurement invariance. I have demonstrated equal form and now want to look at constraining the factor loadings.
1) I constrained all of the factor loadings and conducted a diff test. Chi-square was significant:
Value 31.570 Degrees of Freedom 10** P-value 0.0005
However, CFI is slightly higher with the loadings constrained and TLI and RMSEA do not change.
Without factor loadings constrained: CFI=0.959, TLI=0.986, RMSEA=0.056, SRMR=0.065
With factor loadings constrained: CFI=0.967, TLI=0.986, RMSEA=0.056, SRMR=0.071
How should I interpret this? My sample size is 686. Is this an example where chi-square may be overly sensitive?
2)To determine the source of the significant chi-square, would an appropriate approach be to consider each factor loading separately and thus conduct 18 diff tests (perhaps adjusted for multiple comparisons?)
2) Should I have constrained the thresholds at the same time as the factor loadings or is this a separate test? It is my understanding from previous posts that it is difficult to disentangle the two. Does this invalidate the meaning of tests looking at factor loadings only when using WLSMV? Is it ever appropriate to consider factor loadings without looking at thresholds?
Chi-square is used for comparing nested models. With WLSMV, you need to use the DIFFTEST option to do this. CFI, TLI, etc. are not.
For categorical outcomes, we recommend looking at the thresholds and factor loadings together. The models we recommend are shown in Chapter 13 for multiple group analysis. The same principles apply across time.
JPower posted on Monday, March 09, 2009 - 11:28 am
Thanks for your reply. Just a couple of points to clarity -
1) I did use the DIFFTEST option to compare the nested models. Do you think the apparent disagreement between the difftest results and the change in the models CFI,TLI could be due to chi-square being overly sensitive to sample size?
2)To determine the source of the significant DIFFTEST, would an appropriate approach be to consider each factor loading separately and thus conduct 18 diff tests (perhaps adjusted for multiple comparisons?)
JPower posted on Monday, March 09, 2009 - 11:40 am
I apologize for the multiple postings - is it still necessary to consider the equivalence of thresholds even if i am not working with MEANSTRUCTURE? Thanks.
1. I would not make much of these minor differences in CFI and TLI. They are not useful for comparing models. I would go with the DIFFTEST results. Your sample size is not large for categorical data analysis.
2. You should look at factor loadings and thresholds together. The two of them determine the basic building block of categorical data anaysis, the Item Characteristic Curve.
3. Means and thresholds are the default in Mplus with weighted least squares estimation. They cannot be excluded.
Dear Linda and Bengt, We are trying to conduct a measurement invariance analysis using the logic of testing increasingly restrictive models (e.g. configural, weak, strong, strict). We have a one factor model with two groups and five categorical indicators. Many of the worked examples (e.g. Gregorich’s worked example http://www.ats.ucla.edu/stat/Mplus/paperexamples/gregorich/default.htm) start with a model which freely estimates loadings, thresholds and residuals in both groups.
However it appears that, if loadings and thresholds are free, then the residuals need to be set at 1. Whilst our models runs when residuals are set at one, setting these at one also means that we can not follow the logic of testing models by systematically restraining lambas, taus then thetas between the groups as the thetas are already fixed.
Is there any way of creating a baseline model using categorical indicators which can be used to assess the significance of changes in later models? Thanks
The steps to test for measurement invariance with categorical outcomes differs from those for continuous outcomes. The models we suggest are shown on pages 399-400 of the Mplus User's Guide and our Topic 2 course handout.
Dear Dr. Muthen. I want to save the tresholds of my model, but Mplus is only giving the factor loadings. Im using WLSM so the tresholds should be the default. When I try the estimator MLR, it does give the tresholds, but I prefer using WLSM, so how can I get the tresholds of my model? Thanks in regard, Nanda Mooij
Comparison Model: factor loadings estimated but contrained to be equal across groups thresholds estimated but constrained to be equal across groups items residual variance fixed @ 1 factor mean @ 0, factor variance @ 1 for identification
DIFFTEST for Model Comparison is significant, however in the "wrong" direction: The Comparison model has MUCH better fit than the Baseline Model.
Your Comparison model is not set up right and will be ill-fitting. This is because you have factor means fixed at zero and factor variances fixed at 1 in all groups (I assume that's what you mean). You should only do it in one (reference) group. Then you'd get the same model as the invariance model we propose.
Thank you very much. I have done that now, and still - the comparison model has an improved fit as compared to the baseline model. Not as much any more as before, but significantly (DIFFTEST). To me, this seems very odd - how can this be?
SIMON MOON posted on Friday, July 13, 2012 - 8:44 am
I am conducting a measurement invariance analysis in Mplus. I ran a series of nested models to test measurement invariance with 1 factor (6 continuous items) and 5 groups (the N size is unusually high... >600,000). Because the items were skewed, I used MLM estimator. When I ran scalar invariance model, a strange thing happened: the chi-square is smaller than the configural model's chi-square! I know it is impossible so I tried several things. When I tried the same analyses using ML estimator, the chi-square problem disappeared. I have two questions. 1) Is the smaller chi-square a problem (or a characteristic?) with MLM estimator or did I do something wrong?
!Configural Model (Chi-square = 24776.964, df =45)
!Scalar Model (Chi-square = 17025.705, df=113)
2) Even if the items are skewed, I feel that using ML estimator is OK because I will compare only nested models. Am I on the right track?
I assume you are working with ordered polytomous items. If you want to separately test thresholds and loadings with polytomous items you should consult the Millsap-Tien (2004) article. We recommend changing both at the same time for simplicity.
Maria posted on Tuesday, January 22, 2013 - 12:57 am
Thanks Bengt, I will check out the paper.. my items are binary so I could still apply the polytomous approach?
If you are working with binary items, you cannot apply the polytomous approach. For binary items, the models to be compared are shown in the Topic 2 course handout under multiple group analysis and on pages 485-486 of the Version 7 Mplus User's Guide.
Maria posted on Tuesday, January 22, 2013 - 10:40 am
I have used the example in the manual and I find that the chi-square test is highly significant x2(120) = 570.2, p<.001 while the model fit indices do not worsen compared to the model with all free parameters).
I have compared the loadings for boys and girls when they were allowed to vary (free model) and I identified 3 loadings that may need to be freed - I do this in separate steps - but still the p< .001 and the model fit is unchanged.
I tried freeing thresholds based on the modindices output but this also has very little effect on the chi-squared p value.
I feel this could be due to the large sample size (2,700) as it appears that the chi-square for the difftest is sensitive to large sample size (Chen, 2007)
If this was the case, I'd be inclined to use the model fit indices to evaluate measure invariance - few authors seem to adopt this approach.
I was wondering if you think this is a reasonable approach and hoping you could recommend a paper on this.
Many many thanks
Maria posted on Tuesday, January 22, 2013 - 10:48 am
I should also probably mention that when I run my main analyses on a fully fixed model vs. the fully free model the results are very similar which is also why I am surprised by the chi-square test results...
I know this is an old thread but I am an Mplus novice with a question about constraining factor means when testing measurement equivalence. I want to test whether or not factor means are equal across cultures. Looking through the thread, it looks like you have to set the mean@0 and variance@1 for a reference group and freely estimate the mean and variance for the second group in order for the model to run, which mine does. But, how do I test whether the means are different across groups? When I constrain them to be equal to each other I get this message:
THE MODEL ESTIMATION TERMINATED NORMALLY
THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES COULD NOT BE COMPUTED. THE MODEL MAY NOT BE IDENTIFIED. CHECK YOUR MODEL. PROBLEM INVOLVING PARAMETER 13.
THE CONDITION NUMBER IS -0.305D-10.
Is it reasonable to export factor scores for each culture and just run a t-test? Or is there another way to do this within Mplus?
The way to compare latent variable means is to compares the model where factor means are zero in all groups to the model where factor means are zero in one group and free in the other groups. Note that if you fix the factor variance to one, you should free the first factor loading which is fixed to one by default.