Message/Author 


If the measurement model does not fit the data for a specific group (e.g. females), but fits for males, can you still proceed with constraints to the model. 


No, you would have to find a wellfitting model for each group. 


Hi Bengt and/or Linda, I have a couple of questions about invariance that concerns invariance of the variance of a factor. Is my understanding correct that, if one has say two groups, the variance of a factor has to be constrained to some value in one group (either by setting it equal to 1 or by setting one of its loadings equal to 1), and at least one of the loadings on the factor need to be constrained in the second group? Second, is my understanding correct that we still won't know what the variance of the factor equals in the second group in an absolute sense but rather the estimate of the variance in the second group will only be relative the value set by constraint in the first group? Thanks very much! Rick Zinbarg 


The metric of a factor can be set by either fixing a factor loading to one or fixing the factor variance to one. With multiple group analysis, if a factor loading is fixed to one in each group, a factor variance can be estimated for each group. I think this way of setting the metric would be best for multiple group analysis. 


wow  that was fast, thanks Linda! I am a bit confused though as I thought fixing a factor loading to one fixes the factor variance to the variance of that indicator. Thus, I must be misunderstanding something as I don't see how the variance can both be set to equal that of a particular indicator and is estimated. Any help clearing up my confusion would be appreciated. 


Fixing a factor indicator to set the metric of the factor uses the scale of the y variable that has the factor loading fixed to one. This implies that as the factor changes one unit, y is expected to change the same amount. This does not mean that the variance of the factor is equal to the variance of y nor that the variance of the factor is fixed at one. 


that helps, thanks very much! Just to make sure I understand, I am going to try to restate what you said in somewhat different terms. a standardized loading equals an unstandardized loading times the ratio of the standard deviation of y divided by the standard deviation of the factor. we can get a standardized loading directly from the item correlations, and the standard deviation of y (or at least our sample estimate of it) is observed. Thus, in the above equation, we are still left with two unknowns and the equation can't be solved in the absence of setting a constraint on one of the two unknowns. We can either constrain the sd of the factor (typically to 1) and estimate the unstandardized loading OR we can constrain the unstandardized loading and estimate the sd of the factor. In the later case (constraining an unstandardized loading), once we constrain one loading, we can now use the estimated sd of the factor to plug into the equation for the unstandardized loadings for all the other items and therefore estimate the remaining unstandardized loadings. Does this sound close to accurate? Thanks again! 


This is not how I think about it but it may be plausible. 


Hi Linda and Bengt, I am a Mplusbeginner and testing for CFA measurement invariance between two groups. Variables are categorical, parameterization is theta. As recommended in the handbook, I compared the configural invariance model with loadings and intercepts free across groups, factor means fixed at 0 for both groups and residual variances of obs. variables fixed at 1 for both groups with the more restrictive default model. (Part of) Input for configural model: MODEL: abt by cieqr_2 cieqr_1 cieqr_3 cieqr_7; auff by cieqr_12 cieqr_5 cieqr_10 cieqr_13; empa by cieqr_6 cieqr_9 cieqr_8 cieqr_10 cieqr_14; schue by cieqr_16 cieqr_6 cieqr_11 cieqr_15 cieqr_17 cieqr_18 cieqr_19 cieqr_20 cieqr_21 cieqr_22; CIEQR_15 WITH CIEQR_16; cieqr_2 with cieqr_3; [abt@0]; [auff@0]; [empa@0]; [schue@0]; CIEQR_1@1; CIEQR_2@1; and so on... (all residual variances) MODEL Offline: (one of the two groups) > the same model with loadings of reference variables fixed at 1 + [CIEQR_1$1 CIEQR_1$2 CIEQR_2$1 and so on...(all tresholds)]; Config. model: CFI=.948; RMSEA=.069. Default model: CFI=.951; RMSEA=.066. So the more restrictive model has a better fit? What am I doing wrong? Does this have something to do with the restrictions for factor means and residual variances? How should I test for invariance of loadings and tresholds separately? I am quite desperate. Thanks a lot in advance and greetings, Marc 


Please send all relevant files and your license number to support@statmodel.com. 

John Capman posted on Saturday, January 10, 2009  7:48 am



I realize this posting was quite a long time ago, but I was wondering if there may be any feedback as to why the data presented by Marc Moller (above) occurred. i am curious as I am having a similar problem with my data. That is, the more restrictive model is fitting better than the less restrictive. Thanks in advance for your help. Sincerely, John 


If you are using the WLSMV estimator, you should be using the DIFFTEST option to test nested models. With WLSMV, only the pvalue should be interpreted. The chisquare and degrees of freedom are adjusted to obtain a correct chisquare and should not be interpreted in the regular way. If this is not the case, please send all relevant files and your license number to support@statmodel.com. 

John Capman posted on Saturday, January 10, 2009  8:23 am



I did not run the DIFFTEST option. I will try it and let you know. Otherwise, I will send the data with the license #. Thanks for your prompt reply. Greatly appreciated. 

JPower posted on Sunday, March 08, 2009  2:05 pm



I am conducting a CFA with 3 factors and 18 ordinal items at two time points using WLSMV estimation. I am interested in evaluating measurement invariance. I have demonstrated equal form and now want to look at constraining the factor loadings. 1) I constrained all of the factor loadings and conducted a diff test. Chisquare was significant: Value 31.570 Degrees of Freedom 10** Pvalue 0.0005 However, CFI is slightly higher with the loadings constrained and TLI and RMSEA do not change. Without factor loadings constrained: CFI=0.959, TLI=0.986, RMSEA=0.056, SRMR=0.065 With factor loadings constrained: CFI=0.967, TLI=0.986, RMSEA=0.056, SRMR=0.071 How should I interpret this? My sample size is 686. Is this an example where chisquare may be overly sensitive? 2)To determine the source of the significant chisquare, would an appropriate approach be to consider each factor loading separately and thus conduct 18 diff tests (perhaps adjusted for multiple comparisons?) 2) Should I have constrained the thresholds at the same time as the factor loadings or is this a separate test? It is my understanding from previous posts that it is difficult to disentangle the two. Does this invalidate the meaning of tests looking at factor loadings only when using WLSMV? Is it ever appropriate to consider factor loadings without looking at thresholds? Thanks. 


Chisquare is used for comparing nested models. With WLSMV, you need to use the DIFFTEST option to do this. CFI, TLI, etc. are not. For categorical outcomes, we recommend looking at the thresholds and factor loadings together. The models we recommend are shown in Chapter 13 for multiple group analysis. The same principles apply across time. 

JPower posted on Monday, March 09, 2009  11:28 am



Thanks for your reply. Just a couple of points to clarity  1) I did use the DIFFTEST option to compare the nested models. Do you think the apparent disagreement between the difftest results and the change in the models CFI,TLI could be due to chisquare being overly sensitive to sample size? 2)To determine the source of the significant DIFFTEST, would an appropriate approach be to consider each factor loading separately and thus conduct 18 diff tests (perhaps adjusted for multiple comparisons?) Thanks again. 

JPower posted on Monday, March 09, 2009  11:40 am



I apologize for the multiple postings  is it still necessary to consider the equivalence of thresholds even if i am not working with MEANSTRUCTURE? Thanks. 


1. I would not make much of these minor differences in CFI and TLI. They are not useful for comparing models. I would go with the DIFFTEST results. Your sample size is not large for categorical data analysis. 2. You should look at factor loadings and thresholds together. The two of them determine the basic building block of categorical data anaysis, the Item Characteristic Curve. 3. Means and thresholds are the default in Mplus with weighted least squares estimation. They cannot be excluded. 


Dear Linda and Bengt, We are trying to conduct a measurement invariance analysis using the logic of testing increasingly restrictive models (e.g. configural, weak, strong, strict). We have a one factor model with two groups and five categorical indicators. Many of the worked examples (e.g. Gregorich’s worked example http://www.ats.ucla.edu/stat/Mplus/paperexamples/gregorich/default.htm) start with a model which freely estimates loadings, thresholds and residuals in both groups. However it appears that, if loadings and thresholds are free, then the residuals need to be set at 1. Whilst our models runs when residuals are set at one, setting these at one also means that we can not follow the logic of testing models by systematically restraining lambas, taus then thetas between the groups as the thetas are already fixed. Is there any way of creating a baseline model using categorical indicators which can be used to assess the significance of changes in later models? Thanks 


The steps to test for measurement invariance with categorical outcomes differs from those for continuous outcomes. The models we suggest are shown on pages 399400 of the Mplus User's Guide and our Topic 2 course handout. 


Dear Dr. Muthen I am trying to test for invariance by gender. i keep getting this message THE MODEL ESTIMATION TERMINATED NORMALLY THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES COULD NOT BE COMPUTED. THE MODEL MAY NOT BE IDENTIFIED. CHECK YOUR MODEL. PROBLEM INVOLVING PARAMETER 35. THE CONDITION NUMBER IS 0.398D16. these are the models based on handout topic 2 page 169 for noninvariance model could you please give me hand? thank you very much fernando Model: su by y1alc y1cig y1mar; pa by eng_r hist_r math_r sci_r; [supa@0]; !Setting means to zero {y1cig@1 y1alc@1 y1mar@1 eng_r@1 math_r@1 hist_r@1 sci_r@1}; model female: su by y1alc y1cig y1mar; pa by eng_r hist_r math_r sci_r; [y1alc$1 y1cig$1 y1mar$1 eng_r$1 math_r$1 hist_r$1 sci_r$1]; 


When you mention the first factor indicator in MODEL female, you free it causing the model not to be identified. You should not mention the first factor indicator in groupspecific MODEL commands. 


thank you! works pretty well now. Fernando 


Dear Dr. Muthen. I want to save the tresholds of my model, but Mplus is only giving the factor loadings. Im using WLSM so the tresholds should be the default. When I try the estimator MLR, it does give the tresholds, but I prefer using WLSM, so how can I get the tresholds of my model? Thanks in regard, Nanda Mooij 


If you are using an older version of Mplus, add TYPE=MEANSTRUCTURE to the ANALYSIS command. If not, send the full output and your license number to support@statmodel.com. 


Dear Dr. Muthen, I am testing measurement invariance for 1 scale through multiple groups analysis (male, female) in ordered categorical data, using WLSMV and Theta Parameterization. Baseline Model: factor loadings estimated thresholds free items residual variance fixed @ 1 factor mean @ 0, factor variance @ 1 for identification Comparison Model: factor loadings estimated but contrained to be equal across groups thresholds estimated but constrained to be equal across groups items residual variance fixed @ 1 factor mean @ 0, factor variance @ 1 for identification DIFFTEST for Model Comparison is significant, however in the "wrong" direction: The Comparison model has MUCH better fit than the Baseline Model. How can this be? Thank you, Sabine 


Your Comparison model is not set up right and will be illfitting. This is because you have factor means fixed at zero and factor variances fixed at 1 in all groups (I assume that's what you mean). You should only do it in one (reference) group. Then you'd get the same model as the invariance model we propose. 


Thank you very much. I have done that now, and still  the comparison model has an improved fit as compared to the baseline model. Not as much any more as before, but significantly (DIFFTEST). To me, this seems very odd  how can this be? 


Please send the two outputs and your license number to support@statmodel.com. 

SIMON MOON posted on Friday, July 13, 2012  8:44 am



I am conducting a measurement invariance analysis in Mplus. I ran a series of nested models to test measurement invariance with 1 factor (6 continuous items) and 5 groups (the N size is unusually high... >600,000). Because the items were skewed, I used MLM estimator. When I ran scalar invariance model, a strange thing happened: the chisquare is smaller than the configural model's chisquare! I know it is impossible so I tried several things. When I tried the same analyses using ML estimator, the chisquare problem disappeared. I have two questions. 1) Is the smaller chisquare a problem (or a characteristic?) with MLM estimator or did I do something wrong? !Configural Model (Chisquare = 24776.964, df =45) !Scalar Model (Chisquare = 17025.705, df=113) 2) Even if the items are skewed, I feel that using ML estimator is OK because I will compare only nested models. Am I on the right track? Thanks in advance! 


This can answered well only by looking at your two outputs. You can send them to Support. 

Maria posted on Monday, January 21, 2013  10:27 am



Dear Bengt, I am looking at measure invariance on my latent variables across males and females. My items are categorical and I am using the WLSMV estimator. I run: 1. a model with free loadings and thresholds 2. a model with free thresholds and fixed loadings Difftest shows me that step 2 is a worse fit and by looking at the modindices I can improve the fit by freeing one loading. 2b. I run a model with free threshold and all fixed loading BUT the one identified by modindices 3. a model with fixed thresholds and loadings( apart from the one identified in step2). I compare step 3 and 2b and the difftest shows that step 3 offers a worse fit. 3b. Looking at the modindices I free one threshold and re run the analysis. difftest still shows that the model in 3b is a worse fit than 2b; however, the model fit indices are equivalent... I went on to free a number of thresholds (based on the modindices) but this did not make my difftest nonsignificant and did not improve the model fit (CFI/TLI). Could I argue that as the model fit are good enough by step 3 (same as step 2b), I am confident that it is a good model ? or is there a better way to identify the thresholds that need exploring other than modindices? Thanks for your help 3. 


I assume you are working with ordered polytomous items. If you want to separately test thresholds and loadings with polytomous items you should consult the MillsapTien (2004) article. We recommend changing both at the same time for simplicity. 

Maria posted on Tuesday, January 22, 2013  12:57 am



Thanks Bengt, I will check out the paper.. my items are binary so I could still apply the polytomous approach? 


If you are working with binary items, you cannot apply the polytomous approach. For binary items, the models to be compared are shown in the Topic 2 course handout under multiple group analysis and on pages 485486 of the Version 7 Mplus User's Guide. 

Maria posted on Tuesday, January 22, 2013  10:40 am



Thanks Linda, I have used the example in the manual and I find that the chisquare test is highly significant x2(120) = 570.2, p<.001 while the model fit indices do not worsen compared to the model with all free parameters). I have compared the loadings for boys and girls when they were allowed to vary (free model) and I identified 3 loadings that may need to be freed  I do this in separate steps  but still the p< .001 and the model fit is unchanged. I tried freeing thresholds based on the modindices output but this also has very little effect on the chisquared p value. I feel this could be due to the large sample size (2,700) as it appears that the chisquare for the difftest is sensitive to large sample size (Chen, 2007) If this was the case, I'd be inclined to use the model fit indices to evaluate measure invariance  few authors seem to adopt this approach. I was wondering if you think this is a reasonable approach and hoping you could recommend a paper on this. Many many thanks 

Maria posted on Tuesday, January 22, 2013  10:48 am



I should also probably mention that when I run my main analyses on a fully fixed model vs. the fully free model the results are very similar which is also why I am surprised by the chisquare test results... 


With binary items, the model with only free factor loadings is not identified. We will have a FAQ with the details about this on the website shortly. 

Maria posted on Tuesday, January 22, 2013  1:09 pm



Thanks Linda  is there anything else I could do in the meantime to check model invariance? is it sufficient to report that running the model completely separately for male and female (free model) produces equivalent results to using a multigroup approach (fixed model)? I am writing the results for a paper and I am concerned about reviewers' picking on this. Thanks 


Please send the outputs that show these equivalent results and your license number to support@statmodel.com. 


Hi Linda and/or Bengt, I know this is an old thread but I am an Mplus novice with a question about constraining factor means when testing measurement equivalence. I want to test whether or not factor means are equal across cultures. Looking through the thread, it looks like you have to set the mean@0 and variance@1 for a reference group and freely estimate the mean and variance for the second group in order for the model to run, which mine does. But, how do I test whether the means are different across groups? When I constrain them to be equal to each other I get this message: THE MODEL ESTIMATION TERMINATED NORMALLY THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES COULD NOT BE COMPUTED. THE MODEL MAY NOT BE IDENTIFIED. CHECK YOUR MODEL. PROBLEM INVOLVING PARAMETER 13. THE CONDITION NUMBER IS 0.305D10. Is it reasonable to export factor scores for each culture and just run a ttest? Or is there another way to do this within Mplus? Thank you so much! 


The way to compare latent variable means is to compares the model where factor means are zero in all groups to the model where factor means are zero in one group and free in the other groups. Note that if you fix the factor variance to one, you should free the first factor loading which is fixed to one by default. 

Back to top 