Message/Author 


I established a structural equation model for testing measurement invariance over two conditions in four groups and I tested by using the command grouping. That leads me to bad model fits, but if I leave out this command, it fits better. Can you help me to explain these results? Another question: I established these nested models by starting with configural invariance. To make mplus to test the configural model, I have to restrict the first factor loading to 1 and so I have to fix the first factor loadings in the weak, strong and strict invariance models, too,right? By keeping this restrictions, I achieve bad model fits. Is there any chance to avoid the restiction of the first factor loadings? Here are my commands: usevar = P_NEO_1 P_NEO_6 P_NEO_11 P_NEO_16 P_NEO_21 P_NEO_26 P_NEO_31 P_NEO_36 P_NEO_41 P_NEO_46 P_NEO_51 P_NEO_56 C_NEO_1 C_NEO_6 C_NEO_11 C_NEO_16 C_NEO_21 C_NEO_26 C_NEO_31 C_NEO_36 C_NEO_41 C_NEO_46 C_NEO_51 C_NEO_56 Reihe; missing = all(99); GROUPING IS Reihe (0=g1 1=g2 2=g3 3=g4); MODEL: N_P BY P_NEO_1 P_NEO_6 P_NEO_11 P_NEO_16 P_NEO_21 P_NEO_26 P_NEO_31 P_NEO_36 P_NEO_41 P_NEO_46 P_NEO_51 P_NEO_56; N_C BY C_NEO_1 C_NEO_6 C_NEO_11 C_NEO_16 C_NEO_21 C_NEO_26 C_NEO_31 C_NEO_36 C_NEO_41 C_NEO_46 C_NEO_51 C_NEO_56; [N_PN_C@0];N_PN_C@1; [P_NEO_1P_NEO_56];[C_NEO_1C_NEO_56]; N_P WITH N_C; 


When you use the GROUPING option, intercepts and factor loadings are held equal as the default. When you don't, the full sample is used and there are no equalities imposed. You can set the metric by fixing the factor variance to one instead of the first factor loading to one: f BY y1* y2 y3; f@1; See the Topic 1 course handout on the website under the topic Multiple Group Analysis. The inputs for measurement invariance are given there. 


Thank you very much for your answer. That helped me a lot. So I have to test my groups against each other. Can you tell me, how to use only a part of the data within one variable? So that I can test within one variable the group of person 1 till 73 against the group of person 143 till 202? I'd be happe for any advise. 


Use the USEVARIABLES option to use part of the data. 


I did use the USEVARIABLES option, but all of my groups are in one variable and I need to test the model fit for example within only one group. If I consider four variables (one for each group) instead of one, mplus says "FATAL ERROR", because the data matrix is too big (more variables than 350 variables). 


Please send the outputs and your license number to support@statmodel.com. 


For the FATAL ERROR I made a programming fault, but I found and corrected it. Thank you very much for your offering. Now, to test the sequence effects, I need to override the default, that fixes the factor loadings and intercepts to be equal over the groups. How can I test a configural or weak Modell of measurement invariance? The command * does only work for different conditions and having different variables loading on different factors, doesn't it? Here are my commands: GROUPING IS Reihe (1=t1 2=t2); DEFINE: IF (Reihe==0 OR Reihe==1) THEN Reihe=1; IF (Reihe==2 OR Reihe==3) THEN Reihe=2; MODEL: N_P BY P_NEO_1* (a) P_NEO_6 (b) P_NEO_11 (c) P_NEO_16 (d) P_NEO_21 (e) P_NEO_26 (f) P_NEO_31 (g) P_NEO_36 (h) P_NEO_41 (i) P_NEO_46 (j) P_NEO_51 (k) P_NEO_56;(l) [P_NEO_1P_NEO_56]; [N_P@0]; N_P@1; 


See the Topic 1 course handout under multiple group analysis. 


Dear Mplus team, I am trying to test gender invariance in a path analysis model with continuous variables. I have looked at your Topics 1 handout but I am confused as to what I should specifiy exactly in my input file. The only thing I changed to test whether the models are different for each gender in the GROUPING command. Here is the input I have so far: ... VARIABLE: MISSING ARE ALL (999); NAMES ARE..... USEVAR ARE Sexe azagg azpop bzpop czpop dzpop aengcpt7 bengcpt7 cengcpt7 dengcpt7 eengcpt7 azaggami bzaggami czaggami dzaggami; GROUPING IS Sexe (0 = filles 1 = garçons); ANALYSIS: ESTIMATOR = MLR; MODEL: dzaggami ON cengcpt7 czaggami czpop; czaggami ON bengcpt7 bzaggami bzpop; bzaggami ON aengcpt7 azaggami azpop; dzpop ON czpop czaggami cengcpt7; czpop ON bzpop bzaggami bengcpt7; bzpop ON azpop azaggami aengcpt7; eengcpt7 ON dengcpt7 dzaggami dzpop; dengcpt7 ON cengcpt7 czaggami czpop; cengcpt7 ON bengcpt7 bzaggami bzpop; bengcpt7 ON aengcpt7 azaggami azpop; azpop ON azagg; azaggami ON azagg; aengcpt7 ON azagg; Is there anything else I should be adding to test this correctly? Many thanks in advance for your help! Genevieve Taylor 


The GROUPING option should be used in all but the first step of testing for measurement invariance. The first step is to run the model separately for each group. The correct inputs are shown in the Topic 1 course handout under Multiple Group Analysis. Please refer to these inputs. 


Hi Dr Muthen, Thanks for your response. I understand the handout now. I will follow these steps for my analysis. Many thanks, Geneviève 


Dear Mplus team, I am trying to understand a new approach to measurement invariance (approximate measurement invariance)implemented in Version of Mplus 7.11. Q1. I ran a twogroup CFA model for testing measurement invariance based on example 5.33. Under the DIFFERENCE OUT, I got average of estimate, standard deviation, deviations from the mean for each parameter and each group. I specified difference between two options like, N(0,0.01). For example, I got Average: 1.422 SD: 0.031 Deviation from the mean: 0.03 (Lamda1), 0.03(Lamda2) > How do I know whether the deviations from the mean in Lamda1 and 2 are significant or not? Q2. Based on Muthen (2013) paper, it says that " With only two groups/timepoints, the difference relative to the average can be augmented by the difference across the two groups/timepoints which can be expressed in MODEL CONSTRAINT. If I want to test approximate measurement invariance between two group, what kinds of model constraint I need? Thanks!! 


q1. There is an asterisk if the value is significant. q2. Use parameter labels a and b in the MODEL command, where those parameters are the 2 parameters in question. Then use Model Constraint to do New(diff); diff = ab; 


Dear Dr. Muthen, Thanks a lot for your answer. I have one more question. Can I test approximate measurement invariance in multilevel context? For example, can I conduct approximate measurement invariance test for betweenlevel factor loadings? I have tried to do it by extending ex5.33 code but I couldn't. Please let me know. Thanks a lot in advance. 


It is in principle possible but is quite complex given that the DODIFF options haven't been tailored to multilevel applications. I would not recommend trying. 

Elina Dale posted on Friday, February 07, 2014  12:15 pm



Dear Dr. Muthen, I would like to test measurement invariance where my loadings are constant across groups, but thresholds are allowed to vary. As per Ex. 5.16, since I am allowing thresholds to vary across groups, I fixed the scale factors to 1. I don't understand what is wrong with my input: CATEGORICAL = i1i9; GROUPING IS g (1 = male 2 = female) ; CLUSTER = clust; MISSING = ALL (9999) ; Analysis: TYPE = COMPLEX ; Model: f1 BY i1 i2 i3 ; f2 BY i4 i5 i6; f3 BY i7 i8 i9 ; Model female: [i1$1 i2$1 i3$1 i4$1 i5$1 i6$1 i7$1 i8$1 i9$1 i1$2 i2$2 i3$2 i4$2 i5$2 i6$2 i7$2 i8$2 i9$2 i1$3 i2$3 i3$3 i4$3 i5$3 i6$3 i7$3 i8$3 i9$3]; {i1@1 i2@1 i3@1 i4@1 i5@1 i6@1 i7@1 i8@1 i9@1}; I keep getting an error message: THE MODEL ESTIMATION TERMINATED NORMALLY THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES COULD NOT BE COMPUTED. THE MODEL MAY NOT BE IDENTIFIED. CHECK YOUR MODEL. PROBLEM INVOLVING PARAMETER 112.THE CONDITION NUMBER IS 0.175D16. I have checked this parameter and it is Alpha for F1, which is an intercept I guess. Thank you!!! 


If you free the thresholds, you must fix the factor variances to zero. 

deana desa posted on Tuesday, March 18, 2014  6:56 am



I would like to know if factor scores computed from the alignment method and the convenient features (i.e., configural, metric or scalar) are (directly) comparable or related? How much these scores are expected to be correlated? Is there any literature out there that I can refer to for the scores computed from these different techniques? 


No, the factor scores from alignment are not the same as those from configural, metric, or scalar. They start from a configural model and maximize measurement invariance. The correlation between the different factor scores would depend on the amount of measurement invariance. I doubt there is any literature on this yet. 

Bilge Sanli posted on Friday, August 08, 2014  11:25 am



Drs. Muthen and Muthen, Using the National Identity Module of the ISSP, I am adopting a twolevel EFA approach in my exploratory research on different dimensions of nationhood, and their contextual and individual predictors. My cluster variable is countries, and my variables are all at the ordinal level of measurement. In a subsequent twolevel SEM analysis, (upon your suggestion in an earlier inquiry) I will use the factor scores I obtained from the initial twolevel EFA analysis as dependent variables and regress them onto independent variables at both individual and contextual levels. My question is the following: since I am engaging in a crossnational analysis, should I be establishing measurement invariance first? If I am to do this, is multiple group CFA the only option? In this scenario, how shall one take into account the multilevelness of the data? Once I establish measurement invariance, shall I proceed with the twolevel SEM? Apologies for the deluge of questions. I'd greatly appreciate your help. Thank you very much in advance. 


These are good questions. I think you will be interested in reading the paper on our website (see Recent papers): Muthén, B. & Asparouhov, T. (2013). New methods for the study of measurement invariance with many groups. Mplus scripts are available here. This paper compares the fixedeffect multiplegroup approach with the randomeffect multilevel approach. It turns out that 2level FA can be seen as a random intercept model, that is, measurement noninvariance that still makes factor comparisons possible. 


I am struggling to conduct a analysis of measurement invariance in a 2group CFA with categorical indicators each with three categories. I've included the code for the model in which factor loadings and thresholds are freed between the two groups: GROUPING is SEX (1=male 0=female); MODEL: FACTOR1 BY PFMS10 PFMS12 PFMS13 PFMS14 PFMS16 PFMS19 PFMS21 PFMS26; FACTOR2 BY PFMS7 PFMS9 PFMS18 PFMS22 PFMS23 PFMS24 PFMS25 PFMS29 PFMS30 PFMS31 PFMS32 PFMS33; [FACTOR1@0 FACTOR2@0]; MODEL female: FACTOR1 BY PFMS10 PFMS12 PFMS13 PFMS14 PFMS16 PFMS19 PFMS21 PFMS26; FACTOR2 BY PFMS7 PFMS9 PFMS18 PFMS22 PFMS23 PFMS24 PFMS25 PFMS29 PFMS30 PFMS31 PFMS32 PFMS33; [PFMS10$1 PFMS10$2 PFMS10$3 PFMS12$1 PFMS12$2 PFMS12$3 PFMS13$1 PFMS13$2 PFMS13$3 PFMS14$1...]; OUTPUT: STDYX MODINDICES; When I conduct this model, I get the following message: THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES COULD NOT BE COMPUTED. THE MODEL MAY NOT BE IDENTIFIED. CHECK YOUR MODEL. PROBLEM INVOLVING THE FOLLOWING PARAMETER: Parameter 82, Group FEMALE: FACTOR2 WITH FACTOR1 I'd appreciate any guidance on how to correctly identify the model! Thank you! 


In MODEL female do not mention the first factor indicator. When you do, the factor loading is not fixed at one and the model is not identified. 


Thanks, Dr. Muthen. I conducted the same model without mentioning the first factor indicators in the female model. See below: GROUPING is SEX (1=male 0=female); MODEL: FACTOR1 BY PFMS10 PFMS12 PFMS13 PFMS14 PFMS16 PFMS19 PFMS21 PFMS26; FACTOR2 BY PFMS7 PFMS9 PFMS18 PFMS22 PFMS23 PFMS24 PFMS25 PFMS29 PFMS30 PFMS31 PFMS32 PFMS33; [FACTOR1@0 FACTOR2@0]; MODEL female: FACTOR1 BY PFMS12 PFMS13 PFMS14 PFMS16 PFMS19 PFMS21 PFMS26; FACTOR2 BY PFMS9 PFMS18 PFMS22 PFMS23 PFMS24 PFMS25 PFMS29 PFMS30 PFMS31 PFMS32 PFMS33; [FACTOR1@0 FACTOR2@0]; [PFMS10$1 PFMS10$2 PFMS10$3 PFMS12$1 PFMS12$2 PFMS12$3 PFMS13$1 PFMS13$2 PFMS13$3 PFMS14$1 PFMS14$2 PFMS14$3 PFMS16$1 PFMS16$2 PFMS16$3 PFMS19$1 PFMS19$2 PFMS19$3 PFMS21$1 PFMS21$2 PFMS21$3]; OUTPUT: STDYX MODINDICES; When I run this model, I get a different error message: THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES COULD NOT BE COMPUTED. THE MODEL MAY NOT BE IDENTIFIED. CHECK YOUR MODEL. PROBLEM INVOLVING THE FOLLOWING PARAMETER: Parameter 162, Group MALE: { PFMS7 } I'm struggling to figure out what is wrong with my output. Thank you! 


Scale factors must be fixed to one in all groups when the factor loadings are free across groups. See the Version 7.1 Language Addendum on the website with the user's guide under Multiple Group Analysis: Convenience Features where models for testing for measurement invariance are described. 

TA posted on Tuesday, May 12, 2015  8:52 am



Does Mplus have simple code to conduct measurement invariance like lavaan's R? I used Millsap's measurement invariance Mplus code for categorical data here: http://www.myweb.ttu.edu/spornpra/catInvariance.html What I noticed is the degrees of freedom are off between the R and Millsap's mplus code. This led me to wonder if there was a simple line in mplus to run a configural, weak, strong, strict models to avoid human error coding like in lavaan's R package. Thanks! 


See the Version 7.1 Language Addendum on the website with the user's guide. The options for automatically testing for measurement invariance are shown there. 


I am testing the longitudinal measurement invariance of a 15 item measure from the child behavior checklist. For the structural invariance model I have asked for the same factor items but factor loadings not constrained, the variances of scales fixed to 1, latent means fixed to 0 and no constraints on intercepts. For the weak model I have the same factor items and factor loadings, variances of scales fixed to 1, latent factor means fixed to 0 and no constraints on intercepts. For the strong model I have the same factor items and factor loadings, variances of scales fixed to 1, only the first latent factor mean fixed to 0 and the other means free to vary and intercepts set to be equal. Have I put too many constraints on the structural and weak models? (ie with the variances and means?) Would it be possible to have some guidance on how the variances and means should be dealt with for the structural, weak and strong models? Is it also possible to find out why you should allow residual correlations of corresponding items across time? Many thanks 


We give detailed information about the models to test measurement invariance for various types of variables and estimators in the Version 7.1 Language Addendum on the website with the user's guide. They refer to multiple group models but the same constraints can be used across time for longitudinal measurement invariance. 

Jamie Vaske posted on Thursday, September 17, 2015  2:42 pm



Hi Linda & Bengt, I conducted a measurement invariance test in my MPLUS 7.1 version and found configural & metric invariance when my items were coded as 1 = strongly agree, 2 = agree, 3 = disagree, and 4 = strongly disagree. I reverse coded the items so that 1=SD and 4=SA. Once I did this, I was not able to establish metric invariance. From Technical Appendix 11, I am guessing one reason why this might occur is because the thresholds move around and change in sign. From your experience, why might the results of measurement invariance change when items are reverse coded? 


Sounds strange  like something isn't set up right. If you don't find it, please send the 2 outputs to support. 

Jamie Vaske posted on Friday, September 18, 2015  3:36 am



You are correct. Using the TECH1 outputs, I noticed that some of the thresholds were automatically constrained in the metric invariance model for one set of output but not the other (despite similarities in syntax). I'm following up with support. Thanks! 

Daniel Lee posted on Saturday, February 27, 2016  9:17 pm



Hi Dr. Muthen, Is it possible for the model fit to improve from configural invariance to strong factorial invariance (tested for weak invariance as well, in between)? I generally find decrements in model fit as I impose more restrictions to the model, but, interestingly, the model fit has been incrementally improving from configural to strong factorial invariance for this scale. I'm wondering if you could tell me (1) if I'm doing something wrong (would be happy to send along data&input&output), or (2) in brief, what this improvement in model fit means conceptually. Thank you, as always! 


Which estimator are you using? 

Daniel Lee posted on Saturday, March 05, 2016  1:43 pm



Hi Dr. Muthen, I am using ML. Is that normal for ML? 

Daniel Lee posted on Saturday, March 05, 2016  1:51 pm



I'm sorry, I meant WLSMV! So again, in the multiplegroup CFA (2 groups), I found it intriguing that the model fit improved from configural to strong invariance and was wondering what might be going on...normally, I would observe a decrement in model fit as I include more restrictions in the model. 


The chisquare values for WLSMV cannot be compared. Only the pvalues can be comapred. This is why the DIFFTEST option must be used for difference testing of nested models. 

Daniel Lee posted on Sunday, March 06, 2016  1:28 pm



I understand! That makes perfect sense. I had one more question about testing invariance. I have been trying to establish configural invariance for a 2factor model (grouping = gendeR) using WLSMV as an estimator, and the error message I get is: The following MODEL statements are ignored: * Statements in Group MALE: [ D1 ] [ D7 ] [ D8 ] [ D9 ] [ D10 ] [ D11 ] [ D12 ] [ D13 ] [ D16 ] So when I release equality restrictions on item intercepts for males, and when I estimate the model using WLSMV, I get the aforementioned error message. some insight about this error message would be greatly appreciated!! Thank you, again! 


Please send the output and you license number to support@statmodel.com. 


Hello, I've trying to test the measurement invariance across gender however I get this error: THE MODEL ESTIMATION TERMINATED NORMALLY THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES COULD NOT BE COMPUTED. THE MODEL MAY NOT BE IDENTIFIED. CHECK YOUR MODEL. PROBLEM INVOLVING PARAMETER 128. THE CONDITION NUMBER IS 0.857D07. 


Syntax for measurement invariance PhysFIW BY PG5@1 (L1) PG6* (L2) PG7* (L3) PG8* (L4); EmotFIW BY EG5@1 (L5) EG6* (L6) EG7* (L7) EG8_b* (L8); PhysWIF BY PG1@1 (L8) PG2* (L9) PG3* (L10) PG4* (L16); EmotWIF BY EG1@1 (L11) (EG2* (L12) EG3* (L13) EG4* (L14); [PG1*] (I1); [PG2*] (I2); [PG3*] (I3); [PG4*] (I4);[PG5*] (I5); [PG6*] (I6); [PG7*] (I7); [PG8*] (I8); [EG1*] (I9); [EG2*] (I10); [EG3*] (I11); [EG5*] (I12);[EG5*] (I13); [EG6*] (I14); [EG7*] (I15); [EG8_b*] (I16); Model Female: PhysFIW BY PG5@1 PG5PG8*; EmotFIW BY EG5@1 EG6EG8_b*; PhysWIF BY PG1@1 PG2PG4*; EmotWIF BY EG1@1 EG2EG4*; Output: STANDARDIZED MODINDICES (ALL); 


You cannot have several labels on a line without separating them by semicolons. 


Good evening. I am conducting measurement invariance testing for some IRT models (Samejima's Graded Response Model). I am using WLSMV estimation and, as such am using the DIFFTEST command to compute the Chisquare difference tests. The problem is that I have a very large sample (N = ~65,000) which yields a very liberal test (i.e., the null is almost always rejected). There are other approaches out there but I can't seem to implement them in Mplus. Todd Little (and others), for example recommend using a difference in CFI computed using the proper null model. I don't think the Chisquare value that results from using WLSMV can be used for these calculations. As evidence of this, the first test I conducted yielded a smaller Chisquare value for my metric invariance model (constrained loadings) than it did for my configural (free; less restricted) model. This, of course doesn't happen when using ML estimation. I tried to use ML estimation, even though it probably isn't appropriate for my ordinal (4 category) data, but you can't use numerical integration with multigroup models, evidently. Is there a correction I can apply to the WLSMV Chisquare that will allow for direct comparison between (nested) models? Is there another approach for conducting the invariance testing that might be more appropriate given my large sample? Your feedback is appreciated. 


WLSMV produces CFI values so I don't see why you couldn't use CFI differences if that is what you like. You can use ML with ordinal outcomes. Note that ML does not mean you have to have continuousnormal outcomes (that's a common misperception). You can use multiplegroup analysis and numerical integration  you just have to do the multiple groups as Knownclass in a Type = Mixture run. Then you can use logL values to do get a chisquare using the loglikelihood ratio approach. 


Thank you for the response. I guess the part that is confusing to me is that, in my first set of models I am adding constraints and getting improved Chisq/CFI values when I should be seeing worse model fit. For example, I estimated a configural model (simple one factor with 10 items; 2 groups; N =~ 32,000 for each group) that yielded a Chisq ~19780 (70 df) and a CFI of .97. I then constrained the loadings to be equal across groups and got a Chisq ~12700 (79 df) and a CFI of .98. That suggested to me that that raw Chisq/CFI values resulting from WLSMV may not be comparable across models without some correction. 


Upon further review, this scenario where the Chisq, etc. improve when adding constraints to go from configural invariance to full metric invariance seems to be the case in every example I could find (in the realm of IRT with WLSMV, anyways). For the rest of the tests (e.g., going from full metric invariance to threshold invariance and so on) the fit indices behave in the usual way. Any idea why that would be? 


With WLSMV, chisquare and related fit statistics like CFI cannot be compared. Difference testing can be carried out only using the DIFFTEST option. 


Thank you. So, then. If I want to use the change in CFI approach as in Cheung & Rensvold (2002), Meade, et al. (2008), etc. I need to use ML estimation by utilizing the KNOWNCLASS command with TYPE=MIXTURE? Is it possible to work backwards from the Chisq statistic reported in the output when using WLSMV to get an uncorrected Chisq that I could then use to compute a CFI value that would be comparable across models? I apologize for all the questions but the LR test implemented using DIFFTEST seems rather unusable with large sample sizes such as those I am working with. The presence of any deviation in the parameters across groups yields a statistically significant difference test. Thank you again. 


There is no way to work with the WLSMV chisquare other than the DIFFTEST option. You would need to use ML. 


Thanks. Can you point me to any literature (a Tech Note, perhaps) that describes the mechanics underlying the DIFFTEST procedure? 


Sorry for the second question but am I correct in saying that I cannot get CFI, TLI, or RMSEA when using the KNOWNCLASS/TYPE=MIXTURE approach with ML estimation? It appears that I can only get Chisquare. Can I compute CFI manually using the Chisquare/df values for the estimated and baseline models? 


See Web Note 10 for Difftest documentation. ML for mixtures requires raw data analysis which implies that a mean vector and a covariance matrix are not sufficient to summarized the modeling. Which in turn means that CFI etc are not relevant. Still, you can get chisquare to compare nested model by using a likelihoodratio test. 


Thanks for all of your help. One last question  can you clarify a bit for me the new "convenience" commands for MG models in Mplus v7.3 and later? I'm currently using v7.11. With that version I cannot use ML estimation with MG specification (i.e., GROUPING=...) and categorical factor indicators due to Mplus not being able to implement numerical integration for these models. I've been told that I can use ML estimation with MG specifications in v7.3 an on by using the MODEL=... command with TYPE=complex under the ANALYSIS section of the syntax. Is that, in fact the case? If so, can I use any ML estimator (ML, MLM, MLR, MLMV)? Which fit indices are reported? Are these fit indices (CFI, TLI, and RMSEA in particular if available) comparable across models? I'm assuming they would be with ML, but perhaps not with an estimator that includes a Chisquare scaling correction. I'm very much interested in finding a robust method for conducting reasonably rigorous MI testing with categorical indicators and large samples without having to resort to using AIC/BIC, an "eyeball" test, or stratified sampling from my larger sample. Thanks in advance. 


With maximum likelihood and categorical items, the KNONWCLASS option must be used instead of the GROUPING Option. Chisquare and other fit statistics should only be compared using the same estimator. 


So, let's say I had continuous indicators. In that case, a Chisquarebased GFI (e.g., CFI, TLI, RMSEA) from, say a configural MG model estimated using MLR could be compared to the same index resulting from a metric model (also using MLR) even if the SB scaling factors are not equal? 


You can use those two chisquare values along with the scaling correction factors to calculate a chisquare difference test. 


Thank you. Can I make any comparison between GFIs that are computed using the Chisquare statistic? For example, if I am comparing two nested factor models using MLR and one yields a CFI of .97 (less constrained model) while the other yields a value of .95 (more constrained), is it appropriate to say that the former fits better than the latter at the global level (not necessarily in the statistically significant sense, of course)? If not, can I use the scaling factor to compute a fit index (CFI, for example) that is comparable across models? With ML the less constrained model will always fit better (per the Chisquare) than the more constrained model save for a few rare cases. I'm not sure if that's the case when using MLR, etc. due to the scaling correction. 


Only a chisquare difference test can be used to say one model fits better than another in my opinion. You may want to ask this question on a general discussion forum like SEMNET to obtain the opinions of others. 


Thank you. Are the CFI, TLI, and/or RMSEA values on the same scale when using MLR estimation for hierarchically related models? I realize that I can't conduct any null hypothesis test of differences between these values from one model to the next. I know that they are on the same scale when using ML, but am not sure about whether this is true when using an estimator that requires a scale correction factor for model comparison. This question probably comes down to whether Mplus applies a scaling correction before calculating those indices. If not, then I can't imagine that they're comparable as reported in the output. Thank you. 


I do not believe you should compare CFI etc. when using MLR. These are based on the chisquare with uses the scaling correction factor. 

Hewa G posted on Friday, July 22, 2016  4:09 pm



Dear Dr. Muthen, I have complex survey data with twostage cluster sampling. I tried testing measurement invariance in a 2group (TL=1 and CS=2) CFA with continuous latent factors. The sample size is unequal for the two groups. My Mplus syntax is as follows: MISSING ARE y1y20 (999); grouping = OC (2 = TL 3 = CS); Cluster Is MG; ANALYSIS: Type=Complex; Model = Configural metric scalar; MODEL: AD by y1y4 AA by y6y9……… OUTPUT: STANDARDIZED (STDYX); How do I control for the two clusters? There is an error message in the output file “THIS IS MOST LIKELY DUE TO HAVING MORE PARAMETERS THAN THE NUMBER OF CLUSTERS MINUS THE NUMBER OF STRATA WITH MORE THAN ONE CLUSTER”. I tried Type= Twolevel complex command with both clusters but did not work. Any advice you can offer will be much appreciated. 


It sounds like you are saying you have 2 clusters. Type=Complex requires at least 20 clusters for good performance. 

Hewa G posted on Friday, July 22, 2016  5:08 pm



My sample consist of two types of clusters (Firms=30 and managers=104). Then data is at employee level with 624 employees. 


I would run that as a 2group analysis. 

Tyler Moore posted on Tuesday, September 13, 2016  8:31 am



Hi Bengt and Linda, I'm trying to run a twogroup CFA with a mix of binary and ordinal items using the convenient "MODEL=CONFIGURAL METRIC SCALAR;" method, but I get this error: *** ERROR in ANALYSIS command When performing measurement invariance with categorical outcomes and the MODEL option of the ANALYSIS command, all categorical outcomes must be binary or all categorical outcomes must be ordered polytomous. MODEL=CONFIGURAL cannot be used for this analysis. Does that mean it doesn't like the mix of binary and ordinal, or is there another problem? If the former, do you have a suggestion for an alternate method? Thanks! 


You would need to do test for measurement invariance not using MODEL = CONFIGURAL etc. These options cannot be used with a mix of binary and ordingal variables. The models to use for each type of invariance are shown in Chapter 14 of the user's guide. You would need to combine then. 


Dear Drs. Muthen, In your 2014 paper (IRT studies of many groups: the alignment method) published in Front. Psychol., you provided an illustration comparing countries in two crosssectional surveys. I wonder whether the alignment approach can be applied to twotime points longitudinal data (assuming measurement invariance is not well justified). More specifically, the attribute (a latent variable) of several groups were measured twice. I am interested in (a) group comparison at time one and (b) at time two. Additionally, I want to investigate "group change" from time 1 to time 2. Theoretically, the indicator measured at time 1 will be correlated to that measured at time 2. However, using the alignment approach, I fail to consider such correlations. Q1: Can I still apply the alignment approach to twotime point longitudinal data? Will that be any problem if the dependency is not considered? Q2: I try to use "save factor score" command to get individual factor scores. Can I assume individual factor scores are in the same scale so that I can compare them to each other? Thank you so much. 


There is a paper that is coming up in Psychological Methods by Herb Marsh et alt "What to do When Scalar Invariance Fails: The Extended Alignment Method for MultiGroup Factor Analysis Comparison of Latent Means Across Many Groups" The method is a twostep estimation where first you run the alignment method on the different groups and time points then once the invariance patter emerges you can run the full model. I would recommend that approach. 


Thank you, Tihomir. I will read that paper. Do you think the correlation between residuals across twotime points will cause any problems? Thank you. 


It will not cause any problems. 

Ti Zhang posted on Tuesday, December 13, 2016  12:02 pm



Dear Dr. Muthen, I am trying to generate some data for my measurement invariance project. I used Mplus to do this. I tried to generate categorical data with 6 indicators, 2 groups and 2 factors (3 indicators per factor). For each indicator, there are 4 response options (threshold=3). I got my generated data as "0,1,2,3" as expected. However, in my "model population" and "model populationg2" command, I set the starting values for each indicator as 0.5 for one group and 0.7 for another group. Also, I set the starting values for each threshold, both groups as below: [y1$1*1.25 y2$1*1.25 y3$1*1.25 y4$1*1.25 y5$1*1.25 y6$1*1.25]; [y1$2*0 y2$2*0 y3$2*0 y4$2*0 y5$2*0 y6$2*0]; [y1$3*1.25 y2$3*1.25 y3$3*1.25 y4$3*1.25 y5$3*1.25 y6$3*1.25]; When I looked at my output, the population thresholds are all 0. Why are they all 0? I thought I set the population values as above. Thank you. 


Use these starting values also in Model, not only Model Population. 

Ti Zhang posted on Wednesday, December 21, 2016  1:06 pm



Thank you, Dr, Muthen. I have another question about the real differences between "@" and "*" in the "model population" command. Based on my understanding, "@" means you want to fix a parameter value to a specific value so that Mplus will actually not estimate it. It is fixed. "*" means you want to free the parameter to be freely estimated so that Mplus will actually estimate the parameter based on the sample. In "model population" command, however, "*" means the true value and the starting value for a particular parameter if you are generating data. In my Mplus syntax, in the "model population" command and "model poluationg2" command, I used "*" for each factor loading values and also I wrote "y1y6*0.75". I think this means the population parameters of residual values are 0.75? In the "model" command, I did not specify the residual variances. I ran the Monte carlo and I looked at the output. In the output, the population values for the residual variance are all 1s, not 0.75. Why? All other population parameters look right. I suspect I did not fully understand the meaning of "*" and the meaning of "@" in the context of model population and model command. Could you explain why? Thank you for your help. 


In the Model Population command the symbols * and @ are equivalent. The values given here determine how the data are generated. In the Model command the values given are those presented in the column labeled Population and are used when Coverage and %Sig are computed. When you don't give a value for a parameter in the Model command it is assigned a default value which is 0.5 for variances. 

Manni posted on Friday, March 03, 2017  1:35 pm



Dear MPlus Team, I am testing longitudinalmutigroup measurement invariance with ordered categorical variables using WLSMV. I tested longitudinal invariance first in each group. No I combined all models in a multi group strucutre. In this longitudinal multigroup model, some indicators seem invariant across groups (strong MI). For example: If one indicator shows invariant thresholds (not all, just two of three), should I free all thresholds in addition to the loading and fix the scale factor to one, or is it appropriate only to free the invariant thresholds and keep the rest (factor loading) fixed? Many thanks in advance 


Either way is fine. 


I am testing a number of multiplegroup invariance models with orthogonal factors, and am having trouble setting the factors equal to 0 in group 2 for the metric, scalar, and residual invariance models. Instead, the factors are uncorrelated in group 1 but are correlated in group 2. Does Mplus allow uncorrelated factors in these models? Is it possible to set the covariances equal to 0 across both groups? Thank you for your time. 


Yes you can do this in several ways: fix the factor covariances to zero in the "overall part" (not groupspecific parts) fix them in each group use the option that automatically fixes all factor covariances: NOCOV 


Thank you very much for your reply. I added the "NOCOV" option to the model statement, but am still getting correlated factors in group 2. Am I putting this option in the wrong place in my code? I've pasted the model statement here: MODEL: ! Model 1: Metric Invariance F1F7 BY Y1Y80 (*1) NOCOV; [F1F7@0]; Y1Y80@1; MODEL g2: [Y1$1Y80$4]; Second question  Using one of the other methods you mentioned above  if I fix the factor covariances to zero in the overall part, do I also need to fix the factor variances to 1? Otherwise (if they are freely estimated in group 2), I get a model misspecification error. 


Q1: Look up nocov in the UG index. Q2: Fixing factor covariances does not have to with fixing factor variances  they are not related. 

Lois Downey posted on Wednesday, March 29, 2017  8:48 am



I want to test the effect of an intervention on a 7indicator factor measured at followup, adjusting for the same 7indicator factor measured at baseline. I have defined the 14 indicators (each with range 011) as censored from below and am using WLSMV for analyses. I first ran a 2group (control/intervention) CFA with two factors (baseline/followup), constraining the loading and intercept for each indicator to equality over the 2 time points. The chisquare test of model fit suggested excellent fit (p=0.9349). I have two questions: 1) For the final regression, do I simply define the 2 timespecific factors so that the indicator loadings and intercepts are constrained to the constants obtained from the measurementinvariant CFA, or is there some way to have these values recomputed within the model that includes the regression equation? 2) In the initial CFA the means for both timespecific factors were  by default  constrained to 0 in the control group. Does this have any negative implications for my ability to test the impact of the intervention on the factor at followup, adjusting for the baseline factor? (If so, is there a way to constrain only the baseline factor mean for the control group to 0 and estimate the factor means for the intervention group at both time points and the control group at followup?) Thanks! 


You do this in a single analysis as a 2group CFA with the factor mean fixed at zero for say the control group at time 1; free in the other 3 instances. For this, you need to impose scalar invariance across group and time. 

Lois Downey posted on Thursday, March 30, 2017  7:26 pm



Thanks for your response. I'm confused by your statement, "For this, you need to impose scalar invariance across group and time." This seems to imply that my 2group CFA does NOT have scalar invariance imposed. However, it has the indicator loadings and intercepts equal across groups and time points. That is all that is required for scalar measurement invariance, isn't it? 


Yes. 


OK. Thank you. I'm also confused by two other parts of your response of March 30: 1) You indicate that in my 2group CFA, I should fix the factor mean at zero for one group and free it for the other three instances. So far, I've been unable to work out the syntax for this. Everything I've tried has resulted in means of zero for both time points in the control group and estimates for the two time points in the intervention group. For example: MODEL: ... [BaseFact@0]; [FUfact]; MODEL intervention: [BaseFact]; [FUfact]; Could you please tell me the syntax for limiting the zero contraint to only the baseline time point in the control group and estimating the other three? 2) You also indicate that I should incorporate the regression into the 2group CFA. However, the predictor of interest is the grouping variable (randomization group  intervention/control), and I want to include covariate adjustment for confounders. Again, I don't know the syntax for doing this in the 2group CFA. Would you please explain? Thanks! 


1) Send your output to Support along with your license number. 2) Look at UG ex5.14. Don't include the Nomeanstructure and Expected settings. 

Lois Downey posted on Monday, April 03, 2017  12:19 pm



Thank you. I've sent the output to Support and hope to learn what I'm doing wrong. Example 5.14 is very helpful. However, because it a different way of assessing an intervention effect than I'm used to using, I want to be certain that I understand how to proceed. 1) The regression command should simply regress the factor as measured at followup on the baseline factor and other covariates. Is that correct? 2) The results will NOT show a regression coefficient for the intervention group. Instead, the effect of the intervention is to be assessed by the pvalue for the intervention group's MEAN on the factor at followup  which implies that the mean that needs to be constrained to zero is the control group's mean on the factor at FOLLOWUP. Correct? 3) And the other factor means  the baseline means for both groups, and the followup mean for the intervention group  need to be estimated. Do I have that right? 4) I typically test a variable for confounding by assessing whether its addition as a predictor in the regression model changes the regression coefficient for the predictor of interest by 10% or more. So with your recommended 2group method for assessing the intervention effect, would the parallel procedure be to consider any variable a confounder if it changes the MEAN for the intervention group's followup factor by 10% or more? 

Lois Downey posted on Wednesday, April 05, 2017  9:03 am



Now that Support has helped me with the syntax for constraining a selected mean to zero in the CFA, I can see that my assumptions about how to proceed with the regression model are incorrect. But I'm at a complete loss as to how to do the test for whether there is a significant effect of the intervention on the factor score at followup, after adjusting for the factor score at baseline  within a 2group (intervention/control) model. Would you please explain. Thanks! 


Say that you give the following parameter labels in the Model command (see the UG for how to do such labels): m02: factor mean for control group at time 2 m11: factor mean for treatment group at time 1 m12: factor mean for treatment group at time 2 The you use the Model Constraint command to say: New(effect); effect = (m12m11)m02; where the first part on the RHS is the change in factor mean for the treatment group and the second part (m02) is the change in factor mean for the control group. The difference in their change is the treatment effect. 

Lois Downey posted on Wednesday, April 05, 2017  6:15 pm



Thanks very much. The pvalue for the intervention effect using this method is SIMILAR TO the pvalue obtained in a singlegroup model regressing the time2 factor on treatment group and the time1 factor, when the loadings and intercepts for both time periods are constrained to constants obtained in a CFA with scalar invariance imposed over groups and times. With a 4indicator factor, the pvalues for the treatment effect were identical for the two methods (0.005). For a 7indicator factor, they were similar (p=0.013 for the method assessing the treatment effect within the CFA model; p=0.023 for the method with indicator loadings and intercepts constrained to constants). Is there any reason to think that the result for the 2group CFA method is a more accurate representation of the treatment effect than the result obtained from the singlegroup regression model with scalar invariance imposed via constants? If there is not, I might opt for the latter, given the ease with which that method allows checking for confounding of the treatment effect by other variables. I don't know how to do that within the CFA model. Again, thank you! 


I worry about the phrase "constrained to constants" which sounds like you fix some parameter values (rather than holding parameters equal); that would underestimate SEs. But generally speaking, using treatment as a covariate rather than a grouping variable should give the same results for the same model  it's just that the multiplegroup approach allows more generality such as different factor variance across time and group. 


You are correct that I did fix the parameter values in the regression model, rather than simply holding them equal. For example, my original 2group CFA with the parameters constrained to equality between groups and over time produced for indicator 2 a loading of 0.873 and intercept of 2.775. So in the singlegroup regression model, the model statement included Factor by ... ind2@0.873 ...; as well as [ind2@2.775]; This results, as you indicate, in a singlegroup regression model with the SEs for the indicators = zero, equal variance across groups for the time1 factor, and equal residual variance across groups of the time2 factor. Is there a way, using the 2group CFA model approach, to evaluate confounding of the treatment effect by other variables, such as gender, age, etc.? If so, how is that done? Thank you! 


Since this discussion is now going toward general analysis strategies, I think you should post on SEMNET. 

Ti Zhang posted on Thursday, April 20, 2017  4:15 pm



Hi, Dr. Muthen, I have one factor, 6 indicators, two groups CFA model for ordinal variables. For metric invariance model, I simulated 6 ordinal variables, one continuous latent variable. I specified each factor loading value for each group in both model population command, model populationg2 command and model command. If there are two factor loadings' values are different between two groups, I specified the values to be different in model population and model populationg2 command, for correct modeling, I should add "model g2" command, and then free these two factor loadings as: model g2: f1 by y1* y2*; In the results section, for group 2, the population values for these two loadings become 1 and the "estimates average" values are produced and they are different from the "estimates average" values for group 1's. For incorrect modeling, I am not quite sure how to write the code. I think the code should be: model g2: f1 by y1*0.6 y2*0.7; In the results section, these two values are specified as population values for group 2 and the "estimates averages" are also produced but they are different from group 1's, as expected. Did I specify correctly for both correct modeling and incorrect modeling? Thank you. 


For a correct model, the Model statements should have the same parameter values given as in the Model Population statements. I don't know what you mean by "incorrect modeling". Perhaps you mean generating data with noninvariant loadings and estimating with invariant loadings. 

Ti Zhang posted on Friday, April 21, 2017  6:19 pm



Hi, Dr.Muthen, Yes. For "incorrect modeling", I mean I want to simulate data with noninvariant loadings and estimating with invariant loadings. In my case, in "model populationg2" command, I specified two factor loadings values, which have different values as the ones in "model population" command. Say, two items have lower factor loading values in one group but higher values in the other group. Does this mean that I successfully generate data with noninvariant loadings? 2. Then, in "model g2" command, I free these two factor loadings, I think this means "correct modeling" because these two loadings are not invariant. For "incorrect modeling", I am not quite sure how to write the code. I am wondering if my previous code is correct? Thank you. 


Q1: Yes. For incorrect modeling you would not mention anything for the second group  this would imply invariance. You can see what you get in TECH1. 


Greetings, I have run into an issue where my multigroup WLSMV measurement invariance nested models have increasing better fits the stricter the models become. This is the case whether I use syntax or the MODEL = CONFIGURAL METRIC SCALAR command. An editor considering my paper has requested I look into this further before my paper can be published. Others on this message board have also described the problem of WLSMV multigroup measurement invariance models having a better fit as they become more stringent, rather than having a worsening fit as would be expected (example, Daniel Lee’s posts on February 27, 2016  9:17 pm to March 05, 2016  1:51 pm, and Ray Reichenberg posts on Monday, June 06, 2016  10:01 pm to June 7, 2016  2:57 pm). The responses were that it is due to WLSMV estimation – one can only use DIFFTEST (or I assume the builtin difference testing function for the MODEL = CONFIGURAL METRIC SCALAR command) to make comparisons between WLSMV multigroup models (Linda K. Muthén on Saturday, March 05, 2016  3:06 pm to the former example and Linda K. Muthén on Tuesday, June 07, 2016  2:48 pm and 4:12 pm to the latter), not the fit statistics themselves. Is that accurate? 


I have two additional questions: 1) If the above is accurate, is it because WLSMV nested models have “mean and variance adjusted chisquare statistics” (p. 1, Muthén, Web Note 10)? 2) I also have withingroup longitudinal measurement invariance WLSMVestimated models that show a worsening fit as models become more strict, as would be expected. I assume I would still only report results from DIFFTEST when comparing nested models, and ignore the worsening goodness of fit statistics in this comparison. Is that correct? Much thanks! 


With WLSMV, you cannot compare the chisquare values directly. They do not behave as do the chisquare values for ML. Only the pvalues should be interpreted with WLSMV. To do difference testing you must use the DIFFTEST option. 


Thank you very much, Dr. Muthen. That answer should suffice. 


I am testing measurement invariance for multigroup CFA with categorical indicators. I use the function MODEL = CONFIGURAL METRIC SCALAR. some output: Models Compared Degrees of Freedom Chisquare Pvalue Metric against Configural 2.357 3 0.5017 Scalar against Configural 7.805 11 0.7306 Scalar against Metric 5.251 8 0.7304 Because of WLSMV estimator, I can't rely on the regular chisquare difference testing. In order to use the DIFFTEST option, I need to save the results from each of the models (config, metric, scalar). How is it possible? My second question is about the group size. One of my groups is about 100, others are 300 and over. Meade and Kroustalis (2006) claim that the power of measurement invariance tests is very low for sample sizes of 100 and often poor for samples of 200 per group. They recommended at least 200 per group for measurement invariance tests. Based on your experience, how solid is this recommendation? 


The printed chisquare difference tests of e.g. Metric against Configural are done in the correct way (we wouldn't print them otherwise) using Difftest behind the scenes for your convenience. That recommendation sounds reasonable but we haven't studied it. 


Dear Bengt, Thank you very much. This convenience function is absolutely awesome! Best, Valeria 


Hello, I am conducting measurement invariance testing using the WLSMV estimator and running into the same results as others who have previously posted here, in which model fit statistics improve with increasing constraints across groups. However, it is still unclear to me whether I can use fit statistics such as CFI at all when examining each model when it uses the WLSMV estimator. For example, the chisquare difference test (using the DIFFTEST command) between a model testing configural invariance vs. a model testing metric invariance is significant, p=.01. However, the CFI, TLI, and RMSEA values for the metric invariance model are still strong in their own right (CFI=.982, TLI=.981, RMSEA=.051). Would I reject this metric invariance model in favor of configural invariance because the chisquare difference test is significant? Or can I integrate any information on model fit in general, suggesting that the metric invariance model still fits the data well? Thank you very much! 


Chisquare is usually much stricter than e.g. CFI. You can explore the reason for rejection by chisquare difference testing by using modindices, make needed parameter equality relaxing, and see if the corresponding group difference seems substantively important. 


Got it. Thank you! 

Back to top 