Message/Author 


Dear Dr. Muthen and Dr. Muthen, I would like to test increasingly restrictive hypothesis about intercepts, factor loadings, and residual variances and am running into two problems. 1. I cannot figure out how to restrict the residual variances across three groups. Mplus normally works, from my experience and the manual, so that the default is to equate intercepts and thresholds but not residual variances. However, the parenthesis command only seems to work within groups but not across groups. How is the equating of residual variances done? 2. When I am freeing the intercepts across the groups, I am not able to free all of them but, instead, have to fix at least one to equality across groups. I believe this is an identification requirement. Is my understanding correct? Thank you for answering these terribly simple questions and thank you for your excellent research, this great product, and your valuable time! Andre A. Rupp, University of Ottawa 


The following specification in the overall MODEL command will fix the residual variances across all groups: MODEL: y1 (1) y2 (2) y3 (3); You may have been placing more than one number in parentheses on a line. Only one is allowed. When you relax the equality of the intercepts, you must fix the factor means to zero in all groups for identification. 


Hello, I am attempting to test measurement invariance in two groups with binary data (ten items on a single underlying factor). No data is missing (its a simulated dataset). I am trying to follow Vandenberg & Lance's (2000) suggestions re how to sequentially test MI, but am having trouble implementing in MPLUS; i have a work around, but am unsure if its kosher. I want to allow the means and variances of the two groups to differ, but then test for MI by constraining all factor loadings (FLs) and thresholds to be the same, and then allowing them to differ bw groups. If MI is violated, I want to be able to follow up and see which specific FLs or thresholds are the violators. I have been able to do this, but I have been forced to set the residual variances across groups equal to one another (all = 1; I'm using THETA parameterization). This to me seems fine, but is there some problem I'm overlooking? Here's the guts of the syntax: ________________________________________ FIRST SCRIPT; ALL THRESHOLDS & FLS CONSTRAINED TO BE SAME BW GROUPS: ________________________________________ MODEL: latent1 BY i1* (1) i2* (2) i3* (3) i4* (4) i5* (5) i6* (6) i7* (7) i8* (8) i9* (9) i10* (10); [i1$1] (11) ! all items are binary [i2$1] (12) [i3$1] (13) [i4$1] (14) [i5$1] (15) [i6$1] (16) [i7$1] (17) [i8$1] (18) [i9$1] (19) [i10$1] (20); latent1@1; [latent1@0]; MODEL focal: latent1 BY i1* (1) i2* (2) i3* (3) i4* (4) i5* (5) i6* (6) i7* (7) i8* (8) i9* (9) i10* (10); [i1$1] (11) [i2$1] (12) [i3$1] (13) [i4$1] (14) [i5$1] (15) [i6$1] (16) [i7$1] (17) [i8$1] (18) [i9$1] (19) [i10$1] (20); i1@1; i2@1; i3@1; i4@1; i5@1; i6@1; i7@1; i8@1; i9@1; i10@1; latent1; [latent1*]; ________________________________________ SECOND SCRIPT; ALL THRESHOLDS (EXCEPT 1) AND FLS (EXCEPT 1) ALLOWED TO DIFFER BW GROUPS: ________________________________________ MODEL: latent1 BY i1* (1) i2* (2) i3* (3) i4* (4) i5* (5) i6* (6) i7* (7) i8* (8) i9* (9) i10* (10); [i1$1] (11) [i2$1] (12) [i3$1] (13) [i4$1] (14) [i5$1] (15) [i6$1] (16) [i7$1] (17) [i8$1] (18) [i9$1] (19) [i10$1] (20); latent1@1; [latent1@0]; MODEL focal: latent1 BY i1* (1) i2* (22) i3* (23) i4* (24) i5* (25) i6* (26) i7* (27) i8* (28) i9* (29) i10* (30); [i1$1] (11) [i2$1] (32) [i3$1] (33) [i4$1] (34) [i5$1] (35) [i6$1] (36) [i7$1] (37) [i8$1] (38) [i9$1] (39) [i10$1] (40); i1@1; i2@1; i3@1; i4@1; i5@1; i6@1; i7@1; i8@1; i9@1; i10@1; latent1; [latent1*]; ________________________________________ After comparing the fit stats (CFI, TLI, etc as recommended by Vandenberg; I'm not interested in testing using chisquare) between these two models, I want to followup and begin constraining the invariant FLs and thresholds to be the same while allowing the noninvariant ones to differ, thereby tracking down where the problems are. Any help you could provide would be appreciated. Is constraining the residual variances across groups OK? Any other problems with my approach? I'd be happy to send the full scripts along. Best, Matt 


I suspect that the suggestions you are following are for continuous outcomes. The models we recommend for testing measurment invariance of categorical outcomes for the default Delta parameterization are: 1. The Mplus default for multiple groups where thresholds and factor loadings are constrained to be equal across groups, factor means are zero in the first group and free in the other and scale factors are fixed to one in the first group and free in the others. 2. A model in which thresholds and factor loadings are free across groups, factor means are zero in all groups and scale factors are fixed to one in all groups. With categorical outcomes, thresholds and factor loadings need to be freed and constrained in tandem. 3. See Example 5.16 for an example of partial measurement invariance. For Theta, just substitute residual variances for scale factors and see Example 5.17. 


I was wondering if someone could explain further about the need to simultaneously constraing loadings and thresholds in categorical invariance models. Working sequentially (discrimination first, then difficulty) is commonplace in the IRT universe. Does the distinction involve the different parameterization in Mplus. That is, in irt we have a(thetab), in Mplus a*theta+b. Bengt has worked through the consequences of this distinction in a couple of papers, but it still isn't clear to me why it wouldn't work to fix the loadings first, then proceed to the thresholds conditional on equal loadings. Or, if you convince me that the sequential approach only works in an IRT paramterization, in MPlus 4.0 couldn't that be accomplished with a constraint block? That would be interesting to try. Eric 


I think the multiplegroup IRT model used in the Mplus WLSMV context is more general than the conventional model. This is due to being able to handle groupvarying variances for the u* latent response variables (e.g. due to varying residual variances). That general model brings special identification issues. I think in the conventional case it is possible to do it stepwise. On the other hand, it seems like it is natural to ask how much the whole item curve differs across groups  and the curve is determined by both parameters. Conventional IRT used to discuss DIF in terms of areas between the curves which also is in line with looking at both parameters jointly. 


Is there an example somewhere of working with the u* latent variables? Would you do this with scale factors? Thanks 


The User's Guide example 3.12 modifies the model diagram in ex 3.11 by changing y1, y2, y3 to categorical variables. When using WLSMV, the probit link is used and u*_1, u*_2 are used as predictors for u3 (with ML, the logit link is used and the actual u_1, u_2 values are used as predictors). Ex3.12 uses the Delta parameterization, whereas ex3.13 uses Theta. Using u* does not really relate to scale factors. Scale factors are the inverted SD's of the u*'s given x's. Ex3.12 does not have any free scale factor parameters. Hope that was what you asked about. 

frank rijmen posted on Wednesday, September 13, 2006  4:17 am



hi eric, you can do it the irtway as follows. in irt, we have a*(thetab), in mplus there is actually scale*(loading*thetab). in a single group, scale or loading has to be fixed for each item. in a multiple group with no across group restrictions, scale or loading has to be fixed in each group. mplus restricts the scales to 1 by default, but in IRT terms it makes more sense to restrict the loadings to 1 (as 'scale'corresponds to the discrimination parameter). equal discrimination parameters/item locations are imposed by having equal scales/thresholds across groups. testing for both simultaneously, you test a model in which thresholds and scales are equal across groups, and loadings set to one in both groups. for two groups, there are two parameters less for each restricted item: unrestricted model: group 1:scale_1*(thetab_1) group 2: scale_2*(thetab_2) restricted model: group 1:scale_1*(thetab_1) group 2: scale_1*(thetab_1) 

frank rijmen posted on Wednesday, September 13, 2006  4:17 am



the procedure proposed in mplus is, in irt terms, testing for item location invariance WHILE allowing for differences in discrimination. first, for two groups, note that, by allowing for noninvariant loading and threshold for an item, the model contains only one parameter more: threshold and loading in the second group are free, but its scale then has to be fixed to one. so, this is for sure not testing, again in IRT terms, a model with the same item parameters across groups versus groupspecific item parameters. what the proposed mplus procedure tests is, subscripts referring to groups: unrestricted model: group 1:loading_1*thetab_1 group 2: loading_2*thetab_2 restricted model: group 1:loading_1*thetab_1 group 2: scale_2*(loading_1*thetab_1) what this actually tests, is more transparent if we consider the fact that the unrestricted model is equivalent to unrestricted model bis: group 1:loading_1*thetab_1 group 2: scale_2(loading_1*thetab*_2) (in group 2 we do not fix the scale to 1 but the loading to the loading of the iem in the first group) hence, the actual test performed in mplus, in IRT terms, is to test for item location invariance only (b_1==b*_2) 

Alex posted on Friday, June 08, 2007  7:49 am



Greetings, I would like to test increasingly restrictive invariance hypotheses in a CFA with continuous indicators. In the manual (page 345) you suggests 4 steps which generally follow what is found in the litterature. I was wondering if you had Mplus syntax examples for these four steps (especially 1 and 2, which involve changing the defaults of the program). Thank you in advance. 


The Day 1 course handout contains the full inputs for these steps. See Example 5.15 for relaxing defaults of factor loading and intercept invariance. 


Dear discussion board, I'm a bit confused: I want to know whether looking for measurement invariance is futile if my 2 groups differ on average item score. I have a set of categorical items loading onto one factor, for 2 groups (male and female). I already know the females score higher overall if I just sum the items, so I'd expect the thresholds to vary across groups. So am I finding out anything new if I find significant measurement invariance between males and females (if as i understand, i must equate the thresholds and factor indicators simultaneously)? Would it not be possible to estimate thresholds in a prior run, and then fix them to the values they are estimated at? Then I could test the significance of equating just the factor loadings, which is what I am more interested in. Thanks very much 


Group differences in item means can be represented by group differences in factor means even when thresholds are the same across groups. Measurement invariance says that the thresholds are the same across groups for a given factor score. 


Thanks Linda. Does that mean I could test for measurement invariance whilst allowing for group differences in item means by doing the following models for WLS with delta parameterization: 1. Thresholds and factor loadings free aross groups; factor means fixed to zero in group one and free in the second group; scale factors fixed to 1 in all groups. 2. Thresholds and factor loadings constrained to be equal across groups; factor means fixed to zero in group one and free in the second group; scale factors fixed to 1 in group one and free in second group. (i.e. as suggested in chapter 13, but allowing factor mean differences in both models). I'm asking partly because I'm not totally clear what the scale factors do and so am not sure when they should be fixed or free. Thanks again for your time & effort 


The models we recommend in the user's guide for weighted least squares and the Delta parameterization are: 1.Thresholds and factor loadings free across groups; scale factors fixed to one in all groups; factor means fixed to zero in all groups 2.Thresholds and factor loadings constrained to be equal across groups; scale factors fixed to one in one group and free in the others; factor means fixed to zero in one group and free in the others (the Mplus default) You have number 1 slightly wrong. 


Sure, but I don't think model 1 as you recommend allows for group differences in factor means (and so does not allow for group differences in item means). I would like to test for measurement invariance by comparing two models that both allow for group differences in factor means, but the more restrictive model forces thresholds and factor loadings to be equal. Is this possible? Sorry if I'm being thick 


One cannot have all thresholds free across groups and the factor means also free. The model would not be identified. 


Thanks. Please could you tell me a way to test for measurement invariance while allowing factor means to be different across groups. Would it be OK to find one or two items whose thresholds and factor loadings can be equated across groups, and equate these to allow factor means to be free (for identification purposes)? 


A model with thresholds, factor loadings, and factor means free across groups is not identified. Measurement invariance can be tested by using the models discussed previously. When the thresholds and factor loadings are free across groups and the factor means are fixed to zero in all groups, this is the same as analyzing each group separately. 


Greetings, This question follow a current SEMNET discussion. Part 1. When testing invariance hypotheses in CFA with categorical indicators (WLSMV), The chisquare value cannot be used for chisquare difference tests (DIFFTEST has to be used). Does this also mean that the obtained CFI and RMSEA values cannot be used for nested models difference testing ? Part 2. If they can be used, how should we interpret improvement in fit (CFI, RMSEA) with the addition of constraints ? 


I don't believe that CFA and RMSEA are used for difference testing of nested models. See the Yu dissertation on the website where these measures have been studied for WLSMV. 


My mistake, I was refering to Cheung and Renswold (2002) suggestions, made for continuous outcomes in the context of measurement invariance testing. The authors suggest that when the CFI changes more than .01 with the addition of invariance constraints (i.e. equal loadings versus configural invariance), the invariance hypothesis should be rejected. Chen (2007) obtained similar results for the CFI and RMSEA. I should have said invariance instead of nested models. With categorical outcomes and WLSMV, the "value and df" of the chi square cannot be used for invariance testing and DIFFTEST should be used. Since the CFI and RMSEA are computed on the basis of chi square, would you believe that they still can be used in this context (invariance testing) ? The fact that Yu (2002) dissertation found similar cut off points for the WLSMV and ML fit indices (for absolute fit) seems to argue in favor of this idea. 


Did this one get lost in the overall number of questions ? 


I think this might be a topic for a research investigation. 


Yes, I know. Was wondering whether you thought so too... French & ?(SEM, 2006) did a preliminary simulation study on this (just read it). Found out that changes in CFI lacked power (for dichotomous items). Guess I'll have to wait. 

MAH posted on Wednesday, December 03, 2008  10:22 am



I have a question about a 2group (multigroup) analysis of measurement invariance of categorical items. In the first model where loadings and thresholds are free across groups, the loading of one indicator in each group is set to one. The indicator with loading set to 1 is the same across groups. Thus, this implies loading invariance of this item across groups. my question is, in fitting a model allowing parameters to be different across groups, you have to hold an item loading = across groups, but what should be done with the threshold and scale factor for this indicator? Here are my two thoughts: 1) The threshold for the indicator w/ loading set to 1 in both groups should be free across groups and the scale factor set to 1 in both groups. then, in the subsequent model, fix not only the loading but the threshold to be = across groups. then iterate, substituting which item has loading set to 1. 2) The threshold for the item w/ loading set to 1 in both groups should be constrained to be = in both groups and the scale factor should be freely estimated in group 2. thus, assume inv of this item while actually testing invariance of all other items. then you could select another item to hold invariant (loading set to 1, thresholds constrained), while you test invariance of the first item. 


The thresholds for all items should be treated the same. 

Simon Denny posted on Sunday, August 09, 2009  4:03 pm



I have a quesiton about testing for measurement invariance in a twolevel model using CFA with covariates. Do the direct and indirect associations between covariates (in my case age, gender etc) and factor indicators need to be opposite signs for there to be measurement invariance? What happens if they both go in the same direction ie there are direct relationships that are not mediated by the factors but are attenuated by them? 


I don't believe that the signs of the direct effects are related to measurement invariance. 

Regan posted on Thursday, September 30, 2010  12:00 pm



I have one question. In the data I have, a construct was measured at two points in time by four items and I am interested in showing the association between the two time points. However, the survey instrument changed and only two of the four items are measured exactly the same at both time points. Can I constrain the factor loadings of the two similar items and allow the remaining two to be free, recognizing this is not exactly the same factor? Or can I not do this? Thank you 


That would be ok to do  it is called partial invariance. You should fix at 1 the loading of one of the two items that didn't change. 

Fatma Ayyad posted on Friday, November 05, 2010  8:41 am



Dear Dr. Muthen, I am conducting MGCFA across different cultural groups. I have 8 categorical items, WLSMV estimator, and I am using Mplus version 6. The groups showed weak factorial invariance. My questin: I want to run partial invariance test. Do I have to compare the MI of the items with the Chisquare Diff value? Or with the Chisquare Critical value found on the probability distribution table of chisquare by comparing the degree of freedom with the alpha level? Thank you 


The chisquare diff test refers to a set of parameters, while MI's refer to only one. So just go with the pvalue for the diff test using the df and alpha level. 

Fatma Ayyad posted on Sunday, November 07, 2010  10:32 pm



Thank you so much 


Hello, I am trying to conduct measurement invariance testing with categorical indicators. I have seen in the manual how to specify when an indicator/threshold is to be free in one group but not the other. My problem is that I am not sure if I should conduct the testing with the following steps: 1)Test a model where factor loadings and thresholds are free in both groups 2)Constrain just the factor loadings (and leave the thresholds free across groups), to test for invariance with just the factor loadings. If MIs suggest freeing constraints on certain factor loadings, then let them be free between groups. 3)Constrain invariant factor loadings from above, and then test for invariance for the thresholds. I believe you would take these equivalent steps with continuous indicators, but you are instead modeling intercepts instead of thresholds. Do you take these analogous steps when using categorical indicators? I have tried this, but the DIFFTEST option said that my models were not nested (when I had partial invariance of the factor loadings and tried to constrain only some of the thresholds). Thank you! 


We recommend that thresholds and factor loadings be constrained or unconstrained in tandem because the item probability curve is influenced by both parameters. See pages 433435 of the current user's guide to see the models we suggest. The details differ depending on the estimator used. 


Linda, Great, that is what I suspected. So you would only have 3 models, correct? 1)A model where factor loadings/thresholds are free for both groups 2)A model where factor loadings/thresholds are fully constrained in both groups, and if you find a poor fit using the DIFFTEST option, then 3)A model where there is partial factor loading/threshold invariance. Thank you so much for your incredibly fast response. 


Yes but there are other parameters that need to be considered. See the user's guide and the Topic 2 course handout for the details. 


Thank you! I actually have one more question about sample size discrepancies between groups when conducting multigroup analyses. When trying to compare my two groups, Group 1 n = 15,000, while Group 2 n = 1,400. I am sure that this large difference will impact my parameter estimates. If I was to take a random sample of participants from Group 1 to compare to Group 2, what is an appropriate sample size for Group 1? Should I just take approximately 1400 from Group 1, or a larger sample? 


Generally speaking, 1400 would seem reasonable unless you have categories with very low proportions. 


Hello, I would like to assess measurement invariance. I am new in Mplus. Each manifest variable is separately measured by wave, eg, mar1, mar2, etc. the last number of the variables represents the corresponding wave. I have fixed the factor loadings and thresholds. How could I test residual invariance? How could I explore the latent factor means or estimate them freely? ! Drug factor D1 by tab1 alc1 (1) coca1 (2) mar1 (3); D2 by tab2 alc2 (1) coca2 (2) mar2 (3); D3 by tab3 alc3 (1) coca3 (2) mar3 (3); ! FIXING THRESHOLDS OF DRUG USE VARIABLES [tab1$1 tab2$1 tab3$1] (4); [tab1$2 tab2$2 tab3$2] (5); [tab1$3 tab2$3 tab3$3] (6); [tab1$4 tab2$4 tab3$4] (7); [alc1$1 alc2$1 alc3$1] (8); [alc1$2 alc2$2 alc3$2] (9); [alc1$3 alc2$3 alc3$3] (10); [alc1$4 alc2$4 alc3$4] (11); [coca1$1 coca2$1 coca3$1] (12); [coca1$2 coca2$2 coca3$2] (13); [mar1$1 mar2$1 mar3$1] (14); [mar1$2 mar2$2 mar3$2] (15); [mar1$3 mar2$3 mar3$3] (16); [mar1$4 mar2$4 mar3$4] (17); Thanks! 


You can hold the residual variances equal across time as follows: alc1 (21) coca1 (22) mar1 (23); alc2 (21) coca2 (22) mar2 (23); alc3 (21) coca3 (22) mar3 (23); See the Topic 1 course handout toward the end of the multiple group section where inputs are given for testing factor means, variances, and covariances. You can adapt that to the multiple time point setting. 


Thank you! I did it as you specified, but I obtained: THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES COULD NOT BE COMPUTED. THE MODEL MAY NOT BE IDENTIFIED. CHECK YOUR MODEL. PROBLEM INVOLVING PARAMETER 37. Do I have to specify anything else? I am using PARAMETERIZATION = THETA. 


With Theta, residual variances must be one at one timepoint and free in the others. The test of equality is a test of that model versus a model where all residual variances are fixed at one. 


Sorry, I don't understand. Using Theta is the only way to do it, right? If I don't use Theta, I get: Variances for categorical outcomes can only be specified using PARAMETERIZATION=THETA with estimators WLS, WLSM, or WLSMV. When could I use your suggestion? alc1 (21) coca1 (22) mar1 (23); In addition, how could I set all residual variances fixed at one. Thanks again! 


One model would have: alc1@1 coca1 mar1 ; The other model would have: alc1@1 coca1@1 mar1@1 ; 


Dear Linda Muthén, I want to perform a Multiple group analysis with 1 latent variable with the restriction that the loadings across the groups have the same sign (e.g. to be positive) but apart from that they can take any positive value. Is it possible to impose such a restriction on a model? thank you. 


You can this using MODEL CONSTRAINT by specifying that each factor loading must be greater than zero. See the user's guide for further information. 


Hello, I am attempting to test the invariance across gradelevel groupings of residual and factor covariances in a CFA involving 49 observed variables across 10 factors. The sample size is 493 (216 grade 9s and 277 grade 8s). The problem is that I can only constrain about ten covariances (of either kind) to be equal across groups, in addition to constraints on factor variances, loadings, before the model ceases to converge. (NO CONVERGENCE. NUMBER OF ITERATIONS EXCEEDED.) I would need to constrain well more than 1000 covariances in order to get them all, so I believe there must be a more efficient way to do this than I have been able to find, so far, in the User's Guide or in Barbara Byrne's latest book (2012), both of which seem to suggest that each covariance must be constrained with its own line of code, for example: item1 WITH item2(1); item1 WITH item3(2); item1 WITH item4(3); Etc. Your advice would be hugely appreciated! 


I don't know why you are constraining covariances among residuals for the items. That is not a typical analysis and they can't all be identified. You don't need a separate line for a parameter label if you use semi colons between each. 


Thank you for your response. Does it hold as well for factor covariances? Much appreciated. 


All factors can be covaried. 

Emma Thomas posted on Thursday, February 04, 2016  9:38 pm



Dear Drs Muthen, I am testing the measurement invariance of a single latent factor model with four indicators, across six different national samples. The indicators are continuous. When I run the CFA for each of the groups separately, the model fit is acceptable to excellent. When you add the chisquares of the 6 models individually you get 20.79. Based on these separate models, I selected a referent marker item to set to unity in the simultaneous test. However, when I try to estimate the same model simultaneously for the different samples (the baseline test) I get a chisquare of 218 and the fit is terrible. I haven't been able to work out where there is an error in my model specification. My syntax is: GROUPING IS group (1 = g1 2 = g2 3 = g3 4 = g4 5 = g5 6 = g6); USEVARIABLE y1 y2 y3 y4; MISSING ARE ALL (999); MODEL: f1 BY y1* y2 y3 y4@1; MODEL g2: f1 BY y1* y2 y3; MODEL g3: f1 BY y1* y2 y3; MODEL g4: f1 BY y1* y2 y3; MODEL g5: f1 BY y1* y2 y3; MODEL g6: f1 BY y1* y2 y3; [y1 y2 y3 y4]; [f1@0]; I’m sure I’m missing something obvious but any advice you have would be gratefully received! Cheers, Emma 


Move [y1 y2 y3 y4]; [f1@0]; from MODEL g6 to MODEL. 

Back to top 