I would like to test increasingly restrictive hypothesis about intercepts, factor loadings, and residual variances and am running into two problems.
1. I cannot figure out how to restrict the residual variances across three groups. Mplus normally works, from my experience and the manual, so that the default is to equate intercepts and thresholds but not residual variances. However, the parenthesis command only seems to work within groups but not across groups. How is the equating of residual variances done?
2. When I am freeing the intercepts across the groups, I am not able to free all of them but, instead, have to fix at least one to equality across groups. I believe this is an identification requirement. Is my understanding correct?
Thank you for answering these terribly simple questions and thank you for your excellent research, this great product, and your valuable time!
I am attempting to test measurement invariance in two groups with binary data (ten items on a single underlying factor). No data is missing (its a simulated dataset). I am trying to follow Vandenberg & Lance's (2000) suggestions re how to sequentially test MI, but am having trouble implementing in MPLUS; i have a work around, but am unsure if its kosher.
I want to allow the means and variances of the two groups to differ, but then test for MI by constraining all factor loadings (FLs) and thresholds to be the same, and then allowing them to differ bw groups. If MI is violated, I want to be able to follow up and see which specific FLs or thresholds are the violators.
I have been able to do this, but I have been forced to set the residual variances across groups equal to one another (all = 1; I'm using THETA parameterization). This to me seems fine, but is there some problem I'm overlooking? Here's the guts of the syntax:
________________________________________ FIRST SCRIPT; ALL THRESHOLDS & FLS CONSTRAINED TO BE SAME BW GROUPS: ________________________________________
After comparing the fit stats (CFI, TLI, etc as recommended by Vandenberg; I'm not interested in testing using chi-square) between these two models, I want to follow-up and begin constraining the invariant FLs and thresholds to be the same while allowing the noninvariant ones to differ, thereby tracking down where the problems are.
Any help you could provide would be appreciated. Is constraining the residual variances across groups OK? Any other problems with my approach? I'd be happy to send the full scripts along. Best,
I suspect that the suggestions you are following are for continuous outcomes.
The models we recommend for testing measurment invariance of categorical outcomes for the default Delta parameterization are:
1. The Mplus default for multiple groups where thresholds and factor loadings are constrained to be equal across groups, factor means are zero in the first group and free in the other and scale factors are fixed to one in the first group and free in the others.
2. A model in which thresholds and factor loadings are free across groups, factor means are zero in all groups and scale factors are fixed to one in all groups. With categorical outcomes, thresholds and factor loadings need to be freed and constrained in tandem.
3. See Example 5.16 for an example of partial measurement invariance.
For Theta, just substitute residual variances for scale factors and see Example 5.17.
I was wondering if someone could explain further about the need to simultaneously constraing loadings and thresholds in categorical invariance models. Working sequentially (discrimination first, then difficulty) is commonplace in the IRT universe. Does the distinction involve the different parameterization in Mplus. That is, in irt we have a(theta-b), in Mplus a*theta+b. Bengt has worked through the consequences of this distinction in a couple of papers, but it still isn't clear to me why it wouldn't work to fix the loadings first, then proceed to the thresholds conditional on equal loadings.
Or, if you convince me that the sequential approach only works in an IRT paramterization, in MPlus 4.0 couldn't that be accomplished with a constraint block? That would be interesting to try.
I think the multiple-group IRT model used in the Mplus WLSMV context is more general than the conventional model. This is due to being able to handle group-varying variances for the u* latent response variables (e.g. due to varying residual variances). That general model brings special identification issues. I think in the conventional case it is possible to do it stepwise. On the other hand, it seems like it is natural to ask how much the whole item curve differs across groups - and the curve is determined by both parameters. Conventional IRT used to discuss DIF in terms of areas between the curves which also is in line with looking at both parameters jointly.
The User's Guide example 3.12 modifies the model diagram in ex 3.11 by changing y1, y2, y3 to categorical variables. When using WLSMV, the probit link is used and u*_1, u*_2 are used as predictors for u3 (with ML, the logit link is used and the actual u_1, u_2 values are used as predictors). Ex3.12 uses the Delta parameterization, whereas ex3.13 uses Theta. Using u* does not really relate to scale factors. Scale factors are the inverted SD's of the u*'s given x's. Ex3.12 does not have any free scale factor parameters. Hope that was what you asked about.
frank rijmen posted on Wednesday, September 13, 2006 - 4:17 am
you can do it the irt-way as follows. in irt, we have a*(theta-b), in mplus there is actually scale*(loading*theta-b). in a single group, scale or loading has to be fixed for each item. in a multiple group with no across group restrictions, scale or loading has to be fixed in each group. mplus restricts the scales to 1 by default, but in IRT terms it makes more sense to restrict the loadings to 1 (as 'scale'corresponds to the discrimination parameter). equal discrimination parameters/item locations are imposed by having equal scales/thresholds across groups. testing for both simultaneously, you test a model in which thresholds and scales are equal across groups, and loadings set to one in both groups. for two groups, there are two parameters less for each restricted item:
unrestricted model: group 1:scale_1*(theta-b_1) group 2: scale_2*(theta-b_2)
restricted model: group 1:scale_1*(theta-b_1) group 2: scale_1*(theta-b_1)
frank rijmen posted on Wednesday, September 13, 2006 - 4:17 am
the procedure proposed in mplus is, in irt terms, testing for item location invariance WHILE allowing for differences in discrimination. first, for two groups, note that, by allowing for noninvariant loading and threshold for an item, the model contains only one parameter more: threshold and loading in the second group are free, but its scale then has to be fixed to one. so, this is for sure not testing, again in IRT terms, a model with the same item parameters across groups versus group-specific item parameters.
what the proposed mplus procedure tests is, subscripts referring to groups:
unrestricted model: group 1:loading_1*theta-b_1 group 2: loading_2*theta-b_2
restricted model: group 1:loading_1*theta-b_1 group 2: scale_2*(loading_1*theta-b_1)
what this actually tests, is more transparent if we consider the fact that the unrestricted model is equivalent to unrestricted model bis: group 1:loading_1*theta-b_1 group 2: scale_2(loading_1*theta-b*_2) (in group 2 we do not fix the scale to 1 but the loading to the loading of the iem in the first group)
hence, the actual test performed in mplus, in IRT terms, is to test for item location invariance only (b_1==b*_2)
I would like to test increasingly restrictive invariance hypotheses in a CFA with continuous indicators. In the manual (page 345) you suggests 4 steps which generally follow what is found in the litterature. I was wondering if you had Mplus syntax examples for these four steps (especially 1 and 2, which involve changing the defaults of the program).
Dear discussion board, I'm a bit confused: I want to know whether looking for measurement invariance is futile if my 2 groups differ on average item score. I have a set of categorical items loading onto one factor, for 2 groups (male and female). I already know the females score higher overall if I just sum the items, so I'd expect the thresholds to vary across groups. So am I finding out anything new if I find significant measurement invariance between males and females (if as i understand, i must equate the thresholds and factor indicators simultaneously)? Would it not be possible to estimate thresholds in a prior run, and then fix them to the values they are estimated at? Then I could test the significance of equating just the factor loadings, which is what I am more interested in. Thanks very much
Group differences in item means can be represented by group differences in factor means even when thresholds are the same across groups. Measurement invariance says that the thresholds are the same across groups for a given factor score.
Thanks Linda. Does that mean I could test for measurement invariance whilst allowing for group differences in item means by doing the following models for WLS with delta parameterization:
1. Thresholds and factor loadings free aross groups; factor means fixed to zero in group one and free in the second group; scale factors fixed to 1 in all groups. 2. Thresholds and factor loadings constrained to be equal across groups; factor means fixed to zero in group one and free in the second group; scale factors fixed to 1 in group one and free in second group.
(i.e. as suggested in chapter 13, but allowing factor mean differences in both models). I'm asking partly because I'm not totally clear what the scale factors do and so am not sure when they should be fixed or free.
The models we recommend in the user's guide for weighted least squares and the Delta parameterization are:
1.Thresholds and factor loadings free across groups; scale factors fixed to one in all groups; factor means fixed to zero in all groups 2.Thresholds and factor loadings constrained to be equal across groups; scale factors fixed to one in one group and free in the others; factor means fixed to zero in one group and free in the others (the Mplus default)
Sure, but I don't think model 1 as you recommend allows for group differences in factor means (and so does not allow for group differences in item means). I would like to test for measurement invariance by comparing two models that both allow for group differences in factor means, but the more restrictive model forces thresholds and factor loadings to be equal. Is this possible? Sorry if I'm being thick
Thanks. Please could you tell me a way to test for measurement invariance while allowing factor means to be different across groups. Would it be OK to find one or two items whose thresholds and factor loadings can be equated across groups, and equate these to allow factor means to be free (for identification purposes)?
A model with thresholds, factor loadings, and factor means free across groups is not identified. Measurement invariance can be tested by using the models discussed previously. When the thresholds and factor loadings are free across groups and the factor means are fixed to zero in all groups, this is the same as analyzing each group separately.
This question follow a current SEMNET discussion. Part 1. When testing invariance hypotheses in CFA with categorical indicators (WLSMV), The chi-square value cannot be used for chi-square difference tests (DIFFTEST has to be used). Does this also mean that the obtained CFI and RMSEA values cannot be used for nested models difference testing ? Part 2. If they can be used, how should we interpret improvement in fit (CFI, RMSEA) with the addition of constraints ?
I was refering to Cheung and Renswold (2002) suggestions, made for continuous outcomes in the context of measurement invariance testing. The authors suggest that when the CFI changes more than .01 with the addition of invariance constraints (i.e. equal loadings versus configural invariance), the invariance hypothesis should be rejected. Chen (2007) obtained similar results for the CFI and RMSEA. I should have said invariance instead of nested models.
With categorical outcomes and WLSMV, the "value and df" of the chi square cannot be used for invariance testing and DIFFTEST should be used. Since the CFI and RMSEA are computed on the basis of chi square, would you believe that they still can be used in this context (invariance testing) ? The fact that Yu (2002) dissertation found similar cut off points for the WLSMV and ML fit indices (for absolute fit) seems to argue in favor of this idea.
Yes, I know. Was wondering whether you thought so too... French & ?(SEM, 2006) did a preliminary simulation study on this (just read it). Found out that changes in CFI lacked power (for dichotomous items). Guess I'll have to wait.
MAH posted on Wednesday, December 03, 2008 - 10:22 am
I have a question about a 2-group (multigroup) analysis of measurement invariance of categorical items. In the first model where loadings and thresholds are free across groups, the loading of one indicator in each group is set to one. The indicator with loading set to 1 is the same across groups. Thus, this implies loading invariance of this item across groups. my question is, in fitting a model allowing parameters to be different across groups, you have to hold an item loading = across groups, but what should be done with the threshold and scale factor for this indicator? Here are my two thoughts:
1) The threshold for the indicator w/ loading set to 1 in both groups should be free across groups and the scale factor set to 1 in both groups. then, in the subsequent model, fix not only the loading but the threshold to be = across groups. then iterate, substituting which item has loading set to 1.
2) The threshold for the item w/ loading set to 1 in both groups should be constrained to be = in both groups and the scale factor should be freely estimated in group 2. thus, assume inv of this item while actually testing invariance of all other items. then you could select another item to hold invariant (loading set to 1, thresholds constrained), while you test invariance of the first item.
The thresholds for all items should be treated the same.
Simon Denny posted on Sunday, August 09, 2009 - 4:03 pm
I have a quesiton about testing for measurement invariance in a two-level model using CFA with covariates.
Do the direct and indirect associations between covariates (in my case age, gender etc) and factor indicators need to be opposite signs for there to be measurement invariance? What happens if they both go in the same direction ie there are direct relationships that are not mediated by the factors but are attenuated by them?
That would be ok to do - it is called partial invariance. You should fix at 1 the loading of one of the two items that didn't change.
Fatma Ayyad posted on Friday, November 05, 2010 - 8:41 am
Dear Dr. Muthen, I am conducting MGCFA across different cultural groups. I have 8 categorical items, WLSMV estimator, and I am using Mplus version 6. The groups showed weak factorial invariance. My questin: I want to run partial invariance test. Do I have to compare the MI of the items with the Chi-square Diff value? Or with the Chi-square Critical value found on the probability distribution table of chi-square by comparing the degree of freedom with the alpha level?
I am trying to conduct measurement invariance testing with categorical indicators. I have seen in the manual how to specify when an indicator/threshold is to be free in one group but not the other. My problem is that I am not sure if I should conduct the testing with the following steps:
1)Test a model where factor loadings and thresholds are free in both groups 2)Constrain just the factor loadings (and leave the thresholds free across groups), to test for invariance with just the factor loadings. If MIs suggest freeing constraints on certain factor loadings, then let them be free between groups. 3)Constrain invariant factor loadings from above, and then test for invariance for the thresholds.
I believe you would take these equivalent steps with continuous indicators, but you are instead modeling intercepts instead of thresholds. Do you take these analogous steps when using categorical indicators? I have tried this, but the DIFFTEST option said that my models were not nested (when I had partial invariance of the factor loadings and tried to constrain only some of the thresholds).
We recommend that thresholds and factor loadings be constrained or unconstrained in tandem because the item probability curve is influenced by both parameters. See pages 433-435 of the current user's guide to see the models we suggest. The details differ depending on the estimator used.
Thank you! I actually have one more question about sample size discrepancies between groups when conducting multigroup analyses. When trying to compare my two groups, Group 1 n = 15,000, while Group 2 n = 1,400. I am sure that this large difference will impact my parameter estimates. If I was to take a random sample of participants from Group 1 to compare to Group 2, what is an appropriate sample size for Group 1? Should I just take approximately 1400 from Group 1, or a larger sample?
Hello, I would like to assess measurement invariance. I am new in Mplus. Each manifest variable is separately measured by wave, eg, mar1, mar2, etc. the last number of the variables represents the corresponding wave.
I have fixed the factor loadings and thresholds. How could I test residual invariance? How could I explore the latent factor means or estimate them freely?
! Drug factor D1 by tab1 alc1 (1) coca1 (2) mar1 (3);
You can hold the residual variances equal across time as follows:
alc1 (21) coca1 (22) mar1 (23);
alc2 (21) coca2 (22) mar2 (23);
alc3 (21) coca3 (22) mar3 (23);
See the Topic 1 course handout toward the end of the multiple group section where inputs are given for testing factor means, variances, and covariances. You can adapt that to the multiple time point setting.
I want to perform a Multiple group analysis with 1 latent variable with the restriction that the loadings across the groups have the same sign (e.g. to be positive) but apart from that they can take any positive value. Is it possible to impose such a restriction on a model?
I am attempting to test the invariance across grade-level groupings of residual and factor covariances in a CFA involving 49 observed variables across 10 factors. The sample size is 493 (216 grade 9s and 277 grade 8s). The problem is that I can only constrain about ten covariances (of either kind) to be equal across groups, in addition to constraints on factor variances, loadings, before the model ceases to converge. (NO CONVERGENCE. NUMBER OF ITERATIONS EXCEEDED.) I would need to constrain well more than 1000 covariances in order to get them all, so I believe there must be a more efficient way to do this than I have been able to find, so far, in the User's Guide or in Barbara Byrne's latest book (2012), both of which seem to suggest that each covariance must be constrained with its own line of code, for example:
item1 WITH item2(1); item1 WITH item3(2); item1 WITH item4(3); Etc.