Anonymous posted on Thursday, March 07, 2013 - 7:47 am
I've been reading your online notes on multiple group analysis with categorical outcomes. After the configural invariance step you suggest going straight to a model where intercepts and slopes etc are constrained across groups and the factor means are freed. This is different from the continuous case where one might examine the factor loadings first and then add in the intercepts in two separate steps, and I was wondering why this would be.
This difference between the continuous item case and the categorical items case is due to having less information with categorical items. With binary items there is an identification issue that prevents testing of loading invariance only (metric invariance), at least when allowing group-varying residual variances. So therefore we recommend going straight from configural to scalar invariance. With polytomous items it is possible to identify and analyze the metric model. But even so, it is not as straightforward as with continuous items. Roger Millsap has written on the polytomous case; see for instance his book Statistical Approaches to Measurement Invariance.
Anonymous posted on Friday, March 08, 2013 - 12:33 am
Thank you, that is extremely helpful. There's a variety of estimators that I can use in Mplus when fitting these models-my outcomes are binary. I recall being told once that ML is associated with the Differential Item Functioning/Item Response Theory approach, and WLSMV is associated with the CFA approach. Is that correct?
No, that is not correct. IRT and CFA with categorical indicators are the same model. You can use either ML or WLSMV as estimators when you have categorical variables and factors. With ML, each factor requires one dimension of integration and each residual correlation also requires one dimension of integration so ML can become computationally heavy with several factors. In this case, WLSMV is preferred. Both ML and WLSMV are good for IRT as is Bayes.
Anonymous posted on Monday, March 11, 2013 - 9:01 am
I'm trying to use the steps you outline for testing measurement invariance across multiple groups with categorical outcomes, within a MACS framework. Having having obtained my baseline models for each group, I have fit the configural model for each group as eg.
According to your outline, would the next equivalent step be a model in which the loadings and intercepts are constrained the same across groups, with the means still fixed at zero in each group, factor variances at 0 in each group, and scale factors fixed at 1 in the first group and freely estimated in the other groups? I can estimate this model (it isn't a great fit), but end up having to examine MI for factor means and slopes/thresholds simultaneously, something I would prefer not to do. I also wondered about using difftest to compare models in the invariance sequence, as when the scale factors are freely estimated I end up with more parameters in a simpler (ie loadings constrained) model than the more complex model (loadings unconstrained but scale factors fixed)?
Any pointers you can offer will be much appreciated as always!
Please see page 485 of the user's guide and the Topic 2 course handout on the website where the inputs are given under multiple group analysis. Factor variance should not be fixed to zero. If you continue to have problems, send the output and your license number to email@example.com.
I was wondering whether it makes sense to test for residual variance invariance (as can be done in CFAs with continuous observed variables), once scalar invariance has been established in a multigroup CFA with categorical data using the Theta parameterization. Thus, after scalar invariance has been found (following the procedure as depicted on page 486 of the MPlus manual), would it make sense to estimate a third model in which the residual variances are again fixed to be one in all groups, and comparing this model to model #2 described on the top of page 486.
Hi Dr Muthen, can I just clarify is the configural model one where factor loadings and intercepts are free to vary and no equality constraints are imposed across groups? I realize mplus by default holds factor loadings and intercepts equal across groups to test measurement invariance. Does this mean that in testing invariance of factor loadings alone, one has to then override the default equality constrain of intercept as the first step?
Thanks very much Dr Muthen - that's fantastic! I notice you don't get std estimates in the output using this option - is there a way around this? Also, in testing latent mean difference across groups, I constrained the factor mean (to 0) and variance of the reference group to 1 for comparison, while letting the factor mean and variance of the other group be free. This was done with equal intercepts across groups (scalar model) - and the model showed a good fit - does mean that the latent mean structure differs across groups?
Thanks very much Dr Muthen -I've read your notes on multigroup CFA. I've tested configural, metric, and scalar invariance, and further, invariance of factor variance and residual variance, on each of the five subscales from a measure I developed. I also looked at latent mean differences across groups. As predicted, two of the subscales tested did not pass the scalar test, that is the intercepts varied across groups. What do you do in this case?
This means you have partial measurement invariance. Please listen to the Topic 1 course handout and video where this is discussed under multiple group analysis.
Alvin posted on Tuesday, August 19, 2014 - 11:31 pm
Hi Dr Muthen I was wondering how do you test partial invariance (by freeing parameters) using the latest feature of MI testing in Mplus? Say if you were to release constrains for some of the items - do you use class-specific syntax? Thanks
Alvin posted on Tuesday, August 19, 2014 - 11:59 pm
A follow-up question is that in my output, while the chi-square test is significant for each model (1 configural, 2 metric, 3 scalar), LRT tests for nested model comparisons (between 1-3, 2-3, 1-2) are not significant. Does this mean invariance across all levels?
You should get a well-fitting configural model as a first step. Once you do that you should test it against the other models.
Xu, Man posted on Friday, December 25, 2015 - 11:53 am
I would like to check something on number of parameters regarding residual variances under THETA and WLSMV setting (in Mplus), as I am a bit confused about the degree of freedom. Take a two-group example of six binary items with one factor. The number of model parameters would be as follows:
It's been suggested strong invariance model compared against config model directly so it shall be a test with 3 degree of freedoms (26 v.s. 21). Then I am a bit confused because there are actually 10 loadings and 12 threshold involved here but the difference in degree of freedom is only 5. Is this correct or I have missed something?
Xu, Man posted on Friday, December 25, 2015 - 11:59 am
Sorry, I miss calculated, there should be only 6 residual variances in the strong inv model, but it still leaves 5+6+6+2+1=20 parameters. That is only 4 less than the confi inv model.
4 parameters difference sounds right. The binary case is different in that the metric (weak) invariance model is not identified with 6 residual variances.
Xu, Man posted on Monday, February 29, 2016 - 8:59 am
Dear Dr. Muthen,
I am carrying on analysis with repeated measures of 6 binary items (two waves). I also wanted to check the mean change of the latent factors. Holding loadings and thresholds equal over time, I noticed that the difference in latent means was dependent on the constraint of the residual variances of the binary indicators. If I specify residual variances equal, then there is a latent mean difference, but this different disappears if I specify the residual variances to be different in the two waves.
I am a bit puzzled mostly because I thought latent means are not supposed to correspond to residual variances - could it that things are different in the case of binary variables?
Dennis Li posted on Saturday, July 30, 2016 - 4:32 am
I am trying to run invariance testing on a 6-class LPA with a known class, but I am having trouble with the syntax. I get the error message "Measurement invariance testing...is only available for TYPE=MIXTURE with one categorical latent variable and the KNOWNCLASS option." Is it possible to test my LPA against a known class using this method? My (abridged) syntax is:
VARIABLE: Classes = c(6) g(2); ! Here's the source of the error Knownclass = g (g=1 g=2); ANALYSIS: Type = mixture; Model= configural metric scalar;
Hi, we have run a two-correlated factor model for two groups. For scalar invariance, when factor means are fixed at 0 in group 1 and free in group 2, the values are .12 and -.14 for group 2, both with p-values < .01. This implies that both groups differ in their means - right? Fit of the model with fixed loadings and intercepts/thresholds is good though.
What puzzles us is that when comparing factor scores - which we had estimated in a previous step (for both groups combined) -, the mean factor scores for group 1 and group 2 are more or less the same.
Have we misspecified the multi-group model in any way?
Could it be due to the fact that our categorical indicators have varying numbers of categories and differing values?
thanks for your reply, we have solved the above question.
We have an additional question:
When we ask Mplus to save factorscores in the model in which we test scalar invariance (fixing loadings, thresholds, intercepts), the factor scores we receive for both groups G1 and G2 differ (this would be in line with the means of both factors F1 and F2 being different between both groups).
When simply specifying a common model (no specification for groups) and do not fix any parameters (apart from factor variance @1 in order to identify the model) and request factorscores, the two groups don't differ.
Is the difference due to fixing the parameters in the scalar model?
Kathy Xiao posted on Monday, January 23, 2017 - 5:37 pm
Dear Dr. Muthen, I am doing a measurement invariance testing with 3 racial groups (black, latino, white) on a 13-item construct.
MODEL: F by F1-F13
After I do the baseline model for each racial group separately (USEOBSERVATION to select group), I got the model fit as: RMSEA (CI)- blacK: 065 (.690, .782) latino: .083 (.710, .778) white: .071 (.647, .689) and the CFI/TLI are all above 0.95.
Given the relatively bad fit of RMSEA, does that mean I cannot continue the invariance testing? Or is there modification of the model I can do?
That's reasonable. But you may want to discuss these general analysis strategies on SEMNET.
Kathy Xiao posted on Tuesday, January 24, 2017 - 6:20 pm
Thanks for your reply. I will move the discussion to SEMNET later. One question on MI of multiple group:
I further tested the measurement invariance and found non-invariance. So I proceeded to test the invariance of loadings by comparing the metric model to a model with one loading freed at a time. However, I found that all of the loadings freed lead to significant worse of the model (p<0.05), and this is the case for all the 2-2 group comparisons.
Would you think this suggest non-invariance of this construct across racial groups? Is there anything else that need to be considered in the analysis?
Kathy Xiao posted on Tuesday, January 31, 2017 - 11:26 pm
Dear Dr. Muthen,
I am doing measurement invariance with 3 groups, the outcome is a 12-item categorical variable with 4 options in responses. I found metric invariance but non-invariance for scalar model.
I want to proceed to free the thresholds to find the source of non-invariance. But there are 3*12 thresholds, is there any systematic and recommended strategy on which threshold I shall start with? Shall I free them one by one? Or shall I free them two by two? Or else?
You can do one threshold at a time but that would be very cumbersome. I would think that it is the variable itself that causes non-invariance, not necessarily specific thresholds.
Kathy Xiao posted on Sunday, February 05, 2017 - 6:43 am
Thanks for your reply!
Follow up the previous questions, I tested one threshold at a time, and I found 6 out of 33 thresholds led to significant worse fit. I thought freeing these 6 thresholds would make the comparison non-significant.
However, when I freed them together, the Chi-square comparison was significant.
I also tried to free a combination of thresholds that cause the biggest Chi-square change, the result was also significant.
Do you think is there is anything wrong with this strategy?