Anonymous posted on Tuesday, July 11, 2000 - 10:51 am
I'm doing a structural equation model with 3 latent variables, a number of exogenous x variables and 4 groups. The model converges and has an rmsea of .044. I am now trying to test invariance across groups. I see that intercepts are constrained to be equal across groups by default, as are factor loadings. I see how to constrain residuals to be equal across groups, and I've also successfully constrained the betas, but I can't figure out how to constrain the gammas. Is it possible?
(previously Anonymous) My problem was that I have several x variables in the model, and I wanted to constrain the coefficients to be equal across groups only, and not also across all x variables, which is what happens with this: y on x1 x2 x3 (1);
I did figure out a solution to my problem. Since a regression equation can be on more than one line and the (#) has to be on the same line as the variable(s) it acts upon, I just used multiple lines for my equation: y on x1 (1) x2 (2) x3 (3); and this worked!
Yes, this is the case. Only one parenthesis can be on a line and it applies to all parameters on the line. The overall model statement sets equalities within and between groups. An equality statement in a group specific model statement sets equalities within a group.
Anonymous posted on Wednesday, January 31, 2001 - 5:37 pm
I want to use Mplus to construct a multigroup SEM that includes two CFAs for categorical data (two factors, 3+ dichotomous indicators each). Is it the case that Mplus will allow me to run these models without any invariance assumptions whatsoever ? I get the impression that I have to constrain at least one of the three sets of parameters either for identification or convergence: loadings, thresholds, means, scale factors. Maybe this is because when I try to relax any of the Mplus default invariance assumptions I get an error msg stating that the standard errors for the model cannot be calculated. Is the problem with my data (lack of variance ?) or with the identification of the model ?
Multiple-group CFA with categorical outcomes uses the default of holding thresholds and loadings invariant across groups, fixing the factor means to zero in the first group while letting them be free in the other groups, and fixing the delta scale factors to one in the first group while letting them be free in the other groups.
If you instead want to have no invariance restrictions across groups you should repeat the thresholds and loadings in each group so that they are group-specific. Note, however, that in this case you need to fix to zero the factor means in all groups (you cannot identify both group-specific thresholds and group-specific factor means) and fix the scale factors to one in all groups (they can only be identified when thresholds and loadings are invariant). You can also accomplish no invariance by doing separate-group analyses.
Anonymous posted on Friday, February 02, 2001 - 12:31 pm
Following up on your recommendation in the 2nd paragraph above: is there a particular interpretation to setting the scale factors equal to 1 (as opposed to 2 or 3, etc.) ? Also, regarding the scale factors themselves, do they refer to the variance of the underlying (continuous) y variable, to the error in measuring that variable via the categorical measure or both ? Given this, how "strong" is the assumption of equal scale factors in the multigroup model where loadings and thresholds are allowed to vary and factor means are set to zero, etc. ?
The scale factors refer to the inverted standard deviations of the latent response variables y*. This means that they are functions of loadings, factor variances, and residual variances. If one or more of those three components vary, the scale factor would vary. So, equal scale factors when loadings vary does not make sense.
I am trying to compare two groups (ed and noned) on a confirmatory factor analysis solution. I have used the following command structure in MPlus, which I thought would work, but which isn't giving me the anticipated output. Again, what I want to be able to do in the end is determine whether the model is the same for the two groups. Thanks for your help.
model: intern by withd somat anx; model: extern by del aggress; model ed: withd somat anx (1); model ed: del aggress (1); model noned: intern by withd somat anx (2); model noned: del aggress (2);
MODEL: intern BY withd somat anx; extern BY del aggress;
the factor loadings will be held equal across groups. It is not clear what you are trying to do with the statements you have sent. If you tell me in words which parameters you are trying to hold equal and whether they are to be held equal within and/or across groups, I can then help you.
The two model ed commands that you have above will hold all residual variances equal across variables for ed and the residual variances for del and aggress held equal to each other and also equal to the factor loadings for intern in the noned group.
By the way, one overall MODEL command and one group-specific model command is sufficient for any input.
Holmes finch posted on Tuesday, February 27, 2001 - 11:15 am
Thanks for your response. What I want to do is compare the two groups over all the parameters, and then maybe look at individual ones. The bottom line is, I want to be able to say that the same model does, or does not fit both groups. Does that make sense? Thanks.
If you send me your fax number, I will fax you several pages we use when we teach. These show setups for a variety of multiple group models that test a variety of hypotheses.
Anonymous posted on Tuesday, June 26, 2001 - 2:43 am
I´am trying to do a multiple group analysis. All measurement parameters are held equal across groups by default. Is it possible to hold specific variances of latent factors equal across the groups? Which syntax do I have to use?
Any parameter that is not held equal by default can be held equal. Any parameter that is held equal by default can have that equality relaxed.
To hold a parameter equal, specify it in the overall MODEL command with a number in parentheses following it. One number in parentheses is allowed per record (line) of the input file. In a three factor model, the variances of the factors will be held equal across groups by adding the following to the overall MODEL command:
f1 (1); f2 (2); f3 (3);
Lee-Fay posted on Tuesday, July 17, 2001 - 4:34 pm
I am trying to run a twolevel analysis. But I get an error message telling me that 'the sample covariance matrix for the variables cannot be inverted'. I have checked my covariance matrices and no two variables are perfectly correlated and no variable has no variation. I have 11 homes with 349 subjects in total. The dependent variable is continous, and I have 4 within-level predictors and 5 between-level predictors. What am I doing wrong?
Even though you cannot see any correlations of 1 in your sample between covariance matrix, there may be dependencies that result in singularity of the matrix. You mention that you have 10 variables and 11 homes. Having 11 homes is like having 11 observations at the between level. You would not be able to have more than 10 variables. So if there are variables you are not mentioning, this could also be the problem. We recommend at least 30-50 clusters for this type of modeling. You can try analyzing the sample between matrix to see if it can be inverted. Or you can send the input and data to firstname.lastname@example.org and I will take a look at it.
Can you please elaborate on the steps in multiple group analysis. I want to test group differences in two beta and two gamma coefficients. Am I correct that the model fitting steps leading up to testing the beta/gamma coefficients are to test assumptions of measurement invariance?
I understand that the first step is to fit the SEM model separately in each group. Then the next three steps are to fit the model in all groups (1) allowing all parameters to be free, (2) holding factor loadings equal, and (3) holding factor loadings and intercepts equal. Given the defaults for multiple group analysis with categorical indicators, am I correct that these three steps require that parameters that are constrained or fixed by default need to be relaxed. If so, could you please elaborate on which defaults to relax? I assume that if factor loadings and intercepts are invariant, then the default settings would be appropriate for testing differences in the beta/gamma coefficients.
The steps in looking at measurement invariance are slightly different with categorical indicators. For one thing, you are dealing with thresholds instead of intercepts You want to compare two models rather than three to test measurement invariance.
Model 1 - This is the default model in Mplus. The thresholds are held equal across groups and the factor loadings are held equal across groups. The scale factor is fixed to one in the first group and free in the others. The factor means are zero in the first group and free in the others.
Model 2 - The thresholds and factor loadings are free across groups. Scale factors are one in all groups and factor means are zero in all groups.
I have run the group analysis below and now want to examine the model by 3 household types. Jaccard and Wan (1996) suggest examining three way interactions using multiple group analysis [in this case six groups].
Are there alternative approaches? For instance, would a two-level or MIMIC model be appropriate?
Grouping is t1totsup (0=below 1=above); Usevariables are q21a q21b q21c q21d p1 p2 p3 p4 p5 p6 p7 p8 p9 p10 p11 p12 p13 lifevent nparpro aparpro; Categorical are q21a q21b q21c q21d p1 p2 p3 p4 p5 p6 p7 p8 p9 p10 p11 p12 p13; Define: Cut t1totsup(36); Model: F1 by q21a q21b q21c q21d; F3 by p1 p5 p6 p9; F4 by p2 p3 p10 p11 p12 p13; F5 by p4 p7 p8; F6 by F3 F4 F5; F6 on F1; F6 on lifevent; nparpro on F6; aparpro on F6;
*** ERROR Group 2 does not contain all values of categorical variable: P2 *** ERROR Group 4 does not contain all values of categorical variable: P2 *** ERROR Group 5 does not contain all values of categorical variable: P2 *** ERROR Mplus VERSION 2.02 PAGE 3 hhstructure moderation model 1 all paths free
Group 6 does not contain all values of categorical variable: P3
Multiple group analysis gives you the most flexibility if you have enough subjects per group. MIMIC cannot look at as many parameters but does not require as many subjects.
Jaccard and Wan (1996) recommend a minimum of 75 subjects per group (100 preferred), but this must depend on several factors such as the number of variables in the model. Can you suggest how to determine the minimum number of subjects needed for group analysis? My smallest group size is 50.
Also, I have convergence problems with a single six group model, but not when the same model is run in 3 separate analyses with two groups each. What are the implications of this?
As you said, sample size depends on many things. As a minimum for each group, you would want to have more observations than the number of variables. You would want to have 5 to 10 observations for each parameter. For categorical outcomes, you usually need more observations than for continuous outcomes. Sample size 50 seems small particularly for categorical outcomes. Regarding convergence, the measurement invariance restrictions that you are probably imposing may not hold across all groups.
Sandra Lyons posted on Saturday, December 01, 2001 - 12:33 pm
I've looked at the Mplus MIMIC examples and observed that none of them have independent latent variables. Hence, I'm wondering whether MIMIC is a good alternative to group analysis for the SEM I'm testing which is:
F1 by q21a q21b q21c q21d; F3 by p1 p5 p6 p9; F4 by p2 p3 p10 p11 p12 p13; F5 by p4 p7 p8; F6 by F3 F4 F5; F6 on F1 lifevent; nparpro aparpro on F6;
I'm primarily interested in group differences in the path coefficients.
If MIMIC is indeed appropriate for this analysis, is it analogus to ols regression with dummy variables?
In multigroup analysis with categorical dependent variables, if measurement invariance is not of substantive interest, would it be appropriate to fix measurement parameters across groups to those obtained in the single group analysis in order to circumvent nonconvergence possibly due to measuement invariance?
bmuthen posted on Saturday, December 01, 2001 - 5:25 pm
The term MIMIC analysis is typically reserved for models with observed covariates influencing factors that have a set of indicators. But you can certainly put grouping variables as covariates into any SEM including yours above. Using grouping variables as covariates makes it possible to have different means (intercepts) of the variables that they are specified to influence (observed and latent). If you are interested in group differences in path coefficients (slopes), however, having grouping variables as covariates will not help.
I would not recommend fixing measurement parameters to single-group analysis values because you want to see that the measurement part of the model is not changed in important ways when doing the joint analysis of several groups - a convergence problem can be an indication of model misspecification.
I'm trying to constrain the path coefficients to be the same for mothers and fathers. The code above yields different coefficents for mothers vs. fathers and the results are identical to code that omits the #'s in parentheses...what am I doing wrong?? thanks!!
Is this the same as recoding my outcome variable for Group 1 but not Group 2 ? Mplus doesn't seem to allow group specific recodes using CUT on the DEFINE command, and doesn't give me an error msg when I use the above specification.
Multiple groups can be, but doesn't have to be, used when a categorical variable is involved. This gives more modeling flexibility than using products since for example variances can be different across the groups.
Methods I have seen described for interacting latent variables with a continuous observed variable seem quiet complex (Jaccard & Wan, Kenny & Judd) relative to group analysis. What method do you generally recommend? For example, I have the following model:
f2 on f1 x1; x2 x3 on f2;
where f1 and f2 are latent variables with dichotomous indicators, and x1 - x3 are observed continuous variables
I want to test the moderating effects of a continuous variable on each path in the model. Would you recommend group analysis or product terms. If product terms, what method do you suggest.
bmuthen posted on Tuesday, July 23, 2002 - 8:30 am
Since the moderating variable is observed and not latent, the simplest approach would be to categorize the continuous moderating variable and do a multiple-group analysis. There are many methods for analysis of latent variable interactions (which includes your case), but I hesitate to recommend any. A new method for ML analysis by Andreas Klein seems superior but is not yet easily available in software form.
Anonymous posted on Tuesday, August 13, 2002 - 5:52 pm
We have been conducting a multigroup analysis with two groups and continuous indicators. We want to test whether some of the structural path coefficients are significantly different for group 1 vs. group 2.
e.g., for structural path x: Group 1 standardized coefficient = .609 Group 2 standradized coefficient = .216
How can we determine if these coefficients for the same path but different groups are significantly different from one another?
To test whether some paths are different between two groups, you can run two models -- one with the paths held equal and the second with the paths not constrained to be equal. Then do a chi-square difference test. This is not a test of the standardized coefficients rather the unstandardized coefficients.
Anonymous posted on Friday, August 16, 2002 - 1:33 pm
I have a question about comparing multigroup SEM coefficients across groups.
Is it the case that the MG approach "controls" on differences in levels of my exogenous variables across groups ?
For example, I'm running a model on two groups, the first of which has much higher income and intelligence scores than the second group. Income and intelligence are one of about 10 different x variables used to predict an outcome variable y. Is it valid to compare differences in the direction and sizes of the effects of x1, x2, x3,...,x10 on y across groups ?
Anonymous posted on Friday, August 16, 2002 - 1:42 pm
I should have appended this second question to the one I originally submitted above:
Is there a convenient way to determine if structural coefficients are equal across groups in a MG SEM without having to resort to Chi-Square (WLS) tests ?
I ask because I have a large number of variables in my models and using individual Chi-Square tests would be tedius, and I think the significance of coefficients would be biased by the order in which I imposed the restrictions.
bmuthen posted on Saturday, August 17, 2002 - 9:39 am
Regarding your first question about controlling for differences, you confuse me by first talking about groups defined by income and intelligence and then talking about these variables as x variables. Let me answer the question as an MG situation where one x variable is used as a grouping variable, and therefore not used as one of the x variables. You should think of this as regular regression in two groups, where we know that the regression slope can be compared even if the x mean is different in the two groups.
Yes, you can print out (TECH3) the estimated covariance matrix for the parameter estimates and do a "correlated t test".
Anonymous posted on Wednesday, August 27, 2003 - 8:38 am
On August 17, Bengt recommends doing a correlated t-test to examine whether or not the coefficients for two groups in a multigroup model are different. I'm wondering if this is the appropriate test to use in all situations.
If one is working with data were individuals are not assigned to groups randomly, when the number of persons in the two groups differs considerably, and where the SEs for the coefficients of interest also vary considerably, shouldn't one use an unequal variance t-test or a t-test for independent samples ?
Also, in Bengt's original recommendation, wouldn't the df for a pooled t-test always be df=(number of groups - 2) = 0 ?
bmuthen posted on Wednesday, August 27, 2003 - 9:07 am
I was using "correlated t test" merely as an analogy. The TECH3-based test I have in mind is asymptotically normal, so the z test analogy is better.
Anonymous posted on Wednesday, August 27, 2003 - 9:20 am
I'm following up to your response to make sure I understand how comparing coefficients across groups in a multigroup model corresponds to common t-tests for comparing means across groups.
TECH3 would be needed to determine the covariance between a given pair of model parameters.
However if the two groups are independent (which I believe is an appropriate assumption if cases are assigned to groups based on non-random factors -- i.e., students allocated to schools, workers allocated to firms or sectors of the labor market), TECH3 wouldn't be needed and n1 and n2 would be the sizes of the two groups from which the coefficients (treated as averages) were obtained.
bmuthen posted on Wednesday, August 27, 2003 - 11:48 am
Here is my understanding of this. I think this question was regarding a SEM, testing equality of structural coefficients. Even if the 2 groups correspond to independent samples, the invariance restrictions across groups typically imposed on measurement parameters could make the structural coefficients estimates from the two groups correlated - so that is where I was thinking TECH3 comes in. As far as I see it, the differences in group sample sizes are already taken into account in the 3 TECH3 components - this is unlike t tests where sample size enters because a variance for a sample mean is figured via the variance for each variable in the mean. So the resulting (approximate) z score ratio is correct.
Anonymous posted on Wednesday, September 24, 2003 - 9:46 am
Just to clarify on the testing equality of structural coefficients. Say, I have latent variables x1, x2 and x3 predicting latent variable y. I look at the difference in chi-squares if I fix everything to be equal between two groups and if I fix everything except the path from x1 to y --- does LM test tell me if this structural coefficient (y on x1) is significantly different between groups? Should I repeat the procedure two more times for x2 and x3? Thank you in advance.
bmuthen posted on Wednesday, September 24, 2003 - 7:11 pm
Not quite the way you said it, I think. Instead:
To test if y on x1 is different across groups, you would run with the slope held equal across groups and then run allowing it to differ. Then do the same for y on x2, then for y on x3.
But if your hypothesis is that all 3 (y on x1, on x2, on x3) are equal across groups, then you would do one run with equal for all 3 across groups and one run letting them be different.
Daniel posted on Tuesday, March 30, 2004 - 10:32 am
In presenting the results of a multi-group LGM, is it appropriate to present standardized or raw path coefficients in a figure? I read in the Loehlin "LATENT VARIABLE MODELING" text that population differences in range on specific variables can influence comparability of standardized scores across populations? Is this a problem in multi-group analysis? Or are the standardized path coefficients based on values appropriate to the entire population?
I would report the raw coefficients and their standard error in addition to the standardized coefficents. Don't forget that the significance test is for the raw coefficient. The standardizations are computed using the variances for each group. There are different opinions about this.
Daniel posted on Wednesday, March 31, 2004 - 11:32 am
Thanks very much once again for your help. One of the difficult parts of being a researcher rather than statistician by training is that I must learn much technique on my own. So, while I have been reading a tremendous amount of text on a variety of subjects in SEM, it is some times difficult to see the forest for the trees! That's when the help of experts like yourself and Bengt's is much appreciated.
Daniel posted on Wednesday, March 31, 2004 - 11:33 am
Thanks very much once again for your help. One of the difficult parts of being a researcher rather than statistician by training is that I must learn on my own. So, while I have been reading a tremendous amount of text on a variety of subjects in SEM, it is some times difficult to see the forest for the trees! That's when the help of experts like yourself and Bengt is much appreciated.
Jen Bailey posted on Wednesday, April 28, 2004 - 5:13 pm
Is it possible to run a multigroup model in which a latent factor that exists in one group does not exist in the other?
Here's the scenario: I'm looking at within-individual continuity in latent substance use across adolescence and adulthood. Some of the members of the sample have children, and some do not. I'm interested in how parental substance use affects child problem behavior. My sample of parents is small (n = 200), and my substance use model is fairly large, since I have multiple indicators and multiple time points. Therefore, I would like to take advantage of my whole sample (n = 800) in estimating the substance use part of the model.
A colleague suggested that I do a multigroup model, leaving out the "child problem behavior" factor in the group that doesn't have children. The child problem behavior variables are, obviously, missing for all non-parents. The thought was that a multigroup model would be superior to mixing the parent and non-parent populations and using FIML because it would explicitly acknowledge that there are two populations in the sample. I've tried specifying a new latent factor in the model statement for my second group, but the program (Version 3) doesn't seem to like that.
What are your thoughts on using a multigroup approach in this case? How would I program such a model?
Yes, this is possible. But you need to define the factor in the overall MODEL command not in a group-specific MODEL command. Then you need to set all of the factor loadings to zero in the group-specific MODEL command. The overall MODEL command is the model assigned to each group and then modified by the group-specific MODEL commands. Chapter 13 has a discussion of this.
Following is an example of how this can be done:
MODEL: f1 BY y1-y4; f2 BY y5 y6 y7; MODEL males: f2 BY y5@0y6@0y7@0;
Jen Bailey posted on Thursday, April 29, 2004 - 11:03 am
Thanks for your reply - I appreciate your syntax suggestion. I still have a problem, however. I wrote the syntax as you suggested, and got an error message saying that all cases in one group were missing data on some variables. This is true - in my non-parents group, there ARE no data for the indicators of child problem behavior, because there are no children.
Any suggestions for getting around the fact that the child problem behavior factor doesn't exist and its indicators are all missing data in the non-parent group?
I think the only thing you can do is run the model with the factors and variables shared by all groups and test invariance of the factors over groups for those factors. Then you would have to run the group separately that has more factors and variables. Establishing measurement invariance would not be as issue for those factors.
Jen Bailey posted on Monday, May 03, 2004 - 10:27 am
Thank you for your time and advice. I very much appreciate having this discussion board as a resource.
Anonymous posted on Monday, June 14, 2004 - 6:06 am
I am running a multi-group analysis with three racial groups - black, white, and hispanic. You will see in the input file below that I allow 2 variable (ED and CMR) paths (slopes, gammas) to be freely estimated among the three groups.
How can I allow one of the variables (MV1) to be constrained to be equal for the first two groups (black and white) and freely estimated/different for the third group (hispanic)?
VARIABLE: GROUPING IS RAC (1=black 2=white 3=hispanic); MISSING IS .;
This will relax the equality constraint for the hispanic group.
Anonymous posted on Monday, August 16, 2004 - 10:42 am
I was wondering if my code is correct to test measurement invariance (has SOME categorical factor indicators and covariates). It is my understanding that I should use the theta parameterization. Is this correct? I believe I should run a model where everything is free (model 1), where factor loadings are held constant across groups (model 2), where variances of latent variables are held constant and factor loadings (model 3), where covariances of latent variables, variances of latent variables and factor loadings are equal (model 4), and finally where regression paramaters, covariances of latent variables, variances of latent variables, and factor loadings are held constant (model 5). I am not specifying thresholds. All of my categorical variables are coded 0-absent, 1-present. I read on page 67 of the User's Guide that if the thresholds are free across groups (I believe this is the default) and a factor loading for a categorical factor indicator is free across groups, the residual variance for the variable must be fixed to one in these groups for identification purposes. Do I need to fix the variance of pardep and fhdadc to one...or some other variable? I am having some identification issues. I am particularly interested in whether the regression weights are equal across groups.
Model 1: grouping is sex (0=male 1=female); IDVARIABLE = subno; missing=.; categorical are fhdadc parsuic pardep late;
ANALYSIS: TYPE = mgroup; parameterization=theta; iterations= 50000; MODEL: suicide BY late@1 (1); suicide by middle (2); suicide by early (3);
attemp by mlife@1 (4); attemp by lalife (5); attemp by elife (6);
parprob by fhdadc@1 (7); parprob by parsuic (8); parprob by pardep (9);
extrov BY ext3@1 (10); extrov by ext2 (11); extrov by ext1 (12);
psychot BY psychot2@1 (13); psychot by psychot1 (14); psychot by psychot3 (15);
neurot BY neurot3@1 (16); neurot by neurot2 (17); neurot by neurot1 (18);
Model 4: add this to MOdel 3.... parprob with extrov (44); parprob with psychot (45); parprob with neurot (46); extrov with psychot (47); extrov with neurot (48); psychot with neurot (49); model female:
parprob with extrov (44); parprob with psychot (45); parprob with neurot (46); extrov with psychot (47); extrov with neurot (48); psychot with neurot (49);
MODEL 5: add this to model 4.... attemp on suicide (149); attemp on pareduc (150); attemp on parprob (151); attemp on extrov (152); attemp on psychot (153); attemp on neurot (154); attemp on careloss (155); attemp on divorce (156); attemp on nphycnt (157); attemp on nvbscnt (158); attemp on nncnt (159); attemp on cle31 (160); Model female : attemp on suicide (149); attemp on pareduc (150); attemp on parprob (151); attemp on extrov (152); attemp on psychot (153); attemp on neurot (154); attemp on careloss (155); attemp on divorce (156); attemp on nphycnt (157); attemp on nvbscnt (158); attemp on nncnt (159); attemp on cle31 (160);
Are these the models you suggest? Is my syntax correct? Do I need to set the residual variance to one for parsuic and pardep (or other variables)? Thank you so much in advance.
You can use either the delta or theta parameterization to test measurement invariance. Many of the equalities that you want to test are not measurement invariance in my opinion. Differences between factor means, variances, and covariances and regression coefficients describe population heterogeneity rather than measurement invariance. Factor loadings and thresholds are related to measurement invariance. Some see residual variances of factor indicators as measurement parameters. I would not require them to be equal for measurement invariance to hold.
Example 5.16 in the Mplus User's Guide shows a multiple group CFA with categorical factor indicators. To test measurement invariance, you would first run the default overall model where factor loadings and thresholds are held equal as the default. The second model is one where factor loadings and thresholds are unequal across groups. How to relax the default equality is shown in Example 5.16. With the THETA parameterization, residual variances instead of scale factors are fixed to one.
Anonymous posted on Tuesday, September 21, 2004 - 5:59 pm
Does Mplus 3 generate modification indices that rank the equality constraints in terms of their effects on overall model chi-square? If not, what is your recommended strategy for localizing areas of relatively worse "misfit" in complex multigroup SEMs? Thanks!
No. No general strategy comes to mind. Just look for the largest ones and also see what difference it makes for parameter estimates when they are relaxed.
Anonymous posted on Friday, October 08, 2004 - 1:30 pm
I am testing measurement invariance of factor loadings where indicators are categorical. I consistently get an error message that the standard errors cannot be estimated because my model may not be identidfied. Hoping to fix this problem, I would like to constrain my factor means to zero. Someone else had this same problem...and the posted response was:
"If you instead want to have no invariance restrictions across groups you should repeat the thresholds and loadings in each group so that they are group-specific. Note, however, that in this case you need to fix to zero the factor means in all groups (you cannot identify both group-specific thresholds and group-specific factor means) and fix the scale factors to one in all groups (they can only be identified when thresholds and loadings are invariant)."
How do I fix to zero the factor means in all groups? What does the code look like?
Madeline posted on Thursday, October 28, 2004 - 4:20 pm
Hi - I am testing measurement invariance of factor loadings across gender. My less restrictive model is giving me the following message: "THE MODEL ESTIMATION TERMINATED NORMALLY THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES COULD NOT BE COMPUTED. THE MODEL MAY NOT BE IDENTIFIED. CHECK YOUR MODEL.
Here is my code. Can you tell me what I am doing wrong?
!Measurement Invariance of Factor Loadings across sex TITLE: Invariance: Male vs Female DATA: FILE IS Y:\Madeline\name1.dat;
ANALYSIS: TYPE = missing h1; parameterization=theta; iterations= 50000;
delinq by lvdamage@1; delinq by lvcut (2); delinq by lvweapon (3); delinq by lvpunish (4); delinq by lvbeaten (5); delinq by lvthrt (6); delinq by lvver (7); delinq by lvstolen (8);
expec by partyfun@1; expec by friendly (10); expec by join (11); expec by betrfren (12); expec by holiday (13); expec by silly (14); expec by caring (15); expec by betrmood (16); expec by drive (17); expec by homework (18);
everalc on delinq expec; expec on delinq;
model female: delinq by lvdamage@1; delinq by lvcut (102); delinq by lvweapon (103); delinq by lvpunish (104); delinq by lvbeaten (105); delinq by lvthrt (106); delinq by lvver (107); delinq by lvstolen (108);
expec by partyfun@1; expec by friendly (110); expec by join (111); expec by betrfren (112); expec by holiday (113); expec by silly (114); expec by caring (115); expec by betrmood (116); expec by drive (117); expec by homework (118);
OUTPUT: tech1 tech2 tech4 STANDARDIZED ; SAVEDATA: DIFFTEST IS sexload.dat;
*** WARNING Data set contains unknown or missing values for GROUPING, PATTERN, COHORT and/or CLUSTER variables. Number of cases with unknown or missing values: 454 1 WARNING(S) FOUND IN THE INPUT INSTRUCTIONS
With categorical outcomes, you must have thresholds and factor loadings both held equal or both free. You can't relax the constraint on a factor loading without relaxing the constraint on the threshold for the same item. I don't see that you have thresholds free in your MODEL command.
Examples 5.16 and 5.17 in the Mplus User's Guide show a multiple group CFA with categorical factor indicators. To test measurement invariance, you would first run the default overall model where factor loadings and thresholds are held equal as the default. The second model is one where factor loadings and thresholds are unequal across groups. In this model, with the Delta parameterization, scale factors must be fixed to one in all groups and factor variances fixed to zero in all groups. With the Theta parameterization, residual variances must be fixed to one in all groups and factor means fixed to zero in all groups. How to relax the default equality is shown in Example 5.16. With the THETA parameterization, residual variances instead of scale factors are fixed to one.
I am running a series of multigroup (male and female) CFA's with continous factors in an attempt to test measurement invariance (a la Bollen 1989). Moving to increasingly more restrictive constraints (factor loadings, intercepts, means, and variance-co-variances) I am now ready to constrain error variance-co-variances. However I am unclear on 1)what the default treatment of error variances is in Mplus and 2) how to constrain them to be equal between groups. Can you tell me what programming language I need to constrain error variances?
bmuthen posted on Sunday, November 14, 2004 - 11:24 am
The default is that the error (co-)variances are allowed to differ across groups. Your input specifies that the error variances are the same across groups since you have in the overall part of your model the statements
This is probably too basic a question, but when asked by my PhD supervisor I was unable to answer. He has no experience with MPlus, and we are both on a steep learning curve. I am running SEM with three groups of about 70 participants of 6, 8, and 10 years. If I use a multiple group format for the SEM, what exactly am I doing. Am I correcting for or accounting for group?differences? Similarly, when would I use CLASS and when would I use CLUSTER?
Mary posted on Tuesday, November 16, 2004 - 6:15 am
Dear Mr and Mrs Muthén,
I have a very simple question regarding the grouping option. Besides the constraint that forces the loadings to be equal across the groups, are there any other differences between runnning a regression with the grouping option or running each group as a different regression?
Re: Larry Cashion. Multiple group analysis is used to study parameter estimates across groups of different observations. In your case, you would be studying difference in parameter estimates acroos age. The CLASSES option is used to define categorical latent variables in mixture models. The CLUSTER option is used to name the cluster variable in an analysis of complex survey data, that is, data that are not collected as a simple random sample.
I assume that you are asking whether a CFA with covariates will result in different parameter estimates when all parameters are free or if you run the anslysis on each group separately. If all parameters are free across groups, the results should be the same.
Anonymous posted on Tuesday, November 30, 2004 - 12:51 pm
I have a question about reporting factor means. I conducted multiple, multi-group analyses, and I tested invariance across gender, age, and race. Now, for the manuscript, I would like to report factor means. However, the factor means for one group in each of the multigroup analyses are set to zero. Is my only option to report:
Mean Conduct Problems Men 0 Women -1.2 Caucasian 0 African American 2.12 Etc....
bmuthen posted on Tuesday, November 30, 2004 - 5:43 pm
Yes, factor means need to be fixed to zero in one group for identification purposes. You should view this group as the reference group to which the factor means of the other groups are compared. So that's how you want to portray it in your reporting. Another way of saying this is that it is really only the factor mean difference between the groups that is identifiable.
I have what I think is a simple question. I have two covariance matrices for which I would like to run a multigroup analysis. All Mplus examples I have seen on the website and in the manual assume that one has raw data with a grouping variable present on the dataset. 1) Can one model with two (or more) covariance matrices instead? 2) If so, could you provide some example syntax?
See the discussion of multiple group analysis in Chapter 13 of the Mplus User's Guide. The only difference is how you refer to the groups. Note that some estimators require raw data.
Anonymous posted on Tuesday, December 14, 2004 - 2:46 pm
Hi - I am trying to use the Difftest option to test measurement invariance of factor loadings and thresholds across sex. The first model allows factor loadings and thresholds to vary across groups. The second model constrains factor loadings and thresholds to be equal across groups. I keep getting an error message saying my models are not nested. Could you help me determine why they are not nested?
The first model - MODEL: delinq by lvdamage@1 lvcut lvweapon lvpunish lvbeaten lvthrt lvver lvstolen;
delinq by lvdamage; [lvdamage$1]; delinq by lvcut; [lvcut$1]; delinq by lvweapon; [lvweapon$1]; delinq by lvpunish; [lvpunish$1]; delinq by lvthrt; [lvthrt$1]; delinq by lvver; [lvver$1]; delinq by lvstolen; [lvstolen$1];
expec by betrmood; [betrmood$1]; expec by caring; [caring$1]; expec by friendly; [friendly$1]; expec by join; [join$1]; expec by betrfren; [betrfren$1]; expec by holiday; [holiday$1]; expec by silly; [silly$1]; expec by homework; [homework$1]; expec by drive; [drive$1];
I just did this nested model testing for examples in the user's guide using both the Delta and Theta parameterization and it worked fine. I would be happy to send you the setups if you give me your email address.
Anonymous posted on Friday, December 31, 2004 - 9:22 am
Could you direct me to the documentation for the new difftest that's available in MPlus when using the WLSMV estimator?
I would like to use WLSMV for testing invariance for a CFA model with dichotomous data. I see that Mplus has the DIFFTEST command, and I can use it to save the derivatives, but I'm unsure what to do with them after that. Could you help me understand how to use this command? Thanks. Below is an example of the code I want to use.
MODEL: F1 BY Y1@1 Y2-Y3*; F2 BY Y4@1 Y5-Y6*; F1-F2*; F1 WITH F2*; [Y1$1*-1]; [Y2$1*-.5]; [Y3$1*-.25]; [Y4$1*0]; [Y5$1*.25]; [Y6$1*.5];
Holmes Finch posted on Wednesday, January 19, 2005 - 5:22 am
I appreciate your directing me to the discussion in the manual regarding using DIFFTEST for WLSMV. I'm using this command in a simulation study, and was wondering if it's possible to save the results of the chi-square difference test for WLSMV that one gets using DIFFTEST. I couldn't find it in the file produced by the RESULTS command, and have looked through the manual, but haven't found anything. Thanks in advance.
the output is giving me a message : _________________________________________________
SERIOUS COMPUTATIONAL PROBLEMS OCCURRED IN THE BIVARIATE ESTIMATION OF THE CORRELATION FOR VARIABLES PERSISTE AND IIE72. CHECK YOUR DATA. IF THE PROGRAM RECOVERS FOR THIS PAIR OF VARIABLES (SEE TECHNICAL 6 OUTPUT), THE ESTIMATES ARE VALID. THE PROBLEM OCCURRED FOR THE FOLLOWING OBSERVATION(S): OBSERVATION 3 OBSERVATION 3 COMPUTATIONAL PROBLEMS ESTIMATING THE CORRELATION FOR PERSISTE AND IIE72 ______________________________ i have checked my data and I don't find a problem with it. The tech 6 report is not provided with the type of analysis I am doing. What are my alternatives to fix this problem?
If you are not getting any fit fit statistics, it is most likely the case that they are not available for the model you are estimating. If you send your full output to email@example.com, I can determine the reason.
Anonymous posted on Friday, February 18, 2005 - 2:08 pm
I just ran a multi-group analysis to test differences in mediation across race. I can test whether the paths of the mediation model are significantly different across groups. Is there a way to test whether the mediated effect (or proportion mediated) is statistically different across groups?
bmuthen posted on Friday, February 18, 2005 - 5:16 pm
If you have the estimate of the mediated effect and its SE for each of the 2 groups, you can simply use those numbers to create the approximately normal test variable:
(e1 - e2)/(se(e1-e2)),
where the denominator is sqrt(var(e1-e2)), where var(e1-e2) is var(e1) + var (e2), where var(e) is the square of the SE(e).
Anonymous posted on Wednesday, March 23, 2005 - 7:47 am
I did a multigroup analysis and a DIFFTEST. The DIFFTEST yielded a Chi-Square difference value of 13.237 with 1 degree of freedom (the difference between the more restrictive H0 modell and the H1 model is only one parameter), which is statistically significant at the .05 probabilty level. Does this mean that the less restrictive modell H1 (in which the parameter was allowed to be estimated freely) fits better than the more restrictive H0 modell and therefore should be used in my further analysis? I ask this question, because I tested the same model with AMOS (only difference: the 6-scale indicators of the latent variable were treated as continous variables) and I got nearly identical results (Estimator : ML), except for the mentioned parameter. Setting this parameter equal across both groups results in AMOS in a significantly better cmin/df.
bmuthen posted on Wednesday, March 23, 2005 - 7:49 am
The answer is yes.
Anonymous posted on Friday, April 22, 2005 - 11:49 am
When is multigroup analysis more appropriate than running a regression with interactions? The variances of my variables are quite different across groups - and I am wondering if this is why multigroup analyses is telling me the groups are different but regression with interaction analyses are telling me the groups are the same. I was thinking this disparity was because the multigroup takes variances by group, where regression with interactions takes pooled variances.
bmuthen posted on Friday, April 22, 2005 - 3:15 pm
It sounds like you are correct.
Anonymous posted on Monday, April 25, 2005 - 10:19 am
Dr. Muthen I try to see the baseline model or the model whitout any constraints for multiple groups analysis(three groups).
Can I use the sum of the df as a check to see if ran without any constraints. I ran a individual model where I had an estimated df=19 and then I ran a multiple group where I had a estimated df = 63.
If not, how can I check if my syntax would be the correct model without any constraints?
I used analysis: type = gen missing h1; estimator=mlr;
I do not understand what IN and PS are but stuctural parameters such as factor means,variances, covariances, and regression coefficients do not need to be held equal for measurement invariance. I would, however, have the same structural parameters in the groups while testing measurement invariance.
I think the rule of thumb of 5 probably has little meaning at this time.
Anonymous posted on Monday, May 09, 2005 - 2:20 am
I am doing a multigroup analysis using the theta parameterization and having a dichotomous outcome. If I understood it correctly, the factor loadings are held equal across the groups as well as the means and intercepts. If I want to free the factor loadings and the thresholds, I have to do it simultanously and I HAVE to fix the residual variances in all groups to one and the factor means in all groups to zero. Is that correct? I ask this, because if I do a chi-square diff test between a model with factors means fixed to zero in the first group and free in the other group and a model with factor means fixed to zero in both groups, the result speaks clearly against the second model.
It is the factor loadings and thresholds of the factor indicators that are held equal as the default. In the default model, factor means are fixed to zero in the first group and are free to be estimated in the other groups. With the theta parameterization, residual variances of the factor indicators are fixed to one in the first group and are free to be estimated in the other group. You are correct that when you free factor loadings and thresholds, all factors means should be fixed to zero and all residual variances should be fixed to one.
Anonymous posted on Thursday, June 23, 2005 - 8:34 am
After running a separate analysis for males (M) and females (F), I ran a multiple group with no constraints. However, my chi-square and df values for M and F do not add up to the chi square and df for the multiple group no constraints model. I have provided my syntax for the M model (the F model is the same - I do get the same number of df for the M and F when I run them separately). I have also included my syntax for the multiple group (MG) no constraints model. Each separate model has 154 df and the MG model has 320 df.
MG model syntax:
VARIABLE: ... MISSING = BLANK ;
GROUPING IS gender (0=female 1=male) ;
ANALYSIS: TYPE = MISSING H1;
MODEL: extprob BY T1delinq T1agg ; risk BY MomBSI Finstrai Neighpro ; intprob BY T1somati T1Withdr T1anxiou ; pospar BY Monitor MCTrust SchInvol ; devpeer BY SchFr NeighFr PeerDelq ; extprob2 BY T2delinq T2agg ; intprob2 BY T2somati T2withdr T2anxiou ; pospar ON risk ; devpeer ON pospar T1parstr; T1parstr ON risk ; extprob2 ON devpeer extprob; intprob2 ON devpeer intprob; T2delinq WITH T1delinq ; T2agg WITH T1agg ; T2somati WITH T1somati ; T2withdr WITH T1withdr ; T2anxiou WITH T1anxiou ;
MODEL male: extprob BY T1agg ; risk BY Finstrai Neighpro ; intprob BY T1Withdr T1anxiou ; pospar BY MCTrust SchInvol ; devpeer BY NeighFr PeerDelq ; extprob2 BY T2agg ; intprob2 BY T2withdr T2anxiou ;
MODEL: extprob BY T1delinq T1agg ; risk BY MomBSI Finstrai Neighpro ; intprob BY T1somati T1Withdr T1anxiou ; pospar BY Monitor MCTrust SchInvol ; devpeer BY SchFr NeighFr PeerDelq ; extprob2 BY T2delinq T2agg ; intprob2 BY T2somati T2withdr T2anxiou ; pospar ON risk ; devpeer ON pospar T1parstr; T1parstr ON risk ; extprob2 ON devpeer extprob; intprob2 ON devpeer intprob; T2delinq WITH T1delinq ; T2agg WITH T1agg ; T2somati WITH T1somati ; T2withdr WITH T1withdr ; T2anxiou WITH T1anxiou ;
Anonymous posted on Wednesday, July 27, 2005 - 4:05 pm
If you find a model is different across two or more groups, is it best to test them simultaneously and get one set of model statistics? Or is it better to split the sample and test the models for each sample separately and get separate sets of model statistics?
bmuthen posted on Wednesday, July 27, 2005 - 6:39 pm
If all parameters are different across groups, it is simpler to work with each group separately. But as long as some parameters are equal across groups you benefit from a simultaneous analysis.
I am testing measurement invariance for a single construct that was measured at different time points. I use multiple CFA in Mplus where the different groups represent the different measurement occasions. I would like to model covariances between the like items' error variances across occasions. I do not know how to model this in a multiple CFA framework in Mplus. Any suggestions will be highly appreciated.
You should not use different groups to represent different measurement occasions because in multiple group analysis each group should contain independent observations. Following is the input for a multiple indicator factor model with four measurement occasions:
MODEL: f1 BY y11 y21 (1); f2 BY y12 y22 (1); f3 BY y13 y23 (1); f4 BY y14 y24 (1); [y11 y12 y13 y14] (2); [y21 y22 y23 y24] (3); [f1@0 f2 f3 f3];
If you want a residual covariance, you would state, for example:
I have a SEM model with two latent endogenous variables that I am treating as continuous and using an MLR estimator. I am testing invariance of the model using the grouping option in Mplus and I have been able to do most of what I want. I am confused, however, about how to constrain the means of my latent factors to be equal across my groups. Can this be done for latent endogenous variables? Related to this, above Dr. Muthen notes that factor means must be set to zero in one group to identify the model, but then why is my tech4 output giving me an estimated mean for my latent variables in both groups? I do see that the intercept for my latent is set to zero in the first group, but I'm somehow missing the connection here. Thanks for any help you can give me.
In a model where intercepts are estimated for the latent variables, there is not a straightforward test of whether means are equal. In a model where you are estimating means not intercepts, you can test that means are equal by fixing the means to zero in all groups.
The model estimated means in TECH4 are based on the model. When a latent variable is endogenous, it's mean is equal to the intercept plus the regression coefficients times the means of the exogenous variables it is regressed on.
I am wondering if Mplus allows me to answer an empirical question. I have employees’ data from 31 organizations. My model includes three latent variables at the individual-level, Job satisfaction, job performance, and worker’s belief. My DV is job performance, my IV is job satisfaction. I conducted a multi-sample analyses and I found that the relationship between my DV and IV varies across organizations (i.e. is moderated by organization). Now, I want to test if this moderating effect of organization on the relationship between my DV and IV is partially mediated by worker’s belief. Is this even possible in Mplus? If it is, could you please refer me to some material that deals with this type of problem?
Thanks in advance for your help,
Boliang Guo posted on Tuesday, November 15, 2005 - 1:49 am
in your case, there are 31 organization, I think you can consider modle a 2 level path analysis, which consider the mediating effect after partial the l2 effects.if you did not have level 2 variable in your model, jsut leave the intercept and slop ramdome in the model 31 level 2 unit is better for multilevel analysis, anwyan, try check the intercept and slope's level2 variance first
I'm wondering if I can conduct the following analysis in Mplus. I modify the example 9.9 and 9.10 from the Mplus version 3 User's guide on pages 205-207. I have 31 clusters would that be large enough cluster size?
TITLE: this is an example of two-level CFA with continuous factor indicators, covariates,and random slopes DATA: FILE IS ex9.9.dat; VARIABLE:NAMES ARE y1-y4 x1-x4 w clus; CLUSTER = clus; BETWEEN = w; ANALYSIS:TYPE = TWOLEVEL RANDOM; ALGORITHM = INTEGRATION; INTEGRATION = 10; MODEL: %WITHIN% fw1 BY y1-y4; fw2 BY x1-x4; s | fw1 ON fw2; %BETWEEN% fb BY y1-y4; y1-y4@0; fb s ON w;
Thanks a lot,
bmuthen posted on Thursday, November 17, 2005 - 5:14 am
Yes, this model can be estimated in Mplus. It may however require a long computing time. 31 clusters is on the border of being too low. Note that 31 is the sample size for between parameters. You have only 7 between parameters so you are probably ok.
Is it possible to do a multi-group analysis using a covariance matrix as the input if the group variable was included in the matrix? Or if it is not in the covariance matrix, but you know how many groups and the number of respondents by group...but the covariance matrix is not separated out by group?
It is possible to do a multiple group analysis using covariance matrices for some estimators. How to do this is described in Chapter 13 under Multiple Group Analsyis, Data In Multiple Group Analysis, Summary Data One Dataset. The grouping variable is not part of the matrices.
Carol posted on Friday, February 10, 2006 - 9:09 am
Hello Dr. Muthen,
I am running a twin model in MPlus using Carol Prescott's examples as a template. In my latest model I ran into the following error message:
WARNING: THE RESIDUAL COVARIANCE MATRIX (PSI) IN GROUP MZ18 IS NOT POSITIVE DEFINITE. PROBLEM INVOLVING VARIABLE A2.
Why might this happen and what are the implications in terms of parameter estimates and fit statistics?
Thank you, Carol
bmuthen posted on Friday, February 10, 2006 - 7:40 pm
This message is ok for twin modeling where the A factors are fixed to correlate 1.0 for MZs. The warning message is good in general where you don't want factor correlations of 1.0. In your case, you can ignore it. If you are doing twin modeling, you will enjoy new features in Mplus Version 4 which will be out in a few weeks.
Hello I am using PLS to verify gender differences in the factors that influence small firms performance. I did run the full model and then one seperately for males and females. I am wondering how I could do the multigroup analysis and what to compare. Is it the path coefficients or T statistics or the means? I used the PLS graph 3.0 Or is there a way to run the whole model using multigroup analysis
If I have 3 groups 1 = low 2 = medium and 3 = high and I want to test the invariance of a structural path between the low and high group only (so that my degrees of freedom difference is 1), would I use:
MODEL: F1 on F2 (1);
MODEL Medium: F1 on F2;
So that only group 1 and 3 are held equal...does this sound reasonable?
I try to conduct a multiple group analysis by testing the invariance of first-order factor loadings on second-order factors.
When I ran a fully constrained model, the result indicated that the factor loadings of the first-order factors on the second-order were not equivalent. It showed that factor loadings, intercepts and thresholds of observed variables were constrained.
How can I constrain the factor loadings of the first-order factors on the second-order factor?
Hello, Dr. Muthen: The questionnair has 33 items, each one having a 5 point Likert scale. By CFA, a measurement model with 5 factor was constructed. Then, I tested the measurement invariance for two groups. I first free the factor loadings and the item threshhold to be freely estimated, but hold the scale factor of the items to be 1 and the factor means to be 0 in both the two groups. By doing this, I got chi-square value as 1583.775. Then, I constrained the factor loadings and item threshholds to be equal across groups. The Chi-square value for the more restrictive model was 921.745*. However, the Chi-square difference is positive 26.589. I used DIFFTEST to do the Chi-square difference test because I used WLSMV estimator. Is it possible for the Chi-square of the more restrictive model to be smaller than the Chi-square of the more flexible model? Am I doing right?
I would like to use the factor scores from a multiple group analysis with continuous variables to graph the relationship between two latent variables. However, the factor scores from the multiple group analysis do not seem accurate; that is, some of the children with high scores on the observed variables have very low factor scores (e.g., -3.8), while others with near identical scores on the observed variables have high factor scores (e.g., 2.0). When examine the factor scores computed from the two single group analyses the factor scores appear as expected, with high scores on the observed variables translating into high factor scores. Why are the factor scores from the multiple group analysis markedly different from those from the single group analyses? Why do they not reflect the trends seen on the observed variables?
This is a question that would require you to send your input, data, outputs, and license number to firstname.lastname@example.org. If you are not using the most recent version of Mplus, I would suggest that as a first step.
I am working with 5 groups, and would like to test for structural invariance doing pairwise comparisons. I know this code:
model: x on y1 (1) y2 (2) y3 (3);
will result in a test of equivalence for y1 (and y2,y3) across all groups - how can i code it so that only group 2 and group3 (for example) are being compared? I am evaluating the significance of between group differences using the chi-square difference test, incorporating the scaling correction factor (i am using wlsm estimation).
You need to use group-specific MODEL commands to achieve this.
MODEL: x on y1 y2 y3; MODEL g2: x on y1 (1) y2 (2) y3 (3); MODEL g3: x on y1 (1) y2 (2) y3 (3);
Ronald Cox posted on Friday, June 16, 2006 - 5:59 pm
Hi I am testing to see if measurement invariance in a CFA model holds for a repeated measures study. I am fitting the same model simultaneously in both samples (time 1 and time 2), without any parameter constraints in order to create a baseline model. However I am getting an error message of "insufficient data" I am using the demo version. Do you have any suggestions what I might be doing wrong? My input file follows. Thanks,
TITLE: Baseline model 10th and 11th graders STEP 3) DATA: FILE = assig6data3.1.INP; TYPE = COVARIANCE; NGROUPS= 2; NOBS = 220 220; VARI: NAMES = CA11 CA12 CA13 CA21 CA22 CA23; MODEL: CASPIRE1 BY CA11 CA12 CA13; CASPIRE2 BY CA21 CA22 CA23; MODEL G2: CASPIRE1 BY CA11 CA12 CA13; CASPIRE2 BY CA21 CA22 CA23;
*** ERROR Insufficient data in "assig6data3.1.INP"
This means that Mplus is not finding enough information in the data file. You need to place the covariance matrix for group 1 first followed by the covariance matrix for group 2. See Chapter 13 where this is described. If you can't solve this, you need to send your input, data, and output to email@example.com.
A few questions regarding multiple group comparisons:
1) I have read that Kenny recommends testing for structural invariance before testing for invariance of error covariance - what would be the harm in testing for invariance of error covariance prior to testing for structural invariance?
2) If forcing two structural parameters to be equal results in a non-positive definite latent variable covariance matrix OR model non-convergence, what should be done about this? What would be the next step?
3) I have read previously on the MPlus discussion board that if a scaled chi-square difference test doesn't run due to negative chi-square difference values, that this is a function of the method and it is not possible to conclude whether the parameters are equal or not in each group. Can you provide a reference for this?
1. There is no harm but invariance of error covariances is less likely than structural invariance. 2. This may indicate that the structural parameters should not be held equal. 3. There is a Satorra and Bentler article about this from a few years ago. I don't know the exact reference.
when testing between group differences, should it always be a change of one degree of freedom between models? If I hold a parameter equal across all groups, I get a change of four degrees of freedom. should i be putting equality constraints between two groups at a time?
HI, I'm working on a two group model with uneven group sizes. The grouping variable is high school team sport participation among females. The first group has 128 participants. The second (no teams) group has only 43 individuals. I ran the two group model and all the fit indicies are good, including a non-sig chi-square and an RMSEA=0(0 .03). My question is whether this analysis is troublesome because of the vast difference in group sizes?
Yes. The following would hold the regression coefficients equal across groups:
MODEL: y1 on x1 (1) x2 (2) x3 (3);
Nina Zuna posted on Wednesday, August 23, 2006 - 1:56 pm
Dear Drs. Muthén,
I am still in Mplus learning mode and came across something I don't understand. I ran 2 single CFAs for my 2 grps and then ran my initial Multiple grp (MG) CFA (configural invariance). Each used MLR estimator. My single group Chi sqs. using MLR do not add up to multiple grp total chi sq using MLR. If this same procedure is done using ML they add up. Que 1. What is diff about MLR that makes the two separate chi sqs not add up to MG chi sq? Second puzzling occurence regardless of estimator used: I had always assumed in MG invariance testing that the group with the lower contribution to chi sq had better fit (ideally you want these #'s roughly equal in MG), but when I did my single CFAs as described above I found out the opposite occured. The group with the larger Chi sq in MG when run in single CFA had better fit than grp with lower chi sq run in single CFA. The grp with better fit statisitcs in single CFA (higher chi sq) appears to be driving the fit statistics in MG invariance tests. This group also has the larger n so perhaps this power differential is the cause. However, I thought since the CFI and TLI are comparative fit indices that they wouldn't be as influenced by sample size?? I am quite confused by this. Que 2. Any thoughts would be very much appreciated.
Q1. This is the same issue as MLR chi-square differences between nested models not being chi-square distributed. This topic is discussed on our web site - see left margin How-To "chi-square difference test".
Q2. Groups with larger n influence the parameter estimates more. And parameter estimates in turn influence CFI/TLI. You say "better fit statisitcs in single CFA (higher chi sq)" - that must be a typo since high chi-square is a worse fit statistic than a low chi-square.
Nina Zuna posted on Wednesday, August 23, 2006 - 8:58 pm
Thanks for your reponse to que 1-makes sense. As for 2nd ques I am still stumped b/c it is not a typo. Below is my output from the 2 single CFAs and MG.
Chi-Sq Test of Model Fit-Disability group (n=112) Value 351.127* df 183 P-Value 0.0000 CFI .810 TLI 0.782
RMSEA 0.091 As you will note all 3 fit statistics are worse in this model with lower Chi Sq value.
Chi-Sq Test of Model Fit (Non Disability group n=566) Value 514.760* df 183 PValue 0.000 CFI 0.906 TLI 0.892
RMSEA 0.057 Fit statistics are better in this model with higher Chi Sq.
Chi-Sq Test of Model Fit (Multiple group- Disability and Non-Disability)
Value 882.407* df 366 P Value 0.0000
CFI 0.889 TLI 0.872 RMSEA 0.065 In the multiple group, the fit statistics are in between the other two models, with the Non Disab. grp with higher Chi sq. seeming to dominate.
Convention suggests that CFI should be above 0.95 for a well-fitting model. I don't think one should compare fit indices when they all point to this degree of misfit. Degrees of poor fit can't really be judged well, I think. In any case, fit indices quite often disagree with each other - this is why it is useful to work with many - at least it is helpful in cases where they are all good.
Nina Zuna posted on Thursday, August 24, 2006 - 6:39 pm
Thank you, Bengt; your continued follow-up is very much appreciated. Indeed, I agree the fit is bad for both groups. I continue to grapple with the fact that the higher chi square had the better model fit. So based on your response am I correct to assume when model fit is this poor, one might see such anomalies as the occurence of better model fit with higher chi squares than a model with lower chi square? I don't think I have seen this before. Everything I have read indicates that Chi square value is a measure of badness of fit. Is there any explanation I could offer to my committee members on this discrepancy?
Your Non-Disability group has worse chi-square but better CFI than your disability group - this may be due to the Non-Disability group having a much larger sample size where sample size probably affects chi-square more than CFI.
Nina Zuna posted on Friday, August 25, 2006 - 9:00 am
Again, thank you so much for your time. Your website and discussion board are such wonderful resources. I look forward to meeting you and learning from you in MD.
In testing measurement invariance (categorical indicators: loadings and thresholds), does anyone have an idea about how many invariant loadings/thresholds is needed to meet criteria for partial measurement invariance?
I don't know about LISREL. This forum is for Mplus. You would need to contact LISREL support.
TAO, Sha posted on Thursday, March 01, 2007 - 12:58 pm
I am trying to do a 3- group SEM with summary sata (correlation matrices and STDs). There are two predictors (one was measured by 3 indicators, the other one is measured by 2 indicators), and one outcome measured by 3 indicators. This analysis is to examine the equality of the path parameters from the two predictors to the outcome. The Script is specified as follows: TITLE: Grade 1-3 SEM: only paths from Independent LVs to the DV are constrained to equal ; DATA: FILE IS "D:\GRADE1-3.txt" ; TYPE = CORR MEANS STD ; NOBSERVATIONS = 100 100 100 ; NGROUPS = 3 ; VARIABLE: NAMES ARE OV1 OV2 OV3 OV4 OV5 OV6 OV7 OV8 ; ANALYSIS: TYPE = General ; ESTIMATOR is ML ; MODELS: LV1 by OV1@1 OV2 OV3 ; LV2 by OV4* OV5@1 ; LV3 by OV6@1 OV7 OV8 ;
LV3 on LV1 LV2; LV1 with LV2 ;
LV1 LV2 LV3;
OV1 OV2 OV3 OV4 OV5 OV6 OV7 OV8 ;
TAO, Sha posted on Thursday, March 01, 2007 - 12:59 pm
MODEL g2: LV1 by OV1@1 OV2 OV3 ; LV2 by OV4* OV5@1 ; LV3 by OV6@1 OV7 OV8 ;
LV1 LV2 LV3;
OV1 OV2 OV3 OV4 OV5 OV6 OV7 OV8 ;
[LV1 LV2 LV3] ;
[OV1 OV2 OV3 OV4 OV5 OV6 OV7 OV8 ] ;
MODEL g3: LV1 by OV1@1 OV2 OV3 ; LV2 by OV4* OV5@1 ; LV3 by OV6@1 OV7 OV8 ;
LV1 LV2 LV3;
OV1 OV2 OV3 OV4 OV5 OV6 OV7 OV8 ;
[LV1 LV2 LV3] ;
[OV1 OV2 OV3 OV4 OV5 OV6 OV7 OV8 ] ;
OUTPUT: STANDARDIZED SAMP ;
When I run the analysis, MPLUS stopped with an error message: *** ERROR Insufficient data in "D:\GRADE1-3.txt"
So I checked the summary data, and did not find anything wrong with the three matrices and STDs. Would you pls let me know what caused this error and how to fix it? Thanks a lot.
Please do not continue your post into more than one window. Posts that cannot be fit into one window are not appropriate for Mplus Discussion. This is a support question. Please send your input, data, output, and license number to firstname.lastname@example.org.
I am trying to use multiple group analysis for a SEM model with two continuous latent independent variables and a variety of observed independent variables regressed on a count dependent variable.
When I try to include the GROUPING option, I get the following error: ALGORITHM = INTEGRATION is not available for multiple group analysis. Try using the KNOWNCLASS option for TYPE = MIXTURE.
However, I have not specified "ALGORITHM=INTEGRATION" and MIXTURE does not make sense for my model (I am using GENERAL). I tried using KNOWNCLASS to see if it would work, and it says: KNOWNCLASS option is only available with TYPE=MIXTURE.
Any idea what the problem might be? Thanks so much in advance.
Linda posted on Thursday, April 26, 2007 - 8:50 am
I have an experimental data with multiple groups (3 intervention groups and 1 control group). I was told that I could create contrast between the groups and use it as an exogenous variable or use multiple group sample analysis. What is the advantage of doing one vs the other? Also, if I use the multiple group analysis, would I be including the control group as well? If this question is too basic, could you refer me to an article? I have done a search before and can't find an article that addresses my question.
In multiple group analysis, more parameters can vary than in a model where the grouping variable is a covariate where only intercepts and means can vary. I would include the control group. You may find the following paper of interest:
Muthén, B. & Curran, P. (1997). General longitudinal modeling of individual differences in experimental designs: A latent variable framework for analysis and power estimation. Psychological Methods, 2, 371-402
Using WLSMV, we're estimating a multiple groups path analysis with an ordered, categorical outcome, which we'll refer to as z. Additionally, we have several exogenous predictors, call them x1, x2, and x3, and a mediator y. Initially, we obtained an excellent fitting model by allowing y to partially mediate the influence of each exogenous predictor (x1, x2, and x3) on z. Now, across several ethnic groups, we're attempting to impose and equality constraint on the path coefficient from y to z. The model statement reads as follows:
MODEL: y ON x1 x2 x3;
z ON y (1);
z ON x1;
z ON x2;
z ON x3;
Given the use of WLSMV, we've employed the DIFFTEST option to obtain the chi-square difference test. Appropriately, we've tested the less restrictive model first, saved the results using SAVEDATA, and then estimated the constrained model. Surprisingly, however, the output does not include the difference test, instead reporting that the constrained model is not nested within the original. As far as we can tell, this is inaccurate. Also, as specified in the input, Mplus correctly imposes the equality constraint across ethnic groups for the relationship of y to z, with all remaining effects estimated as requested.
Are we incorrect in assuming the restricted model is nested within the original model?
I am running a path model with multiple groups (with both dichotomous and continuous endogenous variables using WLSMV). I am interested in testing for gender differences in individual regression coefficients. I know that I can constrain all regression paths to be equal between groups and then compare this model to the model without these constraints. This will tell me if there is a significant difference in the fit of the path model by gender. In order to test for structural invariance of individual paths however, do I have to run separate models for each? I would be running 23 models and doing difference testing for each.
I'm running a multigroup analysis with covariates. Apparently, Mplus returns an error (and does not estimate the model) whenever the variance of one of the covariates in one of the groups is zero. However, this zero-variance is not necessarily a problem as long as I pool the coefficient of that covariate across groups. So, is there a way to "force" Mplus to estimate the model. Thanks in advance.
Thanks again for your answer: by including the the VARIANCES=NOCHECK; option, the model starts running. However, apparently, the procedure still encounters singularities because of group-specific operations (again, in one of the groups one of my covariates has zero-variance). Can I somehow exclude the covariate from the group where it has zero-variance and still keep the covariates' coefficients equal across groups? I think that would solve the problem.
If this does not work, please send your input, data, output, and license number to email@example.com.
Linda posted on Thursday, October 11, 2007 - 9:02 am
I had a question about interpreting findings from multiple sample SEM investigating structural paths. I am runnig multiple sample SEM using intervention types as groups (Control, TPC, TMI, and TPC+TMI). And as an obvious approach, I am using the control group as my reference group when building the multiple sample SEM. Here is my question. So when a structural path shows that it's not different across groups, is that in reference to the control group only? Is multiple sample SEM allowing me to see the differences in the paths between TPC vs. TMI, TPC vs. TPC+TMI, and TMI vs. TPC+TMI? If so, how does that work given that I am specifying a reference group?
I'm not sure what you mean by using the control group as a reference group. You may want to use MODEL TEST. See the user's guide.
Linda posted on Thursday, October 11, 2007 - 10:48 am
Yes, you are right. I got confused after reading an article. It's all clear now. I also have another question. To test for group invariance, how would I code my groups? Does this make sense control=0, TPC=1 TMI=2 and TPC+TMI=3? Also, doesn't the numeric coding imply a linearity of the groups?
The numbers tell the program how to divide the data into groups.
Linda posted on Friday, October 12, 2007 - 8:02 am
Great! It's all clear now. Thank you very much!
Linda posted on Wednesday, November 28, 2007 - 8:53 am
I am conducting Multi-sample SEM on 4 groups. I am investigating structural paths between 5 variables (1 LV and 4 OV). I would like to build my model constraining first the LV before I constrain the structural paths.
Here is my question. Do I need to constrain the loadings, residual variances, and means? or could i just constrain the loadings?
The first step is to establish measurement invariance of the latent variable. How to do this is described in Chapter 13 of the user's guide. Only if the latent variable is the same construct in all groups does it make sense to make comparisons of the structural parameters.
Linda posted on Thursday, November 29, 2007 - 8:25 am
Thank you for your reply.
I did establish measurement invariance first. And I wanted to take a step further to establish the structural parameters. To do that, do I keep the groups constrained on the loadings only or residual variances and means as well?
Linda posted on Thursday, November 29, 2007 - 9:22 am
I am running the model below where x1, x2, and x3 are exogenous predictor variables, m1, m2, m3 and f1 are mediators, and y is the outcome variable.
MODEL: f1 by y1 y2 y3; m1 on x1 x2 x3; f1 on m1; m2 on f1; m3 on m2; y on m3;
I get the following error message. how could I fix this? Thanks in advance.
THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES MAY NOT BE TRUSTWORTHY FOR SOME PARAMETERS DUE TO A NON-POSITIVE DEFINITE FIRST-ORDER DERIVATIVE PRODUCT MATRIX. THIS MAY BE DUE TO THE STARTING VALUES BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. THE CONDITION NUMBER IS 0.338D-10. PROBLEM INVOLVING PARAMETER 15.
Once you have established measurement invariance, you should leave the equalities in place. We don't use equalities of residual variances.
Regarding the error message, please send your input, data, output, and license number to firstname.lastname@example.org.
Jungeun Lee posted on Friday, November 30, 2007 - 4:49 pm
I am working on a multiple group (males and females) SEM. I'd like to test whether or not each individual coefficient in the structural part differs by males and females. I used MODEL TEST to test this. Here is my mplus input.
MODEL: depr by dep1 dep2 dep3 dep4; hope by pos1 pos2 ; anxiety by anx1 anx2; hope on anxiety (p1); depr on hope (p2);
MODEL female: depr by dep2 dep3 dep4; hope by pos2 ; anxiety by anx2; hope on anxiety(p3); depr on hope (p4);
MODEL TEST: 0=p1-p3; 0=p2-p4;
The program gave me one wald test result (value=.343 pvalue=.8425).
Q1. What does this mean? Does it mean that p1=p3 & p2=p4?
Q2. I expected more than 1 test results from the above analysis-- like the first test result corresponds to p1=p3 and the second test corresponds to p2=p4... Is there a way to do this in mplus? Or, do I have to run separate models for each?
I've got a question to the way of reporting a multigroup-SEM in a paper:
some of the effects in my model are set equal for both, men and women. Some are not. If the model is presented in a paper, standardized estimates are reported in general. But for the equal Effects, the standardized estimates are different, while the unstandardized are not.
If I want to report the standardized estimates, which estimate do I choose? the one of the male or of the female group?
I would report both raw and standardized coefficients and standardized for both males and females.
Erika Wolf posted on Monday, January 28, 2008 - 10:57 am
I'm running a series of CFAs to examine measurement invariance across 2 groups. I'm using categorical indicators and using the WLSMV estimator and the DIFFTEST function to test the nested models.
I'm first testing for equal form across the groups, allowing the factor loadings and thresholds to be freely estimated in both groups and setting the scale factor to 1 in the 2nd group.
In the second model, I'm testing for equal factor loadings, so I've left in all the Mplus defaults and have not specified anything for the 2nd group.
I'm confused, though, because my equal factor loadings model has fewer DF than my equal form model when I would expect this to be the otherway around. Is this simply a function of the DF being estimated with the WLSMV estimator? Or are their additional defaults that I should override in my equal factor loadings model?
With WLSMV, the only value that you should look at is the p-value. If you want to look at degrees of freedom in the traditional way, use WLS or WLSMV to see if they are behaving as you would expect. See also the section in Chapter 13 where the set of models to test measurement invariance for categorical outcomes are described.
Dale Glaser posted on Tuesday, February 05, 2008 - 12:35 pm
Hi Linda and Bengt...I have a result that seems easy enough to rectify but is proving to be intractible! I am testing a multigroup (g = 2) CFA with three constructs and three items per construct. When I test the model for the full sample, I get an unimpressive fit (CFI = .82,RMSEA = .118, etc.); however, when I run the multigroup model, whether I constrain the loadings to be equal or not I get an error message that "the standard errors of the model parameter estimates could not be computed.....".....when I check the offending parameter, it is the parameter in the PSI matrix and has a negative SE. After checking for collinearity, multivariate normality, etc there didn't seem to be any major problems. Interestingly, when I run an EFA for each group there is a very clean factorial solution for each group (though I am well aware of EFA vs. CFA differences in results). After trying various fixes (e.g,constraining the elements in the PSI matrix to 0) I was only able to attain convergence when I used the parameter estimates from the full sample as fixed estimates, and as expected fit was horrible (CFI = .65, RMSEA = .122,). Unfortunately, due to privacy issues I can't share the data as Linda generously offers. So, before abandoning this model, any recommendations for negative SE in PSI matrix even though the usual culprits (e.g, singularity) are not an apparent issue? Thank you....Dale
Have you run the CFA model for the two groups separately?
Dale Glaser posted on Tuesday, February 05, 2008 - 3:25 pm
yes I did Linda, and I was able to obtain convergence for one group but fit was abysmal (CFI approx .8, RMSEA approx, .12, etc.)......and I believe that for the other group I had to fix the PSI estimate to obtain convergence (and again fit was problematic).......what I find intriguing is the factorial solution for EFA (whether orthogonal or oblique rotation) was very unambiguous (i.e., as postulated) for both groups.......
I know you can't send your data but I would like to see the EFA outputs for each group and the CFA outputs for each group. If you had clear EFA results, the CFA's should not fit so poorly. It does not sound like the CFA fits in either group.
I recently conducted several analyses where I compared the pattern of results across correlation matrices of mostly personality data. Specifically, I was interested in whether the pattern of results in group A (e.g., men) was similar to that seen in group B (e.g., women). The procedure yielded the following fit indices:
-Chi-square -Standardized RMR -RMSEA -Population Gamma -Adjusted Population Gamma -McDonald Noncentrality index -Population noncentrality index
For all but the first two I have 90% confidence intervals as well.
My sample sizes are, by SEM standards, small (<=250). I have read the Hu and Bentler paper but am still a little unclear as to what the "appropriate" cutoffs are for assessing model fit. Any suggestions? How might the confidence intervals sort out this issue?
Any and all suggestions would be greatly appreciated -- thanks in advance!
From the indices you give, it looks like you are not using Mplus. These indices are tests of overall fit of the model not the comparison of groups. Chi-square difference testing can be used to test across group differences.
I have a problem with a multiplegroup SEM (two groups). I have a set of mixed observed variables (continuous and categorical), so my input is a matrix of Polychoric/polyserial/pearson correlations. However I can't use WLS estimator (and calculate Asymptotic Covariances Matrix), maybe because of little N (500) in one group. My questions are:
a) my model converges with a ML estimation, with quite good fit; anyway, is that correct? b) If I want to compare two structural parameters (or test factorial invariance), what can I do if I used a correlation matrix?
There are a couple of issues here. If you are doing a multiple group analysis using a correlation matrix as input, then you must be telling Mplus it is a covariance matrix or this would not be allowed. If you have a combination of continuous and categorical dependent variables, you need to use raw data in Mplus.
Hi Linda and Bengt-I am working on measurement invariance for my model and have a question about contraining means to 0. According to UG: MODELS FOR CONTINUOUS OUTCOMES... 1. Intercepts, factor loadings, and residual variances free across groups; factor means fixed at zero in all groups 2. Factor loadings constrained to be equal across groups; intercepts and residual variances free; factor means fixed at zero in all groups...
This would be the means for the latents only and be indicated by [var@0] for the second group? Should I also set means for observed exogenous variables to 0 as well? Thanks, Sue
Hi Linda and Bengt: Still working on measurement invariance.
MODELS FOR CONTINUOUS OUTCOMES
2. Factor loadings constrained to be equal across groups; intercepts and residual variances free; factor means fixed at zero in all groups
I originally ran this model with the factor means default and it ran fine and I was able to achieve partial metric invariance. However, when I run the model with the factor means at zero for all groups, I get the following error message: THE MODEL ESTIMATION TERMINATED NORMALLY
THE CHI-SQUARE DIFFERENCE TEST COULD NOT BE COMPUTED BECAUSE THE H0 MODEL MAY NOT BE NESTED IN THE H1 MODEL. DECREASING THE CONVERGENCE OPTION MAY RESOLVE THIS PROBLEM.
I'm running a two-group path analysis, but I've a major problem... I don't obtain an output !
As mentioned in the user's guide, I first specified the H1-model (unconstrained model) with DIFFTEST-option in the SAVEDATA-command. That turned out well (output file was OK).
Second, I specified the H0-model (fully constrained) in which all regression coefficients are defined eqaul between both groups. This is done by specifying the DIFFTEST-option in the ANALYSIS-command. The syntax for this H0-model is provided below. The output file only mentions that reading the input terminated normally. However, no other information is provided in the output (no model results, no Chi² difference test, ...). What can cause this problem ?
DATA: FILE IS C:\AAG\data AAG test.dat;
VARIABLE: NAMES ARE ... USEV ARE ... MISSING ARE ... CATEGORICAL = cu; GROUPING IS tour (0=worktour 1=complextour);
Thanks for your help, but I've managed to solve the problem. The models ran succesfully and I obtained the outputs.
anja schüle posted on Thursday, June 05, 2008 - 6:08 am
I confirmed a big SEM model with continuous variables. Now, I am trying to do a two-group SEM analysis within this model in addition: Hypothesizing that 7 of the 14 betas are affected and respectively vary across the two groups, (but the other 7 do not, and the gamma doesn’t either). is it the right way to test my seven hypothesis by running the model at first for both groups by the “grouping” command, and only specifying the general Model after “Model:” without any restrictions, and afterwards, in the second run, doing the same again but constraining one beta to be equal across groups by formulating: f4 on f1 (1) (So I would have to calculate this second model 7 times, each time constraining only 1 beta. And than, comparing the X² of this constrained run with the X² of the unconstrained run to see if the difference is significant?)
Or is it better to constrain all betas and gammas across groups in the first run, and afterwards in the second run, to set only one beta free in each run? (And compare the X² of the completely constrained model with the X² of the model where only one beta is set free?)
I have tested a complex model (6 latent variables; 22 observed variables) and obtained acceptable model fit for the data. I ran subsequent multiple groups analyses for each of the 3 race/ethnicities within the dataset (n = 140 for each race/ethnicity). The model fit was excellent for two of the groups, but unacceptable for the third group. How do you suggest I proceed? Should I scrap the omnibus model and develop individual models for each race/ethnicity? Should I accept the omnibus model for the two races that have good fit and develop a different model for the third race?
I have been unable to find guidance on this issue? Any help (advice, references) you could provide would be very appreciated.
It does not make sense to put groups together if the same model does not fit the data well for each group. Determining this is the first step in testing for measurement invariance. Once this has been established, then measurement invariance across the groups can be tested. Only then does it makes sense to combine the groups. You can search the literature for measurement invariance for more information and also see the following papers:
Muthén, B. (1989). Factor structure in groups selected on observed scores. British Journal of Mathematical and Statistical Psychology, 42, 81-90.
Muthén, B. (1989). Multiple-group structural modeling with non-normal continuous variables. British Journal of Mathematical and Statistical Psychology, 42, 55-62.
I have a question about power analysis for a multpile group SEM where we plan to evaluate the mediated effects of an intervention on drinking outcomes in Caucasian versus Hispanic adolescents. I have conducted the power analysis in MPlus on the overall model and have found that with 200 subjects I have power .75 and with 250 subjects I have power .84. Given this information, how do I determine what sample size is needed for each group? Do I simply double the same size (and thus, I would need n=250 Caucasians and n=250 Hispanics)? I've read a few papers including Muthen & Muthen 2002 from the website and can't seem to find the answer to this question. Most I can find about power in multiple group SEM is that power is higher if group sizes are equal. Any help would be much appreciated!
Thanks so much for your reponse. Yes, we hypothesize structural paths that will be different in the two groups. So I should do two different power analyses? If the answer is that I were to need 100 in one group and 200 in another, doesn't this compromise power for the 1 df chi-square difference tests for whether a particlar path is, indeed, different between the groups?
I don't think this would compromise the multiple group power test. However, this is a test of the equality of two parameters so the last column is not power because the parameter is not being compared to zero. You would need to use MODEL CONSTRAINT to create a new parameter that is the difference between the two parameters and see the last column for that.
Hi there - I have a SEM model with one categorical predictor (two levels). This predictor represents 2 different experimental conditions that my participants were in (between subjects design). My question is, am I able to simply dummy code this categorical variable and run the usual SEM, or do I need to run a different kind of SEM in order to analyze this? Thanks very much for your help, Andrea
I am estimating a structural model to look at whether father involvement mediates the relationship between being an immigrant child and that child's cognitive outcomes.
The mediator of father involvement is a latent variable. I am first running a CFA before proceeding to the path analysis portion. I have determined that I do not have measurement invariance between resident and non-resident fathers on the latent variable of Father Involvement, although the CFA model shows an acceptable fit for each group. Theoretically, this makes a lot of sense, since what fathers do when they live with or away from their children should vary, that is, I expected to find noninvariance.
I am struggling because now that I have established noninvariance, is it ok to go ahead and estimate the larger path models separately by group? Or should I estimate one large model across both groups and allow all the parameters of the latent variable Father Involvement to vary across groups? Is this even possible?
Thanks so much for getting back to me Dr. Muthen. I think I need a bit more clarification though (forgive me - I am new at this whole SEM thing). I was wondering if a dummy coded, 2-level categorical predictor (not a covariate) can be included in a regular SEM analysis. And I checked example 5.8 and it doesn't seem to refer to a categorical predictor...perhaps I am misinterpreing it? Thanks so much, sorry for the repeat postings, Andrea
I am estimating a structural model to look at whether father involvement mediates the relationship between being an immigrant child and that child's cognitive outcomes.
The mediator of father involvement is a latent variable. I am first running a CFA before proceeding to the path analysis portion. I have determined that I do not have measurement invariance between resident and non-resident fathers on the latent variable of Father Involvement, although the CFA model shows an acceptable fit for each group. Theoretically, this makes a lot of sense, since what fathers do when they live with or away from their children should vary, that is, I expected to find noninvariance.
I am struggling because now that I have established noninvariance, is it ok to go ahead and estimate the larger path models separately by group? Or should I estimate one large model across both groups and allow all the parameters of the latent variable Father Involvement to vary across groups? Is this even possible?
If you do not have invariance of the factor across groups, then you should look at the two groups separately. You cannot compare the factor parameters across the two groups in a meaningful way without measurement invariance.
Heejung Chun posted on Thursday, November 27, 2008 - 8:09 pm
I am conducting a multiple group analysis (MGA) with a second-order confirmatory factor model. The MGA is established in five steps. The five steps are the following:
1. Configural invariance (released the intercepts of the indicators along with releasing all other parameters) 2. Factor Loading invariance of Indicators 3. Factor Loading invariance of First-order factors 4. Intercept invariance of Indicators 5. Intercept invariance of First-order Factors
In my understanding the CFIs should be deceased as I constrain factor loadings and/or intercepts between groups. However, my results showed greater CFIs as I constrained some parameters between groups. Is this right?
I would appreciate your answer.
dena posted on Friday, November 28, 2008 - 8:27 am
I would like to run autoregressive models separately for boys and girls because the correlation matrix clearly suggests that our variables of interest are significantly correlated among girls but not among boys. My question is can I (and if yes, how) justify my decision to run models separately for boys and girls? I read somewhere that we can test whether the variance-covariance matrix is the same for boys and girls and if not, this could justify the split of the analyses. I’m not sure if this is right and how to do that. I constrained all the correlations to be equal for boys and girls. The chi-square with 28 df = 53.17. Can I compare it to the base model (chi-square = 0) and say that the constrained model is «significantly worse» (critical chi-square for n = 28df = 41.34)?
I also did multi-group analyses. Even though some coefficients are significant and quite different for girls and boys, the difference in the chi-square when I look at the constrained vs. unconstrained models is not significant. Is it normal?
The most pointed analysis would be the multi-group analysis where the auto-regressive model is used for both genders and runs with full equality and full inequality are used to form the chi-square difference test.
If the missingness is due to different reasons in the two groups, any group comparisons would be biased. You should investigate why the missingness occurs. The only solution would be to include variables in the model that relate to missingness. But with so much missing data in the one group, some may not find your results meaningful.
I have two groups and I want to fix all factor loadings and path coefficients across all groups. Therefore, I have the following input:
CP by cp1 cp3 (1); AC by ac3 ac4 ac5 ac6 (2); NC by nc4 nc5 nc7 (3);
CP on AC (4); CP on NC (5);
But, when I look into the output, loadings and path coefficients are only equal in the model result section. If I compare the values in the stdyx standardization section, the factor loadings and path coefficients differ. I don't understand this, as they should be also equal - I mean that is what I wanted to fix in my input commands. When I run the same stuff in LISREL, the standardized values are equal...
1. Estimate the nested and comparison models using MLR. The printout gives loglikelihood values L0 and L1 for the H0 and H1 models, respectively, as well as scaling correction factors c0 and c1 for the H0 and H1 models, respectively. For example,
I’m doing multi-group analyses on autoregressive cross-lagged paths with two variables and four time-points.
I first did my global model (both genders), in which I had to add two correlations between residuals to improve model fit.
Then, I looked whether the fit was good among girls and boys. Fits are ok, but I noticed that one of the correlated residual is not significant among girls, whereas the other is not significant among boys (i.e., one is significant in each group, but it’s not the same).
I also noticed that some paths are significant among girls but none are significant among boys.
My questions are:
- Is it necessary to go further in the analyses since the coefficients are only significant among girls and not among boys? - If yes, if I constrain only the significant paths to be equal among boys and girls, should the difference in the chi-square detect these differences? - If not, what could explain it?
Yes, it is. Then, what can I conclude from these findings?
Can I still report the coefficients for boys and girls and mention that even though the coefficients were significant for girls but not for boys, we could not detect a significant differences between the two?
Answer to your third (last) question - if you cannot reject equality of the coefficients across gender you would want to consider if this common coefficient is significant; perhaps it is. That would make perfect sense I think.
dena posted on Sunday, February 01, 2009 - 1:26 pm
The common coefficients (I assume these are the coefficients for the total sample) are not always significant...
What conclusions can I then draw?
Coefficient for total = .09, z = 1.88 Coefficient for girls = .13, z = 2.01 Coefficient for boys = -.02, z = -0.281
If I constrain only this coefficient to be equal among boys and girls, the delta chi-square is not significant.
I would report what you see - the coefficient for girls is significantly different from zero. The coeff for boys is not. The two coefficients are not significantly different from each other. This is not contradictory - the last statement might be due to the coefficient for boys having a large standard error; the SE for boys plays into the gender difference testing. Perhaps there is too little power to reject gender differences.
Kihan Kim posted on Tuesday, February 17, 2009 - 4:00 pm
I am trying to test a multi-group SEM with no constraints on the measurement and structural parts (I do not want any parameter to be constraint).
I have five factors (F1-F5), and the following is the MODEL command. I am keep receiving the following error message, and I am not sure what is wrong with the model identification. Could you please advise me?
THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES COULD NOT BE COMPUTED. THE MODEL MAY NOT BE IDENTIFIED. CHECK YOUR MODEL. PROBLEM INVOLVING PARAMETER 73.
Model: F1 by Y1 Y2 Y3; F2 by Y4 Y5 Y6; F3 by Y7 Y8; F4 by Y9 Y10; F5 by Y11 Y12 Y13 Y14;
F5 on F1 F2 F3 F4;
F1 by Y1 Y2 Y3; F2 by Y4 Y5 Y6; F3 by Y7 Y8; F4 by Y9 Y10; F5 by Y11 Y12 Y13 Y14;
I was wondering whether it is possible to test for an interaction between an observed and a latent variable (both continuous) within a multiple group analysis, i.e. to test whether the interaction (between two variables) differs between groups (third variable).
I just read that interactions with continuous variables require numerical integration. But if I add "ALGORITHM = INTEGRATION", MPLus tells me that "ALGORITHM = INTEGRATION is not available for multiple group analysis".
You need to use the KNOWNCLASS option and TYPE=MIXTURE instead of the GROUPING option when numerical integration is involved. If you have further questions on this topic, send them along with your license number to email@example.com.
Ela Polek posted on Friday, February 27, 2009 - 7:30 am
I run Multiple Group SEM in 6 groups with Mplus. I have to compare some specific coefficients across groups, (to test if the influence of some variables on the outcome variable differs across groups). I have already used model-test command in Mplus, which gives Wald Test, but this does not test invariance of specific coefficients across groups. I know that when comparing coefficients in 2 groups t-test can be used. What test should be used when comparing 6 groups? I will be more than thankful for any advice.
I suspect I have two groups but I'd like to prove it. I have run ESEM analysis separately on each of the two groups and they do not respond in the same way to my construct: some variables do not load on the same latent factors between the two groups. But I'd actually like to go back a step and actually show on my whole sample (n1 + n2) that the grouping variable has a significant effect on the construct. Could you please tell me how I can assess in Mplus if the grouping variables has an additive effect on the outcomes loadings but as well may lead to some variables loadings on some latents in one group but not in the other group? Would the following syntax take into account the different possible effects I just mentioned of the grouping variable on the construct?
MODEL: F1 BY u1-u7(*1); F2 BY u1-u7(*1); F1 with F2; u1-u7 ON Gp;
With F1 and F2, the two common continuous latent factors; u1 to u7, the ordinal outcomes variables; Gp, the binary grouping variable (coded in 0 and 1).
The model you have specified can determine differences in the intercepts only not the factor loadings. If you do a multiple group analysis where the factor loadings are held equal across groups as the default, you can look at modification indices to assess measurement invariance.
MODEL: F1 BY u1-u7x1(*1); F2 BY u1-u7x1(*1); F1-F2 ON x1; F1 with F2; u4 with u5;
2. The original model has 2 latent factors on which different outcomes (e.g.u1) load according to the group (when I run the analysis separately on the 2 groups). To test the effect of the interaction variables (e.g. u1x1) on the model structure on the whole sample, should I still specify 2 latent factors or more/less? The results show the original ordinal outcomes significantly loading on 1 factor, and the interaction variables, on the other…I'm not sure what this is showing.
The interaction between the covariate and the items does not get at factor loading invariance. The interaction between the covariate and the factor would do that. The best way to look at factor loading invariance is multiple group analysis. See Example 4 in the Version 5.1 Examples Addendum on the website with the user's guide. See also the Topic 1 course handout and video. Topics 1 and 2 will be taught at Johns Hopkins University in August.
I investigate mean level differences between three age groups in a measurement model with three correlated factors. In the next step I would like to test the between group ability differences in the same factors after controlling for general cognition for example. To do this I modelled a structural model in which those factors are regressed onto general cognition:
f1 on GenCOG; f2 on GenCOG; f3 on GenCOG;
I need to compare the means of the residuals of f1, f2, f3 between the groups if I want to test performance differences on those factors after controlling for GenCOG, right? Where in Mplus Output do I find those values? In the residual Output I find only parameter for the Indicators. Tech 4 shows means of the latent variables (also of the endogenous variables), but when I am looking at those values, I dont think those are the means of the residuals of the endogenous factors. There are exactely the same latent means displayed that I found in the measurement model. But the exogenous variable explains at about the half of the variance, so the residual means of f1, f2 and f3 should change compared to the measurement model, I think. Could you please advice me.
You can do this using chi-square or loglikelihood difference testing of two nested models where one model has the parameter of interest free across groups and the other has the parameter constrained to be equal across groups. You can also use MODEL TEST.
I am trying to see if the path analysis model below differs between males and females and am not sure what syntax to use to constrain the paths. I have been looking at syntax that people use but am not sure how to apply it to my model and/or whether I need to prepare my data differently to test for measurement invariance.
VARIABLE: NAMES ARE ID IDYRFAM sex SES ZSES alc2 cn0 gp1 bp1 cn2 gp2 bp2; USEVARIABLES ARE alc2 cn0 bp1 cn2 bp2; CLUSTER = IDYRFAM; ANALYSIS: TYPE = COMPLEX; MODEL: bp1 bp2 cn2 alc2 ON cn0; bp2 cn2 alc2 ON bp1; alc2 ON bp2; alc2 ON cn2; bp2 WITH cn2; OUTPUT: SAMPSTAT STANDARDIZED; standardized mod(3.84);
Chapter 13 has a section on Equalities in Multiple Group Analysis that should help you. There is a full discussion of the Mplus language for multiple group analysis in that chapter.
Linda posted on Monday, August 24, 2009 - 11:16 am
How does multi-sample SEM account for multiple comparisons when comparing models across multiple groups?
naT posted on Wednesday, August 26, 2009 - 1:41 pm
I am modelling path analysis model with all observed variables. However, my TECH1 tells me that there are no parameter specified in NU nor THETA matrices, but instead all are specified in ALPHA and PSI matrices. I am wondering whether I misspecified the model, or is this the default of the mplus? If I have misspecified the model, how can I fix this?
This is correct. There is no matrix in Mplus for observed regressed on observed so the observed variables are turned into latent variables that are identical to the observed variables. This does not in any way affect the results. It simply moves the parameters from one matrix to another.
naT posted on Wednesday, August 26, 2009 - 4:01 pm
I am using similar syntax as in the MPlus manual to estimate a SEM with constrained factor loadings across multiple groups, but separate model ON statements. But I get a mesg that the model didn't converge and factor scores were not computed. When I look at the parameter estimates, they don't look too huge, there are no negative residual variances. I have 2 groups, and 3 continuous latent variables, and 15 factors. Can you send some sample syntax that would work?
I'm running a model and want to compare effects across developmental periods. I've run two models, one unconstrained, and one where I've constrained the paths of interest to be equivalent across developmental periods. Is there also a way to test if the factor loadings in the unconstrained model are significantly different across periods instead of running a separate model where the paths are constrained to be equal?
Hi, I fully understand why using nominal variables as mediators in a path/SEM is unacceptable. But suppose one were to include it as a set of dummies. *And* if separate analysis, using seemingly unrelated estimation, showed that the effects of "upstream" exogenous variables on these dummies, was statistically no different from the effects of these same exogenous variables on the corresponding nominal variable categories -- would the strategy then become defensible? Thanks. - Bobby
Following is an answer that was given to a similar question last week. It was found by searching on nominal.
"I don't think mediation via a nominal mediator m has been studied methodologically - but correct me if I am wrong. One possible direction to go would be to create a latent class variable c where the nominal categories of c are the same as the observed nominal variable categories of m (this is done via logit thresholds). c on x is then a multinomial logistic regression and the influence of c on y is captured by the means of y changing over the c categories (you don't say "y on c", but it has the same effect). This avoids the y on m regression which would treat m as continuous which would not make sense when m is nominal.
One can then explore if there is a need for direct effects y on x. But there isn't any guidance for how one should/could simply quantify how much of the x influence goes via m versus directly. Perhaps that isn't needed. This topic is a method research paper in itself - anyone?"
I am interested in running a multiple group analysis and constraining paths across three samples. However, one sample is missing two variables so, I would like to constrain all of the paths across all of the samples, except for paths related to those two variables for that sample, but for the two samples that have those two variables, I would like the paths constrained.
Please let me know if this is possible in MPlus and if so, how I can do it.
I'm doing a multigroup comparison including children learning to read across two different orthographies.
I'm a new user of Mplus, but so far I understand that the procedure is to go step by step. First comparing (across groups) factor loadings, then intercepts, then factor variance/covariance etc.
I've also learned that if some of the steps show a sig.diff. across groups than further comparison is meaningless.
My question concerns partial measurement invariance. In my study I have five latent variables made up by ten indicators (two indicators for each latent) A chi-square difference test tells me that my factor loadings diff. sig across groups. But does that mean that all loadings are sig.diff? or is there a place in the output showing which loadings that differ?
Is it possible to continue comparing invariance across groups allowing some parameters to be free?
You can have partial measurement invariance if you model the invariance by allowing the parameters to differ across groups. How much invariance you can have is debatable. You can see where the large differences are by looking at modification indices. See the Topic 1 video and course handout for a full description of measurement invariance.
QianLi Xue posted on Sunday, November 29, 2009 - 4:07 pm
Is it true that theoretically, multiple group CFA with categorical factor indicators can have same loadings but different thresholds across groups? The model will be identified as long as the scale factors are set to 1 across all groups.
Yes, I think that's true. Note that the scale factors depend on 3 things: loadings, factor variance, and item residual variance. If factor variances are different across groups, fixing scale factors at 1 in all groups would be inconsistent with that.
I'm doing an hierarchical regression analysis in two groups using Cholesky decomposition (because of indications of colinerity among the independent variables). I have established measurement as well as factor variance/covariance across groups. How do I compare regression coefficients across groups? I think this procedure is quite straigthforward doing ordinary SEM, but I`m getting confused by the decomposition framework.
Sally Czaja posted on Monday, December 14, 2009 - 11:10 am
I am predicting a person level latent variable outcome (achievement) using a cluster level factor (neighborhood poverty) by group (grp). I'm running the following analysis and am getting an error message (ERROR in MODEL command Parameters involving between-level variables are not allowed to vary across classes. Parameter: FB ON NEIGHPOV). Is there another way to estimate a model which allows between level variables to vary across classes? What would you suggest? Thank you. Classes= c(2); KNOWNCLASS = c(grp=0 grp=1); WITHIN = female raceWb ageint1 poverty; CLUSTER = census; BETWEEN = neighpov; ANALYSIS: TYPE= TWOLEVEL mixture; MODEL: %WITHIN% %OVERALL% fw BY ZgrdyrSp5 Zwratscr Zqutest; fw ON female raceWb ageint1 poverty; %c#1% fw BY ZgrdyrSp5 Zwratscr Zqutest; fw ON female raceWb ageint1 poverty; %c#2% fw BY ZgrdyrSp5 Zwratscr Zqutest; fw ON female raceWb ageint1 poverty; %BETWEEN% %Overall% fb by ZgrdyrSp5 Zwratscr Zqutest; fb on neighpov; ZgrdyrSp5 Zwratscr Zqutest @0; %c#1% fb by ZgrdyrSp5 Zwratscr Zqutest; fb on neighpov; ZgrdyrSp5 Zwratscr Zqutest @0; %c#2% fb by ZgrdyrSp5 Zwratscr Zqutest; fb on neighpov; ZgrdyrSp5 Zwratscr Zqutest @0;
I see that my first explanation was somewhat unclear. I meant "I have established measurement invariance as well as factor variance/covariance invariance across groups". And yes, it is a Cholesky decomposition of the independent factors.
It sounds like for your group comparisons of slopes on the factors you don't want to use the decomposed factors oyu got by Cholesky but the original ones. If so, you backtranslate the slopes to the original factors using Model Constraint and do tests of invariance using Model Test.
Hi, thank you so far! I think I include some more information
Below is the input for my separate hierarchical regressions (Cholesky decomposition). As I said in an earlier post I understand (hopefully) the procedure in how to compare SEM models when the predictor variables are included simultaneously. However, I’m not sure what to do when comparing hierarchical models. What should I include in the second model (Model Scan) so that I can set the baseline model and then proceed with the comparison from factor loadings to structural paths?
The "model scan" is the group specific model (Group 1 is Eng, Group 2 is Scan). The output above is my set up for hieracichal regressions. PH1-PH4 is equal to one factor residualized given the other. That is, PH1 equal to WR, PH2 is the residual of VOC after WR have been partialled out, PH3 is the residual of RAN after WR and VOC have been partialled out, and PH4 is the residual of PA after, WR, VOC, and RAN have been partialled out.
I have done separate analysis for each group (Eng and Scan). My problem arises when I try to compare the structural paths for WR1 on PH1-PH4 across groups.
As I said, when doing standard regression this is quite straigthforward. When doing hierarchical regression I´m not sure if it's even possible...
I don't see why a problem would arise here. First test that the measurement parameters are equal across groups, including the parameters of the PH* BY statements, and if that is not rejected, test if the structural parameters of WR1 ON PH1-PH4 are equal. Group-equality testing is covered in our Topic 1 course on the web.
I have data for three different grades who were assessed on the same instrument in the fall and spring of the academic year. In order to estimate appropriately scaled ability scores across time points (fall/spring) and grade (1,2,3), is it best to run two separate multiple group analyses (one for each time point) or to run a multiple group MIMIC model with time as the covariate? Thank you for any input!
I would first test measurement invariance across time for each grade. Once that is established, I would use multiple group analysis to test measurement invariance across grade.
leah lipsky posted on Friday, January 08, 2010 - 11:05 am
Hello, Can you tell me why I'm getting the same estimates & fit statistics regardless of which paths I constrain (trying to do multiple group path analysis)? For example, the 1st model below constrains all paths, and the 2nd I believe frees them all. Thank you!!
MODEL 1--ALL PATHS CONSTRAINED VARIABLE: NAMES ARE id edu exfreq ageyrs wtchg2y gainer pcap pwtatt pseff yr1rtrn yr2rtrn return bmichg ploc fvint retain1y partot; MISSING = ALL (999); GROUPING IS retain1y (0=no 1=yes); USEV ARE pseff ploc exfreq fvint wtchg2y; CATEGORICAL = exfreq fvint; MODEL: exfreq on pseff ploc; fvint on pseff ploc; wtchg2y on pseff ploc; pseff with ploc; fvint with exfreq@0; OUTPUT: standardized modindices(3.84);
MODEL 2- NO CONSTRAINTS DATA: ...same as above... MODEL: exfreq on pseff ploc; fvint on pseff ploc; wtchg2y on pseff ploc; pseff with ploc; fvint with exfreq@0; MODEL retainer: exfreq on pseff ploc; fvint on pseff ploc; wtchg2y on pseff ploc; pseff with ploc; fvint with exfreq@0; OUTPUT: standardized modindices(3.84);
The default in Mplus is for regression coefficients to be free across groups as the default. So the models are the same. You can see this by looking at TECH1 or your model results. You need to constrain the parameters to be equal in the second model.
in my multiple group analysis (testing for metric invariance) I get the following message:
"THE MODEL ESTIMATION TERMINATED NORMALLY
THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES COULD NOT BE COMPUTED. THE MODEL MAY NOT BE IDENTIFIED. CHECK YOUR MODEL. PROBLEM INVOLVING PARAMETER 324."
However, I don't have a parameter 324 - I checked TECH1. My model looks like this: y1 BY x1 TO x7; y2 BY x8 TO x13; y2 ON y1 x7;
I have four groups.
My file contains missing data which I specify by "missing = blank" and I use the "auxiliary" option. My groups are of different size. However, this was not a problem when establishing configural invariance.
Since I don't have a parameter 324 - what does the error message mean?
Go to the beginning of Technical 1 and search for 324. I have never heard of us reporting a parameter number that does not exist. If this does not help, please send the full output and your license number to firstname.lastname@example.org.
Additionally, I've estimated the above model testing for metric invariance without the "AUXILIARY" command, getting the following warning:
"THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES MAY NOT BE TRUSTWORTHY FOR SOME PARAMETERS DUE TO A NON-POSITIVE DEFINITE FIRST-ORDER DERIVATIVE PRODUCT MATRIX. THIS MAY BE DUE TO THE STARTING VALUES BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. THE CONDITION NUMBER IS -0.693D-17. PROBLEM INVOLVING PARAMETER 82. THIS IS MOST LIKELY DUE TO HAVING MORE PARAMETERS THAN THE SAMPLE SIZE IN ONE OF THE GROUPS."
My groups are of sample size 42, 100, 72 and 158, respectively. If I'm not mistaken, I estimate less parameters than sample size in one group. Moreover, the parameter that Mplus points to is the psi parameter for the endogenous latent variable in my second largest group. And I did not get this error message in the model testing for configural invariance where more parameters had to be estimated!
I've checked my input file for the fifth time and have eventually discoverd the reason for both problems mentioned above. I've specified the model correctly now and it works well. Sorry for bothering you ...
I have four groups in my SEM. Since subjects in all groups have missing values I have used the "AUXILIARY option" as follows:
AUXILIARY = (m) z1 z2 z3 z4 z5;
My models did either not converge or SE could not be estimated. Now I've run the model again without auxiliary variables and everything works fine (i. e. I can establish scalar invariance). How am I to interpret this result? I would have expected a different result - no convergence without auxiliary variables and convergene with auxiliary variables.
I would like to test the structural invariance between three groups. To produce an unconstrained model (regression weights free among groups), should I fixed all parameters (factor loadings, intercepts, and means@0) in my measurement model? Would it be better to fix only factor loadings and intercepts? What should I do with my covariances, fixing them or letting them free? Thank you for your help!
The Mplus default is to hold the measurement parameters of factor loadings and intercepts equal across groups. The residual variances are not held equal. To compare structural parameters, measurement invariance is required.
See the end of the discussion of measurement invariance and population heterogeneity to see the models for testing the equality of the structural parameters.
I am running a fairly complicated SEM analysis. Just to give you an idea of the scope of it, here is the syntax
MODEL: abusemid on abuseearly; momdrink by maxdrkBMt12 dkpropbmt1BM bingedkt1BM; maxdrkBMt12 with bingedkt1BM; daddrink by bingedkt1BF dkpropbmt1BF maxdrkBFt12; bingedkt1BF with maxdrkBFt12; daddrink with momdrink; momdrink on FHg0; daddrink on FHg0; abuseearly on momdrink; abuseearly on daddrink; adolund by DELINt4 DELINt5 aggret4 aggret5; earlyund by aggBFt1 aggBFt2 aggBMt1 aggBMt2 delBFt2 delBMt2 ; school by TRFviii1t5 TRFviii2t5 TRFviii3t5 TRFviii4t5; abuseearly on FHg0; crisemft4 on FHg0 momdrink daddrink; adolund on abusemid crisemft4; school on abusemid crisemft4; school on adolund ; adolund on earlyund tsex;
I have 329 in my sample - 89 are girls and 240 are boys. I would like to address sex differences in this model, but I feel like I do not have enough girls to do this with. Do you think there are enough girls to try to run the 2 group analysis?
At a minimum you need to have several more observations in a group than you have parameters in the group. If you meet this condition, you would need to do a Monte Carlo study to see if the sample size is large enough.
I have a question concerning the calculation of CFI in multi-group models. Little et al. (2007) refer to a paper by Widaman and Thompson (2003) who argue that "many applications of SEM require one to specify and estimate an appropriate null model" when one wishes to model variances or means.
Is such an altered null model used as the default in the calculation of CFI in Mplus 5.21 when specifying a grouping variable?
Thanks for your help, Maren
....... Little, T. D., Card, N. A., Slegers, D. W. & Ledford, E. C. (2007). Representing contextual effects in multiple-group MACS models. In T. D. Little, J. A. Bovaird & N. A. Card (Eds.), Modeling contextual effects in longitudinal studies. (pp. 121-147). Mahwah, NJ US: Lawrence Erlbaum Associates Publishers.
Widaman, K. F. & Thompson, J. S. (2003). On specifying the null model for incremental fit indices in structural equation modeling. Psychological Methods, 8, 16-37
Widaman and Thompson (2003) describe the modified null model as follows: "First, an acceptable null model must represent covariances among manifest variables as null, or zero. Second, and the key distinction here, if any within-group and / or between-group constraints on estimates of manifest variable means or residual variances are invoked in any substantive models under consideration, these constraints must be included in an acceptable null model. These constraints on means and residual variances will typically be operationalized as constraints on the tau and theta matrices that are the only matrices with parameter estimates in the standard null model."
Is there a way to specify such an alternative baseline model in Mplus?
You can't change the Baseline model that Mplus uses. However, you can run two models, the baseline you want and your H0 model and do a difference test.
We do not fix the observed exogenous variable covariances to zero because the model is estimated conditioned on the observed exogenous variables. Their covariances are not fixed at zero during model estimation. By fixing them at zero in the baseline model, overall model fit depends on how highly the observed exogenous variables correlate in spite of the fact that these correlations are not H0 model parameters.
I've thought about running a difference test, too. However, if I did as you 've suggested, wouldn't all goodness-of-fit-indices (CFI, TLI, RMSEA, ...) for my H0 model still be calculated on the basis of the baseline model that Mplus uses as the default? If so, and if I want to estimate these fit-indices by using my baseline model, could I estimate these indices by hand by using the chi-square difference value in the formulas? Thanks for your help!
I am conducting a multi-group path analysis and am attempting to compare a model where all parameters are freely estimated to one in which the means are constrained to be equal across groups. I am obtaining the same model fit statistics and parameter estimates in both the freely estimated and constrained models, which does not seem possible. I set up my variables as latent constructs using a single indicator. Below is a portion of my input.
Freely estimated model: !LATENT VARIABLES MEANS (A = alpha) [p0_pos]; [p1_pos]; [p2_pos]; [p3_pos]; [t0_agg]; [t1_agg]; [t2_agg]; [t3_agg];
Model with means constrained to be equal:
!LATENT VARIABLES MEANS (A = alpha) [p0_pos](1); [p1_pos](2); [p2_pos](3); [p3_pos](4); [t0_agg](5); [t1_agg](6); [t2_agg](7); [t3_agg](8);
I would appreciate any suggestions on how to correct this issue. Thank you very much for you help.
It is not possible to answer your question without more information. Please send the two outputs and your license number to email@example.com.
Wu wenfeng posted on Monday, April 05, 2010 - 5:20 pm
Hello! I have read some articles about measurement invariance, and found the process of using MASC to test the multi-group were different. I wonder when testing the latent mean equivalence, should the item variance equivalence be test? And if it should, the test should be before or after latent mean equivalence test?
Latent variable means are not measurement parameters. They are structural parameters. See our Topic 1 course handout and video for a discussion of using multiple group analysis to test for measurement invariance and population heterogeneity.
Wu wenfeng posted on Tuesday, April 06, 2010 - 9:32 am
I have read the content you mentioned, but still confused.anyway,thank you!
I am conducting a multiple group ESEM analysis of dichotomous data (2 groups) with a high number of cases in each group (170.000 and 50.000 respectively) and 41 variables. The fit indices of our analysis (CFI, RMSEA and TLI) indicate, when testing for measurement invariance (thresholds and loadings equal, scale factors 1 in one group and free in the other), that both groups have a similar structure. We conduct factor analyses in the first place in order to obtain factor values on which our further analyses are based.
Based on the factor values, which are comparable for the two groups in the invariant model, we would like to calculate Eulidean Distances (of these factor values for each case across the two groups).
Is it legitimate to constrain the factor means in the invariant model in both groups to zero (contrarily to the recommendation that - when holding thresholds and loadings constant - means in the second group should be estimated freely)? With factor means of 0 in both groups, factor values seem to be much more comparable in our case and Euclidean Distances would be calculated for standardized factors instead substracting standardized from unstandardized factor values, right?
I would highly appreciate your comments. Thanks, Pablo
The idea behind holding factor means invariant was to being able to actually compare factor values by calculating distances between the factor scores of one group and the scores of the other group per item.
Our goal is in fact to have a such a proximity measure on which our further analyses are based.
Could you maybe specify what exactly you would not recommend: Holding factor means in this case invariant (although for non-invariant factor means distances between the factor values do not really make sense) or actually calculating distances of the factor scores at all (and if so, why)?
Your help is very much appreciated. Thanks in advance.
You need to first test if the factor means are equal across the groups. Only if that is not rejected would I work with factor scores from the model where you hold the means at zero in all groups.
Note that factor scores are comparable across groups even when factor means are different. The measurement invariance ensures that. So you could go ahead and calculate your factor score distances under our default model.
Thank you for your response. Which way to test for the equality of the factor means across the groups is recommended/sufficient?
When holding means in the reference group constant, while freeing them in the other group, the resulting estimated means (which as far as I understand are the mean differences in comparison to the reference group) are significant. This is not surprising since the dataset is quite comprehensive.
However, when comparing the model with means = 0 in the reference group and freely estimated means in the second group to a model with factor means fixed at zero in both groups, the change in the fit indices is quite small and the indices themselves are good.
Could I therefore assume that I can work with a model with factor means hold at zero in both groups (due to the still good fit values)?
What I meant in paragraph 3 was: I am comparing model A (Thresholds and Factor Loadings constrained to be equal across groups; residual variances fixed at one in one group and free in the other; factor means fixed at zero in one group and free in the other group) to model B (Factor Loadings and Thresholds held equal in both groups AND Factor Means fixed at zero in BOTH groups).
The fit indices in model B are good and the difference to the fit indices in model A are rather small.
Can I therefore assume (referring to Dr. Muthen’s post from April 16, 2010 - 10:27 am), that factor means are equal across the groups (since the invariance model B with equal means shows good fit indices)?
I would like to specify a multiple group model in Mplus with the following constraints: factor loadings fixed to zero, intercepts invariant over groups, and unique factor covariances are freely estimated.
I have a g-factor-modell with seven indicators which I specified as follows for all four groups:
I get the following error message: THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES COULD NOT BE COMPUTED. THE MODEL MAY NOT BE IDENTIFIED. CHECK YOUR MODEL. PROBLEM INVOLVING PARAMETER 14. (which is the theta parameter for variable g)
What I'm trying to do with this model is specifying the "acceptable null model" according to Widaman and Thompson (2003). For my case, this model has to have the following specifications: factor loadings fixed to zero, intercepts invariant over groups, unique factor covariances freely estimated.
I'm not sure whether the additional specifications for mean and variance (which I added in order to identify the model) actually alter the model's meaning.
Why the estimates are a little bit different if I use multigroup analysis (let's say for boys and girls) than if I use separate data for boys and girls and run the exactly same model (no constraints included)?
They should not be different. You may not have relaxed all of the default equality constraints in Mplus. If you can't see the problem, please send the relevant outputs and your license number to firstname.lastname@example.org.
Nadia posted on Wednesday, June 23, 2010 - 9:53 am
hi, i am trying to run a simple logistic regression with a categorical variable as predictor, how do i run this? I have put in grouping is... but it still comes up with an error message: ERROR in ANALYSIS command ALGORITHM = INTEGRATION is not available for multiple group analysis. Try using the KNOWNCLASS option for TYPE = MIXTURE.
and if i try to do mixture it tells me i don't have the mixture option
hi linda, this is really driving me bonkers! I am trting this out on our institution computer which has the mixture add on I have tried the type=mixture with knownclass and i keep getting an error message *** ERROR in Variable command CLASSES option not specified. Mixture analysis requires one categorical latent variable. but i don't want a categorical latent variable, all i want is a straightforward logistic regression with a categorical predictor How can it be so complicated to do this!!
A logistic regression is shown in Example 3.5. If you want multiple group analysis also, you need to use the KNOWNCLASS option along with the CLASSES option and TYPE=MIXTURE. Example 8.8 shows the way to specify this.
Prathiba posted on Tuesday, June 29, 2010 - 9:22 am
Dear Drs. Muthen: If I conduct a multigroup CFA with sample sizes N=550, 3261, and 2103, do you think the sample size disparity would cause inflation/deflation of any estimates? No parameters are constrained across groups.
I'm conducting a SEM including an interaction effect. I want to conduct MSEM with an interaction effect. As I know, Mplus does not provide chi-square statistic when an interaction term is included in the model. How can I examine the group differences?
I have math outcome data at two time points (pretest and post test) for students in two conditions: Treatment and Control. Pretest and post test score measure 4 different aspects of math. Therefore I created a latent variable.
My question is are there significant differences between treatment and control group in math. To address that question my plan was to conduct multiple group analysis. However because of the small sample size (N = 78) I couldn't conduct the analysis. Is there another way to address my question? The only idea that comes to my mind is to save the factor scores and conduct an ANCOVA. Can you recommend some other ways to analyze my data?
If you can estimate a model and obtain factor scores, I'm not sure why you were unable to conduct the analysis.
Anna Nagy posted on Tuesday, August 10, 2010 - 1:15 pm
I was only able to conduct the CFA and create two latent variables measuring math at time 1 and time 2. Following that step I was planning to conduct the MGA, but the model blow up right at the configural invariance level. I blamed on the small sample size.
Sonja Nonte posted on Monday, August 23, 2010 - 8:45 am
We are trying to test factorial invariance in a multigroup CFA (categorical data). We would like to specify the baseline model with free thresholds, factor loadings, and means. We've already found out that we have to fix the factor mean at 0 and the residual variances at 1 for identification. But our question is one step before that: how can we free the tresholds and factor loadings? Until now, we have the following statements:
VARIABLE: ... grouping is S1sex (1=girls 2=boys);
MODEL: SpoSeko by S1Sp1r S1Sp2r S1Sp3r S1Sp4r; SpoSeko@0; S1Sp1r@1; S1Sp2r@1; S1Sp3r@1; S1Sp4r@1; And in the next step (equal thresholds and factor loadings) do we keep the restrictions concerning the mean and the residual variances? If we do not keep those, how can we still perform a diff test, even though we changed the baseline model?
See the Topic 2 course handout under multiple group analysis. Here the measurement invariance models are shown for the Delta parametrization. The only difference between this and the Theta parametrization is that scale factors are parameters in Delta and residual variances are parameters in Theta.
The intercepts of the 1st order factors in their regression on the 2nd order factors need to be fixed at zero in both groups for identification.
Regan posted on Tuesday, September 14, 2010 - 12:48 am
Hello! A while ago, someone had this question:
"...I ran subsequent multiple groups analyses for each of the 3 race/ethnicities...The model fit was excellent for two of the groups, but unacceptable for the third group....Should I accept the omnibus model for the two races that have good fit and develop a different model for the third race?"
Dr. Linda Muthen's advice to him:
"...It does not make sense to put groups together if the same model does not fit the data well for each group.... Only then does it makes sense to combine the groups..."
1) I wanted to confirm that if in attempting to do a multiple-group path model, we first test the model in each group separately, and if you have good model fit in two groups and poor fit in one group, one should stop and just present a separate model for each group and not attempt the multiple-group approach? (If there may be a plausible and interesting reason as to the finding that the third model did not fit the data, can we also present this model?)
2) Am I correct that with a non-significant chi-2 diff test, your interpretation is that the H1 and Ho models are not significantly different from each other and it is okay to combine the data into one group--perhaps allowing for invariance in certain paths? (and that separate models are necessary if the chi-2 diff test IS significantly different)?
1. A first step in a multiple group analysis is to analyze each group separately. Only groups for which the same model fits well should be compared. That a different model fits well for one group can be of interest.
2. I don't know what you mean by combine into one group because if you do this, you cannot allow for invariance.
Regan posted on Tuesday, September 14, 2010 - 12:02 pm
Hello again, I was referring to your response to the gentleman above. When you say that:
'it does not make sense to put groups together if the same model does not fit the data well...'
By this, do you mean that we should not try to compare these groups?
If my model fits well for non-hispanic caucasians and non-hispanic african-americans for instance, but not for hispanics/latinos, I believe what I should do after having run separate models is just do a multiple group analysis with the caucasian and african-american groups and either explain the lack of fit in the hispanic group---or develop a separate model altogether for them. Is this correct understanding?
I have a good fitting omnibus structural equation model that includes three racial/ethnic groups (N=424). When I run the model that includes all participants, the fit indices are all good and I have no error messages. When I run the model separately for each group, the fit indices are still good, but I get the following error message: "THE MODEL ESTIMATION TERMINATED NORMALLY WARNING: THE LATENT VARIABLE COVARIANCE MATRIX (PSI) IS NOT POSITIVE DEFINITE. THIS COULD INDICATE A NEGATIVE VARIANCE/RESIDUAL VARIANCE FOR A LATENT VARIABLE, A CORRELATION GREATER OR EQUAL TO ONE BETWEEN TWO LATENT VARIABLES, OR A LINEAR DEPENDENCY AMONG MORE THAN TWO LATENT VARIABLES. CHECK THE TECH4 OUTPUT FOR MORE INFORMATION. PROBLEM INVOLVING VARIABLE EDU3."
My question is, is the mplus output for each racial group intrepetable, or does the error message negate interpretability?
This message means the model is not admissible. You probably have a negative residual variance or variance for edu3.
haxha posted on Saturday, October 30, 2010 - 11:26 pm
Dear Dr. Mullen. I am using MPLUS in conducting multi group analysis. I have a question with regards to validating of the model. I have a model that I created for a large sample of 800 women. I was told that to validate it I need to test this model on one half of the population and then test it again on the other half. I do this using MLM estimator because my data is not normal regardless of the transformations I have undertaken. Most estimators are similar (the direction, significance, chi square significance) but one parameter looses the significance when I test the mode in one half of the data. Should I respecify the model? Is it essential that all parameters are significant in all models tested? Also, since I cant use bootstrapping with MLM; is there any other simulation method I am able to use? Haxha
I think typically one randomly divides the sample as a first step and fits the model first in one sample and then in the other. If key parameters are not significant in both, the model may not be robust.
haxha posted on Sunday, October 31, 2010 - 10:18 am
Thanks so much Dr. Muthen. I apologize for a typo earlier. Just one more follow up question if you don't mind. I have transformed the data but they are still not normal; I am using mLM but I am doing so with the already transformed data....is that ok? OR must I go back to using the data on their original form? Data on the original form are severely skewed. Also is there any simulation method instead of bootstrapping I can use with MLM? Many many thanks! Haxha.
In general I would not transform variables. I would use the MLR estimator.
haxha posted on Sunday, October 31, 2010 - 11:15 am
Thank you so much!
Kai Savi posted on Thursday, November 04, 2010 - 12:51 pm
I am working on a multiple group analysis and am looking for MLR output. so I can use chi-square difference testing. I know I can't do MLR with grouping, but my data is in two data sets. I can do a multigroup analysis, but not with MLR.
Beccause I am using two data sets, I do not have a single variable to use to differentiate classes. Is it possible to use KNOWNCLASS and get a MLR output with two data sets?
You should be able to do this with MLR if you are using TYPE=GENERAL. Have you received an error message or are you just assuming this? If you have received an error message, please send your output and license number to email@example.com.
Kai Savi posted on Thursday, November 04, 2010 - 2:09 pm
I am assuming it, because I am not clear on how to describe KNOWNCLASS with two datasets (as opposed to a class variable). It seems like it should be simple enough, but I was not able to find anything in the manual on how to write that into the syntax.
Hello, I run a multi group analysis with strong invariance that fits fine. When i try to test for measurement invariance (configural or weak) I receive the message:
THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES MAY NOT BE TRUSTWORTHY FOR SOME PARAMETERS DUE TO A NON-POSITIVE DEFINITE FIRST-ORDER DERIVATIVE PRODUCT MATRIX. THIS MAY BE DUE TO THE STARTING VALUES BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. THE CONDITION NUMBER IS -0.929D-17. PROBLEM INVOLVING PARAMETER 34.
I checked parameter 34 (psi between two latent var.) but didn't see what's the problem. The model also runs fine, when I try every group in a single model (only girls or boys). I'm wondering because it doesn't makes sense to me that both single models work well and a model with strong invariance shows a good fit too, while more liberal models don't work. Is there any explanation for this phenomenon? Thanks a lot.
Amy Tobler posted on Friday, November 19, 2010 - 1:04 pm
I am running a multi-group clustered path analysis. When the model is run I get the following warning:
THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES MAY NOT BE TRUSTWORTHY FOR SOME PARAMETERS DUE TO A NON-POSITIVE DEFINITE FIRST-ORDER DERIVATIVE PRODUCT MATRIX. THIS MAY BE DUE TO THE STARTING VALUES BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. THE CONDITION NUMBER IS -0.291D-15. PROBLEM INVOLVING PARAMETER 29.
THIS IS MOST LIKELY DUE TO HAVING MORE PARAMETERS THAN THE NUMBER OF CLUSTERS MINUS THE NUMBER OF STRATA WITH MORE THAN ONE CLUSTER.
My question is, does this mean that the chi-square values for model fit are not reliable as well or just the individual parameter standard errors?
Both chi-square and SEs are somewhat questionable in this case where you have more parameters than clusters. They could be fine, or they may be poor. It depends on several factors, including how many parameters refer to the between level - if few, you may be ok. Only a Monte Carlo simulation study could tell more.
I am running a multiple group analysis for a path analysis with multiple mediators using ML, and I am getting a negative chi square value. I know that this can happen with MLR, and that you can't interpret the test. What are your recommendations when this happens with a model using ML?
Sylia Wilson posted on Saturday, February 05, 2011 - 12:09 am
I am using multigroup models to compare across mothers and fathers in our sample. In order to fit the measurement model for our first latent construct (not yet testing invariance across groups), I need to constrain 1 of 4 indicators to be equal to 1, and the other 3 indicators to be equal to one another: Model: latent by obs1@1 obs2 (1) obs3 (1) obs4 (1);
I would now like to test invariance across mothers and fathers, but I'm not sure of the code for setting the loadings equal across groups, taking into consideration the constraints on the latent variable. What I would like to do is something like the following, where the * means equal across groups, but I know you cannot put 2 () in the same line. Model for mothers: latent by obs1@1 obs2 (1) (1*) obs3 (1) (2*) obs4 (1) (3*); Model for fathers: latent by obs1@1 obs2 (1) (1*) obs3 (1) (2*) obs4 (1) (3*);
Do you have any suggestions? Thank you very much for your time.
I´ll like to do a multigroup model with two groups. First group should be composed of all observations with a value from 2 to 4 for the variable "SNB" and the other group should be composed of all observations with the value 1 for this variable "SNB". I tested serval options for the grouping, as:
Dear Dr. Mullen, I am running a multiple group analysis with two groups with very different sample sizes. n of group 1 = 213 n of group 2 = 70 The unconstrained path coefficients are in the case of 2 paths very different for the two groups. As an example: Group 1: beta = -.04 (p = .674) Group 2: beta = -.39 (p = .006) However, when I constrain all path coefficients for the two groups to be equal in order to test if the paths differ for the groups, the contrained model is not significantly worse than the unconstrained model (p = .355). Is it possible that this nonsignificant difference is due to the unequal sample sizes? And if so, is there a way to circumvent the problem of the unequal sample sizes? I thank you very much for any advice you could give me! Veronique
Dear Linda, I`m trying to run a multiple group analysis with three groups and imputed data. My model contains a latent variable which is regressed on four other latent variables. Furthermore, I added three covariates (following the mplus user guide's example 5.14). The model shows good fit, however, standard errors of the latent means for group 2 and group 3 seem to be extremely large. Running the model without the covariates leads to acceptable standard errors, so I assume there might be some problem with the covariates. Do you have any idea why the standard errors of the latent means increase when I add covariates? Thanks in advance!
Hi Linda, Is there a limit to the number of groups that Mplus can accomodate in a multiple group analysis? In the Mplus manual, I can only find examples involving 2 groups but saw mention of 6 groups in an earlier post in this topic. I am analyzing data from a study involving 8 groups and am wondering if I can include all 8 groups in the same analysis? Thanks! P.S. I am very excited to see that there will be a Mac version of Mplus at some point soon (I can stop spending money on Parallels and Windows at that point). Will those of us who switch have to buy a new license or will we be able to get the Mac Version as part of our annual renewal?
I am conducting a multiple group analysis (by gender) on a model wherein we have separate hypothesized models for men and women. Essentially we have a theoretical model for men and theoretical model for women. Both models contain the same latent variables but paths are specified to be different between the genders.
What I would like to know is if it is possible in Mplus to empirically show that the male model fits the data better for males than it does for females and that the female model fits the data better for females than it does for males.
I'm trying to test a mediation model in three religious groups in a large dataset (N= 10 000)with latent variables. I also want to control for the influence of another grouping variable (university) so using multigroup analysis as well as a TYPE=complex. My mediation model runs perfectly in the whole group with the TYPE=complex statement. However, when I try to run the multigroup I receive this warning: THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES COULD NOT BE COMPUTED. THE MODEL MAY NOT BE IDENTIFIED. CHECK YOUR MODEL. PROBLEM INVOLVING PARAMETER 89. I do not know how I can interprete this warning or how I can solve this problem? Thanks for any advice! Kind regards
Thanks for your quick reply. A colleague noticed in the output that one item has residual variances above 1. So, I will rerun the analyses without this item and see whether the model is identified. Thanks anyway! Kind regards, Jessie
I did a single group analysis as first step of a multiple group analysis (structural equation model with 4 latent factors). results indicated that in the second group one observed indicator of a latent factor had to be removed for getting a good model fit. this probably indicates that measurement models of this factor differ, right? my problem is now: 1) how should I proceed, if I want to test moderation in both groups? 2) may I just continue with the mga and leave this observed indicator out? 3) ai8 is the observed indicator which has to be removed in the second group (male) and if i try something like:
WARNING: THE LATENT VARIABLE COVARIANCE MATRIX (PSI) IN GROUP M.M. IS NOT POSITIVE DEFINITE. THIS COULD INDICATE A NEGATIVE VARIANCE/RESIDUAL VARIANCE FOR A LATENT VARIABLE, A CORRELATION GREATER OR EQUAL TO ONE BETWEEN TWO LATENT VARIABLES, OR A LINEAR DEPENDENCY AMONG MORE THAN TWO LATENT VARIABLES. CHECK THE TECH4 OUTPUT FOR MORE INFORMATION. PROBLEM INVOLVING VARIABLE AI8.
sorry, i saw only now that ai8@0 does fix the variance at 0...thats not the solution for my problem...my question is how to analyse moderation effects if the factor structure differs in the two groups because of one observed indicator..
Deleting an item is done by not including it on the USEV list. But if you have a model where in one group an item makes it not fit the data, then multiple-group invariance for the remaining items seems unlikely. If you want to model moderation of structural relations by group membership, you need measurement invariance.
Professors Muthen, I am trying to conduct a multiple group (4 groups) SEM with seven latent variables and one single indicator. This model also initially had two higher order factors.
1st I conducted 4 single group CFAs. The model fit was good for three of four groups. there were two errors for the last group, one for an observed variable and the other for a second order latent factor- both I believe were linear dependence related.
When I removed the observed variable I only received one non positive definite error. I removed the second order factor and repeated the analysis in all four groups and the model fit was acceptable for all. Can you please help me understand why this helped since its basically the same model?
Also, in testing configural Invaraince - I have two questions about single indicators and second order factors. Are single indicators as well as the first first-order factor omitted in the group specific model statements?
hi I need your help, I want to model using logistic regression the survival of bird as a function of weight and sex. The probability of survival for female is given by p(f)=exp(b0+b1x)/1+exp(b0+b1x) and probability of male given by pm(x)=pf(x+b2/b1), can some one help me verify this? how do i verify?
I am conducting Multi-Group SEM to do cross-cultural research for my dissertation. I tested my final model using MPLUS and I could obtain fine fit indexes.
Measurement fixed, Structure fixed (Strict invariance)(using ONLY Correlation matrix and STD as the dataset) CFI = 0.93, TLI = 0.90, SRMR = 0.09, SRMR = 0.09
However, I could not figure that out the way to compare factor means using this correlation matrix as data set.
Therefore, I tried to add mean (means of observed variables (Ys)) in this data set. Syntax: type is correlation and std and means
And I got pretty different results. All syntax is same except including means in the data set. CFI = 0.85, TLI = 0.81, RMSEA = 0.13, SRMR = 0.10
(1) Why is it different? and Is it OKAY to use only correlation matrix to conduct multi-group SEM?
(2) When I used raw data( or correlaiton and mean std), I found that one group has consistently higher observed means (Y's) than the other cultural gorup, so probably, I think it is a weak invariance. However, others are very similar (structure is similar), only observed means differ by groups. Can I use multi-group SEM?
Hello I always thank you for your support i have a question on the multiple group comparison, I would like to test for the moderating effect of a drug crime (drug v. non-drug offenders). I am especially interested in how a drug crime conditions the effect of being racial minority on sentence length in courts. I wonder if it is ok for me to impose constrains on one or two related variables of interests, and do the chi-square difference tests with the unconstrained model? Some of my old material told me that I have to do structural invariance test (?) first, which is the chi-square difference test between the unconstrained model and the fully constrained model, and then only in the situations where there is no statistical difference in the chi-square test, I can proceed to doing the path by path test using the chi-square difference test just like the former approach. Which one is correct? If the latter approach is correct, then I wonder why we do the path by path test even though we do not find any difference when we constrain the whole paths. I am confused. So, my question is "do we need to do structural invariance test even though I am doing just a path analysis with only observed variables, not SEM?"
Dear Dr. Muthen, I really appreiciate your comments. So,it means that it is okay to use correlation matrix and std for multigroup SEM? What is limitaiton when using correlation matrix and std?
I have one more question. When I use raw data for multigroup SEM, I got the decent fit indexes for each cultural group when I conducted CFA for each cultural group (measurement model). When I conducted multigroup SEM, fit indexes were low, so I released some of intercepts (observed means) for one group and then I got a good result.
MODEL : F1 BY Y2* Y5 Y11 Y12; F2 BY Y4 y7 Y10; F3 BY Y3 Y6 Y9; F1@1;
F2 on F1; F3 on F1; F2 with F3; Y17 on F1(4); Y17 on F2(5); Y17 on F3(6);
Y7 with Y10; Y7 with Y3; Y6 with Y4; Y6 with Y3; Y10 with Y3;
Model G: [Y7 Y9 Y12];
-->CFI 0.933 TLI 0.911 RMSEA 0.088 SRMR 0.096
Is it partially measurement invariance? I can't compare factor means, right?
I would not use only the correlations and standard deviations. I would use also the means which is the default with raw data.
Please see multiple group analysis in the Topic 1 course handout on the website. It discusses testing for measurement invariance in addition to testing of factor means across groups. There is also a video that you can watch.
peter pitt posted on Friday, July 01, 2011 - 1:26 pm
I have some questions with respect to the factor variances in multigroup EFA (ESEM). (a) Suppose that the variables are standardized per group and that I didn’t constrain the factor variances to be equal, for example to one (but instead I constrained some loadings to one to solve the identification problems), are these factor variances then subject to any constraint (e.g., the sum of factor variance of a factor in group A + factor variance of the same factor in group B = 1)? What would be the influence of standardizing the concatenated data instead of standardizing for each group separately? (b) Is it possible to find a solution with the same factor loadings, but with factors that have different factor variances in each group (and if so, what does this mean then)?
(a) You don't want to standardize variables in a multi-group analysis because then you cannot study group diffs in means and variances.
(b) Multi-group ESEM has the default of group-invariant loadings and intercepts and group-varying factor variances and means. The goal of multi-group analysis is to be able to study population (people) diffs in factors when measurement (variable) par's are the same.
I like to compare standardized path in a multi group analysis. My model is the following (three latent dependent variables and three exogenous manifest variables):
SR BY y1-y5; SW BY y6-y10; SC BY y11-15; SR ON A B C; SW ON A B C; SC ON A B C; A with B C; C with B; SR with SW SC; SC with SW;
I have two groups and will compare the standardized path from C to SC over these two groups. I don’t know how to create the standardized coefficients in the MODEL CONSTRAINT. A similar question was posted on Wednesday, August 04, 2010 - 11:34 am by Simon O. F. posted on http://www.statmodel.com/discussion/messages/11/16.html?1309783320
I think, I can compare the two standardized path with this equitation: beta_CSC1 = beta_CSC2*(sqrt(sdC2)/sqrt(sdSC2))/(sqrt(sdC1)/ sqrt(sdSC1). But, how can I define beta_CSC2 and beta_CSC1 as well as the variance of SC as NEW parameters in MODEL CONSTRAINT and test the difference using MODEL TEST.
Thank you Drs. Muthen & Muthen for taking the time to answer our questions.
I was wondering if someone could please explain to me the difference between fitting a full SEM where we do not specify that there are two different groups, as opposed to a model where we specify GROUP IS but constrain parameters to be equal.
As an example, for a study I am working on, I am performing two multiple group analyses. I first fit a full SEM. I then performed a Multiple Group Analysis comparing the parameters between two groups of teachers based on their teaching experience. The parameters of the constrained model differ from the parameters obtained in the full SEM. I then performed another Multiple Group Analysis comparing the parameters between two groups of teachers based on school level. I found that parameters of the constrained model also differed from the full SEM as well as from the constrained model of the first Multiple Group Analysis.
Is this supposed to happen? If so, I also wonder why some journal articles I have read do not report the results of their constrained models.
Looking at the sample of males and females together is not the same as a multiple group analysis where the coefficients are held equal between males and females. The first analysis is a mixture. See the following paper which is available on the website for further information:
Muthén, B. (1989). Latent variable modeling in heterogeneous populations. Psychometrika, 54:4, 557-585.
Siran Zhan posted on Sunday, October 09, 2011 - 11:26 pm
Dear Dr. Muthen,
I'm trying to establish measurement equivalence between 2 groups. The problem I'm facing is that one of the groups has no data at all on one of the manifest variables. Mplus did not run my script because of that and returned error messages that "One or more variables in the data set have no non-missing values". Is there a way you can suggest for me go around this?
I'm interested in comparing whether the path coefficients are different across groups. My model consisted of 2 latent variables and 3 observed variables. I came across some articles stating that the comparison would be more meaningful if measurement invariance can be established before comparing the path coefficients. Is the measurement invariance necessary? If yes, how could I establish the measurement invariance where some of my variables are observed variables?
Measurement invariance applies to latent variables not observed variables. It is necessary to establish that the latent variables have the same meaning in different groups if comparisons are going to be made across groups. See the Topic 1 course handout on the website under Multiple Group Analysis.
I don't know why the error terms would be constrained with what you show. Please see the Topic 1 course handout under multiple group analysis for the inputs for testing measurement invariance. If these don't help, please send your output and license number to firstname.lastname@example.org.
ellen posted on Monday, November 07, 2011 - 11:37 pm
Dear Drs. Muthen, I have a question about how to test invariance of path coefficients for structural paths using Mplus. I am not testing measurement invariance. I am comparing SEM models for 3 groups, and see some paths seem to be comparable across groups. I read something about examining the “completely standardized common metric solution” in LISREL. However, I use Mplus. What section of an Mplus output would suggest significant group differences are present for some specific relationships (e.g., between A & B). Here is what I read: “We used LISREL to examine the invariance of path coefficients for structural paths in the SEM model by conducting multiple-group comparison for boys and girls. To compare the two models, we conducted a model in which the relations among A, B, C, D variables were freely estimated and a model in which the relations were set to be equal for boys and girls. We then used the chi-square difference test to examine whether these models were equivalent. Results showed there was a significant chi-square difference... Examination of the completely standardized common metric solution suggested that significant group differences were present for the relationship between A and B ... To confirm this, we compared a model in which the relationships among A, B, C, D, E were set to be equal for boys and girls with a model in which A and B were freely estimated. There was a significant chi-square difference between the models..."
They did a chi-square difference test where they estimated two models. One where regression coefficients were free across groups, for example,
MODEL: y1 ON x1; y2 ON x2;
and one where they were constrained to be equal across groups, for example,
MODEL: y1 ON x1 (1); y2 ON x2 (2);
Then they did a chi-square difference test as described on pages 434-435 of the Mplus User's Guide.
ellen posted on Tuesday, November 08, 2011 - 10:26 am
Thanks for your prompt response! I know how to conduct a difference test for two models, but my question is more about how to make a "justification" from a Mplus output to suspect that some (but not all) path coefficients may be equivalent across groups. The article I described above (Nov. 7) uses the “completely standardized common metric solution” in LISREL to justify for testing a model where only some paths were set to be equal while other paths were freely estimated across groups. I am wondering whether a Mplus output of certain metric solutions will be able to provide justification for me to set certain paths equal... rather than just by my subjective view.
Please help! Thanks so much!
ellen posted on Wednesday, November 09, 2011 - 1:42 pm
Dear Drs. Muthen,
Could you respond to my question (posted above; 11/8) and restated below?
how to make a "justification" from a Mplus output to suspect that some (but not all) path coefficients may be equivalent across groups. Some researchers use “completely standardized common metric solution” in LISREL to justify for testing a model where only some paths were set to be equal while other paths were freely estimated across groups. I am wondering whether a Mplus output of certain metric solutions will be able to provide justification for me to set certain paths equal?
I am unclear what "completely standardized common metric solution” means. If it means you are comparing standardized coefficients across groups, I would not recommend this. I would compare raw coefficients. You should have a theory about which coefficients you expect to be different across groups. If you are in an exploratory setting, I would hold all raw coefficients equal across groups and look at their modification indices.
ellen posted on Wednesday, November 09, 2011 - 9:04 pm
Thanks! Could I ask 2 more questions? (sorry new to Mplus!) How do I "hold all raw coefficients equal across groups"? Is below the right way to write the commands?
Also, how do I interpret "M.I." and "E.P.C."? I read the User's Guide (pp. 646-647) but still don't understand it... ......
GROUPING = race ( 1 = Black 2 = Asian 3 = White);
ANALYSIS: ESTIMATOR = MLR ; MODEL: A by A1 A2 A3 ; T by T1 T2 T3 ; O by O1 O2 O3 ; S by S1 S2 S3 ;
I would concentrate on modification indices. The value given is the decrease in chi-square if the equality is removed. The value 3.84 is the chi-square value of significance for one degree of freedom. Any MI over this value would improve fit significantly if the equality is removed meaning that the two coefficients are not equal across group.
ellen posted on Thursday, November 10, 2011 - 10:58 pm
Dr. Muthen, Thank you! This is helpful! May I ask a couple follow-up questions:
There was a path (e.g., A->T) that was significant (p< .01) only in the Asian group, but not in the Black or White groups when estimated freely. However, when it was fixed to be equal across groups, the M.I. was not greater than 3.84? Does that mean we can consider this specific coefficient equal across groups? ... this does not seem to make sense-- because when estimated freely, it was significant at p< .01 in one group, while in the other two groups it was not significant. How to explain this?
Also, could you tell me how to interpret an "E.P.C."? I read the user guide but still am confused...
(Thanks SO MUCH! ...& sorry about the basic questions.)
You are looking at two different types of tests. One coefficient can be significantly different from zero and the other not even though the two coefficients may not be significantly different from each other.
EPC is the value the parameter would take if it is free.
Jiyeon So posted on Tuesday, November 22, 2011 - 11:40 pm
I have a hypothesis that predicts : The model will receive stronger support from sexually active group (group A) than sexually inactive group (group B).
To test this hypothesis, I think I should compare model fit across two groups. However, since the model is the same (and only the sample is different) Chisquare difference test does not apply here since it is only for nested groups.
Is there some sort of significance test for comparing model fit across two groups? I understand this may not be a specific Mplus question but I'm using Mplus and find this board very helpful. Please advise me what to do! I would really appreciate it!!
I am testing a model for measurement invariance by gender, and receive the following error message: THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES COULD NOT BE COMPUTED. THE MODEL MAY NOT BE IDENTIFIED. CHECK YOUR MODEL. PROBLEM INVOLVING PARAMETER 43.
When I run the analyses for the entire sample everything goes smoothly. The error only comes up when I attempt a multi-group analyses. My syntax is below, and parameter 43 corresponds to the Alpha value for zmomhrs in the female sample. Can you advise on what might be causing this error and how I might be able to work around it? Thanks so much!
Variable: (variables removed) MISSING = .; GROUPING = childgender (1=male 2=female) Analysis: Model NOCOVARIANCES; Model: f1 by; f1 on zfamilyinc@1 zedusum; f1@0; f2 by zroleambig_r zdeclat_r; f3 by zchildh_r zmchildh_r zbmi_r; zmompa on f2; zmompa on zmomhrs; zmompa on f1; zmchildpa on zmompa (1); zmchildpa on f1; f3 on zmchildpa (2); f3 on f1; f2 on f1; zmomhrs on f1; f2 with zmomhrs; Output: stdyx tech1;
Steven John posted on Wednesday, January 18, 2012 - 5:09 am
I'm currently doing a MGA comparing the correlation between A and B for two primary school grades. The correlation between A and B is different, but significant, in separate analysis I run earlier for both grades. I now want to compare the correlation to see if the difference between grades is significant. To me it seems appropriate to run a totally relaxed MGA with both grades and then another where the correlation under investigation is relaxed. Thereafter I use the Chi2 diff-test for nested models to calulate if the models differ? Am I correct?
You can do this. Or you can do it in one step using MODEL TEST. See the user's guide for further information.
Steven John posted on Thursday, January 19, 2012 - 1:03 am
Thanks! However, I got a message in the output and therefore calculated the TRd according to the formula on the Mplus website. (Probably because I run the MLR estimator?)
If I run the model totally constrained and thereafter relax on the correlation under investigation, would this also be correct? This seems to be the common way of doing it? However, it seems a bit strange to impose equal factor loadings across grades instead of assume that they theoretically measure the same construct (factor loadings vary across grade but follow the same pattern). The totalally constrained model also fit the data poorly.
Holding factor loadings and intercepts equal represents measurement invariance. You must first establish measurement invariance before coefficients related to latent variables can be compared across groups. See the Topic 1 course handout and video for a discussion of this topic.
I want to check that I am using the correct information from the output to do the computation.
In the following formula... cd = (d0 * c0 - d1*c1)/(d0 - d1)
...I took the degrees of freedom (d0 and d1) in the "Chi-Square Test of Model Fit" section from each output. Is that correct? (There is another section in the output called "Chi-Square Test of Model Fit for the Baseline Model" and I want to be sure which values to use.)
Then, for the correction factor (c0 and c1), I used the following : in the "Loglikelihood" section, "H0 Scaling Correction Factor for MLR". Is that correct?
I am novice user of multiple group analysis. I did all analysisi, but I do not how to report and interpret data for the manuscript which model, including unconstrained, structural weights, structural covariances, and structural resiadulas ý should use for reporting.
Please see multiple group analysis in the Topic 1 course handout and video on the website. See also how results are reported in the journal you plan on submitting too. If this does not help, I suggest posting this on a general discussion forum like SEMNET.
Hello, I have a question about a negative residual variance, it is very small and not significant. So I want to fix it zero, that is not a problem, but now I don't have the correct degrees of freedom. I cannot find any reference to this anywhere. Is there a way that I can make Mplus only estimate positive residual variances so that my degrees of freedom are correct?
Hi, We are trying to run a multiple group version of a Cole & Maxwell Trait-State-Occasion model. When we run it in our entire sample, the model converges just fine. When we try to run a multiple group, metric invariant version of the model we get a message saying that standard errors could not be computed and the model may not be identified and the problem appears to be with a parameter related to the mean structure. We believe this may be due to the fact that the occasion factors in the Cole & Maxwell model are actually the residual variances in the State factors (after accounting for the Trait factor) and therefore are not independent of the state and trait factors but that Mplus is trying to estimate group differences on all three sets of factors (trait, state and occasion) as if they were independent parameters. We are not certain though. Does this make sense? If so, any thoughts on how we fix the identification problem in the multiple group model?
There is no limit except your computer's. I have done 34.
ellen posted on Monday, September 10, 2012 - 6:58 pm
if I want to test whether two parameters are equal across three groups, is it accurate to write the Mplus language in the following way? (knowing the overall multigroup comparison chi-square difference is significant across groups.)
MODEL African: Sg ON De (p1); Rm WITH Ot (p2) ;
MODEL Asian: Sg ON De (p3); Rm WITH Ot (p4);
MODEL Hispanic: Sg ON De (p5); Rm WITH Ot (p6);
MODEL TEST: p1 = p3; p1= p5; p2=p4; p2 = p6;
Is this the correct way to test whether the parameters of "Sg ON De" and "Rm WITH Ot" are equal across the 3 groups?:
Hello, I am running a multiple group (male/female) sem model with categorical indicators. I freely estimated the latent means in both groups (by fixing one treshhold in the indicators to 0) but I am not sure which metric these latent means now have and how to interpret them. Some means in both groups are negative (e.g. -0.25 vs. -0.39). But my categorical indidactors are only between 1-4. It seems to me that the latent means are in some way centered to 0 and what I get are deviations from zero? But I don't understand what the reference (0) here is? The whole sample or the female and male group? And do you suggest to report the standardized or not stand. means? Thanks, Sofie
There is no gain to freeing the factor mean and fixing one threshold. Using the standard approach, the factor mean is zero in the reference group and other group means are deviations from zero. When you fix one threshold to zero, the factor mean is in the metric of the threshold. Threshold are in the metric of z-scores not of the categories of the variable.
Herb Marsh posted on Monday, September 24, 2012 - 3:15 am
I have a large data set with 26 groups (13 countries x 2 age cohorts) with 3000+ cases for each group. I began by showing reasonable invariance of factor loadings across the 26 groups. In the multi-group SEM I have several key path coefficients. I would like to do something like an ANOVA to determine how much of the differences in a path coefficient across the 26 groups can be explained by country, age-cohort, and their interaction. Can I do this with either ‘Model Test’ or ‘Model Constraint’ ? I did something along these lines previously when there were 4 groups (with a 2 x 2 design) with model constraint where the main and interaction effects were df=1 contrasts.
You can do it with Model Constraint. You have 24 coefficients and you can write the two-way ANOVA sum of squares decomposition in Model Constraint.
ellen posted on Tuesday, September 25, 2012 - 12:39 am
Hi, I am running a multigroup SEM model (African, Asian, Latino). The structural parameters initially showed very different results across groups. For example, one structural parameter was -.22** for Latino, .11 (not significant) for African, and -.12 (not significant) for Asian groups. However, when I used MODEL TEST to examine parameter equalities, the results showed the parameter was NOT significantly different across groups. This is puzzling to me because the parameter result was initially only significant for Latino (-.22**), and was not significant and in the opposite direction (positive .11) for the African group, and not significant for Asians (-.12)-- how could the three parameters not having significant difference? Does it mean statistically they are considered as equivalent? If they are considered equivalent statistically, do I have to constrain the parameters to be equal across groups and claim there is structural invariance?
When I constrained it to be equal across the three groups, I got a result that shows this parameter was significant across ALL groups. How do I interpret the results here?
I am just confused why the three parameters seemed so different initially (e.g., in opposite directions and only one was statistically significant) would somehow turn out to be statistically equivalent?
Herb Marsh posted on Friday, September 28, 2012 - 10:39 pm
Tihomir: Thank you for your assistance. However, I have not been able to work out how to follow your suggestion.
In model constraint I: 1. computed age cohort differences for each of the 13 countries, and then took deviations of these from the mean cohort difference over all countries. I then used 'model test' to test whether 13-1 country deviations were simultaneously equal to zero. I guess that this is a test of the country-by-cohort interaction. 2. I computed country means (averaged across the two cohorts) for each country, and then took deviations of these from the grand mean. However, I could not use 'model test' to test whether these were simulaneously equal to without a separate analysis. As I have a LOT of coefficients to test, this would require 1000s of lines of code and many separate analyses.
More importantly these did not really give me the two-way ANOVA sum of squares decomposition that I wanted. Obviously I have missed something, Can you give me a bit more guidance about how to translate the 26 (13 countries x 2 age cohorts) coefficients into ANOVA-style SS?
Herb Marsh posted on Monday, October 01, 2012 - 5:19 pm
Tihomir: Sorry for being so dense and not thinking through what I want more carefully.
Yes, your suggestion gives me SS decomposition -- like a two-way anova with one case per cell so that there is no within-cell variation. This is what I asked for
However, what I really want (with hindsight) is to be able to say that the variation explained by each effect is trivial, small, large, etc. To do this (hazardous though it is) I need some measure of SSerror or SStotal.
I cannot do this with a single value for each cell. However, what I do have is a standard error for each of the cells and the number of cases in each cell. Can I use that to construct SSerror. Naively, I am thinking I can use the SEs to create a SD (mult by N) and then compute a wted-avg of these. I doubt if I could use this to construct a legitimate F-test, but it might suffice for my descriptive purposes.
I am running a path model with 5 continuous latent variables. This works well and now I am interested in testing for differences between groups (grouping variable: dichotomous 1,2). I already did multigroup analysis before but in this case I got the following warning:
*** WARNING Data set contains unknown or missing values for GROUPING, PATTERN, COHORT, CLUSTER and/or STRATIFICATION variables. Number of cases with unknown or missing values: 503
I already rechecked the dataset but there are no missing values. I also tried to start with testing for configural and metric invariance and got the same warning. What am I doing wrong? Thank you!
Thomas Eagle posted on Saturday, November 03, 2012 - 4:12 pm
I am having a problem setting up a multigroup analysis where I have two groups. One group answered every question. The other group skipped all the items of one complete factor plus three additional variables. I used the example as in the post dated April 29, 2004. I still get an error message. Below is my code. What am I doing wrong?
GROUPING IS teen (1 = NonTeen 2 = Teen);
MODEL: BP by nq24_1-nq24_15; PV by nq24_16-nq24_20; HB by nq24_21-nq24_35; NAT by nq24_36 nq24_40 q24_41; Q by nq24_42-nq24_44; T by nq24_37-nq24_39 nq24_45-nq24_50; SP_EX by nq24_51-nq24_60; SOC_ENV by nq24_61-nq24_64; MODEL Teen: BP by nq24_1-nq24_5 nq24_6@0 nq24_7-nq24_15; PV by nq24_16@0nq24_17@0nq24_18@0nq24_19@0nq24_20@0; HB by nq24_21-nq24_30 nq24_31@0nq24_32@0 nq24_33 nq24_34 nq24_35@0; NAT by nq24_36 nq24_40 nq24_41; Q by nq24_42-nq24_44; T by nq24_37-nq24_39 nq24_45-nq24_50; SP_EX by nq24_51-nq24_60; SOC_ENV by nq24_61-nq24_64;
I have a multiple-group latent variable model and I would like to verify that I am interpreting the output correctly.
My analysis has one dependent variable with continuous indicators, which is regressed on each of three latent independent variables with ordinal indicators. Here is my question:
Does the regression coefficient in the output for each group represent the relationship between the independent and dependent variables for only that group, or is the regression coefficient for all groups beyond the first representing a degree of difference between the first group and another group?
As an example, if I have the following regression coefficients:
Group 1: 0.895 Group 2: -0.105 Group 3: 0.063 Group 4: 0.102
would I interpret this as the variables have a stronger relationship for Group 1 than the other groups (0.895 compared with absolute values smaller than 0.2), or as the relationship is relatively strong for all groups and the regression coefficient ranges between 0.790 and 0.997 depending on group membership?
Thanks in advance for any insight or advice you have!
Thomas Eagle posted on Thursday, November 08, 2012 - 10:38 am
Hi Linda, I am back. I tried the fixing of missing data defined to a group to zero using what you recommended. It does not converge. Here is the essence of my code:
DEFINE: IF (teen eq 2) THEN NQ24_6 = 0; IF (teen eq 2) THEN NQ24_16 = 0; IF (teen eq 2) THEN NQ24_17 = 0; ... etc...
USEVARIABLES nq24_1-nq24_64; MISSING = .; GROUPING IS teen (1 = NonTeen 2 = Teen);
ANALYSIS: COVERAGE = 0.0; MODEL: BP by nq24_1-nq24_15; PV by nq24_16-nq24_20; HB by nq24_21-nq24_35; NAT by nq24_36 nq24_40 nq24_41; Q by nq24_42-nq24_44; T by nq24_37-nq24_39 nq24_45-nq24_50; SP_EX by nq24_51-nq24_60; SOC_ENV by nq24_61-nq24_64;
I want to test whether my model differs by boys and girls by using a multigroup model where all parameters are equal and then another where all parameters are free. However, mplus won’t constrain one of my variables to be equal across groups. This variable is a dummy coded variable. Can you please help me with this?
I am new to Mplus and SEM so I apologize in advance if you have answered this question elsewhere. I want to do a multiple group comparison by sex in which the first model is free across all parameters and the other is equal across all parameters.
The model free across parameters seems to be working:
large on PTSD; sumcomp on PTSD; locw2 on PTSD; contwt on large sumcomp locw2; contwt on contbmi;
When I run the equal parameter model (below) the tau, theta, alpha, and psi matrices in the Tech1 output are still being estimated for the female model. How do I equate these parameters? PTSD by rx* an hyp; PTSD@1; large on PTSD (1); sumcomp on PTSD (2); locw2 on PTSD (3); contwt on large (4); contwt on sumcomp (5); contwt on locw2 (6); contwt on contbmi (7);
Model Female: large on PTSD (1); sumcomp on PTSD (2); locw2 on PTSD (3); contwt on large (4); contwt on sumcomp (5); contwt on locw2 (6); contwt on contbmi (7);
This is a matter of choosing one of two chi-square tests: Wald or Likelihood-ratio. They are different but asymptotically the same. Wald is the same as the z test you see when judging significance of a path and LR is the chi-square test of model fit.
I would add the path if there was theoretical reason to consider it. The fact that is is then insignificant is a finding of subject-matter interest.
Jo Brown posted on Thursday, December 27, 2012 - 12:13 pm
I am running a multiple group analyses to explore mediation. As I am using imputed data, I need to specify the direc, and indirect effects using the model constraint options.
However, when I do so I only receive one output for the direct indirect effects if I simply specify:
Y on M (p1); Y on X (c1); M on X (m1);
MODEL CONSTRAINT: new(ind dir); indF = p1*m1; dirF = c1;
Should I repeat the same lines after this as in:
model male: Y on M (p1); Y on X (c1); M on X (m1);
MODEL CONSTRAINT: new(ind dir); indF = p1*m1; dirF = c1;
model female: Y on M (p1); Y on X (c1); M on X (m1);
MODEL CONSTRAINT: new(ind dir); indF = p1*m1; dirF = c1;
to obtaion ind and dir for boys and girls separately.
I'd be grateful if you could advice me on the best way to proceed.
Yes, but have only one MODEL CONSTRAINT which is not interspersed in the MODEL command. Put MODEL CONSTRAINT either before or after the MODEL command not in the MODEL command. And don't use the same names for the direct and indirect effects.
Y on M (p1); Y on X (c1); M on X (m1);
model male: Y on M (p2); Y on X (c2); M on X (m2);
model female: Y on M (p3); Y on X (c3); M on X (m3);
MODEL CONSTRAINT: new(ind dir indm dirm indf dirf); ind = p1*m1; dir = c1;
indm = p2*m2; dirm = c2;
indF = p3*m3; dirF = c3;
Jo Brown posted on Thursday, December 27, 2012 - 11:42 pm
I have, perhaps, a simple question. I am running a two-group MLR model with two latent variable predictors and one latent variable dependent variable. I would like to graph, what is effectively an interaction, of the difference in the coefficients between groups in an Aiken and West style model (e.g. -1SD, 0,+1SD). The output gives me the slopes for each predictor for each group, however to properly graph the difference in the slopes I need the intercept. Can I use the group specific intercept for the DV that is printed in the output as my anchor for graphing the slopes? I noticed that one intercept seems to be fixed at zero while the other is freely estimated.
You can also try to do your full plot for the range [-1 SD, +1 SD] using the Version 7 "LOOP" plot. See Part 1 of the handouts and videos from the Utrecht course in August on our web site - or see the version 7 UG ex 3.18 and modify to two-group analysis.
i have checked my model for three socioeconomic statuses.....and build separate file for each.....
for lower SES, there are four paths which are non significant, when i placed constraints on them...chi sq value increases...model fit indices also increase but not as such great effect has been observed....but when i delete all those paths then that gives me good model fit....kindly suggest me...would i delete all those paths which are non sig (improved model fit) or place constraints on them...(which gives me just marginal model fit).
I am conducting a multiple-group analysis. I have 6 categorical indicators loading onto 3 factors (2 indicators on each). I'm using CLASSES instead of GROUPING (and have specified 8 classes), TYPE = MIXTURE, MLR estimator, and ALGORITHM = INTEGRATION.
I am testing configural invariance first (i.e., covariance invariance), and then measurement invariance (i.e., factor loadings, thresholds/factor means). Since I'm using MLR estimator, I can't test the invariance of the residual variances, but that is fine with me. And I believe that when thresholds are free, factor means have to be fixed at 0, and vice versa.
However, I'm having trouble getting some of my models to converge. I've set up 16 models to compare (by combining free or invariant model specifications for each of the 4 parameters: factor correlations, factor variances, factor loadings, thresholds/factor means).
In my model, I'm freeing the loading of the first indicator on each factor. In models where factor variances are meant to vary across groups, I've fixed variances in group 1 to 1 and allowed variances in other groups to vary freely.
My question is: Do you have any suggestions about why some models are not converging? Are there model combinations (out of my 16 combinations above) that are just not going to be identified?
Models where you don't set the metric in the loadings and instead fix a factor variance at 1 in one group and have them free in other groups need to rely on holding the loadings equal across groups for identification.
The way to figure out the source of the non-identification is to check the parameter number in the error message against Tech1 to see which parameter that is.
I do have a follow-up question. In one of my models, I made factor correlations invariant, factor loadings free, factor variances free (with variances in first group fixed to 1), factor means free, and thresholds invariant across groups. This model converged successfully. If I need to rely on holding loadings equal across groups for identification, why might this model have successfully converged?
Moreover, I have models in which that requirement (if variances are free, loadings must be equal) is satisfied that did not converge. For example, I have a model in which factor correlations are invariant, factor loadings are invariant, factor variances are free (with variances in first group fixed to 1), factor means are free, and thresholds are invariant. Allowing for 2000 iterations, the model still did not terminate normally. The message is "THE MODEL ESTIMATION DID NOT TERMINATE NORMALLY DUE TO A NON-ZERO DERIVATIVE OF THE OBSERVED-DATA LOGLIKELIHOOD. THE MCONVERGENCE CRITERION OF THE EM ALGORITHM IS NOT FULFILLED. CHECK YOUR STARTING VALUES OR INCREASE THE NUMBER OF MITERATIONS. ESTIMATES CANNOT BE TRUSTED. THE LOGLIKELIHOOD DERIVATIVE FOR PARAMETER 7 IS -0.73857449D-02."
Again, thank you very much for your help! It's invaluable.
Factor loadings not held equal across groups makes it meaningless to compare factor variances across groups. Generally speaking, this model is not identified. Non-identified models can converge. Why your model converged I can only tell by looking at your output which you can send to support.
For the question in your second paragraph I also would have to see your output to be able to tell.
Note also that you talk about invariant factor correlations. I would think you mean factor covariances since the factor variances are not in all your models it sounds like.
Jenny L. posted on Tuesday, April 23, 2013 - 9:43 am
Dear Drs. Muthen and Muthen,
I'm doing a multi-group analysis on two groups. There are 5 exogenous variables and I would like to correlate them both within and across groups. I know that within-group correlation is a default, but I'm not sure about the code for cross-group correlation (i.e. f1 of Group 1 is correlated with f1 of Group 2) and I can't seem to find it in the user's guide. Could you please give me some clue? Thank you in advance for your help.
Jenny L. posted on Tuesday, April 23, 2013 - 10:27 am
Thank you for your reply, Dr. Muthen.
What I have is actually a longitudinal data set with 2 time points. I was trying to test whether the associations among 8 variables (5 exogenous, 2 mediators, 1 outcome) would vary across time. I thought I could treat them as two different groups but it violated the independence assumption of multi-group analyses. What analysis would you suggest? Thank you again for your advice.
When I conduct multi-group analysis, say, for males and females, I understand equality constraint should be used to directly test whether a coefficient of interest is statistically different between the two groups, using chi-square difference. Sometimes, however, I have a situation, where significant chi-square difference is found (i.e., statistically significant difference between male and female coefficient) when both coefficients are statistically NOT significant. In such case, should I report the coefficient is different between males and females (based on the chi-square difference test) or not (because both coefficients are not significant, that is, not different from zero; and thus comparing two non-significant coefficients is pointless)?
Thanks. I'd like to ask a follow-up question. What if I found a structural coefficient to be significant in one group but not significant in the other when its chi-square test showed non-significant difference? Should I report significant difference between the two groups (based on significant vs. non-significant coefficient) or not (because the test showed non-significant chi-square difference with equality constraint)? This is not a hypothetical question, but I often have such case. Thanks in advance.
This is really not related to Mplus. You can probably get a more thorough response by posting this on a general discussion forum like SEMNET.
Claire posted on Wednesday, May 29, 2013 - 2:11 am
I have a question (probably a stupid one!) about doing path analysis, which I was hoping you might be able to answer.
I’m thinking about running a path analysis (perhaps going onto a SEM after) but want to compare path coefficients using the same model between groups (in my case countries).
Could this be done by just running separate path models for each country and comparing the coefficients? Or do you have to use multigroup path analysis? What is the difference between the two methods? Do you have any detailed examples (including syntax) of multigroup path analysis if this is what I should be doing or can you point me in the direction of materials that explain the difference between the two?
You should use multiple group analysis so that the testing can be done by the program using either chi-square difference testing or the Wald test using MODEL TEST. If you analyze the groups separately, you would need to do the same type of testing by hand which could be difficult.
Hello Dr. Muthen. I've been struggling with how best to approach multiple group factor analysis when there are many groups (my "group" is typically "country" and I usually have about 15-20 groups). I have been working through the paper: "General random effect latent variable modeling: Random subjects, items, contexts, and parameters" but I've been worried that the random effect approach is going to "force" (for lack of a better word) invariant loadings to be non-invariant since a random effect is estimated for each group whether or not it is "needed", and that this could bias the structural (latent mean) parameter estimates. I'm just wondering if I am completely off the mark.
See the new ALIGNMENT option in the Version 7.1 Language Addendum on the website with the user's guide. See also Web Note 18 and Bengt's UCONN Keynote address which discusses random versus fixed factor loadings.
Tait Medina posted on Wednesday, June 19, 2013 - 8:38 am
Thank you for these resources! The ALIGNMENT option is VERY interesting. I am looking forward to giving it a go.
A colleague of mine is using Mplus to examine how associations among constructs differ in two contexts.
Rather than constrain each path at a time, and then compare the Chi square difference between the constrained and unconstrained models to see if it’s significant, he has used pairwise t-tests with pooled standard errors. He wrote that he did this because he used the mean-adjusted maximum likelihood method in Mplus and that the chi-square values from this test in Mplus cannot be used for chi-square tests.
I have not used Mplus before and would like to confirm whether this is a good handling of the issue. Could you please let me know? I looked for other posts on this issue before posting this, but couldn't find anything.
Sorry for taking up your time, but your advice would be really helpful.
One can do difference testing using MLM. It requires using a scaling correction factor. I think that would find the same results as what your colleague did as long as the values from TECH3 are used in the computations.
Greetings, As a follow-up on my question posted here on January 25, 2012, I would like to know what is the purpose of the "scaling correction factor for MLR" value under the section named "Chi-square test of model fit". Because your answer to my previous post says that I need to use the "H0 Scaling Correction Factor for MLR" found under the "Loglikelihood" section, I'm wondering why I also have this other correction factor available. For your information, I am using the correction factors in the following formula: cd = (d0 * c0 - d1*c1)/(d0 - d1) Thanks for your assistance.
For difference testing you need the scaling correction factor which is related to the degree of non-normality. You can do difference testing using either chi-square values or loglikelihood values. You would use the scaling correction factor that is for the test statistic you decide to use.
Hannah Lee posted on Friday, September 06, 2013 - 9:05 am
Hi, I am trying to conduct a multigroup analysis (4 groups). It seems I can only compare two paths at a time with MODEL TEST. So here was my input:
MODEL: comp BY COMP1-COMP5; comp ON REOadd36 perc leng DREadd36 DPUadd36; MODEL HSE: comp BY COMP1-COMP5; comp ON REOadd36 (HSEb1) perc leng DREadd36 DPUadd36; MODEL HOBC: comp BY COMP1-COMP5; comp ON REOadd36 (HOBCb1) perc leng DREadd36 DPUadd36; MODEL TEST: HSEb1=HOBCb1;
OUTPUT: TECH1 STDYX;
Although I get the mdel estimates,I get the following meassage:
THE MODEL ESTIMATION TERMINATED NORMALLY
THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES COULD NOT BE COMPUTED. THE MODEL MAY NOT BE IDENTIFIED. CHECK YOUR MODEL. PROBLEM INVOLVING PARAMETER 49.
THE CONDITION NUMBER IS -0.317D-19.
I am not sure where to look for "parameter 49." Could this have to with individual group sample sizes?
You find the parameters and their numbers in TECH1.
You should not mention the first factor indicator, comp1, in the group-specific MODEL commands. When you do this, they are free making the model not identified.
marie posted on Wednesday, October 02, 2013 - 3:05 pm
I am running a multiple group analysis with gender as the grouping variable. I understand that I cannot just compare the paths without establishing measurement invariance. I first checked whether the final structural regression model was a good fit across groups ("no constraint across groups"). I have 10 latent constructs. There were twice more females than males. Three questions:
- How do I test whether the drop in fit is significant or not (I am using MLR)? Numerically speaking there was a drop in the CFI and an increase in the RMSEA and SRMR. - All the indirect effect became non significant in the male group. Five of my direct paths in both groups became non significant. Is this due to the small sample size after I divided them into two groups? - Is it fair to say that the final model could not hold in both groups so I have to stop my multiple group analysis there?
You should test for measurement invariance using a model with only the ten latent variables. You should not include paths among the ten latent variables until you have established measurement invariance. See the Version 7.1 Mplus Language Addendum on the website with the user's guide. There is a new feature that automatically tests for measurement invariance.
I have a SEM model with the WLSMV estimator for categorical items. In Mplus version 6 I was able to stratify my final model using the GROUPING option by levels of a variable I made in the DEFINE command. However, I have imputed data and now when I try to re-run the same input file in Mplus version 7 it says that I cannot do this because the GROUPING option requires the same number of participants per group in each imputation (and this is not the case since the imputation created some variation across the defined variable). Is this a bug in version 7 because in version 6 it just took the average number of people per group across all the imputations in order to do it? Is there another way I could stratify my model?
You would have to use Model Constraint to do this yourself. Label the slope parameters involved in the indirect effects for each group and use those labels in Model Constraint to express the indirect effects, like for 2 groups: