Message/Author 

Anonymous posted on Tuesday, July 11, 2000  10:51 am



I'm doing a structural equation model with 3 latent variables, a number of exogenous x variables and 4 groups. The model converges and has an rmsea of .044. I am now trying to test invariance across groups. I see that intercepts are constrained to be equal across groups by default, as are factor loadings. I see how to constrain residuals to be equal across groups, and I've also successfully constrained the betas, but I can't figure out how to constrain the gammas. Is it possible? 


If you have constrained the betas (f1 ON f2), the gammas (f1 ON x)are done in the same way. For example, in the overall model statement: MODEL: f1 ON x1 (1); f1 ON x2 (2); will constrain the two regression coefficients to be equal across groups. Let me know if this is not what you mean. 


(previously Anonymous) My problem was that I have several x variables in the model, and I wanted to constrain the coefficients to be equal across groups only, and not also across all x variables, which is what happens with this: y on x1 x2 x3 (1); I did figure out a solution to my problem. Since a regression equation can be on more than one line and the (#) has to be on the same line as the variable(s) it acts upon, I just used multiple lines for my equation: y on x1 (1) x2 (2) x3 (3); and this worked! 


Yes, this is the case. Only one parenthesis can be on a line and it applies to all parameters on the line. The overall model statement sets equalities within and between groups. An equality statement in a group specific model statement sets equalities within a group. 

Anonymous posted on Wednesday, January 31, 2001  5:37 pm



I want to use Mplus to construct a multigroup SEM that includes two CFAs for categorical data (two factors, 3+ dichotomous indicators each). Is it the case that Mplus will allow me to run these models without any invariance assumptions whatsoever ? I get the impression that I have to constrain at least one of the three sets of parameters either for identification or convergence: loadings, thresholds, means, scale factors. Maybe this is because when I try to relax any of the Mplus default invariance assumptions I get an error msg stating that the standard errors for the model cannot be calculated. Is the problem with my data (lack of variance ?) or with the identification of the model ? 


Multiplegroup CFA with categorical outcomes uses the default of holding thresholds and loadings invariant across groups, fixing the factor means to zero in the first group while letting them be free in the other groups, and fixing the delta scale factors to one in the first group while letting them be free in the other groups. If you instead want to have no invariance restrictions across groups you should repeat the thresholds and loadings in each group so that they are groupspecific. Note, however, that in this case you need to fix to zero the factor means in all groups (you cannot identify both groupspecific thresholds and groupspecific factor means) and fix the scale factors to one in all groups (they can only be identified when thresholds and loadings are invariant). You can also accomplish no invariance by doing separategroup analyses. 

Anonymous posted on Friday, February 02, 2001  12:31 pm



Following up on your recommendation in the 2nd paragraph above: is there a particular interpretation to setting the scale factors equal to 1 (as opposed to 2 or 3, etc.) ? Also, regarding the scale factors themselves, do they refer to the variance of the underlying (continuous) y variable, to the error in measuring that variable via the categorical measure or both ? Given this, how "strong" is the assumption of equal scale factors in the multigroup model where loadings and thresholds are allowed to vary and factor means are set to zero, etc. ? 


The scale factors refer to the inverted standard deviations of the latent response variables y*. This means that they are functions of loadings, factor variances, and residual variances. If one or more of those three components vary, the scale factor would vary. So, equal scale factors when loadings vary does not make sense. 


I am trying to compare two groups (ed and noned) on a confirmatory factor analysis solution. I have used the following command structure in MPlus, which I thought would work, but which isn't giving me the anticipated output. Again, what I want to be able to do in the end is determine whether the model is the same for the two groups. Thanks for your help. model: intern by withd somat anx; model: extern by del aggress; model ed: withd somat anx (1); model ed: del aggress (1); model noned: intern by withd somat anx (2); model noned: del aggress (2); 


If you use the following syntax: MODEL: intern BY withd somat anx; extern BY del aggress; the factor loadings will be held equal across groups. It is not clear what you are trying to do with the statements you have sent. If you tell me in words which parameters you are trying to hold equal and whether they are to be held equal within and/or across groups, I can then help you. The two model ed commands that you have above will hold all residual variances equal across variables for ed and the residual variances for del and aggress held equal to each other and also equal to the factor loadings for intern in the noned group. By the way, one overall MODEL command and one groupspecific model command is sufficient for any input. 

Holmes finch posted on Tuesday, February 27, 2001  11:15 am



Linda, Thanks for your response. What I want to do is compare the two groups over all the parameters, and then maybe look at individual ones. The bottom line is, I want to be able to say that the same model does, or does not fit both groups. Does that make sense? Thanks. Holmes 


If you send me your fax number, I will fax you several pages we use when we teach. These show setups for a variety of multiple group models that test a variety of hypotheses. 

Anonymous posted on Tuesday, June 26, 2001  2:43 am



I´am trying to do a multiple group analysis. All measurement parameters are held equal across groups by default. Is it possible to hold specific variances of latent factors equal across the groups? Which syntax do I have to use? 


Any parameter that is not held equal by default can be held equal. Any parameter that is held equal by default can have that equality relaxed. To hold a parameter equal, specify it in the overall MODEL command with a number in parentheses following it. One number in parentheses is allowed per record (line) of the input file. In a three factor model, the variances of the factors will be held equal across groups by adding the following to the overall MODEL command: f1 (1); f2 (2); f3 (3); 

LeeFay posted on Tuesday, July 17, 2001  4:34 pm



I am trying to run a twolevel analysis. But I get an error message telling me that 'the sample covariance matrix for the variables cannot be inverted'. I have checked my covariance matrices and no two variables are perfectly correlated and no variable has no variation. I have 11 homes with 349 subjects in total. The dependent variable is continous, and I have 4 withinlevel predictors and 5 betweenlevel predictors. What am I doing wrong? 


Even though you cannot see any correlations of 1 in your sample between covariance matrix, there may be dependencies that result in singularity of the matrix. You mention that you have 10 variables and 11 homes. Having 11 homes is like having 11 observations at the between level. You would not be able to have more than 10 variables. So if there are variables you are not mentioning, this could also be the problem. We recommend at least 3050 clusters for this type of modeling. You can try analyzing the sample between matrix to see if it can be inverted. Or you can send the input and data to support@statmodel.com and I will take a look at it. 


Can you please elaborate on the steps in multiple group analysis. I want to test group differences in two beta and two gamma coefficients. Am I correct that the model fitting steps leading up to testing the beta/gamma coefficients are to test assumptions of measurement invariance? I understand that the first step is to fit the SEM model separately in each group. Then the next three steps are to fit the model in all groups (1) allowing all parameters to be free, (2) holding factor loadings equal, and (3) holding factor loadings and intercepts equal. Given the defaults for multiple group analysis with categorical indicators, am I correct that these three steps require that parameters that are constrained or fixed by default need to be relaxed. If so, could you please elaborate on which defaults to relax? I assume that if factor loadings and intercepts are invariant, then the default settings would be appropriate for testing differences in the beta/gamma coefficients. 


The steps in looking at measurement invariance are slightly different with categorical indicators. For one thing, you are dealing with thresholds instead of intercepts You want to compare two models rather than three to test measurement invariance. Model 1  This is the default model in Mplus. The thresholds are held equal across groups and the factor loadings are held equal across groups. The scale factor is fixed to one in the first group and free in the others. The factor means are zero in the first group and free in the others. Model 2  The thresholds and factor loadings are free across groups. Scale factors are one in all groups and factor means are zero in all groups. 


Thank you for the clarification. Could you point me to a good reference on examining measurement invariance with categorical indicators? 


I think you may find something relevant at www.statmodel.com under REFERENCES/CATEGORICAL/MIMIC. 


I have run the group analysis below and now want to examine the model by 3 household types. Jaccard and Wan (1996) suggest examining three way interactions using multiple group analysis [in this case six groups]. Are there alternative approaches? For instance, would a twolevel or MIMIC model be appropriate? Grouping is t1totsup (0=below 1=above); Usevariables are q21a q21b q21c q21d p1 p2 p3 p4 p5 p6 p7 p8 p9 p10 p11 p12 p13 lifevent nparpro aparpro; Categorical are q21a q21b q21c q21d p1 p2 p3 p4 p5 p6 p7 p8 p9 p10 p11 p12 p13; Define: Cut t1totsup(36); Model: F1 by q21a q21b q21c q21d; F3 by p1 p5 p6 p9; F4 by p2 p3 p10 p11 p12 p13; F5 by p4 p7 p8; F6 by F3 F4 F5; F6 on F1; F6 on lifevent; nparpro on F6; aparpro on F6; 


Can you please tell me how to obtain the sample correlation matrix in order to report it with the analysis? Thanks! 


You could do a multiple group analysis with six groups or a MIMIC with five dummy variables. Unless you have clustered data, TWOLEVEL would not be appropriate. Multiple group analysis gives you the most flexibility if you have enough subjects per group. MIMIC cannot look at as many parameters but does not require as many subjects. You can obtain a sample correlation matrix using SAVEDATA: TYPE (SAMPLE) IS CORRELATION; 

Sandra Lyons posted on Wednesday, November 21, 2001  11:04 am



Thank you for your prompt and helpful support! The group analysis below produced the error message that follows it. Is the solution to this problem to remove the offending indicators from the relevant groups? Grouping is t1hhsup (1=losingle 2=hisingle 3=lopartner 4=hipartner 5=loexfam 6=hiexfam); Usevariables are q21a q21b q21c q21d p1 p2 p3 p4 p5 p6 p7 p8 p9 p10 p11 p12 p13 lifevent nparpro aparpro; Categorical are q21a q21b q21c q21d p1 p2 p3 p4 p5 p6 p7 p8 p9 p10 p11 p12 p13; Missing = Blank; Model: F1 by q21a q21b q21c q21d; F3 by p1 p5 p6 p9; F4 by p2 p3 p10 p11 p12 p13; F5 by p4 p7 p8; F6 by F3 F4 F5; F6 on F1; F6 on lifevent; nparpro on F6; aparpro on F6; Output: Standardized; *** ERROR Group 2 does not contain all values of categorical variable: P2 *** ERROR Group 4 does not contain all values of categorical variable: P2 *** ERROR Group 5 does not contain all values of categorical variable: P2 *** ERROR Mplus VERSION 2.02 PAGE 3 hhstructure moderation model 1 all paths free Group 6 does not contain all values of categorical variable: P3 


In multiple group analysis with categorical outcomes, each variable must have the same values in each group. You would need to collapse categories of p2 and p3 to obtain this condition. 


Your reply of 11/19 said: Multiple group analysis gives you the most flexibility if you have enough subjects per group. MIMIC cannot look at as many parameters but does not require as many subjects. Jaccard and Wan (1996) recommend a minimum of 75 subjects per group (100 preferred), but this must depend on several factors such as the number of variables in the model. Can you suggest how to determine the minimum number of subjects needed for group analysis? My smallest group size is 50. Also, I have convergence problems with a single six group model, but not when the same model is run in 3 separate analyses with two groups each. What are the implications of this? 


As you said, sample size depends on many things. As a minimum for each group, you would want to have more observations than the number of variables. You would want to have 5 to 10 observations for each parameter. For categorical outcomes, you usually need more observations than for continuous outcomes. Sample size 50 seems small particularly for categorical outcomes. Regarding convergence, the measurement invariance restrictions that you are probably imposing may not hold across all groups. 

Sandra Lyons posted on Saturday, December 01, 2001  12:33 pm



I've looked at the Mplus MIMIC examples and observed that none of them have independent latent variables. Hence, I'm wondering whether MIMIC is a good alternative to group analysis for the SEM I'm testing which is: F1 by q21a q21b q21c q21d; F3 by p1 p5 p6 p9; F4 by p2 p3 p10 p11 p12 p13; F5 by p4 p7 p8; F6 by F3 F4 F5; F6 on F1 lifevent; nparpro aparpro on F6; I'm primarily interested in group differences in the path coefficients. If MIMIC is indeed appropriate for this analysis, is it analogus to ols regression with dummy variables? In multigroup analysis with categorical dependent variables, if measurement invariance is not of substantive interest, would it be appropriate to fix measurement parameters across groups to those obtained in the single group analysis in order to circumvent nonconvergence possibly due to measuement invariance? 

bmuthen posted on Saturday, December 01, 2001  5:25 pm



The term MIMIC analysis is typically reserved for models with observed covariates influencing factors that have a set of indicators. But you can certainly put grouping variables as covariates into any SEM including yours above. Using grouping variables as covariates makes it possible to have different means (intercepts) of the variables that they are specified to influence (observed and latent). If you are interested in group differences in path coefficients (slopes), however, having grouping variables as covariates will not help. I would not recommend fixing measurement parameters to singlegroup analysis values because you want to see that the measurement part of the model is not changed in important ways when doing the joint analysis of several groups  a convergence problem can be an indication of model misspecification. 


I am trying to run a multiple group [mothers vs. fathers] two level [children within families] model: CLUSTER IS sid; GROUPING IS ptsex (1=fathers 2=mothers); ANALYSIS: TYPE = MEANSTRUCTURE TWOLEVEL; MODEL: %WITHIN% dbnew on monitor (1) agecb (2); aggress on monitor (3) agecb (4); monitor on agecb (5); %BETWEEN% dbnew on monitor@0 agecb@0; aggress on monitor@0 agecb@0; monitor on agecb@0; aggress with dbnew@0; I'm trying to constrain the path coefficients to be the same for mothers and fathers. The code above yields different coefficents for mothers vs. fathers and the results are identical to code that omits the #'s in parentheses...what am I doing wrong?? thanks!! 


If you send your complete output to support@statmodel.com, I will look at it. 

Anonymous posted on Sunday, June 23, 2002  3:33 pm



Sanity check needed: I'm running a multigroup SEM in Mplus with several ordered categorical variables as outcomes. In Group 1 I specify my thresholds as follows: [outcome$1*2]; [outcome$2*.8]; [outcome$3@0]; [outcome$4*1.5]; and in Group 2 I specify: [outcome$1*1]; [outcome$2@0]; [outcome$3@0]; [outcome$4*.5]; Is this the same as recoding my outcome variable for Group 1 but not Group 2 ? Mplus doesn't seem to allow group specific recodes using CUT on the DEFINE command, and doesn't give me an error msg when I use the above specification. Thanks ! 

LMuthen posted on Monday, June 24, 2002  6:52 am



I don't believe that you can use thresholds to recode your data. You should be able to use DEFINE to recode data for one group, for example, DEFINE: if (group eq 1 and y1=2) then y1=1; 


Someone asked me why SEM uses multiple group rather than products of factors to assess moderating effects? Do you have a brief explantion for this or could you point me to the literature? 

bmuthen posted on Monday, July 22, 2002  5:26 pm



Multiple groups can be, but doesn't have to be, used when a categorical variable is involved. This gives more modeling flexibility than using products since for example variances can be different across the groups. 


Methods I have seen described for interacting latent variables with a continuous observed variable seem quiet complex (Jaccard & Wan, Kenny & Judd) relative to group analysis. What method do you generally recommend? For example, I have the following model: f2 on f1 x1; x2 x3 on f2; where f1 and f2 are latent variables with dichotomous indicators, and x1  x3 are observed continuous variables I want to test the moderating effects of a continuous variable on each path in the model. Would you recommend group analysis or product terms. If product terms, what method do you suggest. 

bmuthen posted on Tuesday, July 23, 2002  8:30 am



Since the moderating variable is observed and not latent, the simplest approach would be to categorize the continuous moderating variable and do a multiplegroup analysis. There are many methods for analysis of latent variable interactions (which includes your case), but I hesitate to recommend any. A new method for ML analysis by Andreas Klein seems superior but is not yet easily available in software form. 

Anonymous posted on Tuesday, August 13, 2002  5:52 pm



We have been conducting a multigroup analysis with two groups and continuous indicators. We want to test whether some of the structural path coefficients are significantly different for group 1 vs. group 2. e.g., for structural path x: Group 1 standardized coefficient = .609 Group 2 standradized coefficient = .216 How can we determine if these coefficients for the same path but different groups are significantly different from one another? 


To test whether some paths are different between two groups, you can run two models  one with the paths held equal and the second with the paths not constrained to be equal. Then do a chisquare difference test. This is not a test of the standardized coefficients rather the unstandardized coefficients. 

Anonymous posted on Friday, August 16, 2002  1:33 pm



I have a question about comparing multigroup SEM coefficients across groups. Is it the case that the MG approach "controls" on differences in levels of my exogenous variables across groups ? For example, I'm running a model on two groups, the first of which has much higher income and intelligence scores than the second group. Income and intelligence are one of about 10 different x variables used to predict an outcome variable y. Is it valid to compare differences in the direction and sizes of the effects of x1, x2, x3,...,x10 on y across groups ? 

Anonymous posted on Friday, August 16, 2002  1:42 pm



I should have appended this second question to the one I originally submitted above: Is there a convenient way to determine if structural coefficients are equal across groups in a MG SEM without having to resort to ChiSquare (WLS) tests ? I ask because I have a large number of variables in my models and using individual ChiSquare tests would be tedius, and I think the significance of coefficients would be biased by the order in which I imposed the restrictions. 

bmuthen posted on Saturday, August 17, 2002  9:39 am



Regarding your first question about controlling for differences, you confuse me by first talking about groups defined by income and intelligence and then talking about these variables as x variables. Let me answer the question as an MG situation where one x variable is used as a grouping variable, and therefore not used as one of the x variables. You should think of this as regular regression in two groups, where we know that the regression slope can be compared even if the x mean is different in the two groups. Yes, you can print out (TECH3) the estimated covariance matrix for the parameter estimates and do a "correlated t test". 

Anonymous posted on Wednesday, August 27, 2003  8:38 am



On August 17, Bengt recommends doing a correlated ttest to examine whether or not the coefficients for two groups in a multigroup model are different. I'm wondering if this is the appropriate test to use in all situations. If one is working with data were individuals are not assigned to groups randomly, when the number of persons in the two groups differs considerably, and where the SEs for the coefficients of interest also vary considerably, shouldn't one use an unequal variance ttest or a ttest for independent samples ? Also, in Bengt's original recommendation, wouldn't the df for a pooled ttest always be df=(number of groups  2) = 0 ? Thanks. 

bmuthen posted on Wednesday, August 27, 2003  9:07 am



I was using "correlated t test" merely as an analogy. The TECH3based test I have in mind is asymptotically normal, so the z test analogy is better. 

Anonymous posted on Wednesday, August 27, 2003  9:20 am



I'm following up to your response to make sure I understand how comparing coefficients across groups in a multigroup model corresponds to common ttests for comparing means across groups. TECH3 would be needed to determine the covariance between a given pair of model parameters. However if the two groups are independent (which I believe is an appropriate assumption if cases are assigned to groups based on nonrandom factors  i.e., students allocated to schools, workers allocated to firms or sectors of the labor market), TECH3 wouldn't be needed and n1 and n2 would be the sizes of the two groups from which the coefficients (treated as averages) were obtained. Thanks again. 

bmuthen posted on Wednesday, August 27, 2003  11:48 am



Here is my understanding of this. I think this question was regarding a SEM, testing equality of structural coefficients. Even if the 2 groups correspond to independent samples, the invariance restrictions across groups typically imposed on measurement parameters could make the structural coefficients estimates from the two groups correlated  so that is where I was thinking TECH3 comes in. As far as I see it, the differences in group sample sizes are already taken into account in the 3 TECH3 components  this is unlike t tests where sample size enters because a variance for a sample mean is figured via the variance for each variable in the mean. So the resulting (approximate) z score ratio is correct. 

Anonymous posted on Wednesday, September 24, 2003  9:46 am



Just to clarify on the testing equality of structural coefficients. Say, I have latent variables x1, x2 and x3 predicting latent variable y. I look at the difference in chisquares if I fix everything to be equal between two groups and if I fix everything except the path from x1 to y  does LM test tell me if this structural coefficient (y on x1) is significantly different between groups? Should I repeat the procedure two more times for x2 and x3? Thank you in advance. 

bmuthen posted on Wednesday, September 24, 2003  7:11 pm



Not quite the way you said it, I think. Instead: To test if y on x1 is different across groups, you would run with the slope held equal across groups and then run allowing it to differ. Then do the same for y on x2, then for y on x3. But if your hypothesis is that all 3 (y on x1, on x2, on x3) are equal across groups, then you would do one run with equal for all 3 across groups and one run letting them be different. 

Daniel posted on Tuesday, March 30, 2004  10:32 am



In presenting the results of a multigroup LGM, is it appropriate to present standardized or raw path coefficients in a figure? I read in the Loehlin "LATENT VARIABLE MODELING" text that population differences in range on specific variables can influence comparability of standardized scores across populations? Is this a problem in multigroup analysis? Or are the standardized path coefficients based on values appropriate to the entire population? 


I would report the raw coefficients and their standard error in addition to the standardized coefficents. Don't forget that the significance test is for the raw coefficient. The standardizations are computed using the variances for each group. There are different opinions about this. 

Daniel posted on Wednesday, March 31, 2004  11:32 am



Thanks very much once again for your help. One of the difficult parts of being a researcher rather than statistician by training is that I must learn much technique on my own. So, while I have been reading a tremendous amount of text on a variety of subjects in SEM, it is some times difficult to see the forest for the trees! That's when the help of experts like yourself and Bengt's is much appreciated. 

Daniel posted on Wednesday, March 31, 2004  11:33 am



Thanks very much once again for your help. One of the difficult parts of being a researcher rather than statistician by training is that I must learn on my own. So, while I have been reading a tremendous amount of text on a variety of subjects in SEM, it is some times difficult to see the forest for the trees! That's when the help of experts like yourself and Bengt is much appreciated. 

Jen Bailey posted on Wednesday, April 28, 2004  5:13 pm



Is it possible to run a multigroup model in which a latent factor that exists in one group does not exist in the other? Here's the scenario: I'm looking at withinindividual continuity in latent substance use across adolescence and adulthood. Some of the members of the sample have children, and some do not. I'm interested in how parental substance use affects child problem behavior. My sample of parents is small (n = 200), and my substance use model is fairly large, since I have multiple indicators and multiple time points. Therefore, I would like to take advantage of my whole sample (n = 800) in estimating the substance use part of the model. A colleague suggested that I do a multigroup model, leaving out the "child problem behavior" factor in the group that doesn't have children. The child problem behavior variables are, obviously, missing for all nonparents. The thought was that a multigroup model would be superior to mixing the parent and nonparent populations and using FIML because it would explicitly acknowledge that there are two populations in the sample. I've tried specifying a new latent factor in the model statement for my second group, but the program (Version 3) doesn't seem to like that. What are your thoughts on using a multigroup approach in this case? How would I program such a model? Thank you! 


Yes, this is possible. But you need to define the factor in the overall MODEL command not in a groupspecific MODEL command. Then you need to set all of the factor loadings to zero in the groupspecific MODEL command. The overall MODEL command is the model assigned to each group and then modified by the groupspecific MODEL commands. Chapter 13 has a discussion of this. Following is an example of how this can be done: MODEL: f1 BY y1y4; f2 BY y5 y6 y7; MODEL males: f2 BY y5@0 y6@0 y7@0; 

Jen Bailey posted on Thursday, April 29, 2004  11:03 am



Hi Linda, Thanks for your reply  I appreciate your syntax suggestion. I still have a problem, however. I wrote the syntax as you suggested, and got an error message saying that all cases in one group were missing data on some variables. This is true  in my nonparents group, there ARE no data for the indicators of child problem behavior, because there are no children. Any suggestions for getting around the fact that the child problem behavior factor doesn't exist and its indicators are all missing data in the nonparent group? Thanks again! 


I think the only thing you can do is run the model with the factors and variables shared by all groups and test invariance of the factors over groups for those factors. Then you would have to run the group separately that has more factors and variables. Establishing measurement invariance would not be as issue for those factors. 

Jen Bailey posted on Monday, May 03, 2004  10:27 am



Thank you for your time and advice. I very much appreciate having this discussion board as a resource. 

Anonymous posted on Monday, June 14, 2004  6:06 am



Hello, I am running a multigroup analysis with three racial groups  black, white, and hispanic. You will see in the input file below that I allow 2 variable (ED and CMR) paths (slopes, gammas) to be freely estimated among the three groups. How can I allow one of the variables (MV1) to be constrained to be equal for the first two groups (black and white) and freely estimated/different for the third group (hispanic)? VARIABLE: GROUPING IS RAC (1=black 2=white 3=hispanic); MISSING IS .; MODEL: F1 BY RE SF MH SFV SO QOL; RE WITH SF MH SFV; SF WITH MH SFV; MH WITH SFV; F1 ON MV1 (1) AGE (2) ED MAR (3) CMR TR (4); OUTPUT: STANDARDIZED; Thank you in advance for your reply. 


Add: MODEL hispanic: f1 ON mv1; This will relax the equality constraint for the hispanic group. 

Anonymous posted on Monday, August 16, 2004  10:42 am



Hi, I was wondering if my code is correct to test measurement invariance (has SOME categorical factor indicators and covariates). It is my understanding that I should use the theta parameterization. Is this correct? I believe I should run a model where everything is free (model 1), where factor loadings are held constant across groups (model 2), where variances of latent variables are held constant and factor loadings (model 3), where covariances of latent variables, variances of latent variables and factor loadings are equal (model 4), and finally where regression paramaters, covariances of latent variables, variances of latent variables, and factor loadings are held constant (model 5). I am not specifying thresholds. All of my categorical variables are coded 0absent, 1present. I read on page 67 of the User's Guide that if the thresholds are free across groups (I believe this is the default) and a factor loading for a categorical factor indicator is free across groups, the residual variance for the variable must be fixed to one in these groups for identification purposes. Do I need to fix the variance of pardep and fhdadc to one...or some other variable? I am having some identification issues. I am particularly interested in whether the regression weights are equal across groups. Model 1: grouping is sex (0=male 1=female); IDVARIABLE = subno; missing=.; categorical are fhdadc parsuic pardep late; ANALYSIS: TYPE = mgroup; parameterization=theta; iterations= 50000; MODEL: suicide BY late@1 (1); suicide by middle (2); suicide by early (3); attemp by mlife@1 (4); attemp by lalife (5); attemp by elife (6); parprob by fhdadc@1 (7); parprob by parsuic (8); parprob by pardep (9); extrov BY ext3@1 (10); extrov by ext2 (11); extrov by ext1 (12); psychot BY psychot2@1 (13); psychot by psychot1 (14); psychot by psychot3 (15); neurot BY neurot3@1 (16); neurot by neurot2 (17); neurot by neurot1 (18); !parsuic@1; !pardep@1; attemp on suicide pareduc parprob extrov psychot neurot careloss divorce nphycnt nvbscnt nncnt cle31; model female: suicide BY late@1 (101); suicide by middle (102); suicide by early (103); attemp by mlife@1 (104); attemp by lalife (105); attemp by elife (106); parprob by fhdadc@1 (107); parprob by parsuic (108); parprob by pardep (109); extrov BY ext3@1 (110); extrov by ext2 (111); extrov by ext1 (112); psychot BY psychot2@1 (113); psychot by psychot1 (114); psychot by psychot3 (115); neurot BY neurot3@1 (116); neurot by neurot2 (117); neurot by neurot1 (118); MODEL 2: missing=.; categorical are fhdadc parsuic pardep late; ANALYSIS: TYPE = mgroup; parameterization=theta; iterations= 50000; MODEL: suicide by late@1 middle early; attemp by mlife@1 lalife elife; parprob by fhdadc@1 parsuic pardep; extrov by ext3@1 ext2 ext1; psychot by psychot2@1 psychot1 psychot3; neurot by neurot3@1 neurot2 neurot1; attemp on suicide pareduc parprob extrov psychot neurot careloss divorce nphycnt nvbscnt nncnt cle31; Model 3: Add this to model 2..... suicide (30); parprob (31); extrov (32); psychot (33); neurot (34); attemp (35); model female: suicide (30); parprob (31); extrov (32); psychot (33); neurot (34); attemp (35); Model 4: add this to MOdel 3.... parprob with extrov (44); parprob with psychot (45); parprob with neurot (46); extrov with psychot (47); extrov with neurot (48); psychot with neurot (49); model female: parprob with extrov (44); parprob with psychot (45); parprob with neurot (46); extrov with psychot (47); extrov with neurot (48); psychot with neurot (49); MODEL 5: add this to model 4.... attemp on suicide (149); attemp on pareduc (150); attemp on parprob (151); attemp on extrov (152); attemp on psychot (153); attemp on neurot (154); attemp on careloss (155); attemp on divorce (156); attemp on nphycnt (157); attemp on nvbscnt (158); attemp on nncnt (159); attemp on cle31 (160); Model female : attemp on suicide (149); attemp on pareduc (150); attemp on parprob (151); attemp on extrov (152); attemp on psychot (153); attemp on neurot (154); attemp on careloss (155); attemp on divorce (156); attemp on nphycnt (157); attemp on nvbscnt (158); attemp on nncnt (159); attemp on cle31 (160); Are these the models you suggest? Is my syntax correct? Do I need to set the residual variance to one for parsuic and pardep (or other variables)? Thank you so much in advance. 


You can use either the delta or theta parameterization to test measurement invariance. Many of the equalities that you want to test are not measurement invariance in my opinion. Differences between factor means, variances, and covariances and regression coefficients describe population heterogeneity rather than measurement invariance. Factor loadings and thresholds are related to measurement invariance. Some see residual variances of factor indicators as measurement parameters. I would not require them to be equal for measurement invariance to hold. Example 5.16 in the Mplus User's Guide shows a multiple group CFA with categorical factor indicators. To test measurement invariance, you would first run the default overall model where factor loadings and thresholds are held equal as the default. The second model is one where factor loadings and thresholds are unequal across groups. How to relax the default equality is shown in Example 5.16. With the THETA parameterization, residual variances instead of scale factors are fixed to one. 

Anonymous posted on Tuesday, September 21, 2004  5:59 pm



Does Mplus 3 generate modification indices that rank the equality constraints in terms of their effects on overall model chisquare? If not, what is your recommended strategy for localizing areas of relatively worse "misfit" in complex multigroup SEMs? Thanks! 


No. No general strategy comes to mind. Just look for the largest ones and also see what difference it makes for parameter estimates when they are relaxed. 

Anonymous posted on Friday, October 08, 2004  1:30 pm



I am testing measurement invariance of factor loadings where indicators are categorical. I consistently get an error message that the standard errors cannot be estimated because my model may not be identidfied. Hoping to fix this problem, I would like to constrain my factor means to zero. Someone else had this same problem...and the posted response was: "If you instead want to have no invariance restrictions across groups you should repeat the thresholds and loadings in each group so that they are groupspecific. Note, however, that in this case you need to fix to zero the factor means in all groups (you cannot identify both groupspecific thresholds and groupspecific factor means) and fix the scale factors to one in all groups (they can only be identified when thresholds and loadings are invariant)." How do I fix to zero the factor means in all groups? What does the code look like? Thank you! 


To fix a factor mean to zero, use the square bracket option in the overall MODEL command: MODEL: [f@0]; 

Madeline posted on Thursday, October 28, 2004  4:20 pm



Hi  I am testing measurement invariance of factor loadings across gender. My less restrictive model is giving me the following message: "THE MODEL ESTIMATION TERMINATED NORMALLY THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES COULD NOT BE COMPUTED. THE MODEL MAY NOT BE IDENTIFIED. CHECK YOUR MODEL. Here is my code. Can you tell me what I am doing wrong? Thanks!!! INPUT INSTRUCTIONS !Measurement Invariance of Factor Loadings across sex TITLE: Invariance: Male vs Female DATA: FILE IS Y:\Madeline\name1.dat; VARIABLE: NAMES ARE id caring friendly join betrfren holiday silly partyfun betrmood drive homework lvcut lvweapon lvpunish lvdamage lvbeaten lvthrt lvver lvstolen everalc binge30 lvdaybin lvbinint alc30 daysalc drinkday grade a5 a6; USEVARIABLES ARE everalc lvcut lvweapon lvpunish lvdamage lvbeaten lvthrt lvver lvstolen caring friendly join betrfren holiday silly partyfun betrmood drive homework; grouping is a5 (1=male 2=female); MISSING = . ; IDVARIABLE = id; categorical are everalc caring friendly join betrfren holiday silly partyfun betrmood drive homework lvcut lvweapon lvpunish lvdamage lvbeaten lvthrt lvver lvstolen; ANALYSIS: TYPE = missing h1; parameterization=theta; iterations= 50000; MODEL: delinq by lvdamage@1; delinq by lvcut (2); delinq by lvweapon (3); delinq by lvpunish (4); delinq by lvbeaten (5); delinq by lvthrt (6); delinq by lvver (7); delinq by lvstolen (8); expec by partyfun@1; expec by friendly (10); expec by join (11); expec by betrfren (12); expec by holiday (13); expec by silly (14); expec by caring (15); expec by betrmood (16); expec by drive (17); expec by homework (18); everalc on delinq expec; expec on delinq; model female: delinq by lvdamage@1; delinq by lvcut (102); delinq by lvweapon (103); delinq by lvpunish (104); delinq by lvbeaten (105); delinq by lvthrt (106); delinq by lvver (107); delinq by lvstolen (108); expec by partyfun@1; expec by friendly (110); expec by join (111); expec by betrfren (112); expec by holiday (113); expec by silly (114); expec by caring (115); expec by betrmood (116); expec by drive (117); expec by homework (118); OUTPUT: tech1 tech2 tech4 STANDARDIZED ; SAVEDATA: DIFFTEST IS sexload.dat; *** WARNING Data set contains unknown or missing values for GROUPING, PATTERN, COHORT and/or CLUSTER variables. Number of cases with unknown or missing values: 454 1 WARNING(S) FOUND IN THE INPUT INSTRUCTIONS 


With categorical outcomes, you must have thresholds and factor loadings both held equal or both free. You can't relax the constraint on a factor loading without relaxing the constraint on the threshold for the same item. I don't see that you have thresholds free in your MODEL command. Examples 5.16 and 5.17 in the Mplus User's Guide show a multiple group CFA with categorical factor indicators. To test measurement invariance, you would first run the default overall model where factor loadings and thresholds are held equal as the default. The second model is one where factor loadings and thresholds are unequal across groups. In this model, with the Delta parameterization, scale factors must be fixed to one in all groups and factor variances fixed to zero in all groups. With the Theta parameterization, residual variances must be fixed to one in all groups and factor means fixed to zero in all groups. How to relax the default equality is shown in Example 5.16. With the THETA parameterization, residual variances instead of scale factors are fixed to one. 


Hello, I am running a series of multigroup (male and female) CFA's with continous factors in an attempt to test measurement invariance (a la Bollen 1989). Moving to increasingly more restrictive constraints (factor loadings, intercepts, means, and variancecovariances) I am now ready to constrain error variancecovariances. However I am unclear on 1)what the default treatment of error variances is in Mplus and 2) how to constrain them to be equal between groups. Can you tell me what programming language I need to constrain error variances? An example program is as follows: GROUPING is female (0=male 1=female); USEVARIABLES ARE da1 da2 da3 da4 da5 da6 da7 da9 da10 da11 da12 da13da14 da15 da16 da17 da18 da19; missing = .; ANALYSIS: type=meanstructure; MODEL: depress by da1da5* da6@1 da7* da9da19*; da14 with da17; da15 with da11; da9 with da19; da7 with da18; da4 with da11; da4 with da15; da18 with da5; da5 with da7; !Variances; da1 (1) da2 (2) da3 (3) da4 (4) da5 (5) da6 (6) da7 (7) da9 (8) da10 (9) da11 (10) da12 (11) da13 (12) da14 (13) da15 (14) da16 (15) da17 (16) da18 (17) da19 (18); depress (19); MODEL female: [depress@0]; Thanks very much for your help! 

bmuthen posted on Sunday, November 14, 2004  11:24 am



The default is that the error (co)variances are allowed to differ across groups. Your input specifies that the error variances are the same across groups since you have in the overall part of your model the statements da1 (1); etc 


This is probably too basic a question, but when asked by my PhD supervisor I was unable to answer. He has no experience with MPlus, and we are both on a steep learning curve. I am running SEM with three groups of about 70 participants of 6, 8, and 10 years. If I use a multiple group format for the SEM, what exactly am I doing. Am I correcting for or accounting for group?differences? Similarly, when would I use CLASS and when would I use CLUSTER? Thank you. 

Mary posted on Tuesday, November 16, 2004  6:15 am



Dear Mr and Mrs Muthén, I have a very simple question regarding the grouping option. Besides the constraint that forces the loadings to be equal across the groups, are there any other differences between runnning a regression with the grouping option or running each group as a different regression? Thank you very much! 


Re: Larry Cashion. Multiple group analysis is used to study parameter estimates across groups of different observations. In your case, you would be studying difference in parameter estimates acroos age. The CLASSES option is used to define categorical latent variables in mixture models. The CLUSTER option is used to name the cluster variable in an analysis of complex survey data, that is, data that are not collected as a simple random sample. 


I assume that you are asking whether a CFA with covariates will result in different parameter estimates when all parameters are free or if you run the anslysis on each group separately. If all parameters are free across groups, the results should be the same. 

Anonymous posted on Tuesday, November 30, 2004  12:51 pm



I have a question about reporting factor means. I conducted multiple, multigroup analyses, and I tested invariance across gender, age, and race. Now, for the manuscript, I would like to report factor means. However, the factor means for one group in each of the multigroup analyses are set to zero. Is my only option to report: Mean Conduct Problems Men 0 Women 1.2 Caucasian 0 African American 2.12 Etc.... Thank you 

bmuthen posted on Tuesday, November 30, 2004  5:43 pm



Yes, factor means need to be fixed to zero in one group for identification purposes. You should view this group as the reference group to which the factor means of the other groups are compared. So that's how you want to portray it in your reporting. Another way of saying this is that it is really only the factor mean difference between the groups that is identifiable. 


I have what I think is a simple question. I have two covariance matrices for which I would like to run a multigroup analysis. All Mplus examples I have seen on the website and in the manual assume that one has raw data with a grouping variable present on the dataset. 1) Can one model with two (or more) covariance matrices instead? 2) If so, could you provide some example syntax? Thank you muchly! 


See the discussion of multiple group analysis in Chapter 13 of the Mplus User's Guide. The only difference is how you refer to the groups. Note that some estimators require raw data. 

Anonymous posted on Tuesday, December 14, 2004  2:46 pm



Hi  I am trying to use the Difftest option to test measurement invariance of factor loadings and thresholds across sex. The first model allows factor loadings and thresholds to vary across groups. The second model constrains factor loadings and thresholds to be equal across groups. I keep getting an error message saying my models are not nested. Could you help me determine why they are not nested? The first model  MODEL: delinq by lvdamage@1 lvcut lvweapon lvpunish lvbeaten lvthrt lvver lvstolen; lvdamage@1; lvcut@1; lvweapon@1; lvpunish@1; lvbeaten@1; lvthrt@1; lvver@1; lvstolen@1; expec by partyfun betrmood caring friendly join betrfren holiday silly homework drive; partyfun@1; betrmood@1; caring@1; friendly@1; join@1; betrfren@1; holiday@1; silly@1; homework@1; drive@1; [expec@0]; [delinq@0]; daysalc on delinq expec; expec on delinq; model indirect: daysalc IND expec delinq; model female: delinq by lvdamage; [lvdamage$1]; delinq by lvcut; [lvcut$1]; delinq by lvweapon; [lvweapon$1]; delinq by lvpunish; [lvpunish$1]; delinq by lvthrt; [lvthrt$1]; delinq by lvver; [lvver$1]; delinq by lvstolen; [lvstolen$1]; expec by betrmood; [betrmood$1]; expec by caring; [caring$1]; expec by friendly; [friendly$1]; expec by join; [join$1]; expec by betrfren; [betrfren$1]; expec by holiday; [holiday$1]; expec by silly; [silly$1]; expec by homework; [homework$1]; expec by drive; [drive$1]; SAVEDATA: Difftest = h1.dat OUTPUT: sampstat STANDARDIZED; The second Model: MODEL: delinq by lvdamage@1 lvcut lvweapon lvpunish lvbeaten lvthrt lvver lvstolen; expec by partyfun@1 friendly join betrfren holiday silly caring betrmood drive homework; daysalc on delinq; expec on delinq; daysalc on expec; model indirect: daysalc IND expec delinq; Thank you 


Can you send both outputs and your data to support@statmodel.com. 


I just did this nested model testing for examples in the user's guide using both the Delta and Theta parameterization and it worked fine. I would be happy to send you the setups if you give me your email address. 

Anonymous posted on Friday, December 31, 2004  9:22 am



Could you direct me to the documentation for the new difftest that's available in MPlus when using the WLSMV estimator? 


There is no paper that explicitly describes this. DIFFTEST is based on principles described in: Muthén, B., du Toit, S.H.C. & Spisic, D. (1997). Robust inference using weighted least squares and quadratic estimating equations in latent variable modeling with categorical and continuous outcomes. which can be requested from bmuthen@ucla.edu. 


Hi, I would like to use WLSMV for testing invariance for a CFA model with dichotomous data. I see that Mplus has the DIFFTEST command, and I can use it to save the derivatives, but I'm unsure what to do with them after that. Could you help me understand how to use this command? Thanks. Below is an example of the code I want to use. MODEL: F1 BY Y1@1 Y2Y3*; F2 BY Y4@1 Y5Y6*; F1F2*; F1 WITH F2*; [Y1$1*1]; [Y2$1*.5]; [Y3$1*.25]; [Y4$1*0]; [Y5$1*.25]; [Y6$1*.5]; MODEL G2: F1 BY Y2*; [Y2$1*.5]; (Y2@1); MODEL: F1 BY Y1@1 Y2Y3*; F2 BY Y4@1 Y5Y6*; F1F2*; F1 WITH F2*; [Y1$1*1]; [Y2$1*.5]; [Y3$1*.25]; [Y4$1*0]; [Y5$1*.25]; [Y6$1*.5]; SAVEDATA: DIFFTEST IS OUT.DAT; RESULTS ARE CATDIFFRESULTS.DAT; 


See Chapter 12 of the Version 3 Mplus User's Guide. There is an example of how to use the DIFFTEST option. 


Thanks very much. 

Holmes Finch posted on Wednesday, January 19, 2005  5:22 am



I appreciate your directing me to the discussion in the manual regarding using DIFFTEST for WLSMV. I'm using this command in a simulation study, and was wondering if it's possible to save the results of the chisquare difference test for WLSMV that one gets using DIFFTEST. I couldn't find it in the file produced by the RESULTS command, and have looked through the manual, but haven't found anything. Thanks in advance. 


I don't think this is possible but send the question to support@statmodel.com. Thuy would know for sure. 

Anonymous posted on Monday, February 07, 2005  9:28 am



am doing a multiple group analysis with 4 latent variables and one categorical outcome variables. I am using: analysis: type=mgroup MISSING h1; iterations=100; PARAMETERIZATION=THETA; estimator= WLSMV; the output is giving me a message : _________________________________________________ SERIOUS COMPUTATIONAL PROBLEMS OCCURRED IN THE BIVARIATE ESTIMATION OF THE CORRELATION FOR VARIABLES PERSISTE AND IIE72. CHECK YOUR DATA. IF THE PROGRAM RECOVERS FOR THIS PAIR OF VARIABLES (SEE TECHNICAL 6 OUTPUT), THE ESTIMATES ARE VALID. THE PROBLEM OCCURRED FOR THE FOLLOWING OBSERVATION(S): OBSERVATION 3 OBSERVATION 3 COMPUTATIONAL PROBLEMS ESTIMATING THE CORRELATION FOR PERSISTE AND IIE72 ______________________________ i have checked my data and I don't find a problem with it. The tech 6 report is not provided with the type of analysis I am doing. What are my alternatives to fix this problem? 


i am doing a multiple group analysis with 3 groups. i am not getting any fit statistics with the model results, except AIC and BIC. Can you tell me how to get chisquare, TLI, IFI, and RMSEA? Thanks! 


If you are not getting any fit fit statistics, it is most likely the case that they are not available for the model you are estimating. If you send your full output to support@statmodel.com, I can determine the reason. 

Anonymous posted on Friday, February 18, 2005  2:08 pm



I just ran a multigroup analysis to test differences in mediation across race. I can test whether the paths of the mediation model are significantly different across groups. Is there a way to test whether the mediated effect (or proportion mediated) is statistically different across groups? 

bmuthen posted on Friday, February 18, 2005  5:16 pm



If you have the estimate of the mediated effect and its SE for each of the 2 groups, you can simply use those numbers to create the approximately normal test variable: (e1  e2)/(se(e1e2)), where the denominator is sqrt(var(e1e2)), where var(e1e2) is var(e1) + var (e2), where var(e) is the square of the SE(e). 

Anonymous posted on Wednesday, March 23, 2005  7:47 am



I did a multigroup analysis and a DIFFTEST. The DIFFTEST yielded a ChiSquare difference value of 13.237 with 1 degree of freedom (the difference between the more restrictive H0 modell and the H1 model is only one parameter), which is statistically significant at the .05 probabilty level. Does this mean that the less restrictive modell H1 (in which the parameter was allowed to be estimated freely) fits better than the more restrictive H0 modell and therefore should be used in my further analysis? I ask this question, because I tested the same model with AMOS (only difference: the 6scale indicators of the latent variable were treated as continous variables) and I got nearly identical results (Estimator : ML), except for the mentioned parameter. Setting this parameter equal across both groups results in AMOS in a significantly better cmin/df. 

bmuthen posted on Wednesday, March 23, 2005  7:49 am



The answer is yes. 

Anonymous posted on Friday, April 22, 2005  11:49 am



When is multigroup analysis more appropriate than running a regression with interactions? The variances of my variables are quite different across groups  and I am wondering if this is why multigroup analyses is telling me the groups are different but regression with interaction analyses are telling me the groups are the same. I was thinking this disparity was because the multigroup takes variances by group, where regression with interactions takes pooled variances. 

bmuthen posted on Friday, April 22, 2005  3:15 pm



It sounds like you are correct. 

Anonymous posted on Monday, April 25, 2005  10:19 am



Dr. Muthen I try to see the baseline model or the model whitout any constraints for multiple groups analysis(three groups). Can I use the sum of the df as a check to see if ran without any constraints. I ran a individual model where I had an estimated df=19 and then I ran a multiple group where I had a estimated df = 63. If not, how can I check if my syntax would be the correct model without any constraints? I used analysis: type = gen missing h1; estimator=mlr; Thanks, 


You can look at TECH1 of the OUTPUT command to see if you have the model that you want. 

Anonymous posted on Tuesday, April 26, 2005  2:15 am



Multigroup comparison & Sample size_lisrel 1. I wonder, when testing the mesurement model for invariance across groups, what should the PSIs and the BETAs be(IN or PS)? 2. When sample size is >3000, is it then appropriate to use MIs > 5 (as in Byrne, Shavelsob and Muthen, 1989)as a criteria for releasing parameters? 


I do not understand what IN and PS are but stuctural parameters such as factor means,variances, covariances, and regression coefficients do not need to be held equal for measurement invariance. I would, however, have the same structural parameters in the groups while testing measurement invariance. I think the rule of thumb of 5 probably has little meaning at this time. 

Anonymous posted on Monday, May 09, 2005  2:20 am



I am doing a multigroup analysis using the theta parameterization and having a dichotomous outcome. If I understood it correctly, the factor loadings are held equal across the groups as well as the means and intercepts. If I want to free the factor loadings and the thresholds, I have to do it simultanously and I HAVE to fix the residual variances in all groups to one and the factor means in all groups to zero. Is that correct? I ask this, because if I do a chisquare diff test between a model with factors means fixed to zero in the first group and free in the other group and a model with factor means fixed to zero in both groups, the result speaks clearly against the second model. 


It is the factor loadings and thresholds of the factor indicators that are held equal as the default. In the default model, factor means are fixed to zero in the first group and are free to be estimated in the other groups. With the theta parameterization, residual variances of the factor indicators are fixed to one in the first group and are free to be estimated in the other group. You are correct that when you free factor loadings and thresholds, all factors means should be fixed to zero and all residual variances should be fixed to one. 

Anonymous posted on Thursday, June 23, 2005  8:34 am



After running a separate analysis for males (M) and females (F), I ran a multiple group with no constraints. However, my chisquare and df values for M and F do not add up to the chi square and df for the multiple group no constraints model. I have provided my syntax for the M model (the F model is the same  I do get the same number of df for the M and F when I run them separately). I have also included my syntax for the multiple group (MG) no constraints model. Each separate model has 154 df and the MG model has 320 df. MG model syntax: VARIABLE: ... MISSING = BLANK ; GROUPING IS gender (0=female 1=male) ; ANALYSIS: TYPE = MISSING H1; MODEL: extprob BY T1delinq T1agg ; risk BY MomBSI Finstrai Neighpro ; intprob BY T1somati T1Withdr T1anxiou ; pospar BY Monitor MCTrust SchInvol ; devpeer BY SchFr NeighFr PeerDelq ; extprob2 BY T2delinq T2agg ; intprob2 BY T2somati T2withdr T2anxiou ; pospar ON risk ; devpeer ON pospar T1parstr; T1parstr ON risk ; extprob2 ON devpeer extprob; intprob2 ON devpeer intprob; T2delinq WITH T1delinq ; T2agg WITH T1agg ; T2somati WITH T1somati ; T2withdr WITH T1withdr ; T2anxiou WITH T1anxiou ; MODEL male: extprob BY T1agg ; risk BY Finstrai Neighpro ; intprob BY T1Withdr T1anxiou ; pospar BY MCTrust SchInvol ; devpeer BY NeighFr PeerDelq ; extprob2 BY T2agg ; intprob2 BY T2withdr T2anxiou ; OUTPUT: STANDARDIZED MODINDICES(3.84) SAMPSTAT TECH1 ; Separate model: VARIABLE: ... MISSING=BLANK ; ANALYSIS: TYPE = MISSING H1; MODEL: extprob BY T1delinq T1agg ; risk BY MomBSI Finstrai Neighpro ; intprob BY T1somati T1Withdr T1anxiou ; pospar BY Monitor MCTrust SchInvol ; devpeer BY SchFr NeighFr PeerDelq ; extprob2 BY T2delinq T2agg ; intprob2 BY T2somati T2withdr T2anxiou ; pospar ON risk ; devpeer ON pospar T1parstr; T1parstr ON risk ; extprob2 ON devpeer extprob; intprob2 ON devpeer intprob; T2delinq WITH T1delinq ; T2agg WITH T1agg ; T2somati WITH T1somati ; T2withdr WITH T1withdr ; T2anxiou WITH T1anxiou ; OUTPUT: STANDARDIZED MODINDICES(3.84) SAMPSTAT ; Your help is greatly appreciated. 

BMuthen posted on Friday, June 24, 2005  1:51 am



This is a support question. Please send your outputs, data, and license number to support@statmodel.com. 

Anonymous posted on Wednesday, July 27, 2005  4:05 pm



If you find a model is different across two or more groups, is it best to test them simultaneously and get one set of model statistics? Or is it better to split the sample and test the models for each sample separately and get separate sets of model statistics? 

bmuthen posted on Wednesday, July 27, 2005  6:39 pm



If all parameters are different across groups, it is simpler to work with each group separately. But as long as some parameters are equal across groups you benefit from a simultaneous analysis. 


Hello, I am testing measurement invariance for a single construct that was measured at different time points. I use multiple CFA in Mplus where the different groups represent the different measurement occasions. I would like to model covariances between the like items' error variances across occasions. I do not know how to model this in a multiple CFA framework in Mplus. Any suggestions will be highly appreciated. Thank you 


You should not use different groups to represent different measurement occasions because in multiple group analysis each group should contain independent observations. Following is the input for a multiple indicator factor model with four measurement occasions: MODEL: f1 BY y11 y21 (1); f2 BY y12 y22 (1); f3 BY y13 y23 (1); f4 BY y14 y24 (1); [y11 y12 y13 y14] (2); [y21 y22 y23 y24] (3); [f1@0 f2 f3 f3]; If you want a residual covariance, you would state, for example: y13 WITH y14; 


I have a SEM model with two latent endogenous variables that I am treating as continuous and using an MLR estimator. I am testing invariance of the model using the grouping option in Mplus and I have been able to do most of what I want. I am confused, however, about how to constrain the means of my latent factors to be equal across my groups. Can this be done for latent endogenous variables? Related to this, above Dr. Muthen notes that factor means must be set to zero in one group to identify the model, but then why is my tech4 output giving me an estimated mean for my latent variables in both groups? I do see that the intercept for my latent is set to zero in the first group, but I'm somehow missing the connection here. Thanks for any help you can give me. 


In a model where intercepts are estimated for the latent variables, there is not a straightforward test of whether means are equal. In a model where you are estimating means not intercepts, you can test that means are equal by fixing the means to zero in all groups. The model estimated means in TECH4 are based on the model. When a latent variable is endogenous, it's mean is equal to the intercept plus the regression coefficients times the means of the exogenous variables it is regressed on. 


Hello Linda and Bengt, I am wondering if Mplus allows me to answer an empirical question. I have employees’ data from 31 organizations. My model includes three latent variables at the individuallevel, Job satisfaction, job performance, and worker’s belief. My DV is job performance, my IV is job satisfaction. I conducted a multisample analyses and I found that the relationship between my DV and IV varies across organizations (i.e. is moderated by organization). Now, I want to test if this moderating effect of organization on the relationship between my DV and IV is partially mediated by worker’s belief. Is this even possible in Mplus? If it is, could you please refer me to some material that deals with this type of problem? Thanks in advance for your help, Pancho 

Boliang Guo posted on Tuesday, November 15, 2005  1:49 am



in your case, there are 31 organization, I think you can consider modle a 2 level path analysis, which consider the mediating effect after partial the l2 effects.if you did not have level 2 variable in your model, jsut leave the intercept and slop ramdome in the model 31 level 2 unit is better for multilevel analysis, anwyan, try check the intercept and slope's level2 variance first 


Hello Linda and Bengt, I'm wondering if I can conduct the following analysis in Mplus. I modify the example 9.9 and 9.10 from the Mplus version 3 User's guide on pages 205207. I have 31 clusters would that be large enough cluster size? TITLE: this is an example of twolevel CFA with continuous factor indicators, covariates,and random slopes DATA: FILE IS ex9.9.dat; VARIABLE:NAMES ARE y1y4 x1x4 w clus; CLUSTER = clus; BETWEEN = w; ANALYSIS:TYPE = TWOLEVEL RANDOM; ALGORITHM = INTEGRATION; INTEGRATION = 10; MODEL: %WITHIN% fw1 BY y1y4; fw2 BY x1x4; s  fw1 ON fw2; %BETWEEN% fb BY y1y4; y1y4@0; fb s ON w; Thanks a lot, Pancho 

bmuthen posted on Thursday, November 17, 2005  5:14 am



Yes, this model can be estimated in Mplus. It may however require a long computing time. 31 clusters is on the border of being too low. Note that 31 is the sample size for between parameters. You have only 7 between parameters so you are probably ok. 


Is it possible to do a multigroup analysis using a covariance matrix as the input if the group variable was included in the matrix? Or if it is not in the covariance matrix, but you know how many groups and the number of respondents by group...but the covariance matrix is not separated out by group? Thanks, 


It is possible to do a multiple group analysis using covariance matrices for some estimators. How to do this is described in Chapter 13 under Multiple Group Analsyis, Data In Multiple Group Analysis, Summary Data One Dataset. The grouping variable is not part of the matrices. 

Carol posted on Friday, February 10, 2006  9:09 am



Hello Dr. Muthen, I am running a twin model in MPlus using Carol Prescott's examples as a template. In my latest model I ran into the following error message: WARNING: THE RESIDUAL COVARIANCE MATRIX (PSI) IN GROUP MZ18 IS NOT POSITIVE DEFINITE. PROBLEM INVOLVING VARIABLE A2. Why might this happen and what are the implications in terms of parameter estimates and fit statistics? Thank you, Carol 

bmuthen posted on Friday, February 10, 2006  7:40 pm



This message is ok for twin modeling where the A factors are fixed to correlate 1.0 for MZs. The warning message is good in general where you don't want factor correlations of 1.0. In your case, you can ignore it. If you are doing twin modeling, you will enjoy new features in Mplus Version 4 which will be out in a few weeks. 


Hello I am using PLS to verify gender differences in the factors that influence small firms performance. I did run the full model and then one seperately for males and females. I am wondering how I could do the multigroup analysis and what to compare. Is it the path coefficients or T statistics or the means? I used the PLS graph 3.0 Or is there a way to run the whole model using multigroup analysis 


There will be a description of testing for measurement invariance in the Version 4 Mplus User's Guide which will be available online next week. I don't know anything about PLS. 


If I have 3 groups 1 = low 2 = medium and 3 = high and I want to test the invariance of a structural path between the low and high group only (so that my degrees of freedom difference is 1), would I use: MODEL: F1 on F2 (1); MODEL Medium: F1 on F2; So that only group 1 and 3 are held equal...does this sound reasonable? 


It sounds reasonable. 


Hello, I try to conduct a multiple group analysis by testing the invariance of firstorder factor loadings on secondorder factors. When I ran a fully constrained model, the result indicated that the factor loadings of the firstorder factors on the secondorder were not equivalent. It showed that factor loadings, intercepts and thresholds of observed variables were constrained. How can I constrain the factor loadings of the firstorder factors on the secondorder factor? Thank you. 


I am not clear what you mean. See Example 5.6 in the Mplus User's Guide. This is a secondorder factor analysis model. Tell me which paths in that model you want to constrain to be equal. 


Hello, Dr. Muthen: The questionnair has 33 items, each one having a 5 point Likert scale. By CFA, a measurement model with 5 factor was constructed. Then, I tested the measurement invariance for two groups. I first free the factor loadings and the item threshhold to be freely estimated, but hold the scale factor of the items to be 1 and the factor means to be 0 in both the two groups. By doing this, I got chisquare value as 1583.775. Then, I constrained the factor loadings and item threshholds to be equal across groups. The Chisquare value for the more restrictive model was 921.745*. However, the Chisquare difference is positive 26.589. I used DIFFTEST to do the Chisquare difference test because I used WLSMV estimator. Is it possible for the Chisquare of the more restrictive model to be smaller than the Chisquare of the more flexible model? Am I doing right? Thank you so much! Best Regards Zhongmiao Wang 


With WLSMV, it is only the pvalues of the chisquare that you should be interpreting for each analysis. This is why we have the DIFFTEST option for comparing two models. 


I would like to use the factor scores from a multiple group analysis with continuous variables to graph the relationship between two latent variables. However, the factor scores from the multiple group analysis do not seem accurate; that is, some of the children with high scores on the observed variables have very low factor scores (e.g., 3.8), while others with near identical scores on the observed variables have high factor scores (e.g., 2.0). When examine the factor scores computed from the two single group analyses the factor scores appear as expected, with high scores on the observed variables translating into high factor scores. Why are the factor scores from the multiple group analysis markedly different from those from the single group analyses? Why do they not reflect the trends seen on the observed variables? Thank you for your time, Dave Barker 


This is a question that would require you to send your input, data, outputs, and license number to support@statmodel.com. If you are not using the most recent version of Mplus, I would suggest that as a first step. 

HW posted on Friday, June 16, 2006  12:21 pm



I am working with 5 groups, and would like to test for structural invariance doing pairwise comparisons. I know this code: model: x on y1 (1) y2 (2) y3 (3); will result in a test of equivalence for y1 (and y2,y3) across all groups  how can i code it so that only group 2 and group3 (for example) are being compared? I am evaluating the significance of between group differences using the chisquare difference test, incorporating the scaling correction factor (i am using wlsm estimation). Thanks 


You need to use groupspecific MODEL commands to achieve this. MODEL: x on y1 y2 y3; MODEL g2: x on y1 (1) y2 (2) y3 (3); MODEL g3: x on y1 (1) y2 (2) y3 (3); 

Ronald Cox posted on Friday, June 16, 2006  5:59 pm



Hi I am testing to see if measurement invariance in a CFA model holds for a repeated measures study. I am fitting the same model simultaneously in both samples (time 1 and time 2), without any parameter constraints in order to create a baseline model. However I am getting an error message of "insufficient data" I am using the demo version. Do you have any suggestions what I might be doing wrong? My input file follows. Thanks, INPUT INSTRUCTIONS TITLE: Baseline model 10th and 11th graders STEP 3) DATA: FILE = assig6data3.1.INP; TYPE = COVARIANCE; NGROUPS= 2; NOBS = 220 220; VARI: NAMES = CA11 CA12 CA13 CA21 CA22 CA23; MODEL: CASPIRE1 BY CA11 CA12 CA13; CASPIRE2 BY CA21 CA22 CA23; MODEL G2: CASPIRE1 BY CA11 CA12 CA13; CASPIRE2 BY CA21 CA22 CA23; *** ERROR Insufficient data in "assig6data3.1.INP" 


This means that Mplus is not finding enough information in the data file. You need to place the covariance matrix for group 1 first followed by the covariance matrix for group 2. See Chapter 13 where this is described. If you can't solve this, you need to send your input, data, and output to support@statmodel.com. 

HW posted on Friday, July 07, 2006  7:41 am



A few questions regarding multiple group comparisons: 1) I have read that Kenny recommends testing for structural invariance before testing for invariance of error covariance  what would be the harm in testing for invariance of error covariance prior to testing for structural invariance? 2) If forcing two structural parameters to be equal results in a nonpositive definite latent variable covariance matrix OR model nonconvergence, what should be done about this? What would be the next step? 3) I have read previously on the MPlus discussion board that if a scaled chisquare difference test doesn't run due to negative chisquare difference values, that this is a function of the method and it is not possible to conclude whether the parameters are equal or not in each group. Can you provide a reference for this? Thanks. 


1. There is no harm but invariance of error covariances is less likely than structural invariance. 2. This may indicate that the structural parameters should not be held equal. 3. There is a Satorra and Bentler article about this from a few years ago. I don't know the exact reference. 

HW posted on Monday, July 17, 2006  11:14 am



when testing between group differences, should it always be a change of one degree of freedom between models? If I hold a parameter equal across all groups, I get a change of four degrees of freedom. should i be putting equality constraints between two groups at a time? 


This would depend on your hypothesis. 


HI, I'm working on a two group model with uneven group sizes. The grouping variable is high school team sport participation among females. The first group has 128 participants. The second (no teams) group has only 43 individuals. I ran the two group model and all the fit indicies are good, including a nonsig chisquare and an RMSEA=0(0 .03). My question is whether this analysis is troublesome because of the vast difference in group sizes? 


I don't think there should be a problem due to different group sizes other than the larger one has more power than the smaller one. 


Thanks. 


Is it possible to do a multiple group path analysis? If so what syntax would I use to constrain the paths (and test for significant differences), I have three groups? Thanks in advance! MODEL: y1 on x1  x3; 


Yes. The following would hold the regression coefficients equal across groups: MODEL: y1 on x1 (1) x2 (2) x3 (3); 

Nina Zuna posted on Wednesday, August 23, 2006  1:56 pm



Dear Drs. Muthén, I am still in Mplus learning mode and came across something I don't understand. I ran 2 single CFAs for my 2 grps and then ran my initial Multiple grp (MG) CFA (configural invariance). Each used MLR estimator. My single group Chi sqs. using MLR do not add up to multiple grp total chi sq using MLR. If this same procedure is done using ML they add up. Que 1. What is diff about MLR that makes the two separate chi sqs not add up to MG chi sq? Second puzzling occurence regardless of estimator used: I had always assumed in MG invariance testing that the group with the lower contribution to chi sq had better fit (ideally you want these #'s roughly equal in MG), but when I did my single CFAs as described above I found out the opposite occured. The group with the larger Chi sq in MG when run in single CFA had better fit than grp with lower chi sq run in single CFA. The grp with better fit statisitcs in single CFA (higher chi sq) appears to be driving the fit statistics in MG invariance tests. This group also has the larger n so perhaps this power differential is the cause. However, I thought since the CFI and TLI are comparative fit indices that they wouldn't be as influenced by sample size?? I am quite confused by this. Que 2. Any thoughts would be very much appreciated. Thank you kindly 


Q1. This is the same issue as MLR chisquare differences between nested models not being chisquare distributed. This topic is discussed on our web site  see left margin HowTo "chisquare difference test". Q2. Groups with larger n influence the parameter estimates more. And parameter estimates in turn influence CFI/TLI. You say "better fit statisitcs in single CFA (higher chi sq)"  that must be a typo since high chisquare is a worse fit statistic than a low chisquare. 

Nina Zuna posted on Wednesday, August 23, 2006  8:58 pm



Thanks for your reponse to que 1makes sense. As for 2nd ques I am still stumped b/c it is not a typo. Below is my output from the 2 single CFAs and MG. ChiSq Test of Model FitDisability group (n=112) Value 351.127* df 183 PValue 0.0000 CFI .810 TLI 0.782 RMSEA 0.091 As you will note all 3 fit statistics are worse in this model with lower Chi Sq value. ChiSq Test of Model Fit (Non Disability group n=566) Value 514.760* df 183 PValue 0.000 CFI 0.906 TLI 0.892 RMSEA 0.057 Fit statistics are better in this model with higher Chi Sq. ChiSq Test of Model Fit (Multiple group Disability and NonDisability) Value 882.407* df 366 P Value 0.0000 CFI 0.889 TLI 0.872 RMSEA 0.065 In the multiple group, the fit statistics are in between the other two models, with the Non Disab. grp with higher Chi sq. seeming to dominate. Still perplexed....any thoughts? 


Convention suggests that CFI should be above 0.95 for a wellfitting model. I don't think one should compare fit indices when they all point to this degree of misfit. Degrees of poor fit can't really be judged well, I think. In any case, fit indices quite often disagree with each other  this is why it is useful to work with many  at least it is helpful in cases where they are all good. 

Nina Zuna posted on Thursday, August 24, 2006  6:39 pm



Thank you, Bengt; your continued followup is very much appreciated. Indeed, I agree the fit is bad for both groups. I continue to grapple with the fact that the higher chi square had the better model fit. So based on your response am I correct to assume when model fit is this poor, one might see such anomalies as the occurence of better model fit with higher chi squares than a model with lower chi square? I don't think I have seen this before. Everything I have read indicates that Chi square value is a measure of badness of fit. Is there any explanation I could offer to my committee members on this discrepancy? With much gratitude (and final posting!), nz 


Your NonDisability group has worse chisquare but better CFI than your disability group  this may be due to the NonDisability group having a much larger sample size where sample size probably affects chisquare more than CFI. 

Nina Zuna posted on Friday, August 25, 2006  9:00 am



Again, thank you so much for your time. Your website and discussion board are such wonderful resources. I look forward to meeting you and learning from you in MD. nz 


In testing measurement invariance (categorical indicators: loadings and thresholds), does anyone have an idea about how many invariant loadings/thresholds is needed to meet criteria for partial measurement invariance? 


I don't think there are any definite guidelines for this. A few might be okay statistically, but the more important issue is if the construct can be argued to be the same across time or groups. 


would you please help me about description of multigroup analysis in LISREL. 


I don't know about LISREL. This forum is for Mplus. You would need to contact LISREL support. 

TAO, Sha posted on Thursday, March 01, 2007  12:58 pm



I am trying to do a 3 group SEM with summary sata (correlation matrices and STDs). There are two predictors (one was measured by 3 indicators, the other one is measured by 2 indicators), and one outcome measured by 3 indicators. This analysis is to examine the equality of the path parameters from the two predictors to the outcome. The Script is specified as follows: TITLE: Grade 13 SEM: only paths from Independent LVs to the DV are constrained to equal ; DATA: FILE IS "D:\GRADE13.txt" ; TYPE = CORR MEANS STD ; NOBSERVATIONS = 100 100 100 ; NGROUPS = 3 ; VARIABLE: NAMES ARE OV1 OV2 OV3 OV4 OV5 OV6 OV7 OV8 ; ANALYSIS: TYPE = General ; ESTIMATOR is ML ; MODELS: LV1 by OV1@1 OV2 OV3 ; LV2 by OV4* OV5@1 ; LV3 by OV6@1 OV7 OV8 ; LV3 on LV1 LV2; LV1 with LV2 ; LV1 LV2 LV3; OV1 OV2 OV3 OV4 OV5 OV6 OV7 OV8 ; 

TAO, Sha posted on Thursday, March 01, 2007  12:59 pm



MODEL g2: LV1 by OV1@1 OV2 OV3 ; LV2 by OV4* OV5@1 ; LV3 by OV6@1 OV7 OV8 ; LV1 LV2 LV3; OV1 OV2 OV3 OV4 OV5 OV6 OV7 OV8 ; [LV1 LV2 LV3] ; [OV1 OV2 OV3 OV4 OV5 OV6 OV7 OV8 ] ; MODEL g3: LV1 by OV1@1 OV2 OV3 ; LV2 by OV4* OV5@1 ; LV3 by OV6@1 OV7 OV8 ; LV1 LV2 LV3; OV1 OV2 OV3 OV4 OV5 OV6 OV7 OV8 ; [LV1 LV2 LV3] ; [OV1 OV2 OV3 OV4 OV5 OV6 OV7 OV8 ] ; OUTPUT: STANDARDIZED SAMP ; When I run the analysis, MPLUS stopped with an error message: *** ERROR Insufficient data in "D:\GRADE13.txt" So I checked the summary data, and did not find anything wrong with the three matrices and STDs. Would you pls let me know what caused this error and how to fix it? Thanks a lot. 


Please do not continue your post into more than one window. Posts that cannot be fit into one window are not appropriate for Mplus Discussion. This is a support question. Please send your input, data, output, and license number to support@statmodel.com. 


I am trying to use multiple group analysis for a SEM model with two continuous latent independent variables and a variety of observed independent variables regressed on a count dependent variable. When I try to include the GROUPING option, I get the following error: ALGORITHM = INTEGRATION is not available for multiple group analysis. Try using the KNOWNCLASS option for TYPE = MIXTURE. However, I have not specified "ALGORITHM=INTEGRATION" and MIXTURE does not make sense for my model (I am using GENERAL). I tried using KNOWNCLASS to see if it would work, and it says: KNOWNCLASS option is only available with TYPE=MIXTURE. Any idea what the problem might be? Thanks so much in advance. 


Please send your input, data, output, and license number to support@statmodel.com. 

Linda posted on Thursday, April 26, 2007  8:50 am



I have an experimental data with multiple groups (3 intervention groups and 1 control group). I was told that I could create contrast between the groups and use it as an exogenous variable or use multiple group sample analysis. What is the advantage of doing one vs the other? Also, if I use the multiple group analysis, would I be including the control group as well? If this question is too basic, could you refer me to an article? I have done a search before and can't find an article that addresses my question. Thanks in advance. 


In multiple group analysis, more parameters can vary than in a model where the grouping variable is a covariate where only intercepts and means can vary. I would include the control group. You may find the following paper of interest: Muthén, B. & Curran, P. (1997). General longitudinal modeling of individual differences in experimental designs: A latent variable framework for analysis and power estimation. Psychological Methods, 2, 371402 


Using WLSMV, we're estimating a multiple groups path analysis with an ordered, categorical outcome, which we'll refer to as z. Additionally, we have several exogenous predictors, call them x1, x2, and x3, and a mediator y. Initially, we obtained an excellent fitting model by allowing y to partially mediate the influence of each exogenous predictor (x1, x2, and x3) on z. Now, across several ethnic groups, we're attempting to impose and equality constraint on the path coefficient from y to z. The model statement reads as follows: MODEL: y ON x1 x2 x3; z ON y (1); z ON x1; z ON x2; z ON x3; Given the use of WLSMV, we've employed the DIFFTEST option to obtain the chisquare difference test. Appropriately, we've tested the less restrictive model first, saved the results using SAVEDATA, and then estimated the constrained model. Surprisingly, however, the output does not include the difference test, instead reporting that the constrained model is not nested within the original. As far as we can tell, this is inaccurate. Also, as specified in the input, Mplus correctly imposes the equality constraint across ethnic groups for the relationship of y to z, with all remaining effects estimated as requested. Are we incorrect in assuming the restricted model is nested within the original model? 


You need to send the two complete outputs and your license number to support@statmodel.com. 


I am running a path model with multiple groups (with both dichotomous and continuous endogenous variables using WLSMV). I am interested in testing for gender differences in individual regression coefficients. I know that I can constrain all regression paths to be equal between groups and then compare this model to the model without these constraints. This will tell me if there is a significant difference in the fit of the path model by gender. In order to test for structural invariance of individual paths however, do I have to run separate models for each? I would be running 23 models and doing difference testing for each. Thank you! 


You could do that or you could use MODEL TEST. See the user's guide for more information about MODEL TEST. 


I'll look into that. Thank you. 


I'm running a multigroup analysis with covariates. Apparently, Mplus returns an error (and does not estimate the model) whenever the variance of one of the covariates in one of the groups is zero. However, this zerovariance is not necessarily a problem as long as I pool the coefficient of that covariate across groups. So, is there a way to "force" Mplus to estimate the model. Thanks in advance. 


You can use the VARIANCES=NOCHECK; option of the DATA command to avoid stopping for zero variance. 


Thanks a lot for your prompt reply!! 


Thanks again for your answer: by including the the VARIANCES=NOCHECK; option, the model starts running. However, apparently, the procedure still encounters singularities because of groupspecific operations (again, in one of the groups one of my covariates has zerovariance). Can I somehow exclude the covariate from the group where it has zerovariance and still keep the covariates' coefficients equal across groups? I think that would solve the problem. 


You can try fixing the coefficient to zero in the groups where it has no variance, for example, y ON x@0; If this does not work, please send your input, data, output, and license number to support@statmodel.com. 

Linda posted on Thursday, October 11, 2007  9:02 am



I had a question about interpreting findings from multiple sample SEM investigating structural paths. I am runnig multiple sample SEM using intervention types as groups (Control, TPC, TMI, and TPC+TMI). And as an obvious approach, I am using the control group as my reference group when building the multiple sample SEM. Here is my question. So when a structural path shows that it's not different across groups, is that in reference to the control group only? Is multiple sample SEM allowing me to see the differences in the paths between TPC vs. TMI, TPC vs. TPC+TMI, and TMI vs. TPC+TMI? If so, how does that work given that I am specifying a reference group? Thanks in advance! Linda 


I'm not sure what you mean by using the control group as a reference group. You may want to use MODEL TEST. See the user's guide. 

Linda posted on Thursday, October 11, 2007  10:48 am



Yes, you are right. I got confused after reading an article. It's all clear now. I also have another question. To test for group invariance, how would I code my groups? Does this make sense control=0, TPC=1 TMI=2 and TPC+TMI=3? Also, doesn't the numeric coding imply a linearity of the groups? 


I'm not sure what you mean by code your groups. Are you referring to the values in the GROUPING option? 

Linda posted on Thursday, October 11, 2007  5:09 pm



Yes. The values in the grouping option. 


These are the values of your grouping variable. For example for the variable gender, if 0 represents males and 1 represents female, you would give the label males to 0 and females to 1. 

Linda posted on Friday, October 12, 2007  1:28 am



Yes...the grouping option. By assigning numbers to the groups, it gives me the impression that the values imply linearity. 

Linda posted on Friday, October 12, 2007  1:36 am



Please ignore the posting above. For some reason, I couldn't read your response until I posted the same question again. So when I have four groups and I am assigning values for each one of those groups, the numbers 0, 1, 2, 3, do not mean that the value 3 is 3x as big as 1, but that 3 is group 3...is this correct? Thanks in advance! 


The numbers tell the program how to divide the data into groups. 

Linda posted on Friday, October 12, 2007  8:02 am



Great! It's all clear now. Thank you very much! 

Linda posted on Wednesday, November 28, 2007  8:53 am



I am conducting Multisample SEM on 4 groups. I am investigating structural paths between 5 variables (1 LV and 4 OV). I would like to build my model constraining first the LV before I constrain the structural paths. Here is my question. Do I need to constrain the loadings, residual variances, and means? or could i just constrain the loadings? 


The first step is to establish measurement invariance of the latent variable. How to do this is described in Chapter 13 of the user's guide. Only if the latent variable is the same construct in all groups does it make sense to make comparisons of the structural parameters. 

Linda posted on Thursday, November 29, 2007  8:25 am



Thank you for your reply. I did establish measurement invariance first. And I wanted to take a step further to establish the structural parameters. To do that, do I keep the groups constrained on the loadings only or residual variances and means as well? 

Linda posted on Thursday, November 29, 2007  9:22 am



I am running the model below where x1, x2, and x3 are exogenous predictor variables, m1, m2, m3 and f1 are mediators, and y is the outcome variable. MODEL: f1 by y1 y2 y3; m1 on x1 x2 x3; f1 on m1; m2 on f1; m3 on m2; y on m3; I get the following error message. how could I fix this? Thanks in advance. THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES MAY NOT BE TRUSTWORTHY FOR SOME PARAMETERS DUE TO A NONPOSITIVE DEFINITE FIRSTORDER DERIVATIVE PRODUCT MATRIX. THIS MAY BE DUE TO THE STARTING VALUES BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. THE CONDITION NUMBER IS 0.338D10. PROBLEM INVOLVING PARAMETER 15. 


Once you have established measurement invariance, you should leave the equalities in place. We don't use equalities of residual variances. Regarding the error message, please send your input, data, output, and license number to support@statmodel.com. 

Jungeun Lee posted on Friday, November 30, 2007  4:49 pm



I am working on a multiple group (males and females) SEM. I'd like to test whether or not each individual coefficient in the structural part differs by males and females. I used MODEL TEST to test this. Here is my mplus input. MODEL: depr by dep1 dep2 dep3 dep4; hope by pos1 pos2 ; anxiety by anx1 anx2; hope on anxiety (p1); depr on hope (p2); MODEL female: depr by dep2 dep3 dep4; hope by pos2 ; anxiety by anx2; hope on anxiety(p3); depr on hope (p4); MODEL TEST: 0=p1p3; 0=p2p4; The program gave me one wald test result (value=.343 pvalue=.8425). Q1. What does this mean? Does it mean that p1=p3 & p2=p4? Q2. I expected more than 1 test results from the above analysis like the first test result corresponds to p1=p3 and the second test corresponds to p2=p4... Is there a way to do this in mplus? Or, do I have to run separate models for each? 


You need to run separate runs for each. A large p=values means you cannot reject equality of the parameters. 


I've got a question to the way of reporting a multigroupSEM in a paper: some of the effects in my model are set equal for both, men and women. Some are not. If the model is presented in a paper, standardized estimates are reported in general. But for the equal Effects, the standardized estimates are different, while the unstandardized are not. If I want to report the standardized estimates, which estimate do I choose? the one of the male or of the female group? 


I would report both raw and standardized coefficients and standardized for both males and females. 

Erika Wolf posted on Monday, January 28, 2008  10:57 am



I'm running a series of CFAs to examine measurement invariance across 2 groups. I'm using categorical indicators and using the WLSMV estimator and the DIFFTEST function to test the nested models. I'm first testing for equal form across the groups, allowing the factor loadings and thresholds to be freely estimated in both groups and setting the scale factor to 1 in the 2nd group. In the second model, I'm testing for equal factor loadings, so I've left in all the Mplus defaults and have not specified anything for the 2nd group. I'm confused, though, because my equal factor loadings model has fewer DF than my equal form model when I would expect this to be the otherway around. Is this simply a function of the DF being estimated with the WLSMV estimator? Or are their additional defaults that I should override in my equal factor loadings model? Thanks, Erika 


With WLSMV, the only value that you should look at is the pvalue. If you want to look at degrees of freedom in the traditional way, use WLS or WLSMV to see if they are behaving as you would expect. See also the section in Chapter 13 where the set of models to test measurement invariance for categorical outcomes are described. 

Dale Glaser posted on Tuesday, February 05, 2008  12:35 pm



Hi Linda and Bengt...I have a result that seems easy enough to rectify but is proving to be intractible! I am testing a multigroup (g = 2) CFA with three constructs and three items per construct. When I test the model for the full sample, I get an unimpressive fit (CFI = .82,RMSEA = .118, etc.); however, when I run the multigroup model, whether I constrain the loadings to be equal or not I get an error message that "the standard errors of the model parameter estimates could not be computed.....".....when I check the offending parameter, it is the parameter in the PSI matrix and has a negative SE. After checking for collinearity, multivariate normality, etc there didn't seem to be any major problems. Interestingly, when I run an EFA for each group there is a very clean factorial solution for each group (though I am well aware of EFA vs. CFA differences in results). After trying various fixes (e.g,constraining the elements in the PSI matrix to 0) I was only able to attain convergence when I used the parameter estimates from the full sample as fixed estimates, and as expected fit was horrible (CFI = .65, RMSEA = .122,). Unfortunately, due to privacy issues I can't share the data as Linda generously offers. So, before abandoning this model, any recommendations for negative SE in PSI matrix even though the usual culprits (e.g, singularity) are not an apparent issue? Thank you....Dale 


Have you run the CFA model for the two groups separately? 

Dale Glaser posted on Tuesday, February 05, 2008  3:25 pm



yes I did Linda, and I was able to obtain convergence for one group but fit was abysmal (CFI approx .8, RMSEA approx, .12, etc.)......and I believe that for the other group I had to fix the PSI estimate to obtain convergence (and again fit was problematic).......what I find intriguing is the factorial solution for EFA (whether orthogonal or oblique rotation) was very unambiguous (i.e., as postulated) for both groups....... 


I know you can't send your data but I would like to see the EFA outputs for each group and the CFA outputs for each group. If you had clear EFA results, the CFA's should not fit so poorly. It does not sound like the CFA fits in either group. 


Hello all, I recently conducted several analyses where I compared the pattern of results across correlation matrices of mostly personality data. Specifically, I was interested in whether the pattern of results in group A (e.g., men) was similar to that seen in group B (e.g., women). The procedure yielded the following fit indices: Chisquare Standardized RMR RMSEA Population Gamma Adjusted Population Gamma McDonald Noncentrality index Population noncentrality index For all but the first two I have 90% confidence intervals as well. My sample sizes are, by SEM standards, small (<=250). I have read the Hu and Bentler paper but am still a little unclear as to what the "appropriate" cutoffs are for assessing model fit. Any suggestions? How might the confidence intervals sort out this issue? Any and all suggestions would be greatly appreciated  thanks in advance! 


From the indices you give, it looks like you are not using Mplus. These indices are tests of overall fit of the model not the comparison of groups. Chisquare difference testing can be used to test across group differences. 


Hello all, I have a problem with a multiplegroup SEM (two groups). I have a set of mixed observed variables (continuous and categorical), so my input is a matrix of Polychoric/polyserial/pearson correlations. However I can't use WLS estimator (and calculate Asymptotic Covariances Matrix), maybe because of little N (500) in one group. My questions are: a) my model converges with a ML estimation, with quite good fit; anyway, is that correct? b) If I want to compare two structural parameters (or test factorial invariance), what can I do if I used a correlation matrix? Thanks! 


There are a couple of issues here. If you are doing a multiple group analysis using a correlation matrix as input, then you must be telling Mplus it is a covariance matrix or this would not be allowed. If you have a combination of continuous and categorical dependent variables, you need to use raw data in Mplus. 


Hi Linda and BengtI am working on measurement invariance for my model and have a question about contraining means to 0. According to UG: MODELS FOR CONTINUOUS OUTCOMES... 1. Intercepts, factor loadings, and residual variances free across groups; factor means fixed at zero in all groups 2. Factor loadings constrained to be equal across groups; intercepts and residual variances free; factor means fixed at zero in all groups... This would be the means for the latents only and be indicated by [var@0] for the second group? Should I also set means for observed exogenous variables to 0 as well? Thanks, Sue 


It is only the latent variable, factor, means that are fixed to zero. 


Thank you Linda. Sue 


Hi Linda and Bengt: Still working on measurement invariance. MODELS FOR CONTINUOUS OUTCOMES 2. Factor loadings constrained to be equal across groups; intercepts and residual variances free; factor means fixed at zero in all groups I originally ran this model with the factor means default and it ran fine and I was able to achieve partial metric invariance. However, when I run the model with the factor means at zero for all groups, I get the following error message: THE MODEL ESTIMATION TERMINATED NORMALLY THE CHISQUARE DIFFERENCE TEST COULD NOT BE COMPUTED BECAUSE THE H0 MODEL MAY NOT BE NESTED IN THE H1 MODEL. DECREASING THE CONVERGENCE OPTION MAY RESOLVE THIS PROBLEM. My analysis syntax was: ANALYSIS: DIFFTEST = 'C:\derivh1_white16_1.dat'; TYPE= MEANSTRUCTURE COMPLEX MISSING H1; CONVERGENCE = .0001; ITERATIONS = 20000; Can you tell me what I am doing wrong? Thanks, Sue 


Please send the necessary files and your license number to support@statmodel.com. 


Dear Linda I'm delighted to say that I figured out my syntax error on my own. Thanks for your quick response. I appreciate your availability. Sue 


Hello all, I'm running a twogroup path analysis, but I've a major problem... I don't obtain an output ! As mentioned in the user's guide, I first specified the H1model (unconstrained model) with DIFFTESToption in the SAVEDATAcommand. That turned out well (output file was OK). Second, I specified the H0model (fully constrained) in which all regression coefficients are defined eqaul between both groups. This is done by specifying the DIFFTESToption in the ANALYSIScommand. The syntax for this H0model is provided below. The output file only mentions that reading the input terminated normally. However, no other information is provided in the output (no model results, no Chi² difference test, ...). What can cause this problem ? DATA: FILE IS C:\AAG\data AAG test.dat; VARIABLE: NAMES ARE ... USEV ARE ... MISSING ARE ... CATEGORICAL = cu; GROUPING IS tour (0=worktour 1=complextour); ANALYSIS: PARAMETERIZATION = THETA; DIFFTEST = C:\AAG\deriv test.dat; MODEL: co2 ON sx dl i1 i2 i3 i4 hbi; cu ON dl d2 d3 d4 d5 co2; co2 on sx (1); co2 on dl (2); co2 on i1 (3); co2 on i2 (4); ... and so on ... 


Please send your input, data, output, and license number to support@statmodel.com so we can see what the problem is. 


Dear Linda, Thanks for your help, but I've managed to solve the problem. The models ran succesfully and I obtained the outputs. 

anja schüle posted on Thursday, June 05, 2008  6:08 am



I confirmed a big SEM model with continuous variables. Now, I am trying to do a twogroup SEM analysis within this model in addition: Hypothesizing that 7 of the 14 betas are affected and respectively vary across the two groups, (but the other 7 do not, and the gamma doesn’t either). is it the right way to test my seven hypothesis by running the model at first for both groups by the “grouping” command, and only specifying the general Model after “Model:” without any restrictions, and afterwards, in the second run, doing the same again but constraining one beta to be equal across groups by formulating: f4 on f1 (1) (So I would have to calculate this second model 7 times, each time constraining only 1 beta. And than, comparing the X² of this constrained run with the X² of the unconstrained run to see if the difference is significant?) Or is it better to constrain all betas and gammas across groups in the first run, and afterwards in the second run, to set only one beta free in each run? (And compare the X² of the completely constrained model with the X² of the model where only one beta is set free?) Thanks a lot in advance! 


I don't think it matters. 


Hello. I have tested a complex model (6 latent variables; 22 observed variables) and obtained acceptable model fit for the data. I ran subsequent multiple groups analyses for each of the 3 race/ethnicities within the dataset (n = 140 for each race/ethnicity). The model fit was excellent for two of the groups, but unacceptable for the third group. How do you suggest I proceed? Should I scrap the omnibus model and develop individual models for each race/ethnicity? Should I accept the omnibus model for the two races that have good fit and develop a different model for the third race? I have been unable to find guidance on this issue? Any help (advice, references) you could provide would be very appreciated. 


It does not make sense to put groups together if the same model does not fit the data well for each group. Determining this is the first step in testing for measurement invariance. Once this has been established, then measurement invariance across the groups can be tested. Only then does it makes sense to combine the groups. You can search the literature for measurement invariance for more information and also see the following papers: Muthén, B. (1989). Factor structure in groups selected on observed scores. British Journal of Mathematical and Statistical Psychology, 42, 8190. Muthén, B. (1989). Multiplegroup structural modeling with nonnormal continuous variables. British Journal of Mathematical and Statistical Psychology, 42, 5562. 


I have a question about power analysis for a multpile group SEM where we plan to evaluate the mediated effects of an intervention on drinking outcomes in Caucasian versus Hispanic adolescents. I have conducted the power analysis in MPlus on the overall model and have found that with 200 subjects I have power .75 and with 250 subjects I have power .84. Given this information, how do I determine what sample size is needed for each group? Do I simply double the same size (and thus, I would need n=250 Caucasians and n=250 Hispanics)? I've read a few papers including Muthen & Muthen 2002 from the website and can't seem to find the answer to this question. Most I can find about power in multiple group SEM is that power is higher if group sizes are equal. Any help would be much appreciated! 


I assume that you are going to compare the two groups because you think they are different. I would do a separate power study for each group. 


Thanks so much for your reponse. Yes, we hypothesize structural paths that will be different in the two groups. So I should do two different power analyses? If the answer is that I were to need 100 in one group and 200 in another, doesn't this compromise power for the 1 df chisquare difference tests for whether a particlar path is, indeed, different between the groups? 


I don't think this would compromise the multiple group power test. However, this is a test of the equality of two parameters so the last column is not power because the parameter is not being compared to zero. You would need to use MODEL CONSTRAINT to create a new parameter that is the difference between the two parameters and see the last column for that. 


Hi there  I have a SEM model with one categorical predictor (two levels). This predictor represents 2 different experimental conditions that my participants were in (between subjects design). My question is, am I able to simply dummy code this categorical variable and run the usual SEM, or do I need to run a different kind of SEM in order to analyze this? Thanks very much for your help, Andrea 


A dummy covariate can be included in an SEM model. See Example 5.8 in the Mplus User's Guide. 


Hello, I am estimating a structural model to look at whether father involvement mediates the relationship between being an immigrant child and that child's cognitive outcomes. The mediator of father involvement is a latent variable. I am first running a CFA before proceeding to the path analysis portion. I have determined that I do not have measurement invariance between resident and nonresident fathers on the latent variable of Father Involvement, although the CFA model shows an acceptable fit for each group. Theoretically, this makes a lot of sense, since what fathers do when they live with or away from their children should vary, that is, I expected to find noninvariance. I am struggling because now that I have established noninvariance, is it ok to go ahead and estimate the larger path models separately by group? Or should I estimate one large model across both groups and allow all the parameters of the latent variable Father Involvement to vary across groups? Is this even possible? Thanks. 


Thanks so much for getting back to me Dr. Muthen. I think I need a bit more clarification though (forgive me  I am new at this whole SEM thing). I was wondering if a dummy coded, 2level categorical predictor (not a covariate) can be included in a regular SEM analysis. And I checked example 5.8 and it doesn't seem to refer to a categorical predictor...perhaps I am misinterpreing it? Thanks so much, sorry for the repeat postings, Andrea 


Andrea: The predictor can be continuous or categorical as in regular regression. 


Wonderful! Thanks so much for your speedy reply, Andrea 


Hello, I am estimating a structural model to look at whether father involvement mediates the relationship between being an immigrant child and that child's cognitive outcomes. The mediator of father involvement is a latent variable. I am first running a CFA before proceeding to the path analysis portion. I have determined that I do not have measurement invariance between resident and nonresident fathers on the latent variable of Father Involvement, although the CFA model shows an acceptable fit for each group. Theoretically, this makes a lot of sense, since what fathers do when they live with or away from their children should vary, that is, I expected to find noninvariance. I am struggling because now that I have established noninvariance, is it ok to go ahead and estimate the larger path models separately by group? Or should I estimate one large model across both groups and allow all the parameters of the latent variable Father Involvement to vary across groups? Is this even possible? Thanks. 


If you do not have invariance of the factor across groups, then you should look at the two groups separately. You cannot compare the factor parameters across the two groups in a meaningful way without measurement invariance. 

Heejung Chun posted on Thursday, November 27, 2008  8:09 pm



Hello, I am conducting a multiple group analysis (MGA) with a secondorder confirmatory factor model. The MGA is established in five steps. The five steps are the following: 1. Configural invariance (released the intercepts of the indicators along with releasing all other parameters) 2. Factor Loading invariance of Indicators 3. Factor Loading invariance of Firstorder factors 4. Intercept invariance of Indicators 5. Intercept invariance of Firstorder Factors In my understanding the CFIs should be deceased as I constrain factor loadings and/or intercepts between groups. However, my results showed greater CFIs as I constrained some parameters between groups. Is this right? I would appreciate your answer. Thank you. 

dena posted on Friday, November 28, 2008  8:27 am



Hi, I would like to run autoregressive models separately for boys and girls because the correlation matrix clearly suggests that our variables of interest are significantly correlated among girls but not among boys. My question is can I (and if yes, how) justify my decision to run models separately for boys and girls? I read somewhere that we can test whether the variancecovariance matrix is the same for boys and girls and if not, this could justify the split of the analyses. I’m not sure if this is right and how to do that. I constrained all the correlations to be equal for boys and girls. The chisquare with 28 df = 53.17. Can I compare it to the base model (chisquare = 0) and say that the constrained model is «significantly worse» (critical chisquare for n = 28df = 41.34)? I also did multigroup analyses. Even though some coefficients are significant and quite different for girls and boys, the difference in the chisquare when I look at the constrained vs. unconstrained models is not significant. Is it normal? Thank you very much for your time. 


The most pointed analysis would be the multigroup analysis where the autoregressive model is used for both genders and runs with full equality and full inequality are used to form the chisquare difference test. 


Regarding the Chun post, the loglikelihood should get lower with more restrictive models (chisquare higher), but I don't think CFI needs to follow this pattern. 

Bander Ahmed posted on Wednesday, December 03, 2008  10:48 am



Hi, i have two groups; one of them is 232 respondents (repose rate 50%) and the other one is 386 respondents (response rate 80).does this effect the comparison results? 


I would be concerned about the 50% response rate. This is very low. It could affect the comparison. 

Bander Ahmed posted on Thursday, December 04, 2008  6:17 am



whould you kindly tell me what effect it might have? and whats the possible solutions? Many thanks 


If the missingness is due to different reasons in the two groups, any group comparisons would be biased. You should investigate why the missingness occurs. The only solution would be to include variables in the model that relate to missingness. But with so much missing data in the one group, some may not find your results meaningful. 


Hello, I've got the following question... I have two groups and I want to fix all factor loadings and path coefficients across all groups. Therefore, I have the following input: MODEL: CP by cp1 cp3 (1); AC by ac3 ac4 ac5 ac6 (2); NC by nc4 nc5 nc7 (3); CP on AC (4); CP on NC (5); But, when I look into the output, loadings and path coefficients are only equal in the model result section. If I compare the values in the stdyx standardization section, the factor loadings and path coefficients differ. I don't understand this, as they should be also equal  I mean that is what I wanted to fix in my input commands. When I run the same stuff in LISREL, the standardized values are equal... What is wrong? Thanks!!!! 


The standardized are different because they are standardized using the groups standard deviations not the overall standard deviations. 

nina chien posted on Tuesday, December 09, 2008  10:42 am



I am doing multiplegroup analysis with chisquare difference tests, and my estimator is MLR. I referred to your page: http://www.statmodel.com/chidiff.shtml I am a little confused about the following: 1. Estimate the nested and comparison models using MLR. The printout gives loglikelihood values L0 and L1 for the H0 and H1 models, respectively, as well as scaling correction factors c0 and c1 for the H0 and H1 models, respectively. For example, L0 = 2,606, c0 = 1.450 with 39 parameters (p0 = 39) L1 = 2,583, c1 = 1.546 with 47 parameters (p1 = 47) **************************************** Is the L0 in the instructions the H0 Value of the *nested* model? And the c0 is the scaling correction factor for the nested model? And is the L1 in the instructions the H0 Value of the *comparison* model? And the c1 is the scaling correction factor for the comparison model? Thank you for your help. 


Is the L0 in the instructions the H0 Value of the *nested* model? And the c0 is the scaling correction factor for the nested model? Yes. And is the L1 in the instructions the H0 Value of the *comparison* model? And the c1 is the scaling correction factor for the comparison model? Yes. 

dena posted on Friday, January 23, 2009  7:36 am



Hi, I’m doing multigroup analyses on autoregressive crosslagged paths with two variables and four timepoints. I first did my global model (both genders), in which I had to add two correlations between residuals to improve model fit. Then, I looked whether the fit was good among girls and boys. Fits are ok, but I noticed that one of the correlated residual is not significant among girls, whereas the other is not significant among boys (i.e., one is significant in each group, but it’s not the same). I also noticed that some paths are significant among girls but none are significant among boys. My questions are:  Is it necessary to go further in the analyses since the coefficients are only significant among girls and not among boys?  If yes, if I constrain only the significant paths to be equal among boys and girls, should the difference in the chisquare detect these differences?  If not, what could explain it? Thank you very much for your time. 


You can have coefficients significant in one group and not the other and still not be able to reject equality across groups. For instance, point estimates of say 0.2 for girls 0.15 for boys may be significant for girls and insignificant for boys, while the two are not significantly different. I hope that's what you were asking. 

dena posted on Monday, January 26, 2009  8:54 am



From prior dena comment: Yes, it is. Then, what can I conclude from these findings? Can I still report the coefficients for boys and girls and mention that even though the coefficients were significant for girls but not for boys, we could not detect a significant differences between the two? What does it mean? Thank you again for your precious help. 


Answer to your second question  yes. Answer to your third (last) question  if you cannot reject equality of the coefficients across gender you would want to consider if this common coefficient is significant; perhaps it is. That would make perfect sense I think. 

dena posted on Sunday, February 01, 2009  1:26 pm



The common coefficients (I assume these are the coefficients for the total sample) are not always significant... What conclusions can I then draw? Example: Coefficient for total = .09, z = 1.88 Coefficient for girls = .13, z = 2.01 Coefficient for boys = .02, z = 0.281 If I constrain only this coefficient to be equal among boys and girls, the delta chisquare is not significant. Thanks again! 


I would report what you see  the coefficient for girls is significantly different from zero. The coeff for boys is not. The two coefficients are not significantly different from each other. This is not contradictory  the last statement might be due to the coefficient for boys having a large standard error; the SE for boys plays into the gender difference testing. Perhaps there is too little power to reject gender differences. 

Kihan Kim posted on Tuesday, February 17, 2009  4:00 pm



I am trying to test a multigroup SEM with no constraints on the measurement and structural parts (I do not want any parameter to be constraint). I have five factors (F1F5), and the following is the MODEL command. I am keep receiving the following error message, and I am not sure what is wrong with the model identification. Could you please advise me? Error Message: THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES COULD NOT BE COMPUTED. THE MODEL MAY NOT BE IDENTIFIED. CHECK YOUR MODEL. PROBLEM INVOLVING PARAMETER 73. Model: F1 by Y1 Y2 Y3; F2 by Y4 Y5 Y6; F3 by Y7 Y8; F4 by Y9 Y10; F5 by Y11 Y12 Y13 Y14; F5 on F1 F2 F3 F4; Model G2: F1 by Y1 Y2 Y3; F2 by Y4 Y5 Y6; F3 by Y7 Y8; F4 by Y9 Y10; F5 by Y11 Y12 Y13 Y14; 


In Model G2 you should not free the first factor indicator loading since that sets the metric of the factor. See the Mplus UG multigroup examples. 

Kihan Kim posted on Wednesday, February 18, 2009  12:25 pm



I have four exogenous factors predicting one endogenous factor such as: F1 by Y1 Y2 Y3; F2 by Y4 Y5 Y6; F3 by Y7 Y8 Y9; F4 by Y10 Y11 Y12; F5 by Y13 Y14 Y15; F5 on F1F4; I appeared that F4 had a negative impact on F5. But the correlations among the factor indicators for F4 and F5 are all positively correlated. Is it possible to find a negative path coefficient when the correlations among indicators are all positive? Thanks you. 


The fact that the factor loadings for both factors are positive does not preclude the factors from having a negative relationship with each other. 

Kihan Kim posted on Wednesday, February 18, 2009  5:34 pm



Sorry... I think I confused you in my previous question. It was not the factor loadings for both factors that were positive, but the correlations among the indicators were positive. So, for the following model, the correlations among Y10Y15 were all positive, and I found a negative impact of F4 on F5. Is this possible? Thank you again.. F1 by Y1 Y2 Y3; F2 by Y4 Y5 Y6; F3 by Y7 Y8 Y9; F4 by Y10 Y11 Y12; F5 by Y13 Y14 Y15; F5 on F1F4; 


Please send the full output and your license number to support@statmodel.com so I can see exactly what you are doing. Please include SAMPSTAT in the OUTPUT command. 


Hi, I was wondering whether it is possible to test for an interaction between an observed and a latent variable (both continuous) within a multiple group analysis, i.e. to test whether the interaction (between two variables) differs between groups (third variable). Thanks for your help, Ina 


Yes, you can use the XWITH option to do this. 


I just read that interactions with continuous variables require numerical integration. But if I add "ALGORITHM = INTEGRATION", MPLus tells me that "ALGORITHM = INTEGRATION is not available for multiple group analysis". 


You need to use the KNOWNCLASS option and TYPE=MIXTURE instead of the GROUPING option when numerical integration is involved. If you have further questions on this topic, send them along with your license number to support@statmodel.com. 

Ela Polek posted on Friday, February 27, 2009  7:30 am



I run Multiple Group SEM in 6 groups with Mplus. I have to compare some specific coefficients across groups, (to test if the influence of some variables on the outcome variable differs across groups). I have already used modeltest command in Mplus, which gives Wald Test, but this does not test invariance of specific coefficients across groups. I know that when comparing coefficients in 2 groups ttest can be used. What test should be used when comparing 6 groups? I will be more than thankful for any advice. Thank you in advance, Ela 


You can test specific coefficients using MODEL TEST, for example, MODEL: y ON x1 (p1) x2 (p2); MODEL TEST: 0 = p1p2; of you can use difference testing of nested models using either chisquare or the loglikelihood. 


Hello, I suspect I have two groups but I'd like to prove it. I have run ESEM analysis separately on each of the two groups and they do not respond in the same way to my construct: some variables do not load on the same latent factors between the two groups. But I'd actually like to go back a step and actually show on my whole sample (n1 + n2) that the grouping variable has a significant effect on the construct. Could you please tell me how I can assess in Mplus if the grouping variables has an additive effect on the outcomes loadings but as well may lead to some variables loadings on some latents in one group but not in the other group? Would the following syntax take into account the different possible effects I just mentioned of the grouping variable on the construct? MODEL: F1 BY u1u7(*1); F2 BY u1u7(*1); F1 with F2; u1u7 ON Gp; With F1 and F2, the two common continuous latent factors; u1 to u7, the ordinal outcomes variables; Gp, the binary grouping variable (coded in 0 and 1). Thank you very much in advance for any input. 


The model you have specified can determine differences in the intercepts only not the factor loadings. If you do a multiple group analysis where the factor loadings are held equal across groups as the default, you can look at modification indices to assess measurement invariance. 


Hi Linda, I’ve got 2 questions: 1. Would this ESEM model assess the effect the grouping variable (x1) can have on the loading structure as well as the strength of the loadings? USEVARIABLES ARE x1 u1 u2 u3 u4 u5 u6 u7 u1x1 u2x1 u3x1 u4x1 u5x1 u6x1 u7x1; CATEGORICAL ARE u1 u2 u3 u4 u5 u6 u7; DEFINE: u1x1 = u1*x1; u2x1 = u2*x1; u3x1 = u3*x1; u4x1 = u4*x1; u5x1 = u5*x1; u6x1 = u6*x1; u7x1 = u7*x1; MODEL: F1 BY u1u7x1(*1); F2 BY u1u7x1(*1); F1F2 ON x1; F1 with F2; u4 with u5; 2. The original model has 2 latent factors on which different outcomes (e.g.u1) load according to the group (when I run the analysis separately on the 2 groups). To test the effect of the interaction variables (e.g. u1x1) on the model structure on the whole sample, should I still specify 2 latent factors or more/less? The results show the original ordinal outcomes significantly loading on 1 factor, and the interaction variables, on the other…I'm not sure what this is showing. 


The interaction between the covariate and the items does not get at factor loading invariance. The interaction between the covariate and the factor would do that. The best way to look at factor loading invariance is multiple group analysis. See Example 4 in the Version 5.1 Examples Addendum on the website with the user's guide. See also the Topic 1 course handout and video. Topics 1 and 2 will be taught at Johns Hopkins University in August. 


Hi, I am comparing a CFA across two groups and keep getting the same fit index statistics for increasingly constrained models. Do you think you can tell me what I'm doing wrong. Here is my code: (1) Model: DOMCON BY AIDEDUC@1 HEALTHCA PROGVIOL SOCSEC; INTCON BY DEFENSE @1 GATHINTE HOMESEC ECONOTHR MILOTHRN; DOMCON with INTCON (1); (2) Model: DOMCON BY AIDEDUC@1 HEALTHCA PROGVIOL SOCSEC; INTCON BY DEFENSE @1 GATHINTE HOMESEC ECONOTHR MILOTHRN; DOMCON with INTCON(1); AIDEDUC (2); HEALTHCA (3); PROGVIOL (4); SOCSEC (5); DEFENSE (6); GATHINTE (7); HOMESEC (8); ECONOTHR (9); MILOTHRN (10); (3) Model: DOMCON BY AIDEDUC@1 HEALTHCA PROGVIOL SOCSEC; INTCON BY DEFENSE @1 GATHINTE HOMESEC ECONOTHR MILOTHRN; DOMCON with INTCON(1); AIDEDUC (2); HEALTHCA (3); PROGVIOL (4); SOCSEC (5); DEFENSE (6); GATHINTE (7); HOMESEC (8); ECONOTHR (9); MILOTHRN (10); [DOMCON] (11); [INTCON] (12); 


Please send the three full outputs and your license number to support@statmodel.com. 


I investigate mean level differences between three age groups in a measurement model with three correlated factors. In the next step I would like to test the between group ability differences in the same factors after controlling for general cognition for example. To do this I modelled a structural model in which those factors are regressed onto general cognition: f1 on GenCOG; f2 on GenCOG; f3 on GenCOG; I need to compare the means of the residuals of f1, f2, f3 between the groups if I want to test performance differences on those factors after controlling for GenCOG, right? Where in Mplus Output do I find those values? In the residual Output I find only parameter for the Indicators. Tech 4 shows means of the latent variables (also of the endogenous variables), but when I am looking at those values, I dont think those are the means of the residuals of the endogenous factors. There are exactely the same latent means displayed that I found in the measurement model. But the exogenous variable explains at about the half of the variance, so the residual means of f1, f2 and f3 should change compared to the measurement model, I think. Could you please advice me. Thank you for your help! 


The parameters you want are the intercepts of f1, f2, and f3 which you will find in the Results section of the output. 


How do you tell if the difference between groups is significant when you run a multigroup model? 


You can do this using chisquare or loglikelihood difference testing of two nested models where one model has the parameter of interest free across groups and the other has the parameter constrained to be equal across groups. You can also use MODEL TEST. 


I am trying to see if the path analysis model below differs between males and females and am not sure what syntax to use to constrain the paths. I have been looking at syntax that people use but am not sure how to apply it to my model and/or whether I need to prepare my data differently to test for measurement invariance. Thanks! VARIABLE: NAMES ARE ID IDYRFAM sex SES ZSES alc2 cn0 gp1 bp1 cn2 gp2 bp2; USEVARIABLES ARE alc2 cn0 bp1 cn2 bp2; CLUSTER = IDYRFAM; ANALYSIS: TYPE = COMPLEX; MODEL: bp1 bp2 cn2 alc2 ON cn0; bp2 cn2 alc2 ON bp1; alc2 ON bp2; alc2 ON cn2; bp2 WITH cn2; OUTPUT: SAMPSTAT STANDARDIZED; standardized mod(3.84); 


Chapter 13 has a section on Equalities in Multiple Group Analysis that should help you. There is a full discussion of the Mplus language for multiple group analysis in that chapter. 

Linda posted on Monday, August 24, 2009  11:16 am



How does multisample SEM account for multiple comparisons when comparing models across multiple groups? 

naT posted on Wednesday, August 26, 2009  1:41 pm



Hello, I am modelling path analysis model with all observed variables. However, my TECH1 tells me that there are no parameter specified in NU nor THETA matrices, but instead all are specified in ALPHA and PSI matrices. I am wondering whether I misspecified the model, or is this the default of the mplus? If I have misspecified the model, how can I fix this? Thank you very much for your help! 


This is correct. There is no matrix in Mplus for observed regressed on observed so the observed variables are turned into latent variables that are identical to the observed variables. This does not in any way affect the results. It simply moves the parameters from one matrix to another. 

naT posted on Wednesday, August 26, 2009  4:01 pm



Thank you very much for clarifying! Best 


I am using similar syntax as in the MPlus manual to estimate a SEM with constrained factor loadings across multiple groups, but separate model ON statements. But I get a mesg that the model didn't converge and factor scores were not computed. When I look at the parameter estimates, they don't look too huge, there are no negative residual variances. I have 2 groups, and 3 continuous latent variables, and 15 factors. Can you send some sample syntax that would work? 


Please send your full output and license number to support@statmodel.com so I can see exactly what you are doing. 


I'm running a model and want to compare effects across developmental periods. I've run two models, one unconstrained, and one where I've constrained the paths of interest to be equivalent across developmental periods. Is there also a way to test if the factor loadings in the unconstrained model are significantly different across periods instead of running a separate model where the paths are constrained to be equal? 


You can use MODEL TEST. See the user's guide. 


Hi, I fully understand why using nominal variables as mediators in a path/SEM is unacceptable. But suppose one were to include it as a set of dummies. *And* if separate analysis, using seemingly unrelated estimation, showed that the effects of "upstream" exogenous variables on these dummies, was statistically no different from the effects of these same exogenous variables on the corresponding nominal variable categories  would the strategy then become defensible? Thanks.  Bobby 


Elaboration on last post: my point is that if IIA holds, then why not replace the multinomial part of the model (exogenous > nominal) with corresponding logits (exogenous > dummies)? Thanks, Bobby 


Following is an answer that was given to a similar question last week. It was found by searching on nominal. "I don't think mediation via a nominal mediator m has been studied methodologically  but correct me if I am wrong. One possible direction to go would be to create a latent class variable c where the nominal categories of c are the same as the observed nominal variable categories of m (this is done via logit thresholds). c on x is then a multinomial logistic regression and the influence of c on y is captured by the means of y changing over the c categories (you don't say "y on c", but it has the same effect). This avoids the y on m regression which would treat m as continuous which would not make sense when m is nominal. One can then explore if there is a need for direct effects y on x. But there isn't any guidance for how one should/could simply quantify how much of the x influence goes via m versus directly. Perhaps that isn't needed. This topic is a method research paper in itself  anyone?" 


Hi Linda, Thanks, this helps. I apologize in advance if the following is a completely braindead question: From a previous discussion: "Making the observed nominal u variable the same as the categorical latent c variable is done by saying %c#1% [u$1@15]; %c#2% [u$1@15];" I'm guessing this is for 2 categories. I.e. for fixing thresholds to logit values of 15 & 15, corresponding to 0/1 probabilities. Right? So how to fix thresholds for 3(+) categories? Once more, I suspect I'm being really dumb here... But clarification would be appreciated. Thanks, Bobby 


For 3 groups (categories) you use: %c#1% ! g=0 group !(note: high threshold, low prob.) [g$1@15 g$2@16 g$3@17]; %c#2% ! g=2 group [g$1@15 g$2@16 g$3@17]; %c#3% ! g=3 group [g$1@16 g$2@15 g$3@17]; where g is declared categorical. 


Thanks! One last question: I'm assuming this latent variable couldn't be dumped into a more elaborate SEM, where "downstream" dependent variables influence each other, and all are influenced by exogenous x's. I don't particularly need to know how much of the influence of x goes via the nominal variable versus directly. Just wanted to check. Thanks again, Bobby 


This approach can be combined with a full SEM. 


I am interested in running a multiple group analysis and constraining paths across three samples. However, one sample is missing two variables so, I would like to constrain all of the paths across all of the samples, except for paths related to those two variables for that sample, but for the two samples that have those two variables, I would like the paths constrained. Please let me know if this is possible in MPlus and if so, how I can do it. Thanks! 


You need the same set of observed variables in each group. 


Hi, I'm doing a multigroup comparison including children learning to read across two different orthographies. I'm a new user of Mplus, but so far I understand that the procedure is to go step by step. First comparing (across groups) factor loadings, then intercepts, then factor variance/covariance etc. I've also learned that if some of the steps show a sig.diff. across groups than further comparison is meaningless. My question concerns partial measurement invariance. In my study I have five latent variables made up by ten indicators (two indicators for each latent) A chisquare difference test tells me that my factor loadings diff. sig across groups. But does that mean that all loadings are sig.diff? or is there a place in the output showing which loadings that differ? Is it possible to continue comparing invariance across groups allowing some parameters to be free? Thanks 


You can have partial measurement invariance if you model the invariance by allowing the parameters to differ across groups. How much invariance you can have is debatable. You can see where the large differences are by looking at modification indices. See the Topic 1 video and course handout for a full description of measurement invariance. 

QianLi Xue posted on Sunday, November 29, 2009  4:07 pm



Is it true that theoretically, multiple group CFA with categorical factor indicators can have same loadings but different thresholds across groups? The model will be identified as long as the scale factors are set to 1 across all groups. 


Yes, I think that's true. Note that the scale factors depend on 3 things: loadings, factor variance, and item residual variance. If factor variances are different across groups, fixing scale factors at 1 in all groups would be inconsistent with that. 


Hi, I'm doing an hierarchical regression analysis in two groups using Cholesky decomposition (because of indications of colinerity among the independent variables). I have established measurement as well as factor variance/covariance across groups. How do I compare regression coefficients across groups? I think this procedure is quite straigthforward doing ordinary SEM, but I`m getting confused by the decomposition framework. Thanks. 

Sally Czaja posted on Monday, December 14, 2009  11:10 am



I am predicting a person level latent variable outcome (achievement) using a cluster level factor (neighborhood poverty) by group (grp). I'm running the following analysis and am getting an error message (ERROR in MODEL command Parameters involving betweenlevel variables are not allowed to vary across classes. Parameter: FB ON NEIGHPOV). Is there another way to estimate a model which allows between level variables to vary across classes? What would you suggest? Thank you. Classes= c(2); KNOWNCLASS = c(grp=0 grp=1); WITHIN = female raceWb ageint1 poverty; CLUSTER = census; BETWEEN = neighpov; ANALYSIS: TYPE= TWOLEVEL mixture; MODEL: %WITHIN% %OVERALL% fw BY ZgrdyrSp5 Zwratscr Zqutest; fw ON female raceWb ageint1 poverty; %c#1% fw BY ZgrdyrSp5 Zwratscr Zqutest; fw ON female raceWb ageint1 poverty; %c#2% fw BY ZgrdyrSp5 Zwratscr Zqutest; fw ON female raceWb ageint1 poverty; %BETWEEN% %Overall% fb by ZgrdyrSp5 Zwratscr Zqutest; fb on neighpov; ZgrdyrSp5 Zwratscr Zqutest @0; %c#1% fb by ZgrdyrSp5 Zwratscr Zqutest; fb on neighpov; ZgrdyrSp5 Zwratscr Zqutest @0; %c#2% fb by ZgrdyrSp5 Zwratscr Zqutest; fb on neighpov; ZgrdyrSp5 Zwratscr Zqutest @0; 


Bjarte  is that a Cholesky decomposition of the independent factors? So that one factor is residualized given the other? 


Yes. 


I see that my first explanation was somewhat unclear. I meant "I have established measurement invariance as well as factor variance/covariance invariance across groups". And yes, it is a Cholesky decomposition of the independent factors. 


It sounds like for your group comparisons of slopes on the factors you don't want to use the decomposed factors oyu got by Cholesky but the original ones. If so, you backtranslate the slopes to the original factors using Model Constraint and do tests of invariance using Model Test. 


Sally: Please send your input, data, output, and license number to support@statmodel.com. 


Hi, thank you so far! I think I include some more information Below is the input for my separate hierarchical regressions (Cholesky decomposition). As I said in an earlier post I understand (hopefully) the procedure in how to compare SEM models when the predictor variables are included simultaneously. However, I’m not sure what to do when comparing hierarchical models. What should I include in the second model (Model Scan) so that I can set the baseline model and then proceed with the comparison from factor loadings to structural paths? Thank you See next post for input. 


USEVARIABLES ARE v1v10; USEOBSERVATIONS = V1 EQ 2; !GROUPING IS V1 (1=Eng 2=Scan); WR by v1 v2; VOC by v3 v4; RAN by v5 v6; PA by v7 v8; WR1 by v9 v10; PH1 by WR* VOC RAN PA; PH2 BY VOC* RAN PA ; PH3 BY RAN* PA; PH4 by PA*; WR@0; VOC@0; PA@0; RAN@0; PH1@1; PH2@1; PH3@1; PH4@1; PH1 WITH PH2@0 PH3@0 PH4@0; PH2 WITH PH3@0 PH4@0; PH3 with PH4@0; WR1 on PH1 PH2 PH3 PH4; Model Scan: ? 


Is your question how to translate the slopes of WR1 ON PH1PH4; to slopes for WR1 ON WR VOC RAN PA; ? I don't understand what "Model Scan:" is. I would expect that you would have Model Constraint here. 


The "model scan" is the group specific model (Group 1 is Eng, Group 2 is Scan). The output above is my set up for hieracichal regressions. PH1PH4 is equal to one factor residualized given the other. That is, PH1 equal to WR, PH2 is the residual of VOC after WR have been partialled out, PH3 is the residual of RAN after WR and VOC have been partialled out, and PH4 is the residual of PA after, WR, VOC, and RAN have been partialled out. I have done separate analysis for each group (Eng and Scan). My problem arises when I try to compare the structural paths for WR1 on PH1PH4 across groups. As I said, when doing standard regression this is quite straigthforward. When doing hierarchical regression I´m not sure if it's even possible... 


I don't see why a problem would arise here. First test that the measurement parameters are equal across groups, including the parameters of the PH* BY statements, and if that is not rejected, test if the structural parameters of WR1 ON PH1PH4 are equal. Groupequality testing is covered in our Topic 1 course on the web. 


Greetings, I have data for three different grades who were assessed on the same instrument in the fall and spring of the academic year. In order to estimate appropriately scaled ability scores across time points (fall/spring) and grade (1,2,3), is it best to run two separate multiple group analyses (one for each time point) or to run a multiple group MIMIC model with time as the covariate? Thank you for any input! 


I would first test measurement invariance across time for each grade. Once that is established, I would use multiple group analysis to test measurement invariance across grade. 

leah lipsky posted on Friday, January 08, 2010  11:05 am



Hello, Can you tell me why I'm getting the same estimates & fit statistics regardless of which paths I constrain (trying to do multiple group path analysis)? For example, the 1st model below constrains all paths, and the 2nd I believe frees them all. Thank you!! MODEL 1ALL PATHS CONSTRAINED VARIABLE: NAMES ARE id edu exfreq ageyrs wtchg2y gainer pcap pwtatt pseff yr1rtrn yr2rtrn return bmichg ploc fvint retain1y partot; MISSING = ALL (999); GROUPING IS retain1y (0=no 1=yes); USEV ARE pseff ploc exfreq fvint wtchg2y; CATEGORICAL = exfreq fvint; MODEL: exfreq on pseff ploc; fvint on pseff ploc; wtchg2y on pseff ploc; pseff with ploc; fvint with exfreq@0; OUTPUT: standardized modindices(3.84); MODEL 2 NO CONSTRAINTS DATA: ...same as above... MODEL: exfreq on pseff ploc; fvint on pseff ploc; wtchg2y on pseff ploc; pseff with ploc; fvint with exfreq@0; MODEL retainer: exfreq on pseff ploc; fvint on pseff ploc; wtchg2y on pseff ploc; pseff with ploc; fvint with exfreq@0; OUTPUT: standardized modindices(3.84); 


The default in Mplus is for regression coefficients to be free across groups as the default. So the models are the same. You can see this by looking at TECH1 or your model results. You need to constrain the parameters to be equal in the second model. 


Dear Linda, in my multiple group analysis (testing for metric invariance) I get the following message: "THE MODEL ESTIMATION TERMINATED NORMALLY THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES COULD NOT BE COMPUTED. THE MODEL MAY NOT BE IDENTIFIED. CHECK YOUR MODEL. PROBLEM INVOLVING PARAMETER 324." However, I don't have a parameter 324  I checked TECH1. My model looks like this: y1 BY x1 TO x7; y2 BY x8 TO x13; y2 ON y1 x7; I have four groups. My file contains missing data which I specify by "missing = blank" and I use the "auxiliary" option. My groups are of different size. However, this was not a problem when establishing configural invariance. Since I don't have a parameter 324  what does the error message mean? Thanks a lot! 


Go to the beginning of Technical 1 and search for 324. I have never heard of us reporting a parameter number that does not exist. If this does not help, please send the full output and your license number to support@statmodel.com. 


Dear Linda, I've just emailed my output. Additionally, I've estimated the above model testing for metric invariance without the "AUXILIARY" command, getting the following warning: "THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES MAY NOT BE TRUSTWORTHY FOR SOME PARAMETERS DUE TO A NONPOSITIVE DEFINITE FIRSTORDER DERIVATIVE PRODUCT MATRIX. THIS MAY BE DUE TO THE STARTING VALUES BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. THE CONDITION NUMBER IS 0.693D17. PROBLEM INVOLVING PARAMETER 82. THIS IS MOST LIKELY DUE TO HAVING MORE PARAMETERS THAN THE SAMPLE SIZE IN ONE OF THE GROUPS." My groups are of sample size 42, 100, 72 and 158, respectively. If I'm not mistaken, I estimate less parameters than sample size in one group. Moreover, the parameter that Mplus points to is the psi parameter for the endogenous latent variable in my second largest group. And I did not get this error message in the model testing for configural invariance where more parameters had to be estimated! Thanks for your help! 


Dear Linda, I've checked my input file for the fifth time and have eventually discoverd the reason for both problems mentioned above. I've specified the model correctly now and it works well. Sorry for bothering you ... 


Dear Drs. Muthén, I have four groups in my SEM. Since subjects in all groups have missing values I have used the "AUXILIARY option" as follows: AUXILIARY = (m) z1 z2 z3 z4 z5; My models did either not converge or SE could not be estimated. Now I've run the model again without auxiliary variables and everything works fine (i. e. I can establish scalar invariance). How am I to interpret this result? I would have expected a different result  no convergence without auxiliary variables and convergene with auxiliary variables. Thanks for your help! 


It would be impossible to answer this question without seeing your input, data, and license number at support@statmodel.com. 


I would like to test the structural invariance between three groups. To produce an unconstrained model (regression weights free among groups), should I fixed all parameters (factor loadings, intercepts, and means@0) in my measurement model? Would it be better to fix only factor loadings and intercepts? What should I do with my covariances, fixing them or letting them free? Thank you for your help! 


The Mplus default is to hold the measurement parameters of factor loadings and intercepts equal across groups. The residual variances are not held equal. To compare structural parameters, measurement invariance is required. See the end of the discussion of measurement invariance and population heterogeneity to see the models for testing the equality of the structural parameters. 


I am running a fairly complicated SEM analysis. Just to give you an idea of the scope of it, here is the syntax MODEL: abusemid on abuseearly; momdrink by maxdrkBMt12 dkpropbmt1BM bingedkt1BM; maxdrkBMt12 with bingedkt1BM; daddrink by bingedkt1BF dkpropbmt1BF maxdrkBFt12; bingedkt1BF with maxdrkBFt12; daddrink with momdrink; momdrink on FHg0; daddrink on FHg0; abuseearly on momdrink; abuseearly on daddrink; adolund by DELINt4 DELINt5 aggret4 aggret5; earlyund by aggBFt1 aggBFt2 aggBMt1 aggBMt2 delBFt2 delBMt2 ; school by TRFviii1t5 TRFviii2t5 TRFviii3t5 TRFviii4t5; abuseearly on FHg0; crisemft4 on FHg0 momdrink daddrink; adolund on abusemid crisemft4; school on abusemid crisemft4; school on adolund ; adolund on earlyund tsex; I have 329 in my sample  89 are girls and 240 are boys. I would like to address sex differences in this model, but I feel like I do not have enough girls to do this with. Do you think there are enough girls to try to run the 2 group analysis? Thanks! Jennie 


At a minimum you need to have several more observations in a group than you have parameters in the group. If you meet this condition, you would need to do a Monte Carlo study to see if the sample size is large enough. 


Dear Linda and Bengt, I have a question concerning the calculation of CFI in multigroup models. Little et al. (2007) refer to a paper by Widaman and Thompson (2003) who argue that "many applications of SEM require one to specify and estimate an appropriate null model" when one wishes to model variances or means. Is such an altered null model used as the default in the calculation of CFI in Mplus 5.21 when specifying a grouping variable? Thanks for your help, Maren ....... Little, T. D., Card, N. A., Slegers, D. W. & Ledford, E. C. (2007). Representing contextual effects in multiplegroup MACS models. In T. D. Little, J. A. Bovaird & N. A. Card (Eds.), Modeling contextual effects in longitudinal studies. (pp. 121147). Mahwah, NJ US: Lawrence Erlbaum Associates Publishers. Widaman, K. F. & Thompson, J. S. (2003). On specifying the null model for incremental fit indices in structural equation modeling. Psychological Methods, 8, 1637 


I am not familiar with the articles. The Baseline model used in Mplus is the means and variances of all observed variables and the covariances of the observed exogenous variables. 


Widaman and Thompson (2003) describe the modified null model as follows: "First, an acceptable null model must represent covariances among manifest variables as null, or zero. Second, and the key distinction here, if any withingroup and / or betweengroup constraints on estimates of manifest variable means or residual variances are invoked in any substantive models under consideration, these constraints must be included in an acceptable null model. These constraints on means and residual variances will typically be operationalized as constraints on the tau and theta matrices that are the only matrices with parameter estimates in the standard null model." Is there a way to specify such an alternative baseline model in Mplus? 


You can't change the Baseline model that Mplus uses. However, you can run two models, the baseline you want and your H0 model and do a difference test. We do not fix the observed exogenous variable covariances to zero because the model is estimated conditioned on the observed exogenous variables. Their covariances are not fixed at zero during model estimation. By fixing them at zero in the baseline model, overall model fit depends on how highly the observed exogenous variables correlate in spite of the fact that these correlations are not H0 model parameters. 


I've thought about running a difference test, too. However, if I did as you 've suggested, wouldn't all goodnessoffitindices (CFI, TLI, RMSEA, ...) for my H0 model still be calculated on the basis of the baseline model that Mplus uses as the default? If so, and if I want to estimate these fitindices by using my baseline model, could I estimate these indices by hand by using the chisquare difference value in the formulas? Thanks for your help! 


The Baseline model used by Mplus will not change. You would need to calculate all fit statistics by hand using the chisquare difference value if you want to change the Baseline model. 


Drs. Muthen, I am conducting a multigroup path analysis and am attempting to compare a model where all parameters are freely estimated to one in which the means are constrained to be equal across groups. I am obtaining the same model fit statistics and parameter estimates in both the freely estimated and constrained models, which does not seem possible. I set up my variables as latent constructs using a single indicator. Below is a portion of my input. Freely estimated model: !LATENT VARIABLES MEANS (A = alpha) [p0_pos]; [p1_pos]; [p2_pos]; [p3_pos]; [t0_agg]; [t1_agg]; [t2_agg]; [t3_agg]; Model with means constrained to be equal: !LATENT VARIABLES MEANS (A = alpha) [p0_pos](1); [p1_pos](2); [p2_pos](3); [p3_pos](4); [t0_agg](5); [t1_agg](6); [t2_agg](7); [t3_agg](8); I would appreciate any suggestions on how to correct this issue. Thank you very much for you help. 


It is not possible to answer your question without more information. Please send the two outputs and your license number to support@statmodel.com. 

Wu wenfeng posted on Monday, April 05, 2010  5:20 pm



Hello! I have read some articles about measurement invariance, and found the process of using MASC to test the multigroup were different. I wonder when testing the latent mean equivalence, should the item variance equivalence be test? And if it should, the test should be before or after latent mean equivalence test? 


I don't know what MASC is. 

Wu wenfeng posted on Tuesday, April 06, 2010  8:16 am



sorry! I spell wrong. it should be MACS(means and covariance structures) 


Latent variable means are not measurement parameters. They are structural parameters. See our Topic 1 course handout and video for a discussion of using multiple group analysis to test for measurement invariance and population heterogeneity. 

Wu wenfeng posted on Tuesday, April 06, 2010  9:32 am



I have read the content you mentioned, but still confused.anyway,thank you! 


Dear Dr. Muthen, I am conducting a multiple group ESEM analysis of dichotomous data (2 groups) with a high number of cases in each group (170.000 and 50.000 respectively) and 41 variables. The fit indices of our analysis (CFI, RMSEA and TLI) indicate, when testing for measurement invariance (thresholds and loadings equal, scale factors 1 in one group and free in the other), that both groups have a similar structure. We conduct factor analyses in the first place in order to obtain factor values on which our further analyses are based. Based on the factor values, which are comparable for the two groups in the invariant model, we would like to calculate Eulidean Distances (of these factor values for each case across the two groups). Is it legitimate to constrain the factor means in the invariant model in both groups to zero (contrarily to the recommendation that  when holding thresholds and loadings constant  means in the second group should be estimated freely)? With factor means of 0 in both groups, factor values seem to be much more comparable in our case and Euclidean Distances would be calculated for standardized factors instead substracting standardized from unstandardized factor values, right? I would highly appreciate your comments. Thanks, Pablo 


I would not recommend this. 

PB posted on Tuesday, April 13, 2010  12:40 pm



Thank you for your reply. The idea behind holding factor means invariant was to being able to actually compare factor values by calculating distances between the factor scores of one group and the scores of the other group per item. Our goal is in fact to have a such a proximity measure on which our further analyses are based. Could you maybe specify what exactly you would not recommend: Holding factor means in this case invariant (although for noninvariant factor means distances between the factor values do not really make sense) or actually calculating distances of the factor scores at all (and if so, why)? Your help is very much appreciated. Thanks in advance. 


You need to first test if the factor means are equal across the groups. Only if that is not rejected would I work with factor scores from the model where you hold the means at zero in all groups. Note that factor scores are comparable across groups even when factor means are different. The measurement invariance ensures that. So you could go ahead and calculate your factor score distances under our default model. 

PB posted on Friday, April 30, 2010  1:57 am



Thank you for your response. Which way to test for the equality of the factor means across the groups is recommended/sufficient? When holding means in the reference group constant, while freeing them in the other group, the resulting estimated means (which as far as I understand are the mean differences in comparison to the reference group) are significant. This is not surprising since the dataset is quite comprehensive. However, when comparing the model with means = 0 in the reference group and freely estimated means in the second group to a model with factor means fixed at zero in both groups, the change in the fit indices is quite small and the indices themselves are good. Could I therefore assume that I can work with a model with factor means hold at zero in both groups (due to the still good fit values)? Thanks again for your much appreciated help. 


The differences between what you do in paragraphs 2 and 3 are unclear. Please send the two outputs and your license number to support@statmodel.com. 

PB posted on Friday, April 30, 2010  9:42 am



Sorry for not being clear. Let me try to clarify. What I meant in paragraph 3 was: I am comparing model A (Thresholds and Factor Loadings constrained to be equal across groups; residual variances fixed at one in one group and free in the other; factor means fixed at zero in one group and free in the other group) to model B (Factor Loadings and Thresholds held equal in both groups AND Factor Means fixed at zero in BOTH groups). The fit indices in model B are good and the difference to the fit indices in model A are rather small. Can I therefore assume (referring to Dr. Muthen’s post from April 16, 2010  10:27 am), that factor means are equal across the groups (since the invariance model B with equal means shows good fit indices)? 


Yes. 


Dear Drs. Muthén, I would like to specify a multiple group model in Mplus with the following constraints: factor loadings fixed to zero, intercepts invariant over groups, and unique factor covariances are freely estimated. I have a gfactormodell with seven indicators which I specified as follows for all four groups: F BY a@0 b@0 c@0 d@0 e@0 f@0 g@0; [a] (1); [b] (2); [c] (3); [d] (4); [e] (5); [f] (6); [g] (7); I get the following error message: THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES COULD NOT BE COMPUTED. THE MODEL MAY NOT BE IDENTIFIED. CHECK YOUR MODEL. PROBLEM INVOLVING PARAMETER 14. (which is the theta parameter for variable g) What am I doing wrong in specifying the model? Thank you so much for your help! 


When you fix the factor loadings to zero, the factor variance is not identified. 


Dear Linda, I've added the following in each group: F@1; [F@0]; a*; b*; c*; d*; e*; f*; g*; And now I do get estimates. Do these additional commands alter the meaning of my model? Thanks for your help! 


That specification is the same as not having the factor in the model  no free parameters are associated with it. 


What I'm trying to do with this model is specifying the "acceptable null model" according to Widaman and Thompson (2003). For my case, this model has to have the following specifications: factor loadings fixed to zero, intercepts invariant over groups, unique factor covariances freely estimated. I'm not sure whether the additional specifications for mean and variance (which I added in order to identify the model) actually alter the model's meaning. Thanks for your help'! 


You should specify a model of means held equal across groups and variances and covariances not held equal across groups. 


hello! Why the estimates are a little bit different if I use multigroup analysis (let's say for boys and girls) than if I use separate data for boys and girls and run the exactly same model (no constraints included)? Thank you! 


They should not be different. You may not have relaxed all of the default equality constraints in Mplus. If you can't see the problem, please send the relevant outputs and your license number to support@statmodel.com. 

Nadia posted on Wednesday, June 23, 2010  9:53 am



hi, i am trying to run a simple logistic regression with a categorical variable as predictor, how do i run this? I have put in grouping is... but it still comes up with an error message: ERROR in ANALYSIS command ALGORITHM = INTEGRATION is not available for multiple group analysis. Try using the KNOWNCLASS option for TYPE = MIXTURE. and if i try to do mixture it tells me i don't have the mixture option 


The GROUPING option is not available with maximum likelihood and categorical outcomes. You need to use KNOWNCLASS and MIXTURE for that. You can use probit regression and the GROUPING option. 

Nadia posted on Friday, June 25, 2010  9:12 am



Sorry Linda, I am using a binary outcomes but a categorical predictor, not a categorical outcome. 


A binary outcome is a categorical outcome. 

Nadia posted on Sunday, June 27, 2010  6:40 am



thanks, is there any way i can buy just the mixture add on? although i have the basic option? 


Yes. Please contact Michelle at Mplusadmin@statmodel.com for more information. 

Nadia posted on Tuesday, June 29, 2010  1:30 am



hi linda, this is really driving me bonkers! I am trting this out on our institution computer which has the mixture add on I have tried the type=mixture with knownclass and i keep getting an error message *** ERROR in Variable command CLASSES option not specified. Mixture analysis requires one categorical latent variable. but i don't want a categorical latent variable, all i want is a straightforward logistic regression with a categorical predictor How can it be so complicated to do this!! 


A logistic regression is shown in Example 3.5. If you want multiple group analysis also, you need to use the KNOWNCLASS option along with the CLASSES option and TYPE=MIXTURE. Example 8.8 shows the way to specify this. 

Prathiba posted on Tuesday, June 29, 2010  9:22 am



Dear Drs. Muthen: If I conduct a multigroup CFA with sample sizes N=550, 3261, and 2103, do you think the sample size disparity would cause inflation/deflation of any estimates? No parameters are constrained across groups. 


If no parameters are constrained equal across groups the results are the same as if analyzing each group separately. 


Hello, Dr. Muthen I'm conducting a SEM including an interaction effect. I want to conduct MSEM with an interaction effect. As I know, Mplus does not provide chisquare statistic when an interaction term is included in the model. How can I examine the group differences? 


Using loglikelihood difference testing where 2 times the loglikelihood difference is distributed as chisquare. Or use MODEL TEST. 

Anna Nagy posted on Tuesday, August 10, 2010  12:41 pm



Dr. MuthÃ©n, I have math outcome data at two time points (pretest and post test) for students in two conditions: Treatment and Control. Pretest and post test score measure 4 different aspects of math. Therefore I created a latent variable. My question is are there significant differences between treatment and control group in math. To address that question my plan was to conduct multiple group analysis. However because of the small sample size (N = 78) I couldn't conduct the analysis. Is there another way to address my question? The only idea that comes to my mind is to save the factor scores and conduct an ANCOVA. Can you recommend some other ways to analyze my data? Thank you, Anna 


If you can estimate a model and obtain factor scores, I'm not sure why you were unable to conduct the analysis. 

Anna Nagy posted on Tuesday, August 10, 2010  1:15 pm



I was only able to conduct the CFA and create two latent variables measuring math at time 1 and time 2. Following that step I was planning to conduct the MGA, but the model blow up right at the configural invariance level. I blamed on the small sample size. 


Please send the files and your license number to support@statmodel.com. 

Sonja Nonte posted on Monday, August 23, 2010  8:45 am



We are trying to test factorial invariance in a multigroup CFA (categorical data). We would like to specify the baseline model with free thresholds, factor loadings, and means. We've already found out that we have to fix the factor mean at 0 and the residual variances at 1 for identification. But our question is one step before that: how can we free the tresholds and factor loadings? Until now, we have the following statements: VARIABLE: ... grouping is S1sex (1=girls 2=boys); ANALYSIS: PARAMETERIZATION=THETA; MODEL: SpoSeko by S1Sp1r S1Sp2r S1Sp3r S1Sp4r; SpoSeko@0; S1Sp1r@1; S1Sp2r@1; S1Sp3r@1; S1Sp4r@1; And in the next step (equal thresholds and factor loadings) do we keep the restrictions concerning the mean and the residual variances? If we do not keep those, how can we still perform a diff test, even though we changed the baseline model? 


See the Topic 2 course handout under multiple group analysis. Here the measurement invariance models are shown for the Delta parametrization. The only difference between this and the Theta parametrization is that scale factors are parameters in Delta and residual variances are parameters in Theta. 


Dr. Muthen, I'm trying to run a crosslagged model with second order factors. The simplified model is shown below: model: f1 by x1x3; f2 by x4x6; f3 by x7x9; f4 by x10x12; f5 by f1f2; f6 by f3f4; f6 ON f5; This model runs fine, but when I run the model for multiple groups: MODEL male: f6 ON f5; this model is not identified. Any advice would be helpful. Thanks 


The intercepts of the 1st order factors in their regression on the 2nd order factors need to be fixed at zero in both groups for identification. 

Regan posted on Tuesday, September 14, 2010  12:48 am



Hello! A while ago, someone had this question: "...I ran subsequent multiple groups analyses for each of the 3 race/ethnicities...The model fit was excellent for two of the groups, but unacceptable for the third group....Should I accept the omnibus model for the two races that have good fit and develop a different model for the third race?" Dr. Linda Muthen's advice to him: "...It does not make sense to put groups together if the same model does not fit the data well for each group.... Only then does it makes sense to combine the groups..." My questions: 1) I wanted to confirm that if in attempting to do a multiplegroup path model, we first test the model in each group separately, and if you have good model fit in two groups and poor fit in one group, one should stop and just present a separate model for each group and not attempt the multiplegroup approach? (If there may be a plausible and interesting reason as to the finding that the third model did not fit the data, can we also present this model?) 2) Am I correct that with a nonsignificant chi2 diff test, your interpretation is that the H1 and Ho models are not significantly different from each other and it is okay to combine the data into one groupperhaps allowing for invariance in certain paths? (and that separate models are necessary if the chi2 diff test IS significantly different)? Thank you! 


1. A first step in a multiple group analysis is to analyze each group separately. Only groups for which the same model fits well should be compared. That a different model fits well for one group can be of interest. 2. I don't know what you mean by combine into one group because if you do this, you cannot allow for invariance. 

Regan posted on Tuesday, September 14, 2010  12:02 pm



Hello again, I was referring to your response to the gentleman above. When you say that: 'it does not make sense to put groups together if the same model does not fit the data well...' By this, do you mean that we should not try to compare these groups? If my model fits well for nonhispanic caucasians and nonhispanic africanamericans for instance, but not for hispanics/latinos, I believe what I should do after having run separate models is just do a multiple group analysis with the caucasian and africanamerican groups and either explain the lack of fit in the hispanic groupor develop a separate model altogether for them. Is this correct understanding? Thanks again in advance! 


You should not compare groups for which the same model does not fit well. There is no basis for comparison. These groups should not be included in the multiple group analysis. 


I have a good fitting omnibus structural equation model that includes three racial/ethnic groups (N=424). When I run the model that includes all participants, the fit indices are all good and I have no error messages. When I run the model separately for each group, the fit indices are still good, but I get the following error message: "THE MODEL ESTIMATION TERMINATED NORMALLY WARNING: THE LATENT VARIABLE COVARIANCE MATRIX (PSI) IS NOT POSITIVE DEFINITE. THIS COULD INDICATE A NEGATIVE VARIANCE/RESIDUAL VARIANCE FOR A LATENT VARIABLE, A CORRELATION GREATER OR EQUAL TO ONE BETWEEN TWO LATENT VARIABLES, OR A LINEAR DEPENDENCY AMONG MORE THAN TWO LATENT VARIABLES. CHECK THE TECH4 OUTPUT FOR MORE INFORMATION. PROBLEM INVOLVING VARIABLE EDU3." My question is, is the mplus output for each racial group intrepetable, or does the error message negate interpretability? 


This message means the model is not admissible. You probably have a negative residual variance or variance for edu3. 

haxha posted on Saturday, October 30, 2010  11:26 pm



Dear Dr. Mullen. I am using MPLUS in conducting multi group analysis. I have a question with regards to validating of the model. I have a model that I created for a large sample of 800 women. I was told that to validate it I need to test this model on one half of the population and then test it again on the other half. I do this using MLM estimator because my data is not normal regardless of the transformations I have undertaken. Most estimators are similar (the direction, significance, chi square significance) but one parameter looses the significance when I test the mode in one half of the data. Should I respecify the model? Is it essential that all parameters are significant in all models tested? Also, since I cant use bootstrapping with MLM; is there any other simulation method I am able to use? Haxha 


I think typically one randomly divides the sample as a first step and fits the model first in one sample and then in the other. If key parameters are not significant in both, the model may not be robust. 

haxha posted on Sunday, October 31, 2010  10:18 am



Thanks so much Dr. Muthen. I apologize for a typo earlier. Just one more follow up question if you don't mind. I have transformed the data but they are still not normal; I am using mLM but I am doing so with the already transformed data....is that ok? OR must I go back to using the data on their original form? Data on the original form are severely skewed. Also is there any simulation method instead of bootstrapping I can use with MLM? Many many thanks! Haxha. 


In general I would not transform variables. I would use the MLR estimator. 

haxha posted on Sunday, October 31, 2010  11:15 am



Thank you so much! 

Kai Savi posted on Thursday, November 04, 2010  12:51 pm



Hello, I am working on a multiple group analysis and am looking for MLR output. so I can use chisquare difference testing. I know I can't do MLR with grouping, but my data is in two data sets. I can do a multigroup analysis, but not with MLR. Beccause I am using two data sets, I do not have a single variable to use to differentiate classes. Is it possible to use KNOWNCLASS and get a MLR output with two data sets? Thanks. 


You should be able to do this with MLR if you are using TYPE=GENERAL. Have you received an error message or are you just assuming this? If you have received an error message, please send your output and license number to support@statmodel.com. 

Kai Savi posted on Thursday, November 04, 2010  2:09 pm



Thanks Linda, I am assuming it, because I am not clear on how to describe KNOWNCLASS with two datasets (as opposed to a class variable). It seems like it should be simple enough, but I was not able to find anything in the manual on how to write that into the syntax. Thanks, 


It is not clear to me why you think you need the KNOWNCLASS option or if you do. If you do, the two data sets must be in the same file with a grouping variable. 


Hello, I run a multi group analysis with strong invariance that fits fine. When i try to test for measurement invariance (configural or weak) I receive the message: THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES MAY NOT BE TRUSTWORTHY FOR SOME PARAMETERS DUE TO A NONPOSITIVE DEFINITE FIRSTORDER DERIVATIVE PRODUCT MATRIX. THIS MAY BE DUE TO THE STARTING VALUES BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. THE CONDITION NUMBER IS 0.929D17. PROBLEM INVOLVING PARAMETER 34. I checked parameter 34 (psi between two latent var.) but didn't see what's the problem. The model also runs fine, when I try every group in a single model (only girls or boys). I'm wondering because it doesn't makes sense to me that both single models work well and a model with strong invariance shows a good fit too, while more liberal models don't work. Is there any explanation for this phenomenon? Thanks a lot. 


Please send the output and your license number to support@statmodel.com. 

Amy Tobler posted on Friday, November 19, 2010  1:04 pm



I am running a multigroup clustered path analysis. When the model is run I get the following warning: THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES MAY NOT BE TRUSTWORTHY FOR SOME PARAMETERS DUE TO A NONPOSITIVE DEFINITE FIRSTORDER DERIVATIVE PRODUCT MATRIX. THIS MAY BE DUE TO THE STARTING VALUES BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. THE CONDITION NUMBER IS 0.291D15. PROBLEM INVOLVING PARAMETER 29. THIS IS MOST LIKELY DUE TO HAVING MORE PARAMETERS THAN THE NUMBER OF CLUSTERS MINUS THE NUMBER OF STRATA WITH MORE THAN ONE CLUSTER. My question is, does this mean that the chisquare values for model fit are not reliable as well or just the individual parameter standard errors? Thanks 


Both chisquare and SEs are somewhat questionable in this case where you have more parameters than clusters. They could be fine, or they may be poor. It depends on several factors, including how many parameters refer to the between level  if few, you may be ok. Only a Monte Carlo simulation study could tell more. 


I am running a multiple group analysis for a path analysis with multiple mediators using ML, and I am getting a negative chi square value. I know that this can happen with MLR, and that you can't interpret the test. What are your recommendations when this happens with a model using ML? 


Please send the files that show this and your license number to support@statmodel.com. 

Sylia Wilson posted on Saturday, February 05, 2011  12:09 am



Hello, I am using multigroup models to compare across mothers and fathers in our sample. In order to fit the measurement model for our first latent construct (not yet testing invariance across groups), I need to constrain 1 of 4 indicators to be equal to 1, and the other 3 indicators to be equal to one another: Model: latent by obs1@1 obs2 (1) obs3 (1) obs4 (1); I would now like to test invariance across mothers and fathers, but I'm not sure of the code for setting the loadings equal across groups, taking into consideration the constraints on the latent variable. What I would like to do is something like the following, where the * means equal across groups, but I know you cannot put 2 () in the same line. Model for mothers: latent by obs1@1 obs2 (1) (1*) obs3 (1) (2*) obs4 (1) (3*); Model for fathers: latent by obs1@1 obs2 (1) (1*) obs3 (1) (2*) obs4 (1) (3*); Do you have any suggestions? Thank you very much for your time. 


The following specification holds parameters equal across variables and across groups. See TECH1 or your results to see that these equalities hold. 


Hello, I´ll like to do a multigroup model with two groups. First group should be composed of all observations with a value from 2 to 4 for the variable "SNB" and the other group should be composed of all observations with the value 1 for this variable "SNB". I tested serval options for the grouping, as: GROUPING IS SNB (1 = NO_SNB 2 3 4 = SNOWB); or GROUPING IS SNB (1 = NO_SNB 2, 3, 4 = SNOWB); but no option works runs. Would be great if you could help me! Thx 


You need to use DEFINE to create a variable that combines the values of 2, 3, and 4. 


Dear Dr. Mullen, I am running a multiple group analysis with two groups with very different sample sizes. n of group 1 = 213 n of group 2 = 70 The unconstrained path coefficients are in the case of 2 paths very different for the two groups. As an example: Group 1: beta = .04 (p = .674) Group 2: beta = .39 (p = .006) However, when I constrain all path coefficients for the two groups to be equal in order to test if the paths differ for the groups, the contrained model is not significantly worse than the unconstrained model (p = .355). Is it possible that this nonsignificant difference is due to the unequal sample sizes? And if so, is there a way to circumvent the problem of the unequal sample sizes? I thank you very much for any advice you could give me! Veronique 


I think the problem is lack of power. 


Thank you for your response Dr Muthen, I was afraid that could be the problem. Veronique 


Dear Linda, I`m trying to run a multiple group analysis with three groups and imputed data. My model contains a latent variable which is regressed on four other latent variables. Furthermore, I added three covariates (following the mplus user guide's example 5.14). The model shows good fit, however, standard errors of the latent means for group 2 and group 3 seem to be extremely large. Running the model without the covariates leads to acceptable standard errors, so I assume there might be some problem with the covariates. Do you have any idea why the standard errors of the latent means increase when I add covariates? Thanks in advance! 


Please send the full outputs and your license number to support@statmodel.com. 


Hi Linda, Is there a limit to the number of groups that Mplus can accomodate in a multiple group analysis? In the Mplus manual, I can only find examples involving 2 groups but saw mention of 6 groups in an earlier post in this topic. I am analyzing data from a study involving 8 groups and am wondering if I can include all 8 groups in the same analysis? Thanks! P.S. I am very excited to see that there will be a Mac version of Mplus at some point soon (I can stop spending money on Parallels and Windows at that point). Will those of us who switch have to buy a new license or will we be able to get the Mac Version as part of our annual renewal? 


There is no explicit limit to the number of groups. Those who want to change from Windows to Mac will be able to do so if their upgrade and support contract is current. I am working out the details on how that will happen. 


Dear Dr. Muthen, I am conducting a multiple group analysis (by gender) on a model wherein we have separate hypothesized models for men and women. Essentially we have a theoretical model for men and theoretical model for women. Both models contain the same latent variables but paths are specified to be different between the genders. What I would like to know is if it is possible in Mplus to empirically show that the male model fits the data better for males than it does for females and that the female model fits the data better for females than it does for males. 


If you use the same observed variables for the two genders, yes. 


Dear Drs. Muthen, I'm trying to test a mediation model in three religious groups in a large dataset (N= 10 000)with latent variables. I also want to control for the influence of another grouping variable (university) so using multigroup analysis as well as a TYPE=complex. My mediation model runs perfectly in the whole group with the TYPE=complex statement. However, when I try to run the multigroup I receive this warning: THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES COULD NOT BE COMPUTED. THE MODEL MAY NOT BE IDENTIFIED. CHECK YOUR MODEL. PROBLEM INVOLVING PARAMETER 89. I do not know how I can interprete this warning or how I can solve this problem? Thanks for any advice! Kind regards 


Please send your output and license number to support@statmodel.com. 


Dear Dr. Muthen, Thanks for your quick reply. A colleague noticed in the output that one item has residual variances above 1. So, I will rerun the analyses without this item and see whether the model is identified. Thanks anyway! Kind regards, Jessie 


Dear Drs. Muthen, I did a single group analysis as first step of a multiple group analysis (structural equation model with 4 latent factors). results indicated that in the second group one observed indicator of a latent factor had to be removed for getting a good model fit. this probably indicates that measurement models of this factor differ, right? my problem is now: 1) how should I proceed, if I want to test moderation in both groups? 2) may I just continue with the mga and leave this observed indicator out? 3) ai8 is the observed indicator which has to be removed in the second group (male) and if i try something like: model: f1 by ai8 ai9 ai10 ai11 model male: ai8@0; i get the following message: WARNING: THE LATENT VARIABLE COVARIANCE MATRIX (PSI) IN GROUP M.M. IS NOT POSITIVE DEFINITE. THIS COULD INDICATE A NEGATIVE VARIANCE/RESIDUAL VARIANCE FOR A LATENT VARIABLE, A CORRELATION GREATER OR EQUAL TO ONE BETWEEN TWO LATENT VARIABLES, OR A LINEAR DEPENDENCY AMONG MORE THAN TWO LATENT VARIABLES. CHECK THE TECH4 OUTPUT FOR MORE INFORMATION. PROBLEM INVOLVING VARIABLE AI8. thank you!! 


sorry, i saw only now that ai8@0 does fix the variance at 0...thats not the solution for my problem...my question is how to analyse moderation effects if the factor structure differs in the two groups because of one observed indicator.. 


Deleting an item is done by not including it on the USEV list. But if you have a model where in one group an item makes it not fit the data, then multiplegroup invariance for the remaining items seems unlikely. If you want to model moderation of structural relations by group membership, you need measurement invariance. 


Professors Muthen, I am trying to conduct a multiple group (4 groups) SEM with seven latent variables and one single indicator. This model also initially had two higher order factors. 1st I conducted 4 single group CFAs. The model fit was good for three of four groups. there were two errors for the last group, one for an observed variable and the other for a second order latent factor both I believe were linear dependence related. When I removed the observed variable I only received one non positive definite error. I removed the second order factor and repeated the analysis in all four groups and the model fit was acceptable for all. Can you please help me understand why this helped since its basically the same model? Also, in testing configural Invaraince  I have two questions about single indicators and second order factors. Are single indicators as well as the first firstorder factor omitted in the group specific model statements? 


A 2ndorder factor model usually puts constraints on the covariance matrix of the 1storder factors so it isn't the same model as without a 2ndorder structure. Regarding configural invariance  I don't think they should be. 


hi I need your help, I want to model using logistic regression the survival of bird as a function of weight and sex. The probability of survival for female is given by p(f)=exp(b0+b1x)/1+exp(b0+b1x) and probability of male given by pm(x)=pf(x+b2/b1), can some one help me verify this? how do i verify? 


Do males and females have the same slope for weight? Did you mean to write pm(x) = pf(x+b2/b1)? What does the righthand side mean? 


Dear Dr. Muthen I am conducting MultiGroup SEM to do crosscultural research for my dissertation. I tested my final model using MPLUS and I could obtain fine fit indexes. Measurement fixed, Structure fixed (Strict invariance)(using ONLY Correlation matrix and STD as the dataset) CFI = 0.93, TLI = 0.90, SRMR = 0.09, SRMR = 0.09 However, I could not figure that out the way to compare factor means using this correlation matrix as data set. Therefore, I tried to add mean (means of observed variables (Ys)) in this data set. Syntax: type is correlation and std and means And I got pretty different results. All syntax is same except including means in the data set. CFI = 0.85, TLI = 0.81, RMSEA = 0.13, SRMR = 0.10 (1) Why is it different? and Is it OKAY to use only correlation matrix to conduct multigroup SEM? (2) When I used raw data( or correlaiton and mean std), I found that one group has consistently higher observed means (Y's) than the other cultural gorup, so probably, I think it is a weak invariance. However, others are very similar (structure is similar), only observed means differ by groups. Can I use multigroup SEM? Thank you so much in advance. 


Hello I always thank you for your support i have a question on the multiple group comparison, I would like to test for the moderating effect of a drug crime (drug v. nondrug offenders). I am especially interested in how a drug crime conditions the effect of being racial minority on sentence length in courts. I wonder if it is ok for me to impose constrains on one or two related variables of interests, and do the chisquare difference tests with the unconstrained model? Some of my old material told me that I have to do structural invariance test (?) first, which is the chisquare difference test between the unconstrained model and the fully constrained model, and then only in the situations where there is no statistical difference in the chisquare test, I can proceed to doing the path by path test using the chisquare difference test just like the former approach. Which one is correct? If the latter approach is correct, then I wonder why we do the path by path test even though we do not find any difference when we constrain the whole paths. I am confused. So, my question is "do we need to do structural invariance test even though I am doing just a path analysis with only observed variables, not SEM?" Thank you very much in adavnce! 


HwaYoung: 1. The differences are due to adding means to the data. The model then constrains the intercepts to be equal over time. You cannot use only the correlations. You must also use the standard deviations as you mentioned above. Mplus turns these into a covariance matrix. 2. The high observed variable means may results in high factors means. It is the intercepts for which you are testing measurement invariance. 


Byungbae: I think it would be fine to test only the coefficients for which you have a substantive hypothesis. You can do this using MODEL TEST. 


Dear Dr. Muthen, I really appreiciate your comments. So,it means that it is okay to use correlation matrix and std for multigroup SEM? What is limitaiton when using correlation matrix and std? I have one more question. When I use raw data for multigroup SEM, I got the decent fit indexes for each cultural group when I conducted CFA for each cultural group (measurement model). When I conducted multigroup SEM, fit indexes were low, so I released some of intercepts (observed means) for one group and then I got a good result. MODEL : F1 BY Y2* Y5 Y11 Y12; F2 BY Y4 y7 Y10; F3 BY Y3 Y6 Y9; F1@1; F2 on F1; F3 on F1; F2 with F3; Y17 on F1(4); Y17 on F2(5); Y17 on F3(6); Y7 with Y10; Y7 with Y3; Y6 with Y4; Y6 with Y3; Y10 with Y3; Model G: [Y7 Y9 Y12]; >CFI 0.933 TLI 0.911 RMSEA 0.088 SRMR 0.096 Is it partially measurement invariance? I can't compare factor means, right? Thank you so much for your help. 


I would not use only the correlations and standard deviations. I would use also the means which is the default with raw data. Please see multiple group analysis in the Topic 1 course handout on the website. It discusses testing for measurement invariance in addition to testing of factor means across groups. There is also a video that you can watch. 

peter pitt posted on Friday, July 01, 2011  1:26 pm



Dear professors, I have some questions with respect to the factor variances in multigroup EFA (ESEM). (a) Suppose that the variables are standardized per group and that I didn’t constrain the factor variances to be equal, for example to one (but instead I constrained some loadings to one to solve the identification problems), are these factor variances then subject to any constraint (e.g., the sum of factor variance of a factor in group A + factor variance of the same factor in group B = 1)? What would be the influence of standardizing the concatenated data instead of standardizing for each group separately? (b) Is it possible to find a solution with the same factor loadings, but with factors that have different factor variances in each group (and if so, what does this mean then)? Thank you very much! 


(a) You don't want to standardize variables in a multigroup analysis because then you cannot study group diffs in means and variances. (b) Multigroup ESEM has the default of groupinvariant loadings and intercepts and groupvarying factor variances and means. The goal of multigroup analysis is to be able to study population (people) diffs in factors when measurement (variable) par's are the same. 


Hi, I like to compare standardized path in a multi group analysis. My model is the following (three latent dependent variables and three exogenous manifest variables): SR BY y1y5; SW BY y6y10; SC BY y1115; SR ON A B C; SW ON A B C; SC ON A B C; A with B C; C with B; SR with SW SC; SC with SW; I have two groups and will compare the standardized path from C to SC over these two groups. I don’t know how to create the standardized coefficients in the MODEL CONSTRAINT. A similar question was posted on Wednesday, August 04, 2010  11:34 am by Simon O. F. posted on http://www.statmodel.com/discussion/messages/11/16.html?1309783320 I think, I can compare the two standardized path with this equitation: beta_CSC1 = beta_CSC2*(sqrt(sdC2)/sqrt(sdSC2))/(sqrt(sdC1)/ sqrt(sdSC1). But, how can I define beta_CSC2 and beta_CSC1 as well as the variance of SC as NEW parameters in MODEL CONSTRAINT and test the difference using MODEL TEST. Thanks a lot in advance, JanHenning Ehm 


The standardized beta is beta*SD(x)/SD(y). Your SD(x) is the C standard deviation and your SD(y) is the variance of the SC dependent variable factor. By regular regression expectation algebra you compute SD(SC) as the sqrt of V(SC) = beta1*V(A)+beta2*V(B)+beta3*V(C)+ 2* beta1*beta2*Cov(A,B)+ 2*beta1*beta3*Cov(A,C)+2*beta2*beta3*Cov(B,C)+resvar(SC), where resvar(SC) is the residual variance of SC that you get in the output. 


Thank you Drs. Muthen & Muthen for taking the time to answer our questions. I was wondering if someone could please explain to me the difference between fitting a full SEM where we do not specify that there are two different groups, as opposed to a model where we specify GROUP IS but constrain parameters to be equal. As an example, for a study I am working on, I am performing two multiple group analyses. I first fit a full SEM. I then performed a Multiple Group Analysis comparing the parameters between two groups of teachers based on their teaching experience. The parameters of the constrained model differ from the parameters obtained in the full SEM. I then performed another Multiple Group Analysis comparing the parameters between two groups of teachers based on school level. I found that parameters of the constrained model also differed from the full SEM as well as from the constrained model of the first Multiple Group Analysis. Is this supposed to happen? If so, I also wonder why some journal articles I have read do not report the results of their constrained models. Thank you! 


Looking at the sample of males and females together is not the same as a multiple group analysis where the coefficients are held equal between males and females. The first analysis is a mixture. See the following paper which is available on the website for further information: Muthén, B. (1989). Latent variable modeling in heterogeneous populations. Psychometrika, 54:4, 557585. 


Fantastic. Thank you very much for this. 

Siran Zhan posted on Sunday, October 09, 2011  11:26 pm



Dear Dr. Muthen, I'm trying to establish measurement equivalence between 2 groups. The problem I'm facing is that one of the groups has no data at all on one of the manifest variables. Mplus did not run my script because of that and returned error messages that "One or more variables in the data set have no nonmissing values". Is there a way you can suggest for me go around this? Thank you very much for your help in advance! 


See the FAQ on the website about having a different number of variables in different groups. 

sam posted on Monday, October 10, 2011  2:44 pm



Hi, I'm interested in comparing whether the path coefficients are different across groups. My model consisted of 2 latent variables and 3 observed variables. I came across some articles stating that the comparison would be more meaningful if measurement invariance can be established before comparing the path coefficients. Is the measurement invariance necessary? If yes, how could I establish the measurement invariance where some of my variables are observed variables? Thank you very much. 


Measurement invariance applies to latent variables not observed variables. It is necessary to establish that the latent variables have the same meaning in different groups if comparisons are going to be made across groups. See the Topic 1 course handout on the website under Multiple Group Analysis. 


Dear Dr. Muthen, I am testing for invariance across two groups and I want to know if it is possible to constrain only the factor loadings across groups without constraining the error terms across the groups. For example when I use Model yes:F1 by y1(1); Model No: F1 by y1(1); This tends to constrain the error terms and the factor loadins across groups. Thanks 


I don't know why the error terms would be constrained with what you show. Please see the Topic 1 course handout under multiple group analysis for the inputs for testing measurement invariance. If these don't help, please send your output and license number to support@statmodel.com. 

ellen posted on Monday, November 07, 2011  11:37 pm



Dear Drs. Muthen, I have a question about how to test invariance of path coefficients for structural paths using Mplus. I am not testing measurement invariance. I am comparing SEM models for 3 groups, and see some paths seem to be comparable across groups. I read something about examining the “completely standardized common metric solution” in LISREL. However, I use Mplus. What section of an Mplus output would suggest significant group differences are present for some specific relationships (e.g., between A & B). Here is what I read: “We used LISREL to examine the invariance of path coefficients for structural paths in the SEM model by conducting multiplegroup comparison for boys and girls. To compare the two models, we conducted a model in which the relations among A, B, C, D variables were freely estimated and a model in which the relations were set to be equal for boys and girls. We then used the chisquare difference test to examine whether these models were equivalent. Results showed there was a significant chisquare difference... Examination of the completely standardized common metric solution suggested that significant group differences were present for the relationship between A and B ... To confirm this, we compared a model in which the relationships among A, B, C, D, E were set to be equal for boys and girls with a model in which A and B were freely estimated. There was a significant chisquare difference between the models..." 


They did a chisquare difference test where they estimated two models. One where regression coefficients were free across groups, for example, MODEL: y1 ON x1; y2 ON x2; and one where they were constrained to be equal across groups, for example, MODEL: y1 ON x1 (1); y2 ON x2 (2); Then they did a chisquare difference test as described on pages 434435 of the Mplus User's Guide. 

ellen posted on Tuesday, November 08, 2011  10:26 am



Dr. Muthen, Thanks for your prompt response! I know how to conduct a difference test for two models, but my question is more about how to make a "justification" from a Mplus output to suspect that some (but not all) path coefficients may be equivalent across groups. The article I described above (Nov. 7) uses the “completely standardized common metric solution” in LISREL to justify for testing a model where only some paths were set to be equal while other paths were freely estimated across groups. I am wondering whether a Mplus output of certain metric solutions will be able to provide justification for me to set certain paths equal... rather than just by my subjective view. Please help! Thanks so much! 

ellen posted on Wednesday, November 09, 2011  1:42 pm



Dear Drs. Muthen, Could you respond to my question (posted above; 11/8) and restated below? how to make a "justification" from a Mplus output to suspect that some (but not all) path coefficients may be equivalent across groups. Some researchers use “completely standardized common metric solution” in LISREL to justify for testing a model where only some paths were set to be equal while other paths were freely estimated across groups. I am wondering whether a Mplus output of certain metric solutions will be able to provide justification for me to set certain paths equal? 


I am unclear what "completely standardized common metric solution” means. If it means you are comparing standardized coefficients across groups, I would not recommend this. I would compare raw coefficients. You should have a theory about which coefficients you expect to be different across groups. If you are in an exploratory setting, I would hold all raw coefficients equal across groups and look at their modification indices. 

ellen posted on Wednesday, November 09, 2011  9:04 pm



Thanks! Could I ask 2 more questions? (sorry new to Mplus!) How do I "hold all raw coefficients equal across groups"? Is below the right way to write the commands? Also, how do I interpret "M.I." and "E.P.C."? I read the User's Guide (pp. 646647) but still don't understand it... ...... GROUPING = race ( 1 = Black 2 = Asian 3 = White); ANALYSIS: ESTIMATOR = MLR ; MODEL: A by A1 A2 A3 ; T by T1 T2 T3 ; O by O1 O2 O3 ; S by S1 S2 S3 ; T on A (1); O on A T (2); S on A T O (3); OUTPUT: sampstat; standardized sampstat; Modindices (0) ; 


T on A (1); O on A (4) T (2); S on A (5) T (6) O (3); Chapter 14 has a discussion of multiple group analysis. 


I would concentrate on modification indices. The value given is the decrease in chisquare if the equality is removed. The value 3.84 is the chisquare value of significance for one degree of freedom. Any MI over this value would improve fit significantly if the equality is removed meaning that the two coefficients are not equal across group. 

ellen posted on Thursday, November 10, 2011  10:58 pm



Dr. Muthen, Thank you! This is helpful! May I ask a couple followup questions: There was a path (e.g., A>T) that was significant (p< .01) only in the Asian group, but not in the Black or White groups when estimated freely. However, when it was fixed to be equal across groups, the M.I. was not greater than 3.84? Does that mean we can consider this specific coefficient equal across groups? ... this does not seem to make sense because when estimated freely, it was significant at p< .01 in one group, while in the other two groups it was not significant. How to explain this? Also, could you tell me how to interpret an "E.P.C."? I read the user guide but still am confused... (Thanks SO MUCH! ...& sorry about the basic questions.) 


You are looking at two different types of tests. One coefficient can be significantly different from zero and the other not even though the two coefficients may not be significantly different from each other. EPC is the value the parameter would take if it is free. 

Jiyeon So posted on Tuesday, November 22, 2011  11:40 pm



I have a hypothesis that predicts : The model will receive stronger support from sexually active group (group A) than sexually inactive group (group B). To test this hypothesis, I think I should compare model fit across two groups. However, since the model is the same (and only the sample is different) Chisquare difference test does not apply here since it is only for nested groups. Is there some sort of significance test for comparing model fit across two groups? I understand this may not be a specific Mplus question but I'm using Mplus and find this board very helpful. Please advise me what to do! I would really appreciate it!! 


I can't think of any way to test that. 


Dr. Muthen, I am testing a model for measurement invariance by gender, and receive the following error message: THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES COULD NOT BE COMPUTED. THE MODEL MAY NOT BE IDENTIFIED. CHECK YOUR MODEL. PROBLEM INVOLVING PARAMETER 43. When I run the analyses for the entire sample everything goes smoothly. The error only comes up when I attempt a multigroup analyses. My syntax is below, and parameter 43 corresponds to the Alpha value for zmomhrs in the female sample. Can you advise on what might be causing this error and how I might be able to work around it? Thanks so much! Variable: (variables removed) MISSING = .; GROUPING = childgender (1=male 2=female) Analysis: Model NOCOVARIANCES; Model: f1 by; f1 on zfamilyinc@1 zedusum; f1@0; f2 by zroleambig_r zdeclat_r; f3 by zchildh_r zmchildh_r zbmi_r; zmompa on f2; zmompa on zmomhrs; zmompa on f1; zmchildpa on zmompa (1); zmchildpa on f1; f3 on zmchildpa (2); f3 on f1; f2 on f1; zmomhrs on f1; f2 with zmomhrs; Output: stdyx tech1; 


Please send the full output and your license number to support@statmodel.com. 

Steven John posted on Wednesday, January 18, 2012  5:09 am



Hi I'm currently doing a MGA comparing the correlation between A and B for two primary school grades. The correlation between A and B is different, but significant, in separate analysis I run earlier for both grades. I now want to compare the correlation to see if the difference between grades is significant. To me it seems appropriate to run a totally relaxed MGA with both grades and then another where the correlation under investigation is relaxed. Thereafter I use the Chi2 difftest for nested models to calulate if the models differ? Am I correct? BR 


You can do this. Or you can do it in one step using MODEL TEST. See the user's guide for further information. 

Steven John posted on Thursday, January 19, 2012  1:03 am



Thanks! However, I got a message in the output and therefore calculated the TRd according to the formula on the Mplus website. (Probably because I run the MLR estimator?) If I run the model totally constrained and thereafter relax on the correlation under investigation, would this also be correct? This seems to be the common way of doing it? However, it seems a bit strange to impose equal factor loadings across grades instead of assume that they theoretically measure the same construct (factor loadings vary across grade but follow the same pattern). The totalally constrained model also fit the data poorly. Many thanks. 


Holding factor loadings and intercepts equal represents measurement invariance. You must first establish measurement invariance before coefficients related to latent variables can be compared across groups. See the Topic 1 course handout and video for a discussion of this topic. 


Hi! I am comparing nested models that were estimated using MLR. I applied the chisquare difference test formula presented on your website (http://www.statmodel.com/chidiff.shtml). I want to check that I am using the correct information from the output to do the computation. In the following formula... cd = (d0 * c0  d1*c1)/(d0  d1) ...I took the degrees of freedom (d0 and d1) in the "ChiSquare Test of Model Fit" section from each output. Is that correct? (There is another section in the output called "ChiSquare Test of Model Fit for the Baseline Model" and I want to be sure which values to use.) Then, for the correction factor (c0 and c1), I used the following : in the "Loglikelihood" section, "H0 Scaling Correction Factor for MLR". Is that correct? Thank you very much. 


Yes, that is correct. You don't want the Baseline Model. Yes, you use the H0 scaling correction factor. 


I am novice user of multiple group analysis. I did all analysisi, but I do not how to report and interpret data for the manuscript which model, including unconstrained, structural weights, structural covariances, and structural resiadulas ý should use for reporting. thanks 


Please see multiple group analysis in the Topic 1 course handout and video on the website. See also how results are reported in the journal you plan on submitting too. If this does not help, I suggest posting this on a general discussion forum like SEMNET. 


thank you for your attention 


Hello, I have a question about a negative residual variance, it is very small and not significant. So I want to fix it zero, that is not a problem, but now I don't have the correct degrees of freedom. I cannot find any reference to this anywhere. Is there a way that I can make Mplus only estimate positive residual variances so that my degrees of freedom are correct? Thank you! Bellinda 


Use MODEL CONSTRAINT to constrain the parameter to be greater than zero. 


Hi, We are trying to run a multiple group version of a Cole & Maxwell TraitStateOccasion model. When we run it in our entire sample, the model converges just fine. When we try to run a multiple group, metric invariant version of the model we get a message saying that standard errors could not be computed and the model may not be identified and the problem appears to be with a parameter related to the mean structure. We believe this may be due to the fact that the occasion factors in the Cole & Maxwell model are actually the residual variances in the State factors (after accounting for the Trait factor) and therefore are not independent of the state and trait factors but that Mplus is trying to estimate group differences on all three sets of factors (trait, state and occasion) as if they were independent parameters. We are not certain though. Does this make sense? If so, any thoughts on how we fix the identification problem in the multiple group model? 


Maybe the occasion factors cannot have free means. With free means, the intercepts need to be constrained across groups. You may need to ask the authors for help. 


how many groups can mplus text per analysis in ESEM. eg. 2 or 3 or 4 what is the max. 


There is no limit except your computer's. I have done 34. 

ellen posted on Monday, September 10, 2012  6:58 pm



if I want to test whether two parameters are equal across three groups, is it accurate to write the Mplus language in the following way? (knowing the overall multigroup comparison chisquare difference is significant across groups.) MODEL African: Sg ON De (p1); Rm WITH Ot (p2) ; MODEL Asian: Sg ON De (p3); Rm WITH Ot (p4); MODEL Hispanic: Sg ON De (p5); Rm WITH Ot (p6); MODEL TEST: p1 = p3; p1= p5; p2=p4; p2 = p6; Is this the correct way to test whether the parameters of "Sg ON De" and "Rm WITH Ot" are equal across the 3 groups?: 


MODEL TEST: 0 = p1  p3; 0 = p1  p5; 0 = p2 p4; 0 = p2  p6; I am removing your other post. It is not necessary to post the same question more than once. 


Hello, I am running a multiple group (male/female) sem model with categorical indicators. I freely estimated the latent means in both groups (by fixing one treshhold in the indicators to 0) but I am not sure which metric these latent means now have and how to interpret them. Some means in both groups are negative (e.g. 0.25 vs. 0.39). But my categorical indidactors are only between 14. It seems to me that the latent means are in some way centered to 0 and what I get are deviations from zero? But I don't understand what the reference (0) here is? The whole sample or the female and male group? And do you suggest to report the standardized or not stand. means? Thanks, Sofie 


There is no gain to freeing the factor mean and fixing one threshold. Using the standard approach, the factor mean is zero in the reference group and other group means are deviations from zero. When you fix one threshold to zero, the factor mean is in the metric of the threshold. Threshold are in the metric of zscores not of the categories of the variable. 

Herb Marsh posted on Monday, September 24, 2012  3:15 am



I have a large data set with 26 groups (13 countries x 2 age cohorts) with 3000+ cases for each group. I began by showing reasonable invariance of factor loadings across the 26 groups. In the multigroup SEM I have several key path coefficients. I would like to do something like an ANOVA to determine how much of the differences in a path coefficient across the 26 groups can be explained by country, agecohort, and their interaction. Can I do this with either ‘Model Test’ or ‘Model Constraint’ ? I did something along these lines previously when there were 4 groups (with a 2 x 2 design) with model constraint where the main and interaction effects were df=1 contrasts. 


You can do it with Model Constraint. You have 24 coefficients and you can write the twoway ANOVA sum of squares decomposition in Model Constraint. 

ellen posted on Tuesday, September 25, 2012  12:39 am



Hi, I am running a multigroup SEM model (African, Asian, Latino). The structural parameters initially showed very different results across groups. For example, one structural parameter was .22** for Latino, .11 (not significant) for African, and .12 (not significant) for Asian groups. However, when I used MODEL TEST to examine parameter equalities, the results showed the parameter was NOT significantly different across groups. This is puzzling to me because the parameter result was initially only significant for Latino (.22**), and was not significant and in the opposite direction (positive .11) for the African group, and not significant for Asians (.12) how could the three parameters not having significant difference? Does it mean statistically they are considered as equivalent? If they are considered equivalent statistically, do I have to constrain the parameters to be equal across groups and claim there is structural invariance? When I constrained it to be equal across the three groups, I got a result that shows this parameter was significant across ALL groups. How do I interpret the results here? I am just confused why the three parameters seemed so different initially (e.g., in opposite directions and only one was statistically significant) would somehow turn out to be statistically equivalent? 


What happens when you test .22 versus .11? 

ellen posted on Tuesday, September 25, 2012  9:11 am



when I tested .22 versus .11, it showed no statistical difference as well. It is very puzzling to me... 


Please send the relevant outputs and your license number to support@statmodel.com. 

Herb Marsh posted on Friday, September 28, 2012  10:39 pm



Tihomir: Thank you for your assistance. However, I have not been able to work out how to follow your suggestion. In model constraint I: 1. computed age cohort differences for each of the 13 countries, and then took deviations of these from the mean cohort difference over all countries. I then used 'model test' to test whether 131 country deviations were simultaneously equal to zero. I guess that this is a test of the countrybycohort interaction. 2. I computed country means (averaged across the two cohorts) for each country, and then took deviations of these from the grand mean. However, I could not use 'model test' to test whether these were simulaneously equal to without a separate analysis. As I have a LOT of coefficients to test, this would require 1000s of lines of code and many separate analyses. More importantly these did not really give me the twoway ANOVA sum of squares decomposition that I wanted. Obviously I have missed something, Can you give me a bit more guidance about how to translate the 26 (13 countries x 2 age cohorts) coefficients into ANOVAstyle SS? HERB 


Here is a sample code using 5 countries x 2 age cohorts  it gives the decomposition of sum of squares, see page 83 in http://www.stat.ufl.edu/~dksparks/sta3024/chapter6.pdf model constraints: new(b_g1b_g5); do(1,5) b_g#=(b1g#+b2g#)/2; new(b1g_ b2g_); do(1,2) b#g_=(b#g1+b#g2+b#g3+b#g4+b#g5)/5; new(b_g_); b_g_=(b1g_+b2g_)/2; new(ss1 ss2 ss3); ss1=5*((b1g_b_g_)**2+(b2g_b_g_)**2); ss2=2*((b_g1b_g_)**2+(b_g2b_g_)**2+(b_g3b_g_)**2+(b_g4b_g_)**2)+(b_g5b_g_)**2; ss3= (b1g1+b_g_b1g_b_g1)**2+ (b1g2+b_g_b1g_b_g2)**2+ (b1g3+b_g_b1g_b_g3)**2+ (b1g4+b_g_b1g_b_g4)**2+ (b1g5+b_g_b1g_b_g5)**2+ (b2g1+b_g_b2g_b_g1)**2+ (b2g2+b_g_b2g_b_g2)**2+ (b2g3+b_g_b2g_b_g3)**2+ (b2g4+b_g_b2g_b_g4)**2+ (b2g5+b_g_b2g_b_g5)**2; 


You will also need to label the parameter b1g1, b1g2, etc... 

Herb Marsh posted on Monday, October 01, 2012  5:19 pm



Tihomir: Sorry for being so dense and not thinking through what I want more carefully. Yes, your suggestion gives me SS decomposition  like a twoway anova with one case per cell so that there is no withincell variation. This is what I asked for However, what I really want (with hindsight) is to be able to say that the variation explained by each effect is trivial, small, large, etc. To do this (hazardous though it is) I need some measure of SSerror or SStotal. I cannot do this with a single value for each cell. However, what I do have is a standard error for each of the cells and the number of cases in each cell. Can I use that to construct SSerror. Naively, I am thinking I can use the SEs to create a SD (mult by N) and then compute a wtedavg of these. I doubt if I could use this to construct a legitimate Ftest, but it might suffice for my descriptive purposes. I would value your thoughts HERB 


Herb Two thoughts from me. 1. You can get SE for any parameter you can construct in model constraints. 2. Take a look at this BSEM design: page 130 https://www.statmodel.com/download/handouts/MuthenV7Part1.pdf I think it is pretty impressive and should catch on in other places such as ML estimation ... but of course you are already on the fringes of BSEM Tihomir 


I am running a path model with 5 continuous latent variables. This works well and now I am interested in testing for differences between groups (grouping variable: dichotomous 1,2). I already did multigroup analysis before but in this case I got the following warning: *** WARNING Data set contains unknown or missing values for GROUPING, PATTERN, COHORT, CLUSTER and/or STRATIFICATION variables. Number of cases with unknown or missing values: 503 I already rechecked the dataset but there are no missing values. I also tried to start with testing for configural and metric invariance and got the same warning. What am I doing wrong? Thank you! Input: VARIABLE: Names are [...] Usevariables are PrAttA1 PrAttA2 PrAttA3 PrAttB1 PrAttB2 PrAttB3 PoAttA1 PoAttA2 PoAttA3 PoAttB1 PoAttB2 PoAttB3 AttCo_1 AttCo_2 AttCo_3 FiltPF; Grouping is FiltPF (1= low 2= high); MODEL: PrAttA by PrAttA1 PrAttA2 PrAttA3; PrAttB by PrAttB1 PrAttB2 PrAttB3; AttCo by AttCo_1 AttCo_2 AttCo_3; PoAttA by PoAttA1 PoAttA2 PoAttA3; PoAttB by PoAttB1 PoAttB2 PoAttB3; AttCo on PrAttA PrAttB; PoAttA on PrAttA AttCo PoAttB; PoAttB on PrAttB AttCo PoAttA; PoAttA with PoAttB; PrAttA with PrAttB; Output: stdyx; 


Please send the output, data set, and your license number to support@statmodel.com. 

Thomas Eagle posted on Saturday, November 03, 2012  4:12 pm



I am having a problem setting up a multigroup analysis where I have two groups. One group answered every question. The other group skipped all the items of one complete factor plus three additional variables. I used the example as in the post dated April 29, 2004. I still get an error message. Below is my code. What am I doing wrong? GROUPING IS teen (1 = NonTeen 2 = Teen); MODEL: BP by nq24_1nq24_15; PV by nq24_16nq24_20; HB by nq24_21nq24_35; NAT by nq24_36 nq24_40 q24_41; Q by nq24_42nq24_44; T by nq24_37nq24_39 nq24_45nq24_50; SP_EX by nq24_51nq24_60; SOC_ENV by nq24_61nq24_64; MODEL Teen: BP by nq24_1nq24_5 nq24_6@0 nq24_7nq24_15; PV by nq24_16@0 nq24_17@0 nq24_18@0 nq24_19@0 nq24_20@0; HB by nq24_21nq24_30 nq24_31@0 nq24_32@0 nq24_33 nq24_34 nq24_35@0; NAT by nq24_36 nq24_40 nq24_41; Q by nq24_42nq24_44; T by nq24_37nq24_39 nq24_45nq24_50; SP_EX by nq24_51nq24_60; SOC_ENV by nq24_61nq24_64; 

Sunny Duerr posted on Sunday, November 04, 2012  6:22 am



Hello, I have a multiplegroup latent variable model and I would like to verify that I am interpreting the output correctly. My analysis has one dependent variable with continuous indicators, which is regressed on each of three latent independent variables with ordinal indicators. Here is my question: Does the regression coefficient in the output for each group represent the relationship between the independent and dependent variables for only that group, or is the regression coefficient for all groups beyond the first representing a degree of difference between the first group and another group? As an example, if I have the following regression coefficients: Group 1: 0.895 Group 2: 0.105 Group 3: 0.063 Group 4: 0.102 would I interpret this as the variables have a stronger relationship for Group 1 than the other groups (0.895 compared with absolute values smaller than 0.2), or as the relationship is relatively strong for all groups and the regression coefficient ranges between 0.790 and 0.997 depending on group membership? Thanks in advance for any insight or advice you have! 


Thomas: Please send your output and license number to support@statmodel.com. 


Sunny: The results are for each group. 

Thomas Eagle posted on Thursday, November 08, 2012  10:38 am



Hi Linda, I am back. I tried the fixing of missing data defined to a group to zero using what you recommended. It does not converge. Here is the essence of my code: DEFINE: IF (teen eq 2) THEN NQ24_6 = 0; IF (teen eq 2) THEN NQ24_16 = 0; IF (teen eq 2) THEN NQ24_17 = 0; ... etc... USEVARIABLES nq24_1nq24_64; MISSING = .; GROUPING IS teen (1 = NonTeen 2 = Teen); ANALYSIS: COVERAGE = 0.0; MODEL: BP by nq24_1nq24_15; PV by nq24_16nq24_20; HB by nq24_21nq24_35; NAT by nq24_36 nq24_40 nq24_41; Q by nq24_42nq24_44; T by nq24_37nq24_39 nq24_45nq24_50; SP_EX by nq24_51nq24_60; SOC_ENV by nq24_61nq24_64; Is there a fix I can try Tom 


Please send your output and license number to support@statmodel.com. 


Hello, I want to test whether my model differs by boys and girls by using a multigroup model where all parameters are equal and then another where all parameters are free. However, mplus won’t constrain one of my variables to be equal across groups. This variable is a dummy coded variable. Can you please help me with this? Thank you! Danyel 


Please send the output and your license number to support@statmodel.com. 


Hi, I am new to Mplus and SEM so I apologize in advance if you have answered this question elsewhere. I want to do a multiple group comparison by sex in which the first model is free across all parameters and the other is equal across all parameters. The model free across parameters seems to be working: model: PTSD by rx* an hyp; PTSD@1; large on PTSD; sumcomp on PTSD; locw2 on PTSD; contwt on large sumcomp locw2; contwt on contbmi; When I run the equal parameter model (below) the tau, theta, alpha, and psi matrices in the Tech1 output are still being estimated for the female model. How do I equate these parameters? PTSD by rx* an hyp; PTSD@1; large on PTSD (1); sumcomp on PTSD (2); locw2 on PTSD (3); contwt on large (4); contwt on sumcomp (5); contwt on locw2 (6); contwt on contbmi (7); Model Female: large on PTSD (1); sumcomp on PTSD (2); locw2 on PTSD (3); contwt on large (4); contwt on sumcomp (5); contwt on locw2 (6); contwt on contbmi (7); Thank you! 


WITH statements are used to specify parameters in Theta and Psi. Bracket statements are used to specify parameters in Tau and Theta. y1 WITH y2; [y1 y2]; 


Hi, I am testing a multilevel mediation model. Fit of the model increases by adding an insignificant path. Which model should be selected: lesser fit model with all significant paths or better fit model with insignificant path included? How can including insignificant path increase fit of the model?? 


This is a matter of choosing one of two chisquare tests: Wald or Likelihoodratio. They are different but asymptotically the same. Wald is the same as the z test you see when judging significance of a path and LR is the chisquare test of model fit. I would add the path if there was theoretical reason to consider it. The fact that is is then insignificant is a finding of subjectmatter interest. 

Jo Brown posted on Thursday, December 27, 2012  12:13 pm



Dear Drs, I am running a multiple group analyses to explore mediation. As I am using imputed data, I need to specify the direc, and indirect effects using the model constraint options. However, when I do so I only receive one output for the direct indirect effects if I simply specify: model: Y on M (p1); Y on X (c1); M on X (m1); MODEL CONSTRAINT: new(ind dir); indF = p1*m1; dirF = c1; Should I repeat the same lines after this as in: model male: Y on M (p1); Y on X (c1); M on X (m1); MODEL CONSTRAINT: new(ind dir); indF = p1*m1; dirF = c1; model female: Y on M (p1); Y on X (c1); M on X (m1); MODEL CONSTRAINT: new(ind dir); indF = p1*m1; dirF = c1; to obtaion ind and dir for boys and girls separately. I'd be grateful if you could advice me on the best way to proceed. Many thanks 


You need to use different labels for male and female and specify an indirect effect for each using these labels. 

Jo Brown posted on Thursday, December 27, 2012  5:47 pm



Thanks Linda, I am sorry but do you mean something like this? model: Y on M (p1); Y on X (c1); M on X (m1); MODEL CONSTRAINT: new(ind dir); indF = p1*m1; dirF = c1; model male: Y on M (p2); Y on X (c2); M on X (m2); MODEL CONSTRAINT: new(indM dirM); indF = p2*m2; dirF = c2; model female: Y on M (p3); Y on X (c3); M on X (m3); MODEL CONSTRAINT: new(indF dirF); indF = p3*m3; dirF = c3; I have never done this before so I am really unsure on the best way... Thanks again 


Yes, but have only one MODEL CONSTRAINT which is not interspersed in the MODEL command. Put MODEL CONSTRAINT either before or after the MODEL command not in the MODEL command. And don't use the same names for the direct and indirect effects. model: Y on M (p1); Y on X (c1); M on X (m1); model male: Y on M (p2); Y on X (c2); M on X (m2); model female: Y on M (p3); Y on X (c3); M on X (m3); MODEL CONSTRAINT: new(ind dir indm dirm indf dirf); ind = p1*m1; dir = c1; indm = p2*m2; dirm = c2; indF = p3*m3; dirF = c3; 

Jo Brown posted on Thursday, December 27, 2012  11:42 pm



Thank you Linda! 


I have, perhaps, a simple question. I am running a twogroup MLR model with two latent variable predictors and one latent variable dependent variable. I would like to graph, what is effectively an interaction, of the difference in the coefficients between groups in an Aiken and West style model (e.g. 1SD, 0,+1SD). The output gives me the slopes for each predictor for each group, however to properly graph the difference in the slopes I need the intercept. Can I use the group specific intercept for the DV that is printed in the output as my anchor for graphing the slopes? I noticed that one intercept seems to be fixed at zero while the other is freely estimated. 


The answer to your question is yes. You can also try to do your full plot for the range [1 SD, +1 SD] using the Version 7 "LOOP" plot. See Part 1 of the handouts and videos from the Utrecht course in August on our web site  or see the version 7 UG ex 3.18 and modify to twogroup analysis. 


hello, I am using type=imputation, estimator = MLR, and running a multiple group analysis comparing two racial groups. Can I conduct a chisquare test of difference between the unconstrained and constrained models? Will this give valid results in terms potentially moderation? 


Difference testing has not been developed for multiple imputation. You can make comparisons using a Wald test using MODEL TEST. 


thank you for your response. can you point me to an example of MODEL TEST used in multiple group analysis? 


I don't have such an example. Just label the parameters using the groupspecific MODEL commands and use the labels in MODEL TEST. 


Hello; i have checked my model for three socioeconomic statuses.....and build separate file for each..... for lower SES, there are four paths which are non significant, when i placed constraints on them...chi sq value increases...model fit indices also increase but not as such great effect has been observed....but when i delete all those paths then that gives me good model fit....kindly suggest me...would i delete all those paths which are non sig (improved model fit) or place constraints on them...(which gives me just marginal model fit). 


This question is more general and basic and is therefore more suitable for SEMNET. 


Dear Drs. Muthen and Muthen, I am conducting a multiplegroup analysis. I have 6 categorical indicators loading onto 3 factors (2 indicators on each). I'm using CLASSES instead of GROUPING (and have specified 8 classes), TYPE = MIXTURE, MLR estimator, and ALGORITHM = INTEGRATION. I am testing configural invariance first (i.e., covariance invariance), and then measurement invariance (i.e., factor loadings, thresholds/factor means). Since I'm using MLR estimator, I can't test the invariance of the residual variances, but that is fine with me. And I believe that when thresholds are free, factor means have to be fixed at 0, and vice versa. However, I'm having trouble getting some of my models to converge. I've set up 16 models to compare (by combining free or invariant model specifications for each of the 4 parameters: factor correlations, factor variances, factor loadings, thresholds/factor means). In my model, I'm freeing the loading of the first indicator on each factor. In models where factor variances are meant to vary across groups, I've fixed variances in group 1 to 1 and allowed variances in other groups to vary freely. My question is: Do you have any suggestions about why some models are not converging? Are there model combinations (out of my 16 combinations above) that are just not going to be identified? Thank you very much for your help! 


I hear nothing wrong in what you are saying. Models where you don't set the metric in the loadings and instead fix a factor variance at 1 in one group and have them free in other groups need to rely on holding the loadings equal across groups for identification. The way to figure out the source of the nonidentification is to check the parameter number in the error message against Tech1 to see which parameter that is. 


Thank you for your help, Dr. Bengt Muthen! I do have a followup question. In one of my models, I made factor correlations invariant, factor loadings free, factor variances free (with variances in first group fixed to 1), factor means free, and thresholds invariant across groups. This model converged successfully. If I need to rely on holding loadings equal across groups for identification, why might this model have successfully converged? Moreover, I have models in which that requirement (if variances are free, loadings must be equal) is satisfied that did not converge. For example, I have a model in which factor correlations are invariant, factor loadings are invariant, factor variances are free (with variances in first group fixed to 1), factor means are free, and thresholds are invariant. Allowing for 2000 iterations, the model still did not terminate normally. The message is "THE MODEL ESTIMATION DID NOT TERMINATE NORMALLY DUE TO A NONZERO DERIVATIVE OF THE OBSERVEDDATA LOGLIKELIHOOD. THE MCONVERGENCE CRITERION OF THE EM ALGORITHM IS NOT FULFILLED. CHECK YOUR STARTING VALUES OR INCREASE THE NUMBER OF MITERATIONS. ESTIMATES CANNOT BE TRUSTED. THE LOGLIKELIHOOD DERIVATIVE FOR PARAMETER 7 IS 0.73857449D02." Again, thank you very much for your help! It's invaluable. 


Factor loadings not held equal across groups makes it meaningless to compare factor variances across groups. Generally speaking, this model is not identified. Nonidentified models can converge. Why your model converged I can only tell by looking at your output which you can send to support. For the question in your second paragraph I also would have to see your output to be able to tell. Note also that you talk about invariant factor correlations. I would think you mean factor covariances since the factor variances are not in all your models it sounds like. 

Jenny L. posted on Tuesday, April 23, 2013  9:43 am



Dear Drs. Muthen and Muthen, I'm doing a multigroup analysis on two groups. There are 5 exogenous variables and I would like to correlate them both within and across groups. I know that withingroup correlation is a default, but I'm not sure about the code for crossgroup correlation (i.e. f1 of Group 1 is correlated with f1 of Group 2) and I can't seem to find it in the user's guide. Could you please give me some clue? Thank you in advance for your help. 


You can't correlate variables across groups. 

Jenny L. posted on Tuesday, April 23, 2013  10:27 am



Thank you for your reply, Dr. Muthen. What I have is actually a longitudinal data set with 2 time points. I was trying to test whether the associations among 8 variables (5 exogenous, 2 mediators, 1 outcome) would vary across time. I thought I could treat them as two different groups but it violated the independence assumption of multigroup analyses. What analysis would you suggest? Thank you again for your advice. 


You can do a singlegroup analysis where you relate the variables across the two time points. 


When I conduct multigroup analysis, say, for males and females, I understand equality constraint should be used to directly test whether a coefficient of interest is statistically different between the two groups, using chisquare difference. Sometimes, however, I have a situation, where significant chisquare difference is found (i.e., statistically significant difference between male and female coefficient) when both coefficients are statistically NOT significant. In such case, should I report the coefficient is different between males and females (based on the chisquare difference test) or not (because both coefficients are not significant, that is, not different from zero; and thus comparing two nonsignificant coefficients is pointless)? 


I can't see any value of reporting this. 


Thanks. I'd like to ask a followup question. What if I found a structural coefficient to be significant in one group but not significant in the other when its chisquare test showed nonsignificant difference? Should I report significant difference between the two groups (based on significant vs. nonsignificant coefficient) or not (because the test showed nonsignificant chisquare difference with equality constraint)? This is not a hypothetical question, but I often have such case. Thanks in advance. 


This is really not related to Mplus. You can probably get a more thorough response by posting this on a general discussion forum like SEMNET. 

Claire posted on Wednesday, May 29, 2013  2:11 am



I have a question (probably a stupid one!) about doing path analysis, which I was hoping you might be able to answer. I’m thinking about running a path analysis (perhaps going onto a SEM after) but want to compare path coefficients using the same model between groups (in my case countries). Could this be done by just running separate path models for each country and comparing the coefficients? Or do you have to use multigroup path analysis? What is the difference between the two methods? Do you have any detailed examples (including syntax) of multigroup path analysis if this is what I should be doing or can you point me in the direction of materials that explain the difference between the two? Many thanks. 


You should use multiple group analysis so that the testing can be done by the program using either chisquare difference testing or the Wald test using MODEL TEST. If you analyze the groups separately, you would need to do the same type of testing by hand which could be difficult. 


Hello Dr. Muthen. I've been struggling with how best to approach multiple group factor analysis when there are many groups (my "group" is typically "country" and I usually have about 1520 groups). I have been working through the paper: "General random effect latent variable modeling: Random subjects, items, contexts, and parameters" but I've been worried that the random effect approach is going to "force" (for lack of a better word) invariant loadings to be noninvariant since a random effect is estimated for each group whether or not it is "needed", and that this could bias the structural (latent mean) parameter estimates. I'm just wondering if I am completely off the mark. Thank you for your time. 


See the new ALIGNMENT option in the Version 7.1 Language Addendum on the website with the user's guide. See also Web Note 18 and Bengt's UCONN Keynote address which discusses random versus fixed factor loadings. 

Tait Medina posted on Wednesday, June 19, 2013  8:38 am



Thank you for these resources! The ALIGNMENT option is VERY interesting. I am looking forward to giving it a go. 


Dear Professors A colleague of mine is using Mplus to examine how associations among constructs differ in two contexts. Rather than constrain each path at a time, and then compare the Chi square difference between the constrained and unconstrained models to see if it’s significant, he has used pairwise ttests with pooled standard errors. He wrote that he did this because he used the meanadjusted maximum likelihood method in Mplus and that the chisquare values from this test in Mplus cannot be used for chisquare tests. I have not used Mplus before and would like to confirm whether this is a good handling of the issue. Could you please let me know? I looked for other posts on this issue before posting this, but couldn't find anything. Sorry for taking up your time, but your advice would be really helpful. Andrew. 


One can do difference testing using MLM. It requires using a scaling correction factor. I think that would find the same results as what your colleague did as long as the values from TECH3 are used in the computations. 


Many thanks Linda! 


Greetings, As a followup on my question posted here on January 25, 2012, I would like to know what is the purpose of the "scaling correction factor for MLR" value under the section named "Chisquare test of model fit". Because your answer to my previous post says that I need to use the "H0 Scaling Correction Factor for MLR" found under the "Loglikelihood" section, I'm wondering why I also have this other correction factor available. For your information, I am using the correction factors in the following formula: cd = (d0 * c0  d1*c1)/(d0  d1) Thanks for your assistance. 


For difference testing you need the scaling correction factor which is related to the degree of nonnormality. You can do difference testing using either chisquare values or loglikelihood values. You would use the scaling correction factor that is for the test statistic you decide to use. 

Hannah Lee posted on Friday, September 06, 2013  9:05 am



Hi, I am trying to conduct a multigroup analysis (4 groups). It seems I can only compare two paths at a time with MODEL TEST. So here was my input: usevariables= REOadd36 COMP1COMP5 LENG PERC DREadd36 DPUadd36; Grouping= RCR4split (0=LOBC 1=LSE 2=HSE 3=HOBC); ANALYSIS: ESTIMATOR=MLMV; MODEL: comp BY COMP1COMP5; comp ON REOadd36 perc leng DREadd36 DPUadd36; MODEL HSE: comp BY COMP1COMP5; comp ON REOadd36 (HSEb1) perc leng DREadd36 DPUadd36; MODEL HOBC: comp BY COMP1COMP5; comp ON REOadd36 (HOBCb1) perc leng DREadd36 DPUadd36; MODEL TEST: HSEb1=HOBCb1; OUTPUT: TECH1 STDYX; Although I get the mdel estimates,I get the following meassage: THE MODEL ESTIMATION TERMINATED NORMALLY THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES COULD NOT BE COMPUTED. THE MODEL MAY NOT BE IDENTIFIED. CHECK YOUR MODEL. PROBLEM INVOLVING PARAMETER 49. THE CONDITION NUMBER IS 0.317D19. I am not sure where to look for "parameter 49." Could this have to with individual group sample sizes? 


You find the parameters and their numbers in TECH1. You should not mention the first factor indicator, comp1, in the groupspecific MODEL commands. When you do this, they are free making the model not identified. 

marie posted on Wednesday, October 02, 2013  3:05 pm



Hi, I am running a multiple group analysis with gender as the grouping variable. I understand that I cannot just compare the paths without establishing measurement invariance. I first checked whether the final structural regression model was a good fit across groups ("no constraint across groups"). I have 10 latent constructs. There were twice more females than males. Three questions:  How do I test whether the drop in fit is significant or not (I am using MLR)? Numerically speaking there was a drop in the CFI and an increase in the RMSEA and SRMR.  All the indirect effect became non significant in the male group. Five of my direct paths in both groups became non significant. Is this due to the small sample size after I divided them into two groups?  Is it fair to say that the final model could not hold in both groups so I have to stop my multiple group analysis there? Thank you 


You should test for measurement invariance using a model with only the ten latent variables. You should not include paths among the ten latent variables until you have established measurement invariance. See the Version 7.1 Mplus Language Addendum on the website with the user's guide. There is a new feature that automatically tests for measurement invariance. 


Hi, I have a SEM model with the WLSMV estimator for categorical items. In Mplus version 6 I was able to stratify my final model using the GROUPING option by levels of a variable I made in the DEFINE command. However, I have imputed data and now when I try to rerun the same input file in Mplus version 7 it says that I cannot do this because the GROUPING option requires the same number of participants per group in each imputation (and this is not the case since the imputation created some variation across the defined variable). Is this a bug in version 7 because in version 6 it just took the average number of people per group across all the imputations in order to do it? Is there another way I could stratify my model? 


Please send the Version 6 and 7 outputs and your license number to support@statmodel.com. 


Hi How can I analyze a single data file with a grouping variable defining two or more groups, but only include one of the groups in the analysis? thanks,Ebi 


Use the USEOBSERVATIONS option without the GROUPING option. 

C posted on Tuesday, December 10, 2013  1:19 pm



Hello, I am trying to conduct a multiple group path analysis (with 4 groups) using the following syntax: GROUPING = welfare (1=Med 2=Social 3=Post_Com 4=Cons); Model: educ ON books rooms occ feat age ; wealth ON main income age books rooms occ feat ; income ON main age ; main ON educ occ age ; Y ON wealth income main educ age books rooms occ feat ; rooms WITH books occ feat ; books WITH occ feat ; w3_occ WITH feat ; Model indirect: Y IND books ; I want to then test whether the path from books to Y is different between groups. I have tried different ways to do this but keep getting errors probably due to my syntax, so I was wondering if there is simple syntax to do this. Thanks. 


You would have to use Model Constraint to do this yourself. Label the slope parameters involved in the indirect effects for each group and use those labels in Model Constraint to express the indirect effects, like for 2 groups: Model Constraint: new(ind1 ind2 diff); ind1 = b11*b12+ b13*b14; ind2 = b21*b22+ b23*b24; diff = ind1ind2; 

C posted on Wednesday, December 11, 2013  2:31 am



Thanks. Just to check  is the above to test whether the indirect effect from books to Y is different between groups? If I just wanted to test whether the direct effect from books to Y is different between groups, is there a simpler way? 


Yes, this is the test. I don't know of a simpler way. 


hi how can i apply multiple group nonlinear structural equation models in this program . regards Thanoon 


I am trying to compare different approaches to testing for measurement invariance across 29 countries. I test a factor with only three indicators, applying both multigroup CFA (with maximum likelihood) and the new alignment method in Mplus (with Bayesian estimations, aiming at a model with approximate measurement invariance). Such data can also be analyzed with multilevel CFA, which would seem advantageous if the aim is to later use the factor in a multilevel regression analysis. Leaving aside configural invariance (which is not tested with only one factor and three indicators) and scalar invariance (which is unlikely with many countries involved): when metric invariance (invariant factor loadings) is supported for nearly all countries, a multilevel model with random intercepts for these countries should be justified. Is this correct? Or, put differently: if metric invariance is not supported, the multilevel factor analysis is strictly speaking not justified, even though frequently used. Correct? One might define factor loadings as random, but that seems to lead to a rather complex model and I haven't seen this in applied research. I would be thankful for guidance. Christopher Bratt 


I think your statements are correct. The paper http://www.statmodel.com/download/PolAn.pdf describes 3 different cases of invariance in the multilevel factor analysis case. 


Thanoon: What kind of nonlinear model do you refer to? 


Thank Thank you for your help I am using Quadratic effects on endogenous latent variable like x1^2.x2^2.x1x2. Regards 


That means that you use XWITH, so multiplegroup analysis has to be done using TYPE= MIXTURE RANDOM and the KNOWNCLASS option. See the User's Guide. 


dear dr. Muthen i need your help to correct this program for SEM because i can not run this program and i dont see any errors and i put this variables X*Z AS a nonlinear effect on W is correct or not ??. regrads TITLE: multiple group SEM group 1 DATA: FILE = C:\Users\hp\Desktop\path (2).dat; VARIABLE: NAMES = Y1Y11; USEVARIABLES = Y1Y11; ANALYSIS: ESTIMATOR = ML; MODEL: X BY Y1 Y2 Y3 Y4 Y5; Y BY Y6 Y7; Z BY Y8 Y9; W BY Y10 Y11; W on X Y Z X*Z; X with Y; Y with Z; OUTPUT:TECH1 TECH4 STDYX; 


You need to define the latent variable interaction using the XWITH option. See Example 5.13 in the user's guide. 


thank you so much for your help i need also effect of X^2 AND Y^2 on W (nonlinear effect)how can i write this command. regards 


dear dr. linda i want to ask you question regarding data type in each group in multiple group SEM "the data should be independent or correlated in each group. regards 


int  x XWITH x; Subjects in each group should be independent. 


dear dr. linda i saw in some referencres that the observed variables not only observations (subjects) should be independent in multiple group SEM.is this speech correct? because i want to simulate data to conduct multiple group SEM. regards 


Observed variables should not be independent. It is the relationship among the observed variables that the analysis tries to explain. 

marie_l posted on Tuesday, December 31, 2013  8:46 am



Hello I tested a theoretical model in which I have indirect effect. Now I`d like to see if gender has a moderating effect. So, I am running a multiple group analysis. First, I tried to establish measurement invariance. To establish measurement equivalence, I am using the following language (see code below) 1) Does the code look OK? 2) May I use the language with Mplus 6? VARIABLE: NAMES ARE mediaexp1 mediaexp2 mediaexp3 mediaexp4 alcohol1 alcohol2 alcohol3 alcohol4 alcohol5 hseek1 hseek2 hseek3 hseek4 interact1 interact2 interact3 interact4 norm1 norm2 norm3 norm4 norm5 r_male; MISSING ARE ALL (9); GROUPING IS r_male (0= male 1=female); Model: mediaexp by mediaexp1 mediaexp2 mediaexp3 mediaexp4; alcohol by alcohol1 alcohol2 alcohol3 alcohol4 alcohol5; hseek by hseek1 hseek2 hseek3 hseek4; interact by interact1 interact2 interact3 interact4; norm by norm1 norm2 norm3 norm4 norm5; ANALYSIS: MODEL = CONFIGURAL METRIC SCALAR; ESTIMATOR IS MLR; ITERATIONS = 1000; CONVERGENCE = 0.00005 Thanks and happy holidays 


MODEL = CONFIGURAL METRIC SCALAR; is not available in Version 6. See the Topic 1 course handout on the website for the inputs to test measurement invariance. 


dear dr. muthen i need your help to conduct this example in msem and as follow: TITLE: Configural CFA model DATA: FILE = C:\Users\hp\Desktop\BSI_18.dat; VARIABLE: NAMES = X1X18 GENDER WHITE AGE EDU CRACK SITE ID; MISSING = ALL (9); USEVARIABLES ARE X1X18; GROUPING = SITE (1=OH 2=KY); !ANALYSIS: ESTIMATOR = ML;!default; MODEL: SOM BY X1 X4 X7 X10 X13 X16; !Somatization; DEP BY X5 X2 X8 X11 X14 X17; !Depression; ANX BY X3 X6 X9 X12 X15 X18; !AnxietX; [SOM@0 DEP@0 ANX@0]; X8 WITH X5; MODEL OH: X9 WITH X12; MODEL KY: SOM BY X1@1 X4 X7 X10 X13 X16; !Somatization; DEP BY X5@1 X2 X8 X11 X14 X17; !Depression; ANX BY X3@1 X6 X9 X12 X15 X18; !Anxiety; [X1X18*]; X11 WITH X14; X9 WITH X18; OUTPUT: TECH1 TECH4; 


If you are trying to specify a configural model, it looks correct. 


dear dr. muthen i want to ask you i can not get on any results in mplus what is the problem? when i took any example ican not get on results for example this code : TITLE: Test invariance of marker item factor loadings DATA: FILE ='C:\Users\hp\Desktop\BSI_18.dat'; VARIABLE: NAMES = X1X18 GENDER WHITE AGE EDU CRACK SITE ID; MISSING = ALL (9); USEVARIABLES ARE X1X18; GROUPING = SITE (1=OH 2=KY); !ANALYSIS: ESTIMATOR = ML;!default; MODEL: SOM BY X1* X4@1 X7 X10 X13 X16; !Somatization; DEP BY X5* X2@1 X8 X11 X14 X17; !Depression; ANX BY X3* X6@1 X9 X12 X15 X18; !Anxiety; [SOM@0 DEP@0 ANX@0]; X5 with X8; MODEL OH: X9 WITH X12; MODEL KY: SOM BY X1 X4@1 X7 X10 X13 X16; !Somatization; DEP BY X5 X2@1 X8 X11 X14 X17; !Depression; ANX BY X3 X6@1 X9 X12 X15 X18; !Anxiety; [X1X18]; X5 with X8; X11 WITH X14; X9 WITH X18; OUTPUT: TECH1; this is example but i dont know where is the errors. please help me regards 


What kind of message do you get? 


" the input setup produced syntax warnings/ errors caused to mplus to abort. please refer to the output file for these warninngs/ errors and fix the input setup accordingly" 


Plrsdr send your output and your license number to support@statmodel.com. 


I am sorry I dont have any output because of errors. I dont have license number because I am using demo version, so how can I solve this problem. 


Send the input and data to support@statmodel.com. 


hi dr. linda i want to ask you if i have dichotomous data how can i choose type of variables?? because i dont see dichtomous type in types of variables in mplus.is categorical data represents ordered categorical and dichotomous. thanks in advance 


We check the variables on the CATEGORICAL list to see how many categories they have and treat them accordingly. 


Hi dr. Linda If I have two categories like male, ,female how can I treat with this variable in mplus. Regards 


hi dr. linda if i have variables with two categories such (male, female) how can i treat with it please explain to me. 


Usually gender is either a grouping variable or a covariate. If it is a covariate, it is treated as a continuous variable in regression and the model is estimated conditioned on it. The scale does not matter as no distributional assumptions are made about it. Only the scale of dependent variables is an issue. 


dear dr. linda i am working on multi group structural equation models and all my observed variables are dichotomous "you mean dependent variables as observed variables" and i received errors in variable command when i wrote "GROUPING ARE X1X10;" HOW CAN I SOLVE THIS PROBLEM PLEASE.?? regards 


The GROUPING option is to name the grouping variable for multiple group analysis. If x1x10 are binary dependent variables, you would say CATEGORICAL ARE x1x10; 


Thanks alot drs. Linda this is my question so categorical option represents dichotomous and ordered categorical data. Right? Thanks again 


Yes, the program counts the number of categories and treats the variables accordingly. 


dear drs. linda can you explain to me how can i prepare my data in multi group structural equation models data file? and i hope to give me example? regards 


You have all data in one data set that contains the grouping variable. See Example 5.15. 


dear drs. linda i want just to see the data "ex5.15" to see how can i manage this type of data to multi group can you tell me where can i find this file please. regards 


It is installed with Mplus and it is also available on the website with the user's guide. 

S Gomez posted on Sunday, January 19, 2014  12:55 pm



Drs. Muthen, I am running a multiple groups SEM from a single data file. My plan is to: 1) Run a model with measurement invariance and all structural paths held constant across the two groups. 2) Relax some equality constraints based on modification indices. The grouping is working fine, and I have been able to run models with structural paths varying freely or held constant. The problem is the factor loadings: I understand from the user guide these are held constant by default. However, when I run the model without specifying constraints for the BY statements, the output shows group differences in factor loadings, Even when I specify the loadings to be held constant, the same group differences show up. Any idea what I might be missing? Thanks! 


Please send the relevant outputs and your license number to support@statmodel.com. 


hi drs. linda i want to conduct multiple group nonlinear structural equation models by using the commands below and i have error. TITLE: MULTI SEM WITH CONTINUOUS DATA DATA: FILE IS "C:\Users\hp\Desktop\normal.dat"; TYPE IS COVARIANCE MEANS; NGROUPS = 2; NOBSERVATIONS = 500 500; VARIABLE: NAMES ARE X1 X2 X3 X4 X5 X6 X7 X8 X9 X10; USEVARIABLES ARE X1 X2 X3 X4 X5 X6 X7 X8 X9 X10; GROUPING IS (0= GROUP1 1=GROUP2); ANALYSIS: TYPE IS GENERAL BASIC; ESTIMATOR IS BAYES; ITERATIONS = 1000; CONVERGENCE = 0.00005; MODEL TYPE=RANDOM X BY X1 X2 X3 X4; Y BY X5 X6; Z BY X7 X8; W BY X9 X10; W ON X Y Z; int  X XWITH Z; OUTPUT: SAMPSTAT MODINDICES RESIDUAL STANDARDIZED CINTERVAL FSCOEFFICIENT FSDETERMINACY TECH3 TECH4 TECH5; SAVEDATA: RESULTS IS TT; TECH3 IS YY; TECH4 IS UU; and the error is *** ERROR in ANALYSIS command Unknown option: Y 


hi drs. linda i want to conduct multiple group nonlinear structural equation models by using the commands below and i have error. TITLE: MULTI SEM WITH CONTINUOUS DATA DATA: FILE IS "C:\Users\hp\Desktop\normal.dat"; TYPE IS COVARIANCE MEANS; NGROUPS = 2; NOBSERVATIONS = 500 500; VARIABLE: NAMES ARE X1 X2 X3 X4 X5 X6 X7 X8 X9 X10; USEVARIABLES ARE X1 X2 X3 X4 X5 X6 X7 X8 X9 X10; GROUPING IS (0= GROUP1 1=GROUP2); ANALYSIS: TYPE IS GENERAL BASIC; ESTIMATOR IS BAYES; ITERATIONS = 1000; CONVERGENCE = 0.00005; MODEL TYPE=RANDOM X BY X1 X2 X3 X4; Y BY X5 X6; Z BY X7 X8; W BY X9 X10; W ON X Y Z; int  X XWITH Z; OUTPUT: SAMPSTAT MODINDICES RESIDUAL STANDARDIZED CINTERVAL FSCOEFFICIENT FSDETERMINACY TECH3 TECH4 TECH5; SAVEDATA: RESULTS IS TT; TECH3 IS YY; TECH4 IS UU; and the error is *** ERROR in ANALYSIS command Unknown option: Y 


Try putting a semicolon after TYPE=RANDOM. If that does not work, please send the full output and your license number to support@statmodel.com. 


after put semicolon also i have error *** ERROR in ANALYSIS command Unknown option: X i am so sorry i dont have license number regards 


You don't give a variable name in the GROUPING option. See the user's guide to see how the GROUPING option is specified. 


thank you so much for your help i saw in user guide the grouping is GROUPING IS group (1 = g1 2 = g2); i put it in my program and still same problem *** ERROR in ANALYSIS command Unknown option: X DATA: FILE IS "C:\Users\hp\Desktop\normal.dat"; TYPE IS COVARIANCE MEANS; NGROUPS = 2; NOBSERVATIONS = 500 500; VARIABLE: NAMES ARE X1 X2 X3 X4 X5 X6 X7 X8 X9 X10; USEVARIABLES ARE X1 X2 X3 X4 X5 X6 X7 X8 X9 X10; GROUPING IS group (1 = g1 2 = g2); ANALYSIS: TYPE IS GENERAL BASIC; ESTIMATOR IS BAYES; ITERATIONS = 1000; CONVERGENCE = 0.00005; MODEL TYPE=RANDOM; X BY X1 X2 X3 X4; Y BY X5 X6; Z BY X7 X8; W BY X9 X10; W ON X Y Z; int  X XWITH Z; int  X XWITH Y; int  Y XWITH Z; OUTPUT: SAMPSTAT MODINDICES RESIDUAL STANDARDIZED CINTERVAL FSCOEFFICIENT FSDETERMINACY TECH3 TECH4 TECH5; SAVEDATA: RESULTS IS TT; TECH3 IS YY; TECH4 IS UU; 


I can't help without seeing your output and license number at support@statmodel.com. 


this is my output Mplus VERSION 6.12 MUTHEN & MUTHEN 01/22/2014 5:17 PM INPUT INSTRUCTIONS DATA: FILE IS "C:\Users\hp\Desktop\normal.dat"; TYPE IS COVARIANCE MEANS; NGROUPS = 2; NOBSERVATIONS = 500 500; VARIABLE: NAMES ARE X1 X2 X3 X4 X5 X6 X7 X8 X9 X10; USEVARIABLES ARE X1 X2 X3 X4 X5 X6 X7 X8 X9 X10; GROUPING IS group (1 = g1 2 = g2); ANALYSIS: TYPE IS GENERAL BASIC; ESTIMATOR IS BAYES; ITERATIONS = 1000; CONVERGENCE = 0.00005; MODEL TYPE=RANDOM; X BY X1 X2 X3 X4; Y BY X5 X6; Z BY X7 X8; W BY X9 X10; W ON X Y Z; int  X XWITH Z; int  X XWITH Y; int  Y XWITH Z; OUTPUT: SAMPSTAT MODINDICES RESIDUAL STANDARDIZED CINTERVAL FSCOEFFICIENT FSDETERMINACY TECH3 TECH4 TECH5; SAVEDATA: RESULTS IS TT; TECH3 IS YY; TECH4 IS UU; *** ERROR in ANALYSIS command Unknown option: X MUTHEN & MUTHEN 3463 Stoner Ave. Los Angeles, CA 90066 Tel: (310) 3919971 Fax: (310) 3918971 Web: www.StatModel.com Support: Support@StatModel.com Copyright (c) 19982011 Muthen & Muthen 


TYPERANDOM should be in the ANALYSIS command not the MODEL command. 


after this change same error the output is: INPUT INSTRUCTIONS DATA: FILE IS "C:\Users\hp\Desktop\normal.dat"; TYPE IS COVARIANCE MEANS; NGROUPS = 2; NOBSERVATIONS = 500 500; VARIABLE: NAMES ARE X1 X2 X3 X4 X5 X6 X7 X8 X9 X10; USEVARIABLES ARE X1 X2 X3 X4 X5 X6 X7 X8 X9 X10; GROUPING IS group (1 = g1 2 = g2); ANALYSIS: TYPE IS GENERAL BASIC; ESTIMATOR IS BAYES; ITERATIONS = 1000; CONVERGENCE = 0.00005; TYPE=RANDOM; MODEL X BY X1 X2 X3 X4; Y BY X5 X6; Z BY X7 X8; W BY X9 X10; W ON X Y Z; int  X XWITH Z; int  X XWITH Y; int  Y XWITH Z; OUTPUT: SAMPSTAT MODINDICES RESIDUAL STANDARDIZED CINTERVAL FSCOEFFICIENT FSDETERMINACY TECH3 TECH4 TECH5; SAVEDATA: RESULTS IS TT; TECH3 IS YY; TECH4 IS UU;SOMSOM *** ERROR in ANALYSIS command Unknown option: Y MUTHEN & MUTHEN 3463 Stoner Ave. Los Angeles, CA 90066 Tel: (310) 3919971 Fax: (310) 3918971 Web: www.StatModel.com Support: Support@StatModel.com Copyright (c) 19982011 Muthen & Muthen 


Put a colon after MODEL. 


hi dr.linda when i conducted sem in mplus i received this message in output page *** WARNING in OUTPUT command TECH4 option is not available for TYPE=RANDOM. Request for TECH4 is ignored. *** WARNING Data set contains unknown or missing values for GROUPING, PATTERN, COHORT, CLUSTER and/or STRATIFICATION variables. Number of cases with unknown or missing values: 437 2 WARNING(S) FOUND IN THE INPUT INSTRUCTIONS why i cannot get on these results and why "not available" 


When random slopes are estimated, the dependent variable variances vary as a function of the covariate so there is not a single TECH4 value to be printed. 


hi dr. linda i want to ask you some questions: 1 which method is suitable to conduct multiple group structural equation models with continuous dependent variables (just ML). 2 which method is suitable to conduct multiple group structural equation models with ordered categorical dependent variables. 3 which method is suitable to conduct multiple group structural equation models with dichotomous dependent variables. regards 


See page 601 of the user's guide where there is a summary table of estimators available for different types of variables. 


dear dr. linda i want to use bayes method with multiple group sem with categorical data and i want to use uniform distribution. my question is this procedure correct?thanks in advance 


You can do multiple group SEM with categorical data using Bayes and the Knownclass approach. A uniform prior can be specified where appropriate. 


dear dr. muthen as you know the basic assumptions of SEM the observed variables must distributed as a normal when i have continuous data but most of scientists like(lee 2007) depended on normal distribution on categorical data my question why he doesnt use uniform distribution with categorical data. thanks alot 


I think you are still talking about Bayes. With categorical outcomes, Bayes MCMC procedures have been developed using probit, generating underlying latent continuous normal variables that are then categorized. See work by Albert and Chib for instance as referred to in our technical reports on the Mplus Bayes implementation. 


Hello, I am currently testing a structural equation model from 4 emotions to distinct subtypes of wellbeing via 2 different mediators. The social emotions were induced in a lab via 4 different conditions. I have run the structural equation model within a mutligroup design, using the condition as a grouping variable. I would like to run a manipulation check to see whether participants were in fact reporting higher levels of the emotion if they were in that condition. I understand that you can do a manipulation check by assessing whether the means of the emotions are higher in each group/condition. What syntax would I use to get the model to compute these? And to see if there any significant differences? I’ve had a look online and in the user manual, but I can’t seem to find anything. Many thanks for your assistance, Elizabeth 


I am assuming emotions are observed variables not factors. You can use MODEL TEST or chisquare difference testing to determine if the means are different from each other. 


dear dr. muthen in multiple group analysis with bayes method when you have categorical data with 4 categories which distribution is suitable for this type of data (nornal or uniform). thanks alot 


If the variable is ordinal and the distribution reasonably symmetric, it might be ok to approximate it as normal. 


Dear Dr. Muthen, I'm testing a 2 group (by gender) mediation model (N=499), where I have 7 exogenous variables (and 2 baseline variables of my outcomes), one mediator variable, and two outcomes. All variables are indicators and continuous. Although my output statements say "terminated normally" they also say, THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES MAY NOT BE TRUSTWORTHY FOR SOME PARAMETERS DUE TO A NONPOSITIVE DEFINITE FIRSTORDER DERIVATIVE PRODUCT MATRIX. THIS MAY BE DUE TO THE STARTING VALUES BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. THE CONDITION NUMBER IS 0.175D15. PROBLEM INVOLVING PARAMETER 162. When I check the parameter, it's simply one variance for one of the baseline variables. When I bootstrap the model, however, my output statements produce no errors. Thus, is it ok to report all results from the bootstrapped model? Otherwise, what can I do with my nonbootstrapped model? thanks! 


I am interested in whether there are differences in means between groups (manova or anova). How do I write the correct syntax for this? After specifying the groups, I wrote Model: [varname] (1) ; But the output for the restricted and the nonrestricted model was the same, with the same number of df. 


Milena: Please send the output with the error message and your license number to support@statmodel.com. 


Vera: Please send the outputs and your license number to support@statmodel.com. 


Hello, Thanks for your response. I am hoping to test the differences in the means using chi square testing in the model constraint command. I'm unsure about what syntax to use. As an example, how would I constrain the selfreport measure of pride to be equal across the neutral and pride group? Many thanks, Elizabeth 


See pages 478480 of the user's guide. 


HI DR. MUTHEN i want to define a new weight matrix regarding WLS OR WLSMV in multiple group SEM how can i write this matrix in mplus?? thanks 


You cannot read a weight matrix with WLS or WLSMV. 

ehrbc1 posted on Tuesday, March 04, 2014  6:07 pm



Hello Linda, I have read through pages 478480 of the users’ guide and I am still experiencing some difficulties. As mentioned above I want to see whether there are any differences between a selfreport measure of pride across the neutral and pride group. In other words, I am expecting that the pride measure will be significantly higher in the pride group compared to the neutral group. Extracts of my syntax are as follows. VARIABLE: NAMES ARE g POSREPORT SEX AGENTIC COMMUNAL SELFWB PRIDEREPORT COMPREPORTGRATREPORT OTHERWB; USEVARIABLES ARE g POSREPORT AGENTIC COMMUNAL SELFWB PRIDEREPORT COMPREPORT GRATREPORT OTHERWB; Missing are all (999); GROUPING is g (1 = neutral 2 = positivity 3 = gratitude 4 = compassion 5 = pride); MODEL: COMMUNAL on POSREPORT; COMMUNAL on GRATREPORT; COMMUNAL on COMPREPORT; COMMUNAL on PRIDEREPORT; MODEL neutral: PRIDEREPORT (1); MODEL pride: PRIDEREPORT (1); After running the analysis, I have examined the "chisquare test of model fit" and it is not significant. This is not what I expected given that the ANOVA I ran in SPSS showed that there was difference between the selfreport measure of pride in the pride and neutral group. Could you please assist? Many thanks, E 


The following holds the variance of pridereport, an independent variable, equal across groups. Is this what you intend? MODEL neutral: PRIDEREPORT (1); MODEL pride: PRIDEREPORT (1); 

ehrbc1 posted on Wednesday, March 05, 2014  8:07 pm



Hi Linda, Thanks for your response. No, that is not what I intend. My intention is to see whether the mean of the variable PRIDEREPORT is significantly different in the neutral and pride group. So a ttest. Many thanks, E 


The mean of an observed exogenous variable is not an estimated parameter in a regression model. You refer to a mean by placing the variable name in brackets: MODEL neutral: [PRIDEREPORT] (1); MODEL pride: [PRIDEREPORT] (1); In a regression, it is most common to test the equality of a regression coefficient across groups. 


hi i want to ask you which method is better to estimate parameters in multi group SEM with categorical outcomes wls or bayes and which one gives less SE. thanks in advance 


I don't think you will find much difference in the standard errors between the two methods. Try it out and see. 


Hello  I am doing a multigroup comparison between groups from two different cultures, and I need to control for a binary variable in one culture, but not the other. When I try to run the model, I get an error message because one group has no variance on that variable. The only way I can figure out to get the model to run is to falsely change the value on one of the observations so that both groups have some variance on that variable, and then constrain all parameters with that variable to be zero in the group where it doesn't apply. Is that the best way to do that? Does constraining the parameters to zero keep the variable from having any effects in the group where the value has been falsely changed? Thank you, Lindsay 


See the following FAQ on the website: Different number of variables in different groups 

Tait Medina posted on Thursday, March 20, 2014  2:48 pm



I am estimating a multiple group CFA model with 2 groups, 6 observed continuous variables, and one factor. I have constrained 2 loadings to be invariant across groups, in addition to the loading that has been fixed to 1 in each group. I am freely estimating the intercepts in both groups. The residual variances are freely estimated as well. The factor means are fixed at 0. Now I would like to compare the substantive implications of extracting factor scores and using them as outcomes in a regression analysis in each group (a twostep approach), compared to estimating the effects of covariates on the factor in each group in a single step. However, I am having difficulty setting up my syntax for the single step approach and am hoping that I can receive some guidance. This example syntax leads to a nonidentified model. I thought that I had met the minimum number of constraints, but obviously I am missing something. Thank you. MODEL: f1 BY y1y6; f1 ON x1; MODEL 2: f1 BY y5 y6; [y1y6]; 


This must not be the full model since the factor means are not fixed at zero. Please send your output and license number to Support. 


I'm so embarrassed. That is exactly what was missing from the singlestep model (factor means fixed at 0 in both groups). This syntax runs just fine: MODEL: f1 BY y1y6; [f1@0]; f1 ON x1; MODEL 2: f1 BY y5 y6; [y1y6]; [f1@0]; f1 ON x1; Thank you, and sorry for the bother. 


I am trying to run a multiple group analysis in a censored regression. All of my variables are observed. The Mplus user guide states that for censored with maximum likelihood estimation outcomes, multiple group analysis is specified using the KNOWNCLASS option of the VARIABLE command in conjunction with the TYPE=MIXTURE option of the ANALYSIS command. However, I receive the following error warning: *** ERROR in VARIABLE command CLASSES option not specified. Mixture analysis requires one categorical latent variable. My input is below: VARIABLE: USEVAR = x1 x2 x3 x4 x5 y1; CENSORED ARE y1 (b); Missing ARE all .; KNOWNCLASS IS x1(0 = male 1 = female); ANALYSIS: TYPE = MIXTURE; Algorithm = INTEGRATION; ESTIMATOR = MLR; MODEL: y1 ON x2 x3 x4; y1 ON x5(1); Are the input commands incorrect? How do I test for moderation (categorical variable) for a censored outcome, when all variables are observed? Your help is much appreciated  thanks! 


You also need the CLASSES option. See Example 7.21. It shows how all of these options work together. 


Thanks for your quick response. I looked at Example 7.21 but am still unclear on how to apply this to my model as I do not have a categorical latent variable with known class membership. Instead I have a categorical observed variable  I am examining sex (0 = male 1 = female) as a moderator. 


The CLASSES option names a categorical latent variable which the KNOWNCLASS option makes equivalent to your observed variable. 

ehrbc1 posted on Tuesday, April 01, 2014  7:34 pm



Hi Linda, I have been able to run some preplanned ANOVA contrasts in SPSS but I would like to now run them in mplus for my multigroup structural equation model. I understand that to compare a DV (the mean selfreported feeling of pride) between a neutral group and a pride group, I would do the following: MODEL neutral: [PRIDEREPORT] (1); MODEL pride: [PRIDEREPORT] (1); How would I compare the pride group to the combined average of the other groups in the model (not including neutral) on [PRIDE REPORT]. The other groups are compassion, positivity and gratitude. Many thanks for your assistance, E 


Let's say you have four means that you have three means that you have labelled: MODEL 1: [y] (p1); MODEL 2: [y] (p2); MODEL 3: [y] (p3); You can use MODEL CONSTRAINT as follows: MODEL CONSTRAINT: NEW (mean diff); mean = (p2 + p3)/2; diff = mean  p1; 


hi dr. Muthen i have multiple group SEM with ordered categorical variables and i want to use inverse normal to solve the identification problem is that correct?? thanks in advance 


I don't know which identification problem you refer to or what inverse normal you refer to. 


Identification proplem for the distribution of thresholds in ordered categorical data.can I use inverse normal as a distribution for thresholds. Thanks alot doctor 


Are you talking about a Bayes prior? Is so, we don't have an inverse normal prior. I still don't understand what you are asking. 


yes i want to use bayesian analysis in SEM with ordered categorical and dichotomous data. 


See MODEL PRIORS in the user's guide to see the priors available in Mplus. 


hi i want a real data example for multiple group SEM wuth ordered categorical and dichotomous data. can you help me to get on it? many thanks in advance Thanoon 


We don't have one that we can share. Perhaps you should ask on SEMNET. 


Hi, Does version 7.11 still provide scaling correction factors for chisquare difference testing using MLR? I'm trying to do a chisquare difference test and I don't see the correction factors in my output. Thanks. Eric 


They should be there. Check to be sure the estimator being used is MLR. 

Eric Deemer posted on Tuesday, July 29, 2014  10:28 am



Ah, there are the scaling correction factors! I thought MLR was the default estimator? eric 


Not in all cases. 

Eric Deemer posted on Wednesday, July 30, 2014  3:34 am



I see. Okay, I specified MLR estimation and I got the correction factors. Thanks so much! Eric 


Hello, I'm unfamiliar with the Wald's test for model comparison. I've been researching it and believe I understand how to do it, but I have a question. When conducting the comparison, do you test all parameters or only the ones that differ between the nested and comparison models? Your advice is greatly appreciated. Thank you in advance. 


It's your choice  you can test any set of parameters. 


Hello, i need your help to get on some information on this data which is found it in your website for multiple group analysis. wmimicd.dat many thanks in advance 


Where do you find this data set? 


thank you for your reply. i found it in Mplus Examples  Categorical Outcome wmimicd.dat. many thanks in advance 


There is no information about this data set available. It may be simulated. 


Hello, I have another question regarding Wald's test. In the testing portion I began with a fully constrained model (parameters, means, covariances that had been added for better data fit). I am wondering if some of the effects can appear to be significant if they are not if the overall model shows significance. To account for that do you drop constraints based on the modification indices and make decisions based on how the other fit indices change? I apologize for all the questions, but greatly appreciate your help. I have found a great deal of references, but none that have answered these questions. Thank you again for providing this service. 


I don't understand your first paragraph. 


Dear Drs Muthen, I am trying to run power analyses using monte carlo esimation for a moderated mediation model. All variables, including the moderator, are continuous. The code below generates power estimates for the indirect effect of X>X2>Y, and for detecting the effect of interaction between X and M (XM) on X2. My question is should I report power separately for the interaction term and for the indirect effect, or is there a way to test for power by combining the two? Right now just for ease, I've specified all means = 0 and variances = 1, and regression effects are various levels of possible effect sizes. Thank you so much in advance MONTECARLO: names are x x2 y m xm; nobs = 70; sample size; nreps = 1000; seed = 2222; number generator; DEFINE: xm = x*m; ANALYSIS: TYPE=meanstructure; MODEL POPULATION: [x @ 0]; !mean of x set to 0; [y @ 0]; [x2 @ 0]; [m @ 0]; [xm @ 0]; y @ 1.0; x @ 1.0; x2 @ 1.00; m @ 1.00; xm @ 1.00; x2 on X @ 1.02; x2 on xm @ .283; x2 on m @ .283; y on x2 @ 1.02; y on x @ .283 x @ .283; MODEL: x * 1.00; x2 * 1.00; y * 1.0; xm * 1.0 x2 on x * 1.02(gamma1); x2 on m * 1.02; x2 on xm * 1.02 (gamma2); y on x2 * .283 (b); y on x * .283 x * .283; MODEL INDIRECT: y IND x; 


Perhaps you get power for the full effect  not just using the slope of the main effect and the slope of the interaction effect separately  by using Model Constraint to express the moderated mediation effect in line with the "indirect" expression in the pdf called Loop plot for ex 3.18 on our Mediation web page http://www.statmodel.com/Mediation.shtml 


Thank you for your very prompt response. When I use the Loop commands: MODEL CONSTRAINT: LOOP(m,2,2,0.1); I receive the following error message: *** ERROR in MODEL CONSTRAINT command A parameter label or the constant 0 must appear on the lefthand side of a MODEL CONSTRAINT statement. Problem with the following: LOOP(M,2,2,0.1) = Would you happen to know what this means and/or how to adjust the code? Thanks again 


The LOOP plot came out in Version 7. It sounds like you may be using an older version of the program where it is not available. 


Oh, I see, yes, I'm using version 6. Do you know if there's a way to test my question (see two posts above)/modify the code I wrote in version 6? Thanks again 


I think Kris Preacher has a website for creating this type of plot. 


Thank you for the suggestion. He doesn't seem to have any utilities for calculating power for a moderated mediation model, but he did post some code to do it, which I modified for my purposes. Just to be sure I'm understanding my output correctly, the following code within a monte carlo power simulation yields power analyses for both the regression estimate of XM on X, as well as to direct the newly defined IND effect. If I want to calculate power for detecting moderated mediation, am I primarily interseted inthe power for the IND effect or for detecting the effect of X on XM? MODEL: y on m (b1) x xm (b2); m on x (a1); xm with m; MODEL CONSTRAINT: new (ind xmodval); xmodval = 1; ind = a1*(b1+b2*xmodval); Thanks again, this has been quite helpful. 


I think you could be interested in the power for both "b2" and "ind". But I am not sure you want to have xm with m; in the model. 

Rachel posted on Thursday, October 23, 2014  9:24 pm



Hello, I am doing sem mediation analysis (observed continuos variables) comparing two ethnic groups. I compared my baseline model to my fully restrained model using the satorrabentler chi square difference test which showed a degrade in model fit. this means that the two groups are different, correct? how would i find out what individual paths they differ on? Would I remove one constraint at a time from my more restricted model and do the satorra bentler calculations again? 


You can look at modification indices. 

Rachel posted on Friday, October 24, 2014  8:54 am



so if there is a degrade in model fit then the two groups differ, yes? and I should request modification indices for my baseline model? 


You want the modindices for the fully constrained model. When there is a degrade in model fit, that is large modindices, the groups differ. 


Is there a way to have Mplus read data in from two different files? I'm trying to do multigroup analysis and I have only summary data for one group but the actual data set for the second group. Thanks, Steve 


You can create summary data for the second group and read in summary data for both groups. 


Thank you for your quick response, Bengt. Is there an example of multigroup analysis based on summary data of this sort any where? How do I tell Mplus that part of the summary data is for group 1 and part is for group 2? Thanks, Steve 


See pages 483484 of the user's guide under Summary Data, One Data Set. 


I have a "missing by design" problem. I have two groups, just about 50% of the sample in each. I get the error: THE MISSING DATA EM ALGORITHM FOR THE H1 MODEL HAS NOT CONVERGED WITH RESPECT TO THE LOGLIKELIHOOD FUNCTION. The question may be framed as more generally, how to have different model structures for two groups. Essentially for group 1: age3 > age 4 > age 5; and for group 2: age3 > age 4 > age 5 > age 6. The discussion boards suggest that the coverage is low for one group that is why this error emerges. But of course, that is the point of "missing by design". I would appreciate your suggestions. The inp file was as follows: DATA: FILE IS 'missing by design test mplus data v01.dat'; VARIABLE: NAMES ARE anketno a3ecbi a3pun a3homep a4ecbi a4pun a4homep a5ecbi a5pun a5homep a6ecbi a6pun a6homep schpat; USEVARIABLES ARE a3ecbi a4ecbi a5ecbi a6ecbi schpat ; PATTERN is schpat(1=a3ecbi a4ecbi a5ecbi a6ecbi 2 = a3ecbi a4ecbi a5ecbi ); missing=all (999); model: !structural ! Age 4; a4ecbi on a3ecbi ; !Age 5; a5ecbi on a3ecbi a4ecbi ; !Age 6 ; a6ecbi on a3ecbi a4ecbi a5ecbi ; 


Try H1iterations = 5000; in the Analysis command. You can also say NoCHI in the Output command to suppress H1 calculations but then you won't get chisquare test of model fit. You can also run this as a 2group run to make H1 computations easier, but then you have to handle the difference in the number of variables for the two groups (see FAQ on that). 


Yes, I got a FAQ sheet on that a few minutes ago but that was not much help because it is referring to a model but that model is not specified. Here is what I got: "Note that this output shows only one missing data pattern for both groups whereas ***females have two patterns***. If you do the analysis in two steps, saving the data and then analyzing it, you see ** two missing data patterns for females ** and get the same results as doing it in one step. " 


All you need is the general statement in that FAQ: For a dependent variable, it is best to create a missing value flag for that variable in the group that does not have that variable using the DEFINE command. You also need to fix the residual variance of the variable to a very small value, or hold it equal to the other group (the estimate will not be affected by the group that has missing data). Fixing it to zero creates a noninvertible estimated covariance matrix. 


But trying an increase in the H1 iterations in the singlegroup approach should be a first step. I assume you don't have coverage=0 for any pairs of variables. 


When I do that, MPLUS never proceeds to estimating the model at all. It quits when it sees the variable for one group totally missing, but that is the whole point! The variable is missing by design! Here is the output that I am getting: *** ERROR One or more variables in the data set have no nonmissing values. Check your data and format statement. Group AGE7SCH Continuous Number of Variable Observations Variance A4ECBI 394 0.031 A5ECBI 393 0.029 A6ECBI 394 0.026 A3ECBI 394 0.026 Group AGE6SCH Continuous Number of Variable Observations Variance A4ECBI 404 0.029 A5ECBI 400 0.027 **A6ECBI 0 A3ECBI 404 0.025 


I was suggesting increasing H1iterations in the singlegroup approach (your original run), not the twogroup approach. If that doesn't work, you can send your output from this singlegroup run and the output from your twogroup run, plus data and license number to support. 


I have question regarding the distribution of parameters in SEMs, can i use my own distributions as a distribution to SEMs parameters for example the distribution of the variance of dependent var. psi is gamma so can i change it to another distribution? is that correct? Many thanks in advance 


I think you are referring to Bayes estimation and priors. The prior choices are described on page 698 of the UG. 


Thank you very much for your quick response i want to ask you in the 698, can i use the dist. available for example gamma, uniform, log normal for psi or just default priors (inverse gamma)? Many thanks again 

Daniel Lee posted on Saturday, April 11, 2015  1:20 pm



Hi Dr. Muthen, If you have a saturated model: 2 factor structure with 2 items loading on each factor, can you conduct a multiple group CFA analysis? 


Yes. 

Daniel Lee posted on Monday, April 13, 2015  6:50 pm



Thank you!! 


Dear Prof. Muthen i want to ask you in the page 698, can i use the dist. available for example gamma, uniform, log normal for psi or just default priors (inverse gamma)? Many thanks again 


You can only use what is mentioned for Psi on that page. 


I should add that "The Mplus default variance prior is IG(1,0) which implies a uniform prior ranging from minus infinity to plus infinity." This is mentioned on page 22 of the paper on our website: Muthén, B. (2010). Bayesian analysis in Mplus: A brief introduction. Technical Report. Version 3. Click here to view Mplus inputs, data, and outputs used in this paper. download paper contact author show abstract 


Profs. Muthén, I am conducting a longitudinal crosslag model with 6 waves and three latent factors at each wave. I've also 1) used parcels to simplify the estimation process, 2) have a large sample size, and 3) have already demonstrated measurement invariance prior to estimation of the full crosslagged models. I'm wondering how you'd suggest that I test gender differences in the individual stability and crosslag paths, given that the significance of these differences is of primary interest (I'm currently using the GROUPING command to separate the models by gender). Because the Model Constrain and Model Test commands perform constraints/tests simultaneously, rather than onebyone, and I don't think the DO option would solve this issue either, I'm at a bit of a loss as to how to accomplish these tests of the regression paths. Would you recommend another command option, or perhaps using the path estimates and SEs to explore significant differences using the confidence intervals (e.g., seeing if "x1x21.96*sqrt(se1^2+se2^2) > 0")? If the latter, should I use the unstandardized or the standardized estimates? Thank you for your assistance, and please let me know if you need more information prior to making a suggestion. I appreciate the amazing help and resource you provide to all of us modelers! It's a pleasure to use your software and the supporting information. 


You can simply do separate runs using Model Test for each coefficient you want to test the difference across groups for. Another approach is to hold all of them equal across groups and then check Modindices for which ones are not equal. 


Thank you, Bengt. In order to save time initially, I prefer the latter option of holding paths equal across groups and looking at Modindices. Using a theoretical example with 3 waves and 2 latent factor at each wave, would I do this by requesting all Modindices on something like the following: Grouping is (0 = female 1 = male); MODEL: A3 ON A2 (1); A3 ON B2 (2); A2 ON A1 (3); A2 ON B1 (4); MODEL male: A3 ON A2 (1); A3 ON B2 (2); A2 ON A1 (3); A2 ON B1 (4); Would the provided ON/BY Modindices be for the the release of the equality constraints (i.e., 1, 2, 3, 4), or something else? 


P.S. I obviously meant the following: GROUPING IS gender (0 = female 1 = male); And, for what it's worth, I'm using the strong invariant model as a baseline for these tests (with latent factor loadings and latent means held equal across waves). Thank you! Dan 

