Message/Author 


I'm currently doing a psychometric study that includes measures at the classroom level nested within schools. Accross the 3 years of data the multilevel CFA's all converged(!!!) and fit well. I'm now working on establishing invariance, following the Vandenberg and Lance (2000) review and a few other papers. My problem concerns the first step, testing whether the covariance matrices in the three years are different. It is not clear to me how to set up a model to test this, I have tried a few ideas which didn't seem to work. It seems like this should be very easy, I'm hoping you can enlighten me. Thank you very much. Lee 


You use a model with no factors, just correlated "errors". That is, use y1 WITH y2, etc.. Technically, this estimates the covariance matrix in the Theta parameter matrix. 

Anonymous posted on Sunday, July 27, 2003  9:38 pm



Is it possible to construct a multigroup, multilevel SEM in the current version of Mplus ? 


Yes. 

Anonymous posted on Sunday, July 27, 2003  11:39 pm



Is this done via: ...ANALYSIS: TYPE=TWOLEVEL RANDOM MISSING MIXTURE; ESTIMATOR = ML; ... and specifying class membership via training data, or is there another way ? 


No, TYPE=TWOLEVEL RANDOM MISSING; with the GROUPING option. 

Anonymous posted on Friday, January 23, 2004  5:42 pm



Is there a substantive meaning to restricting the Level2 endogenous variable covariances to be equal across groups in a multilevel, multigroup SEM ? In my particular application, I find that several of the Level1 and Level2 covariances are nearly identical across groups, and I'm hoping to restrict them so as to regain df's. 

bmuthen posted on Saturday, January 24, 2004  3:27 pm



Although all parameters carry substantive meaning, endogeneous covariances refer to covariances between residuals in the regression equations and typically one would not have hypotheses about these unexplained parts of the variation in the dependent variables. So my own feeling is that gaining df's this way is not recommended. 

Anonymous posted on Sunday, January 25, 2004  4:19 pm



As a followup: is it the case that testing restricted Level2 covariances would ***NOT NECESSARILY*** be a test of whether both groups' Level2 s are missing the same Level2 predictor variables ? 

bmuthen posted on Sunday, January 25, 2004  7:44 pm



Correct. 

Anonymous posted on Monday, February 16, 2004  8:27 pm



I hadve a couple of questions regarding performing multigroup multilevel models in Mplus. 1. Is there a certain estimated reliability value below which you recommend not using multilevel modeling (for example, .5, .4, etc.) ? 2. I seem to recall reading in Muthén 1997 that ML estimation may be complicated or untrustworthy in sitations where the number of Level2 units is small. Is it estimation problematic when the number of Level2 units is sizeable, but the number of cases per unit themselves is small (<5, say) ? 3. When Level2 units contribute cases to members of all groups in a multigroup, multilevel model, does Mplus assume the error variances are homogenous within the original Level2 units, or does it treat each Level2 unit / group combination as a separate population ? Thank you. 

bmuthen posted on Monday, February 16, 2004  10:29 pm



1. No. Lower reliabilities simply give less power to detect relationships. 2. This does not necessarily lead to problems. The number of level 1 units required is related to the number of withinlevel parameters. In particular, you need several units per cluster in order to estimate withinlevel variance parameters. A Monte Carlo study in Mplus can tell you more specifics. 3. Within each group, the level 2 variances are assumed equal across level 2 units. 

hollybaker posted on Thursday, May 13, 2004  12:56 am



I was hoping someone could help me i have a paper due in my social research class and ihave to make up a survey without actuallt doing it using regression models and etc, does anyone know what i'm talkn aboutRemember the framework for your 3rd part: 1. Quantify your variables (really well) 2. Path analysis and direction of causality 3. Elaboration model and ideas for regression 4. Control issues 


I have a problem concerning doing multigroup multilevel analysis. I want to specify some paths (between level) specific only to boys or girls. But I get an error message. TITLE: PROOVIME; DATA: FILE IS nagudega.dat; VARIABLE: NAMES ARE ID GENDER SELF PEER VICT REJ EXTERN INTERN ADAPT VAEN SUMMA HOSTIL FRE ENE NEU; USEVARIABLES ARE PEER SELF HOSTIL ENE EXTERN INTERN; GROUPING = GENDER (1=MALE 2=FEMALE); CLUSTER IS ID; BETWEEN ARE PEER SELF EXTERN INTERN; WITHIN IS ENE; ANALYSIS: TYPE = TWOLEVEL; MODEL: %BETWEEN% HOSTIL ON EXTERN INTERN; %WITHIN% HOSTIL ON ENE; MODEL MALE: %BETWEEN% HOSTIL ON SELF; MODEL FEMALE: %BETWEEN% HOSTIL ON PEER; OUTPUT: SAMPSTAT STANDARDIZED RES MOD (0.00); *** WARNING in Model command Variable is uncorrelated with all other variables: PEER *** WARNING in Model command Variable is uncorrelated with all other variables: SELF *** WARNING in Model command All least one variable is uncorrelated with all other variables in the model. Check that this is what is intended. *** ERROR The following MODEL statements are ignored: * Statements in the BETWEEN level of Group MALE: HOSTIL ON SELF * Statements in the BETWEEN level of Group FEMALE: HOSTIL ON PEER Could you tell me what the problem is about! Thank you! 


In multiple group analysis, the overall MODEL command must contain the most general model. The groupspecific MODEL commands show differences between the overall model and the model for each group. You would need to include all ON statements in the overall MODEL command and fix the ones you don't want in each group to zero. 

Anonymous posted on Monday, June 07, 2004  6:28 pm



Assuming that i don't have the original data, how can I tell MPLUS to read one or two covariance matrices for a TWOLEVEL Model? 


You need raw data for TYPE = TWOLEVEL. 


Dr. Muthen, I wanna do a multiple group(male and female ) for a multilevel path analysis, what is wrong for the following command? I wanna compare the path coefficient between male and female group, which I do not care whether or not the two group coefficient are equal.just wanna make a comparsion. thanks, boliang following is the syntax: USEVARIABLES ARE bul1 newac ctad class sex; MISSING IS *; BETWEEN = ctad; Grouping = sex (0 = g1 1 = g2); CLUSTER IS class; ANALYSIS: TYPE = TWOLEVEL RANDOM Missing H1; MODEL: %WITHIN% snewac ON bul1; bul1 ON newac@0; %BETWEEN% Bul1 newac s ON ctad; Model g2: %WITHIN% snewac ON bul1; bul1 ON newac@0; %BETWEEN% Bul1 newac s ON ctad; *** ERROR in Model command Random effect variables can only be declared in the GENERAL model. 


You canot define random effect in the groupspecific MODEL commands. Use the syntax below which will give each group the model you want. The overall MODEL command is the starting point for each group's model. MODEL: %WITHIN% snewac ON bul1; bul1 ON newac@0; %BETWEEN% Bul1 newac s ON ctad; 


I am going to be attempting a multigroup multilevel path analysis for the first time, so I have read this discussion with interest. I have one specific question, however. Is it possible to use a grouping variable that is a level2 variable instead of a level1 variable? My data include 400+ children clustered in ~90 census block groups. I'd like to use neighborhood impoverishment (high vs. low/moderate) as the grouping variable. Is it okay to do this? Thanks. 


In multiple group analysis, a group should contain independent observations. So people from a cluster should be in the same group. With continuous outcomes, Mplus adjusts for this. With categorical outcomes, it does not. 

Anonymous posted on Monday, June 13, 2005  3:01 pm



I have multiple cohorts. Does it matter whether I specify using GROUPING or COHORT options? I guess i should use the COHORT option but what would be a senario in which the GROUPING option would instead be used? 


The COHORT option is used in conjunction with the TIMEMEASURES. Together, the two options create new variables (based on the dates in COHORT option and the years specified in TIMEMEASURES) that are used in the analysis. This creates a pattern of missingness for observations of each cohort. The GROUPING option is used when you want to analyze models where observations with a certain group membership are to be kept together. 

Anonymous posted on Monday, June 13, 2005  4:52 pm



So, if I want to conduct and compare analysis on two cohorts, I should use the grouping option? 


Not if your data are strung out. Just treat it as regular data then. If your data are not strung out, then you use the GROUPING option. The setups for this are shown in the Day 2 short course handout. 

Anonymous posted on Tuesday, June 14, 2005  3:34 pm



I'm sorry. I see where to order the five day handout on the website but not the two day. Can you direct me? Thank you. 

bmuthen posted on Wednesday, June 15, 2005  1:45 pm



"Day 2" is the second day of the 5day short course. 

Anonymous posted on Wednesday, June 22, 2005  3:23 pm



Hi, I am a novice, attempting to run a multilevel model with random slopes. I have students clustered in classrooms. I also have two groups (3rd&4th graders combined because some of them had the same teacher although they were in different grades, and 6th graders). It’s a simple path model, but I have some incomplete data. The way I have started is with a multilevel multigroup analysis. Here are the input instructions: USEVARIABLES ARE catchma holdma tcsex w4mint tid w4tmaint grade; MISSING = all (99); WITHIN = tcsex w4mint; BETWEEN = catchma holdma ; CLUSTER = tid; GROUPING = grade (0=young 1=old); CENTERING = GRANDMEAN (ALL); ANALYSIS: TYPE = TWOLEVEL RANDOM missing; MODEL: %WITHIN% s1  w4tmaint on tcsex*0; s2  w4tmaint on w4mint*.5; %BETWEEN% w4tmaint s2 ON catchma holdma*0 ; When I ran this model, the estimated between covariance matrix for the younger group was not positive definite. Then I fixed each of the slope variances to 0, and it ran ok. I have a few questions: 1. Having fixed the slope variances to zero, it is unclear to me why I was still able to use betweenlevel variables to predict variability in slopes (i.e., what variability was there to predict)? Is it that, by setting the slope variances to zero, I have only set the unaccounted for (resid) var to zero? If this is true (only residual variance was set to zero), then does seeting the resid var to zero affect the standard errors? 2. Second, my intraclass correlation for the 3rd&4th graders is low (.04). Could this have caused the problem? Also, I’m wondering how it is best to write up results from this type of analysis, and it appears that there is a manuscript written by Muthén and others that has a similar analysis. Would it be possible to be sent a copy of the paper below? Muthén, B., Khoo, S.T. & Gustafsson, J.E. (1997). Multilevel latent variable modeling in multiple populations. (#74) Thank you so much for your help. 


You would need to send your output, data, and license number to support@statmodel.com for us the understand how to answer questions 1 and 2. Please request the paper from bmuthen@ucla.edu. 

Anonymous posted on Monday, June 27, 2005  1:38 pm



In an attempt to compare pathcoefficients of two groups I conducted a multigroup analysis. The sample sizes of the two groups are unequal (Na = 3.853 vs Nb = 440). The output says: NO CONVERGENCE. NUMBER OF ITERATIONS EXCEEDED. Could this be due to the unequal sample sizes? And if so, what would be a wise strategy? 


It should not be due to that. See the user's guide for suggestions about nonconvergence. If that does not help, send the output, data, and your license number to support@statmodel.com. I would also get the model to converge for each group separately as a first step. 

Anonymous posted on Tuesday, June 28, 2005  6:31 pm



Hello, I am conducting a multigroup cfa with ordered binary, complex sample data. Each group is contained in a separate data file. I have set up the default invariance model (which constrains thresholds and loadings equal across groups), and am getting the error: "CLUSTER option cannot be used with multiple data files" Does this mean that I cannot take into account the complex nature of data and test invariance at the same time? Is there a way around this (e.g., merging the two data files and distinguishing groups with a grouping variable)? Thanks. 


No, it means that all of your data must be in one file with a grouping variable inlcuded. 

Aryn posted on Wednesday, August 03, 2005  6:46 pm



Is there a citation for why the twogroup test is not appropriate for correlated samples (e.g. mothers & fathers from the same family) 

bmuthen posted on Monday, August 08, 2005  7:22 pm



Introductory statistics text books treat uncorrelated and correlated t tests, so those would be good references. The fact that the samples are correlated should not be in dispute and need a ref. 


I am running a multigroup model with TYPE = COMPLEX. I have three groups 0, 1, and 2...the model has latent variables. I have: X ON Y; M1 M2 ON Y; M1 M2 ON X; When I run the model I get the results I expect for the relationship X on Y in that the relationship is not significant in group 0 and it is very significant in group 2. To test for structural invariance I restrict the X on Y path to equal for the 0 and 2 model. Such that I have: X ON Y (1); M1 M2 ON Y; M1 M2 ON X; MODEL 1: X ON Y; However the chisquare diff. at 1 d.f. is not even remotely significant. (I used the scale correction factor as it is MLR) That seems a bit odd to me as in one group X ON Y is not significant versus very significant in the other. Which leads me to beleive that maybe I am modeling it incorrect. Any thoughts? 


Looks like you set it up right. Sounds like you have a case where the group 0 estimate a, say, is not necessarily close to 0 but has a large enough SE to make it insignificant, while the group 2 estimate b say is a bit further away from zero and significantly so  but ba is not that large relative to its SE. You can also try doing it via Model Test, which uses the Wald test and therefore automatically takes nonnormality into account (see UG on how to do this). 


Sorry what does UG stand for? 


User's Guide. 


Whoops sorry that was kind of obvious wasn't it. Just to clarify, in the User's Guide on page 488 with the sentence starting "In the MODEL CONSTRAINT command, the factor loading for y3 is constrianted to be....". Should that be y4 as p4=2*p2? That being the case, if I want to constraint the structural path Y on X to be equal across two groups do I use: MODEL: Y on X (P1); MODEL TEST; P1 = 1; (or should this be a different number) And when I have three groups and only want to constrain Y ON X for group 1 and 3 do I use: MODEL: Y on X (P1); MODEL TEST; P1 = 1; MODEL 2: Y ON X; !allowing this parameter to vary for group 2? Thanks, 


Yes, that should be y4. If you have three groups and you want to use the Wald test to test a model where a regression coeffcient is free across three groups versus a model where the parameters are constrained to be equal across two groups, one way to specify it is the following: MODEL: y ON x; MODEL g1: y ON x (p1); MODEL g3: y ON x (p3); MODEL TEST: p1 = p3; Note that MODEL TEST has the most restrictive model. 


I am having some difficulty adding a multiple group analysis to a TWOLEVEL latent curve model that I am working on. I am very interested in how a between level variable (whether the husband in a couple is an alcoholic) may or may not modify the effect of divorce on drinking behavior in both the husband and his wife. I don't think I can use the between level variable (alcdad) to examine this witin level association (drink on divorce). When I try to use alcdad as a grouping variable instead, I get this error message " ALGORITHM = INTEGRATION is not available for multiple group analysis. Try using the KNOWNCLASS option for TYPE = MIXTURE." Here is my syntax TITLE: rsa model using alcdad as grouping var; DATA: FILE IS e:/marriage/noalign.RAW; TYPE is INDIVIDUAL; VARIABLE: NAMES ARE family rsex target revcontrol servpres famses conflic1 numdrink divdate tbdi1 tbdi2 tbdi3 tbdi4 tbdi5 numchild drink1 drink2 drink3 drink4 drink5 nodrink1 nodrink2 nodrink3 nodrink4 nodrink5 divorce2 divorce3 divorce4 divorce5 alcdad; usevariables are tbdi1 rsex divorce2 divorce3 divorce4 divorce5 drink1 drink2 drink3 drink4 drink5 alcdad conflic1 servpres famses divdate; categorical = divorce2 divorce3 divorce4 divorce5; within = rsex tbdi1 conflic1 servpres famses divdate; cluster = family; grouping is alcdad (0=no 1=yes); missing are all (999); ANALYSIS : type = TWOLEVEL random missing; model: %within% iw sw  drink1@0 drink2@1 drink3@2 drink4@3 drink5@4; iw @0; sw @0; sw on rsex tbdi1 conflic1 servpres famses divdate; iw on rsex tbdi1 conflic1 servpres famses divdate; divorce2 on drink1 (1); divorce3 on drink1 (1); divorce4 on drink1 (1); divorce5 on drink1 (1); drink2 on divorce2 (2); drink3 on divorce3 (2); drink4 on divorce4 (2); drink5 on divorce5 (2); %between% ib sb  drink1@0 drink2@1 drink3@2 drink4@3 drink5@4; ib @0; sb @0; OUTPUT: TECH1 ; Is it not possible to add a grouping variable to this analysis? If not, how can I examine modifying effects of between level variables on within level associations? 


You need to use the KNOWNCLASS option instead of the GROUPING option. See Example 7.21 in the Mplus User's Guide to see how this is used. In your case, you would have only the KNOWNCLASS variable in the CLASSES statement. If you have further questions about this, please send them along with your license number to support@statmodel.com. 


Hello I am doing multilevel modeling and am looking whether some associations are moderated by gender. I find that some of the variables do not have a betweenlevel variance for girls. So, some of the betweenlevel associations are set to be 0 for girls...Should I still compare the models where I constrain all the paths to be equal across the genders to the model where the paths are freely estimated (even when I know that for boys, there is a significant variance for some of the variables, while for girls, there is not)? Thank you 


Are you testing for measurement invariance? Or are you analyzing only observed variables? 


I am analyzing only observed variables. 


One more question: I am comparing nested models by doing chisquare difference tests. I am comparing the model where I constrain all the paths to be equal across genders and the model where I free one of the paths at a time. If the difference test is significant, am i correct when i conclude that i should retain the freely estimated path? And, when I am using MLR estimator, can I just multiply the chisquare value by scaling correction factor and then conduct the chisquare difference test? Thank you! 


To answer your question from Sunday, I would do acrossgroup equality testing of paths even if some variances are not significant for one group. To answer your first question from today: yes. Your second question from today: see our discussion of chisquare diff testing on our web site. 

mehdi rezaei posted on Wednesday, November 22, 2006  9:18 am



would you please reffer me to some resources about multig roup analysis in LISREL. m_rezaei_05@yahoo.com 


You need to contact LISREL support or ask this question on SEMNET. This discussion forum is for Mplus. 


Is it possible to make a Wald test to test whether or not a coefficient is lower or equal (LE) than cero in one group? Or if one coefficient is LE the same coefficient in another group? (I have unsuccessfully tried different combinations of model constraint/model test). Thanks. 


Logical operators cannot be used in MODEL CONSTRAINT or MODEL TEST. Only arithmetic operators can be used. Please send your input, data, output, and license number to support@statmodel.com so I can see what you are doing that is not working. 


Sorry Linda, I understood from the UG that it is possible to use logical operators (p.486). Thanks for your quick answer. 


On page 486 under MODEL CONSTRAINT, it says: "Linear and nonlinear constraints can be defined using the equal sign (=), the greater than sign (>), the less than sign (<), and all arithmetic operators and all functions that are available in the DEFINE command with the exception of the absolute value function." I don't see where it says that logical operators can be used. If you look at DEFINE starting on page 409, you will see that there is a distinction made between logical operators, arithmetic operators, and functions. 


Well, I tried with the < sign, by doing: Model Constraint: New (c); 0<exp(c)+p3p2; Now, I hoped that making another run without the constraint, 2*difference Loglikelihood I could get a chi2 with 1 degree of freedom, testing p3<p2. Thanks. 


Hi Linda and Bengt, I'm investigating measurement invariance across two groups in a multilevel model with a complex sample (TYPE = COMPLEX TWOLEVEL). When I use the scaling correction factor with the likelihood difference test  invariance is rejected (the metric invariant model fits significantly worse than a configural invariance model), but when I use the scaling correction factor with the chisquare difference test  invariance holds (the metric invariant model does not fit significantly worse than the configural invariance model). My question is whether one of these difference tests is preferred in this modeling situation? Also, I'm curious as to why they might be different. Thanks! 


A quick correction. It is the intercept (aka strong) invariance model that is rejected when using the likelihood difference test, but it is not rejected when using the chisquare difference test. Thanks again, Katie 


I just looked at a COMPLEX TWOLEVEL output and I don't see that you get a chisquare fit test. Please send your output and license number to support@statmodel.com so I can see what you are referring to. 


I am doing invariance testing for a multiple groups latent growth model with a quadratic term estimated using MLR. I am interested in whether there are significant group differences in covariances, intercepts, and residual variances. In addition to my baseline model... cluster IS momid; grouping IS female (0 = male 1 = female); Analysis: type = complex missing H1; i s q  y1@0 y2@2 y3@4 y4@6 y5@8; i ON x1 x2 x3 x4; s ON x1 x2 x3 x4; q ON x1 x2 x3 x4; I have run 3 separate models individually constraining each of the residual variances using i(1); s(1); q(1); Am I correct in my attempts to constrain the residual variances to be equal across groups? How do I constrain the intercepts and the covariances? Thank you very much for you time 


The way you have the equalities specified, you are holding the residual variances of i, s, and q both equal to each other and equal across groups. Instead say, i(1); s(2); q(3); Intercepts are specified using bracket statements. Covariances are specified using the WITH option. [i] (4); i WITH s (5); See the discussion of equalities in multiple group analysis in Chapter 13 for more information. 


Thank you very much for your reply. A follow up question, if you don't mind. I am running a series of constrained models so that I may determine whether or not males and females differ significantly on their slopes, variances, effects of covariates, etc. Three of the 21 constrained models that I have run do not converge. first one  i s q  y1@0 y2@2 y3@4 y4@6 y5@8; i ON x1 x2 x3 x4; s ON x1 x2 x3 x4; q ON x1 x2 x3 x4; s ON x1 (1); second one  i s q  y1@0 y2@2 y3@4 y4@6 y5@8; i ON x1 x2 x3 x4; s ON x1 x2 x3 x4; q ON x1 x2 x3 x4; s WITH i (1); third one  i s q  y1@0 y2@2 y3@4 y4@6 y5@8; i ON x1 x2 x3 x4; s ON x1 x2 x3 x4; q ON x1 x2 x3 x4; [i] (1); All of the other constrained models seem to run with no problems. Any idea what could be wrong? Thank you very much. 


I can't say from the information provided. Please send the inputs, data, outputs, and your license number to support@statmodel.com. 


It has been sent. Thank you very much for your time. 


I have a question concerning conducting multigroup modeling (e.g.,examining associations by gender) vs. conducting analyses separately for each group. Have I understood correctly that in the case of multigroup modeling all observations are used to estimate the effects? And, if I run the analyses separately for each group, the number of observations is smaller. Thank you! 


If you run a model for each group separately or the groups together, you will get identical results as long as there are no parameters constrained to be equal across the groups. 


I just noticed in the multigroup example on the website that it uses TYPE = MGROUP; I can not find this in the user's manual. What is the difference if you are running a multigroup analysis and you don't use the TYPE = MGROUP command but do indicate that there are groups through Grouping is...say for example if you are running a TYPE = GENERAL? Thanks, 


This is an old option. It is not necessary. Can you email me at support@statmodel.com the link where you found this? 


When doing a multiple group analysis, how does Mplus handle the sampling weights calibrated based on the total sample? 


No. It does so within each group. 


Dr. Muthen, I am a novice, and I am trying to conduct a multiple group analysis within a multilevel latent covariate approach. I am trying to test the invariance of a model on students nested within schools (unit) using the grouping variable of gender. However I am getting the following error message: *** ERROR Cluster ID cannot appear in more than one group. Does this mean that I cannot use gender as a grouping variable because student gender is a level 1 variable that is mixed in each school? 


You cannot use gender as a grouping variable for this reason. You can, however, use gender as KNOWNCLASS in a TYPE=MIXTURE run. 

gibbon lab posted on Thursday, January 05, 2012  3:01 pm



Hi Professor Muthen, In one of your old posts(above), you mentioned that Mplus can compare two groups in a three group analysis using the following code: MODEL: y ON x; MODEL g1: y ON x (p1); MODEL g3: y ON x (p3); MODEL TEST: p1 = p3; I tried to use this code for WLSMV, but it did not work. Is there another way to perform this kind of comparison for WLSMV? Thanks a lot. 

Nidhi Kohli posted on Thursday, January 05, 2012  3:10 pm



I am trying to compare the mean of the slope growth factor across the two groups. The model that I have run is a multilevel GMM with KNOWN class option. The dataset is a repeated measures, clustered data. Here is the Mplus code: MODEL: %WITHIN% %OVERALL% iw sw  BP1@0 BP2@1 BP3@2 BP4@3 BP5@4; iw; sw; iw sw ON ...; %CD#1% iw; sw; iw WITH sw; %CD#2% iw; sw; iw WITH sw; %BETWEEN% %OVERALL% ib sb  BP1@0 BP2@1 BP3@2 BP4@3 BP5@4; [ib sb]; ib; sb@0; ib WITH sb@0; BP1BP5@0; %CD#1% [ib@0 sb]; %CD#2% [ib sb]; The mean of the slope growth factor in each group is statistically significant. My question is how can I test if these two means significantly differ from each other? Thanks 


Gibbon: Please send your output and license number to support@statmodel.com so I can see what did not work means. 


Nidhi: You can use MODEL TEST or loglikelihood difference testing. 

Nidhi Kohli posted on Thursday, January 05, 2012  6:37 pm



How can I use loglikelihood difference test to see which class slope growth factor mean is higher / lower when compared to the other class in the model? The Mplus output contains *only* one loglikelihood. Thanks. 


Difference testing requires running two analyses  one with the parameters constrained to be equal and one where they are not. Perhaps using MODEL TEST is a better idea. 

Nidhi Kohli posted on Thursday, January 05, 2012  7:49 pm



Thanks, Linda. One last question. Since I have already run the Multilevel Growth Mixture Model with Known class option, how can I use MODEL TEST option? Do I have to rerun the model with MODEL TEST option? or, I can write a small, new code and test the slope differences using MODEL TEST option? 


See MODEL TEST in the user's guide. It is in addition to the MODEL command. In the MODEL command, you label the parameters you want to test and use the labels in MODEL TEST. 

Nidhi Kohli posted on Wednesday, January 18, 2012  3:13 pm



Linda, I ran the model, the one I described above in my question posted on January 05, with MODEL TEST option. I, however, cannot find the results from this test in the Mplus output. Where can I find in the output whether the overall mean of the slope factor in class#1 is statistically different from the overall mean of the slope in class#2? Thanks so much. 


It is under the following heading with the fit statistics: Wald Test of Parameter Constraints 

Nidhi Kohli posted on Sunday, January 22, 2012  10:32 pm



Thank you. I have one last question on this thread. The main purpose of my analysis is to compare the overall mean of the slope growth factor of one particular group / class (i.e., the reference group) with the overall mean of the slope growth factor of 5 other groups in the dataset, respectively. In total, I have 5 pairs of multilevel, multigroup comparisons. I have successfully run all the 5 pairs of comparisons, however, I noticed that the coefficients for the reference group slightly change from one pair of comparison to the other. Can you please help me understand why the coefficients for the reference group slightly vary from one comparison to the other? Thank you. 


Please send output that shows this and your license number to support@statmodel.com. 

Nidhi Kohli posted on Wednesday, February 22, 2012  4:32 pm



My question is in the context of multilevel growth mixture model with known class membership. I was wondering if you can tell me why does Mplus, by default, fixes the mean of the intercept growth factor to zero in one class and allows it to be free in the other class? Secondly, since the mean of the intercept growth factor is fixed to zero in one class, how can one then test if the two means of the intercept growth factors in class1 and class2, respectively, are different? Thank you. 


It sounds like you a categorical outcome. In this case, the growth model parametrization is to hold thresholds equal and fix the intercept mean to zero in one class. The test of mean differences would be means zero in all classes versus mean zero on one class. 

Nidhi Kohli posted on Wednesday, February 22, 2012  7:29 pm



Thank you. Just to make sure that I understand you correctly. To test the mean differences I should run the model in two ways. In the first approach, I should fix the value of means of the intercept growth factors in class1 and class2, respectively, to zero. In the second approach, I should run the model where the intercept mean in class1 is fixed to zero, and in class2 it is free to be estimated. Right? Should I then DIFFTEST to see if the two nested models are different? Thank you. 


This sounds correct. You should use DIFFTEST if you are using WLSMV. 

Nidhi Kohli posted on Wednesday, February 22, 2012  8:16 pm



I plan to use MLR. What would be the ideal Mplus option for testing mean differences? Thank you. 


See the website where difference testing with MLR is described under HowTo. 

Sergio Ruiz posted on Saturday, February 25, 2012  6:16 pm



Hello! I am testing a multigroup regrssion model where I have latent and observed predictors and interaction terms including latent x observed variables. I centered all the observed predictors around the mean but I am not sure what to do with the latent means for both groups. Should I center them? if yes, how it is done in MG analysis? Can you please give me some advise? Thanks! 


It is not necessary to center the latent variables. 

Sergio Ruiz posted on Saturday, February 25, 2012  7:14 pm



Thank you for the fast answer. Only to be sure, could you confirm me that is not a problem that the mean of the latent variable in the first group is zero and in the second is free to be estimated? Thanks again! 


Yes, as far as I can see, having a nonzero factor mean in one of the two groups is ok. You just have to make the interpretation accordingly. So if you have the interaction expressed as y = a + b1*f + (b2 + b3*f)*x + e, for a factor f, you want to evaluate the moderator term (b2 + b3*f)*x at 1 SD above/below the mean of f, which isn't zero. 

Nidhi Kohli posted on Monday, February 27, 2012  3:08 pm



I have one last question on my post dated February 22, 2012. How can I compute the 95% confidence interval around the mean of the slope growth factor in a Multilevel GMM with known class membership? Generally, the equation for computing confidence interval is the following: sample statistic +/ (Critical value x Standard error of the statistic). I know the sample statistics (i.e., the mean estimate of slope growth factor) and I also know the S.E. of the sample statistics. How can I find the critical value? In other words, what is the distribution of the sample statistic in this case? Thank you. 


You can use the CINTERVAL option to obtain confidence intervals for your parameter estimates. See the user's guide for further information. For symmetric confidence intervals, the critical value is taken from a ztable. 

Nidhi Kohli posted on Tuesday, February 28, 2012  4:26 pm



Since the slope growth factor is continuous variable which is assumed to be normally distributed, I can use the critical value taken from a ztable to create confidence interval around the mean of the slope growth factor, right? Thanks. 


You can do that, but the reason you can is not that the slope growth factor is normally distributed, but that the ML estimate of the slope growth factor mean is asymptotically normal. 


Dear Drs. Muthen: I am conducting a multilevel multigroup analysis in MPLUS with observed variables and have tried to find the answers to the following questions without success: 1. I have adolescents nested in schools (cluster) and am grouping by ethnicity. Hence, I get the error message “Cluster ID cannot appear in more than one group.” On a message board related to MPLUS, it suggests that to remedy this problem I should define new cluster values that are unique for each group. So for example, in school 101, I would recode data so that Asian students were in school 1011, Black Students in school 1012, Latino Students in school 1013, White student in school 1014, etc. Is this my best option? It is reported that there are no unintended consequences of this method. Is this correct? If so, how do I report that I did this in a publication? 2. A number of researchers have reported that it is not possible to obtain a true R2 value for a multilevel model in Mplus. Is this accurate? Would you recommend that I calculate pseudoR2 values if I would like to report the variance explained by predictors? If so, which method would you recommend? Thank you for your help 


1. It sounds like you are using a withinlevel variable as a grouping variable. When you do this. the groups are not independent because members of the same cluster appear in the groups. This violates the assumptions that the groups contain independent observations. 2. Mplus computes Rsquare for multilevel models. For continuous outcomes, it is a regular Rsquare. For categorical outcomes, it is a pseudo Rsquare. I wonder where this misinformation comes from. 


Thank you for clarifying. I have one more question. If I instead run a multigroup model and cluster by school, would this procedure adequately reduce my Type I error that I would risk by not conducting multilevel analysis? The ICC = .02 but there are citations explaining that even very low ICC's risk of a Type 1 error (e.g., Barcokowski 1981). i.e., i.e., GROUPING IS ETHNIC (1=asian 2=black 3=latino 4=white); CLUSTER = school; Would this be the procedure you would recommend? 


I am working on a 3level latent growth model that is also a multiple group analysis across 5 cohorts. Based on this thread and Example 7.21 in the U.G. I have the code below, which does not run because "THERE IS NOT ENOUGH MEMORY SPACE TO RUN THE PROGRAM ON THE CURRENT INPUT FILE...." I've also tried including INTEGRATION=MONTECARLO, with the same error. Analysis: Type = TWOLEVEL MIXTURE; ESTIMATOR = MUML; MODEL: %WITHIN% %OVERALL% iw BY wiscrawwiscraw3@1; sw BY wiscraw@1 wiscraw2@0 wiscraw3@1; iw ON ...; sw ON ...; iw sw; %cohorts#1% iw sw; iw WITH sw; %cohorts#2% ... %cohorts#3% ... %cohorts#4% ... %cohorts#5% ... %BETWEEN% %OVERALL% ib BY wiscrawwiscraw3@1; sb BY wiscraw@1 wiscraw2@0 wiscraw3@1; ib ON conaff90; sb ON conaff90; [wiscrawwiscraw3@0 ib sb]; wiscrawwiscraw3@1; ib@1; sb@1; %cohorts#1% [ib sb]; %cohorts#2% ... %cohorts#3% ... %cohorts#4% ... %cohorts#5% ... 


Please send the output and your license number to support@statmodel.com. 

Weber Seaman posted on Tuesday, September 11, 2012  1:00 pm



Dear Dr. Muthen: I understand that in order to conduct a multiplegroup multilevel modeling, the grouping variable has to be a level2 (or higher) variable. However, if I want to do the multigroup based on RACE, a level1 variable, is there any way to bypass the restriction? Some people suggest the method of reassigning cluster membership. For example , in school 101, Asian students are reassigned cluster ID to 1011, Black Students to 1012, Latino Students to 1013, White student to 1014, etc. Is this a safe way to do so? Dr. Linda Muthen suggested another alternative, using KNOWNCLASS. How different are these two approcahes? 


We will shortly have a web note posted that describes how to do this. Hopefully within a week. 

Weber Seaman posted on Tuesday, September 11, 2012  1:24 pm



Dr. Muthen, Thank you. Please post a link to the web note here when it is online. I think it will benefit many readers with the same question as me. Thank you again. 

Sarah posted on Wednesday, September 04, 2013  10:58 am



Hi, I was wondering if it is possible to investigate group differences in path coefficients in a SEM model (as opposed to the model as a whole). I have tested a model and want to see if certain paths are significantly different between groups. For example: Group #1: F1 to F2 = .45 Group #2: F1 to F2 = .30 Group #3: F1 to F2 = .10 Is there a way to know if .45, .30, and .10 are significantly different? Thank you for your help! 


You can do this using chisquare difference testing or MODEL TEST. 

Nate Breznau posted on Wednesday, January 08, 2014  8:17 pm



Long time lurker, first time poster! I am working with a moderation analysis. I have a latent variable PFB measured from 1 individual, 1 regional and 1 countrylevel variable. PFB predicts Y (an individual attitude). ID is a categorical moderator (3 identities) also measured at the individual level. Question: When I run this as a multilevel linear model (i.e. in stata) it includes dummies for region and country plus accounts for the clustered standard errors at each level. This gives me effect sizes of the moderation of PFB (interacted with ID) that range from about .03 to .06 standardized at the individual level. When I run it in MPLus I get standardized coefficients in the three groups that range from .27 to .30. As much as I love these massive effect sizes I am instinctively thinking they are wrong because I don't know how to account for the nested data structure. The best I have is the following for syntax. [..abridged...] CLUSTER = country; GROUPING = id (1=A 3=B 5=C); Analysis: Type = COMPLEX ; Model: Y BY w1 w2 w3 w4 w5 w6 w7; pfb BY pfb1 pfb2 pfb3; Y ON pfb; How can I deal with the fact that much of the unobserved variance in Y occurs at the regional or country level? Or that pfb2 is regional and pfb3 is country level. (N=22k, regional N=112, country N=14) I am humbly grateful for your work. 


When you say that you do multilevel analysis with "dummies for region and country plus accounts for the clustered standard errors at each level.", I wonder what the levels are for the multilevel analysis. It seems that you either use region as level2 (and country as level3) and do a multilevel analysis, or have them as dummies and do a singlelevel analysis. Using them as dummies affects the results of the regression of Y ON pbf and could explain the discrepancy you see. One approach would be to use country as a grouping variable, region as the cluster variable, and id and its interaction effects as a level=1 dummies (14 countries is too few for taking a random mode approach with respect to country). The question then is what you do with your pfb measurement model. 


Thank you so much for your reply. I have used the term 'dummies' in my post a bit falsely based on old fashioned approaches. I use a multilevel approach with individuals, regions and countries (xtmixed Stata operations, in case you know them). If I use country as a grouping variable I think MPlus will then analyze the effect separately for each group/country. Or can I get around this by fixing the effects of independent variables to be identical for all groups? I ask because I still want the variance in Y (the DV) to be partitioned out of the variation I seek to explain at the individual level (like with the multilevel model where the unexplained variation at the country and region level is kept out of the estimation of the individual level variance of Y). I am unsure about the pfb measurement model. This is similar to the question above, but if I use the grouping of country and clustering of region will MPLus account for the nested structure of the latent variable pfb (measured at each of the 3 levels)? Finally, if I use no second or thirdlevel predictor variables (other than for pfb) and I include dummies for region and/or country in MPlus, instead of using clustering, am I missing something important that will go wrong that a multilevel model would otherwise correct for? I am running MPlus 6 by the way in case it matters. Again, I am grateful. 


With country as a grouping variable you can use equality constraint across countries for any parameter in the model. Am I understanding you correctly that the pfb factor is measured by the 3 indicators pfb1, pfb2, and pfb3, where the 3 are the same thing but on the individual, region, and country levels? I don't think you should use dummies for 112 regions. 

Nate Breznau posted on Wednesday, January 15, 2014  8:03 pm



It is unclear to me how grouping by country and then using equality constraints will help me accurately remove unobserved heterogeneity at the countrylevel that can account for countrylevel variation in the DV (there is a lot of it, roughly 30%). I don't have enough countries (14) to use more than one parameter at that level, and this is reserved for pfb3. pfb1 is an individual level subjective evaluation of the number of foreignborn persons in the country, pfb2 is a census measure of regionallevel percent foreignborn, and pfb3 is census countrylevel percent foreign born. Yes... MPlus does not like 112 dummies. I can't get it to converge. Thanks again. 


You have a latent variable Y measured by several indicators, so the mean of Y and the measurement parameters for the Y indicators can be different in the 14 groups, beyond the regional differences. The multiplegroup approach takes care of the heterogeneity in a fixed mode fashion whereas region does it in a random mode fashion (assuming you use Cluster=region). for fixed versus randommode modeling of measurement models, see also Muthén and Asparouhov (2013). New methods for the study of measurement invariance with many groups. Mplus scripts are available here. which is on our website. As for the pfb factor, you can define it on the regional level using the regionpart of the pfb1 variable and the pfb2 regionlevel variable. I don't know how to get pfb3 in there. 

Tom Aquin posted on Thursday, January 16, 2014  5:12 am



Dear all, I have just a quick clarification question regarding a multigroup multilevel path analysis. As far as I understand, when doing multigroup path analysis, there is no measurement invariance analysis since there are only observed variables. But what about a multigroup multilevel path analysis, where you have, for example, an observed variable specified on both levels? The observed variable at the within level has a latent counterpart on the between level. Are there any necessary steps of measurement invariance analysis? If so, which? Do you know any references applying a multigroup multilevel path analysis? Thank you in advance! 


There are no measurement invariance issues related to the situations you describe. Regarding papers on multiple group multilevel models, try searching for papers by Preacher and Zyphur. 

Nate Breznau posted on Thursday, February 20, 2014  6:23 pm



I am following up on my two previous posts, but have now abandoned the 3level factor "pfb", and instead want to estimate each component of this factor independently; one effect at each of 3levels. So I have pfb_c at the countrylevel; pfb_ctx at the regional level; and pfb_s at the individual level. pfb_c and pfb_ctx are manifest observations of the percentage of foreignborn persons in countries and in regions within countries (14 countries; 113 regions). This is like a standard within and between setup, except for the within effects are at level2 within each level3 group; and between effects are at level3. I have no idea how to program this. 1st Question: Can MPlus handle this threelevel model? I have version 6. 2nd Question: Can MPlus do this threelevel model for a 3group multigroup model? Any references to literature and/or code are sought. Thank you for making the applied work of a (notsomathematical) sociologist possible. 


Threelevel was introduced in Version 7. Examples can be found in Chapter 9 of the user's guide. 


Dear all, I want to check for mean differences between groups having regad to the hierachical structur of my data. I am only interested, if the there are mean difference in the variable N3 between each of the 6 groups on the between level. It should be an InterceptOnly model without any predictors. With the following input, i receive the means and standard errors for each group, but how can i compare the means? USEVARIABLES ARE N3 P; MISSING ARE ALL (444, 999); CLUSTER = Code; GROUPING = P (0 = P0, 1 = P1, 2 = P2, 3 = P3, 4 = P4, 5 = P5); ANALYSIS: Type = TWOLEVEL; OUTPUT: sampstat stdyx; Thank you very much in advance! Sandra 


Use Model Test with parameter labels given in the Model command. 


Dear Prof. Muthen, thank you very much for your fast reply! If I understood this correctly, I can compare the means between Group P0 and P5 by using the following input: USEVARIABLES ARE N3 P; MISSING ARE ALL (444, 999); CLUSTER = Code; GROUPING = P (0 = P0, 1 = P1, 2 = P2, 3 = P3, 4 = P4, 5 = P5); ANALYSIS: Type = TWOLEVEL; Model P0: N3 (a); Model P5: N3 (b); Model Test: a=b; Am I on the right track? 


Sorry for repost, I have just noticed one mistake: there should be [N3] instead of N3 on the model comand. 


Right. Or, 0 = ab; 

sailor cai posted on Tuesday, February 10, 2015  2:09 am



Dear Drs Muthen, I have a question on unequal sample sizes for running multiple group SEM using Mplus. I want to compare a structural model with three exogenous variables and one endogenous variables across two groups (say, n1=14,000; n2=40,000). Do I need to weight the item parameters after modeling? If so, how? It would be appreicated if you can recommend some literature! Many thanks in advance! 


I think you want to use weighting only if you have oversampled in one of the groups and you want to estimate a quantity like the mean overall. In multigroup settings the situation is typically different in that you either estimate groupspecific parameters or a parameter that is the same across groups and in neither case you want to weight. 

Sonja Nonte posted on Thursday, February 12, 2015  10:12 am



Dear all, I conducted a multigroup model for complex survey data (three independent cohorts) to check for measurement invariance and to identify "true" differences in reading motivation (for the latent mean scores). Strong invariance is given, so I decided to analyze the effect of different variables (level1 and level2) on reading motivation in a multilevel random slope model. I am using the grouping variable (to hold factor loadings and intercepts equal across groups), too. Now I have two questions: 1. How can I interpret the unstandardized regression weights in the output for the different cohorts? Are the betas the effects on reading motivation for each cohort or the differences compared to cohort 1 (reference group)? 2. I don't get any latent mean for the cohorts and according to this no indicator for meandifferences for reading motivation in the random slope model. Can I interpret the Intercept as difference? The intercept for cohort 1 is zero, so cohort 2 and 3 differs from cohort1 to the amount that is given in the intercept of the latent factor for reading motivation (within)? Am I right to conclude, if the intercept from cohort 2 and 3 are not significant, the cohorts don't differ from cohort1 after controlling for influencing factors? (In multigroup model the differ significantly.) Sorry for my questions. 


1. The betas are for the different cohorts, not differences. 2. Send output to support along with license number, repeating your Q2. 


Dear: Dr.Muthen I have the problem analyzing the following syntax Mplus VERSION 7.1 TITLE: invariance 3 level DATA: FILE IS "C:\Users\Vivo\Desktop\20 11 57\mplus\TEST\test50258.dat"; NGROUPS = 2; VARIABLE: NAMES ARE LEVEL1 RECHAR QUAL LEVEL2 PROADV EXPADV KNOWRE LEVEL3 UNISIZE GAGE SYSMAI; USEVARIABLES ARE RECHAR QUAL LEVEL2 PROADV EXPADV KNOWRE LEVEL3 SYSMAI; GROUPING IS GAGE (1 = GAGE1 2 = GAGE2); CLUSTER IS LEVEL3 LEVEL2; WITHIN = RECHAR; BETWEEN ARE (LEVEL2) PROADV EXPADV KNOWRE (LEVEL3) SYSMAI; ANALYSIS: TYPE IS THREELEVEL MGROUP; ITERATIONS = 1000; CONVERGENCE = 0.00001; Model: %WITHIN% QUAL ON RECHAR; %BETWEEN LEVEL2% QUAL ON PROADV EXPADV KNOWRE; %BETWEEN LEVEL3% QUAL ON SYSMAI; Model GAGE2: %WITHIN% QUAL ON RECHAR; %BETWEEN LEVEL2% QUAL ON PROADV EXPADV KNOWRE; %BETWEEN LEVEL3% QUAL ON SYSMAI; OUTPUT: SAMPSTAT RESIDUAL STANDARDIZED MODINDICES(0); SAVEDATA: RESULT IS DESKTOP; After inserting the command it not says anything. Do you have any suggestion in solving this problem? Thank you very much 


Please send the input, data, and your license number to support@statmodel.com. 

Steven John posted on Thursday, April 02, 2015  10:06 am



Dear Muthéns, I ran two identical 2level SEM models for two different samples. I obtained different coefficients for the relationship of interest. I would like to examine whether the estimates differ statistically. Is it correct to compute two 2level multiple group models, where 1) all parameters are constrained and 2) constraints are relaxed only for my relationship of interest? Thereafter calcultate the chi2 differencetest, as proposed by Satorra & Bentler? All the best, Stan 


Yes. 


1.Is it possible to fit a multiple group threelevel CFA model with ordered categorical variables in Mplus 7? 2. I have seen in the UG that with Bayes option, multiple group analysis is done with MIXTURE and KNOWNCLASS option since GROUPING option is not available with ESTIMATOR=BAYES option. How would one specify such a model? 


This is not currently possible in Mplus. 


Many thanks for your very prompt response, Linda. 


I have encountered the following error when conducting a 221 multilevel multi group mediation analysis: *** FATAL ERROR CLASSSPECIFIC BETWEEN VARIABLE PROBLEM. ***deleted var names to save space for post*** idvariable = id; CLUSTER IS tid; missing=all(999); knownclass is c (dblack=0 dblack=1); classes=c(2); ANALYSIS: TYPE IS TWOLEVEL MIXTURE; MODEL: %WITHIN% %OVERALL% math2 ON math1 dmale dmomed age1dfree dlep dspeced dlowbw; %BETWEEN% %OVERALL% EXP_SB by c17 c18; EXP_SB ON dtx (a); math2 ON EXP_SB(d); math2 ON dtx; math2 ON comp smas task; math2 ON t_hw1 t_drill1 t_disc1 t_act1 t_csize1 t_mast1 t_nusys1 t_algeb1 t_geo1 t_exp1 t_exppk1 t_pd1 t_time1 t_boys1; %c#1% math2 ON EXP_SB (d); %c#2% math2 ON EXP_SB (f); MODEL CONSTRAINT: ! section for computing indirect effect NEW(ad); ! name the indirect effect NOT BLACK ad = a*d; ! compute the indirect effect NOT NEW(af); ! name the indirect effect BLACK af = a*f; ! compute the indirect effect BLACK 


Please send the output and your license number to support@statmodel.com. 


I have received the following error when conducting a multilevel mixture model: *** ERROR in MODEL command Unrestricted xvariables in TWOLEVEL MIXTURE analysis must be specified as either a WITHIN or BETWEEN variable. The following variable cannot exist on both levels: F_1 *** ERROR in MODEL command Unrestricted xvariables in TWOLEVEL MIXTURE analysis must be specified as either a WITHIN or BETWEEN variable. The following variable cannot exist on both levels: DTX This is part of a mediation mode where DTX is the predictor and F_1 is the mediator. Both are BETWEEN level variables but since I am estimating different WITHINLEVEL effects for individuals within these clusters I have not specified them as BETWEEN or WITHIN. 


Please disregard my previous post. I have solved the issue. 

Back to top 