Message/Author 

Anonymous posted on Tuesday, February 19, 2002  3:06 pm



I would like to know if a multilevel or multisample mixture model is possible with Mplus. If so, can you please point me to further readings? Many thanks! 


Mplus can currently do Mixture Complex with and without missing data. It can also do Mixture mutisample using training data to specify the groups. Mixture multilevel is currently under development. 

Scott Grey posted on Wednesday, June 14, 2006  11:47 am



Linda, I'm trying to run a LPA using 'TYPE IS MIXTURE COMPLEX' but I do not get standard errors in my output. Here's the code: DATA: FILE IS "C:\Documents and Settings\insthealthsa4\My Documents\DARE\ External prevention programming\CFA3.dat"; VARIABLE: NAMES ARE n1q44a_7 n1q44a_8 n1q44a_9 n1q44b_7 n1q44b_8 n1q44b_9 n1q44c_7 n1q44c_8 n1q44c_9 n1q44d_7 n1q44d_8 n1q44d_9 v10q52 v10q55 n2q49 utq39 crsswlk ms7dist treatms inclass outclass dare other t7 t8 t9 t10 msdist; USEVARIABLES ARE inclass outclass dare other; AUXILIARY = crsswlk; CLUSTER IS ms7dist; CLASSES = class(5); ANALYSIS: TYPE IS MIXTURE COMPLEX; STARTS = 50 5; LOGHIGH = +25; LOGLOW = 25; UCELLSIZE = 0.01; ESTIMATOR IS MLR; LOGCRITERION = 0.0000001; ITERATIONS = 1000; CONVERGENCE = 0.000001; MITERATIONS = 500; MCONVERGENCE = 0.000001; MIXC = ITERATIONS; MCITERATIONS = 2; MIXU = ITERATIONS; MUITERATIONS = 2; MODEL: %OVERALL% inclassother WITH outclassother; OUTPUT: TECH8 TECH11; SAVEDATA: FILE IS LC3; FORMAT IS FREE; SAVE = CPROB; THANKS FOR YOUR HELP! 


I'm afraid this information does not tell me why you don't get standard errors. Please send your input, data, output, and license number to support@statmodel.com. 

Hao Duong posted on Saturday, October 18, 2008  11:02 am



Dr. Muthen, I am confused about interpretations for ib sb on w, and c on w in example 10.9 in Mplus user's guide in 2007. Would you please explain them for me? Thank you I appreciate all your help! Hao Duong 


All regressions are linear regressions. ib and sb are continuous growth factors. c#1 is a random intercept, also a continuous latent variable. 


Hello, I have two questions. I am attempting to establish a GMM with four time points. However, the time points are not equally spaced. I do have an indicator of time (age at each assessment). 1. Would this preclude me from using a traditional GMM? I was under the impression that I would have to use a two level model, rather than LGCM because of the unequal time between interviews, and regress my outcome on my time variable. 2. Also, if multilevel modeling is required, would it still be possible to examine classes of trajectories? 


1. No. A singlelevel model is sufficient. Just use time scores that reflect the nonequidistance. 2. Yes, you can use betweenlevel latent class variables. 


Thank you very much Dr. Muthen. The problem that I have run into is that the spacing of each assessment varies individually. Is this still possible? 


I should clarify as this is unclear. I have been able to run these models using the TYPE = RANDOM command for the individually varying time scores. I am wondering if there is a way, using this option, to examine classes of trajectories? If so, how would I specify this model in MPlus? 


Individuallyvarying times of observations is handled by the TSCORES option (see UG) and it can be combined with Type=Mixtures, although it takes a little longer. 


Hi, I'm doing multilevel latent class analysis with 12 items which have 4 categorical options. I tried few times but keep pop up some error message. And I'm not sure what's gone wrong. Hope someone can advise me on this. many thanks. my input instruction: VARIABLE: NAMES ARE wel mor dif enj str qui bor lik hel oth uni job schid ; USEVARIABLES = wel mor dif enj str qui bor lik hel oth uni job ; CATEGORICAL ARE wel mor dif enj str qui bor lik hel oth uni job ; MISSING ARE all (9); CLASSES=C(3); CLUSTER=schid; WITHIN=wel mor dif enj str qui bor lik hel oth uni job ; ANALYSIS: TYPE = MIXTURE TWOLEVEL ; STARTS = 20 10; PROCES = 8 (STARTS); MODEL: %WITHIN% %OVERALL% %BETWEEN% %OVERALL% C#1; C#2; C#1 WITH C#2; *** ERROR Categorical variable WEL contains 158 categories. This exceeds the maximum allowed of 10. 


It sounds like you are reading your data incorrectly. Perhaps the number of variable names is not the same as the number of columns in your data set or you have blanks in your data. If you can't figure this out, send your files and license number to support@statmodel.com. 


I am implementing a IRT multilevel mixture model with item bias effects in Mplus. My structure has items nested in individuals, nested in countries. I would like to code the mixtures nominal, yielding discrete random effects. For identification I use effectcoding for the mixtures. But when using the following syntax, there are no differences for the latent means across mixtures; both are zero. Am I applying the correct syntax for effectcoding of mixtures? VARIABLE: NAMES = id item1 item2 item3 country; USEVARIABLES = item1 item2 item3; CATEGORICAL = item1 item2 item3; CLUSTER= country; CLASSES= eta3 (2); BETWEEN= eta3; ANALYSIS: ALGORITHM=integration; TYPE=TWOLEVEL MIXTURE; MODEL: %WITHIN% %OVERALL% ETA2w BY item1* (a) item2 (b) item3 (c); ETA2w@1; [ETA2w@0]; %BETWEEN% %OVERALL% ETA2b BY item1* (a) item2 (b) item3 (c); %eta3#1% !Mixture #1 [ETA2b] (d); !Mean of latent var for mixture #1 item2; !item bias effects item3; %eta3#2% !Mixture #2 [ETA2b] (e); !Mean of latent var for mixture #2 item2; !item bias effects item3; model constraint: 0= d+e; !effect coding for mixtures 


The latent variable mean must be fixed at zero in one class for model identification. So d and e in MODEL CONSTRAINT are not both identified. 


Hello, I'm running Multilevel Latent Class model similar to model presented in Henry&Muthén(2010). The core of the model is multilevel logistic model: resp on negd1 negd2 negd3; and 2 latent classes are specified. Estimation of the model works when I've two regression in each of the latent class: MODEL: %WITHIN% %OVERALL% resp on negd1 negd2 negd3; cw#1 on time ; %CW#1% resp on negd1 negd2 negd3 ; %CW#2% resp on negd1 negd2 negd3 ; %BETWEEN% %OVERALL% resp ; [resp$1] ; %CW#1% resp ; [resp$1] ; %CW#2% resp ; [resp$1] ; But the problem appears when I want to have empty logistic model in class 2. i.e model without explanatory variables. I was trying several specifications. None of them worked. While I'm declaring ON statement in %OVERALL% part I get ON statements in all classes. When ON statement is not declared in %OVERALL% part I'm not allowed to specify it for class 1. Is there a way to specify logistic model with explanatory variables in class 1 and empty model in class 2? Thank You! 


Please send to Support the data and output for the case where you specified the ON statement in the Overall part and fix the slopes in class 2. 


Hello, I plan to use UG Example 10.12 (twolevel LTA with a covariate) for my analyses. I have students nested in schools. I understand the code in the example, except I want to clarify one aspect of it. In the code below, why are the indicators for the latent classes modeled at the between level? If these reflect individual responses (such as from individual students), wouldn't those be on the within level? Or is it that because we are estimating probabilities (or mean responses) for the items for persons, conditional on class, this becomes an average across personsno longer on the within level? Thank you. MODEL: %WITHIN% %OVERALL% c2 ON c1 x; c1 ON x; %BETWEEN% %OVERALL% c1#1 ON w; c2#1 ON c1#1 w; c1#1 c2#1; MODEL c1: %BETWEEN% %c1#1% [u11$1u14$1] (14); %c1#2% [u11$1u14$1] (58); MODEL c2: %BETWEEN% %c2#1% [u21$1u24$1] (14); %c2#2% [u21$1u24$1] (58); 


In multilevel modeling, all mean/threshold/intercept parameters are on the highest level. 


OK, may I request your input on the following questions regarding this section of the code: %BETWEEN% %OVERALL% c1#1 ON w; c2#1 ON c1#1 w; c1#1 c2#1; Understanding the code above: 1) Why is c2#2 ON c1#2 (and other combinations such as c2#2 ON c1#1) not above, similar to the second line? I believe that the code above regresses the clusterlevel (average or intercept) latent status for class 1 of c2 on that for class 1 of c1. This is part of the random intercept setup. But wouldn't regressing the second class of each latent class variable make sense to do as well? 2) By the same token, why does the above code not show "c1#2 c2#2;" as well? Is it that by allowing intercepts for the first class for each latent class variable to vary across clusters, these are already free to differ from those for the second latent classes? Altering the code for my data: 3) If I have 3 latent class variables rather than 2, and want to have random intercepts, I would also model c3#1 ON c2#1, right? 4) Finally, if I try a model without random intercepts, would I remove both the "c2#1 ON c1#1", and "c1#1 c2#1;" sections of code? Thank you sincerely. 


There are only 2 classes in which case there is no between counterpart for the second class just like in multinomial logistic regression. With 3 classes you have between counterparts for the first 2. 


Dear Muthen I am writing this post to ask MLCA analysis. is it possible to run "multilevel LCA" with covariates and distal outcome simultaneously? the number of individual cases are 580 nested with 30 organizations. Thank you in advance. 


Thirty organizations is the minimum you should have. Yes, this model is possible. 


I'm now considering to calculate the MOR proposed by Larsen & Merlo (2005). But due to my poor statistical ability, I cannot understand the mathematical expression shown in page 83, 1st line. Could anyone give me some example with actual number to calculate MOR in this article? 


Try contacting the authors, or post on Multilevelnet. 


Some segments of Example 10.1, 'Twolevel mixture regression for a continuous dependent variable,' are below. I'd like to modify the program for two objectives: (1) Incorporate the measurement error table output from step 1/step 2 of manual 3step estimation, i.e., "Logits for the Classification Probabilities for the Most Likely..." (2) Temporarily omit the regressions from the overall model in the between part of the model. The example does not mention %C#2% in the within part of the model. Is it necessary to omit %C#2%? If so, then how could the parameter N#1@ be used for %C#2%? VARIABLE: CLASSES = C(2); WITHIN = X1 X2 N; ! N is my addition ! BETWEEN = W; ! Not using for now CLUSTER = CLUS; NOMINAL = N; ANALYSIS: TYPE = TWOLEVEL MIXTURE; STARTS=0; MODEL: %WITHIN% %OVERALL% Y ON X1 X2; C ON X1; %C#1% [N#1@1.901]; ! As per Webnote 15, appendix E, step 3 of ! manual 3step estimation Y ON X2; Y; %BETWEEN% %OVERALL% ! Y ON W; ! No betweenlevel regression ! C#1 ON W; ! No betweenlevel regression ! C#1*1; ! A starting value is not needed %C#1% [Y*2]; OUTPUT: TECH1 TECH8; 


It is not necessary to omit %C#2%. We did because we didn't need to say anything about c#2. So go ahead and try it. 


Dear Muthen, Is it possible to have a twolevel mixture model in mplus where LCA classes are predictor variables, outcome is binary observed variable & the covariates for the model include a continous latent factor? i.e Y = Classes + continous latent factor + covariates The code below is an attempt to do this but I'm having a difficulty keeping the binary Y separate from observed categorical that are used in LCA classes. Is that possible?how do I achive that? ## UseVariables are Dailies Radio TV Maln Brstfed Eat_Freq Diet_Div C_Age Urban Wealth M_Age Female sexXage Edu; Classes=HLTH_BHV(2); Categorical are Brstfed Eat_Freq Diet_Div; WITHIN = C_Age Female sexXage; BETWEEN = Dailies Radio TV; CLUSTER = Comm_Hse; Auxiliary = (r)Wealth Urban Edu M_Age; Define: sexXage=C_Age*Female; CENTER C_Age(GRANDMEAN); Analysis: Type=TWOLEVEL MIXTURE; Starts=4000 40; Processors = 4; Algorithm=integration; Model: %WITHIN% %OVERALL% HLTH_BHV ON Female C_Age sexXage; %BETWEEN% %OVERALL% MED_USE BY Dailies Radio TV; Maln ON HLTH_BHV MED_USE; HLTH_BHV ON MED_USE; How do I ensure the outcome variable is treated as binary when LCA classes are regressed on it? 


You don't say which variable the binary outcome variable is. Perhaps it is "Maln" but it is not declared categorical. You say "when LCA classes are regressed on it". If an outcome is a function of the latent class variable, you should say "the outcome is regressed on the latent class variable", not the other way around. But note that in Mplus you don't say "y ON c", but instead let the default change of y means/thresholds be the effect of c on y. See also our 3step papers on the website such as Web Note 21. 


I want to change the reference class from being class 3 to class 2. Would I use the outputted values of class 2 as userspecified starting values for class 3 in the next run? Where do I insert the desired starting values in the code below? Thanks. USEVARIABLES = y x1 x2 w n; NOMINAL = n ; CATEGORICAL = y; CLASSES = C(3); CLUSTER = clus; WEIGHT = wgt; WITHIN = n x1 x2; BETWEEN= w; ANALYSIS: TYPE = MIXTURE TWOLEVEL; MODEL: %WITHIN% %OVERALL% y ON x1 x2; C ON x1 x2; %C#1% [N#1@4.2]; [N#2@1.4]; [N#3@2.9]; y ON x1 x2; %C#2% [N#1@0.8]; [N#2@3.9]; [N#3@1.9]; y ON x1 x2; %C#3% [N#1@2.6]; [N#2@2.3]; [N#3@6.4]; y ON x1 x2; %BETWEEN% %OVERALL% y ON w; C#1 ON w; C#1*1; C#2 ON w; C#2*1; %C#1% y ON w; [y$1*2]; %C#2% y ON w; [y$1*2]; %C#3% y ON w; [y$1*2]; 


Try switching values only for the nominal statements [N#1@...] 

Youmi Suk posted on Monday, February 27, 2017  11:24 am



In multilevel mixture regression, if we do not use no particular indicators for latent classes and have a model with categorical and continuous variables (covariates), do we have to specify the variable type of covariates? In the example 10.1, NAMES ARE y x1 x2 w1 w2 class clus; USEVARIABLES = y x1 x2 w1 w2; CLASSES = c (2); WITHIN = x1 x2; BETWEEN = w1 w2; CLUSTER = clus; If x1 (a within level variable) w1 (a betweenlevel variable) are binary variables, should we specify x1 w1 as categorical variables (CATEGORICAL = x1 w1)? Like this, NAMES ARE y x1 x2 w1 w2 class clus; USEVARIABLES = y x1 x2 w1 w2; CATEGORICAL = x1 w1; CLASSES = c (2); WITHIN = x1 x2; BETWEEN = w1 w2; CLUSTER = clus; If I specify them using ¡°CATEGORICAL,¡± I get threshold information, relating to indicators. Thanks in advance for your help. 


You should not declare a variable type such as categorical for a covariate. 

Youmi Suk posted on Monday, February 27, 2017  7:48 pm



Thanks much for the quick reply. I have one more question regarding a variable type for a dependent variable in the same situation above except for the variable type of the dependent variable (a continuous dependent variable > a binary dependent variable). Given the fact that we cannot declare a variable type, cannot we use a categorical dependent variable for multilevel mixture (logistic) regression? I guess that if we cannot specify a categorical variable as a dependent variable in Mplus, do we use the linearprobability model (i.e., standard regression), rather than using logistic regression model? I really appreciate your help in advance. 


You can declare a dependent variable (DV) as categorical or anything else  I don't know why you think you cannot. I was referring to a covariate, not a DV. 

Youmi Suk posted on Friday, March 10, 2017  12:56 pm



(1) We are trying to fit 2class multilevel models with Level 1 latent classes (2) With a binary DV, adding "DV;" provides classspecific variance estimates for the random intercept. It worked. VARIABLE: NAMES ARE y x w clus; USEVARIABLES = y x w; CLASSES = cw (2); CATEGORICAL = y; WITHIN = x; BETWEEN = w; CLUSTER = clus; MODEL: %WITHIN% %OVERALL% y on x; %cw#1% y on x; %cw#2% y on x; %BETWEEN% %OVERALL% y on w; %cw#1% y on w; %cw#2% y on w; y; (3) However, with a continuous DV, adding "DV;" did not work. The program stopped with the error message. VARIABLE: NAMES ARE y x w clus; USEVARIABLES = y x w; CLASSES = cw (2); WITHIN = x; BETWEEN = w; CLUSTER = clus; MODEL: %WITHIN% %OVERALL% y on x; %cw#1% y on x; %cw#2% y on x; y; %BETWEEN% %OVERALL% y on w; %cw#1% y on w; %cw#2% y on w; y; The error message is as follows:  *** FATAL ERROR CLASSSPECIFIC BETWEEN VARIABLE PROBLEM.  We would like to allow classspecific random intercept variances with a continuous DV, identifying Level1 latent classes. Could you help me out with this problem? Thank you in advance for your help. 


Please send the 2 outputs to Support along with your license number. 


I am trying to run a twolevel LCA and am having trouble because the program keeps freezing my computer right after it starts estimating the model. Our IT guy suggested that the data file might be corrupt, but I can run other, more basic, analyses using the same file, so I assume that is not the issue. I have checked for blanks and random characters in the data file and there are none. I'm out of ideas as to what might be causing the program to freeze and crash my computer. 


Send your input and data to Support along with your license number. 


Variable: missing=All(9); Weight=w1; IDvariable=stu_id; usevariables=sch_ID female latinx black asian other firstgen math ses scise sciint sciid sciut mathse mathin mathid mathut STEM GPA private city town rural frl pctlat pctblack scifair summer mentor pdlearn pdint calc compsci chem phys; categorical=scifair summer mentor pdlearn pdint calc compsci chem phys; classes=cb(4) cw(4); within=female latinx black asian other firstgen math ses scise sciint sciid sciut mathse mathin mathid mathut; between=private city town rural frl pctlat pctblack scifair summer mentor pdlearn pdint calc compsci chem phys; cluster=sch_id; Analysis: type=mixture twolevel; processors=8(starts); miteration=5000; starts=20000 2000; stiterations=100; model: %within% %overall% cw ON female latinx black asian other firstgen math ses; %between% %overall% cb ON private city town rural frl pctlat pctblack; cw on cb; MODEL cw: %cw#1% [scise sciint sciid sciut] [mathse mathin mathid mathut]; %cw#2% [scise sciint sciid sciut] [mathse mathin mathid mathut]; %cw#3% [scise sciint sciid sciut] [mathse mathin mathid mathut]; %cw#4% [scise sciint sciid sciut] [mathse mathin mathid mathut]; 


What is your question? Note that cb should be put on the Between= list. 


This was my question from above, I ran out of space, sorry. I cannot send my data file, also sorry. I am trying to run a twolevel LCA and am having trouble because the program keeps freezing my computer right after it starts estimating the model. Our IT guy suggested that the data file might be corrupt, but I can run other, more basic, analyses using the same file, so I assume that is not the issue. I have checked for blanks and random characters in the data file and there are none. I'm out of ideas as to what might be causing the program to freeze and crash my computer. 


Are you running in version 8.3? If so, the only way we can help you about potential freezing is to run your data. Also, you specify: starts=20000 2000; stiterations=100; You should not need that many starts to replicate the best loglikelihood. And you should use the default for Stiterations (which I believe is 20); I have never encountered a situation needing that many Stiterations. 


Yes, I'm running the most recent version of Mplus. I'll make the changes you have suggested and see if that helps. I appreciate the suggestions. 

Youmi Suk posted on Thursday, April 09, 2020  3:09 am



Here is my code: VARIABLE: USEVARIABLES = Z X1 W1; WITHIN = X1 ; BETWEEN = W1 CB; CLUSTER = id; CLASSES = CB (2); CATEGORICAL = Z; ANALYSIS: TYPE = TWOLEVEL MIXTURE; STARTS = 20 3; ESTIMATOR = ML; MODEL: %WITHIN% %OVERALL% Z ON X1 ; %CB#1% Z ON X1 ; %BETWEEN% %OVERALL% Z on W1 ; %CB#1% Z on W1 ; Z; individual = i and cluster (e.g., school) = j. Z_ij = binary individuallevel dependent variable X1_ij = individuallevel independent variable W1_j = clusterlevel independent variable id = cluster id With the betweenlevel categorical latent variable, K_j, I get clusterspecific posterior probabilities, P(K_j =k  X1_ij, W1_j). Using Bayes rule, P(K_j =k  X1_ij, W1_j) = [P(K_j=k)P(Z_ij=1 K_j=k, X1_ij, W1_j)] / [\sum P(K_j=k)P(Z_ij=1 K_j=k, X1_ij, W1_j)] P(K_j=k) is the constant across all the observations from the current model only with intercept. P(Z_ij=1 K_j=k, X1_ij, W1_j) should be individualspecific due to individuallevel variable, X1_ij. Then, P(K_j=k)P(Z_ij=1 K_j=k, X1_ij, W1_j) should be individualspecific. But, we obtain clusterspecific posterior probabilities. Could you please explain how we can get clusterspecific posterior probabilities in this model? 


This is easier to look at if you use probit link as it will avoid the numerical integration associated with ZB [ZB_jW1_j,K_j=k] ~N(mb_j, v_j), where mb_j=beta_b*W1_j [ZW_ijX1_ij,K_j=k] ~N(mw_j, 1), where mw_ij=beta_w*X1_ij P(Z_ij=1 K_j=k, X1_ij, W1_j)=P(ZW_ij+ZB_j>tau)=1Phi((taumb_jmw_ij)/sqrt(1+v_j)) where Phi is the standard normal distribution function. 

Youmi Suk posted on Monday, April 13, 2020  8:43 am



Thank you for your explanation, but I’m not still getting that. I changed the code, now without W1, to make it more straightforward. VARIABLE: USEVARIABLES = Z X1; WITHIN = X1 ; BETWEEN = CB; CLUSTER = id; CLASSES = CB (2); CATEGORICAL = Z; ANALYSIS: TYPE = TWOLEVEL MIXTURE; STARTS = 20 3; ESTIMATOR = ML; MODEL: %WITHIN% %OVERALL% Z ON X1 ; %CB#1% Z ON X1 ; No %BETWEEN% Model (assume intercept only for each betweenlevel categorical latent variable (CB)). In this setting only with X1_ij, I get clusterspecific posterior probabilities, P(K_j =k  X1_ij) for the betweenlevel categorical latent variable (CB), K_j. Using Bayes rule, P(K_j =k  X1_ij) = [P(K_j=k)P(Z_ij=1 K_j=k, X1_ij)] / [\sum P(K_j=k)P(Z_ij=1 K_j=k, X1_ij )] P(K_j=k) is the constant from the current model only with intercept. P(Z_ij=1 K_j=k, X1_ij) should be individualspecific due to individuallevel variable, X1_ij. Then, P(K_j=k)P(Z_ij=1 K_j=k, X1_ij) should be individualspecific. But, we obtain clusterspecific posterior probabilities, P(K_j =k  X1_ij). Could you please elaborate on how to get clusterspecific posterior probabilities, P(K_j =k  X1_ij) in this model? Thank you for your help in advance. 


I think I understand now what you are asking. You want to find out how we compute the posterior class probability that you get in the savedate file with the CPROB option. For that there is no way to avoid the numerical integration, i.e., there is no way to make it into a simple formula. You need this P(Z_1j,Z_2j,... K_j=k, X1_1j, X1_2j, ... )= integraloverZB of the product_over_j P(Z_ij=1 K_j=k, X1_ij, ZB ) All the relevant formulas are in here http://www.statmodel.com/bmuthen/articles/Article_128.pdf 

Back to top 