Message/Author 


I am running an LCA with 19 binary predictors and an N of 223. The best fitting model has two classes (entropy = .87). I have regressed this model on three covariate predictors. I am fine with and, in fact, want the predictors to influence class membership. Model: %Overall% c#1 on catag2 k6cat aids_eve; My questions: 1) These predictors reduce the N to 189 so I have rerun the model using multiple imputation. Using MI, I can't get the conditional probabilities of the indicators, only thresholds. Is there a way to get the results in probability scale? 2) I have two binary outcomes included using the auxiliary statement. Mplus won't run these using DCAT, which I use because I want odds ratios for the distal variables. What other option works? 3) I want to regress the distal outcomes the covariates to obtain direct effects (as well as the indirect effects through latent class), how is this specified in the model statement? Once the model is running on nonimputed data, can I run the exact same model using imputed data? I presently get this error message for the imputed data: Auxiliary variables with E, R, DU3STEP, DE3STEP, or BCH are not available for TYPE=IMPUTATION. Thanks for any help. 


By 19 binary predictors I assume you mean 19 latent class indicators. 1) No auto in Mplus but you can easily compute them by the usual translation from logits to probabilities. 2) See the tables at the end of web note 21 3) In the 1step approach you simply say y ON x in each class. Last question  the message correctly states that this is not provided in Mplus. 


Thank you Dr. Muthen, I did mean indicators...sorry. With respect to number 3, perhaps I was unclear... In the current statements I have usevar ind1ind19 age aids; auxiliary (e) hospital er; Then in the model statement, as indicated, I have: %Overall% c#1 on age aids; If I then add to that statement (as I interpreted from your suggestion): hospital on age aids; I get this error message: *** ERROR in MODEL command Unknown variable(s) in an ON statement: HOSPITAL If I try to include hospital on the usevar AND auxiliary command, I get the same error message. In other words, it seems to be telling me I can't have a variable that is defined as a distal outcome on the auxiliary command and also include that variable in the model as a DV regressed on one (or more) of the covariate predictors of latent class. Is there a way to do this? 


First, note that Auxiliary (e) is outdated  see the tables at the end of Mplus Web Note 21. To add that direct effect you have to take "a manual" approach to 3step as discussed in Web Note 15 and 21. 


Thank you again. Understood. 

Soyoung Kim posted on Monday, December 18, 2017  5:02 am



Dear Dr. Muthen. I would like to ask a question about multiple imputation (MI) with distal outcome in mixture model. When I use BCH with latent class model, I got a error message: *** ERROR in VARIABLE command Auxiliary variables with E, R, DU3STEP, DE3STEP, or BCH are not available for TYPE=IMPUTATION. It does not seem to be available to use them at once. Then, doing BCH analysis, if I want to get the last result of multiple imputation, is it possible to calculate the mean value of estimates manually? In other words, I have made imputed data sets, and I got the results of each data set. Is it possible to calculate the means of chisquare values and pvalues for BCH results from the multiple data sets? Thank you in advance. Soyoung. 


Yes, you can combine the results using the usual formula, see the bottom two formulas on page 3 http://statmodel.com/download/MI7.pdf Alternatively you can use the manual BCH approach with imputation. The manual BCH approach is described in Section 3 http://statmodel.com/examples/webnotes/webnote21.pdf 

Soyoung Kim posted on Tuesday, December 19, 2017  7:57 pm



Thank you so much for your response Tihomir. it was very helpful. Best wishes, Soyoung. 

Soyoung Kim posted on Wednesday, December 20, 2017  1:06 am



Dear Dr. Asparouhov. How can I combine the pvalues? I understand that I can calculate the mean of the chisquares values from multiple results. Is it okay to calculate the means of pvalues for BCH results from the multiple data sets? Thank you in advance, Soyoung. 


You can not combine the mean of the chisquare or the mean of the pvalues. Instead see the bottom two formulas on page 3 http://statmodel.com/download/MI7.pdf Using these you can compute the point estimate P as well as the SE for that joint point estimate (the formula there is for the variance so you have to square the SE for the individual imputed data sets and at the end take the square root of the total variance). From there you can use the standard method pvalue=2*Phi^{1}(P/SE), where Phi^{1} is the inverse of the standard normal distribution function. For example, if P/SE=1.96 the pvalue=0.05. 


Dear Dr. Asparouhov, I have used BCH for distal outcomes over 10 imputed datasets with 3 latent classes. I have combined the Means and SEs from datasets for each distal outcome for each Class, using the formulas that you mentioned on page 3 of http://statmodel.com/download/MI7.pdf. So now I have combined Mean and SE of each distal outcome for each separate Class, similar to the output under EQUALITY TESTS OF MEANS ACROSS CLASSES USING THE BCH PROCEDURE. However, I'm still a little unclear about how I use these combined estimates to compare means between latent classes. From my understanding, if I was interested whether there was a significant difference between Class 1 and 2 on Distal outcome X, i would have (Class 1 combined M  Class 2 combined M)/combined SE. Is this correct? And if this is the case as I have three combined SEs from each Class, would I need to pool these estimates again, and what would be the proper method for doing this? Thank you for your assistance. 


For each imputed data set compute the difference parameter  you can use model constraints statement to form that, like this %C#1% [x] (m1); %C#2% [x] (m2); model constraints: new(p); p=m1m2; Then combine the estimates of p over the imputed data sets. Section 2, http://statmodel.com/download/MI7.pdf, explains how to combine the three estimates (they are actually two not three class1 class2 and class1  class3, the third one is just the difference of these two and should not be included as it is a dependent statement, i.e., the hypothesis class1=class2 and class1=class3, is the same as the hypothesis class1=class2, class1=class3, class2=class3). 


Dear Dr. Asparouhov, Thanks for your response. I have a follow up question. For 3 classes, I understand theoretically how the difference between the first comparison (Class 1Class 2) and the second comparison (Class 1 Class 3) gives you the estimates for the 3rd comparison (Class 2Class 3). But Im not sure I understand how to test that comparison without putting it into the model. In this analysis, we are interested in finding out if classes differ on depression symptoms. So using this procedure, we would be able to tell if Class 1 had significantly higher depression symptoms than Class 2, and whether Class 1 had significantly higher depression symptoms than Class 3. However, Im not sure how we would determine the statistical significance of the difference in depression between Class 2 and 3. To get this result, should we put all three comparisons in the model? For example my syntax is: MODEL: %C#1% [DEP_MEAN] (m1); %C#2% [DEP_MEAN] (m2); %C#3% [DEP_MEAN] (m3); model constraints: new(DEP_Diff1); DEP_Diff1=m1m2; new(DEP_Diff2); DEP_Diff2=m1m3; new(DEP_Diff3); DEP_Diff3=m2m3; Thanks again for your help. 


Yes  that will work 


Thank you so much for your help Dr Asparouhov. One final question  in order to analyse the means of variables by latent classes, I'm assuming we would need to use a manual two step BCH process, where the latent classes were fixed with BCH weights in step 1, and then the model constraints are added in step 2 for each dataset? Or is there a more straightforward way of doing this? Thank you again. 


Because of the imputation, there is no easier approach. If you can avoid the imputation and use simple FIML estimation that accounts for the missing data than you can use simply the auxiliary command. 


Thank you very much Dr Asparouhov. Your advice has been extremely helpful. 


Dear Dr Asparouhov and Dr Muthen, I attempt to use the 1ststep syntax from Webnotes #21 Section 3.2 for each of my imputed datasets. However, I always get the ERROR as follows: *** ERROR The following MODEL statements are ignored: * Statements in Class 1: [ EEC38$1 ] [ EHT38$1 ] [ EPA38$1 ] [ ESD38$1 ] [ EUI38$1 ] * Statements in Class 2: [ EEC38$1 ] [ EHT38$1 ] [ EPA38$1 ] [ ESD38$1 ] [ EUI38$1 ] *** ERROR One or more MODEL statements were ignored. These statements may be incorrect or are only supported by ALGORITHM=INTEGRATION. My syntax is this and I'm using Mplus V8.0. DATA: FILE = TMIE1.DAT; VARIABLE: NAMES = GENDER EEC38 EHT38 EPA38 ESD38 EUI38...etc; USEVAR = EEC38EUI38; MISSING = *; CENSORED = EEC38(a) EHT38EUI38(b); AUXILIARY = M9DTM38ET GENDER; CLASSES = C(2); ANALYSIS: TYPE = MIXTURE; STARTS = 50 5; PROCESSORS = 2; MODEL: %OVERALL% %C#1% [EEC38$1EUI38$1*1.0]; %C#2% [EEC38$1EUI38$1*1.0]; OUTPUT: SAVEDATA: FILE = BCHTMIE1.DAT; SAVE = BCHWEIGHTS; I also notice that SAVE=BCHWEIGHTS is not available with ALGORITHM=INTEGRATION. Could you help me out? 


The [y$1] expression refers to a threshold of a categorical variable, not to a mean/intercept of a continuous variables. You don't declare any of your variables as categorical. 


Thank you, Dr. Muthen. Another question is that following Dr. Asparouhov's suggestion in this topic, I computed the difference parameter for each imputed data set. However, I just noticed that the classification for each imputed data set is different. In some sets, class 1 refers to the highconcern group, while in some other sets, class 2 refers to the highconcern group, at least according to the means and variances of the indicators. Need I swap the difference parameter between two classes (groups) for different datasets when combining the estimates over the imputed datasets to use the formulas on http://statmodel.com/download/MI7.pdf ? Or this actually indicates that the classification solution is not good based on the MI datasets with AUXILIARY in LCA models? 


You have to align the classes. We actually do this internally when Mplus combines the imputations. The way to do this is as follows. Run the first data set with the option "output:svalues". This will give you a model statement with great starting values. Copy that model statement and use it to run the rest of the imputations with the analysis:starts=0. This should align the classes. If the classification is not good for a particular run (this has nothing to do with the imputations) you will get a separate warning message from Mplus but I don't think this is the case. 


Also your auxiliary command maybe should be something like this AUXILIARY = (DU3step) M9DTM38ET GENDER; 


Dear Dr. Tihomir Asparouhov, Thank you for the helpful advice. One more question is to check whether I am using the formula correctly (perhaps also help other newbies like me). Correct me if I am wrong in any steps. (1) I define and use model constraint to obtain the meandifference between class 1 and class 2 (lets say the 2class solution is chosen) on variable A (x = A1A2), so I will use the adjusted Wald test to decide whether x = 0. Now I have x1, x2, x3
x30 for 30 imputed datasets. (i = 1, 2, 3,
30) (2)Thus, I will have mean_x = 1/30 * SUM (x1, x2,
,x30); (3) For each of x1, x2, x3
x30, I also have SE1, SE2, SE3,
, SE30 from the output. Thus, I have Vi = SEi^2 (V1 = SE1^2, V2 = SE2^2,
, V30 = SE30^2). (4) Then to calculate V = 1/30 * SUM (V1, V2, V3
,V30) + (30+1)/[30*(301)] * SUM [(xi mean_x)^2]. (5) Finally, W^2, which in the webnote MI7 (pp34 for Wald test) is 'W' in the text, is said to be a chisquare distribution. We need to calculate W^2 (df = 1) = [(mean_x  0)^2] / V for p value, and decided whether it is significant or not. 


All correct. Instead of step 5 you can just use the Ztest: Z=mean_x/sqrt(V). If abs(Z)>1.96 the mean difference is statistically significant. 


I am using the 3step method for regressing a distal outcome onto 10 latent classes as per Asparouhov & Muthen 2014, using the logits from the LCA to account for classification uncertainty in my regression. However, I notice that the logit for my 10th class is 0, the whole column has only zeroes. And it doesn't like this when i put it in the regression. Why is this occurring? Thank you! 


You don't use the last zero column. See the appendices for the Mplus Web Note 15 on our web site. 

fred posted on Saturday, March 30, 2019  11:22 pm



A have avery basic question about the use of BCH method, What do the class specic (thresholds) values of 1,1 , and 1* in the example below denote, and how is the order of these values given? I have a 2 class model with 7 binary indicators (U1U7) that I want to include in a 3step BCH model and struggle to set these. Thanks %c#1% [ U1$1U8$1*1.0] %c#2% [ U1$1U4$1*1.0 U5$1U8$1*1.0] 


Perhaps you are looking at page 11 of web note 21. Note that it says: "Starting values are provided so that the class order does not reverse from the generated order. In real data analysis starting values are not needed. Instead, a large number of random starting value should be set using the starts command." 

fred posted on Sunday, March 31, 2019  10:08 pm



Thanks, yes, I am looking at that example. So the whole model command: Model: %Overall% %c#1% [U1$1U8$1*1.0 ] etc Is not nessecary in first step? Also how does one specify the large number of starting values, would you kindly point to an example? 


Q1: That's right. Q2: E.g. Starts = 100 20; 

fred posted on Monday, April 01, 2019  9:53 pm



Thank you very much! To follow up, the example provides class specific regression, is it possible to also include Y on C in the model overall command at the same time if the interest is in checking the effect of C on the outcome AND the class specific regressions of Y on Xs? 


You don't say Y ON C but instead the effect of C on Y is captured by different Y means for the different classes. 

fred posted on Wednesday, April 03, 2019  11:37 pm



I see, so the means are given in the output (by the outline of the example on P11) without having to specify a command such as D3step or DCON? 


Yes. They are part of the model when you take the manual approach. 

fred posted on Tuesday, April 09, 2019  12:32 pm



Thanks, now I have run the analysis as suggested by Webnote 21 and get the error message that Type=Mixture is not avialable for multi group analysis. What does this indicate since I have used the xact code on page 11 (without starting values). 


Send your output to Support along with your license number. 


I am trying to estimate a latent class regression (3 classes on a single covariate) using the auxiliary command from the webnotes. I am using the R3STEP but it does not retain the class proportions from the unconditional model. I'm using the following syntax: USEVARIABLES ARE [list of variables]; classes = c(3); AUXILIARY = race_1(R3STEP) ; MISSING are all (6 9999); ANALYSIS: TYPE = Mixture; starts = 0; Model: %Overall%... Can you please tell me what I am doing incorrectly? Thanks for your help! Mary 


Send the outputs from your 2 steps to Support along with your license number. 


Dear Prof. Muthen Is there an equivalent of the FAQ sheet "Odds ratios from thresholds of binary distal outcomes in mixtures" for nominal distal outcomes? Otherwise please could you advise how I might specify this? Many thanks Steve 


No, but end of UG chapter 14 describes how to produce probabilities for a nominal DV. That can then be expanded to odds and odds ratios. All of this can be expressed in Model Constraint. 

Steven Hope posted on Thursday, July 09, 2020  7:40 am



Many thanks! Can I please just check my code for 3 classes assuming a nominal distal with 3 categories would be: %c#1% [distal#1](d1); [distal#2](d2); %c#2% [distal#1](d3); [distal#2](d4); %c#3% [distal#1](d5); [distal#2](d6); Model Constraint: New(prob1 prob2 prob3 prob4 prob5 prob6 odds1 odds2 odds3 odds4 odds5 odds6 or15 or26 or35 or46); prob1 = 1/(1+exp(d1)); prob2 = 1/(1+exp(d2)); prob3 = 1/(1+exp(d3)); prob4 = 1/(1+exp(d4)); prob5 = 1/(1+exp(d5)); prob6 = 1/(1+exp(d6)); odds1 = prob1/(1prob1); odds2 = prob2/(1prob2); odds3 = prob3/(1prob3); odds4 = prob1/(1prob4); odds5 = prob2/(1prob5); odds6 = prob3/(1prob6); or15 = odds1/odds5; or26 = odds2/odds6; or35 = odds3/odds5; or46 = odds4/odds6; 


The denominators of the probs are wrong  there are more terms; look at how it's done in chapter 14. 


Many thanks for your suggestions! I have made the following changes  would you mind confirming whether this is now correct (again for 3 classes assuming a nominal distal with 3 categories): %c#1% [distal#1](d1); [distal#2](d2); %c#2% [distal#1](d3); [distal#2](d4); %c#3% [distal#1](d5); [distal#2](d6); Model Constraint: New(prob1 prob2 prob3 prob4 prob5 prob6 odds1 odds2 odds3 odds4 odds5 odds6 rrr15 rrr26 rrr35 rrr46); prob1 = exp(d1)/(1+exp(d1)+exp(d2)); prob2 = exp(d2)/(1+exp(d1)+exp(d2)); prob3 = exp(d3)/(1+exp(d3)+exp(d4)); prob4 = exp(d4)/(1+exp(d3)+exp(d4)); prob5 = exp(d5)/(1+exp(d5)+exp(d6)); prob6 = exp(d6)/(1+exp(d5)+exp(d6)); odds1 = prob1/(1prob1prob2); odds2 = prob2/(1prob1prob2); odds3 = prob3/(1prob3prob4); odds4 = prob4/(1prob3prob4); odds5 = prob5/(1prob5prob6); odds6 = prob6/(1prob5prob6); rrr15 = odds1/odds5; rrr26 = odds2/odds6; rrr35 = odds3/odds5; rrr46 = odds4/odds6; Many thanks Steve 


It will be clearer if you compute not only prob1 = exp(d1)/(1+exp(d1)+exp(d2)); prob2 = exp(d2)/(1+exp(d1)+exp(d2)); but also the third probability which is 1(prob1+prob2) = 1prob1prob2. When you compute odds1, you are contrasting prob1 with the above 3rdcategoru prob  is that what you wanted? 


Many thanks for your response. Yes, as in a standard multinomial regression, where the third outcome category is the base category for the odds, and class 3 is base for the relative risk ratio. Am I right in thinking that your suggestion is to calculate the probabilities for the third category for clarity only, so that odds1 would become: odds1=prob1/prob3 Otherwise the original syntax is all correct? 


Right. 


Brilliant  many thanks again! 

Back to top 