

LCA with Covariates, Distal, and Impu... 

Message/Author 


I am running an LCA with 19 binary predictors and an N of 223. The best fitting model has two classes (entropy = .87). I have regressed this model on three covariate predictors. I am fine with and, in fact, want the predictors to influence class membership. Model: %Overall% c#1 on catag2 k6cat aids_eve; My questions: 1) These predictors reduce the N to 189 so I have rerun the model using multiple imputation. Using MI, I can't get the conditional probabilities of the indicators, only thresholds. Is there a way to get the results in probability scale? 2) I have two binary outcomes included using the auxiliary statement. Mplus won't run these using DCAT, which I use because I want odds ratios for the distal variables. What other option works? 3) I want to regress the distal outcomes the covariates to obtain direct effects (as well as the indirect effects through latent class), how is this specified in the model statement? Once the model is running on nonimputed data, can I run the exact same model using imputed data? I presently get this error message for the imputed data: Auxiliary variables with E, R, DU3STEP, DE3STEP, or BCH are not available for TYPE=IMPUTATION. Thanks for any help. 


By 19 binary predictors I assume you mean 19 latent class indicators. 1) No auto in Mplus but you can easily compute them by the usual translation from logits to probabilities. 2) See the tables at the end of web note 21 3) In the 1step approach you simply say y ON x in each class. Last question  the message correctly states that this is not provided in Mplus. 


Thank you Dr. Muthen, I did mean indicators...sorry. With respect to number 3, perhaps I was unclear... In the current statements I have usevar ind1ind19 age aids; auxiliary (e) hospital er; Then in the model statement, as indicated, I have: %Overall% c#1 on age aids; If I then add to that statement (as I interpreted from your suggestion): hospital on age aids; I get this error message: *** ERROR in MODEL command Unknown variable(s) in an ON statement: HOSPITAL If I try to include hospital on the usevar AND auxiliary command, I get the same error message. In other words, it seems to be telling me I can't have a variable that is defined as a distal outcome on the auxiliary command and also include that variable in the model as a DV regressed on one (or more) of the covariate predictors of latent class. Is there a way to do this? 


First, note that Auxiliary (e) is outdated  see the tables at the end of Mplus Web Note 21. To add that direct effect you have to take "a manual" approach to 3step as discussed in Web Note 15 and 21. 


Thank you again. Understood. 

Soyoung Kim posted on Monday, December 18, 2017  5:02 am



Dear Dr. Muthen. I would like to ask a question about multiple imputation (MI) with distal outcome in mixture model. When I use BCH with latent class model, I got a error message: *** ERROR in VARIABLE command Auxiliary variables with E, R, DU3STEP, DE3STEP, or BCH are not available for TYPE=IMPUTATION. It does not seem to be available to use them at once. Then, doing BCH analysis, if I want to get the last result of multiple imputation, is it possible to calculate the mean value of estimates manually? In other words, I have made imputed data sets, and I got the results of each data set. Is it possible to calculate the means of chisquare values and pvalues for BCH results from the multiple data sets? Thank you in advance. Soyoung. 


Yes, you can combine the results using the usual formula, see the bottom two formulas on page 3 http://statmodel.com/download/MI7.pdf Alternatively you can use the manual BCH approach with imputation. The manual BCH approach is described in Section 3 http://statmodel.com/examples/webnotes/webnote21.pdf 

Soyoung Kim posted on Tuesday, December 19, 2017  7:57 pm



Thank you so much for your response Tihomir. it was very helpful. Best wishes, Soyoung. 

Soyoung Kim posted on Wednesday, December 20, 2017  1:06 am



Dear Dr. Asparouhov. How can I combine the pvalues? I understand that I can calculate the mean of the chisquares values from multiple results. Is it okay to calculate the means of pvalues for BCH results from the multiple data sets? Thank you in advance, Soyoung. 


You can not combine the mean of the chisquare or the mean of the pvalues. Instead see the bottom two formulas on page 3 http://statmodel.com/download/MI7.pdf Using these you can compute the point estimate P as well as the SE for that joint point estimate (the formula there is for the variance so you have to square the SE for the individual imputed data sets and at the end take the square root of the total variance). From there you can use the standard method pvalue=2*Phi^{1}(P/SE), where Phi^{1} is the inverse of the standard normal distribution function. For example, if P/SE=1.96 the pvalue=0.05. 


Dear Dr. Asparouhov, I have used BCH for distal outcomes over 10 imputed datasets with 3 latent classes. I have combined the Means and SEs from datasets for each distal outcome for each Class, using the formulas that you mentioned on page 3 of http://statmodel.com/download/MI7.pdf. So now I have combined Mean and SE of each distal outcome for each separate Class, similar to the output under EQUALITY TESTS OF MEANS ACROSS CLASSES USING THE BCH PROCEDURE. However, I'm still a little unclear about how I use these combined estimates to compare means between latent classes. From my understanding, if I was interested whether there was a significant difference between Class 1 and 2 on Distal outcome X, i would have (Class 1 combined M  Class 2 combined M)/combined SE. Is this correct? And if this is the case as I have three combined SEs from each Class, would I need to pool these estimates again, and what would be the proper method for doing this? Thank you for your assistance. 


For each imputed data set compute the difference parameter  you can use model constraints statement to form that, like this %C#1% [x] (m1); %C#2% [x] (m2); model constraints: new(p); p=m1m2; Then combine the estimates of p over the imputed data sets. Section 2, http://statmodel.com/download/MI7.pdf, explains how to combine the three estimates (they are actually two not three class1 class2 and class1  class3, the third one is just the difference of these two and should not be included as it is a dependent statement, i.e., the hypothesis class1=class2 and class1=class3, is the same as the hypothesis class1=class2, class1=class3, class2=class3). 


Dear Dr. Asparouhov, Thanks for your response. I have a follow up question. For 3 classes, I understand theoretically how the difference between the first comparison (Class 1Class 2) and the second comparison (Class 1 Class 3) gives you the estimates for the 3rd comparison (Class 2Class 3). But Im not sure I understand how to test that comparison without putting it into the model. In this analysis, we are interested in finding out if classes differ on depression symptoms. So using this procedure, we would be able to tell if Class 1 had significantly higher depression symptoms than Class 2, and whether Class 1 had significantly higher depression symptoms than Class 3. However, Im not sure how we would determine the statistical significance of the difference in depression between Class 2 and 3. To get this result, should we put all three comparisons in the model? For example my syntax is: MODEL: %C#1% [DEP_MEAN] (m1); %C#2% [DEP_MEAN] (m2); %C#3% [DEP_MEAN] (m3); model constraints: new(DEP_Diff1); DEP_Diff1=m1m2; new(DEP_Diff2); DEP_Diff2=m1m3; new(DEP_Diff3); DEP_Diff3=m2m3; Thanks again for your help. 


Yes  that will work 


Thank you so much for your help Dr Asparouhov. One final question  in order to analyse the means of variables by latent classes, I'm assuming we would need to use a manual two step BCH process, where the latent classes were fixed with BCH weights in step 1, and then the model constraints are added in step 2 for each dataset? Or is there a more straightforward way of doing this? Thank you again. 


Because of the imputation, there is no easier approach. If you can avoid the imputation and use simple FIML estimation that accounts for the missing data than you can use simply the auxiliary command. 


Thank you very much Dr Asparouhov. Your advice has been extremely helpful. 


Dear Dr Asparouhov and Dr Muthen, I attempt to use the 1ststep syntax from Webnotes #21 Section 3.2 for each of my imputed datasets. However, I always get the ERROR as follows: *** ERROR The following MODEL statements are ignored: * Statements in Class 1: [ EEC38$1 ] [ EHT38$1 ] [ EPA38$1 ] [ ESD38$1 ] [ EUI38$1 ] * Statements in Class 2: [ EEC38$1 ] [ EHT38$1 ] [ EPA38$1 ] [ ESD38$1 ] [ EUI38$1 ] *** ERROR One or more MODEL statements were ignored. These statements may be incorrect or are only supported by ALGORITHM=INTEGRATION. My syntax is this and I'm using Mplus V8.0. DATA: FILE = TMIE1.DAT; VARIABLE: NAMES = GENDER EEC38 EHT38 EPA38 ESD38 EUI38...etc; USEVAR = EEC38EUI38; MISSING = *; CENSORED = EEC38(a) EHT38EUI38(b); AUXILIARY = M9DTM38ET GENDER; CLASSES = C(2); ANALYSIS: TYPE = MIXTURE; STARTS = 50 5; PROCESSORS = 2; MODEL: %OVERALL% %C#1% [EEC38$1EUI38$1*1.0]; %C#2% [EEC38$1EUI38$1*1.0]; OUTPUT: SAVEDATA: FILE = BCHTMIE1.DAT; SAVE = BCHWEIGHTS; I also notice that SAVE=BCHWEIGHTS is not available with ALGORITHM=INTEGRATION. Could you help me out? 


The [y$1] expression refers to a threshold of a categorical variable, not to a mean/intercept of a continuous variables. You don't declare any of your variables as categorical. 


Thank you, Dr. Muthen. Another question is that following Dr. Asparouhov's suggestion in this topic, I computed the difference parameter for each imputed data set. However, I just noticed that the classification for each imputed data set is different. In some sets, class 1 refers to the highconcern group, while in some other sets, class 2 refers to the highconcern group, at least according to the means and variances of the indicators. Need I swap the difference parameter between two classes (groups) for different datasets when combining the estimates over the imputed datasets to use the formulas on http://statmodel.com/download/MI7.pdf ? Or this actually indicates that the classification solution is not good based on the MI datasets with AUXILIARY in LCA models? 


You have to align the classes. We actually do this internally when Mplus combines the imputations. The way to do this is as follows. Run the first data set with the option "output:svalues". This will give you a model statement with great starting values. Copy that model statement and use it to run the rest of the imputations with the analysis:starts=0. This should align the classes. If the classification is not good for a particular run (this has nothing to do with the imputations) you will get a separate warning message from Mplus but I don't think this is the case. 


Also your auxiliary command maybe should be something like this AUXILIARY = (DU3step) M9DTM38ET GENDER; 


Dear Dr. Tihomir Asparouhov, Thank you for the helpful advice. One more question is to check whether I am using the formula correctly (perhaps also help other newbies like me). Correct me if I am wrong in any steps. (1) I define and use model constraint to obtain the meandifference between class 1 and class 2 (lets say the 2class solution is chosen) on variable A (x = A1A2), so I will use the adjusted Wald test to decide whether x = 0. Now I have x1, x2, x3
x30 for 30 imputed datasets. (i = 1, 2, 3,
30) (2)Thus, I will have mean_x = 1/30 * SUM (x1, x2,
,x30); (3) For each of x1, x2, x3
x30, I also have SE1, SE2, SE3,
, SE30 from the output. Thus, I have Vi = SEi^2 (V1 = SE1^2, V2 = SE2^2,
, V30 = SE30^2). (4) Then to calculate V = 1/30 * SUM (V1, V2, V3
,V30) + (30+1)/[30*(301)] * SUM [(xi mean_x)^2]. (5) Finally, W^2, which in the webnote MI7 (pp34 for Wald test) is 'W' in the text, is said to be a chisquare distribution. We need to calculate W^2 (df = 1) = [(mean_x  0)^2] / V for p value, and decided whether it is significant or not. 


All correct. Instead of step 5 you can just use the Ztest: Z=mean_x/sqrt(V). If abs(Z)>1.96 the mean difference is statistically significant. 

Back to top 

