Message/Author 


1) Is it possible to estimate LCA with a distal outcome in a single step in Mplus(along the same lines one does with LCGA or GGMM)? (Are there examples of this anywhere?) 2) Would I obtain the same results if I used a twostep procedure whereby, first, I estimated conditional latent classes with covariates (while also identifying my distal outcomes using the "auxillary" command), export this to another statistical package like Stata, and second, run regular regressions of my distal outcomes on class probabilities in Stata? Much thanks for any information! 


1. Yes. It would be like Example 8.6 but with an LCA model instead of a GMM model. 2. It is always preferable to do the entire model estimation in one step. Doing it in two steps would introduce estimation errors and the standard errors will be incorrect. 


Thank you, Linda. Are there any references you would recommend for me to read more about this type of estimation error? Thanks! 


I don't know of any specific reference. Perhaps you could find something in a general statistical text. 


I am running a LCA with several covariates and distal outcomes. I would like to control for the distal outcome scores in the fall so I have specified "outcome on fallscore" in my overall model. Because of this, I get intercepts for my distal outcomes instead of means. I have two questions: 1) What is the best way to request the class means for the outcomes? Tech7? 2) What is the best way to compare my 3 classes on the distal outcomes? I have used MODEL TEST but isn't that comparing the class intercepts and not the means in my particular model? Thank you in advance for your time! 


1. TECH7 is sample statistics. TECH4 and the RESIDUAL option of the OUTPUT command give model estimated values. 2. You would need to define the means in terms of the model parameters using new parameters in MODEL CONSTRAINT where you can also test the differences. 

Keri Jowers posted on Tuesday, September 20, 2011  12:08 pm



Hi, I'm aiming to use LCA to predict both categorical and continuous distal outcomes. From what I see in Example 8.6, it seems like the current recommendation is to basically do the equivalent of including it as a covariate (c on x). Is that the correct interpretation? If so, when I do that, the size of my 3 class shift more than I'd expect or like for them to (even when I've specified the stat values for each of the latent class indicators. My understanding is that this indicates that the model may be unstable or not replicable, and that the number of classes may not be correct. The 2 and 3class models have very similar fit indices: 2class (79.14%, 20.86%): AIC = 4431.098 BIC = 4498.885 SSABIC = 4451.258 LMRadj LRT p = .0007 BLRT p = .0000 entropy = .789 3class (47.31%, 31.33%, 21.36%): AIC = 4400.205 BIC = 4504.146 SSABIC = 4431.146 LMRadj LRT p = .0378 BLRT p = .0000 entropy = .720 The 3class model is a better fit theoretically/substantively. What, then, is the best way to estimate the association between the latent classes and the distal outcome? Can I trust the results I get when the classes are shifting? Thanks so much in advance for your input! 


In ex 8.6 the distal outcome is u, not x. 

Keri Jowers posted on Wednesday, September 21, 2011  10:43 am



Apologies  I mistyped. So, in ex 8.6, the distal is mentioned only in the "categorical = u" statement. How would one estimate the association between the latent classes and a continuous distal? 


The key is that u is on the NAMES list so it is an analysis variable. In the case where all variables on the NAMES list are not analysis variables, u would have been on the USEVARIABLES list. The same holds for a continuous distal outcome. It needs only to be on NAMES or NAMES and USEVARIABLES. 

Keri Jowers posted on Thursday, September 22, 2011  11:49 am



Right, I've got the NAMES and USEVARIABLES piece. Perhaps a better way of posing my question is this: When the continuous distal is in the USEVARIABLES list and I then set the start values for my LC indicators (not the u) using my previously obtained thresholds to try to preserve my classes, the output provides me with classspecific means for the distal, and these are based on the reestimated model I mentioned above. Are these means and their associated pvalues intended to be interpreted as the association between the latent class and the distal? This seems counterintuitive to me. 


The relationship between the categorical latent variable and the distal is found in the varying of the means of the distal across classes. The question you want to ask is if these means are the same across classes or different. You can use MODEL TEST to answer this question. 

Keri Jowers posted on Friday, September 23, 2011  8:07 am



Thanks so much! One final question  how concerned should I be that the class sizes change drastically when compared to when the distal is not included in the USEVAR statement? Not only are the class proportions very different (below), but the sample proportions within each class are very different: without distal in USEVAR: 47.3% 31.3% 21.4% with distal in USEVAR: 76.8% 16.9% 6.3% 


When you add a distal outcome, it is the same as adding another latent class indicator. The classes will be affected by this. This means that the distal is being taken into account which it should be. 


We have estimated a sixclass solution in LPA and are interested in using the LPA class membership (which was estimated at Time 1) to predict a distal outcome (at Time 2) while controlling for various attributes at Time 1. I recently attended a conference and heard a presentation that discussed "distalasconsequence" where class membership is treated as missing data in the regression of the distal outcome on class membership and multiple imputations are used based on the posterior class probabilities (obtained from the estimated growth mixture model without the distal outcome included) to estimate the association between class membership and distal outcomes. I have searched the Mplus archive and reviewed papers posted as well as the manual for more information/examples on this. Thus far, I have not found any. Any suggestions? Thank you. 


Section 4 of this paper on our web site discusses plausible values for latent class variables obtained by multiple imputation and how those plausible values can be used: Asparouhov, T. & Muthén, B. (2010). Plausible values for latent variables using Mplus. Technical Report. 

C. Gantz posted on Monday, February 02, 2015  7:48 pm



I have read with great interest the many posts on using LCA to predict distal outcomes. I understand this is a complex topic. In my analysis, I would like to use a 3 class solution at T1 to predict a variety of T2 continuous outcomes. I additionally would like to control for T2 outcomes at T1. In the first step of this analysis, a three class solution was best based on AIC, BIC and LoMendell, with entropy of .89. These three classes also make a lot of sense theoretically. My question is as follows: I understand that often the one step approach is preferable here. However, I read the Clark & Muthen (2009) piece, and it seems that when entropy is high, it is acceptable to use the most likely class membership. When I included the outcome in the class estimation, this significantly changed the formation of the latent classes in a way that no longer made theoretical sense. Am I right to interpret the Clark & Muthen (2009) paper that in this case, given the high entropy from my 3 class results, I could be justified to assign most likely class membership and use these in follow up analyses? 


Yes. 

C. Gantz posted on Tuesday, February 03, 2015  10:32 am



Thank you so much for your quick reply, Bengt! 

C. Gantz posted on Tuesday, February 03, 2015  10:40 am



Apologies, one quick follow up here: The Clark & Muthen paper appears to be unpublished  is this the case? If so, do you happened to have a reference to this line of logic in a published paper? Thank you! 


You can open the data file with any Mac text editor. 


I think you can refer to the paper below for this logic: Asparouhov, T. & Muthén, B. (2014). Auxiliary variables in mixture modeling: Threestep approaches using Mplus. Structural Equation Modeling: A Multidisciplinary Journal, 21:3, 329341. The posted version corrects several typos in the published version. An earlier version of this paper was posted as web note 15. Appendices with Mplus scripts are available here. 

Yueqi Yan posted on Monday, April 27, 2015  9:42 pm



Is there any way to examine effect size when comparing the mean differences of the distal outcomes among classes? Thanks! 


You can divide the mean difference by the standard deviation of the distal. 

Yueqi Yan posted on Wednesday, April 29, 2015  12:10 pm



Thanks Bengt! So how I can receive the standard deviation of the distal. My distal is latent variable. I used fixed factor loading method and could not directly receive the variance information for the latent distal from the output. There is only standard error coming up with mean difference and residual variance of the distal outcome for each latent class. Should I run a separate model without latent class to see the variance of the distal outcome? Thanks again! 


You get the factor variance of a distal in TECH4. 


Hi, I am running a LCA with regression using the 3 step, where the Latent Classes predict distal outcomes. Is it possible to control for other variables in the model that are not used to form the Latent classes such as ethnicity or ses? many thanks 


Yes, manually  see Web Note 21. 


Thank you 


I am trying to use a 3 class latent profile model to predict a continuous distal variable. I was only able to proceed to LPA with an auxiliary variable included in the assessment, following the four articles I have read: 1. Clark & Muthén (2009). Relating Latent Class Analysis Results to Variables not Included in the Analysis 2. Asparouhov & Muthén (2014). Auxiliary variables in mixture modeling 3.Lanza et al. (2013). Latent Class Analysis With Distal Outcomes 4. Web note 14 I decided to go for the Lanza approach, following the simulation results from article 2, in which it states that Lanza approach seems to perform well particularly in the situation of my proposed model. Here is the syntax for the LPA using Lanza: VARIABLE: names are SRM WRM CI TT CTEU PRODIS; USEVARIABLES ARE SRM WRM CI TT CTEU; classes = c(3); auxiliary = PRODIS(dcon); ANALYSIS: type = mixture; MODEL: %overall% [SRM WRM CI TT CTEU]; %c#1% [SRM WRM CI TT CTEU]; %c#2% [SRM WRM CI TT CTEU]; %c#3% [SRM WRM CI TT CTEU]; savedata: file is LCAinput1.dat; save = cprob; I know that I need to use the data saved from the above to run the next analysis, using 3 class to predict the continuous distal variable. I tried to imitate ex8.6, but I do not understand how I can use it in the context of my model. Your help will be very much appreciated. 


You can choose between 3 ways to handle the distal: 1step, automatic 3step, and manual 3step. Your input is for automatic 3step. This does not need a further step, but you get all the information you need in the output  the distal's means and tests of their differences across classes. In the 1step approach the distal is part of the model, that is, on the USEV list. This gives you the means of the distal in different classes. You should add a 5th paper to your reading list, which shows the methods we recommend in 2 tables at the end: Asparouhov, T. & Muthén, B. (2014). Auxiliary variables in mixture modeling: Using the BCH method in Mplus to estimate a distal outcome model and an arbitrary second model. Web note 21. For a continuous distal we now recommend "BCH". 


Hi Bengt, This is massively helpful. I have read the web note 21, and attempted to run the analysis using automatic BCH approach. I presume that this is a new feature developed after version 7.2 since I got an error message saying: *** ERROR in VARIABLE command Unrecognized setting in the AUXILIARY option: BCH for variable(s): PRODIS Sabrina 


BCH came out in Version 7.3. 


Hi I created profiles of families based on their responses to items concerning their relationship. I am now interested in using these profiles to predict distal outcomes for the siblings in those families (data is nested). Does the manual BCH method work with multi level modeling with continuous outcomes? I was told that perhaps I need to use BCH weights that are generated in the latent class run but I'm not really sure what this means. If this is correct can you please clarify. thanks! 


BCH is suitable for distal outcomes, but is not available for multilevel situations. But having siblings in families does not necessitate multilevel modeling  instead you can use a "wide" approach. See our Topic 3 or 4 handouts. 


Dear Drs Muthén, I performed a LPA having 2 latent variables (3 profiles each), and I would like to predict a distal continuous outcome using the BCH approach (manual, I’d like to have other covariables in my final model).  It is possible? I tried, and by imput is correctly read, calculation are done, but the output is empty (there is nothing after the "warnings" section). Also, BCH weights are not computed (although the file has been created, it's empty).  If BCH is not available in my case, do you suggest an alternative approach (other than categorize subjects using latent class membership)? Thanks for your help 


Please send your input, output, and data to Support along with your license number. 


Dear Muthen, I tried to compare estimated withinclass means of the distal outcome by DE3STEP, DU3STEP, DCON, and BCH. Below is the part of input file I used in the practice simulation study (from Appendix P, Asparouhov, T., & Muthen, B. (2014)): Montecarlo: Names are u1u5 y; Generate = u1u5(1); Categorical = u1u5; Genclasses = c(2); Classes = c1(2); Nobservations = 500; Nrep = 10; Auxiliary = y(DE3STEP); *(DU3STEP / DCON / BCH) ...... I have found differences estimated withinclass means of distal outcome from the results "EQUALITY TESTS OF MEANS ACROSS CLASSES USING THE (3STEP / BCH / ...) PROCEDURE". 1) I wonder whether the generated data in each simulation conditions (3STEP, DCON, BCH) are the same? 2) If those generated data are not the same, is it possible to say the results can be compared across the approaches? Thanks in advance for your help. Myungho Shin 


1) you can save the generated data using REPSAVE = ALL; SAVE = sample*.dat; 2) if the data is not the same most likely you should not compare the results. 


Thanks for your help. I have additional questions on my practice simulation. 1) I performed 4 separate monte carlo simulation. changing the approaches with other conditions remaining equal as below; study 1 "auxiliary = y(DE3STEP);" study 2 "auxiliary = y(DU3STEP);" study 3 "auxiliary = y(DCON);" study 4 "auxiliary = y(BCH);" Within each study, I got different data sets generated. Is it still possible to compare performances of the approaches? 2) I tried to conduct external monte carlo simulation in order to have the same data set across the approaches simulated. However, I got an error messeage: *** ERROR in VARIABLE command Auxiliary variables with E, R, R3STEP, DU3STEP, DE3STEP, DCATEGORICAL, DCONTINUOUS, or BCH are not available for TYPE=MONTECARLO. It would be appreciated if you could give ant advice on this. 


1) I get the same data sets. Send your example to support@statmodel.com 2) You can use Mplus with R to do external montecarlo http://statmodel.com/usingmplusviar.shtml or this https://www.statmodel.com/utility/extractor.shtml 


Hello, If I restructure my data that has nesting within person and within family (i.e., longitudinal data collected from 2 siblings in a family)to the wide format in order to use the BCH method for predicting distal outcomes, would I would need to estimate the distal outcome for each sibling separately (i.e., class membership predicting an outcome for first born siblings and then for second born siblings?) Thanks! 


If the distal outcome is a siblinglevel variable, yes. But not if the distal outcome is a familylevel variable. 

Back to top 