I am trying to use a 6-class solution in LCA to predict continuous outcomes (e.g., depression, or t2dep) at Time 2 while controlling for various things at Time 1 (Time 1 depression, or t1dep, sex, age, income, education, and marital status).
I found a lot of information about the syntax to predict class membership, but not much about how to use class membership to predict outcomes. When I used the following syntax, I got the ERROR message below it:
%Overall% T2dep on C T1dep rc001re sexrsp rb003red t1age;
The following MODEL statements are ignored: * Statements in the OVERALL class: T2DEP ON C#1 T2DEP ON C#2 T2DEP ON C#3 T2DEP ON C#4 T2DEP ON C#5
Can you tell me what I am doing wrong? I also tried writing out C#1-C#5 in the model statement, but I got the same error message. What is the appropriate syntax for using class to predict an outcome, controlling for various covariates?
I am looking at Exampl 8.6 however, and it is referring to "GMM with a categorical distal outcome using automatic starting values and random starts" - but my distal outcome is CONTINUOUS (e.g., depression). It is my predictor (class) that is categorical. Am I looking at the wrong example??
It does not matter whether the distal outcome is continuous or categorical or what the model is. The distal outcome is connected with the categorical latent variable. For a continuous distal outcome, the means vary over classes. For a categorical distal outcome, the thresholds vary across classes.
Anjali Gupta posted on Saturday, October 17, 2009 - 10:22 am
I'm attempting a similar model. I've decided on a 3 class model and would now like to use class membership to predict to a distal predictor (depression); and also use depression to predict to class.
However, I do not want depression to be involved with determining class membership. Is this possible? And is there an example of such a model?
It's critical that class membership is not based on depression.
In your first paragraph you mention depression twice: as a DV and as an IV. I assume you mean depression measured at T1 and T2 - otherwise you need to clarify.
I hear this question often - you have a distal outcome that you want to predict from latent class membership but you don't want the distal to influence your classes. It is interesting that this question never seems to come up in SEM when the latent class variable is replaced by a factor - the factor is also determined in part by the distal. My feeling is that the problem should be re-conceptualized - the distal should influence the latent class membership in a first model step. Then, using this model, in a second step a new sample can be considered for which you classify people into the classes not using the distal information, fixing the parameters at the values from the first step.
True, you can do the analysis without the distal in the first step and then do the second step. But that first step would not draw on the strength that the latent classes are informed by their relationship to the distal - if you think the distal has different means/probabilities for different classes, why not use that information in forming the classes. Also, you would not have an estimate of how the classes influence the distal.
Thank you. Yes, there are 2 depression time points, sorry.
I'm new to SEM/MPLUS and haven't used factors/latent variables.
However - my motivation to separate the two (distal and class) is my desire to see if the latent class (comprised of economic variables) is related to depression. I don't wish to confound the classes by including depression as an indicator. I'd like to see if people 'more' successful economically have a different relation as to depression compared to people less successful economically.
And when you suggest using different samples - I'm unclear if you mean an entirely different set of records (people). I only have my main sample of 632 persons.
If at all possible, could you further explain the steps required for both of your suggestions?
You can do LCA on the economic variables only and then (1) fix all those parameters when adding the distal, or (2) get the most likely class membership (if the entropy is big) and let that be an observed variable that predicts the distal. That's one line of thinking.
The other line is saying that you can get the economics latent classes more pertinent to predicting depression if you include the depression distal in the modeling. You can for example split your sample and estimate that full model in the first step and then in the second step use the other half of the sample and use the econ items only to form latent classes with parameters fixed at the values from the first run - and see how that latent class membership relates to depression. We teach about similar approaches in "Topic 6" of our short courses - see videos and handouts on our web site.
Reading your very helpful comments above brought a question to my mind. In a standard LCA with ordered categorical outcomes, would one only need to fix the thresholds prior to adding the distal outcomes (and other covariates)?
If at all possible, I'd appreciate some starting steps to accomplish your suggestions above of:
"You can do LCA on the economic variables only and then (1) fix all those parameters when adding the distal, or (2) get the most likely class membership (if the entropy is big) and let that be an observed variable that predicts the distal."
From the follow up emails I understand I need to fix both thresholds and probabilities - and, unfortunately, haven't found an example I understand.
I 'think' the section of Topic 6 that is related starts on page 142 - but I unclear how to translate the results of my LCA into the 2nd step wiith the distal.
It is a little hard to teach this topic in a short Mplus Discussion format - better to come to our short courses or watch their movies (the Topic 5 mixture movie will shortly be available from our recent Berlin teaching). But let me say a few things:
For the simplest approach (2) above, you get the most likely class membership by requesting cprob in the Save option of the Savedata command - see p. 649 of our UG.
To fix parameters, see UG ex 7.5 where instead of * you use @. The ex shows the item parameters - you also have to add the class prob-related parameters, for example
Jinseok Kim posted on Wednesday, October 21, 2009 - 4:46 am
Hi Bengt I am conducting LCA with a number of binary class indicators and a continuous distal outcome variable. I've got thresholds for class indicators and means and SE for the distal outcome variable for each class. My question is now whether there is any way to test whether the estimated means of the distal outcome variables differ across the classes.
Your previous responses have been most helpful. I've since used 'hard coded' class membership in regressions with distal predictors per the suggestion: "(2) get the most likely class membership (if the entropy is big) and let that be an observed variable that predicts the distal."
I have 2 follow up questions:
I've learned how to interpret resulting using (using 2 of 3 classes) classes as Dependent variables. Could class be a Independent variable predicting a distal 'outcome'? And how would that be interpreted? Would I use Class as a nominal variable - or test an excluded class by including the 2 remaining classes as independent variables?
Second - I can't get my head around the choice between (1) using covariates in class creation - versus (2) creating classes without covariates and including covariates when finding the relations between class membership and distal predictors. I understand this may be an issue specific to the research questions - but any insight would be helpful.
Hi, I have a question similar to those in this thread but a little more elaborate. I want to use latent classes formed from variables X1-X8 as "baseline" covariates in a growth model of Y1-Y4, ie growth of Y over 4 time points.
Specifically: 1) The "baseline" latent classes are formed from the baseline variables X1-X8 only. 2) All individuals are followed up post-baseline with repeated measurements of Y: Y1-Y4. I would like to fit a growth model for Y for each individual allowing the intercepts and slopes to depend on the latent classes formed from (1).
I would like uncertainty in class formation to be taken into account when when doing the growth model regression, but I definitely do *not* want the Y's to be used in the formation of the latent classes. In my specific example it makes no sense to have class formation using future measurements. (eg classes are "baseline" only)
Can this be modelled in MPlus? If so, where should I look for such examples?
At worst case I guess I could fix each individual's class and do the second stage regression as if these were fixed in advance, and then try to concoct a post-hoc misclassification correction of the effects of each class based on the posterior class probabilities for each individual. I haven't seen this done but I suspect it may have been. Any pointers would be appreciated.
This is a complex topic. In one sense the Y's should contribute to the classes for X1-X8 because the Y's are correlated with the X classes so why not use all the information pertaining to the X classes?
One approach that might help is to add a latent class variable for the Y's, so having two latent class variables. And then let those two be related.
Another approach is to impute m data sets for the X class variable (most likely class) using only X1-X8 information. And then do the analyses m times relating X class to the Y's.
I think some object to using both X and Y information in forming the X classes because they want to use the X classes for prediction of Y in later stages where only X information is available. If that's the case, I would conjecture that you get a better prediction model when the X classes have been formed based on both X and Y.
I have read the above discussion and have a follow up question about fixing class probabilities. I choose the option of estimating latent classes in one half of the sample (i.e. social support at one measurement of friends and parents), then using the other half of the sample to fix the class solution and use this latent class variable to predict a distal outcome (a latent growth curve of social anxiety).
you state that: "To fix parameters, see UG ex 7.5 where instead of * you use @. The ex shows the item parameters - you also have to add the class prob-related parameters, for example
What do x and y stand for, exactly? I understand from example 7.5 that I would need to fix the estiamted means and variance for each of the latent classes; but from your response I need to additionally (!) fix the class probabilities. Is that where x and y stand for? How do I calculate/get these?