Regressing outcomes on latent classes PreviousNext
Mplus Discussion > Latent Variable Mixture Modeling >
 Ben Chapman posted on Saturday, March 24, 2007 - 2:15 pm
Hi--I just purchased Mplus, and after checking the manual and discussion board, it looks like one can't regress some distal outcome on a latent class variable using ON.

I'd like to use class membership in a three category latent class variable as a predictor of a distal outcome in the same way one would use two dummies in a regression (and include observed covariates also predicting the outcome). Yet I want to keep the observed class indicators separate from the distal outcome for substantive reasons, in the same way one would keep indicators in the measurement model of a continous LV different from a structural outcome regressed on the LV.

As I understand it, saving class probabilities and fitting a second regression model with dummy predictors representing class membership isn't optimal.

The DV is observed. Would specifying it as a continous LV permit this sort of analysis? What are the options here?

Thanks in advance.
 Bengt O. Muthen posted on Saturday, March 24, 2007 - 3:08 pm
This brings up two issues which may not always be well understood in mixture modeling.

First, modeling the influence of a latent class variable c on a distal outcome y is not done by saying y ON c, but what is done gives information equivalent to having used ON. As in dummy variable regression the influence of c on y implies that the means of y change over the classes of c. This is what Mplus accomplishes by either using the default where y is not mentioned in the model, or equivalently by mentioning the mean of y in each of the c classes. Same for a categorical y in which case the thresholds (probabilities) change over classes. - In other words, you get what you want out of Mplus without saying y on c.

Second, making an analogy outside mixture modeling, if you do SEM with a set of indicators of a factor f and want to specify that f influences a distal outcome y that can be done by saying

f BY indicators
y ON f

But that is statistically equivalent to saying

f BY indicators y

All the model says is that indicators are uncorrelated given (conditional on) f and that indicators are uncorrelated with y given f (all correlations go through f). And, because y is correlated with f, y data influences factor score estimation for f. You would not first estimate f using indicator information and then regress y on the estimated f scores. The same thing holds for the mixture situation.

So even if indicators are substantivelly different than the distal, the statistical modeling should be done using the Mplus default. The class formation obtained by analyzing only the indicators should not change substantially when adding the y information - it should only be better determined. If it does change, the classes are not well determined by the indicators alone.
 Ben Chapman posted on Monday, March 26, 2007 - 12:06 pm
Thanks, this is very helpful and I was able to parameterize the model I needed.

The situation of class membership changing when other "outcomes" are added (beyond the intended "indicators", acknowledging the possible artificiality of the distinction) is interesting. I understand that the "outcomes" convey additional information about class membership and are statistically equivalent to indicators. On the other hand, I'm working from a substantive theory which dictates a typology defined by the "indicators" only, and if the membership and nature of the classes (i.e., standing on the original indicators) changes when adding other outcomes, even if classes are better determined, this could get tricky to explain to reviewers who aren't used to seeing this sort of analysis.

I thought of extension analysis, projecting extension variables into an already defined factor space by fixing loadings of the original indicator variables while freely estimating the loadings of the extension variables. What do you think of fixing means of the latent class "indicators" at the values obtained from the "indicator only" LPA--could this stabalize class membership when "distal outcomes" are added?
 Bengt O. Muthen posted on Monday, March 26, 2007 - 7:07 pm
Mplus lets you fix parameters in the model to the values from an analysis of the part that includes indicators only, estimating only the parameters related to the distal outcome in the different classes. In principle, you can even set individuals' class membership by using TRAINING data with the PROBABILITIES option, where those values are obtained from the CPROB option when analyzing the indicator part only.

Before attempting this, however, I would first do 2 analyses, (1) using only the indicator information, and (2) adding the distal information. A solid model for (1) should not change in any important way when doing (2), so this is a good check of the replicability of (1).
 Tom Hildebrandt posted on Friday, June 01, 2007 - 12:00 pm

I believe I have a similar type of question. I am interested in looking at the effects of a latent class variable on treatment outcome. I only have data at baseline (t1) and end of tx (t2), so I can't look at this in a growth model. The latent class variable (c) has 14 indicators. I have two issues that I would appreciate advice on handeling. 1) the t2 outcomes are repeated measures of several of the t1 variables 2) most of the t2 outcomes are zero-inflated (about 80% were free of certain symptoms).

Because I'm ultimately interested in whether c has any predictive clinical validity on certain outcomes (8 of the original 14 indicators) I'm unsure what my modeling options are here. Following the logic above, these t2 outcomes should be included in the estimation of c, although I am unsure how to handle the zero-inflated nature of several of these variables in such a model.

Any thoughts would be greatly appreciated.
 Linda K. Muthen posted on Monday, June 04, 2007 - 10:57 am
It sounds like you might want to consider a latent transition model where the inflation can be taken care of by a class where individuals do not have symptoms.
 Tom Hildebrandt posted on Tuesday, June 05, 2007 - 4:17 pm

Thanks for the advice, I like this idea. If I use a LTA approach, conceptually, should the t2 (c) variable look similar to the t1 (c) variable in terms of number of classes, particularly if I am only interested in 8 of the original 14 indicators at outcome? This may be hard to find if the response rate to treatment is about 80% because there are so few (relative to baseline)subjects who would fall outside of the zero class. However, my zero-class conceptually is a good fit for a categorical representation of "tx responder". Although I also wonder if say I get a 2-class (c) at t2, if that means my (c) (currently a 4-class variable) is not a reliable variable.

A follow-up question. A colleague here recomended that for simplicity I just use a Piossion Zero-Inflated regression model to look at the effect of (c) on these zero-inflated outcomes. However, this seems inconsistent with what Bengt described in his post on March 24, 2007 above. Am I correct here?
 Linda K. Muthen posted on Wednesday, June 06, 2007 - 7:50 am
If you have different more latent class indicators at time one versus time two, you may have a different number of classes.

You could have two categorical latent class variables and regress c2 on c1 or you could have one latent class variable and regress outcomes at time 2 on c1. I don't see the contradiction you refer to.
 Tom Hildebrandt posted on Wednesday, June 06, 2007 - 7:13 pm
Thanks again Linda,

I actually think I confused myself in reading Bengt's post. I realized that the original question wasn't the same situation that I'm interested in modeling.
 Tom Hildebrandt posted on Monday, June 18, 2007 - 8:22 pm
I have a couple of follow up questions now that I have started anlyzing the data I discussed (June 5 above).

1. Is it possible to regress a categorial latent variable onto a zero-inflated latent variable?

2. In a typical zero-Inflated model (ex.7.2 Users Manual)
a) how do you interpret the threshold parameters in the output when your covariates are categoriacal?

b) If one of my categorical covariates is Tx condition and has 3 levels, how should I interpret the regression coefficient predicting u or u#1?

c) Am I correct to interpret the u#1 regressed on covariates to mean the prediction of zero?

Thanks in advance!
 Bengt O. Muthen posted on Tuesday, June 19, 2007 - 6:15 pm
1. Yes and no. Let the inflation variable be u#1 in the standard Mplus notation, where u#1 is the latent binary variable. Mplus will not let you say

cd#1 on u#1;

where cd is the latent class variable that you predict.

(Mplus only lets u#1 be a dependent variable, such as in UG ex8.5).

But you can use the explicit two-class approach to ZIP, where u#1 is instead represented as a latent class variable. See UG ex 7.25 where c is the same as u#1. Having formulated your ZIP model this way, Mplus lets you say

cd#1 on c#1;

Note, however, that the latent binary inflation variable is in my experience not very stable, but will change across somewhat different model specifications, so this type of regression is a bit "far out".

2.a) There are no thresholds - that language is reserved for categorical outcomes. Instead you have intercepts. This is explained in the text of UG ex 7.2 and also in ex3.8. See also text books for ZIP, such as Long's book.

2b) With 3 levels for a covariate, you have to create 2 dummy variables as usual in regression analysis.

2c) Yes
 Tom Hildebrandt posted on Wednesday, June 20, 2007 - 12:03 pm
Thanks Bengt,

This is very helpful. It sounds as though you believe the ZIP model I suggested is potentially unreliable. Do you have any thoughts on an alternative approach.? I was also considering using a LTA model although my t2 timepoint will have a zero-class that's not present at t1.

I will also definately seek out Long's book.
 Bengt O. Muthen posted on Wednesday, June 20, 2007 - 6:05 pm
I'm unclear on the model you are working with. Are you considering an LTA for time 1, and wanting to see how the time 1 latent class variable influences the time 2 outcomes?
 Tom Hildebrandt posted on Friday, June 22, 2007 - 11:22 am

I'm sorry for the confusion. I described the model in my June 1 post. Briefly, I'm interested in whether latent class at t1 (14 indicators) is predictive of a range of outcomes at t2 (8 indictors). 3 of the outcomes are zero inflated with Poission distributions at t2, but not t1. t1 is pretreatment and t2 is post treatment. I was potentially thinking of using an LTA to look at transition from t1 class to t2 class, but because of the zero inflation, I figured I would need a zero class at t2.

Essentially, I'm interested in whether a specific subtyping approach has any predictive clinical utility.

 Bengt O. Muthen posted on Saturday, June 23, 2007 - 8:47 am
I think the LTA approach is the way to go. So you specify an LCA measurement model at each of the two time points, defining the latent class variables c1 and c2. You don't have to have the same number of classes at the two time points. For the classes that are the same, you probably want to specify measurement invariance across time.

If you don't choose to use LTA, and don't have a latent class measurement model for time 2, you can still regress time 2 count variables on the time 1 latent class variable. You don't use the construction "u2 on c1", but you let u2 parameters vary as a function of c1 classes.

I hope I understand what you are interested in. Let me know if not.
 Bengt O. Muthen posted on Saturday, June 23, 2007 - 10:31 am

If you don't have a latent class variable c2 at time point 2 (as in LTA), the u2 items will be taken as conditionally independent given c1, which may not be what you want. Having c2, this is relaxed to conditional independence given c2.
 Tom Hildebrandt posted on Saturday, June 23, 2007 - 8:33 pm

I think you've convinced me. I will stick with the LTA model, in part, because of the conditional independence issue that you described.

If am also interested in tx (3 conditions) as a predictor of the t1-->t2 transition (particularly to the zero class), should I use a multiple group model or treat tx as a covariate? Do you have a preference based on your experience? I already know that there is no difference between tx at outcome (58%,61%, 59% symptom free at outcome), but I am also interested if c1 interacts with treatment, so I will also be testing a moderator model.

Thanks again for your advice!
 Bengt O. Muthen posted on Sunday, June 24, 2007 - 11:54 am
tx as a predictor of the transition is best modeled using tx as Knownclass, so that "c2 on c1" can vary across these tx classes. If you email me I can send you Mplus input for this LTA analysis.
 Tom Hildebrandt posted on Thursday, July 19, 2007 - 6:18 am

I've been trying to run the LTA model described above building from the example that you sent me. I have a 4 class solution at t1 and 2 class solution at t2. As I mentioned, I'm interested in whether tx (3 conditions) moderates the transition from 4-classes to the final 2 classes.

I have 2 quick questions.

In your LTA example you used the following for your overall model:
c2#1 ON c1#1@0 cg#1 (p0);
[c2#1] (p1);

If my c1 variable has 4 classes and cg variable has 3 classes, how would a specify this? Do I have to specify a seperate regression for each c2 class fixed at zero(except the reference group) on c2#1? Similarly for cg#1 & cg#2?
c2#1 ON c1#1@0 c1#2@0 c1#3@0 cg#1 (p0)
c2#1 ON c1#1@0 c1#2@0 c1#3@0 cg#2 (p1)
[c2#1] (p1);
My second question is related to the differences in class numbers at t1 & t2. Because of the high tx response, I have a tx repsonse (zero class) and non responder group at outcome. Although the indicators at t2 are repeated measures of t1, I'm not sure how to use constraints to model conditional independence. Any suggestions on this?

Thanks again for your help
 Linda K. Muthen posted on Thursday, July 19, 2007 - 9:59 am
You don't need to repeat the fixed parameters in the second line. I think you want a different label for [c2#1].

c2#1 ON c1#1@0 c1#2@0 c1#3@0 cg#1 (p0)
c2#2 ON cg#2 (p1)
[c2#1] (p2);

I'm not clear on what you mean by using constraints to model conditional independence.
 Tom Hildebrandt posted on Thursday, July 19, 2007 - 12:04 pm

Thank you for your answer to question 1 above and sorry about the confusion for question 2. What I was trying to understand is how to account for the repeated measures aspect of the model (for indicators of the c variables). Bengt suggested that I would constrain parameters to be invariant accross classes that are the same at t1 & t2. However, I don't know that either of the classes at t2 are the same as any of the 4 classes at t1. In this case, would I not worry about invariance?

Is that more clear? Sorry again for the confusion.

 Bengt O. Muthen posted on Thursday, July 19, 2007 - 12:14 pm
If you don't want to impose measurement invariance across time, you just don't include the (1-4) and (5-8) type of parameter constraints shown in UG ex8.13.
 Tom Hildebrandt posted on Wednesday, July 25, 2007 - 6:02 pm

Thank your for the advice. I think I have my LTA model set. I was wondering if it typically took a long time to converge (e.g 6 hrs in a practice attempt). I have had some inconsistency with the t1 LCA model, particularly with replicating the best LogLikelihood value, even with increased starts = 150 20.

My second question is for calculating the probabilities. If I have 3 types of treatment and a 4-class t1 variable (conceptualized as diagnostic categories) and 2-class t2 variable (tx response vs. nonresponse), I believe I would be calculating the probabilites for 24 cells (8 per tx). What would the calculation of tx effects look like in this model with 3 txs.

For example, if I wanted to calculate the effect of tx on class 1 (t1) moving to the non-repsonder class 2 (t2).
 Bengt O. Muthen posted on Thursday, July 26, 2007 - 8:39 am
LTA with 2 time points usually doesn't take that long. I would make sure as a first step that you get a T1 LCA model with many replicat loglikelihoods - if you can't replicate the LLs you probably have too many classes.

The probabilities for the 24 cells are given in the Mplus output. They are computed as shown in Chapter 13 of the UG.

I have a setup showing how to use the Mplus Model constraint feature to compute such tx effects on the probabilities that I can send you if you send me an email.
 Tom Hildebrandt posted on Saturday, September 15, 2007 - 10:59 am
In a LTA model that assess for the effects of treatment (I have 24 potential cells 4 class x 3 treatmetns x 2 classes), I have run into estimation problems because of certain cells that have no estimated members (i.e., class1 in treatment 2 transitioning to symptom remission. My 4 class solution at t1 and 2 class at t2 appear to be replicable and good fits theoretically and statistically. Any thoughts on how to handle this low probability transitions?

Thanks in advance.
 Linda K. Muthen posted on Tuesday, September 18, 2007 - 4:36 am
Low probability cells are not a problem in general. However, you cannot estimate treatement effects in these. You should fix the treatment effects to zero for those cells.
 Tom Hildebrandt posted on Tuesday, September 18, 2007 - 12:11 pm

Just to make sure that I am doing this correctly, I would set parameters under the knownclass portion of the model to zero where I knew that there was no transition. For instance, knowing that no participants in c1#1 transition to c2#1 within cg#1 would be specified as follows:

Model cg:
c2#1 ON c1#1@0;

This also means that everyone in c1#1 transitions to c2#2 within cg#1, so would I fix that parameter as well or does fixing the first parameter take care of estimating the second parameter?

Thanks again for your thoughts?
 Bengt O. Muthen posted on Thursday, September 20, 2007 - 10:50 am
Here are a couple of comments. You mentioned in your Sept 15 message that you ran into estimation problems. We took that to mean that intervention effects couldn't be identified in some cells due to zero counts. But you are now mentioning zero transition probabilities which should be handled automatically in Mplus by Mplus fixing large/small logits. So it is unclear what the problem is. Also, Chapter 13 of the User's Guide describes how "a" and "b" logit parameters combine to form logits and therefore transition probabilities.
 Bengt O. Muthen posted on Thursday, September 20, 2007 - 10:51 am
P.S. Regarding fixing transition probabilitioes to zero, see also the Kaplan LTA paper on our web site.
 Tom Hildebrandt posted on Monday, September 24, 2007 - 5:39 am

thank you for the advice, the Kaplan paper is very helpful. I think I have 2 problems as I've mentioned.

1) I have cells with low counts when I add the treatment variable. When I estimate the standard LTA without covariates I don't have any estimation problems and the low counts don't seem to be a problem.

2) Theoretically, 2 (of 4) of the classes at time 1 would have a 100% probability of transitioning to class 1 at time 2, regardless of which treatment (CG1-CG3). My Sept 15 post was double checking on the correct way to fix these cells.

3) I think I've run into a related problem and that is how to deal with the latent variable indicators for low count cells (particularly low probability indicators). Is there a way to fix these parameters for individual cells?

Thanks again for your help!
 Bengt O. Muthen posted on Monday, September 24, 2007 - 6:21 pm
Given what you say in 1), I would suggest analyzing the time point 2 outcomes by themselves in an LCA that included the treatment variable. If this creates problems that you don't see when the treatment variable is not included, I would think about why.

Regarding 3), the latent variable indicator probabilities for a certain time point should vary only across the latent class categories for the latent class variable of that time point - this is in line with regular LTA. Mplus automatically fixes probabilities to 1 or 0 (actually corresponding logits); are you saying this fixing doesn't happen? If so, you can do it yourself using the large values printed. Again, if this happens only in the LTA and not in the time-specific LCAs, I would think about why.
 Tom Hildebrandt posted on Tuesday, September 25, 2007 - 7:49 am

I did some further investigation as you suggested. Adding the tx variable to the t2 LCA model did not cause any problems and the results made sense based on my previous attempts with the unconditional LTA and original t2 LCA. For the sake of exploration, I also reran the t1 LCA model with tx as a covariate... this led to estimation problems, which suggests to me that the problem in the conditional LTA model is likely to occur in the relationship between tx and latent class probabilities for t1.

But then, I figured that the tx variable isn't specified as a covariate for the estimation of t1 latent variable in the conditional LTA, so this shouldn't be a problem... correct?

I was also wondering if I should be constraining the means/thresholds for t1 LCA classes to be invariant across treatments...

Thanks again for all of your help.
 Bengt O. Muthen posted on Tuesday, September 25, 2007 - 8:03 am
The tx covariate should not be allowed to influence the t1 outcomes - isn't t1 pre intervention?

I believe I sent an example of the input for LTA in a treatment-control setting. This works with 3 model statements:

Model c1:
Model c2:
Model cg:

where the first two are for the latent class variable at the 2 time points and the third is for the treatment-control latent class variable (KNOWNCLASS). This setup produces invariant measurement parameters across the two cg classes - for both t1 and t2 outcomes.
 Tom Hildebrandt posted on Tuesday, September 25, 2007 - 1:06 pm

The t1 is pre-treatment but represents diagnostic group. t2 uses many of the same indicators, but not all of them and is more limited to a tx response variable.

In the example that you sent me, you set the t1 and t2 classes to have invariant threshold paramenters across indicators within class. I have a slightly different problem since my t1 and t2 variables are different so I freed these parameters. Would this cause estimation problems in the conditional LTA that are absent in the unconditional LTA?
 Tom Hildebrandt posted on Tuesday, September 25, 2007 - 6:34 pm
I think I have it figured out, but I would love to have your input on why this is the case.

I constrained the relevant parameters from my first t1 diagnostic group (which had low symptom presentation)to be equal to the corresponding t2 parameters in my non-response group. This cut the estimation time from 6-8 hrs to about 1:30 hrs with STARTS 120 20; The results make sense theoretically and I didn't get any warnings.

Why would making equating these 2 classes make such a difference?
 Bengt O. Muthen posted on Thursday, September 27, 2007 - 9:38 am
Here are answers to your two recent posts. Because your items and latent classes are different at the two time points (which I had forgotten), I would not hold them equal across time. At most, I would hold equal across time measurement parameters that are for the same items coupled with latent classes that are interpreted the same across time - but that too is iffy. Measurement equality or not should not in principle influence estimation problems.

If you think that tx should influence the time 1 part of the model, I would first investigate why adding tx to the time 1 LCA creates a problem.

The setup I sent did not let tx influence the time 1 part of the model.
 Tom Hildebrandt posted on Thursday, September 27, 2007 - 3:33 pm

I think I can rationalize the link between the two classes, i.e. equating diagnostic group 1 and tx non-reponder at time 2 (at least for indicators that are the same), but I will continue to investigate the problem with the tx varible and time 1 LCA.

The problem as you mentioned though, is that tx was randomly assigned after the baseline data was collected so the influence of tx-->c1 or c1--> seems hard to rationalize.

Would the coding for the tx variable make a difference in estimation? For instance, should tx's be coded as 0, 1, 2 or should contrast coding be used as when investigating categorical moderators?
 Bengt O. Muthen posted on Thursday, September 27, 2007 - 3:44 pm
I don't think coding of the tx variable makes a difference. The input I sent showed tx translated into latent classes, so here coding is bypassed. I don't know what you mean by

c1--> seems hard to rationalize.

c1 points to c2, doesn't it, and that would seem essential.
 Tom Hildebrandt posted on Thursday, September 27, 2007 - 9:01 pm
Sorry for the confusion.

I left out a word. I think it's hard to rationalize

tx-->c1 since tx was randomly assigned

c1-->tx for the same reason...

Looking at these relationships and potential problems with model estimation, c1#1 seems to have an empty cell (no individuals in tx = 1 who are also in c1#1. A similar problem with c1#3 with tx = 2. I investigated this using posterior classification because there are estimation problems when I use the tx variable in the model of t1 LCA (e.g., as a covariate) and I get a different array of classes, with c1#3 having very few members (less 5%). This is not the case when the model is estimated without the tx variable.

Any thoughts?
 Bengt O. Muthen posted on Friday, September 28, 2007 - 11:36 am
I have no further thoughts on this other than the observation that if random assignment took place after t1 there should be no significant tx-->c1, but your reporting seems to say that this relationship is strong: "no individuals in tx = 1 who are also in c1#1". If I am understanding this correctly, this finding suggests that the randomization broke down.
 christine meng posted on Friday, June 08, 2012 - 11:59 pm
I am using GMM to model negative parenting. I would like to use the parenting class membership to predict children's academic achievement (a continuous variable). I am having a hard time getting the right syntax equivalent to y on c (y = academic achievement; c = class membership). Please point me to the right direction. Thanks.
 Linda K. Muthen posted on Saturday, June 09, 2012 - 6:53 am
This is seen in the means of academic achievement varying over classes. Just include academic achievement on the USEVARIABLES list.
 Kei Miyazaki posted on Tuesday, August 21, 2018 - 11:59 am
Dear Dr.s Muthen,

Referring ex7.2.inp code, I'm doing a mixture analysis for a count variable using a zero-inflated poisson model. I'd like to make the model in which the estimates (means) of inflation variable differ depending on latent classes. Can M-plus do this analysis ?

The following is my unfinished code. The number of the dependent variables that follow ZIP distributions is 10. There are 2 explanatory variables for latent class indicator.

TITLE: latent class ZIP model
VARIABLE: NAMES ARE y1-y10 x1 x2;
CLASSES = c (2);
COUNT = y1-y10(i);
c ON x1 x2;

In the above code, however, I get the results in which the estimates (means) of inflation variables are common among latent classes.

Thanks in advance.
 Bengt O. Muthen posted on Tuesday, August 21, 2018 - 6:01 pm
Yes, this can be done. Mention the inflation parameters in each class.
 EH posted on Friday, August 16, 2019 - 8:25 am
Dear dr Muthen,

I'm having some trouble to understand the meaning of the different possibilities of adding covariates when predicting outcomes with LPA in Mplus.

If I want to predict satisfaction with my classes and I want to eliminate the influence that age might have.
do I add age as a covariate for the latent classes or do I add the regression of satisfaction on age?

So I want to make sure it is the profile membership that predicts satisfaction, regardless of their age.
 Bengt O. Muthen posted on Sunday, August 18, 2019 - 4:31 pm
Sounds like you should control for age when predicting satisfaction, so regress satisfaction on age. The intercept of satisfaction will vary across the latent classes and that will be what you should be interested in. I would also check if age predicts your latent class variable.
Back to top
Add Your Message Here
Username: Posting Information:
This is a private posting area. Only registered users and moderators may post messages here.
Options: Enable HTML code in message
Automatically activate URLs in message