Ben Chapman posted on Saturday, March 24, 2007 - 2:15 pm
Hi--I just purchased Mplus, and after checking the manual and discussion board, it looks like one can't regress some distal outcome on a latent class variable using ON.
I'd like to use class membership in a three category latent class variable as a predictor of a distal outcome in the same way one would use two dummies in a regression (and include observed covariates also predicting the outcome). Yet I want to keep the observed class indicators separate from the distal outcome for substantive reasons, in the same way one would keep indicators in the measurement model of a continous LV different from a structural outcome regressed on the LV.
As I understand it, saving class probabilities and fitting a second regression model with dummy predictors representing class membership isn't optimal.
The DV is observed. Would specifying it as a continous LV permit this sort of analysis? What are the options here?
This brings up two issues which may not always be well understood in mixture modeling.
First, modeling the influence of a latent class variable c on a distal outcome y is not done by saying y ON c, but what is done gives information equivalent to having used ON. As in dummy variable regression the influence of c on y implies that the means of y change over the classes of c. This is what Mplus accomplishes by either using the default where y is not mentioned in the model, or equivalently by mentioning the mean of y in each of the c classes. Same for a categorical y in which case the thresholds (probabilities) change over classes. - In other words, you get what you want out of Mplus without saying y on c.
Second, making an analogy outside mixture modeling, if you do SEM with a set of indicators of a factor f and want to specify that f influences a distal outcome y that can be done by saying
f BY indicators y ON f
But that is statistically equivalent to saying
f BY indicators y
All the model says is that indicators are uncorrelated given (conditional on) f and that indicators are uncorrelated with y given f (all correlations go through f). And, because y is correlated with f, y data influences factor score estimation for f. You would not first estimate f using indicator information and then regress y on the estimated f scores. The same thing holds for the mixture situation.
So even if indicators are substantivelly different than the distal, the statistical modeling should be done using the Mplus default. The class formation obtained by analyzing only the indicators should not change substantially when adding the y information - it should only be better determined. If it does change, the classes are not well determined by the indicators alone.
Ben Chapman posted on Monday, March 26, 2007 - 12:06 pm
Thanks, this is very helpful and I was able to parameterize the model I needed.
The situation of class membership changing when other "outcomes" are added (beyond the intended "indicators", acknowledging the possible artificiality of the distinction) is interesting. I understand that the "outcomes" convey additional information about class membership and are statistically equivalent to indicators. On the other hand, I'm working from a substantive theory which dictates a typology defined by the "indicators" only, and if the membership and nature of the classes (i.e., standing on the original indicators) changes when adding other outcomes, even if classes are better determined, this could get tricky to explain to reviewers who aren't used to seeing this sort of analysis.
I thought of extension analysis, projecting extension variables into an already defined factor space by fixing loadings of the original indicator variables while freely estimating the loadings of the extension variables. What do you think of fixing means of the latent class "indicators" at the values obtained from the "indicator only" LPA--could this stabalize class membership when "distal outcomes" are added?
Mplus lets you fix parameters in the model to the values from an analysis of the part that includes indicators only, estimating only the parameters related to the distal outcome in the different classes. In principle, you can even set individuals' class membership by using TRAINING data with the PROBABILITIES option, where those values are obtained from the CPROB option when analyzing the indicator part only.
Before attempting this, however, I would first do 2 analyses, (1) using only the indicator information, and (2) adding the distal information. A solid model for (1) should not change in any important way when doing (2), so this is a good check of the replicability of (1).
I believe I have a similar type of question. I am interested in looking at the effects of a latent class variable on treatment outcome. I only have data at baseline (t1) and end of tx (t2), so I can't look at this in a growth model. The latent class variable (c) has 14 indicators. I have two issues that I would appreciate advice on handeling. 1) the t2 outcomes are repeated measures of several of the t1 variables 2) most of the t2 outcomes are zero-inflated (about 80% were free of certain symptoms).
Because I'm ultimately interested in whether c has any predictive clinical validity on certain outcomes (8 of the original 14 indicators) I'm unsure what my modeling options are here. Following the logic above, these t2 outcomes should be included in the estimation of c, although I am unsure how to handle the zero-inflated nature of several of these variables in such a model.
Thanks for the advice, I like this idea. If I use a LTA approach, conceptually, should the t2 (c) variable look similar to the t1 (c) variable in terms of number of classes, particularly if I am only interested in 8 of the original 14 indicators at outcome? This may be hard to find if the response rate to treatment is about 80% because there are so few (relative to baseline)subjects who would fall outside of the zero class. However, my zero-class conceptually is a good fit for a categorical representation of "tx responder". Although I also wonder if say I get a 2-class (c) at t2, if that means my (c) (currently a 4-class variable) is not a reliable variable.
A follow-up question. A colleague here recomended that for simplicity I just use a Piossion Zero-Inflated regression model to look at the effect of (c) on these zero-inflated outcomes. However, this seems inconsistent with what Bengt described in his post on March 24, 2007 above. Am I correct here?
1. Yes and no. Let the inflation variable be u#1 in the standard Mplus notation, where u#1 is the latent binary variable. Mplus will not let you say
cd#1 on u#1;
where cd is the latent class variable that you predict.
(Mplus only lets u#1 be a dependent variable, such as in UG ex8.5).
But you can use the explicit two-class approach to ZIP, where u#1 is instead represented as a latent class variable. See UG ex 7.25 where c is the same as u#1. Having formulated your ZIP model this way, Mplus lets you say
cd#1 on c#1;
Note, however, that the latent binary inflation variable is in my experience not very stable, but will change across somewhat different model specifications, so this type of regression is a bit "far out".
2.a) There are no thresholds - that language is reserved for categorical outcomes. Instead you have intercepts. This is explained in the text of UG ex 7.2 and also in ex3.8. See also text books for ZIP, such as Long's book.
2b) With 3 levels for a covariate, you have to create 2 dummy variables as usual in regression analysis.
This is very helpful. It sounds as though you believe the ZIP model I suggested is potentially unreliable. Do you have any thoughts on an alternative approach.? I was also considering using a LTA model although my t2 timepoint will have a zero-class that's not present at t1.
I'm sorry for the confusion. I described the model in my June 1 post. Briefly, I'm interested in whether latent class at t1 (14 indicators) is predictive of a range of outcomes at t2 (8 indictors). 3 of the outcomes are zero inflated with Poission distributions at t2, but not t1. t1 is pretreatment and t2 is post treatment. I was potentially thinking of using an LTA to look at transition from t1 class to t2 class, but because of the zero inflation, I figured I would need a zero class at t2.
Essentially, I'm interested in whether a specific subtyping approach has any predictive clinical utility.
I think the LTA approach is the way to go. So you specify an LCA measurement model at each of the two time points, defining the latent class variables c1 and c2. You don't have to have the same number of classes at the two time points. For the classes that are the same, you probably want to specify measurement invariance across time.
If you don't choose to use LTA, and don't have a latent class measurement model for time 2, you can still regress time 2 count variables on the time 1 latent class variable. You don't use the construction "u2 on c1", but you let u2 parameters vary as a function of c1 classes.
I hope I understand what you are interested in. Let me know if not.
If you don't have a latent class variable c2 at time point 2 (as in LTA), the u2 items will be taken as conditionally independent given c1, which may not be what you want. Having c2, this is relaxed to conditional independence given c2.
I think you've convinced me. I will stick with the LTA model, in part, because of the conditional independence issue that you described.
If am also interested in tx (3 conditions) as a predictor of the t1-->t2 transition (particularly to the zero class), should I use a multiple group model or treat tx as a covariate? Do you have a preference based on your experience? I already know that there is no difference between tx at outcome (58%,61%, 59% symptom free at outcome), but I am also interested if c1 interacts with treatment, so I will also be testing a moderator model.
I've been trying to run the LTA model described above building from the example that you sent me. I have a 4 class solution at t1 and 2 class solution at t2. As I mentioned, I'm interested in whether tx (3 conditions) moderates the transition from 4-classes to the final 2 classes.
I have 2 quick questions.
In your LTA example you used the following for your overall model: %OVERALL% c2#1 ON c1#1@0 cg#1 (p0); [c2#1] (p1);
If my c1 variable has 4 classes and cg variable has 3 classes, how would a specify this? Do I have to specify a seperate regression for each c2 class fixed at zero(except the reference group) on c2#1? Similarly for cg#1 & cg#2? %OVERALL% c2#1 ON c1#1@0 c1#2@0 c1#3@0 cg#1 (p0) c2#1 ON c1#1@0 c1#2@0 c1#3@0 cg#2 (p1) [c2#1] (p1); My second question is related to the differences in class numbers at t1 & t2. Because of the high tx response, I have a tx repsonse (zero class) and non responder group at outcome. Although the indicators at t2 are repeated measures of t1, I'm not sure how to use constraints to model conditional independence. Any suggestions on this?
Thank you for your answer to question 1 above and sorry about the confusion for question 2. What I was trying to understand is how to account for the repeated measures aspect of the model (for indicators of the c variables). Bengt suggested that I would constrain parameters to be invariant accross classes that are the same at t1 & t2. However, I don't know that either of the classes at t2 are the same as any of the 4 classes at t1. In this case, would I not worry about invariance?
Is that more clear? Sorry again for the confusion.
Thank your for the advice. I think I have my LTA model set. I was wondering if it typically took a long time to converge (e.g 6 hrs in a practice attempt). I have had some inconsistency with the t1 LCA model, particularly with replicating the best LogLikelihood value, even with increased starts = 150 20.
My second question is for calculating the probabilities. If I have 3 types of treatment and a 4-class t1 variable (conceptualized as diagnostic categories) and 2-class t2 variable (tx response vs. nonresponse), I believe I would be calculating the probabilites for 24 cells (8 per tx). What would the calculation of tx effects look like in this model with 3 txs.
For example, if I wanted to calculate the effect of tx on class 1 (t1) moving to the non-repsonder class 2 (t2).
LTA with 2 time points usually doesn't take that long. I would make sure as a first step that you get a T1 LCA model with many replicat loglikelihoods - if you can't replicate the LLs you probably have too many classes.
The probabilities for the 24 cells are given in the Mplus output. They are computed as shown in Chapter 13 of the UG.
I have a setup showing how to use the Mplus Model constraint feature to compute such tx effects on the probabilities that I can send you if you send me an email.
In a LTA model that assess for the effects of treatment (I have 24 potential cells 4 class x 3 treatmetns x 2 classes), I have run into estimation problems because of certain cells that have no estimated members (i.e., class1 in treatment 2 transitioning to symptom remission. My 4 class solution at t1 and 2 class at t2 appear to be replicable and good fits theoretically and statistically. Any thoughts on how to handle this low probability transitions?
Just to make sure that I am doing this correctly, I would set parameters under the knownclass portion of the model to zero where I knew that there was no transition. For instance, knowing that no participants in c1#1 transition to c2#1 within cg#1 would be specified as follows:
Here are a couple of comments. You mentioned in your Sept 15 message that you ran into estimation problems. We took that to mean that intervention effects couldn't be identified in some cells due to zero counts. But you are now mentioning zero transition probabilities which should be handled automatically in Mplus by Mplus fixing large/small logits. So it is unclear what the problem is. Also, Chapter 13 of the User's Guide describes how "a" and "b" logit parameters combine to form logits and therefore transition probabilities.
thank you for the advice, the Kaplan paper is very helpful. I think I have 2 problems as I've mentioned.
1) I have cells with low counts when I add the treatment variable. When I estimate the standard LTA without covariates I don't have any estimation problems and the low counts don't seem to be a problem.
2) Theoretically, 2 (of 4) of the classes at time 1 would have a 100% probability of transitioning to class 1 at time 2, regardless of which treatment (CG1-CG3). My Sept 15 post was double checking on the correct way to fix these cells.
3) I think I've run into a related problem and that is how to deal with the latent variable indicators for low count cells (particularly low probability indicators). Is there a way to fix these parameters for individual cells?
Given what you say in 1), I would suggest analyzing the time point 2 outcomes by themselves in an LCA that included the treatment variable. If this creates problems that you don't see when the treatment variable is not included, I would think about why.
Regarding 3), the latent variable indicator probabilities for a certain time point should vary only across the latent class categories for the latent class variable of that time point - this is in line with regular LTA. Mplus automatically fixes probabilities to 1 or 0 (actually corresponding logits); are you saying this fixing doesn't happen? If so, you can do it yourself using the large values printed. Again, if this happens only in the LTA and not in the time-specific LCAs, I would think about why.
I did some further investigation as you suggested. Adding the tx variable to the t2 LCA model did not cause any problems and the results made sense based on my previous attempts with the unconditional LTA and original t2 LCA. For the sake of exploration, I also reran the t1 LCA model with tx as a covariate... this led to estimation problems, which suggests to me that the problem in the conditional LTA model is likely to occur in the relationship between tx and latent class probabilities for t1.
But then, I figured that the tx variable isn't specified as a covariate for the estimation of t1 latent variable in the conditional LTA, so this shouldn't be a problem... correct?
I was also wondering if I should be constraining the means/thresholds for t1 LCA classes to be invariant across treatments...
The tx covariate should not be allowed to influence the t1 outcomes - isn't t1 pre intervention?
I believe I sent an example of the input for LTA in a treatment-control setting. This works with 3 model statements:
Model c1: Model c2: Model cg:
where the first two are for the latent class variable at the 2 time points and the third is for the treatment-control latent class variable (KNOWNCLASS). This setup produces invariant measurement parameters across the two cg classes - for both t1 and t2 outcomes.
The t1 is pre-treatment but represents diagnostic group. t2 uses many of the same indicators, but not all of them and is more limited to a tx response variable.
In the example that you sent me, you set the t1 and t2 classes to have invariant threshold paramenters across indicators within class. I have a slightly different problem since my t1 and t2 variables are different so I freed these parameters. Would this cause estimation problems in the conditional LTA that are absent in the unconditional LTA?
I think I have it figured out, but I would love to have your input on why this is the case.
I constrained the relevant parameters from my first t1 diagnostic group (which had low symptom presentation)to be equal to the corresponding t2 parameters in my non-response group. This cut the estimation time from 6-8 hrs to about 1:30 hrs with STARTS 120 20; The results make sense theoretically and I didn't get any warnings.
Why would making equating these 2 classes make such a difference?
Here are answers to your two recent posts. Because your items and latent classes are different at the two time points (which I had forgotten), I would not hold them equal across time. At most, I would hold equal across time measurement parameters that are for the same items coupled with latent classes that are interpreted the same across time - but that too is iffy. Measurement equality or not should not in principle influence estimation problems.
If you think that tx should influence the time 1 part of the model, I would first investigate why adding tx to the time 1 LCA creates a problem.
The setup I sent did not let tx influence the time 1 part of the model.
I think I can rationalize the link between the two classes, i.e. equating diagnostic group 1 and tx non-reponder at time 2 (at least for indicators that are the same), but I will continue to investigate the problem with the tx varible and time 1 LCA.
The problem as you mentioned though, is that tx was randomly assigned after the baseline data was collected so the influence of tx-->c1 or c1--> seems hard to rationalize.
Would the coding for the tx variable make a difference in estimation? For instance, should tx's be coded as 0, 1, 2 or should contrast coding be used as when investigating categorical moderators?
I left out a word. I think it's hard to rationalize
tx-->c1 since tx was randomly assigned
c1-->tx for the same reason...
Looking at these relationships and potential problems with model estimation, c1#1 seems to have an empty cell (no individuals in tx = 1 who are also in c1#1. A similar problem with c1#3 with tx = 2. I investigated this using posterior classification because there are estimation problems when I use the tx variable in the model of t1 LCA (e.g., as a covariate) and I get a different array of classes, with c1#3 having very few members (less 5%). This is not the case when the model is estimated without the tx variable.
I have no further thoughts on this other than the observation that if random assignment took place after t1 there should be no significant tx-->c1, but your reporting seems to say that this relationship is strong: "no individuals in tx = 1 who are also in c1#1". If I am understanding this correctly, this finding suggests that the randomization broke down.
I am using GMM to model negative parenting. I would like to use the parenting class membership to predict children's academic achievement (a continuous variable). I am having a hard time getting the right syntax equivalent to y on c (y = academic achievement; c = class membership). Please point me to the right direction. Thanks.
Referring ex7.2.inp code, I'm doing a mixture analysis for a count variable using a zero-inflated poisson model. I'd like to make the model in which the estimates (means) of inflation variable differ depending on latent classes. Can M-plus do this analysis ?
The following is my unfinished code. The number of the dependent variables that follow ZIP distributions is 10. There are 2 explanatory variables for latent class indicator.
---------------- TITLE: latent class ZIP model DATA: FILE IS pr_ZIP.dat; VARIABLE: NAMES ARE y1-y10 x1 x2; CLASSES = c (2); COUNT = y1-y10(i); ANALYSIS: TYPE = MIXTURE; MODEL: %OVERALL% c ON x1 x2; OUTPUT: TECH1 TECH8; ----------------
In the above code, however, I get the results in which the estimates (means) of inflation variables are common among latent classes.