I'm looking at both GMM and LCGA of depression scores at 4 time points, using cubic growth (neither linear nor quadratic fits well). For the GMM I am allowing only the intercept to vary. As expected, LCGA yields more classes for about the same fit (BIC): 7-8 vs. 3-4 for GMM. For LCGA some of the classes have roughly parallel, flat curves with different starting points. This has appealing interpretation: e.g., "chronically depressed" (consistently high scores) or "chronically happy" (consistently low scores). In the GMM these are merged into one centrally located class, which I think makes sense since the intercepts are random instead of fixed. The entropy for the GMM is higher than LCGA (0.84 vs. 0.70). However, the interpretation seems less useful, because the depressed and the non-depressed folks are lumped into the one class that can only be described as "stable" regardless of severity.
It seems that such a GMM will always tend to lump parallel trajectories, and different classes would be primarily distinguished by shape. Is this correct? If so, it seems that an LCGA model is more appropriate for us, since average depression level is as important as trajectory shape in our characterization. However, a 7/8-class LCGA model is a bit unwieldy. Is there an alternative way I could/should be thinking about this that allows integration of average depression status as well as shape? Thanks.
Yes, GMM primarily distinguishes classes by curve shape. The within-class variation, captured by a variance for a growth factor, picks up variations on such themes. But I wouldn't agree that "an LCGA is more appropriate for us, since average depression level is as important as trajectory shape" for two reasons. First, one should ask which model fits the data better; if GMM is significantly better, LCGA shouldn't be used. Second, the importance of the level can be acknowledged also in a GMM framework - for instance predicting a distal outcome from not only the latent class but also the growth factor itself.
...well after a bit more thought perhaps "GMM will always tend to lump parallel trajectories" is probably not accurate. At any rate, though, at least with our depression data there does seem to be this continuum of average depression status that results in several parallel-trajectory classes under LCGA, which get lumped into once classs in GMM. I am very new to this but have gleaned that LCGA sometimes can't differentiate between a normal-mixture approximation to a non-normal distribution (which the depression scores are) and different classes. How do I reconcile this with the fact that the LCGA classes have greater interpretability in terms of average depression status, which gets lost in the GMM model? Thanks so much.
Thanks--very helpful, although I'm still struggling with this issue of average level. In some sense we are interested in predicting a patient's trajectory given where they start. I'm not sure of the best way to formulate this in the model though. My initial and perhaps naive thought was to run separate GMM models by subgroups defined by the baseline score (e.g. 0-4:no depression, 5-9:mild, 10-14:moderate, etc.). One potential problem is that the baseline score, thus stratified, is not normally distributed nor can no longer be treated as censored normal as I do with the other three time points. Perhaps I should use only the follow-up time points then in identifying the trajectories? Do you have other thoughts about how to approach this?
On another note, is there any problem with estimating cubic curves using only 4 time points? We often see rapid change following by plateauing, which neither linear nor quadratic fit very well. I have been admitting variances for the intercept +/- slope but have been holding the quadratic and cubic terms fixed.
Apologies for so many questions and tremendous thanks for your suggestions. We have a contingent of folks signed up for the March workshop and are looking forward to it.
Usually, the initial score contributes to the class formation by the fact that the initial status mean [i] varies across classes. But one can also specify "c on i" - but that is advanced (i.e. something you do after having studied the topic a long time).
Estimated time scores are discussed in our Short Courses, "Day 2", covering growth modeling - you can get handouts from that. Here is an example:
Hello, I am running LCGA in randomized trials of smoking at 6 time points. I am wondering if it is correct to have a direct path from treatment to slope. I know that GMM can have the path, but I am not sure LCGA. In my model, should the path from treatment be related to class (not slope) because LCGA does not have within-class variation?
I also have covariates. Can they also be mapped only to class (not intercept and slope)?
I am sorry for this beginner's question. I will appreciate your help. Thanks!
It's a good question. LCGA as defined by Nagin does not have within-class variation as you say, and covariates cannot influence growth within classes. But in Mplus you can let treatment and other covariates change the slope mean within classes, using zero residual variance. Note that you don't want the treatment to influence the class membership because the class membership is typically thought of as a pre-treatment variable.
But why restrict the slope residual variance to zero - I think you should use GMM, not LCGA and as you say let the within-class slope be influenced by treatment.
mari posted on Wednesday, April 27, 2011 - 7:24 am
Thank you very much for your response. It helps me a lot.
I'm looking for different classes of fatigue trajectories in a sample of osteoarthritis patients (n = 1000) using both GMM and LCGA (with linear and quadratic growth factors).
At this moment I'm in a bit of a conundrum. Not surprisingly, in case of LCGA I've been able to find more distinct classes than with GMM (6-7 vs. 3-4). With LCGA the entropy scores are exceeding the 0.8 level, although these classes only differ in terms of initial change (i.e. intercept). The GMMs on the other hand have consistently shown much better fit (e.g. BIC, BLRT). The entropy scores of these GMMs, however, do not exceed the 0.6 level. Thus GMM doesn't seem to be able to classify individual trajectories properly. Although the average latent class probabilities on the diagonal are mostly ~0.75.
As the LCGA models show worse fit and the trajectory plots show obvious within-class differences (at least in some classes) regarding individual trajectories, I'm inclined to go for GMM. But if I do, I lose the distinction between individual trajectories concerning initial change.
In deciding on the number of classes, I would not focus on entropy. I would look at model fit and the substantive meaning of the classes. See the Topic 6 course handout and video where we give a strategy for deciding on the number of classes using various measures. It is well known that LCGA extracts many similar classes because of not allowing for within-class variability.
I have run a 2-class LCGA and interestingly, the BIC increases from the unconditional 1 class model. However, when I run a fully GMM model with 2 classes (though the entropy is only .68) I get a lower BIC from the 1 class model and the BLRT: p<.01. How might this happen? I have always read LCGA is going to extract more classes than the GMM and that LCGA might be considered a first step to determining number of classes. Is this a sign that there are no classes in the data even though the GMM might indicate there are?
You say that you "get a lower BIC from the 1-class model" when using GMM. If that means that your 2-class GMM has a higher BIC than a 1-class GMM, then that points to 1 class also for the GMM. We want to find the model with the smallest BIC.