Daniel posted on Wednesday, May 12, 2004 - 3:23 pm
I am currently preparing to resubmit a manuscript, but one question the reviewers are having difficulty with is the difference in slopes between classes. I conducted a latent class growth analysis with an ordered categorical variable with five classes. The results suggest four classes. Two classes have similar slopes. However, in one class, participants are at a higher category at baseline than the other. Thus, although their slopes are similar, they started and finished in very different categories. The reviewers question the viability of latent trajectory analysis in general, and believe that our results show nothing but differences at baseline. My question is, is there a graphic that I could use to convince them that my classes clearly represent unique classes? The remainder of our results clearly show that these two groups differ on covariates, but that doesn't seem to convince them. But they really want a graphic of some sort.
Let me clarify. We are evaluating smoking trajectories over 4 years (4 annual measurements). Our outcome is an ordered categorical variable spanning never smoker = 0 to frequent smoker = 4. The LCGA indicated 4 classes. We modeled growth to have the same shape across classes. Although all of the indices supported the existince of the 4 distinct classes, a reviewer wants to know if the slopes differ for these 4 classes or are the differences between classes only at baseline. How can we best respond to the reviewer? Is it appropriate to evaluate slope differences? If so, how should we approach this? If it is not appropriate, given the nature of our outcome, what should we provide the reviewer?
It seems to me that if people start at different places, grow at the same rate, and end at different but therefore parallel places, this does not support different classes even if there are differences in the covariates. Is there any substantive reason to support these different classes? Because LCGA does not allow variation within the classes, you may be finding more classes than you need due to this.
Hi, In reading the reviewers' comments, and Bengt's chapter in "New Developments and Techniques in Structural Equation Modeling," the question being posed by the reviewers would be better stated, "why should we believe there are multiple trajectories rather than a single trajectory?" It is true that two of the slopes are not different. However, our findings indicate clearly that adolescents in the two trajectories in question have very different distributions over time. OUr goal then is to demonstrate that a multi group solution is better than a single group solution. Unfortunately, they do not care for our statistical analysis. We gave them BIC differences, entropy, the LRT (tech11), and other supporting evidence. I'm not sure where to go from here.
Basically, they want to see that the slopes for all of our 4 classes differ. We conducted a LRT for the difference between the 4 class models with equal or different proportional odds betas, which indicated that the slopes were significantly different. Is there any other information that we should or could provide to the reviewers to support the LCGA?
It appears that a key problem is that the reviewers are not versed in modeling. They also wish to see a graphic were we plot the class means across time for our ordered categorical variable of smoking progression(i.e., treat our outcome as continuous on a scale of 0 to 4). We believe this misrepresents the data and that a plot based on odds of smoking progression is more appropriate. Would you agree?
I am a little confused. Which slopes differ? Not the growth model because that is the same in each class I believe. I would agree with you that if you analyzed the data as categorical, that plot would best represent the results.
I came across an article I thought was incorrect in their discussion of mixture models and wanted to see what your thoughts were--if you have time. The article is available on line Development and Psychopathology, special issue in honor of Paul E. Meehl. The article is “Taxometrics and developmental psychopathology” by Theodore P. Beauchaine (01 Aug 2003 pp 501-527). In this article, Beauchaine states that mixture modeling suffers the same problems as does cluster analysis--it provides a latent class but it can't determine the "ontological status of a trait or disorder as discrete versus continuous" p. 507. The author does not like cluster because it imposes a structure on the data rather think, “seeking the structure”—which is what the model fitting process in mixture modling enables via mplus--I would argue.
Beauchaine argues that programs written by Meehl (MAXslope and maxcov for example) are able to identify discrete groups, specifically they note maxslope is successful when the two indicators "are correlated less within groups than between groups" which is to me a latent profile analysis. Moreover, because mplus enables model testing --best fit it seems as though the claim that mixture suffers from the limits of cluster is not correct. Anyway, I ask because this is an important issue within development and psychopathology and wonder what the impact of this article may be on those of us who use mixture modeling in papers/grants. Are you familiar with this argument/ or stat program and how might researchers respond? Many thanks in advance.
bmuthen posted on Friday, August 20, 2004 - 1:17 pm
I would say that LCA cannot statistically determine that a latent variable is discrete rather than continuous. You can get the same fit with either type of model - some psychometric work points out this relationship. In my view, you really need "auxiliary variables", including distal outcomes (predictive validity) to help give credence for one or the other as the empirically more useful view.
I agree that LCA doesn't impose - or assume - a structure, but searches for it. I mean that in the sense that you increase the number of classes until you either have fit or it is clear that the conventional LCA model isn't sufficient - for example you need within-class correlations among some variables.
I would think it is clear that LCA is more useful than conventional cluster analysis.
I am not familiar with Meehl programs. I should read the Beauchaine article.
Could one speak of two distinct trajectory classes (GGMM) when these classes differ only by level (intercept) but not by linear slope (same decrease across both classes)--> parrallel growth curves? According to Li, Barrera, Hops & Fisher (2001) who had a similar result, I plan to label these groups by "Low-avergae" and "High-average". Both groups seem to be validated by various concurrent trajectories (LGMs) of other behavior, where I found growth-factor mean differences in accordance with theory.
Thanks! I predicted class membership of both trajectories of family climate by auxiliary T1 covariates like child temperament, gender and conduct problems. There were significant and theoretically meaningful associations. However, a reviewer is wondering what I am actually predicting, since both trajectory classes have the same slope. Is it valid to predict group membership of these kind of distinct growth curves (that differ only by level) by T1 covariates and what does it mean? What else could one do to validate this special kind of growth mixture solution (trajectories that differ only by level)?
I would like to use this 2 group solution as basis of a multiple group analysis targeting concurrent associated growth curves of problem behavior. Especially I'm interested in moderating effects of both family climate classes on covariances between growth factors of two problem behaviors. However, a reviewer also claimed, that it would be better to compute a mean level of family climate over the (5) waves and build "high" and "low"-groups on a distribution split of that score (since my two classes differ only by level). What would be an (statistical) argument to prefer my two class growth mixture solution for the purpose of multiple group building instead?
I try to stick with my GMM. However, could you, offhand, imagine another method for my research question that basically is: Are relationships (covariances between growth factors) in a multivariate LGM of 2 problem behaviors moderated by concurrent development of family climate (each variable measured over 5 waves)? Since I have short time effects between problem behaviors, a moderation of cross-lagged effects within the ALT-framework would also be of interest.
I thought this can be best done by multiple group modeling of the multivariate (ALT-)LGM and family climate groups of development thereby created by GMM. But, as I already mentioned, reviewers are not very convinced of both high and low average groups.
So if there would be an alternative way of analysis, I wouldn't mind. The only thing that came up to my mind, until now, is splitting the distribution growth factor scores of family climate, derived from a single class LGM. But that seems also very arbitrary and I wouldn't know how to combine intercept and slope score distribution splits to establish meaningful groups of development for multiple group analysis.
I don't know enough about your substantive situation to advice well. If your key question is about how the covariance between growth factors for two different processes is moderated by a third process, that's a quite complex question to model. And one could argue that adding GMM for the third process to the mix is making it too complex - from that angle the reviewer comments are reasonable, keeping it more transparent.
One interesting way to model a covariance as a function of an observed moderator is the CONSTRAINT = moderator; option of the VARIABLE command. See the QTL ex 5.23 in the UG.
Thanks, once again. I'm using GMM not as part of the multivariate LGM, but most likely class membership as grouping of the multiple group multivariate LGM (classification quality was high enough). So, model complexity was not a problem. I think I try to stick with that, but thank you for your opinion.
I have one last question. Another paper used LCGA to find groups of family climate used as moderator, and groups had different slopes in addition to different levels (easier to interpret as compared to my case). However, these models fitted considerably less well to my data as compared to GMM. Would it be correct to argument that GMM in my case provides a better (more reliable) longitudinal cutting method than LCGA (just for the purpose of finding longitudinal high vs. low groups), because it fits better to the data (albeit interpretability is lower due to variance around both trajectories, that seems to soak up ?artificial? class differences with respect to the slope).