I'm estimating a growth mixture model in which each subject was measured at 7 equidistant time points and subjects are nested within groups. I would like to estimate a model that a) accounts for non-independence within groups and b) allows class probabilities to vary across groups (similar to User's Guide example 8.8).
My question: Is there any reason why I can't combine the CLUSTER command to achieve the former with the KNOWNCLASS command to achieve the latter, in the same model, and with both commands being referred to the same grouping variable? Is there a more elegant solution?
My code would look something like the following.
VARIABLE: NAMES ARE grp y1-y7; CLASSES = grp (5) c (5); KNOWNCLASS = grp (grp = 1 grp = 2 grp = 3 grp = 4 grp = 5); CLUSTER = grp; ANALYSIS: TYPE = MIXTURE COMPLEX; MODEL: %overall% i s | y1@0y2@1y3@2y4@3y5@4y6@5y7@6; c ON grp;
Your growth model takes care of non-independence within group, that is, the across-time correlation for a subject. And with 5 groups I think you want to consider group a fixed mode, so group is not a cluster variable as with multilevel analysis. So you only need Knownclass for the groups.
I understand that the growth model accounts for non-independence across repeated measurements within subjects.
In addition, I need to account for the fact that subjects are nested within groups. Are you saying that the KNOWNCLASS option alone accounts for non-independence across subjects, within groups?
I'm not sure what you mean by "with 5 groups I think you want to consider group a fixed mode."
I suspect you're suggesting that I treat group as a fixed effect, as you would if you had randomized subjects to 5 different treatment groups. Is this what you're suggesting?
For complex reasons, I really do need to treat group as random, and account for non-independence nested within groups, across subjects. In case this matters, there will actually be 16 highest-level units in the model (16 groups); the code above is a greatly simplified version of what the final model will look like, just for illustrative purposes.
Yes, I had in mind that you "treat group as a fixed effect, as you would if you had randomized subjects to 5 different treatment groups." Even with 16 groups it is hard to treat group as random using Type=Twolevel because the cluster variance, which is what accounts for non-independence within a cluster, is not well estimated for so few clusters/groups. You need at least 20-30 and preferably more. Same for Type=Complex.
Many thanks for your quick response. I'll look at the recent paper.
Regarding my original question, would the CLUSTER command used in the code above account for the non-independence across groups in the same manner as TYPE=TWOLEVEL? Besides the relatively low number of highest-level units, is there any flaw in the code above?
If it makes a difference, the design is analogous to one in which there are approx. 6,000 subjects per group, each measured at 7 time points.
Let’s assume 16 groups of subjects, 6,000 subjects per group, 7 equidistant repeated measurements per subject. Our goal is to identify latent trajectories within groups. We have no interest in between-groups comparisons. There are strong theoretical reasons to predict that groups vary in numbers of latent trajectories, and in group-specific trajectory patterns. Is there good reason to think that a single, multiple-group growth mixture model will provide more accurate latent trajectory identification than you’d get from 16 separate, group-specific growth mixture models?
I understand that there are strong advantages to estimating a single, multiple-group growth mixture model, including elegance, convenience, and the possibility of between-group comparisons. However, my main concern is the accuracy of within-group latent trajectory identification.
The fact that we expect different groups to have different numbers of latent trajectories, and the fact that the multiple-group approach constrains all groups to have the same number of latent trajectories, makes me think that perhaps estimating separate, group-specific growth mixture models might yield more accurate latent trajectory identification. Do you agree? Or do you think there is good reason to believe that the multiple-group growth mixture models will yield more accurate latent trajectory identification?
Any guidance would be greatly appreciated. Many thanks. Jordan
You should not use Knownclass and Cluster for the same variable (group) - that would be contradictory, saying that group is both a fixed and a random mode. The Cluster option goes with Type=Twolevel and Type=Complex.