

GMM with combined cluster and knowncl... 

Message/Author 


Hi Mplus experts, I'm estimating a growth mixture model in which each subject was measured at 7 equidistant time points and subjects are nested within groups. I would like to estimate a model that a) accounts for nonindependence within groups and b) allows class probabilities to vary across groups (similar to User's Guide example 8.8). My question: Is there any reason why I can't combine the CLUSTER command to achieve the former with the KNOWNCLASS command to achieve the latter, in the same model, and with both commands being referred to the same grouping variable? Is there a more elegant solution? My code would look something like the following. VARIABLE: NAMES ARE grp y1y7; CLASSES = grp (5) c (5); KNOWNCLASS = grp (grp = 1 grp = 2 grp = 3 grp = 4 grp = 5); CLUSTER = grp; ANALYSIS: TYPE = MIXTURE COMPLEX; MODEL: %overall% i s  y1@0 y2@1 y3@2 y4@3 y5@4 y6@5 y7@6; c ON grp; Thanks, Jordan 


Your growth model takes care of nonindependence within group, that is, the acrosstime correlation for a subject. And with 5 groups I think you want to consider group a fixed mode, so group is not a cluster variable as with multilevel analysis. So you only need Knownclass for the groups. 


Thanks so much. I understand that the growth model accounts for nonindependence across repeated measurements within subjects. In addition, I need to account for the fact that subjects are nested within groups. Are you saying that the KNOWNCLASS option alone accounts for nonindependence across subjects, within groups? I'm not sure what you mean by "with 5 groups I think you want to consider group a fixed mode." I suspect you're suggesting that I treat group as a fixed effect, as you would if you had randomized subjects to 5 different treatment groups. Is this what you're suggesting? For complex reasons, I really do need to treat group as random, and account for nonindependence nested within groups, across subjects. In case this matters, there will actually be 16 highestlevel units in the model (16 groups); the code above is a greatly simplified version of what the final model will look like, just for illustrative purposes. Many thanks, Jordan 


Yes, I had in mind that you "treat group as a fixed effect, as you would if you had randomized subjects to 5 different treatment groups." Even with 16 groups it is hard to treat group as random using Type=Twolevel because the cluster variance, which is what accounts for nonindependence within a cluster, is not well estimated for so few clusters/groups. You need at least 2030 and preferably more. Same for Type=Complex. 


Regarding fixed vs random you may also be interested in this related paper on our website (see Recent Papers on home page): Muthén, B. & Asparouhov, T. (2013). New methods for the study of measurement invariance with many groups. Mplus scripts are available here. 


Many thanks for your quick response. I'll look at the recent paper. Regarding my original question, would the CLUSTER command used in the code above account for the nonindependence across groups in the same manner as TYPE=TWOLEVEL? Besides the relatively low number of highestlevel units, is there any flaw in the code above? If it makes a difference, the design is analogous to one in which there are approx. 6,000 subjects per group, each measured at 7 time points. Jordan 


A related question: Let’s assume 16 groups of subjects, 6,000 subjects per group, 7 equidistant repeated measurements per subject. Our goal is to identify latent trajectories within groups. We have no interest in betweengroups comparisons. There are strong theoretical reasons to predict that groups vary in numbers of latent trajectories, and in groupspecific trajectory patterns. Is there good reason to think that a single, multiplegroup growth mixture model will provide more accurate latent trajectory identification than you’d get from 16 separate, groupspecific growth mixture models? I understand that there are strong advantages to estimating a single, multiplegroup growth mixture model, including elegance, convenience, and the possibility of betweengroup comparisons. However, my main concern is the accuracy of withingroup latent trajectory identification. The fact that we expect different groups to have different numbers of latent trajectories, and the fact that the multiplegroup approach constrains all groups to have the same number of latent trajectories, makes me think that perhaps estimating separate, groupspecific growth mixture models might yield more accurate latent trajectory identification. Do you agree? Or do you think there is good reason to believe that the multiplegroup growth mixture models will yield more accurate latent trajectory identification? Any guidance would be greatly appreciated. Many thanks. Jordan 


First post: You should not use Knownclass and Cluster for the same variable (group)  that would be contradictory, saying that group is both a fixed and a random mode. The Cluster option goes with Type=Twolevel and Type=Complex. 


Post 2: You would only benefit from doing multiplegroup analysis if some parameters are held equal across groups. It sounds like you should do 16 separate analyses for your 16 groups. 


OK, this answers my questions. Many thanks, Dr. Muthen, for the invaluable and quick advice. Jordan 

Back to top 

