GMM with combined cluster and knowncl...
Message/Author
 Jordan Silberman posted on Friday, February 06, 2015 - 4:56 pm
Hi Mplus experts,

I'm estimating a growth mixture model in which each subject was measured at 7 equidistant time points and subjects are nested within groups. I would like to estimate a model that a) accounts for non-independence within groups and b) allows class probabilities to vary across groups (similar to User's Guide example 8.8).

My question: Is there any reason why I can't combine the CLUSTER command to achieve the former with the KNOWNCLASS command to achieve the latter, in the same model, and with both commands being referred to the same grouping variable? Is there a more elegant solution?

My code would look something like the following.

VARIABLE: NAMES ARE grp y1-y7;
CLASSES = grp (5) c (5);
KNOWNCLASS = grp (grp = 1 grp = 2 grp = 3 grp = 4 grp = 5);
CLUSTER = grp;
ANALYSIS: TYPE = MIXTURE COMPLEX;
MODEL:
%overall%
i s | y1@0 y2@1 y3@2 y4@3 y5@4 y6@5 y7@6;
c ON grp;

Thanks,
Jordan
 Bengt O. Muthen posted on Saturday, February 07, 2015 - 2:58 pm
Your growth model takes care of non-independence within group, that is, the across-time correlation for a subject. And with 5 groups I think you want to consider group a fixed mode, so group is not a cluster variable as with multilevel analysis. So you only need Knownclass for the groups.
 Jordan Silberman posted on Saturday, February 07, 2015 - 3:52 pm
Thanks so much.

I understand that the growth model accounts for non-independence across repeated measurements within subjects.

In addition, I need to account for the fact that subjects are nested within groups. Are you saying that the KNOWNCLASS option alone accounts for non-independence across subjects, within groups?

I'm not sure what you mean by "with 5 groups I think you want to consider group a fixed mode."

I suspect you're suggesting that I treat group as a fixed effect, as you would if you had randomized subjects to 5 different treatment groups. Is this what you're suggesting?

For complex reasons, I really do need to treat group as random, and account for non-independence nested within groups, across subjects. In case this matters, there will actually be 16 highest-level units in the model (16 groups); the code above is a greatly simplified version of what the final model will look like, just for illustrative purposes.

Many thanks,
Jordan
 Bengt O. Muthen posted on Saturday, February 07, 2015 - 4:48 pm
Yes, I had in mind that you "treat group as a fixed effect, as you would if you had randomized subjects to 5 different treatment groups." Even with 16 groups it is hard to treat group as random using Type=Twolevel because the cluster variance, which is what accounts for non-independence within a cluster, is not well estimated for so few clusters/groups. You need at least 20-30 and preferably more. Same for Type=Complex.
 Bengt O. Muthen posted on Saturday, February 07, 2015 - 4:52 pm
Regarding fixed vs random you may also be interested in this related paper on our website (see Recent Papers on home page):

Muthén, B. & Asparouhov, T. (2013). New methods for the study of measurement invariance with many groups. Mplus scripts are available here.
 Jordan Silberman posted on Saturday, February 07, 2015 - 5:08 pm
Many thanks for your quick response. I'll look at the recent paper.

Regarding my original question, would the CLUSTER command used in the code above account for the non-independence across groups in the same manner as TYPE=TWOLEVEL? Besides the relatively low number of highest-level units, is there any flaw in the code above?

If it makes a difference, the design is analogous to one in which there are approx. 6,000 subjects per group, each measured at 7 time points.

Jordan
 Jordan Silberman posted on Sunday, February 08, 2015 - 4:24 am
A related question:

Let’s assume 16 groups of subjects, 6,000 subjects per group, 7 equidistant repeated measurements per subject. Our goal is to identify latent trajectories within groups. We have no interest in between-groups comparisons. There are strong theoretical reasons to predict that groups vary in numbers of latent trajectories, and in group-specific trajectory patterns. Is there good reason to think that a single, multiple-group growth mixture model will provide more accurate latent trajectory identification than you’d get from 16 separate, group-specific growth mixture models?

I understand that there are strong advantages to estimating a single, multiple-group growth mixture model, including elegance, convenience, and the possibility of between-group comparisons. However, my main concern is the accuracy of within-group latent trajectory identification.

The fact that we expect different groups to have different numbers of latent trajectories, and the fact that the multiple-group approach constrains all groups to have the same number of latent trajectories, makes me think that perhaps estimating separate, group-specific growth mixture models might yield more accurate latent trajectory identification. Do you agree? Or do you think there is good reason to believe that the multiple-group growth mixture models will yield more accurate latent trajectory identification?

Any guidance would be greatly appreciated. Many thanks.
Jordan
 Bengt O. Muthen posted on Sunday, February 08, 2015 - 12:38 pm
First post:

You should not use Knownclass and Cluster for the same variable (group) - that would be contradictory, saying that group is both a fixed and a random mode. The Cluster option goes with Type=Twolevel and Type=Complex.
 Bengt O. Muthen posted on Sunday, February 08, 2015 - 12:42 pm
Post 2:

You would only benefit from doing multiple-group analysis if some parameters are held equal across groups. It sounds like you should do 16 separate analyses for your 16 groups.
 Jordan Silberman posted on Sunday, February 08, 2015 - 12:51 pm
OK, this answers my questions. Many thanks, Dr. Muthen, for the invaluable and quick advice. -Jordan