Sample size for latent class growth a...
Message/Author
 Wong Tin Yau Terry posted on Sunday, December 15, 2013 - 6:01 am
Dear Drs. Muthen,

I am currently investigating the growth of children's arithmetic skills. I have an initial sample of 210 participants, in which only 140 left in the fourth phase of the study. I would like to use their growth trajectories in arithmetic to classify them into different classes. I tried that out in LCGA, and a 5 class solution was suggested, in which one of the groups was particularly interesting. Yet, I am wondering if my sample size was large enough for me to do such an analysis. I know that Monte Carlo stimulation was a good way to test it, but the only information I can find about Monte Carlo stimulation was your paper in 2002 (How to use a Monte Carlo study to decide on sample size and determine power). Yet, it maybe too brief for a beginners to master this analysis. I am wondering whether there are other more detailed work that I can refer to in terms of using Monte Carlo study.

On the other hand, if my sample size really isn't enough for a LCGA, I am wondering whether a simple latent class analysis (with the arithmetic scores in the four time points) is a reasonable substitute.

Thank you in advance for your help.
 Bengt O. Muthen posted on Sunday, December 15, 2013 - 11:07 am
Chapter 12 of the V7 User's Guide shows examples of Monte Carlo simulations.

I don't think n=210 is too low in principle for LCGA or GMM, but it depends strongly on how different the class are; the more different, the easier it is to get good estimates.
 Wong Tin Yau Terry posted on Tuesday, December 17, 2013 - 8:54 am
Dear Prof. Muthen,

Thank you for your prompt reply. I have actually read through the examples in Chapter 12 of the user guide, but I still have difficulties in running the Monte Carlo stimulation. Maybe I would need some more text for beginners.

Thank you for suggesting that n=210 is not too low for LCGA and GMM. For various reasons, I am considering only using the sample with all data (i.e., n=140). In that case, there would be no missing data. Would that affect the power a lot? On the other hand, may I ask what are the indicators of 'how different the class are?' Is that reflected by entropy? If it is, then are there any standards for entropy values? My LCGA results (n=140, 5 classes) has an entropy of .802. In that case, are the groups considered different enough?

Thank you very much for your help!
 Bengt O. Muthen posted on Tuesday, December 17, 2013 - 3:51 pm
You should use all 210 subjects.

A good indication of how different the classes are is how different the means are in the different classes relative to the SD. A mean difference of at least 2 SD is a good separation.
 Wong Tin Yau Terry posted on Wednesday, December 18, 2013 - 4:14 am
Just a small follow up question to clarify the 'difference in means' that you mentioned. Do you mean the mean of intercept, slope, observed values, or estimated values? On the other hand, given that I have 5 groups in total, do I need to find the average of the difference in mean of all the 20 (5 x 4) pairs or I just need to compare the closest pairs (4 pairings), or maybe more strictly, only the pair which are closest in distance? Would the picture be more complicated if the growth curves of different classes intersect with each other?

Thank you very much. Your answers helped me a lot!
 Bengt O. Muthen posted on Wednesday, December 18, 2013 - 10:36 am
There are no definite rules for this; my comment was just a general one. I am talking about the estimated means. If you have a growth model the most relevant mean is for the growth factor that is most influential in defining the classes. Sometimes that is the intercept, sometimes the slope.
Post:
This is a private posting area. Only registered users and moderators may post messages here.