

Sample size for latent class growth a... 

Message/Author 


Dear Drs. Muthen, I am currently investigating the growth of children's arithmetic skills. I have an initial sample of 210 participants, in which only 140 left in the fourth phase of the study. I would like to use their growth trajectories in arithmetic to classify them into different classes. I tried that out in LCGA, and a 5 class solution was suggested, in which one of the groups was particularly interesting. Yet, I am wondering if my sample size was large enough for me to do such an analysis. I know that Monte Carlo stimulation was a good way to test it, but the only information I can find about Monte Carlo stimulation was your paper in 2002 (How to use a Monte Carlo study to decide on sample size and determine power). Yet, it maybe too brief for a beginners to master this analysis. I am wondering whether there are other more detailed work that I can refer to in terms of using Monte Carlo study. On the other hand, if my sample size really isn't enough for a LCGA, I am wondering whether a simple latent class analysis (with the arithmetic scores in the four time points) is a reasonable substitute. Thank you in advance for your help. 


Chapter 12 of the V7 User's Guide shows examples of Monte Carlo simulations. I don't think n=210 is too low in principle for LCGA or GMM, but it depends strongly on how different the class are; the more different, the easier it is to get good estimates. 


Dear Prof. Muthen, Thank you for your prompt reply. I have actually read through the examples in Chapter 12 of the user guide, but I still have difficulties in running the Monte Carlo stimulation. Maybe I would need some more text for beginners. Thank you for suggesting that n=210 is not too low for LCGA and GMM. For various reasons, I am considering only using the sample with all data (i.e., n=140). In that case, there would be no missing data. Would that affect the power a lot? On the other hand, may I ask what are the indicators of 'how different the class are?' Is that reflected by entropy? If it is, then are there any standards for entropy values? My LCGA results (n=140, 5 classes) has an entropy of .802. In that case, are the groups considered different enough? Thank you very much for your help! 


You should use all 210 subjects. A good indication of how different the classes are is how different the means are in the different classes relative to the SD. A mean difference of at least 2 SD is a good separation. 


Just a small follow up question to clarify the 'difference in means' that you mentioned. Do you mean the mean of intercept, slope, observed values, or estimated values? On the other hand, given that I have 5 groups in total, do I need to find the average of the difference in mean of all the 20 (5 x 4) pairs or I just need to compare the closest pairs (4 pairings), or maybe more strictly, only the pair which are closest in distance? Would the picture be more complicated if the growth curves of different classes intersect with each other? Thank you very much. Your answers helped me a lot! 


There are no definite rules for this; my comment was just a general one. I am talking about the estimated means. If you have a growth model the most relevant mean is for the growth factor that is most influential in defining the classes. Sometimes that is the intercept, sometimes the slope. 

Back to top 

