I have a fairly large sample (~10,000) and 4 class indicators. Using the usual approaches for the number of classes is leading me to 9 - however there is very little difference between the classes, in terms of the indicators. (It's also difficult to interpret, and theoretically rather difficult). I wondered if the large sample size was overpowering the VLMR test, and also the AIC and BIC, and resorting to a solution that makes more conceptual sense.
bmuthen posted on Saturday, January 17, 2004 - 12:25 am
Perhaps it is true about the VLMR test, but I don't think AIC and BIC get overpowered by large n since they are not looking at discrepancy functions. I assume you have continuous outcomes, so latent profile analysis (LPA). Two thoughts. First, you may get many and some not so meaningful classes because LPA works with conditional independence within class - perhaps some of your indicators have trivial methods correlation in which case you could include such a correlation and probably reduce the number of classes. Second, you might get more substantively meaningful classes if you add covariates (antecedents) and distal outcomes to the analysis.
Thanks for the quick response. I did add some covariates to the model, and it made more sense, and there was more agreement between the criteria.
Anonymous posted on Thursday, March 04, 2004 - 2:26 pm
I am having this same problem too. I'm running a LCA with 9 ordered polytomous latent class indicators of substance use (with 3 levels each). I have a data set with approximately 110,000 cases. Tech11 and BIC indicate that the 2, 3, 4, 5, 6, and 7 class models each fit better than a model with k-1 classes (and entropy remains above .8). Predictably, as the number of classes increase, the distinction between classes are not very meaningful for a few of the classes. I added in three covariates (age, gender, and race/ethnicity) and started over, but I am getting the same results for up to 5 classes so far (I'm waiting for the 6 class model with covariates to converge).
I'd like to try your suggestion about including trivial methods correlations between indicators to make sure that that issue is not contributing to finding so many classes. Since I am not sure how to do this, can you please provide an explicit example? For example, how can I tell if a correlation between any two class indicators are due to a trivial methods correlation? Where do I add the correlation into the model statement and how should I write the command (e.g., “u3 ON u7” in the overall model statement)? Do I need to interpret the resulting parameter or is it just sufficient to account for such correlations in the model building? When I examined the correlations between the 9 indicators, all of them reached significance and were distributed as follows: .24 -.3 = 11%, .301-.4 = 36 %, .401-.5 = 17%, .501 - .6 = 28 %, .601 - .7 = 3%, .701 - .8 = 6% (with a little rounding error). However, given that these are all substance use variables, I expect them to be significantly correlated. How much is too much? Any advice that you can offer would be much appreciated.
bmuthen posted on Friday, March 05, 2004 - 12:28 am
If Tech11 and BIC doesn't help restrict the number of classes, and covariates don't help, one good way is to bring in a distal outcome and look at predictive validity - do all classes have meaningful differences on the distal?
The within-class methods correlation betweeen indicators is easy to add with LPA (continuous outcomes), but harder with categorical indicators, at least until you have version 3. For categorical indicators you can add a class that only affects the two indicators in question. This is easy in version 3 which has a simplified approach to working with multiple latent class variables. Another option in version 3 is to work with a random effect (a continuous latent variable) influencing both indicators. There is an article by Qu, Tan, Kutner (1996) in Biometrics on these matters.
anonymous posted on Monday, April 12, 2010 - 1:12 am
Hello, I have a related issue. When I conduct an LCA on a general population sample(n=10,000) using 6 diagnoses as indicators, I obtain 3 classes differentiated primarily by severity rather than nature of problems. When I subset the sample only to individuals with disorders (n=2000), I obtain 7 classes differentiated primarily by nature of problem (not severity of problem). However, I am confused as to why the same 7 classes that were extracted in the disordered sample (plus an additional class without disorder) are not extracted as the best solution in the general population sample? Is it due to the fact that only a minority have diagnoses (about 18%) so it is more difficult to differentiate the classes based on diagnoses in the general population sample? Is there any reading that would help me conceptually understand the discrepant results?
Typically some of the subjects without the disorder have a few diagnostic criteria fulfilled and therefore can be a heterogeneous group for which a single class is not sufficient. The 7 classes in the 18% minority are then hard to find when analyzing everyone - the signals are drowned out by the heterogeneity among the majority.
anonymous posted on Tuesday, April 13, 2010 - 3:50 pm
Thanks very much for your response. Since the class indicators are diagnoses (not diagnostic criteria), and the majority do not have these diagnoses, shouldn't the majority be a homogeneous group?
I see, so the majority has zeros for all 6 variables. When you say "best solution" perhaps you are referring to BIC which might not point to 8 classes. Are you saying that you have tried using 8 classes for the total sample of 10K and the 7 non-zero classes are not the same 7 classes as for the subset of 2000 with disorders?
anonymous posted on Saturday, April 17, 2010 - 1:19 am
Yes, that's exactly correct. According to the BIC, the best solution in the subpopulation with psychiatric disorders (n=2000) is 7-classes, whereas the best solution in the entire sample (n=10,000), is 3 classes. In addition, when I examine 8 classes in the entire sample the model fails to converge. The log likelihood is not replicated, despite increasing random starts.