Mplus Discussion >> Class Selection

Topics
Last Day
Last 3 Days
Last Week
Tree View

Edit Profile


Class Selection

Mplus Discussion > Latent Variable Mixture Modeling >

Message/Author

Sanjoy posted on Wednesday, November 23, 2005 - 11:34 pm

Prof. Muthen ...

this is the situation ... I am doing a LCA with Covarites, 6 ordinal (each 3 categories) indicator variables and 10 covariates

2 class 3 class 4 class 5 class 6 Class
AIC 2479.396 2450.399 2437.525 2432.189 2389.83
Sample Adjusted BIC 2490.276 2468.119 2462.084 2463.588 2427.76
Entropy 0.862 0.898 0.823 0.849 0.981
Pearson P-value 0.0294 (df=701) 0.0312 (df=687) 0.99 (df =676) 0.99 (df =663) 0.99 (df =0.0865)

QI. Can you help me please to choose the number of latent calsses

- following AIC and sample adjusted BIC, Class 4 is better than 2 and 3
- Class 5 is better following AIC but not by Sample adjusted BIC
- Class 6 is better, in fact I have checked "Nagin table" (diagonal elements are between .95-.99) however in the regression result with 6 classes some of the estimates are astoundingly large, it looks slightly weird

Q2. Can you tell us please whether this is ok or not (# no warning in the output apart from +/-15 issues for some of the cells)

Categorical Latent Variables

C#1 ON
F1 0.907 0.522 1.739
HHSIZE -1.108 0.420 -2.637
F3INC 1.364 0.854 1.598
F4GMHEAR 0.685 0.413 1.660
F5EMPLOY 34.076 15.033 2.267
F6AEDU 16.388 6.456 2.539
F6BSCI 10.737 4.648 2.310
F7AGE 2.728 1.156 2.360
F10GMREA -16.461 6.709 -2.454
F11FSHOP -22.471 9.730 -2.309

C#2 ON
F1 5.969 2.599 2.297
HHSIZE 0.497 0.397 1.251
F3INC -5.860 2.523 -2.322
F4GMHEAR 14.672 5.987 2.451
F5EMPLOY -9.946 5.794 -1.717
F6AEDU 21.408 8.463 2.530
F6BSCI 71.584 32.350 2.213
F7AGE -6.231 2.615 -2.382
F10GMREA -39.392 17.437 -2.259
F11FSHOP -25.250 11.214 -2.252

C#3 ON
F1 20.990 10.401 2.018
HHSIZE -6.373 2.561 -2.489
F3INC -4.458 2.059 -2.165
F5EMPLOY -4.925 4.366 -1.128
F6AEDU 18.131 8.259 2.195
F6BSCI 43.399 19.744 2.198
F7AGE 11.876 5.159 2.302
F10GMREA -22.331 9.846 -2.268
F11FSHOP -32.049 14.271 -2.246

C#4 ON
F1 0.000 0.000 0.000
HHSIZE -2.756 1.080 -2.552
F3INC -5.470 2.349 -2.328
F5EMPLOY 1.834 3.593 0.511
F6AEDU 24.308 10.995 2.211
F6BSCI 59.123 28.020 2.110
F7AGE 7.606 3.424 2.221
F10GMREA -22.938 10.632 -2.158
F11FSHOP -29.166 12.868 -2.267

C#5 ON
F1 0.000 0.000 0.000
HHSIZE -0.886 0.428 -2.069
F3INC 1.456 0.866 1.682
F5EMPLOY 33.593 14.879 2.258
F6AEDU 16.087 6.478 2.483
F6BSCI 8.538 4.791 1.782
F7AGE 2.558 1.168 2.189
F10GMREA -15.794 6.795 -2.324
F11FSHOP -22.203 9.729 -2.282

Intercepts
C#1 89.500 39.580 2.261
C#2 137.887 60.984 2.261
C#3 124.500 51.936 2.397
C#4 129.445 54.820 2.361
C#5 89.779 39.518 2.272

Q3. I ran Mendell-Rubin test, it appears class 5 should be rejected... is not it!, however, the mean and the standard deviation value look pretty huge, Why is this ?

TECHNICAL 11 OUTPUT
VUONG-LO-MENDELL-RUBIN LIKELIHOOD RATIO TEST FOR 5 (H0) VERSUS 6 CLASSES
H0 Loglikelihood Value -1178.578
2 Times the Loglikelihood Difference 211.324
Difference in the Number of Parameters 45
Mean 294207.135
Standard Deviation 413877.053
P-Value 1.0000

LO-MENDELL-RUBIN ADJUSTED LRT TEST
Value 210.471
P-Value 1.0000

Thanks and regards

bmuthen posted on Thursday, November 24, 2005 - 8:26 am

Q1. You have some answers in the Muthen (2004) chapter in the Kaplan handbook on our web site. I would also look at the log likelihood and see how it improves as you add classes - you stop adding classes when the improvement starts leveling off. I would place less emphasis on Lo-Mendell-Rubin.

Q2. Large slopes of c on x is not a problem but is likely to happen with more classes because you are more likely to find people who have little variation on the x in certain classes (so with x=0/1, you may have almost all x=1 in one class) and you may also have class probability almost 1 for certain x values.

Q3. It looks like Tech11 runs into a problem here - hard to say why.

Sanjoy posted on Friday, November 25, 2005 - 7:17 pm

Thank you Professor, let me go through the paper you have suggested.

regards

anonymous posted on Tuesday, December 20, 2005 - 6:59 am

Hello,
how do I correct for design effects when estimating LC regression models but the number of between-units is small - 12, in mjy case. Would it be approbiate to use the complex procedure?

bmuthen posted on Tuesday, December 20, 2005 - 7:10 am

Simulations indicate that you might need at least 20 clusters for the SEs of the complex procedure to work well.

Elaine Walsh posted on Sunday, November 12, 2006 - 4:31 pm

Dr. Muthen,
I am working on a mixture model and attempting to identify trajectories/classes related to a specific behavior. Is there a command that provides information about which individual cases are placed in each trajectory/class?
Thank you.

Linda K. Muthen posted on Monday, November 13, 2006 - 8:34 am

See the CPROBABILITIES option of the SAVEDATA command.

Jon Elhai posted on Wednesday, July 16, 2008 - 6:14 pm

Drs. Muthen,
In running an LCA (ordinal outcome variables and MLR estimation)... I'm wondering how to interpret a p value of 1.000 for the VUONG-LO-MENDELL-RUBIN LR tests that I have obtained for each of my class solutions that I have run thus far (classes of 1 through 5). Is it typical for those first few class solutions to all have p values of 1.000 for those tests?

Bengt O. Muthen posted on Wednesday, July 16, 2008 - 6:41 pm

A p-value of 1 in the 2-class run would suggest that a single-class model is sufficient.

Wen posted on Thursday, October 15, 2009 - 8:41 am

Dear Drs. Muthen,

I'm working on the growth mixture modeling with continuous outcome. If the 2-class model has BIC=2180.8 and the VUONG-LO-MENDELL-RUBIN LIKELIHOOD RATIO TEST is significant (p=0.041) whereas the 3-class models has BIC=2180.6 and p-value=0.264 and is more interpretable. Which one should I choose? Do I have to calculate the chi-square value? If yes, can I get it in programming GMM?
Thank you.

Linda K. Muthen posted on Thursday, October 15, 2009 - 10:37 am

The VUONG-LO-MENDELL-RUBIN LIKELIHOOD RATIO TEST points to two classes. Substantive interpret ability and predictive validity can also be considered in determining the number of classes. See the following papers which are on the website:

Nylund, K.L., Asparouhov, T., & Muthen, B. (2007). Deciding on the number of classes in latent class analysis and growth mixture modeling. A Monte Carlo simulation study. Structural Equation Modeling, 14, 535-569.

Muth�n, B. (2004). Latent variable analysis: Growth mixture modeling and related techniques for longitudinal data. In D. Kaplan (ed.), Handbook of quantitative methodology for the social sciences (pp. 345-368). Newbury Park, CA: Sage Publications.

M.O. posted on Tuesday, May 12, 2015 - 2:37 am

Dear Drs. Muthen,

I am comparing factor mixture model(FMM), and local independence model(LCA).

LCA worked well, with 3-class solution showing the smallest BIC (5225).
FMM did not work with 2-class solution: while showing smaller BIC (5160), and bootstrap LR difference test being significant (p=0.01), all the class-members resulted in one class (ie., number of class members for class 1=888; and class2=0).
3-class FMM worked finely, with good interpretability, and BIC being smaller than 3-class LCA (5164). However, LR difference test was not significant (p=1)

Just for clarification, here are the input files for LCA and FMM.

Considring the result, is it reasonable to choose 3-class FMM ?

---LCA---
TITLE: LCA
DATA: FILE = 'N888.dat';
VARIABLE:
NAMES = u1-u3;
USEVARIABLES = u1-u3;
CATEGORICAL = u1-u3;
CLASSES = C(3);
ANALYSIS:
TYPE IS MIXTURE;
STARTS = 1000 100;

---FMM---
TITLE: fmm
DATA:
FILE = 'N888.dat';
VARIABLE:
NAMES = u1-u3;
USEVARIABLES = u1-u3;
CATEGORICAL = u1-u3;
CLASSES = C(3);
ANALYSIS:
TYPE IS MIXTURE;
STARTS = 10000 1000;
ALGORITHM=INTEGRATION;
MODEL:%OVERALL%
f1 by u1-u3;;
---

Bengt O. Muthen posted on Tuesday, May 12, 2015 - 7:43 am

I'd go by BIC. But I would also consider a model that is LCA, but adds a few WITH statements. See our new article:

Asparouhov, T. & Muthen, B. (2015). Residual associations in latent class and latent transition analysis. Structural Equation Modeling: A Multidisciplinary Journal, 22:2, 169-177, DOI: 10.1080/10705511.2014.935844