Class sizes PreviousNext
Mplus Discussion > Latent Variable Mixture Modeling >
 Anonymous posted on Tuesday, April 30, 2002 - 1:41 pm
I had three GMM models:
Model 1: a simple GMM without predictors of class membership and growth factors.
Model 2: predictors of class membership were included.
Model 3: both predictors of class membership and growth factors were included

Model 2 had exactly the same class classification as Model 1. However, the class sizes changed somewhat in Model 3 when predictors of growth factors were included.
Can I report class classification based on the results of Model 1, and interpret the impact of covariates on growth factors based on the results of Model 3?

Thank you very much for your help.
 Bengt O. Muthen posted on Tuesday, April 30, 2002 - 2:36 pm
No, use Model 3 for classification as well if that is your final model. If you are concerned about the change in classification in Model 3, try to modify Model 3. Perhaps some of the covariates have a direct influence on some of the outcomes. Also, you may want to study the individuals who change class membership to understand why that happens.
 Anonymous posted on Wednesday, May 01, 2002 - 10:04 am
Dear Dr. Muthén:
Thank you so much for your quick answers to my questions. They are very helpful. I wish I could have one more question.
Once the predictors of growth factors are included in the model, Mplus does not print out estimated mean values and S.E.s of the growth factors in default. So, option TECH4 is used to print the mean values and estimated variances of the growth factors. To my understanding, the estimated variances of growth factors measure the variation of the random coefficients, and we can use their square roots for significance tests. However, I found that the square roots of the variances of growth factor were much larger than expected. Then, I removed the predictors of growth factors and ran the model again with option TECH4 . I found that the square roots of the estimated variances of the growth factors were much larger than the S.E.s of the growth factors that Mplus provides.
I would like to know how I could test the significance of the growth factors when predictors of the growth factors are included in the model. Thank you very much for your help.
 Bengt O. Muthen posted on Thursday, May 02, 2002 - 7:20 am
I believe that you are confusing the standard error of the parameter estimate and the variance/standard deviation of a growth factor. When covariates are included in the model, a residual variance of a growth factor is estimated rather than the variance of a growth factor. This means that with covariates, significance testing focuses on the residual variance and the slopes for the covariates because this is how the model decomposes the variance of the growth factor. You typically don't test for the growth factor variance being zero. The significance test for any parameter in the model is found in column three of the results, the ratio of the parameter estimate to its standard error.
 Anonymous posted on Friday, May 03, 2002 - 7:23 am
You are right. S.E.s refer to sampling variation in parameter estimates, and variances of random coefficients measure variation of the coefficients across cases. Thanks for pointing out that the variance of growth factor is decomposed into two parts (explained and unexplained) when covariates are included.
 Harald Gerber posted on Saturday, October 04, 2008 - 3:30 am
Sorry, for that somewhat simple question, but how can one compute a significance test of the means of the growth factors given in tech 4 (conditional model)? I'm not sure if the question above deals with the same issue (it is more variance testing of growth facors?). But I want to test growth factor means in conditional models not growth factor (residual-)variances.
Thank you for your time!
 Linda K. Muthen posted on Saturday, October 04, 2008 - 11:57 am
You can use MODEL CONSTRAINT to define the mean or run the unconditional model.
 emm plaza posted on Monday, June 22, 2009 - 1:43 pm

I have a question about minimum number of subjects:

My latent class analyses are based on two variables in two different samples with 1400 and 1000 subjects respectively. My problem is that when I assess model fit, the number of classes suggested gives some classes that are very small (between 7 and 26 subjects in some classes). My worry is that the classes, although they have model fit and also are theoretically meaningful, might be too small. Could this be the case? Is there a minimum limit of subjects in each class in order for it to be meaningful?

 Linda K. Muthen posted on Monday, June 22, 2009 - 5:31 pm
It's hard to know. I know of no minimum limits.
 Arne Floh posted on Tuesday, November 17, 2009 - 6:19 am

how can I calculate the BLRT mentioned in Nyland/Asparouhov/Muthen (2007) to determine the optimal number of classes?

It should be implemented in Mplus since the 4.1 release.


 Linda K. Muthen posted on Tuesday, November 17, 2009 - 9:19 am
TECH14 of the OUTPUT command give BLRT. It cannot be calculated by hand.
 PS posted on Friday, December 09, 2016 - 2:29 pm

We are writing to request help with LCA using covariates. We were able to identify the best fitting model and use covariates to predict latent class membership using the r3step approach. We have saved the classes in a text file for use in spss to create a table describing participant demographics by class (e.g., % of people in each class that are girls). Surprisingly, the number of participants in each class in the SPSS file differs from the number of participants in each class shown in the mplus output file under “FINAL CLASS COUNTS AND PROPORTIONS FOR THE LATENT CLASS PATTERNS BASED ON THEIR MOST LIKELY CLASS MEMBERSHIP.”

What could be accounting for the different n’s in mplus compared to the spss file? We think it may be uncertainty in class assignment.
If that is the case, is there a way to determine the demographics by class in mplus instead of saving the file into spss?

Or …. Is it satisfactory to report demographics across classes and acknowledge in our demographics table that the values differ due to classification uncertainty?
 Bengt O. Muthen posted on Friday, December 09, 2016 - 5:30 pm
When you say

We have saved the classes in a text file

do you mean that you use Save=cprobs and consider most likely class and that this is from the Step 1 run? If so, please send the relevant files to Support along with your license number.
 PS posted on Saturday, December 10, 2016 - 6:06 am
Thank you for your response.

I used save = Cprob.
R3step is in the auxiliary statement.

Specifically, we used the following syntax in version 7.4.

data: file is data.txt;
missing are all (-99);

names = id weight psu strat sex
urban age ind1
ind2 ind3 ind4;

usevariables = sex urban age
ind1 ind2 ind3 ind4;

Nominal = ind1 ind2 ind3 ind4;

Classes = class (5);

Axiliary = (r3step) sex urban age;

Weight = weight;

stratification = strat;
cluster = psu;

psu = strat*3 +psu;
! code above suggested from to create unique
values for PSU;

type = complex mixture;
file is 5classsolution.txt;
save is cprop;
format is free;

For this syntax, we expected that the n's in "final class counts and proportions for the latent classes based on their most likely latent class membership" would be the same as the frequencies for the class variable in the 5classsolution.txt file. But they are not.

There will be some delay in sending the output. All our output must be approved by staff managing the dataset before it is exported from a remote desktop environment.
 Bengt O. Muthen posted on Saturday, December 10, 2016 - 9:15 am
Try not using weight.
 PS posted on Saturday, December 10, 2016 - 10:34 am
Dr. Muthen,

Thank for your response. Dropping "weight" results in the same class sizes in Mplus output file and SPSS file. Yay!

Now the question is how should I modify the rest of my analyses?

Previously, I ran the syntax above 6 times (changing class number from 1 - 6) and report regression results from the 5 class solution. Should I re-run the syntax without "weight" for models 1 to 6 and also run it a 7th time with "weight" included to get the results of the regression?
 Bengt O. Muthen posted on Monday, December 12, 2016 - 10:35 am
You should keep using your weight variable. The Mplus mixture analyses use weights but the file that is printed does not (that is, a person is not repeated twice for instance). You can read in the printed file together with your weight variable and do weighted analyses in Mplus like crosstabs.
 PS posted on Tuesday, December 13, 2016 - 7:15 am
Thank you very much. This has been tremendously helpful. One last question, how would one handle continuous variable such as age?
 Bengt O. Muthen posted on Tuesday, December 13, 2016 - 5:33 pm
I think you want to handle age just like you do sex for example.
 PS posted on Wednesday, December 14, 2016 - 6:41 am
Thank you again for our prompt response. I am not sure how to obtain the mean and standard deviation for age in each class.

The crosstabs would result in frequencies for each age category.
 Bengt O. Muthen posted on Wednesday, December 14, 2016 - 11:52 am
Compute the mean and Sd for age based on each most likely class membership. In any program.
Back to top
Add Your Message Here
Username: Posting Information:
This is a private posting area. Only registered users and moderators may post messages here.
Options: Enable HTML code in message
Automatically activate URLs in message