Peter Tice posted on Tuesday, March 20, 2001 - 11:02 am
What data distribution assumptions does Mplus version 2.0 make in estimating the growth mixture models. I understand that with version 1.04 the assumption is that one's data used in estimating the mixture models is normally distributed. Is the same assumption made with version 2.0, or is there more flexibility in estimating growth mixture models that use non-normal data?
The data distribution assumptions have not changed between Version 1 and Version 2. The assumption for the continuous y variables is that they are normally distributed within class given x. The marginal distributions of y can be quite non-normal. The latent class indicators are categorical so assumptions of normality are not relevant.
Anonymous posted on Tuesday, March 27, 2001 - 1:15 pm
Can Mplus deal with unbalanced longitudinal data for mixture modeling?
With TYPE=MIXTURE MISSING, there is a listwise deletion done on both time-varying and time-invariant x variables. So any subject with a missing value on one or more x variables will be eliminated from the analysis.
Anonymous posted on Tuesday, April 06, 2004 - 12:02 pm
A question about data distribution assumptions in Version 3: You indicate on the web site that mixture models will now be estimable with binary, ordinal, nominal and count data as well as continuous data. What is the distribution assumed of the latent factors when working with the discrete scales? Are mixtures of normals still used at the level of the factors, or some other continuous distribution, or is a point mass approach taken where the factors have means but no variances (similar to what is in GLLAMM)? Thanks in advance for your clarification (guidance to references would also be appreciated if available).
bmuthen posted on Tuesday, April 06, 2004 - 1:07 pm
Mplus allows you to take either of the two approaches that you mention, a mixture of normals, or the non-parametric mass point approach. These two approaches are discussed in examples in chapter 7 of the Version 3 User's Guide. The User's Guide refers to work by Aitkin (1999) in Biometrics for the non-parametric approach. The following paper discusses the factor mixture approach with non-zero factor variances (see the Mplus home page for a pdf):
Lubke, G. & Muthén, B. (2003). Performance of factor mixture models. Under review, Multivariate Behavioral Research.
Anonymous posted on Tuesday, April 06, 2004 - 5:27 pm
I have a few questions about the data requirements for GMM. 1. What assumptions are made about the time structure of the data? Is it the same as conventional growth curve models where the Y variable can be unequally spaced, yet in the complete data individuals must be measured at the same time points? 2. Is it likewise assumed that time-varying covariates have the same distribution across individuals? 3. In version 3, is there still listwise deletion of data with missing values on the covariates? Thanks!
1. You can have a model where the time scores are parameters in the model or where the time scores are treated as data. In the first situation, all individuals are measured at the same times. In the second option, individuals can be measured at different times.
2. No, time-varying covariates can also be measured at different times for different individuals.
3. Yes, but if you treat the x-variables as y-variables by mentioning their variances in the MODEL command, they will not be deleted. The only difference is that the model is not estimated condtioned on the x's but they are part of the model.
Ted Fong posted on Tuesday, March 18, 2014 - 3:02 am
Dear Dr. Muthen,
I have performed a factor mixture analysis on 9 continuous items and found a two-class (high vs low class), two-factor mixture model to the data.
However, a reviewer questions the validity of the two classes as they may arise because of violation of assumptions such as heavy skewness and kurtosis. He wants to know how badly-behaved the data are, that may in turn give rise to the mixture components.
I have checked that the skewness and kurtosis of the 9 variables are slightly negative (-1 to 0). 6 of the 9 variables appear to have a bimodal distribution. I have done TECH13 which rejects the multivariate skewness and kurtosis of the two-class, two-factor model.
I understand that the overall distribution of variables can be nonnormal in mixture analysis. I am not sure if my data behaves well or badly according to the view of the reviewer. Overall, he seems to question the use of mixture analysis to finding distinct meaningful subgroups VS its alternative indirect use of approximating some unknown distribution.
I have read the papers written by Bauer and your 2003 paper on substantial and statistical checking of mixture models. But I don't know where to proceed or how to respond to the reviewer.
Do you have any views on this matter? I would be grateful for any comments.
You can use BIC to argue on statistical grounds that a two-class factor mixture model is better than a regular 1-class factor model (or a two-class LCA).
You can use substantive theory to argue on subject-matter grounds that your two classes are useful and of substantive interest by relating the class membership to other variables as I argued in my 2003 response. You can use 3-step methods for that: R3STEP to look at antecedents (predictors of the latent class variable) and DCAT/DCON to look at consequences (distal outcomes; predictive validity).
I think you want to make both arguments in a mixture analysis.