Mplus Discussion >> GMM

Topics
Last Day
Last 3 Days
Last Week
Tree View

Edit Profile


GMM

Mplus Discussion > Latent Variable Mixture Modeling >

Message/Author

socrates posted on Thursday, July 13, 2006 - 4:01 am

I have two questions concerning GMM:

a) Is there a method available to check for multivariate normal distribution of observed outcome variables?

b) Is it possible to do confirmatory analyses with GMM? That is, is it possible to implement theoretically pre-defined trajectories into a GMM (by fixing the mean growth parameters of the latent classes to certain values) and then to check if the model is valid? If yes, which criteria would you recommend?

Thanks & best regards!

Bengt O. Muthen posted on Friday, July 14, 2006 - 4:46 pm

a) not a direct method. But if you conclude that 2 or more classes fit better than one, you have also concluded non-normality.

b) certainly. To test the model, you would check if the restrictions are suitable - either by 2*logL differences or by Wald (MODEL TEST).

socrates posted on Wednesday, July 26, 2006 - 12:31 am

Dear Dr. Muthén

Many thanks! To be sure, I would like to ask if the following procedure is sound: I started with a GMM with over 20 theoretically pre-defined latent classes (the growth parameter means were fixed a certain values in this model). By running this model, the individuals were assigned to these patterns on the basis of their repeated measurements. Subsequently, I excluded the empty latent classes and then the GMM was ran again. This resulted in a lower BIC indicating better model fit. However, is this procedure sound and how can I convince reviewers of it? I ask this because, as far as I know, with this procedure the BIC and other fit criteria are always worse than with exploratory GMMing.
Perhaps I should also say why I have chosen this procedure: When doing GMMing in an exploratory way, I got the impression that the fit criteria, especially the Lo-Mendell-Rubin criterion, are quite conservative (i.e. they give great attention to parsimony of models). However, I am not only interested in a parsimonious model, but also in the detection of a variety of patterns.

Bengt O. Muthen posted on Wednesday, July 26, 2006 - 3:57 pm

I think it is unusual that one has enough theoretical information to pre-define 20 latent classes by fixing growth parameter means at certain values. I can imagine situations where fixing a few values such as a quadratic mean at zero to presuppose linearity could be given theoretical backing, but what you are referring to is on a much grander scale. Although I don't know your topic/area, and you may have very good reasons for doing what you suggest, as a reviewer I myself would a priori be hard to convince.

Peter Elliott posted on Wednesday, July 07, 2010 - 3:48 am

Dear Dr Muthen,
How do I compare the model fits of GMMs - same sample, just different numbers of classes - when using Estimator = Bayes in Mplus v6?
If the DVs are continuous should I use Point='mean' rather than 'median'?
Thanks. p

Tihomir Asparouhov posted on Wednesday, July 07, 2010 - 11:09 am

You can use PPP. See Section 6.5.3 in
http://statmodel.com/download/BayesAdvantages6.pdf

There is no reason to prefer point=mean because the DV are continuous as far as I know.

Peter Elliott posted on Monday, July 26, 2010 - 8:34 pm

Thank you for your advice. Sect 6.5.3 refers to a simulation study with large n. Unfortunately my n=235. PPP results are: 1 Class=.339 2 Classes=.000, 3 Classes=.000. When using ML, AIC and BIC suggested a 3-class model is best. If PPP is not useful, what can I use to compare GMM model fits estimated using Bayes?

peter

Bengt O. Muthen posted on Tuesday, July 27, 2010 - 9:23 am

That's an unusual progression of PP p-values. Can you send your data and input/output to support@statmodel.com?

Tihomir Asparouhov posted on Wednesday, July 28, 2010 - 11:20 am

The PPP is addressing a more general question of fit than simply what the number of classes is. So you have to use/interpret it properly. In your particular example the PPP does not have the power to reject the 1-class model but we can sort of assume here that in your particular case that power is very low. So the fact that the 1 class model did not get rejected doesn't really mean that it is one class - simply PPP didn't have the power to prove more than 1 class . One has to remember what we are comparing :1 class growth v.s. 1 class unrestricted. You have a growth model with 3 time points and a linear growth. Therefore the DF here is 1, i.e., it is not that surprising that PPP did not reject the 1 class model. This small DF issue has more to do with the low power than the sample size. In your case the two models differ in only one parameters and not that surprisingly the model is not rejected.

The fact that the two class model is rejected has more to do with how well the estimated model fits the within class variance covariance rather than determining the number of classes. In particular it is mostly testing the class invariance of the variance parameters and the invariance of the residual variances. If you have the residual variances vary across the 2 classes than the model is not rejected.

The PPP can be used to determine the number of classes as we did in the paper but it can also be pointing towards within class misspecifications.

Each tool for determining the number of classes has a its own power for rejecting the smaller number of classes models. We don't know how the PPP power compares to say BIC power in general, but we do know that in some cases that power is 0 (for example a univariate mixture model) or very small like in your case.

The question about how many classes there are is not tied to the estimator ( ML and Bayes produce almost exactly the same model estimation ) so one can use ML and Bayes together in figuring out the number of classes as well as the within class models using all the tools that are available regardless of the estimator.