Message/Author 

socrates posted on Thursday, July 13, 2006  4:01 am



I have two questions concerning GMM: a) Is there a method available to check for multivariate normal distribution of observed outcome variables? b) Is it possible to do confirmatory analyses with GMM? That is, is it possible to implement theoretically predefined trajectories into a GMM (by fixing the mean growth parameters of the latent classes to certain values) and then to check if the model is valid? If yes, which criteria would you recommend? Thanks & best regards! 


a) not a direct method. But if you conclude that 2 or more classes fit better than one, you have also concluded nonnormality. b) certainly. To test the model, you would check if the restrictions are suitable  either by 2*logL differences or by Wald (MODEL TEST). 

socrates posted on Wednesday, July 26, 2006  12:31 am



Dear Dr. Muthén Many thanks! To be sure, I would like to ask if the following procedure is sound: I started with a GMM with over 20 theoretically predefined latent classes (the growth parameter means were fixed a certain values in this model). By running this model, the individuals were assigned to these patterns on the basis of their repeated measurements. Subsequently, I excluded the empty latent classes and then the GMM was ran again. This resulted in a lower BIC indicating better model fit. However, is this procedure sound and how can I convince reviewers of it? I ask this because, as far as I know, with this procedure the BIC and other fit criteria are always worse than with exploratory GMMing. Perhaps I should also say why I have chosen this procedure: When doing GMMing in an exploratory way, I got the impression that the fit criteria, especially the LoMendellRubin criterion, are quite conservative (i.e. they give great attention to parsimony of models). However, I am not only interested in a parsimonious model, but also in the detection of a variety of patterns. 


I think it is unusual that one has enough theoretical information to predefine 20 latent classes by fixing growth parameter means at certain values. I can imagine situations where fixing a few values such as a quadratic mean at zero to presuppose linearity could be given theoretical backing, but what you are referring to is on a much grander scale. Although I don't know your topic/area, and you may have very good reasons for doing what you suggest, as a reviewer I myself would a priori be hard to convince. 


Dear Dr Muthen, How do I compare the model fits of GMMs  same sample, just different numbers of classes  when using Estimator = Bayes in Mplus v6? If the DVs are continuous should I use Point='mean' rather than 'median'? Thanks. p 


You can use PPP. See Section 6.5.3 in http://statmodel.com/download/BayesAdvantages6.pdf There is no reason to prefer point=mean because the DV are continuous as far as I know. 


Thank you for your advice. Sect 6.5.3 refers to a simulation study with large n. Unfortunately my n=235. PPP results are: 1 Class=.339 2 Classes=.000, 3 Classes=.000. When using ML, AIC and BIC suggested a 3class model is best. If PPP is not useful, what can I use to compare GMM model fits estimated using Bayes? peter 


That's an unusual progression of PP pvalues. Can you send your data and input/output to support@statmodel.com? 


The PPP is addressing a more general question of fit than simply what the number of classes is. So you have to use/interpret it properly. In your particular example the PPP does not have the power to reject the 1class model but we can sort of assume here that in your particular case that power is very low. So the fact that the 1 class model did not get rejected doesn't really mean that it is one class  simply PPP didn't have the power to prove more than 1 class . One has to remember what we are comparing :1 class growth v.s. 1 class unrestricted. You have a growth model with 3 time points and a linear growth. Therefore the DF here is 1, i.e., it is not that surprising that PPP did not reject the 1 class model. This small DF issue has more to do with the low power than the sample size. In your case the two models differ in only one parameters and not that surprisingly the model is not rejected. The fact that the two class model is rejected has more to do with how well the estimated model fits the within class variance covariance rather than determining the number of classes. In particular it is mostly testing the class invariance of the variance parameters and the invariance of the residual variances. If you have the residual variances vary across the 2 classes than the model is not rejected. The PPP can be used to determine the number of classes as we did in the paper but it can also be pointing towards within class misspecifications. Each tool for determining the number of classes has a its own power for rejecting the smaller number of classes models. We don't know how the PPP power compares to say BIC power in general, but we do know that in some cases that power is 0 (for example a univariate mixture model) or very small like in your case. The question about how many classes there are is not tied to the estimator ( ML and Bayes produce almost exactly the same model estimation ) so one can use ML and Bayes together in figuring out the number of classes as well as the within class models using all the tools that are available regardless of the estimator. 

Back to top 