My sample consists of 1142 subjects and I am trying to decide upon the optimal number of classes for my GMM with linear and quadratic effects. I've read the paper by Nylund et al. (2007) and decided to mainly focus on the BLRT and the BIC for class enumeration.
In order to choose this number of classes I started out fitting models without any random effects and with increasing number of classes. I stopped at 6 classes, because this 6-class model produced a class <1%. However, the BIC was still going down and the BLRT remained significant whenever I added more classes. I subsequently attempted to base my decision on more complicated models, by fitting models including random effect and even class-specific random effects with increasing number of classes. Unfortunately the problem of the ever-decreasing BIC and always-significant BLRT remained. Hence I do not know how to make a final decision on the number of classes.
Do you have any advice for me on how to proceed? Or does this mean that my data can simply not be summarized into classes in a good way?
I altered the residual variance structure. Although model fit improved, the class enumeration problem was not solved. The BIC still goes down whenever I add classes and the BLRT always remains significant. I tried several models, with and without random effects, but the indexes remain inconclusive.
I do not know how to interpret this.. do you have any other ideas?
Also, if (hypothetically) a 4-class model fits the best, but the 5-class model does not converge, the how will I ever know whether the 4-class model is really the best?