We are doing multilevel mixture modelling (MMM)with data in which sample size is 1668 and number of clusters is 93. We have 82 cases tested individually whose cluster number is thus unknown. These cases are valuable and we need to classify them too. We have tried two ways to solve this.
A 5 class solution (MMM) was found to fit best to the data of the 1668 children with cluster information available.
The first way to calculate posterior probabilities for the additional single cases (whose class number is unknown) we fixed all the parameters to be those we found previously in the multilevel model without the 82 additional single cases and run the model again. Each single case were set to have individual cluster number.
Other way would be to do the multilevel analysis for the whole data (1668+82). Each single case were again set to have individual cluster number.
In both cases we used STARTS 500 20.
These two ways to analyze, however, produced very different solutions. We are now wondering which is the proper way to do MMM analysis in this case?
Greetings from Jyvaskyla Minna Torppa & Asko Tolvanen
Perhaps your example is like in UG ex 10.4? Although perhaps not longitudinal. Or perhaps using features in 4.2 with a between-level latent class variable "cb".
In any case, I would suggest your second approach (1668+82) as more appropriate given that we don't know that the solution for 1668+82 would be the same as for 1668 alone. Nevertheless, it is a bit surprising that the two approaches would give largely different results given that 82 is a small portion of 1668 - assuming the 82 are similar to the rest. Still, even this second approach is an approximation because the 82 may be in (some of) the 93 clusters and therefore not be independent observations of units in those clusters.