Message/Author 


I am using LCA to examine whether poor fit for a particular measurement model can be explained by heterogeneity in our sample (i.e., the measurement model is not invariant). Our observed variables are ordinal. To do so, I specified the model based on example 7.27 in the user manual. The results provide support for multiple classes that account for local independence. For example: Model……..BIC…………....…LR ÷2………....df………...…pvalue 1class……116786.604……14781.249…..279355…..1 2class……114471.559……13950.463..…279357…..1 3class……113637.632……12914.465…..279329…..1 4class……113334.263……12678.593…..279329…..1 5class……waiting for the results Although I can use the above information to make comparisons between the nested models, I cannot figure out how to evaluate overall model fit based on the available information. In a CFA model I would have a look at the LRratio chisquare to start the LRchisquare in the LCA models has an extremely large number of degrees of freedom and I do not know how to interpret this. I pasted the model syntax below for the twoclass model. Could someone provide me some direction on how to assess overall model fit? Thank you for taking a look at this!  ANALYSIS: TYPE = MIXTURE; ALGORITHM=INTEGRATION; STARTS = 100 10; VARIABLE: NAMES ARE y1y40; USEVARIABLES ARE y1y7; CATEGORICAL ARE y1y7; CLASSES = C(2); MODEL: %OVERALL% f BY y1y7; [f@0]; %C#1% f BY y1@1 y2y7; f; [y1$1y7$5]; %C#2% f BY y1@1 y2y7; f; [y1$1y7$5];  


You have 40 categorical variables. This means that the fit of the model to data is very difficult to assess. The unrestricted model would be the multinomial with as many probability parameters (minus 1) as there are cells in the 40way frequency table. The chisquares you see (Pearson and LRT) refer to the frequency table. With 40 variables, you have many empty cells and therefore the 2 chisquare values are not dependable  you see that by the Pearson and LRT versions disagreeing greatly. One way to get some degree of appreciation of fit is to request Tech10 which gives standardized residuals for bivariate tables  but there are many many bivariate tables for 40 categorical variables. In other words, you probably have to settle for seeing that your likelihood and BIC improves greatly when you move from the conventional singleclass factor analysis (where you also have no fit to the observed data given 40 categorical outcomes) to using more than one class. 


Thank you very much for your prompt reply. I did indeed receive the following message in the output which is congruent with your note of caution regarding interpreting fit: "** Of the 279936 cells in the latent class indicator able, 493 were deleted in the calculation of chisquare due to extreme values". In addition to examining the standardized residuals, would it also be defensible to follow the latent class analysis up with CFA's for each of the latent classes? I realize that would not allow for statistical crossmodel comparisons of fit, but it would provide a picture of the degree to the measurement model fits within each of the latent classes. Thanks again! PS  my syntax was a bit misleading in that I actually only use 7 of the 40 variables in this particular measurement model. 


I am not sure to which degree the fit of CFAs on those classified into the different classes actually would reflect the overall fit of the model. That would seem to be a topic for a simulation study. One reason I think it would not work well is that if the measurement model fits poorly in a class, the class formation that was used would be incorrect  so like trying to pull yourself up by your boot straps. 


Hello, I am using finite mixture path models and find that measurement invariance across the segments does not hold up. But, given the data, I do expect that. I am a little confused about the nature of inferences I can make given the results. Thanks in advance for your help! Raji 


In our experience, it is most often the case that there is not measurement invariance for the latent variables in the model particularly for the intercepts. Although this means that you cannot meaningfully compare themeans, variances, and covariances of the latent variables across classes, there are still interesting findings due to the fact that the latent variables have different meanings in different classes. 

Back to top 