Message/Author 

Jon Heron posted on Thursday, March 25, 2010  7:32 am



Dear Mplussians, I've read quite a few latentclass model papers recently and I don't recall any of them mentioning tech10/bivariate residuals/conditional independence. In my (somewhat limited) experience it's much more difficult to keep ones residuals under control with a crosssectional mixture model (compared with a longitudinal model) as the interdependence between the items tends to be more complex. It's not unheard of to need to add 2 or 3 more classes compared to what would be supported by entropy/BIC/BLRT alone. So how important is this? should I just concentrate on the more familiar assessments of fit and regard low residuals as being desirable but not strictly necessary? BTW I recognise this isn't strictly Mplus related (although I did mention tech10). many thanks, Jon 


We agree with you. We would use TECH10 sparingly and look at only the largest residuals. Rather than adding more classes, we might add residual covariances using the f BY language shown in Example 7.16 although each residual covariance is one dimension of integration. 

Jon Heron posted on Friday, March 26, 2010  2:12 am



Hi Linda, thanks for the suggestion. I'll check out Tan, and Kutner (1996). best wishes, Jon 

Jon Heron posted on Thursday, April 15, 2010  5:09 am



Hi Linda, I recognize that example 7.16 is merely to demonstrate a principle, however I am wondering if one might have been able to infer than conditional independence was being violated IN ONE CLASS ONLY from some of the Mplus output. As Tech10 output is not classspecific i wonder whether the solution to conditional DEpendence is to start with a factor in only one class and then to add it to additional classes if the problem does not go away. 

Jon Heron posted on Thursday, April 15, 2010  5:17 am



I think I've answered this myself  stepping away from the machine for a minute often works wonders. Conditional dependence will show up as one or more high residuals for specific response patterns. Providing those pattern(s) are allocated to the same class with a high classassignment probability, you know that the CD problem will be in one but not the other class. [possibly premature smiley] 


I believe you have reason to smile  I think this is on the right track. The RESPONSE option (see User's Guide) might be useful here, giving the most likely class membership for each pattern. The idea of relaxing conditional independence in only one class was used in the Qu, Tan, Kutner 1996 Biometrics article on a dentistry application. 

Jon Heron posted on Friday, April 16, 2010  12:42 am



Thanks Bengt, Qu, Tan, Kutner discuss examples in which they feel they can argue for adding direct effects e.g. distinctive characteristics of slides in a laboratory. I am currently struggling to justify introducing extra terms purely to improve fit without such nonstatistical justification. That has never sat well with me. 

Jon Heron posted on Friday, April 10, 2015  2:00 am



Hi Bengt/Linda, I see it's been five years since I pondered about the utility of Tech10. In addition to using Tech10 to search for high standardized residuals, we've been wondering intermittently about pooling the individual Pearson values. It seems to me that if conditional independence were to hold and one were to convert the Bivariate Pearson Chisquares to pvalues, the distribution of these pvalues should be uniformly distributed. One might then use something like KolmogorovSmirnoff to test or uniformity. To test this I simulated a 2class mixture with 20 binary indicators. The resulting distribution for the 190 pvalues was strongly skewed towards p=1.0. There was also no pvalue <0.05 whereas I would have expected a few. Is there an obvious error in my logic? many thanks, Jon 


The 190 values are not independent draws. So I think you might have to draw multiple samples (not just one data sample) to determine that distribution. However, I wouldn't expect it be uniform. We don't actually compute pvalues there in tech10 or claim DF. We don't do that because the number of parameters is not a clear concept when you look at a bivariate submodel. 

Jon Heron posted on Monday, April 13, 2015  12:17 am



Hmm, thanks Tihomir those two short comments pretty much invalidate everything we have done/considered. (1) Compare the global Pearson with Chisquare(df = 190*(q1)^2, alpha = 0.05), where q = #categories for each variable (2) Assess how many individual Pearson values are > Chisquare(df = (q1)^2, alpha = 0.05) (3) Determine the number of high standardized residuals 

Back to top 