Jon Heron posted on Thursday, March 25, 2010 - 12:32 pm
I've read quite a few latent-class model papers recently and I don't recall any of them mentioning tech10/bivariate residuals/conditional independence.
In my (somewhat limited) experience it's much more difficult to keep ones residuals under control with a cross-sectional mixture model (compared with a longitudinal model) as the inter-dependence between the items tends to be more complex. It's not unheard of to need to add 2 or 3 more classes compared to what would be supported by entropy/BIC/BLRT alone.
So how important is this? should I just concentrate on the more familiar assessments of fit and regard low residuals as being desirable but not strictly necessary?
BTW I recognise this isn't strictly Mplus related (although I did mention tech10).
We agree with you. We would use TECH10 sparingly and look at only the largest residuals. Rather than adding more classes, we might add residual covariances using the f BY language shown in Example 7.16 although each residual covariance is one dimension of integration.
Jon Heron posted on Friday, March 26, 2010 - 7:12 am
thanks for the suggestion. I'll check out Tan, and Kutner (1996).
best wishes, Jon
Jon Heron posted on Thursday, April 15, 2010 - 11:09 am
I recognize that example 7.16 is merely to demonstrate a principle, however I am wondering if one might have been able to infer than conditional independence was being violated IN ONE CLASS ONLY from some of the Mplus output.
As Tech-10 output is not class-specific i wonder whether the solution to conditional DEpendence is to start with a factor in only one class and then to add it to additional classes if the problem does not go away.
Jon Heron posted on Thursday, April 15, 2010 - 11:17 am
I think I've answered this myself - stepping away from the machine for a minute often works wonders.
Conditional dependence will show up as one or more high residuals for specific response patterns. Providing those pattern(s) are allocated to the same class with a high class-assignment probability, you know that the CD problem will be in one but not the other class.
I believe you have reason to smile - I think this is on the right track. The RESPONSE option (see User's Guide) might be useful here, giving the most likely class membership for each pattern.
The idea of relaxing conditional independence in only one class was used in the Qu, Tan, Kutner 1996 Biometrics article on a dentistry application.
Jon Heron posted on Friday, April 16, 2010 - 6:42 am
Qu, Tan, Kutner discuss examples in which they feel they can argue for adding direct effects e.g. distinctive characteristics of slides in a laboratory.
I am currently struggling to justify introducing extra terms purely to improve fit without such non-statistical justification. That has never sat well with me.
Jon Heron posted on Friday, April 10, 2015 - 8:00 am
I see it's been five years since I pondered about the utility of Tech10.
In addition to using Tech10 to search for high standardized residuals, we've been wondering intermittently about pooling the individual Pearson values.
It seems to me that if conditional independence were to hold and one were to convert the Bivariate Pearson Chi-squares to p-values, the distribution of these p-values should be uniformly distributed. One might then use something like Kolmogorov-Smirnoff to test or uniformity.
To test this I simulated a 2-class mixture with 20 binary indicators. The resulting distribution for the 190 p-values was strongly skewed towards p=1.0. There was also no p-value <0.05 whereas I would have expected a few.
The 190 values are not independent draws. So I think you might have to draw multiple samples (not just one data sample) to determine that distribution.
However, I wouldn't expect it be uniform. We don't actually compute p-values there in tech10 or claim DF. We don't do that because the number of parameters is not a clear concept when you look at a bi-variate sub-model.
Jon Heron posted on Monday, April 13, 2015 - 6:17 am
Hmm, thanks Tihomir
those two short comments pretty much invalidate everything we have done/considered.
(1) Compare the global Pearson with Chi-square(df = 190*(q-1)^2, alpha = 0.05), where q = #categories for each variable
(2) Assess how many individual Pearson values are > Chi-square(df = (q-1)^2, alpha = 0.05)
(3) Determine the number of high standardized residuals