

LPA with highly nonnormal indicators... 

Message/Author 


I've ran LPA(n=5,500)with 3 indicators: frequency of drinking, typical quantity,frequency of binge drinking. (observed skewness 38; kurtosis 898) In my final (normal) model I still have high skewness/kurtosis residuals. Q1: Is this solution still useful when e.g. the kurtosis is underestimated by 98% (21.46 vs. 0.42)? 2)Skewt gave much better fit than normal/skewnormal/T with 1class, but higher class numbers resulted in nonconvergence/errors e.g. DUE TO A LOW DF ESTIMATE IN CLASS 1 THE ESTIMATED MEAN IN THAT CLASS IS INFINITY. THE MODEL ESTIMATION DID NOT TERMINATE NORMALLY. ESTIMATES CANNOT BE TRUSTED. RESIDUAL VARIANCES CONVERGING TO 0. Q2: Should I use categorical indicators instead? 3) When I use categorical indicators,the best LL is replicated sufficiently BUT of the 1000 replications selected 400 to 950 did not converge to a solution. Q3: How much of the replications should converge to a solution to be interpretable? Q4: Generally, do BVRs tend to be higher with sparse cell counts and large sample sizes? Q5: Generally, is there a way to desrcibe the classes by means/percentages of nonindicator variables taking into account measurment error (Entropy is <.8)? Thanks so much! 


Q 1): Kurtosis needs a very large sample to be well estimated. 2) You might have strong floor effects for which skewt does not works well. Q 2): Treating the variable as ordinalcategorical is a good idea and solves any floor effect. Q3: About 5. Q4: I don't know what BVR is. Q5: I don't know what nonindicator variables refers to. 


Sorry for not beeing precise. Q4. I meant bivariate residuals from TECH10 output. My standardized residuals are largest for those variable categories with very low probabilities in TECH10 Output e.g. <0.005. So I wondered if residuals tend to be higher with sparse cell counts / or were depended to sample size in general? For example, for my indicator "typical quantity" I have only n=20 people in the highest category "10 or more drinks". Q5. I meant variables that are not included to form the classes. E.g. I would like to describe my found classes by sociodemographics in terms of means and percentages. Is there a way to take into account measurment error as entropy is <.8? Thanks so much for your help. 


Q4. It may be that for very sparse cells, the normal approximation used in the ztest is poor  although 20 sounds large enough. Q5. You can use a 3step procedure for "distal outcomes"  see our Mplus Web Notes 15 and 21 on our website. Web Note 21 has tables at the end suggesting options for this, depending on the variable being categorical or continuous. 

Back to top 

