I have 4 waves of measurement on childrenís behavioral problem scores (normally distributed) and 4 corresponding measurements on the # of placements experienced between waves for 5500 children in child welfare system. The placement data is extremely skewed and has many zeros. The %s of children experiencing, respectively, 0, 1, 2, and 3 or more placements in the four waves are: wave 1: 85, 10, 3, 1; wave 2: 75, 12, 7, 6; wave 3: 88, 7, 3, 2 and wave 4: 95, 2, 1, 2.
I am using a cross-lagged panel design, with the # of placements predicting behavior problems and visa versa. For now, I am treating # of placements as a 4-category ordinal variable, but I wonder whether the % of zeros and/or skew are such that this model ceases to be viable. How do I assess?
Would a zero-inflated poisson model be preferred? A problem is that the inflated portion canít be used as a predictor (page 521). Can mplus build overdispersion into a poisson model?
The semi-continuous model seems to hold possibilities. If I donít switch to a growth model, can the binary and continuous variables created from the placement variable at an earlier wave function as predictors of outcomes in a subsequent one?
I may exclude wave 4, due to its low variability and other problems. I can exclude a subgroup of 1000 kids who experienced hardly any placements. Would these changes help?
In Version 5.1, we added several models for count variables. These are documented in the Version 5.1 Language Addendum and the Version 5.1 Examples Addendum which are on the website with the user's guide. Try the negative binomial model.
I would not use a semicontinuous model. There are not enough values in the tail.
See how the negative binomial model works before you consider other changes.
Is it possible to do a group-based trajectory analysis based on cost data? Time points will be 10+. There are obviously many options for fitting longitudinal cost data, but I didn't know if these options would then be limited within the GBTM framework?
How skewed do data have to be such that the MLR estimator is no longer appropriate in the context of LCGA/GMM? I am attempting to identify longitudinal trajectories in my data on ADHD symptoms across 4 ages in which there are many zeros (over 50% zeros). Using TECH13, both the skew and kurtosis tests are statistically significant p<0.0001. If I ignore the skewness and treat the outcome as continuous, I get solutions for a 4-class LCGA with quite high entropy (0.94). But are these results invalidated by the skew of the data? I have also run the analyses with a negative binomial distribution, but the entropy is much lower (about 0.74). If you have any advice about how would be most appropriate to proceed I would really appreciate it. Many thanks.