Extreme skew in longitudinal model PreviousNext
Mplus Discussion > Growth Modeling of Longitudinal Data >
 james rosenthal posted on Thursday, April 23, 2009 - 4:02 pm

I have 4 waves of measurement on childrenís behavioral problem scores (normally distributed) and 4 corresponding measurements on the # of placements experienced between waves for 5500 children in child welfare system. The placement data is extremely skewed and has many zeros. The %s of children experiencing, respectively, 0, 1, 2, and 3 or more placements in the four waves are: wave 1: 85, 10, 3, 1; wave 2: 75, 12, 7, 6; wave 3: 88, 7, 3, 2 and wave 4: 95, 2, 1, 2.

I am using a cross-lagged panel design, with the # of placements predicting behavior problems and visa versa. For now, I am treating # of placements as a 4-category ordinal variable, but I wonder whether the % of zeros and/or skew are such that this model ceases to be viable. How do I assess?

Would a zero-inflated poisson model be preferred? A problem is that the inflated portion canít be used as a predictor (page 521). Can mplus build overdispersion into a poisson model?

The semi-continuous model seems to hold possibilities. If I donít switch to a growth model, can the binary and continuous variables created from the placement variable at an earlier wave function as predictors of outcomes in a subsequent one?

I may exclude wave 4, due to its low variability and other problems. I can exclude a subgroup of 1000 kids who experienced hardly any placements. Would these changes help?


 Linda K. Muthen posted on Friday, April 24, 2009 - 3:36 pm
In Version 5.1, we added several models for count variables. These are documented in the Version 5.1 Language Addendum and the Version 5.1 Examples Addendum which are on the website with the user's guide. Try the negative binomial model.

I would not use a semicontinuous model. There are not enough values in the tail.

See how the negative binomial model works before you consider other changes.
 Kelly Kenzik posted on Tuesday, June 27, 2017 - 1:26 pm

Is it possible to do a group-based trajectory analysis based on cost data?
Time points will be 10+. There are obviously many options for fitting longitudinal cost data, but I didn't know if these options would then be limited within the GBTM framework?

Thank you
 Bengt O. Muthen posted on Tuesday, June 27, 2017 - 5:57 pm
You can do growth mixture modeling. I don't know what GBTM means.
 Jessica Agnew-Blais posted on Wednesday, October 02, 2019 - 1:47 pm
How skewed do data have to be such that the MLR estimator is no longer appropriate in the context of LCGA/GMM? I am attempting to identify longitudinal trajectories in my data on ADHD symptoms across 4 ages in which there are many zeros (over 50% zeros). Using TECH13, both the skew and kurtosis tests are statistically significant p<0.0001. If I ignore the skewness and treat the outcome as continuous, I get solutions for a 4-class LCGA with quite high entropy (0.94). But are these results invalidated by the skew of the data? I have also run the analyses with a negative binomial distribution, but the entropy is much lower (about 0.74). If you have any advice about how would be most appropriate to proceed I would really appreciate it. Many thanks.
 Bengt O. Muthen posted on Wednesday, October 02, 2019 - 5:07 pm
When you have that high degree of a floor effect, you should not treat the variable as continuous. Instead, you can treat it as censored or do two-part modeling - both have examples in the UG.
Back to top
Add Your Message Here
Username: Posting Information:
This is a private posting area. Only registered users and moderators may post messages here.
Options: Enable HTML code in message
Automatically activate URLs in message