I am working with a measure where individuals are presented a word list and then immediately afterwards asked to recall the number of words (in any order). The score for each trial is the count of the number of words correctly recalled. There are a total of 15 words presented. There are five consecutive trials.
This is count data. However, the distribution of subjects' “scores” for each trial is symmetric around the mean. Recall of 0 words on any given trial is a rare event. Mean scores are also consistently greater than their variance "skewness" and "kurtosis" are close to zero.
Oversimplifying, I am modelling the increase in number of words recalled from trials 1 to 5 using LGM. I fit different functional forms of growth to the data and am considering use of structured LGMs. This has been done in the literature before with similar measures. In all previous studies, the authors treat treat the scores as continuous and make no adjustements for non-normality in the data.
How should I proceed. Should I follow precedent and treat the data as “normally” distributed (and by implication continuous) or as count. If count, how to do I deal with the fact that this data is not poisson distributed. The distribution of this data does not meet any of the distributional assumptions underlying the possible alternative distributional forms appropriate for count data supported by Mplus.
You can use the negative binomial model for counts which allows the variance to not be the same as the mean. But if the count distribution is almost symmetric with no floor effect at zero, perhaps a normal/continuous approximation is ok. You can compare the results of the two approaches and see. If similar, proceed with continuous/normal if that's accepted in the field, adding a statement that you investigated it.