Chelsea Jin posted on Thursday, March 01, 2012 - 9:19 pm
I've recently worked on a project involving correlated errors between a count and a continuous variables. For example, I have equations like:
y1 ON x1 - x5; y2 ON x1 - x6;
Say, y1 is a count variable. It could be negative binomial or zero-inflated. y2 is approximately normally distributed. I want to do "y1 WITH y2", however, the statement doesn't apply to a count variable with a continuous one. So, I'm looking for a solution.
Oh, it's another question... I mean there's no residual covariance between y2 and y1 this time. It's just a continuous variable regressing on a count one. Maybe the count one is a outcome of another regression,like "y1 ON x1 - x5; y2 ON x1 - x6 y1;", so y1 is a mediator.
I think I read some notes saying in Mplus, if a count variable is a predictor, then it's being considered as a continuous variable. Even it's a mediator, it's still a continuous variable. Am I right? Is there any other situation to deal with the count variable as a mediator?
Then for the first one, "f BY y1@1 y2 y3", I can get factor loadings on y2 and y3, but how can I know the correlation coefficient of the residuals between y2 and y3? The same question for the second situation.
Hello, This is an interesting problem that my apply to a parallel process model I am running. If one growth process is continuous and the other process is specified with a Poisson distribution through the COUNT command, are the covariances between the latent intercepts of each process (would also apply to the slope) specified correctly by a simple WITH statement, or would I need to specify with the BY command as described above? Thank you for your advice.
Hello, I would like to clarify the points above. I am running a multiple wave autoregressive model with a count variable as a DV. There are five other variables that need to be correlated with the count variable. The advice above indicates that each of the other variables should be specified as a separate, two indicator factor with the count dv (f1 by c@1 y2; f2 by c@1 y3;). With 274 participants, this model won't converge or is reaching saddle points.
My question is: would a single latent factor with all of the variables as indicators (f by c@1 Y2 Y3 Y4 Y5 Y6) be appropriate for capturing the residual correlations among all of the variables-- both with the count variable and with each other? I do not need to be able to see the values of each correlation separately.
One factor would be an approximation where the different loadings would have to pick up the different-sized correlations.
Note, however, that it isn't clear how an autoregressive model with counts should be defined. Each DV can follow a count regression but the IV is treated as a regular continuous variable. That becomes a strange mix of regression equations which doesn't seem right. And, there is no underlying latent response variable concept for count variables that can resolve it as far as I know (which means there is a chance there might be). One way around this is to treat the count variable as an ordinal variable - although not perfect, it might be a practical way out. In that case you can use WLSMV where you have no problems with these correlations. Or use Bayes with ordinal which handles missing data better than WLSMV. Or ML, but then you have the correlation problem.