Correlated Errors between Count and C...
Message/Author
 Chelsea Jin posted on Thursday, March 01, 2012 - 9:19 pm
I've recently worked on a project involving correlated errors between a count and a continuous variables. For example, I have equations like:

y1 ON x1 - x5;
y2 ON x1 - x6;

Say, y1 is a count variable. It could be negative binomial or zero-inflated. y2 is approximately normally distributed. I want to do "y1 WITH y2", however, the statement doesn't apply to a count variable with a continuous one. So, I'm looking for a solution.

Many thanks!
 Linda K. Muthen posted on Friday, March 02, 2012 - 11:37 am
In your situation, each residual covariance requires one dimension of integration. You need to specify them using the BY option, for example,

f BY y1@1 y2;
f@1;
[f@0];

where the factor loading for y2 will contain the residual covariance parameter.
 Chelsea Jin posted on Friday, March 02, 2012 - 2:57 pm
Oh, thanks so much, Linda! But does it matter much if y1 is either negative binomial, poisson, or zero-inflated distributed?

In addition, what if I want to regress y2 on y1, say "y2 ON x1 - x6 y1", should I do further steps to take y1 as a count variable into account?

I would appreciate that you will reply me.
 Linda K. Muthen posted on Friday, March 02, 2012 - 4:26 pm
No, it does not matter what type of model you are estimating.

You cannot regress y2 on y1 if you have a residual covariance for y2 and y1. Both parameters cannot be identified.
 Chelsea Jin posted on Friday, March 02, 2012 - 4:35 pm
Oh, it's another question... I mean there's no residual covariance between y2 and y1 this time. It's just a continuous variable regressing on a count one. Maybe the count one is a outcome of another regression,like "y1 ON x1 - x5; y2 ON x1 - x6 y1;", so y1 is a mediator.

I think I read some notes saying in Mplus, if a count variable is a predictor, then it's being considered as a continuous variable. Even it's a mediator, it's still a continuous variable. Am I right? Is there any other situation to deal with the count variable as a mediator?

Many thanks.
 Linda K. Muthen posted on Friday, March 02, 2012 - 5:38 pm
When a count variable is a mediator, it is treated as a count variable when it is a dependent variable and a continuous variables when it is an independent variable.
 Chelsea Jin posted on Sunday, March 04, 2012 - 12:06 pm
Hi, I have questions still back to correlated residuals. Now, I have three regressions:

y1 ON x1 - x5;
y2 ON x1 - x6;
y3 ON x1 - x5;

Still, y1 is a count, and y2 is a continuous. y3 could be either count or continuous. What if I want three residuals mutually correlated, should I say:

if y3 is continuous:

f BY y1@1 y2 y3;
f@1;
[f@0];

if y3 is count:

f BY y1@1 y3@1 y2;
f@1;
[f@0];

or two factors have to be extracted, one from y1 and the other from y3,like:

f1 BY y1@1 y2;
f1@1;
[f1@0];
f2 BY y3@1 y2;
f2@1;
[f2@0];
f1 WITH f2;

I'm not sure which one should be correct...

Then for the first one, "f BY y1@1 y2 y3", I can get factor loadings on y2 and y3, but how can I know the correlation coefficient of the residuals between y2 and y3? The same question for the second situation.

Many thanks.
 Bengt O. Muthen posted on Sunday, March 04, 2012 - 6:08 pm
For each pair of residuals you need one factor.
 Chelsea Jin posted on Sunday, March 04, 2012 - 9:27 pm
Hmmm... but how to correlate two count variables' residuals, since the both factor loadings are 1...

Thanks~
 Linda K. Muthen posted on Monday, March 05, 2012 - 6:11 am
They are not both one:

f BY c1@1 c2;
f@1;
[f@0];

 Chelsea Jin posted on Monday, March 05, 2012 - 8:23 am
But it's still confusing~ Mplus also estimates the correlations among the factors... How can I know the correlated factors are not the correlated residuals...?

 Bengt O. Muthen posted on Monday, March 05, 2012 - 8:46 am
The factors should be uncorrelated:

f1 with f2@0 etc
 Nicholas Bishop posted on Monday, December 23, 2013 - 2:24 pm
Hello,
This is an interesting problem that my apply to a parallel process model I am running. If one growth process is continuous and the other process is specified with a Poisson distribution through the COUNT command, are the covariances between the latent intercepts of each process (would also apply to the slope) specified correctly by a simple WITH statement, or would I need to specify with the BY command as described above? Thank you for your advice.

Nick
 Bengt O. Muthen posted on Monday, December 23, 2013 - 4:59 pm
The WITH statement correlate the latent variables. You can use the BY approach to correlate observed outcomes beyond what the correlation among their latents can explain, so a residual correlation.
 Nicholas Bishop posted on Monday, December 23, 2013 - 8:01 pm
OK that helped, thank you.
 Allecia Reid posted on Tuesday, June 27, 2017 - 8:56 am
Hello, I would like to clarify the points above. I am running a multiple wave autoregressive model with a count variable as a DV. There are five other variables that need to be correlated with the count variable. The advice above indicates that each of the other variables should be specified as a separate, two indicator factor with the count dv (f1 by c@1 y2; f2 by c@1 y3;). With 274 participants, this model won't converge or is reaching saddle points.

My question is: would a single latent factor with all of the variables as indicators (f by c@1 Y2 Y3 Y4 Y5 Y6) be appropriate for capturing the residual correlations among all of the variables-- both with the count variable and with each other? I do not need to be able to see the values of each correlation separately.
 Bengt O. Muthen posted on Tuesday, June 27, 2017 - 5:43 pm
One factor would be an approximation where the different loadings would have to pick up the different-sized correlations.

Note, however, that it isn't clear how an autoregressive model with counts should be defined. Each DV can follow a count regression but the IV is treated as a regular continuous variable. That becomes a strange mix of regression equations which doesn't seem right. And, there is no underlying latent response variable concept for count variables that can resolve it as far as I know (which means there is a chance there might be). One way around this is to treat the count variable as an ordinal variable - although not perfect, it might be a practical way out. In that case you can use WLSMV where you have no problems with these correlations. Or use Bayes with ordinal which handles missing data better than WLSMV. Or ML, but then you have the correlation problem.
 Allecia Reid posted on Wednesday, June 28, 2017 - 5:17 am
Thank you Bengt. Much appreciated.