Mplus Discussion >> CFA with zero-inflated neg. binomial paths

Topics
Last Day
Last 3 Days
Last Week
Tree View

Edit Profile


CFA with zero-inflated neg. binomial ...

Mplus Discussion > Confirmatory Factor Analysis >

Message/Author

Jon Elhai posted on Friday, November 30, 2012 - 6:13 pm

Linda,
Do you have sample input syntax for a CFA measurement model, whereby the observed variables are count variables, with zero-inflated negative binomial regression paths as factor loadings? I could not find zero-inflated syntax examples in the Mplus Manual.

Bengt O. Muthen posted on Saturday, December 01, 2012 - 9:29 am

Use the regression example ex3.8 as guide.

Jon Elhai posted on Saturday, December 01, 2012 - 11:25 am

Bengt. Would I simply substitute "BY" for the "ON" statements in example 3.8? Or is there something else needed to transfer the syntax in 3.8 to a CFA model?

Bengt O. Muthen posted on Saturday, December 01, 2012 - 2:16 pm

I think using BY is all you have to do.

Jon Elhai posted on Tuesday, December 04, 2012 - 7:59 am

Bengt. You mentioned that taking example 3.8 and substituting BY for the ON statements would give me neg binomial or poisson (or zero-inflated) regression paths within CFA. But in 3.8, the variable preceding the word "ON" is the count dependent variable. But in a "BY" command for CFA, the variable preceding the word "BY" would not be the count variable, but rather would be the latent factor; the variables after the "BY" would be the count variables. So how I would turn 3.8 into a CFA with something like negative binomial factor loadings estimated?

Bengt O. Muthen posted on Tuesday, December 04, 2012 - 1:52 pm

f BY y;

is the same as

y ON f;

So you just specify u1-u10, say as count negbin:

COUNT = u1-u10(nb);

and then say

f BY u1-u10;

Tom Booth posted on Wednesday, June 12, 2013 - 4:13 am

Dear Linda/Bengt,

I am fitting a CFA model with 5 indicators. 3 are ordered categorical and 2 are count variables with a high proportion of zeros. The model is fit with MLR and numerical integration with logit link.

I have a number of questions about this model:

1) Is it reasonable to fit zero-inflated parameters within CFA?

2) If the answer to (1) is yes, does the inflation parameter get included as a factor indicator? e.g.

f by u1 u1#1 u2 u2#1 ....

3) I am also not sure which combination of standardizations I would need to report in this model for the values to be comparable. For example, I assume the inflation params would be STDY as this is binary.

Having run a couple of variates of this model, I tend to receive estimates and p-Values in the raw and STD solutions, but estimates of 1.00 and associated -999 for the count variables in the STDYX results.

Any assistance would be warmly received. As always, apologies if this has been answered elsewhere but I have struggled to find it.

Thanks

Tom

Linda K. Muthen posted on Wednesday, June 12, 2013 - 1:11 pm

1. Yes.
2. You would have factors with count indicators and factors with inflation indicators. You would not use them in the same factors.
3. Standardization is not done with count variables.

Tom Booth posted on Wednesday, June 12, 2013 - 2:17 pm

Thanks Linda.

Can I ask why you would fit difference factors for the inflation indicators. Is this because the zero inflation models assume 2 processes are underlying the patterns of responses in the count variables and thus 2 latent factors are required?

Best

Tom

Bengt O. Muthen posted on Wednesday, June 12, 2013 - 3:27 pm

What influences the inflation probabilities may be different from what influences the number of counts among those in the non-zero class.

Tom Booth posted on Wednesday, June 12, 2013 - 11:27 pm

Thanks both. In sum:

1) Its fine to model inflation params in CFA.
2) Model them on a separate factor to model different influences.
3) Report unstandardized values.

Sorry for what may be a further very simplistic question, but is there a good reference/reading for why counts are not standardized?

thanks

Linda K. Muthen posted on Thursday, June 13, 2013 - 8:32 am

In a model where a residual variance is not an estimated parameter, standardization with respect to y cannot be done. You can standardize with respect to x. I know of no article that addresses this.

Rob Dvorak posted on Monday, April 21, 2014 - 8:35 am

Hi there,

I'm running some CFA models using behavioral observation data where the variables are counts (the number of X type of utterance), but unlike most count variables, their distribution does not really approach a Poisson or Negative Binomial distribution because it's extremely skewed (e.g., the median may be 2 or 3, but valid cases have counts over 50). In fact, when I run count models in Mplus, it gives a warning that counts exceed 50 and perhaps a continuous model would be better. My sense is that there is no ideal model for data distributed this way (count data that are extremely positively skewed), but I figured I would ask if you have any recommendations.

Linda K. Muthen posted on Tuesday, April 22, 2014 - 10:05 am

Is there a substantive reason for the counts over 50. Do they represent a different subpopulation?

Olivia Hamilton posted on Wednesday, February 26, 2020 - 9:54 am

Dear Linda and Bengt,

I am fitting a CFA model with four indicators: two are continuous variables and two are zero inflated count variables. The estimator is MLR. I understand from the posts above that it is reasonable to fit zero-inflated parameters within CFA, but that I should model the count and inflation indicators in separate factors. I'm unsure of how to model this. Should I model one factor with all four indicators and a separate one for the inflation indicators? If so should these factors then load on to an overall factor as below?

COUNT ARE u3 (i) u4 (i);

USEVARIABLES u1 u2 u3 u4;

MODEL:
f1 BY u1 u2 u3 u4;

f2 by u3#1 u4#1;

f3 BY f1 f2;

I would appreciate any advice you might have.

Many thanks.

Bengt O. Muthen posted on Wednesday, February 26, 2020 - 4:37 pm

Q1: That's fine.

Q2: No - you need at least 3 first-order factors to identify a second-order factor. It is sufficient that f1 and f2 correlate.

Olivia Hamilton posted on Tuesday, March 03, 2020 - 3:28 am

Dear Bengt,

Thank you very much for your reply. I have one further question.

I understand that I should report unstandardized factor loadings for the binary variables in this CFA. Should I also report unstandardized coefficients when using this latent variable as a predictor in a regression model?

Many thanks in advance.

Bengt O. Muthen posted on Tuesday, March 03, 2020 - 3:21 pm

Use STD for the loading - this gives it for factor variance 1. I would use standardized coefficients for the factor as a predictor. The only issue is that you don't want to standardize with respect to Y when Y is a count DV.