Count Data Within Path Models PreviousNext
Mplus Discussion > Structural Equation Modeling >
 Joerg Luedicke posted on Thursday, September 15, 2005 - 1:38 am
hi all,

could somebody tell me if MPLUS is able to combine poisson distributed count data variables (independent as well as dependent) and normal distributed variables in one path model with observed variables? many thanks in advance, joerg.
 Linda K. Muthen posted on Thursday, September 15, 2005 - 8:15 am
In path analysis, observed outcome variables can be continuous, censored, binary, ordered categorical (ordinal), counts, or combinations of these variable types. In addition, for path analysis for non-mediating outcomes, observed outcomes variables can be unordered categorical (nominal). Observed independent variables can be binary or continuous.
 Michela Addis posted on Monday, May 11, 2009 - 7:03 am
I have run a path analysis on a longitudinal dataset which combines continuous and count data, treated with Poisson regression and zero-inflated Poisson regression. After comparing a few alternative models, I have choosen the best model (the lowest BIC). Now, I would like to cross-validate this model on a second dataset, but I have 2 doubts:
1) How can I evaluate the goodness of the model for the cross-validation when Poisson regression is involved? I guess that I cannot use the traditional tests of model fits...
2) What is the syntax for the cross-validation? Are there some specific aspects that I should take into account? The syntax for my model follows:
VARIABLE: NAMES ARE y1-y6 u1-u3 x1-x2;
USEVARIABLES ARE y1-y6 u1-u2 x1 x2;
COUNT IS u2 (i);

y1 on x1 x2;
y2 on x1 y1 u1;
y3 on x1 y1 y2 u2;
y4 on x1 x2;
y5 on y4 x2 u1;
y6 on y4 y5 x2 u2;
u1 on x1 x2;
u2 on x1 x2;
u2#1 on x1 x2;
y1 y2 y3 (3);
y4 y5 y6 (4);
y6 with y3@0;
y6 with y1@0;
y6 with y2@0;
y1 with y3@0;
y2 with y1@0;
y2 with y3@0;

Thank you very much for your time and support.
 Linda K. Muthen posted on Tuesday, May 12, 2009 - 9:02 am
I'm sure there is a cross-validation literature out there. I am not familiar with it.

With count variables, chi-square and related fit statistics are not available. Nested models can be compared using -2 times the loglikelihood difference which is distributed as chi-square.

As far as cross-validation, I would look at the pattern of signficance across the two data sets. You could also do a multiple group analysis.
 Michela Addis posted on Sunday, May 24, 2009 - 1:00 am
Dear Linda,
Thank you very much.
Best, michela
 Tamika Zapolski posted on Wednesday, November 02, 2011 - 2:08 pm
We are running an SEM model in which we want to use zero inflated poisson regression to predict each of 4 count variable criteria. We have checked each count variable to ensure that each has only integer values. Each time we try to run the analysis, we get an error message saying "There is at least one observation in the data set where a count variable has negative or non-integer values." None of the 4 variables shows anything but 0, 1, 2, 3, 4, or 5. We cannot fix it and need help. Thank you.
 Linda K. Muthen posted on Wednesday, November 02, 2011 - 3:43 pm
It sounds like you have blanks in the data set. This is not allowed with free format data. If you can't figure it out, please send the input, data, output, and your license number to
 sahar shadi posted on Wednesday, August 22, 2012 - 11:21 pm
Dear all, I am now constructing a SEM model with 1 exogenous variable and 1 final endogenous variable . In between there are 5 key mediators (2 latent and 3 observed). The type of my endegenous variable is count. I used AMOS18 but I know I have to do this analysis with MPLUS Is my analysis completly wrong ? or is better to do with mplus? I am new begginer. please help me thank you very much
 Linda K. Muthen posted on Thursday, August 23, 2012 - 8:31 am
You should not treat a count variable as a continuous variable. This will result in improper results.
 sahar shadi posted on Thursday, August 23, 2012 - 10:29 pm
Thanks for answer .

I did this model with SMARTPLS also .

Is it wrong in SMARTPLS also ?

thank you in advance
 Linda K. Muthen posted on Friday, August 24, 2012 - 6:28 am
It would be incorrect if you do not treat the count variable as a count variable. I don't know anything about SMARTPLS so I can't say.
 sahar shadi posted on Friday, August 24, 2012 - 9:51 am
Thank you very much Linda
 Camilla Overup posted on Tuesday, August 20, 2013 - 7:51 am

I am running a model in which I have a count predictor (IV), and continuous mediators and outcomes. Can I use COUNT = IV in Mplus, when the predictor is count (and not the DV)? Would it be appropriate to specify a predictor as a count variable?

Thank you so much
 Linda K. Muthen posted on Tuesday, August 20, 2013 - 8:09 am
The scale of predictors is not taken into account in regression. Only the scale of dependent variables matters. All predictors are treated as continuous variables.
 Andy Daniel posted on Thursday, June 05, 2014 - 4:52 am
Two quick questions: Is there any problem using a count variable as a mediator? If there is no problem, it should also be possible to build a simple markov chain with several count variables measured in different time point, right?
 Bengt O. Muthen posted on Thursday, June 05, 2014 - 2:12 pm
A count mediator (M) presents a problem as I see it. In the regression


we can specify M as count and do a Poisson regression. But what do we do with M in the regression

Y on M;

? ML estimation in Mplus would treat M as continuous which contradicts the first regression. There is no underlying latent response variable M* for counts so WLSMV and Bayes can't use that approach. I see it as an unresolved research area.
 Daniel Forster posted on Monday, January 18, 2016 - 2:26 pm

I'm trying to perform what I believe is a simple analysis. I have a latent variable predicting a count variable and I want to examine group differences. From what I've seen, I have to use the KNOWNCLASS command, but I can't seem to figure out how to satisfy all of the requirements to get the analysis to run.

This is the model I *want* to run:

USEVARIABLES ARE g2choice ind1-ind3;
GROUPING = GROUP (1 = g1 2 = g2 3 = g3);
COUNT = outcome;

FAC by ind1-ind3;
outcome on FAC;

Could you clarify how I should fix my syntax? Also, with the corrected syntax, will I be able to constrain the ON path across groups to test for differences? I imagine that it won't be quite the same as using the MODEL subgroup commands.

Thank you!
 Bengt O. Muthen posted on Monday, January 18, 2016 - 2:36 pm
As an example of using Knownclass, see UG ex 8.8, where cg is the Knownclass group variable. Ignore the c variable. Then add

FAC BY ....
outcome on FAC@0;

outcome on FAC (p1);
outcome on FAC (p2);
outcome on FAC (p3);

Then you can do any tests you want with p1-p3, for instance test that they are the same using Wald testing in Model Test.
 Daniel Forster posted on Monday, January 18, 2016 - 3:25 pm
Thank you so much! That was very helpful. Also, to clarify for anyone else who may come across this, my previous example had a typo; g2choice should have been 'outcome' in the usevariables command.

My model converged and I just want to be sure I am doing this correctly. Could you verify if I used the CLASS and KNOWNCLASS commands correctly?

USEVARIABLES ARE outcome ind1-ind3;
CLASSES = group(3);
KNOWNCLASS = group (group = 1 group = 2 group = 3);
COUNT = outcome;

FAC by ind1-ind3;
outcome on FAC@0;
outcome on FAC(p1);
outcome on FAC(p2);
outcome on FAC(p3);


I also have a follow-up question. I want to know which group has the strongest association between FAC and OUTCOME. Knowing they are different is obviously the first step. Will that look like...


and will I conclude they are different if the Wald test is significant?

Finally, to test which group has the strongest association, I would typically look at something like R-SQUARE. Is there an equivalent for count variables?

Thank you again for all your help!
 Bengt O. Muthen posted on Monday, January 18, 2016 - 5:34 pm
Group should be on the NAMES = list. The latent class name should not be the same as this group variable name.

Mplus does not test for strongest effect, but equality is tested like you mention, although as

0 = p1 - p2;

For counts there is no meaningful R2 of the usual kind.

You want to ask these more syntax-related questions on Support.
 Christoph Weiss posted on Wednesday, August 24, 2016 - 1:15 am
Hi all,

i want to run a SEM with two dependent count variables which are zero-inflated and at least five independent continuous variables. So I use the zero-inflated poisson regression as in example 3.8.
If I use more as two independent variables for one dependent count variable, then the Error Message in the output is:


My Questions are:

1.) If possible, how can I transform zero-inflated dependent count variables in continuous (with the poisson assumption)?

2.) Do I need a pc with more power?

3.) In another post with the same error I have read that it’s possible to use the two-part model. But this was in the case of a LGM, so I’m not sure, if this solution also works in my case?

Thank you and best regards,

 Linda K. Muthen posted on Wednesday, August 24, 2016 - 9:00 am
Please send the output, the data set, and your license number to
 Sarah Arpin posted on Thursday, February 22, 2018 - 10:31 am

I am having the same issues as others have posted with the following error message, when trying to model a DV as count:

There is at least one observation in the data set where a count variable
has negative or non-integer value. Please check your data and format statement.

There are no negative or non-integer values in my DVs, and there are no blanks in the data. Do you have any suggestions for how to proceed? Thank you.
 Bengt O. Muthen posted on Thursday, February 22, 2018 - 3:26 pm
Sounds like there is something off in the input related to data reading. One way to check that your input is correct is to use the Savedata command to see that the analysis variables contain what you expect.
 Sarah Arpin posted on Friday, February 23, 2018 - 4:18 pm
Thank you for your fast response and your suggestion. I checked the input file using the Savedata command and all looks fine. I also opened the file within the Mplus Editor, and deleted the strange character at the beginning of the file, resaved, and still received the same error message. There are no blanks in the file. Do you have any other recommendations?

Thank you,

 Bengt O. Muthen posted on Friday, February 23, 2018 - 4:40 pm
Then you need to send your output and data to Support along with your license number.
Back to top
Add Your Message Here
Username: Posting Information:
This is a private posting area. Only registered users and moderators may post messages here.
Options: Enable HTML code in message
Automatically activate URLs in message