Mplus Discussion >> SEM for count models?

Topics
Last Day
Last 3 Days
Last Week
Tree View

Edit Profile


SEM for count models?

Mplus Discussion > Categorical Data Modeling >

Message/Author

eric baumer posted on Wednesday, January 02, 2002 - 10:05 am

I have a situation in which I really want to estimate a fairly complex structural equation model, yet the key endogenous variable is probably best represented as a count variable with a poisson distribution (i.e., a count model or negative binomial). Does Mplus handle this type of estimation in an SEM context? Thanks, Eric Baumer

Linda K. Muthen posted on Wednesday, January 02, 2002 - 4:21 pm

Mplus does not have a facility for representing count data using a Poisson distribution. Mplus would treat this data as ordered polytomous.

Patrick Malone posted on Tuesday, June 25, 2002 - 8:34 am

I'm also working with count data, where I have several count items (number of occasions on which the respondent has performed certain behaviors) which I would like to use in a factor model. Unfortunately, I have incomplete data.

Previously, I imputed the count data (using NORM -- yes, I know, it's already a distortion), categorized the counts into ordered polytomous variables, and ran a categorical data model in MPlus, combining across imputations.

I'd like to avoid the multiple imputation step, if possible, not least because I have a lot of individual items and the imputation takes quite a while. Would using the MLR estimator in 2.1 seem like a reasonable approach for these data?

Separately, following up Eric's question, is work being done on Poisson links for this kind of model? I have no idea what the complexities would be, but I'm used to y'all pulling rabbits out of hats :-).

Thanks,
Pat

bmuthen posted on Tuesday, June 25, 2002 - 2:33 pm

Yes, I think MLR or MLM can be seen as a rough approximation. Mplus is developing missing data facilities for more types of outcomes. And also planning for Poisson outcomes.

Patrick Malone posted on Wednesday, June 26, 2002 - 8:52 am

Excellent! Thank you.

Charles Green posted on Monday, September 16, 2002 - 7:02 am

Dr. Muthen:

I am also working on a relatively complex growth curve model that is based on count data. Do you know of any resources that might save me some time writing simulations by discussing the biases that might result from treating data distributed as a poisson with M-Plus which would consider them polytomous in nature. I assume that this would entail the violation of the assumption that there is a continuous latent dimension underlying the categorical polytomous data.

Thank you.

bmuthen posted on Sunday, September 22, 2002 - 9:41 am

I am not aware of any writings on the analysis of Poisson count outcomes using methods for ordered polytomous outcomes. Personally, I would not worry too much because for the simple purpose of regression I don't think the ordered model makes important violations of the nature of count data, although I may be wrong. That is, the ordered polytomous model would certainly not estimate the same parameters as Poisson, but probably fit the data well and point to the same important predictors. Perhaps other Mplus Discussion readers have an opinion here. I don't think of the ordered polytomous model as requiring an assumption of a continuous underlying dimension (a "y*" latent response variables), but merely a model based on proportional odds (see Agresti's book), so that would not be an important consideration to me in this choice.

Tor Neilands posted on Friday, October 01, 2004 - 9:43 am

Greetings,

I am preparing to fit a path analysis model in which there are three sets of observed variables: (a) exogenous variables (either dummy-coded or continuous, approximately normally distributed); (b) mediators that are continuous and approximately normally distributed; and (c) a single outcome that is a count and which appears Poisson distributed (perhaps with zero-inflation).

We will be using Mplus 3.11 to analyze these data. Of interest are the indirect effects of the (a) variable set on (c). I notice that Mplus 3.11 does not support computation of indirect and total effects models involving count outcomes. Can you tell me how to compute these by hand?

More generally, I'm also curious how much the calculations would change under different, but similar scenarios. For instance, how would the computations change if one of the mediating variables was binary or ordered categorical? What about if continuous latent variables are involved at the exogenous, mediating, or outcome stage of the model?

Thank you for any tips you can offer on this topic.

bmuthen posted on Friday, October 01, 2004 - 1:12 pm

With an ultimate count outcome, the indirect effect could pertain to the log rate that is modeled - so that this is the "y*" variable (in my terms) that we are used to when modeling categorical outcomes. The indirect effect would then simply be the usual product of regression coefficients and SEs computed using the Delta method (see Bollen's book). For count outcomes, Mplus uses ML and in the ML context a mediating variable that is categorical (binary or ordinal) is entering the prediction of the count outcome as an observed variable (not y*), i.e. a score that is treated as continuous. The same approach as above applies. And also for continuous latent variables.

Anonymous posted on Thursday, October 14, 2004 - 1:04 pm

Hi Linda,

Does mplus version 3 handle mediating count variables?
thanks a lot.

bmuthen posted on Thursday, October 14, 2004 - 1:18 pm

Yes. This is done in the new ML estimation framework. Note that when used in the equation where it is an exogeneous variable, as opposed to in the equation where it is endogenous, the count variable is treated as a continuous variable (an observed variable rather than the underlying y*-type log rate).

Anonymous posted on Friday, July 15, 2005 - 2:16 pm

I am glad to see that Mplus 3 can run zero-inflated Poisson model, using cross section or longitudinal data. I have a couple of questions on this issue:

1) It looks like that Mplus only provides Loglikelihood values and Information Criteria for the zero-inflated Poisson model. Is there any way to tell whether a model fits the data?
1) Can I use the Loglikelihood values Mplus provides to conduct LR test for comparing the standard Poisson model with zero-inflated Poisson model?
2) Can I run zero-inflated negative binomial model in Mplus?
3) Can I run other zero-modified (e.g., zero-deflated and zero-truncated) Poisson models in the current version of Mplus?

Thank you very much for your help

bmuthen posted on Friday, July 15, 2005 - 6:41 pm

1) No fit statistics are offered so one has to work with nested models. I believe you can request RESIDUAL in the Output command to check disagreement between observed and estimated.

2) Yes, these models are properly nested I believe.

3) No, Mplus does not yet have negative binomial modeling.

4) No, that is not implemented yet. Unless it is possible to do it via tricks like the one used in the Version 3 User's Guide ex 7.25 where the ZIP is done via explicit 2-class modeling.

Anonymous posted on Monday, July 18, 2005 - 9:13 am

Thanks a lot!
Hope more zero-modified Poisson models will be integrated in the future version of Mplus. Great service!

bmuthen posted on Monday, July 18, 2005 - 5:51 pm

Do you have any good writings to recommend with applications of zero-deflated and/or zero-truncated Poisson?

artem prokhorov posted on Wednesday, April 19, 2006 - 11:54 pm

Does Mplus support grouped zero-inflated poisson indicators, i.e. indicators of the form "1 through 3", "4 through 8", etc.? Thanks

Linda K. Muthen posted on Thursday, April 20, 2006 - 9:08 am

No, Mplus does not support this model.

Daniel E Bontempo posted on Wednesday, July 26, 2006 - 8:51 am

Hello -

I was studying the example 7.25 Linda referenced above.

Is there a reference (or discussion of how the output of the two alternative approaches are interpreted)?

Did Bengt publish a paper where he used this model on alcohol or substance use data?

Thanks

Daniel E Bontempo posted on Wednesday, July 26, 2006 - 9:05 am

Hi again -

I am reading Bengt's posting above on September 22, 2002 where he wrote "I am not aware of any writings on the analysis of Poisson count outcomes using methods for ordered polytomous outcomes."

My question is about the other way around: the analysis of discrete (i.e., 6 response options) data representing unequal count intervals (e.g., 0, 1-2, 3-5, 6-9, 10-19, 20-39, 40+) using continuous Poisson models.

The choice would appear to be between an ordered polytomous outcome (i.e, 0,1,2,3,4,5,6) or a recoded somewhat discrete count-approximation using interval midpoints (or endpoints).

A colleague told me Bengt had written a paper on adolescent alcohol use where a Poisson model was used, although there was some discussion as to whether response options or interval mid-points should have been used. I have been unable to clearly identify this paper, or the subsequent critique. Does this ring any bell?

Also, what about the more general question as the best approach for modeling data obtained from a survey item about number of drinks in the past month ... etc?

The zero-inflated Poisson is attractive because many of these high-school students did not drink and reported zero drinks in the past month. But I am worried that the data is not continuous.

Our respondents are nested within school, so with either the discrete or the continuous model, I would want to appropriately handle the nesting.

Any advice, references to your own work, pointers to MPLUS examples, or other references would be greatly appreciated.

Thanks

Bengt O. Muthen posted on Wednesday, July 26, 2006 - 4:11 pm

Answer to your first message of today: No, I have not written about this. One relevant, related ref. is the Roeder et al (1999) JASA article on ZIP mixture modeling.

Bengt O. Muthen posted on Wednesday, July 26, 2006 - 4:19 pm

My 1999 Biometrics paper with Shedden worked with frequency of heavy drinking as the outcome. The outcome was a categorized count outcome such as 0, 1-2 times, 3-5 times etc. One could argue that this should be handled via some generalized Poisson model - earlier in this thread a "grouped zero-inflated Poisson" model was mentioned, but Mplus does not support that yet. I tend to want to treat such data as ordered categorical. This then takes care of the strong floor effect. Another approach is 2-part modeling which Mplus also supports and that modeling has the advantage of letting covariates have different impact on the probability of engaging in an activity at all vs how much.

Mplus can handle 2-level data in either case.

Daniel E Bontempo posted on Friday, July 28, 2006 - 12:01 pm

Thanks Bengt -

When you refer to two-part modeling, do you refer to Example 7.25, which uses two classes? or do you mean a manual split of the data into drinkers and non-drinkers and I predict frequency of drinking using only the drinkers?

Or, perhaps I misunderstand any you mean something else?

Bengt O. Muthen posted on Friday, July 28, 2006 - 12:38 pm

No, I refer to example 6.16.

marissa hansen posted on Tuesday, January 02, 2007 - 4:47 pm

I have a mediating and outcome variable that are both count variables. The outcome variable is a categorical count variable. The mediating variable is continuous. Below is the syntax that I used to try to run my model but for some reason the operation can not be performed. Can you let me know what part of the syntax needs to be adjusted? This is a straight forward regression model with only one LV.

Thanks.

TITLE:1-2-07 Instr mediation model2
DATA: FILE IS "C:\1-2-07.dat";
VARIANCES=CHECK;
VARIABLE: NAMES ARE id a5 b2 d1 h5 h9 h17 needdepr recgende
forsrv lngsocco lngmedia contlang socethn relginf2;
MISSING are all (999);
USEVARIABLES ARE a5 b2 d1 h5 h9 h17 needdepr recgende
forsrv lngsocco lngmedia contlang socethn relginf2;
COUNT ARE h5 h9 h17 forsrv;
CATEGORICAL ARE forsrv;

ANALYSIS: TYPE=GENERAL;

MODEL:
cultural_integration by lngsocco lngmedia contlang socethn b2;
h9 on a5 cultural_integration recgende relginf2;
forsrv on a5 cultural_integration recgende relginf2 h9;
d1 with a5 cultural_integration recgende relginf2 forsrv needdepr;
needdepr with a5 cultural_integration recgende relginf2 forsrv;
OUTPUT: SAMPSTAT RESIDUAL STANDARDIZED;

Linda K. Muthen posted on Monday, January 08, 2007 - 9:59 am

I don't think you can have WITH statements that include count variables. If this is not the problem, please send your input, data, output, and license number to support@statmodel.com.

Bryan T. Karazsia posted on Wednesday, April 23, 2008 - 9:02 am

Drs. Muthen & Muthen,

On an earlier post, you mentioned that "Mplus is developing missing data facilities for more types of outcomes". (posted on Tuesday, June 25, 2002 - 2:33 pm). I am developing some models with a count outcome. These models will be analyzed with longitudinal data, so missing data is an issue. Are there capabilities in the latest version of Mplus for handling missing data on count variables?

Thanks so much for your time!

Bryan

Linda K. Muthen posted on Wednesday, April 23, 2008 - 10:13 am

Yes. Mplus provides maximum likelihood estimation under MCAR (missing completely at random) and MAR (missing at random; Little & Rubin, 2002) for continuous, censored, binary, ordered categorical (ordinal), unordered categorical (nominal), counts, or combinations of these variable types.

Bryan T. Karazsia posted on Wednesday, April 23, 2008 - 11:14 am

Many Thanks!

Lisa Fucito posted on Tuesday, June 24, 2008 - 7:08 am

I would like to model indirect effects using a zero inflated count outcome variable. The IV and mediator are both continuous. I receive an error message when I try to model indirect effects.

Lisa Fucito posted on Tuesday, June 24, 2008 - 7:15 am

Model Constraint with Count Variable:

I have a zero inflated count outcome variable, 2 IVs and 3 mediators. My dataset has missing data. I am trying to compare the strength of different parameters using bootstrapping to derive confidence intervals. None of the estimators that are available for bootsrapping, however, work with count data or missing data. Can you please advise me of my options.

Linda K. Muthen posted on Tuesday, June 24, 2008 - 9:27 am

MODEL INDIRECT is available only for continuous, binary, and ordered categorical dependent variables.

Bootstrapping is not available when numerical integration is required.

Lindsay Jorgensen posted on Tuesday, October 07, 2008 - 7:09 am

I had a brief question. I created a measurement model using both categorical and continuous variables. My final model has correlated errors between the indicators. I would like to create a full model using a count outcome and my measurement model as the exposure. However, I cannot seem to do this in MPlus. Is there any way in the current version of MPlus to include WITH statements with a count outcome?

Linda K. Muthen posted on Tuesday, October 07, 2008 - 8:00 am

You would need to put a factor behind the two indicators as shown in Example 7.16. Note that each residual covariance requires one dimension of numerical integration.

Lindsay Jorgensen posted on Wednesday, October 08, 2008 - 2:38 pm

Thank you!

Lindsay Jorgensen posted on Thursday, October 09, 2008 - 2:39 pm

Dr. Muthen,

I am having problems with the numerical integration in my model per your note above. I have 6 correlated error covariances in my original measurement model.

c by x1 x2 x3 x4 x5 x6;
f1 by x2 x3;
.
.
.
f6 by x5 x6;
u1 ON c;

Where the x's are my indicators, c is my latent variable of interest and u1 is my count outcome.

What would you recommend I do to facilitate the integration if anything. Change the INTEGRATION= in the Analysis step? If so, to what? Thank you!

Bengt O. Muthen posted on Thursday, October 09, 2008 - 3:16 pm

Not enough information to say - please send your input, output, data, and license number to support@statmodel.com.

Mary Campa posted on Tuesday, April 21, 2009 - 9:21 am

Hello. I am trying to estimate a multiple mediator path model with a negative binomial distributed Y, one continuous M, one Poisson distributed M, and a three-level X.

The convention in my discipline is to present standardized coefficients; however, M-plus will not provide these with the count mediating variables. Is there any way to get these estimates or a reference I can provide as to why these estimates are not valid?

Thank you.

Bengt O. Muthen posted on Tuesday, April 21, 2009 - 9:42 am

Regression with a count dependent variable does not involve a residual variance parameter. There is also not an underlying continuous DV for which a residual variance can be conceptualized like with logit or probit regression. That's why you don't see standardized count regression coefficients in the literature.

You can standardize with respect to the other variables.

Note that mediational models are perfectly valid in their raw, unstandardized form.

Mary Campa posted on Thursday, April 23, 2009 - 6:03 am

Thank you for your prompt response. Could you tell me what situation would create a significant raw effect and a very non-significant standardized effect (in both STDYX and STDY)? How should I interpret this?

Bengt O. Muthen posted on Thursday, April 23, 2009 - 1:36 pm

Such big differences are rare, so there is probably something peculiar about this example. Please send the output and license number to support@statmodel.com.

Rob Dvorak posted on Thursday, October 22, 2009 - 11:56 am

Hi Drs. Muthen,

I am running a model in which I have two latent variables and their interaction predicting two zero inflated negative binomial distribution outcomes, with one of the zinb outcomes predicting the final zinb outcome as well. Mplus ran the model and terminated normally, however, I just want to be sure that what I am getting is valid. I am wondering, because I am only getting a dispersion test for one of the zinbs (the one that serves as a mediator).

Bengt O. Muthen posted on Thursday, October 22, 2009 - 12:41 pm

To be able to answer that you need to send your input, output, data, and license number to support@statmodel.com.

Frank Snyder posted on Tuesday, March 16, 2010 - 5:48 pm

I have a SEM with a latent variable mediating the effect of an intervention on 3 outcome variables (2 count outcomes and 1 binary outcome) and want to be sure I�m calculating and interpreting the effects correctly. For the count outcomes, I used an approach recommended in a previous post and calculated the indirect effects by simply calculating the product of the unstandardized regression coefficient of the intervention on the latent mediator times the unstandardized regression coefficient of the latent mediator on the count outcome variable. Then, I calculated the SE using the delta method; this was also done for the binary outcome. Here are my questions:
1) Did I follow your recommended approach?
2) For the binary outcome, is it appropriate to calculate the % mediation by dividing the indirect effect B by the direct effect (without the mediator) B? For example, x-> m-> y B=-1.969 and x->y B=-0.697; -0.697/-1.969= .354, or 35.4% of the effect of the intervention was mediated by the mediator.
3) Could the approach used in question 2 be used to calculate the % mediation for the count outcomes?
4) Your previous post states that WITH statements can�t be included with count variables, and I�m unclear why? Is there a reference available for not including the correlations?
5) Would it be useful to exponentiate the product terms (i.e., indirect effects) to get a fractional interpretation?
Thank you

Linda K. Muthen posted on Wednesday, March 17, 2010 - 9:48 am

1. It sounds like you did. Remember that the final dependent variable is the log rate.
2-3. This sounds questionable to me.
4. There is no model estimated variance for a count variable. See one of the Agresti books on categorical data analysis.
5. I think this would work. When you exponentiae a log rate, it becomes a rate.

Alicia Bunger posted on Friday, March 26, 2010 - 8:08 am

Hi - I'm new to path modeling and Mplus(so forgive me if this question is simple!) but I was wondering if there is a way to get fit statistics for path models that have an outcome variable with a negative binomial distribution?

Linda K. Muthen posted on Friday, March 26, 2010 - 9:37 am

Chi-square and related fit statistics are not available for these models because means, variances, and covariances are not sufficient statistics for model estimation. In these cases, people compare nested models using loglikelihood difference testing and also look at BIC.

Alicia Bunger posted on Friday, March 26, 2010 - 10:14 am

Thanks for your response - I appreciate it!

james rosenthal posted on Monday, May 03, 2010 - 1:32 pm

Dear Mplus:

I have a multi-wave sample of 5000 children involved with the child welfare system. I have measurements of behavior problems (approximately normally distributed) at three times, T2, T3, and T4. I have counts of the number of out-of-home placements experienced by the children at three time intervals: from T1 to T2, from T2 to T3, and from T3 to T4. The counts are highly skewed. For any given interval, about 85% of children experience zero placements, about 7-8% experience one placement only, and about 7-8% experience multiple placements. I am modeling the effects of: behavior on placement, placement on behavior, behavior on behavior, and placement on placement.

For the regressions on placement counts, I used a zero-inflated negative binomial model. This model worked reasonably well (after appropriate starting values were supplied) and provided sensible/interpretable results. Due to the skewness issue, a journal reviewer has recommended that the placement counts be modeled either as binary or ordered categorical variables. My question is: Is the placement count data so skewed that the zero inflated negative binomial is not be recommended and, thus, a binary or ordered categorical model would be preferred.

Just hoping to get an opinion on this. I realize there may not be an �answer.�

Thanks in advance.

Jim

Rob Dvorak posted on Monday, May 03, 2010 - 4:39 pm

Hi Drs. Muthen,
I have a latent variable interaction predicting a zinb outcome. There is a significant path from the latent variable interaction to the count portion of the model. I computed the simple slopes ala Aiken & West (i.e., +/- 1 SD on one of the latent variables), and computed the SEs of these slopes using asymptotic covariances. I then exponentiated the simple slope coefficients, making them into incident rate ratios. Is this an appropriate way to compute simple slopes for non-linear models? If not, do you know of a reference for computing these? Thanks in advance.

Linda K. Muthen posted on Tuesday, May 04, 2010 - 9:35 am

James: I would disagree with the reviewer. A count variable is skewed by definition and the count model is made to handle that.

Linda K. Muthen posted on Tuesday, May 04, 2010 - 9:36 am

Rob: I think what you are doing sounds correct. I don't know of a reference.

Scott C. Roesch posted on Friday, December 16, 2011 - 6:14 am

Using the output from a negative binomial model in MPlus, how can I calculate a rate ratio? Is there an option for this using the MPlus code?

Linda K. Muthen posted on Friday, December 16, 2011 - 9:18 am

You should exponentiate the coefficient. You can do this in MODEL CONSTRAINT and you will obtain a standard error for it.

Chris posted on Saturday, September 01, 2012 - 5:35 pm

Is it possible to use a zero inflated variable as a mediator in a SEM? If yes, is there any special treatment on the variable or in the way to interpret the results?

I usually use negative binomial regression when this variable is my DV as there is too much over dispersion for a Poisson regression.

In the SEM my DV is binary.

Thank you.

Linda K. Muthen posted on Sunday, September 02, 2012 - 10:32 am

Yes. In the y on m regression it is treated as a continuous variable. For issues related to the computation of indirect effects, see the following paper which is available on the website:

Muth�n, B. (2011). Applications of causally defined direct and indirect effects in mediation analysis using SEM in Mplus.

Chris posted on Sunday, September 02, 2012 - 5:51 pm

Thank you Linda. Much appreciated.

Tom Booth posted on Thursday, August 01, 2013 - 12:27 am

Linda/Bengt,

I have a CFA model which forms part of a larger SEM which uses both count and categorical variables.

I have been asked by a reviewer to provide some information on how well the model fits. As we only have one model, AIC and BIC are not useful. Also, our sample size is moderate to large (700+) so I am not sure how useful the chi-square for the categorical and count portions of the model will be.

I was considering the following as options:

1 - fix all loadings in the CFA to nominally small values (.01), and compare AIC and BIC from a psuedo-model of no association to the CFA with free loadings.

2 - provide the average of the residual matrix.

Does this sound reasonable?

Tom

Linda K. Muthen posted on Thursday, August 01, 2013 - 10:46 am

You might consider looking at TECH10 and also estimating neighboring models.

Tom Booth posted on Thursday, August 01, 2013 - 11:36 am

Thanks Linda. My poor phrasing, by point (2) I was referring to a summary of TECH10.

Please excuse my ignorance, but what do you mean by neighbouring models?

Tom

Linda K. Muthen posted on Thursday, August 01, 2013 - 2:19 pm

A neighboring model would be a model which frees a key parameter that is fixed at zero in the original model.

Jacqueline Homel posted on Wednesday, September 11, 2013 - 1:39 pm

I have a path model with three outcomes measured over three time points. Two of the outcomes are continuous and one is a count, and I am using numerical integration. I need to test whether some parameters are significantly different (e.g. stronger) than others. I know that with count outcomes that standardized estimates are not available, and I also read above that bootstrapping is not available when numerical integration is required. Would it be feasible to answer this question by comparing the fit of two models, one where the parameters to be compared are constrained to be equal, and another where they are free?

Linda K. Muthen posted on Wednesday, September 11, 2013 - 3:23 pm

You can use chi-square difference testing to test the difference between parameters with the same scale.

Jacqueline Homel posted on Wednesday, September 11, 2013 - 4:34 pm

Thank you very much - but I cannot use chi-square difference testing to test the difference between parameters that do not have the same scale? If not, is there any way to compare these parameters?

Linda K. Muthen posted on Wednesday, September 11, 2013 - 4:53 pm

No, this is not possible.

Yvonne LEE posted on Tuesday, April 22, 2014 - 8:54 am

As a novice to Mplus, I am conducting modeling on 'rape tendency' on a group of offenders and have made enquiries earlier.

I have 2 indicators for my rape tendency DV i.e. self-report number of rape and official record of rape count. The former indicator scored 0 on 64% of the cases and the latter scored 0 on 79%. Because of the very low frequency on higher count, I collapse these 2 indicators into 4 levels. For the self-report rape, I collapse into 0, 1, 2, >=3 and the same for the official rape count.

Question 1: Should I treat the 2 indicators as categorical or count data? The estimation model terminated normally when treating as count data but error message 'THE RESIDUAL COVARIANCE MATRIX (THETA) IS NOT POSITIVE DEFINITE.' appears if treating as categorical data. Pls advise.

Question 2: Should I use two-part modeling ? I have tried and the estimation model terminated normally.

Linda K. Muthen posted on Wednesday, April 23, 2014 - 10:51 am

You should treat this as a categorical variable not a count variable.

I would not recommend two-part modeling.

Yvonne LEE posted on Wednesday, April 23, 2014 - 5:16 pm

Thanks for your advice. I have run analysis treating them as categorical variable but error message of 'non-positive definite covariance matrices' was shown, involving variable PAbuse. Pls advise how to go about.

MODEL:
f1 by EAbuse PAbuse SAbuse ENeglect CTS_PV;
f2 by Rape0123 SES0123;
f2 on f1;

One thing is that both categorical indicators have a U-shaped distribution. Does it matter? Major problem observed in TECH4 is correlation between F2 and SES0123 is 1.14. Also, Rsquare for SES0123 is undefined.

Linda K. Muthen posted on Wednesday, April 23, 2014 - 6:41 pm

Categorical data methodology can handle variables with both floor and ceiling effects.

Please send the output and your license number to support@statmodel.com.

Kyoko Shimamoto posted on Thursday, January 29, 2015 - 3:23 pm

Dear Linda,

I would like to seek your advice on convergence problem. I have a count outcome, with three latent mediators, with survey data. I am unable to get the model to converge when I treat the mediators as latent variable. However, the model converges with summary measured variables(treating them as observed).

Could you take a look at my code? Please let me know anything that I could try.

Thank you very much.

Kyoko

count is v201;
Categorical are hlt purc visit gooutr neglr arguer refsr burnfr negsex negcon;
Cluster = v021;
Weight = weigh;
Stratification = v022;
Subpopulation = subpop_fa eq 1;
Analysis:
Type = complex ;
integration = montecarlo (75);
Parameterization=theta;
Model:
v511 ON v133 v012 work2 head urban house2 house3 house4 house5 polyfir polysec reage reeduc2;
f1 by hlt purc visit;
f1 ON v133 v511 v012 work2 head urban house2 house3 house4 house5 polyfir polysec reage reeduc2;
f2 by negsex negcon;
f2 ON v133 v511 v012 work2 head urban house2 house3 house4 house5 polyfir polysec reage reeduc2;
f3 by gooutr neglr arguer refsr burnfr;
f3 ON v133 v511 v012 work2 head urban house2 house3 house4 house5 polyfir polysec reage reeduc2;
v201 ON v133 v511 f1 f2 f3 v012 work2 head urban house2 house3 house4 house5 polyfir polysec access reage reeduc2;
f1 with f2; f2 with f3; f1 with f3;

Linda K. Muthen posted on Thursday, January 29, 2015 - 5:51 pm

Please send the output with the problem and your license number to support@statmodel.com.

Kim Kiely posted on Tuesday, March 17, 2015 - 8:52 pm

Hi Linda and Bengt,

Many years ago (Sept 22 2002), Bengt posted in this thread:

"I am not aware of any writings on the analysis of Poisson count outcomes using methods for ordered polytomous outcomes. Personally, I would not worry too much because for the simple purpose of regression I don't think the ordered model makes important violations of the nature of count data, although I may be wrong. That is, the ordered polytomous model would certainly not estimate the same parameters as Poisson, but probably fit the data well and point to the same important predictors. "

I was wondering if your thougts on this remained the same?

I have the following outcome data, which strictly speaking are counts (I also expect are underdispersed with a mean=1.17 and variance=0.78):

Cat	n	%
0	4947	24.3
1	8447	41.6
2	5550	27.3
3	1251	6.1
4	112	0.5

I have analyzed as count (with NegBin and Poisson) and as ordinal in Stata, and am about to start replicating these models in Mplus v7. Generally my ordinal models have a relatvely better fit - and given the frequencies it seems reasonable to treat as such. But I would be interested get a second opinion.

Cheers

Bengt O. Muthen posted on Wednesday, March 18, 2015 - 12:28 pm

Yes, I think I'll stay with what I wrote 13 years ago. You could perhaps even use linear regression - counts get approx normal with a high rate (mean).

rongqin.yu@psych.ox.ac.uk posted on Sunday, February 28, 2016 - 9:41 pm

I am building a cross-lagged panel model with one count variable and one continuous variable across four longitudinal waves.
how to specify the within-wave correlations (T1, T234) between these two variables? Are there any Mplus examples?

rongqin.yu@psych.ox.ac.uk posted on Sunday, February 28, 2016 - 10:53 pm

follow up my previous question, I will need to use ZIP model for my data. The relationship between the continuous variable and the zero-inflated part has to be estimated as well (after creating a latent variable for the count variable, the correlation between part of the count variable and continuous variable can be estimated but the model does not recognize the zero-inflated part of the count variable). any suggestions?

Bengt O. Muthen posted on Monday, February 29, 2016 - 5:55 pm

I assume you specified a factor as measured by the continuous outcome and the count outcome where the factor variance is fixed at 1 and one loading is free to capture the residual covariance. Seems like you can do the same using the inflation part as an indicator of a second factor.

Sindes Dawood posted on Monday, August 01, 2016 - 2:37 pm

Dear Prof Muthens,

Given that my outcome count data are highly zero-inflated, I am interested in comparing different count based distribution models of my data to determine the best fitting model so that I may specify my data accordingly for other analyses. Specifically, I would like to compare the Olsen and Schafer's Two-Part model to poisson, NB, ZIP, ZINB, and NBH.

I was wondering if you could please tell me what information criterion in mplus should I use to compare the two-part model to these other distributional models. BIC? Vuong test?

Thank you!

Bengt O. Muthen posted on Monday, August 01, 2016 - 4:09 pm

This has many aspects which are discussed at length in our new book

http://www.statmodel.com/Mplus_Book.shtml

Sindes Dawood posted on Sunday, August 14, 2016 - 8:38 pm

Dear Prof Muthen,

Thanks for your suggestion. I have since purchased your new book and read the section in your book. It was helpful in answering most of my questions.. According to the section of the chapter on model comparison, it says a Vuong test should be used to compare non nested models, which I need to do. I was wondering if you could please tell me whether mplus can compute the Vuong test and if so, how?

If mplus cannot compute it, could you please let me know what information from the mplus outputs are needed to put in the formula to do the Vuong test by hand?

Thank you

Linda K. Muthen posted on Monday, August 15, 2016 - 6:48 am

On which page of the book do you find this suggestion?

Sindes Dawood posted on Monday, August 15, 2016 - 7:10 pm

Hi Linda,

The suggestion was on page 264 in the new book.

Linda K. Muthen posted on Monday, August 15, 2016 - 8:54 pm

We have not implemented this test. You would need to read the articles referenced to see what they suggest.

Lisa M. Yarnell posted on Wednesday, October 05, 2016 - 3:52 pm

Hello, I read in the Mplus User's Guide that residual variances are not estimated for count variables: "The inflated part of censored outcomes, binary outcomes, ordered categorical (ordinal) outcomes, count outcomes, and the inflated part of count outcomes have no variance parameters" (p. 639), and indeed I have seen this in my own Mplus output before--that no R-square is given, for this reason.

However, I am regressing count variables in a latent growth model on time-varying covariates and it is working.

How can I regress a variable that does not have residual variances defined on a TVC, if it is already loading onto the I, S, and Q LGM factors? (I am not using Theta parameterization.) If its residual variance is undefined, what does the regression parameter mean? I am unsure how to interpret this.

Can you explain whether it is possible to regress a count indicator in a factor model on another variable? How can this be possible? Thank you.

Bengt O. Muthen posted on Wednesday, October 05, 2016 - 4:22 pm

Regression analysis with a count DV indeed has no residual variance parameter in the regular Poisson version of count modeling. You can still estimate an intercept and slopes. See for example Chapter 6 in our new book which also discussed more elaborate models.

Lisa M. Yarnell posted on Thursday, October 06, 2016 - 5:58 am

Bengt, the quote from the User's Guide above suggests that the (zero) inflated portion of a count outcome does not have a defined residual variance; but that the COUNT portion (if it has both inflated and count portions) does. Is that true?

This seems intuitive because there is a dispersion parameter for a zero-inflated negative binomial variable.

Could you address? I will also see the book.

Bengt O. Muthen posted on Thursday, October 06, 2016 - 9:22 am

I don't see that the quote suggests that the Count portion has a residual variance. The standard Poisson doesn't.

In our book you will see how one can add a residual to the Poisson and how the negative binomial model with its dispersion parameter can be viewed as having a residual.