Growth Model for Categorical Outcome PreviousNext
Mplus Discussion > Categorical Data Modeling >
Message/Author
 Elizabeth Ginexi posted on Thursday, March 29, 2001 - 10:16 am
In the Mplus version 2 user's guide, example 22.1D shows a linear growth model for a categorical outcome with time-invariant and time-varying covariates. Can this model handle missing data in the outcome variables or will my participants with missing data need to be dropped? Chapter 23 that discusses modeling with missing data only mentions that it is available for analyses with continuous outcomes. Thanks for any guidance.
 Linda K. Muthen posted on Sunday, April 01, 2001 - 5:43 pm
The TYPE=MISSING option is not available for categorical outcomes. Any observation with one or more misssing values on the analysis variables will be dropped.
 Shige posted on Saturday, December 31, 2005 - 5:33 pm
Is this still the case in version 3?

Shige
 bmuthen posted on Saturday, December 31, 2005 - 5:47 pm
No, this post is quite outdated. With categorical outcomes you have 2 major options, ML estimation or limited information weighted least-squares estimation. With the former, you have the usual MAR facility through Type = Missing. With the latter you can now also allow missing data, but the estimation is only MAR when covariates predict missingness; when outcomes predict missingness, MCAR is needed because of the limited (2nd-order) information approach.
 Eisuke Segawa posted on Saturday, July 01, 2006 - 8:01 am
The limited information weighted least-squares was available only for complete data (no missing) in older version of Mplus (ver. 2?). It was extended to data with missing in newer version of Mplus. Is there any article describing the extension?

Thank you

Eisuke
 Bengt O. Muthen posted on Sunday, July 02, 2006 - 5:30 am
No, only the user's guide. It is pairwise present without covariates and missingness can be predicted by covariates when covariates are part of the model.
 Miles Taylor posted on Thursday, January 11, 2007 - 7:01 am
I'm running a growth with random onset model, similar to those in the Albert and Shih 2003 piece and Masyn's work/ tutorials. Although my model handles missingness on the outcomes (both the binary onset and continuous growth) the program kicks out anyone who's missing on independent variables. I'm was planning to use multiple imputation since I could not get this to work but I want to make sure my code is not incorrect first, especially in regard to the appropriate estimators. If a direct maximum likelihood/FIML type option is available then I'd rather go that route. Here's my basic code in Version 4...

ANALYSIS:
TYPE=MISSING;
ESTIMATOR=ML;

MODEL:
f by on1@1;
f by on2@1;
f by on3@1;
f by on4@1;
iy sy| ytot1@0 ytot2@1 ytot3@2;
f ON x1 x2 x3;
iy-sy ON x1 x2 x3;
f@0;
f WITH iy@0;
f WITH is@0;

Thanks,
Miles
 Linda K. Muthen posted on Thursday, January 11, 2007 - 8:23 am
TYPE=MISSING; with maximum likelihood is FIML. There is no missing data theory for covariates. Estimation is done conditioned on the covariates. Therefore, any observation with a missing value on a covariate is excluded from the analysis. You can include the covariates in the analysis by mentioning their variances in the MODEL command. They are then treated as dependent variables and distributional assumptions are made about them.
 Miles Taylor posted on Thursday, January 11, 2007 - 10:18 am
Hi Linda,

Thanks for your quick response. I just wanted to make sure I hadn't missed something or coded incorrectly. I'll most likely just use imputation for the covariates.

Thanks!

Miles
 Miles Taylor posted on Thursday, February 15, 2007 - 1:38 pm
Hi Linda,
I can't seem to get model fit indices for the model listed above (growth with random onset). I can run the growth portion separately and get RMSEA, etc. but none of these are reported for the model above or for the onset only (discrete-time survival)model. I would calculate these by hand but my Chi-sq. stats look suspicious (Chisq=0, 11df). I need to find some way to get fit stats that reviewers are familiar with...even if I have to run the onset and growth models separately. Any suggestions?
 Linda K. Muthen posted on Thursday, February 15, 2007 - 2:55 pm
When means, variances, and covariances are not sufficient statistics for model estimation, chi-square and related statistics cannot be computed. In these cases, nested models are compared using loglikelihoods.
 Andrew Mackinnon posted on Thursday, June 21, 2007 - 12:08 am
I would be grateful for your comments on an ordinal growth curve model I am working on.

The variables relate to 6 occasions: one taken beforehand, 4 occasions during a process and one afterwards.

The four ‘during’ measures used a 4-point scale. An ordinal growth was fitted to these variables as described in Example 6.4.

The Before and After variables are also ordinal but have unique response categories making it inappropriate to include them in the growth curve. I specified both as categorical in the model and defined a latent continuous variable for Before.

However, this model would not converge until I fixed the first threshold of After. My question is whether this should be necessary and whether this is a reasonable thing to do. Ultimately, I want to elaborate this model by adding more predictors, so I want to be certain that the base model is okay or whether there are better ways of achieving my aims.

Thanks,

Andrew


ANALYSIS:
TYPE=MISSING H1;
MODEL: interc slope | Occ1@0 Occ2@1 Occ3@2 Occ4@3;
B4 by Before; !Create latent continuous variable from ordinal var
b4@1; !Fix variance of latent var for identification
[after$1@0]; !Fix first threshold of ordinal dependent var

interc slope on b4;
after on interc slope;
 Bengt O. Muthen posted on Thursday, June 21, 2007 - 8:57 am
It should not be necessary to fix a threshold for the after variable. Perhaps the convergence problem is due to after being very skewed with some very rare categories?

The

B4 by Before;

approach assumes a normal B4 - why not simply treat Before as a continuous covariate?
 Andrew Mackinnon posted on Sunday, June 24, 2007 - 9:28 pm
Dear Bengt,

Thanks for your advice. I've experimented with collapsing categories of AFTER to eliminate rare categories. I've also done this the other variables in the model. None of this results in convergence. Neither does treating AFTER as continuous. The parameters concerned drift further away from plausible values as the number of iterations allowed increases. This led me to impose the original constraint. Do you have any suggests about next steps?

With regard to creating a normally distributed latent variable (B4) from the ordinal variable (BEFORE), I did this so as to treat this variable consistently with AFTER, which is measured on the same 7-point scale and also with the four 'during' measures. It seemed to me that if these variables warranted special treatment due to their ordinal properties, so did BEFORE. Is this a reasonable view or am I being too precious?

Thanks,

Andrew
 Linda K. Muthen posted on Monday, June 25, 2007 - 8:38 am
It seems we need to see the data to understand this. Please send your input, data, output, and license number to support@statmodel.com.
 Emily Blood posted on Sunday, October 05, 2008 - 8:38 am
Hi,
Can you explain more the difference between the theta and delta parameterization when used with a binary growth curve and WLSMV estimator? I have read the manual on this point (example 13.2 and 13.3) but am still not clear on when you would use one over the other and how allowing the scale factors for y*'s to be parameters versus allowing the residual variances of the y*'s to be parameters affects the model that is fit. It would be helpful to see how you how you could get the model results obtained by using DELTA parameterization using the THETA parameterization of vice versa to see what each is doing. Is this possible?
Thanks,
Emily
 Linda K. Muthen posted on Sunday, October 05, 2008 - 10:19 am
I would use the default Delta unless the model must be estimated using Theta. See Web Note 4 for a full discussion of this topic.
 Emily Blood posted on Sunday, October 05, 2008 - 2:24 pm
I am generating data (outside of Mplus and then fitting the model with Mplus) and using the Theta parameterization is the way that I'm able to recover the parameters I set, if I use the Delta parameterization I don't recover them, but I don't quite understand why that is. That is why I was asking about the full specification of each parameterization and if you can get the results of one by specifying certain constraints in the other. Where is "Web Note 4"?
Thanks,
Emily
 Emily Blood posted on Sunday, October 05, 2008 - 2:28 pm
I found Web Note 4 and will read it.
Thanks,
Emily
 Arina Gertseva posted on Wednesday, February 25, 2009 - 5:29 pm
Hi,
I try to run a mixture model for a binary outcome measured at four occasions. Below there is the syntax for the program I am trying to estimate.
I would like to know whether this syntax looks O.K., because i receive a message that I have a negative df?
My next question concenrns the possibility of building a mean trend/trajectory for a categorial outcome. Is it possible? Should I estimate an unconditional latent growth model for overall sample?

Thank you.

DATA: FILE IS C:\Documents and Settings\arina\Desktop\friend7.txt;
NOBSERVATIONS=3245;
VARIABLE: NAMES ARE alcohol1 alcohol2 alcohol3;
USEVARIABLES ARE alcohol1 alcohol2 alcohol3;
MISSING ARE;
CLASSES=c(4);

CATEGORICAL ARE alcohol1 alcohol2 alcohol3;
ANALYSIS: TYPE = MIXTURE;
MODEL: %OVERALL%
%C#1%
[alcohol1$1*0 alcohol2$1*0 alcohol3$1*0];
%c#2%
[alcohol1$1*0 alcohol2$1*0 alcohol3$1*1];
%c#3%
[alcohol1$1*0 alcohol2$1*1 alcohol3$1*1];
%c#4%
[alcohol1$1*1 alcohol2$1*1 alcohol3$1*1];

OUTPUT: SAMPSTAT MODINDICES(10) STAND RESIDUAL TECH4;
plot: type is plot3;
series alcohol1 (1) alcohol2(2) alcohol3 (3);
 Linda K. Muthen posted on Wednesday, February 25, 2009 - 5:53 pm
You have set up an LCA model. With three categorical indicators you cannot identify four classes.

See Example 8.4 for a Growth Mixture Model for a categorical outcome. A good place to start is with an unconditional model specified in the overall part of the MODEL command.
 Arina Gertseva posted on Wednesday, February 25, 2009 - 7:50 pm
Linda,
Thank you very much for a prompt reply.
Drawing on Example 8.4, I modified the initial model (the syntax is below). Does it look correct? After I run the single-class model, I can try to increase the number of classes, right?

Thank you very much.

DATA: FILE IS C:\Documents and Settings\arina\Desktop\friend7.txt;
NOBSERVATIONS=3245;
VARIABLE: NAMES ARE alcohol1 alcohol2 alcohol3;
USEVARIABLES ARE alcohol1 alcohol2 alcohol3;
MISSING ARE;
CLASSES=c(1);
CATEGORICAL ARE alcohol1 alcohol2 alcohol3;
ANALYSIS: TYPE = MIXTURE;
MODEL: %OVERALL%
i s| alcohol1@0 alcohol2@1 alcohol3@2;

OUTPUT: SAMPSTAT RESIDUAL TECH1 TECH8;
plot: type is plot3;
series alcohol1(1) alcohol2(2) alcohol3 (3);
 Linda K. Muthen posted on Thursday, February 26, 2009 - 6:29 am
That looks fine. Yes, the next step would be to increase the number of classes.
 Arina Gertseva posted on Monday, March 02, 2009 - 10:24 pm
Linda,
I am still working on the model for a binary outcome measured at three occasions(the syntax is in my previous message). For some reason I cannot get the sample statistics and the plot for the mean trend in my output.
Could you please advise me what to do?
 Linda K. Muthen posted on Tuesday, March 03, 2009 - 6:26 am
Please send your files and license number to support@statmodel.com.
 J.W. posted on Tuesday, October 13, 2009 - 10:12 am
When Delta parameterization is used in a LGM with ordinal outcomes, usually:
1) the mean of the latent intercept growth factor is set to 0.00
2) the threshold invariance was constrained across time.
3) the scale factors are set free while it is fixed at a reference time point.

I have two questions:

1) Should the scale factor always be fixed at 1.0 at a reference time point? I tried to set it at 0.0, it did not work.

2) Alternatively, one can free the intercept factor mean and fix one threshold (e.g., the first threshold) at all time points. In my model with 6 repeated measures, I fixed the first threshold to 0.00 across time, the estimated intercept factor mean was 0.446, when I fixed the first threshold to 1.00, the estimated parameter became 1.446 other estimates remained the same. How do I interpret the results?

Your help will be appreciated!
 J.W. posted on Tuesday, October 13, 2009 - 2:09 pm
More info for Question 1 I asked:

The model runs for a positive value specified for the scale factor, but the estimate of the intercept mean varies.How do I interpret the parameter estimate?
Thanks!
 Bengt O. Muthen posted on Tuesday, October 13, 2009 - 6:49 pm
1) A scale factor is the inverted SD for a latent response variable so it needs to be positive.

2) For the time point where you have centered the growth model (say time 1), the terms that determine the binary outcome probability is tau - alpha, or in Mplus language [u1$1] - [i]. So you can see why you got the two estimated sets of values. Typically you don't interpret tau or alpha, but merely use them in calculating outcome probabilities. So the choice of parameterization has no real interpretational impact.
 J.W. posted on Friday, October 16, 2009 - 2:01 pm
I have a couple of questions for a LGM with 4-categories ordinal outcome measures:

1) WLSMV estimator and Delta parameterization are used in modeling: I would like to confirm the interpretation of the probabilities calculated from the Probit regression coefficients using the formula on p.406 in Mplus User¡¯s Guide. Instead of being probabilities of being in specific categories, they are: probability of y* > threshold 1 (i.e., probability of being in categories 2-4); probability of y* > threshold 2 (i.e., probability of being in categories 3-4); and probability of y* > threshold 3 (i.e., probability of being in category 4), respectively. In addition, does the covariance between the latent growth factors affect the calculation?

2) WLSMV estimator and Theta parameterization are used in modeling: in the unstandardized solution, means and variances/covariance of the latent intercept and slope factors, as well as thresholds and residual variances, all had very large p values; however, the estimates of means and variances/covariance of the latent intercept and slope factors in the standardized solution are very close to the corresponding figures in Delta parameterization. How to explain these? By the way, the threshold estimates, including p-values, in standardized solution are identical in the two parameterizations.
 Bengt O. Muthen posted on Saturday, October 17, 2009 - 12:21 pm
1) Only the mean and the variance of y* plays in.

2) We need to see this - please send input, output, data, and license number to support@statmodel.com.
 Nicolas Müller posted on Thursday, April 14, 2011 - 10:58 am
I'm fitting an ordinal growth model using the twolevel specification.

I'd like to know if it is possible to test the proportional odds assumption using MPlus. I thought that one way could be to declare my dependent variable as NOMINAL instead of CATEGORICAL, thus having one set of coefficients by category (multinomial regression), in order to check if this model fits significantly better than the ordinal one where the coefficients are restricted to equality.

Anyway, I get this error when I try to fit the model with a NOMINAL dependent variable: Internal Error Code: MDP1039

Is this because what I'm trying to do makes no sense? Or should I send you my input and data along with my serial number?
 Bengt O. Muthen posted on Thursday, April 14, 2011 - 6:04 pm
A growth model with a nominal outcome is a funny model. You don't have a single outcome as you do with an ordinal outcome, in the sense of having a single slope, so it almost seems like you have to have C-1 growth models for a C-category outcome.

In general, it is a little involved to test for ordinality using a nominal outcome.

If you like, you can send your output with the error message to support.
 Quintana posted on Monday, November 19, 2012 - 1:09 pm
I am running a growth model of substance use at three time points. It is dichotomous (0= haven't consumed in the past year; 1= have consumed at least once in the past year).

When I run the dichotomous growth curve according to the syntax in the example in the user's guide, the model does not run (non-positive definite error). However, if I run the model with that same data and syntax EXCEPT not specifying the drinking variable as categorical in the syntax, the model runs perfectly.

(1) Can I use the output from the analyses where I don't specify the variables as categorical by creating an odds ratio myself using the betas I get from this model?

(2) If not, do you have any suggestions of what could be wrong or what I could do to get my model to work?

Note: I also get the model to work perfectly when I keep the substance use variable continuous (How often have you consumed in the past year on a scale of 1 to 5). However, the variable is very skewed, even when log transformed, so I would like to get the dichotomous model to work as well.

Thank you
 Linda K. Muthen posted on Monday, November 19, 2012 - 1:42 pm
Please send the output where it did not work and your license number to support@statmodel.com.
Back to top
Add Your Message Here
Post:
Username: Posting Information:
This is a private posting area. Only registered users and moderators may post messages here.
Password:
Options: Enable HTML code in message
Automatically activate URLs in message
Action: