Longitudinal binary data PreviousNext
Mplus Discussion > Growth Modeling of Longitudinal Data >
Message/Author
 Anonymous posted on Tuesday, November 20, 2001 - 11:13 am
Hi, I am pretty new in this field. I have a data set with the following relationship:

X -> Y -> Z

X, Y and Z are all observable binary longitudinal data. I wonder if Mplus can handle this type of model, if it does, which part of the analysis (e.g. 'path analysis') should be performed? Many thanks for your help!
 bmuthen posted on Wednesday, November 21, 2001 - 8:26 am
Yes, this can be seen as path analysis. In Mplus language you define y and z as categorical and then say:

z on y;
y on x;
 Silvia Sörensen posted on Sunday, May 18, 2003 - 9:44 am
Is it possible to run a parallel process model for a continuous and a categorical variable? I haven't found any examples that combine the parallel process with the
categorical approach.

My continuous variable was measured monthly over 15 months, my categorical (binary) variable at baseline, month 3, month 9 and month 15. I would like to know if one variable "drives" the other.
I'm fairly new at this
Thanks
 Linda K. Muthen posted on Monday, May 19, 2003 - 8:39 am
Yes, it is possible to do that. Just put together a model where one process is continuous with 15 measures and one categorical with four measures. You can put together using Example 22.1 for the continuous outcome and Example 22.1D for the categorical outcomes with/or without the covariates.
 Anonymous posted on Monday, December 06, 2004 - 11:28 am
How much variability is necessary for MPlus to estimate a model? I have binary outcome data and some low base rates (in terms of the presence of the outcome). I have 20 participants measured over 18 time points. When I attempt to run the LGCA, I get the following error message:

THE WEIGHT MATRIX PART OF VARIABLE T1 IS NON-INVERTIBLE. THIS MAY BE DUE TO ONE OR MORE CATEGORIES HAVING TOO FEW OBSERVATIONS. CHECK
YOUR DATA AND/OR COLLAPSE THE CATEGORIES FOR THIS VARIABLE. PROBLEM INVOLVING THE REGRESSION OF T1 ON RACE. THE PROBLEM MAY BE CAUSED BY AN EMPTY CELL IN THE JOINT DISTRIBUTION.

Unfortunately, I get the same message for about 1/2 of my time points. (I have no missing data - just mostly 0's with a few 1's interspersed throughout the dataset.)Is this error accurate, or might I have an error in my program?
 bmuthen posted on Monday, December 06, 2004 - 11:41 am
You mention doing LGCA (I assume you mean LCGA), that is ML estimation of a mixture model. But the error message seems to refer to weighted least squares analysis - not a mixture analysis. Please clarify or send input, output, and data to support@statmodel.com.
 Anonymous posted on Wednesday, October 12, 2005 - 12:04 am
Hi,

I am working on LGM with longitudinal binary variables. It also has time- varying covariates which are also binary. When I added time-varying covariates, the program gave me a warning saying that it needs Monte Carlo integration.

Having Monte Carlo as integration is the way to go in my situation(LGM with binary variables and time-varying covariates)?

Thanks in advance!
 Linda K. Muthen posted on Wednesday, October 12, 2005 - 8:36 am
I would imagine that without covariates, you were using the default WLSMV estimator and when you added the covariates, maximum likelihood was used. Certain models require Monte Carlo integration. If you need additional information, please send your input, data, output, and license number to support@statmodel.com.
 Gina Allen posted on Friday, October 06, 2006 - 11:30 am
We are working on a project in which we are trying to model five categorical indicators at 6 time points. One variable (retirement) designates a permanent transition (once retired always retired). The rest (work, marriage, etc.) can take any value at each time point. We are planning to run latent class models for each of the five variables over the 6 time points and then do a follow up latent class analysis of the cross-classification of the resulting trajectories. Is there an example we can follow or a more efficient way to program this?
 Bengt O. Muthen posted on Friday, October 06, 2006 - 6:41 pm
I am not sure what you mean when you say "then do a follow up latent class analysis of the cross-classification of the resulting trajectories". I can imagine several approaches - here are two, both are based on an LCA at each time point. One is latent transition analysis where class membership at time t influences class membership at time t+1. There are several examples of that in the UG. The other is latent class growth analysis where the outcome is the latent class variable at each time point - that is a bit cumbersome, however. The latter is the same as doing a joint LCA of all time points, but then structuring the latent class probabilities according to an LCGA.
 Sarah Dauber posted on Monday, October 09, 2006 - 8:54 am
I am interested in using growth curve analysis to analyze data on monthly rates of abstinence from drug use across 18 timepoints. Is this possible to do in MPLUS?

Thank you for your help.
 Linda K. Muthen posted on Tuesday, October 10, 2006 - 8:37 am
Yes, this is possible in Mplus.
 sara hussain posted on Monday, January 29, 2007 - 10:23 am
I am interested in modelling longitudinal poverty (a binary indicator) over 15 waves of data. I am learning about the LCGA approach, however, please could you tell me what the main differences are between this method and latent Markov analysis (of Langeheine & van de Pol)?

Many thanks

Sara
 Jungeun Lee posted on Thursday, August 09, 2007 - 12:37 pm
Hi,

I am running a latent growth mixture modeling with 4 binary variables. When number of classes >1, Mplus gives me a warning like;

ONE OR MORE PARAMETERS WERE FIXED TO AVOID SINGULARITY OF THE INFORMATION MATRIX. THE SINGULARITY IS MOST LIKELY BECAUSE THE MODEL IS NOT IDENTIFIED, OR BECAUSE OF EMPTY CELLS IN THE JOINT
DISTRIBUTION OF THE CATEGORICAL VARIABLES IN THE MODEL.THE FOLLOWING PARAMETERS WERE FIXED:3

The parameter 3 is alpha for the slope of the first class and probably because of that, S.E for the slop of the first class is 0. So, I can't tell it is statistically significant or not. Could you let me know how problematic it is to have the warning like the above? Could you please also let me know what I can do in a situation like this?

Thanks in advance.
 Linda K. Muthen posted on Tuesday, August 14, 2007 - 6:41 pm
This seems like a problematic message. Please send your input, data, output, and license number to support@statmodel.com.
 Jungeun Lee posted on Tuesday, August 21, 2007 - 4:10 pm
I just emailed them to you. Thanks!
 Emily Blood posted on Wednesday, September 24, 2008 - 7:23 am
I am fitting a latent growth curve with repeated observed binary outcome, a latent continuous intercept and a latent continuous slope and a logit link between the latent intercept and slope and binary outcomes, using the MLR estimation. I just want to be clear on the estimation procedure. Does the likelihood assume normality for the observed outcomes (or the unobserved continuous variable with the threshold:Y*) and use the normal density or is the logistic density used in the likelihood? If you could let me know that would be great.
Thanks,
Emily
P.S. I have read the Muthen and Asparouhov 2008 paper, thank you for sending that reference. I still have this one question, though.
 Bengt O. Muthen posted on Wednesday, September 24, 2008 - 9:31 am
The binary growth model with logit link uses regular logistic regression of each binary outcome on the growth factors. Regular logistic regression does not necessitate an underlying y* variable, but simply considers the conditional probability of the binary outcome as a function of the growth factors. However, the logistic regression model can equivalently be expressed in terms of such a y* variable that has a logistic density given the predictors (growth factors).
 Emily Blood posted on Wednesday, September 24, 2008 - 11:30 am
Thank you for the response. So the observed data likelihood in this case is expressed as:
Integral[f(xb) * normal density of random effects] integrated over the random effects? Where f(xb)=p(xb)^y * (1-p(xb))^(1-y)? I'm basing this on the estimation section in the Muthen & Aspourohov, 2008 paper and putting in the likelihoods for this specific case. Hopefully I've understood it correctly?
Thanks,
Emily
 Bengt O. Muthen posted on Wednesday, September 24, 2008 - 11:47 am
Yes. You have conditional independence of the y's given the random effects.
 Nicholas Bishop posted on Monday, December 14, 2009 - 10:23 am
I am interested in creating a factor-of-curves LGM that utilizes both binary and continuous lower-order curves to estimate the higher order factor. Would this be as simple as defining the lower-order binary curves as categorical (as described in example 6.4 of the user's guide) then creating a model similar to that described by Duncan, Duncan, and Strycker (2006)? Here is a link to their example of the factor-of-curves LGM with only continuous outcomes: http://www.ats.ucla.edu/stat/mplus/examples/ddsla/app52man.inp.txt. Thanks.
 Linda K. Muthen posted on Tuesday, December 15, 2009 - 8:43 am
To change the input from the link to a combination of categorical and continuous variables, add the CATEGORICAL option to specify the variables that are categorical.
 Nicholas Bishop posted on Tuesday, January 05, 2010 - 12:13 pm
I have two questions relating to the factor-of-curves model mentioned above. Is it possible to utilize mixture modeling on the second-order slope and intercept? I would like to examine the heterogeneity in the second order curve. Also, I have missing data related to the outcome variables (smoking and health screening in a sample of older adults). What would be the most efficient way of accounting for data NMAR in the factor-of-curves model (while also utilizing selection modeling with the common I S) ?
 Bengt O. Muthen posted on Tuesday, January 05, 2010 - 3:02 pm
To answer your first question: Yes.

To answer your second question, NMAR is a big topic and is not easy to carry out well; no approach is really "efficient". A first step would be to check if those dropping out have a different mean on the outcome before dropping out than others do. But even if that is so, it doesn't mean that dropout is NMAR; it could still be MAR. I think the pattern-mixture approach is probably the most accessible in terms of exploring the missingness.
 Nicholas Bishop posted on Thursday, January 07, 2010 - 12:01 pm
Thanks Bengt. What potential roadblocks will I face when attempting do this this with categorical outcomes?
 Bengt O. Muthen posted on Friday, January 08, 2010 - 10:47 am
You get more dimensions of integration due to categorical outcomes, so that can make for heavy computations.
 Regan posted on Wednesday, February 03, 2010 - 7:14 pm
I am interested in following up on Mr. Bishop's questions. I have about 2% of my sample that are non-responders on the four indicators that make up my main independent factor variable. I have done some preliminary regression analyses on the variables in the model, and it seems that one could argue that these missing cases violate the MAR assumptions for FIML or MI. Should I drop these cases from analyses, or how do I use the pattern-mixture approach you mentioned in your response to Mr. Bishop?
Thanks!
 Linda K. Muthen posted on Thursday, February 04, 2010 - 7:46 am
With so little missing data, I would estimate the model under MAR.
 Craig Furneaux posted on Thursday, July 22, 2010 - 7:56 pm
Hi Dr Muthen

I am undertaking an analysis of organisational processes, which are all categorical variables, including an array of binary variables related to these processes.

I am interested in the change to processes over time, and wonder whether the MPlus program would be suitable for this purpose?

I am particularly trying to prove / disprove change to these processes over time, based on these categorical variables.

Any help would be greatly appreciated, including any examples of this sort of approach.

Craig
 Linda K. Muthen posted on Friday, July 23, 2010 - 11:31 am
You might consider growth modeling, latent transition analysis, or growth mixture modeling. You can read about these in the Topic 3, 4, and 6 course handouts.
 Alain Girard posted on Monday, April 30, 2012 - 10:48 am
Hi,
I perform a growth model for binary data with a probit link. I have predictors for intercept and slope.

Where can i find equation to compute the predicted probabilities for given values of predictors.

Thanks
Alain
 Linda K. Muthen posted on Monday, April 30, 2012 - 2:06 pm
See slide 45 of the Topic 3 course handout in conjunction with slides 162-164 of the Topic 2 course handout.
 Alain Girard posted on Tuesday, May 01, 2012 - 6:57 am
Thanks for your answer. I just want to confirm my computation.

I estimated the model :

unsevariables = y1 y2 y3 y4 x1 x2;
categorical = y1 y2 y3 y4;
(...)

i s | y1@0 y2@1 y3@2 y4@3;
[y1$1@0]; [y2$1@0]; [y3$1@0]; [y4$1@0];
i on x1 x2;
s on x1 x2;

I want to compute P(yj=1|x1, x2) ; j = 1,2,3,4 for given values of x1 and x2.

I known P(yj=1|i, s, x1, x2) = 1-F(-i-(j-1)*s)
thus P(yj=1|i, s, x1, x2) = E(P(yj=1|i, s, x1, x2))

To compute P(yj=1|x1, x2) i simule (using R) i and s for given value of x1 and x2 and compute P(yj=1|i, s, x1, x2) for each simulated subject. I obtain P(yj=1|x1, x2) by take the mean of
P(yj=1|i, s, x1, x2).

Thanks
Alain Girard
University of Montreal
 Bengt O. Muthen posted on Thursday, May 03, 2012 - 10:01 am
It looks like you are doing numerical integration by simulation. When you don't condition on the growth factors, the normal factors together with logit link requires numerical integration to get the probabilities. This is computed in the PLOT command I believe. With probit link the numerical integration is not needed - instead you have an explicit expession.
Back to top
Add Your Message Here
Post:
Username: Posting Information:
This is a private posting area. Only registered users and moderators may post messages here.
Password:
Options: Enable HTML code in message
Automatically activate URLs in message
Action: