Mplus Discussion >> Two-part modeling

Topics
Last Day
Last 3 Days
Last Week
Tree View

Edit Profile


Two-part modeling

Mplus Discussion > Growth Modeling of Longitudinal Data >

Message/Author

Michael Spaeth posted on Friday, December 14, 2007 - 10:26 am

Hello,

I have a question regarding two-part modeling and the Brown et al. (2005) article which one can download from your hp.
1) the authors recommend to check for the the growth-shape in both parts separately. Referring on ex. 6.16, does this mean I can set an exclamation mark in the binary-model row and binary "data-two part"-row, to test the shape for the cont-model and vice versa for the binary model, or do I have to modify the data and test each part "really" separately?
2) the authors refer to log likelihood chi square testing to compare the nested model. Where do I find these relevant likelehood values for this testing? mplus depicts only H0 likelihood value. I'm using MLR with numerical integration.
3.) Are there any other strategies to fit two-part models?
4) I plan to fit a two part model in an intervention group and control group. After that i want to do multiple goup modeling. Is it necessary that the growth shape of both parts of the intervention group is equal to the correspondend parts of the two-part model of the control group (e.g. both groups have a linear slope in the binary part and a quadratic slope in the cont-part)... I know, that this is a necessary condition in normal LGM multiple group comparison.

Thanks!
Michael

Bengt O. Muthen posted on Friday, December 14, 2007 - 10:49 am

1) You can simply comment out the model part and the variables in the USEV list (and categorical list).

2) The LLs should be used. I don't think using chi-square testing for the continuous part is correct, because that is acting as if the missing data was seen in the data.

3) See the Kim & Muthen 2-part factor analysis paper on this web site.

4) This is the same as for regular growth. I think one type of intervention effect could be that the growth shape changes after the intervention, so I wouldn't restrict myself there.

Michael Spaeth posted on Sunday, December 16, 2007 - 5:20 am

Thanks a lot Prof. Muthén!

1.) o.k.

2.) The "HO Value"? I have used MLR for both parts, just as it is described in ex 6.16. Is it also possible to use ML, because then i can use the values without correction and this is not so time-consuming!? Articles related to two-part give no hint regarding this issue...

3.) thanks!

4.) just to clarify: both corresponding parts in control and intervention group must have the same growth shape to do multiple group? So if that is not the case, i would follow the covariate approach as described in Brown et al. (2005) to test for intervention effects...

Michael Spaeth posted on Sunday, December 16, 2007 - 8:24 am

2.) I've done some calculations since my last post. two-part indeed helped to normalize my data, so I'm guite sure that I can modify ex. 6.16 with regard to: MLR=ML and tranformation = log into transformation = none, right?

If that is right, only question no. 4 is still a little bit unclear for me :-).

Regards,
Michael

Bengt O. Muthen posted on Sunday, December 16, 2007 - 11:36 am

2) Yes, the H0 LL value. No transformation is ok, but log transformation is more statistically sound. ML might be sufficient.

4) If you have an intervention dummy covariate influencing a slope growth factor, then the intervention and control group should have the same growth shape for this influence to be seen as an intervention effect.

Michael Spaeth posted on Monday, December 17, 2007 - 2:31 am

thanks again...! my last questions so far: I've done a log-transformation in two-part and after that I was not able to find a sufficient fitting growth shape for the continuous part. In detail: With log transformation my alcohol-use mean decreased from t1 to t2 which hinders the growth shape to be modelled in a linear or quadratic fashion. In the untransformed data I found this was not the case. The mean increased from t1 to t2.
Any possible reasons for that?

4) how can one statistically test for intervention effects if the continuous parts have not the same shape? (the binary parts do have)

Michael Spaeth posted on Monday, December 17, 2007 - 5:58 am

sorry, short question concerning 1.) again... I've done separated growth shape analyses for both parts. The final solution fitted very well in both parts. Then I combined the final solution from both parts to "one" two-part model as in ex. 6.16, to let the growth factors correlate and to introduce my covariate. The unconditional model derived from the separate analyses fits very poor and I got a PSI-Error without no hint for possible reasons in Tech4. Also when I introduce my covariate.
In addition: When i comment out the model parts as you recommend under 1.), to do separate analyses, my N reduced due to missings on all measurement points, but this warning didn't upcome when I did analyses with the whole model (correlated growth factors as in ex 6.16) and analyses are done with the whole N. Also the estimated means for the continuous part slitely differ between the "whole" model and separate analyses...
I'm a little bit confused...

Michael Spaeth posted on Monday, December 17, 2007 - 6:05 am

forgot to mention, that my N reduced only in the separate analyses for the continuous part, as expected

Bengt O. Muthen posted on Monday, December 17, 2007 - 6:21 pm

I will try to answer these questions briefly so not to get into statistical consulting which we can't offer.

2:31 post:

If a variable increases in mean over time, the log of that variable should also increase over time since log is a monotonic transformation. Perhaps you include the zero part of the variable when you studied the unlogged variable - if the number of zeros decrease over time the mean might increase.

Regarding intervention effect with different growth shapes for control and tx groups, you can estimate the different mean curves and compute (via Model constraint) the estimated mean difference between the curves at any time point.

5:58 post:

The models estimated from each part separately should not differ that much in parameter values from the joint model - if big differences occur this might be a sign that the joint model is misspecified. A typical cause might be that more random effects need to be correlated across the processes. This could also resolve the Psi message - which might be due to several (more than 2) latent variables having a linear relation.

The analysis of the joint model benefits from the "MAR" approach to missing data where you always have observations on the binary part but not always on the continuous part. Having observations on at least the binary part keeps the individual in the analysis sample.

Michael Spaeth posted on Tuesday, December 18, 2007 - 3:09 am

Thanks!
Regarding log: No, I only added "Transform none" to the separated cont-model (binary model was comment out). No zeros are included in log and none-log analyses, N was equally decreased in both analyses. But I have many values within the range of 0.1 to 0.99.

Leah Rohlfsen posted on Monday, January 28, 2008 - 3:46 pm

I am doing a two-part growth model for males and females. I am trying to fit the model without predictors.
First, is it necesary to use algorithm=integration? It is NOT telling me I have to unless I try to correlate iu with sy, which I don't think I need to do.
Second, when I run the model, I am getting estimates and est./s.e. for the means, thresholds, variances and residual variances but I am not getting the est./s.e. for the growth factors or intercepts. Instead it is giving me .999. Is this problematic or unusual?

I am new to these models, so I really appreciate your help! I have included the syntax for my model statment in case it helps.
Model:
%overall%
iu su | bin3@0 bin4@1.5 bin5@2.5 bin6@3.5 bin7@4.5;
iy sy | cont3@0 cont4@1.5 cont5@2.5 cont6@3.5 cont7@4.5;
%c#1%
cont3-cont7(1);
%c#2%
[bin3$1 bin4$1 bin5$1 bin6$1 bin7$1](2);
cont3-cont7(3);
iy;
sy;

Linda K. Muthen posted on Monday, January 28, 2008 - 3:55 pm

Please send your input, data, output, and license number to support@statmodel.com.

Michael Spaeth posted on Thursday, January 31, 2008 - 6:56 am

In my understanding mplus always uses algorithm = integration in the binary part of the model and in the whole model regardless of specifying covariances. This leads to my problem. I want to test for influences of covariates on my outcome (time variant and time invariant). These tests include mediational effects. The manual declares that it is not possible to use Model indirect command in conjunction with algorithm = integration. Which options do I have to test for mediational effects in two part models? It would be best if these options include time variant and time invariant mediational testing. Thanks!

Leah Rohlfsen posted on Thursday, January 31, 2008 - 10:04 am

I want to make sure I completely understand the estimate for the mean of the 'iu' parameter in two-part models for two classes. For latent class #1, the estimate is -1.247 and for latent class #2, the estimate is .000. How are these interpreted? Is the estimate for class #1, the log odds of having any of the outcome COMPARED to class #2? Is it appropriate to exponentiate -1.247 (=.287) and then subtract it from 1 (1-.287 = .713) and say class #1 is 71% less likely to have any (outcome)compared to class #2?

Similarly, in regular latent class modeling with two groups for an ordinal outcome, the estimate of the mean of 'i' for group #1 is .000 and for group #2 it is -.037. Are these interpreted similar to above (it is an estimate of the intercept in comparison to the other group)? Is this why one group's estimate is .000?

Also, if the estimate for one of the groups (classes) is not significant, then there are no siginificant differences between the groups (classes) in the intercept (i or iu)?

Bengt O. Muthen posted on Thursday, January 31, 2008 - 4:52 pm

Model Indirect for algo = integration has not been implemented yet because it is more complex. However, for any model you can use Model Constraint to define a mediation parameter, say m = a*b, and this gives you a SE for it so you can test it. But note that it is not obvious (to me) that this product formula holds for the two-part model, or how it would have to be modified. - Perhaps something for a methods paper? Perhaps Dave MacKinnon at ASU knows or is interested.

Leah Rohlfsen posted on Friday, February 01, 2008 - 2:26 pm

This is a question in regards to the post on Thursday, Jan 31st at 10:04 am. I may not have been clear in my question...is there any other information that I can give to help you answer my question?
Thank you. I really appreciate your help!

Bengt O. Muthen posted on Friday, February 01, 2008 - 3:26 pm

Going back to your question from Jan 31 at 10:04:

[iu] refers to the mean of the random intercept of the binary part of the 2-part model. The binary part is a logistic growth model for experiencing the event in question. Individual differences in development over time of this probability are expressed by random growth factors. So a good place to read more about this is in books on binary random effects growth modeling such as Applied Longitudinal Analysis by Fitzmaurice, Laird and Ware. With the twist that you have two latent classes.

Extracting a quick answer from this for you, note that the probability of u=1 (observing the event) at the time point where the time score is zero (often the first time point) is directly related to [iu] as

(1) P(u=1) = 1/(1 + exp(-L)),

where L = logit = log odds for u=1 vs u=0 = -t + [iu] with t denoting the threshold parameter. Here [iu] = 0 for one class and estimated freely for the others. From (1) you can compute the probability of u=1 for each class when the random intercept is at its mean (note that you have to condition on the mean). And as you say, in line with regular logistic regression with a binary outcome and a binary covariate (the latter being your latent class), you can exponentiate [iu] to get the odds ratio (I would not subtract from 1 etc, but simply talk in terms of odds ratio). But talking in terms of probabilities may be more down to earth.

And, yes, one group having mean zero is to provide this reference group.

And, yes, on your last question.

Michael Spaeth posted on Friday, February 08, 2008 - 10:48 am

Bengt, back to your answer from Jan. 31 at 4.25pm

Thank's! Yes it would be best to contact him! Together with Cheong and Khoo he recommends parallel process modeling for testing mediational effects in LGM (2003). I have a strong theory for my mediational effect so the, in fact, correlational relationship between mediator and outcome wouldn't mind me. Is it possible to use this approach in mplus two-part modeling? It seems to me, that there is no model indirect command needed in the case of parallel process modeling. This would help me to investigate mediational effects with my covariates specified as time invariant (T1 value and time averaged). I also investigate their mediational effects within my two part model specified as time variant covariates. For this case I also see no other way instead of asking a specialist like MacKinnon.

Bengt O. Muthen posted on Friday, February 08, 2008 - 5:59 pm

You can have a parallel process growth model in the 2-part modeling framework. But a regular auto-regressive parallel process (cross-lagged model) wouldn't be able to have the 2-part structure.

Michael Spaeth posted on Saturday, February 09, 2008 - 6:15 am

just to clarify, it makes sense that my first process (mediator) is normal LGM and my second process is 2-part and that I regress 2-part both growth-parameters on LGM growth parameters?

Bengt O. Muthen posted on Sunday, February 10, 2008 - 9:31 am

Yes, that seems fine. But build up this complex model in small steps.

Michael Spaeth posted on Monday, February 11, 2008 - 12:37 am

thank's so far, your comments are very helpful! yes, this model building promises to be fun...

Michael Spaeth posted on Wednesday, April 09, 2008 - 10:37 am

I have quite large residuals in the continuous part, when I add the both parts to a complete model (correlated growth parameters, in a separate continuous model it fits fine). I have tried a lot of modifications and nothing helped. Then I let "iy" predict "sy" to control for iy-baseline. I looked at the residuals and found a substantial improvement but BIC and AIC (as a rough indicator of the whole two part model) got worse. What's going on there!?

Linda K. Muthen posted on Wednesday, April 09, 2008 - 10:56 am

Please send your input, data, output, and license number to support@statmodel.com.

Li Lin posted on Wednesday, September 08, 2010 - 1:55 pm

Could you tell me what is wrong with following code? Mean estimate for growth factors was not in output, and all growth factors ON x were 0 for both population and replicates average. Thanks!
MONTECARLO: NAMES = ...; CUTPOINTS = x(0);...GENERATE = u1-u4(1); MISSING = y1-y4; CATEGORICAL = u1-u4; MODEL POPULATION: [x@0]; x@1; iu su1| u1@0 u2@1 u3@1 u4@1; iu su2| u1@0 u2@0 u3@1 u4@2; [u1$1-u4$1*1.7](1); [iu@1 su1*-.15 su2*0.01]; iu*1.45; iy sy1| y1@0 y2@1 y3@1 y4@1; iy sy2| y1@0 y2@0 y3@1 y4@2; [y1-y4@0]; y1-y4*.5; [iy*0 sy1*0 sy2*0]; iy*1; y1*.2; iy WITH sy1*.1; iu WITH iy*0.9; iu ON x*-.35; su1 ON x*-.25; iy ON x*-.15; sy1 ON x*-.5; MODEL MISSING:...; MODEL: iu su1|...(as in POPULATION)

Bengt O. Muthen posted on Wednesday, September 08, 2010 - 2:58 pm

Please send this question to support@statmodel.com with your input, output, and license number.

matteo giletta posted on Wednesday, January 30, 2013 - 9:21 pm

Hello,

I am conducting some two-part growth curve models and I have a question about the plot of the curve. Before conducting a two-part model I have conducted a model for the frequency part only to check my model fit. The model fit was really good with a significant linear slope. However I cannot understand why when I plot the means of the growth curve, the values range from -0.5 to 0, whereas actually the outcome ranges from 0 to 4. I have the same problem when I try to plot the effect of a covariate on the slope of the frequency part.
How do I explain this?
Thanks a lot!!!

Linda K. Muthen posted on Thursday, January 31, 2013 - 8:39 am

Please send the input, data, output, and your license number to support@statmodel.com.

matteo giletta posted on Thursday, January 31, 2013 - 12:33 pm

Hi Linda,

I think I found out the problem: in two-part models a log transformation is used for the continuous part of the model. That would explain my negative values.

Thanks!

xiaoyu bi posted on Tuesday, December 24, 2013 - 2:47 pm

Hi, Linda,
I am running a two-part LGM by looking at # of medical conditions (ranging from 0 to 8) people have over time. I got error messages (please see below) when I ran the code. Any guidance would be greatly appreciated? Thank you so much!
data twopart:
names = medcon1 medcon2 medcon3 medcon4 medcon5;
binary = bin1-bin5;
continuous = cont1-cont5;
transform = none;
analysis: estimator = mlr;
model: ib sb | bin1@0 bin2@1 bin3@2 bin4@3 bin5@4;
ic sc | cont1@0 cont2@1 cont3@2 cont4@3 cont5@4;
ib with sb @0;
output: tech1 tech8;

The erro message I got are:
THE ESTIMATED COVARIANCE MATRIX COULD NOT BE INVERTED.
COMPUTATION COULD NOT BE COMPLETED IN ITERATION 364.
CHANGE YOUR MODEL AND/OR STARTING VALUES.

THE MODEL ESTIMATION DID NOT TERMINATE NORMALLY DUE TO AN ERROR IN THE
COMPUTATION. CHANGE YOUR MODEL AND/OR STARTING VALUES.

Bengt O. Muthen posted on Tuesday, December 24, 2013 - 5:16 pm

Are you sure you want ib WITH sb@0? Growth factor are likely to be correlated within a process. You may want to try with all 4 across-process WITH statements @0. Or, have all 6 WITHs free. You may also want to analyze the binary part by itself to see if you need a free sb variance.

Mary E. Mackesy-Amiti posted on Wednesday, February 12, 2014 - 12:36 pm

I am using 2-part data with log transformation in a growth model. For the plot of observed and estimated means, I was wondering if there is a way to get the geometric mean instead of the arithmetic mean for the observed values?

Mary E. Mackesy-Amiti posted on Wednesday, February 12, 2014 - 12:43 pm

Please disregard my previous message

xiaoyu posted on Friday, March 07, 2014 - 4:29 pm

Dear Dr. Muthen,
Do you have an example how to fit a two-part model using long format data? I know how to do that using wide format data, but for my current case, I have to use long format data. Otherwise, I have to create 70 more variables with missing values for everyone using wide format data.
Thank you so much,

Bengt O. Muthen posted on Friday, March 07, 2014 - 4:51 pm

I'm sorry, but I don't think I do. That would be using 2 variables, binary and cont's. Should be doable.

B posted on Friday, August 01, 2014 - 5:26 pm

I have a quick question about how intensive a potential model might be for MPlus to handle.

I'm doing an autoregressive latent panel model. One set of factors at 2 time points is continuous. The other set is zero inflated with non-integers, so two-part modeling will be necessary.

All of this data is complex survey, so weights need to be applied, and I'm also going to need to estimate the model in a multilevel framework. This is because I want to see if a level 2 variable moderates the cross-lagged effects at level 1.

Before I go down the rabbit hole I want to make sure I'm not at a threshold of MPlus capability, so is this going to be maxing out the capabilities of MPlus?

Thanks

Bengt O. Muthen posted on Saturday, August 02, 2014 - 2:58 pm

The speed will depend on the number of random effects that you specify and the sample size. The more of them, the slower the speed of the ML numerical integration.

B posted on Sunday, August 03, 2014 - 9:06 am

In regards to speed: What's the status of using MPlus on clusters? Last I knew, it was cost-prohibitive for our University to put MPlus on a cluster with current licensing agreements.

Linda K. Muthen posted on Sunday, August 03, 2014 - 12:43 pm

Mplus has not been developed for clusters.

anonymous Z posted on Friday, February 05, 2016 - 12:28 pm

Drs. Muthen,

I am fitting a two-part model. For Part 1 (the binary part), the variance of the linear slope was not significant. However, when I added treatment condition as one covariate, it showed that the treatment was a significant predictor of the linear slope for the binary part. Does this sound right to you? Can a covariate have significant effects on a parameter estimate without significant variance?

Thanks so much!

Bengt O. Muthen posted on Friday, February 05, 2016 - 6:00 pm

Yes, this is a common finding. Perhaps due to higher power when the covariate is included.

anonymous Z posted on Friday, February 05, 2016 - 8:55 pm

Dr. Muthen,

Thanks for your prompt response. So did you mean that I can explain the results as that the treatment has effects on the slope growth factor for the binary part even though the variance is not significant?

Thanks

Bengt O. Muthen posted on Saturday, February 06, 2016 - 4:54 pm

Yes.

rongqin posted on Thursday, March 03, 2016 - 1:36 am

Dear Dr. Muthen,

can two-part modelling be adopted in a cross-lagged model?

In the mplus mannual, after specifying
DATA TWOPART:
NAMES = y1-y4;
BINARY = bin1-bin4;
CONTINUOUS = cont1-cont4;

there is further specification in the growth model,
su@0; iu WITH sy@0;

I wonder whether anything similar like this (su@0; iu WITH sy@0;) should be done if the data are used in a cross-lagged model. Thank you very much for your answer.

Bengt O. Muthen posted on Thursday, March 03, 2016 - 6:57 pm

Q1. Yes, but the model gets more complex since the two-part variable is both a DV and an IV.

The fixing of slope variance for the binary part is mostly to simplify the model. It is up to you if that is appropriate.

Christoph Weiss posted on Tuesday, September 20, 2016 - 1:53 am

Hello Linda,

i want to run a two part model. I have used exactly the code from example 6.16. But I get the message: “Categorical variable BIN1 contains less than 2 categories.”

Other Questions are:
- at which point (place in the code) do I implement the other variables (not only the count variable(s)) from the model?
- at which point (place in the code) do I implement the actual model code?

Thank you for helping.

Christoph

Linda K. Muthen posted on Tuesday, September 20, 2016 - 5:56 am

Please send the output, data, and your license number to support@statmodel.com.

Kerry Lee posted on Thursday, August 09, 2018 - 2:37 am

Dear Prof Muthen,

I want to use a cut-point of 3 for creating a two-part binary variable. This works fine when the variable is not transformed (i.e., count for the variable tallies with count from SPSS).

However, when I use the DEFINE command to scale the variable (VAR = VAR/1.5), adjusting the cut-point does not have any effect. The output under "COUNTS FOR CATEGORICAL VARIABLES" shows that a value above the cut-point is included.

Would appreciate your advice.

Sincerely,
Kerry.

Bengt O. Muthen posted on Thursday, August 09, 2018 - 2:09 pm

Please send your full output to Support along with your license number. The data would also be useful to have.