I have a question regarding two-part modeling and the Brown et al. (2005) article which one can download from your hp. 1) the authors recommend to check for the the growth-shape in both parts separately. Referring on ex. 6.16, does this mean I can set an exclamation mark in the binary-model row and binary "data-two part"-row, to test the shape for the cont-model and vice versa for the binary model, or do I have to modify the data and test each part "really" separately? 2) the authors refer to log likelihood chi square testing to compare the nested model. Where do I find these relevant likelehood values for this testing? mplus depicts only H0 likelihood value. I'm using MLR with numerical integration. 3.) Are there any other strategies to fit two-part models? 4) I plan to fit a two part model in an intervention group and control group. After that i want to do multiple goup modeling. Is it necessary that the growth shape of both parts of the intervention group is equal to the correspondend parts of the two-part model of the control group (e.g. both groups have a linear slope in the binary part and a quadratic slope in the cont-part)... I know, that this is a necessary condition in normal LGM multiple group comparison.
2.) The "HO Value"? I have used MLR for both parts, just as it is described in ex 6.16. Is it also possible to use ML, because then i can use the values without correction and this is not so time-consuming!? Articles related to two-part give no hint regarding this issue...
4.) just to clarify: both corresponding parts in control and intervention group must have the same growth shape to do multiple group? So if that is not the case, i would follow the covariate approach as described in Brown et al. (2005) to test for intervention effects...
2.) I've done some calculations since my last post. two-part indeed helped to normalize my data, so I'm guite sure that I can modify ex. 6.16 with regard to: MLR=ML and tranformation = log into transformation = none, right?
If that is right, only question no. 4 is still a little bit unclear for me :-).
2) Yes, the H0 LL value. No transformation is ok, but log transformation is more statistically sound. ML might be sufficient.
4) If you have an intervention dummy covariate influencing a slope growth factor, then the intervention and control group should have the same growth shape for this influence to be seen as an intervention effect.
thanks again...! my last questions so far: I've done a log-transformation in two-part and after that I was not able to find a sufficient fitting growth shape for the continuous part. In detail: With log transformation my alcohol-use mean decreased from t1 to t2 which hinders the growth shape to be modelled in a linear or quadratic fashion. In the untransformed data I found this was not the case. The mean increased from t1 to t2. Any possible reasons for that?
4) how can one statistically test for intervention effects if the continuous parts have not the same shape? (the binary parts do have)
sorry, short question concerning 1.) again... I've done separated growth shape analyses for both parts. The final solution fitted very well in both parts. Then I combined the final solution from both parts to "one" two-part model as in ex. 6.16, to let the growth factors correlate and to introduce my covariate. The unconditional model derived from the separate analyses fits very poor and I got a PSI-Error without no hint for possible reasons in Tech4. Also when I introduce my covariate. In addition: When i comment out the model parts as you recommend under 1.), to do separate analyses, my N reduced due to missings on all measurement points, but this warning didn't upcome when I did analyses with the whole model (correlated growth factors as in ex 6.16) and analyses are done with the whole N. Also the estimated means for the continuous part slitely differ between the "whole" model and separate analyses... I'm a little bit confused...
I will try to answer these questions briefly so not to get into statistical consulting which we can't offer.
If a variable increases in mean over time, the log of that variable should also increase over time since log is a monotonic transformation. Perhaps you include the zero part of the variable when you studied the unlogged variable - if the number of zeros decrease over time the mean might increase.
Regarding intervention effect with different growth shapes for control and tx groups, you can estimate the different mean curves and compute (via Model constraint) the estimated mean difference between the curves at any time point.
The models estimated from each part separately should not differ that much in parameter values from the joint model - if big differences occur this might be a sign that the joint model is misspecified. A typical cause might be that more random effects need to be correlated across the processes. This could also resolve the Psi message - which might be due to several (more than 2) latent variables having a linear relation.
The analysis of the joint model benefits from the "MAR" approach to missing data where you always have observations on the binary part but not always on the continuous part. Having observations on at least the binary part keeps the individual in the analysis sample.
Thanks! Regarding log: No, I only added "Transform none" to the separated cont-model (binary model was comment out). No zeros are included in log and none-log analyses, N was equally decreased in both analyses. But I have many values within the range of 0.1 to 0.99.
I am doing a two-part growth model for males and females. I am trying to fit the model without predictors. First, is it necesary to use algorithm=integration? It is NOT telling me I have to unless I try to correlate iu with sy, which I don't think I need to do. Second, when I run the model, I am getting estimates and est./s.e. for the means, thresholds, variances and residual variances but I am not getting the est./s.e. for the growth factors or intercepts. Instead it is giving me .999. Is this problematic or unusual?
In my understanding mplus always uses algorithm = integration in the binary part of the model and in the whole model regardless of specifying covariances. This leads to my problem. I want to test for influences of covariates on my outcome (time variant and time invariant). These tests include mediational effects. The manual declares that it is not possible to use Model indirect command in conjunction with algorithm = integration. Which options do I have to test for mediational effects in two part models? It would be best if these options include time variant and time invariant mediational testing. Thanks!
I want to make sure I completely understand the estimate for the mean of the 'iu' parameter in two-part models for two classes. For latent class #1, the estimate is -1.247 and for latent class #2, the estimate is .000. How are these interpreted? Is the estimate for class #1, the log odds of having any of the outcome COMPARED to class #2? Is it appropriate to exponentiate -1.247 (=.287) and then subtract it from 1 (1-.287 = .713) and say class #1 is 71% less likely to have any (outcome)compared to class #2?
Similarly, in regular latent class modeling with two groups for an ordinal outcome, the estimate of the mean of 'i' for group #1 is .000 and for group #2 it is -.037. Are these interpreted similar to above (it is an estimate of the intercept in comparison to the other group)? Is this why one group's estimate is .000?
Also, if the estimate for one of the groups (classes) is not significant, then there are no siginificant differences between the groups (classes) in the intercept (i or iu)?
Model Indirect for algo = integration has not been implemented yet because it is more complex. However, for any model you can use Model Constraint to define a mediation parameter, say m = a*b, and this gives you a SE for it so you can test it. But note that it is not obvious (to me) that this product formula holds for the two-part model, or how it would have to be modified. - Perhaps something for a methods paper? Perhaps Dave MacKinnon at ASU knows or is interested.
This is a question in regards to the post on Thursday, Jan 31st at 10:04 am. I may not have been clear in my question...is there any other information that I can give to help you answer my question? Thank you. I really appreciate your help!
[iu] refers to the mean of the random intercept of the binary part of the 2-part model. The binary part is a logistic growth model for experiencing the event in question. Individual differences in development over time of this probability are expressed by random growth factors. So a good place to read more about this is in books on binary random effects growth modeling such as Applied Longitudinal Analysis by Fitzmaurice, Laird and Ware. With the twist that you have two latent classes.
Extracting a quick answer from this for you, note that the probability of u=1 (observing the event) at the time point where the time score is zero (often the first time point) is directly related to [iu] as
(1) P(u=1) = 1/(1 + exp(-L)),
where L = logit = log odds for u=1 vs u=0 = -t + [iu] with t denoting the threshold parameter. Here [iu] = 0 for one class and estimated freely for the others. From (1) you can compute the probability of u=1 for each class when the random intercept is at its mean (note that you have to condition on the mean). And as you say, in line with regular logistic regression with a binary outcome and a binary covariate (the latter being your latent class), you can exponentiate [iu] to get the odds ratio (I would not subtract from 1 etc, but simply talk in terms of odds ratio). But talking in terms of probabilities may be more down to earth.
And, yes, one group having mean zero is to provide this reference group.
Thank's! Yes it would be best to contact him! Together with Cheong and Khoo he recommends parallel process modeling for testing mediational effects in LGM (2003). I have a strong theory for my mediational effect so the, in fact, correlational relationship between mediator and outcome wouldn't mind me. Is it possible to use this approach in mplus two-part modeling? It seems to me, that there is no model indirect command needed in the case of parallel process modeling. This would help me to investigate mediational effects with my covariates specified as time invariant (T1 value and time averaged). I also investigate their mediational effects within my two part model specified as time variant covariates. For this case I also see no other way instead of asking a specialist like MacKinnon.
I have quite large residuals in the continuous part, when I add the both parts to a complete model (correlated growth parameters, in a separate continuous model it fits fine). I have tried a lot of modifications and nothing helped. Then I let "iy" predict "sy" to control for iy-baseline. I looked at the residuals and found a substantial improvement but BIC and AIC (as a rough indicator of the whole two part model) got worse. What's going on there!?
Li Lin posted on Wednesday, September 08, 2010 - 1:55 pm
Could you tell me what is wrong with following code? Mean estimate for growth factors was not in output, and all growth factors ON x were 0 for both population and replicates average. Thanks! MONTECARLO: NAMES = ...; CUTPOINTS = x(0);...GENERATE = u1-u4(1); MISSING = y1-y4; CATEGORICAL = u1-u4; MODEL POPULATION: [x@0]; x@1; iu su1| u1@0u2@1u3@1u4@1; iu su2| u1@0u2@0u3@1u4@2; [u1$1-u4$1*1.7](1); [iu@1 su1*-.15 su2*0.01]; iu*1.45; iy sy1| y1@0y2@1y3@1y4@1; iy sy2| y1@0y2@0y3@1y4@2; [y1-y4@0]; y1-y4*.5; [iy*0 sy1*0 sy2*0]; iy*1; y1*.2; iy WITH sy1*.1; iu WITH iy*0.9; iu ON x*-.35; su1 ON x*-.25; iy ON x*-.15; sy1 ON x*-.5; MODEL MISSING:...; MODEL: iu su1|...(as in POPULATION)
I am conducting some two-part growth curve models and I have a question about the plot of the curve. Before conducting a two-part model I have conducted a model for the frequency part only to check my model fit. The model fit was really good with a significant linear slope. However I cannot understand why when I plot the means of the growth curve, the values range from -0.5 to 0, whereas actually the outcome ranges from 0 to 4. I have the same problem when I try to plot the effect of a covariate on the slope of the frequency part. How do I explain this? Thanks a lot!!!
I think I found out the problem: in two-part models a log transformation is used for the continuous part of the model. That would explain my negative values.
xiaoyu bi posted on Tuesday, December 24, 2013 - 2:47 pm
Hi, Linda, I am running a two-part LGM by looking at # of medical conditions (ranging from 0 to 8) people have over time. I got error messages (please see below) when I ran the code. Any guidance would be greatly appreciated? Thank you so much! data twopart: names = medcon1 medcon2 medcon3 medcon4 medcon5; binary = bin1-bin5; continuous = cont1-cont5; transform = none; analysis: estimator = mlr; model: ib sb | bin1@0bin2@1bin3@2bin4@3bin5@4; ic sc | cont1@0cont2@1cont3@2cont4@3cont5@4; ib with sb @0; output: tech1 tech8;
The erro message I got are: THE ESTIMATED COVARIANCE MATRIX COULD NOT BE INVERTED. COMPUTATION COULD NOT BE COMPLETED IN ITERATION 364. CHANGE YOUR MODEL AND/OR STARTING VALUES.
THE MODEL ESTIMATION DID NOT TERMINATE NORMALLY DUE TO AN ERROR IN THE COMPUTATION. CHANGE YOUR MODEL AND/OR STARTING VALUES.
Are you sure you want ib WITH sb@0? Growth factor are likely to be correlated within a process. You may want to try with all 4 across-process WITH statements @0. Or, have all 6 WITHs free. You may also want to analyze the binary part by itself to see if you need a free sb variance.
I am using 2-part data with log transformation in a growth model. For the plot of observed and estimated means, I was wondering if there is a way to get the geometric mean instead of the arithmetic mean for the observed values?
Dear Dr. Muthen, Do you have an example how to fit a two-part model using long format data? I know how to do that using wide format data, but for my current case, I have to use long format data. Otherwise, I have to create 70 more variables with missing values for everyone using wide format data. Thank you so much,
I have a quick question about how intensive a potential model might be for MPlus to handle.
I'm doing an autoregressive latent panel model. One set of factors at 2 time points is continuous. The other set is zero inflated with non-integers, so two-part modeling will be necessary.
All of this data is complex survey, so weights need to be applied, and I'm also going to need to estimate the model in a multilevel framework. This is because I want to see if a level 2 variable moderates the cross-lagged effects at level 1.
Before I go down the rabbit hole I want to make sure I'm not at a threshold of MPlus capability, so is this going to be maxing out the capabilities of MPlus?
anonymous Z posted on Friday, February 05, 2016 - 12:28 pm
I am fitting a two-part model. For Part 1 (the binary part), the variance of the linear slope was not significant. However, when I added treatment condition as one covariate, it showed that the treatment was a significant predictor of the linear slope for the binary part. Does this sound right to you? Can a covariate have significant effects on a parameter estimate without significant variance?
Yes, this is a common finding. Perhaps due to higher power when the covariate is included.
anonymous Z posted on Friday, February 05, 2016 - 8:55 pm
Thanks for your prompt response. So did you mean that I can explain the results as that the treatment has effects on the slope growth factor for the binary part even though the variance is not significant?
i want to run a two part model. I have used exactly the code from example 6.16. But I get the message: “Categorical variable BIN1 contains less than 2 categories.”
Other Questions are: - at which point (place in the code) do I implement the other variables (not only the count variable(s)) from the model? - at which point (place in the code) do I implement the actual model code?
Kerry Lee posted on Thursday, August 09, 2018 - 2:37 am
Dear Prof Muthen,
I want to use a cut-point of 3 for creating a two-part binary variable. This works fine when the variable is not transformed (i.e., count for the variable tallies with count from SPSS).
However, when I use the DEFINE command to scale the variable (VAR = VAR/1.5), adjusting the cut-point does not have any effect. The output under "COUNTS FOR CATEGORICAL VARIABLES" shows that a value above the cut-point is included.