Anonymous posted on Monday, July 16, 2001 - 10:30 am
I have a parallel process model: y-variables==8 x-variables==11 latent var==4
I have run TECH4 in order to obtain the observed means for the slope factors. The output provides model estimated means for 24 model elements, but I'm having difficulty interpreting which mean goes with which variable in the model. Example:
1 (-.12), 2 (.96), ....24 (-1.57). Can you provide some assistance? Thank you.
TECH4 gives means, covariances, and correlations for the latent variables in the model. In TECH4 the variable names are given to identify which variables the means are for. I would need to see the output if this is not the case for your analysis. You can send it to firstname.lastname@example.org.
Anonymous posted on Monday, November 19, 2001 - 7:17 am
Dear Linda & Bengt!
I have been running GGM models with three classes. I have been using the residual output for the estimated means. Recently, I compared the estimated means with the ones I computed by hand. The values are very close, but the ones from the out put are always higher. Is this due to a rounding error? Here is an example for two time points in one class:
ac=3.499 bc=0.270 qc=-0.060
bc(t8)= 6.5 bc(t7)=5.5 qc(t8)= 42.25 qc(t7)=30.25
3.499 + (6.5*0.270) + (42.25* (-0.06))=2.719 Estimated mean from the residual output=2.732
3.499 + (5.5*0.270) + (30.25* (-0.06))=3.169 Estimated mean from the residual output=3.178
As you can see, I am equating the variance of the latent factors in the autoregressive model. Is this correct? Is it appropriate to equate the variances? Using HLM and autoregressive covariance structure gives me very different results. What is wrong with my model specification in Mplus?
I would have to see the full output to comment. Include TECH11 in the OUTPUT command so that I can see which parameter it is referring to and send the output to email@example.com.
Anonymous posted on Thursday, December 09, 2004 - 12:07 pm
I am currently running a cross-lagged panel analysis with count data. Multiple time points are modeled using a first order autoregressive structure. I am concerned about deviations from dispersion that would ordinarily be assumed for poisson distributed data. Convexity plots do demonstrate some deviations from poissoness. It appears that using the zero inflation capability of MPlus is not possible because these variables cannot occur on the right of an "ON" statement. How robust would parameter and fit estimates be, in MPlus, if there is some violation of the distrubutional assumption? Does the integration algorithm (forgive me because I am out of my depth here) provide any protection against assumption violations?
Thanks for your help.
bmuthen posted on Thursday, December 09, 2004 - 5:16 pm
Ex 7.25 in the Version 3 User's Guide shows how to do zero-inflated Poisson modeling where the inflation and non-inflation parts are explicit. This approach could in principle be used for cross-lagged panel analysis, although is perhaps cumbersome. Note also that with Poisson outcomes as predictors, the counts are treated as continuous variables (so prediction is not from any latent log rate propensity construct).
Anonymous posted on Monday, April 25, 2005 - 10:22 am
I am trying to learn growth modeling. I got my model into MPLUS but I don't understand much of the output. Can you please recommend some first steps for me to take to interpret? There's CFI/TLI, RMSEA, Infomation Criteria, SRMR... How do I go about learning which of these are important and what they mean?
Anonymous posted on Monday, April 25, 2005 - 12:38 pm
related to above -
in a growth model analysis, if the p-value for the 'chi-sq test of model fit' is not significant (e.g. 0.4910), that means that I would not reject the null hypothesis. So, does that mean that my model doesn't do any better than analysing individuals as if they were separate independent cases? In other words, knowing the individuals' previous scores on a given var. doesn't help at all?
I believe all of these fit measures are described in the dissertation by Yu which is available under Mplus Papers on our website. This dissertation also suggests appropriate cutoffs for these fit measures.
The null hypothesis is your growth model. So if you cannot reject the growth model, this implies that it fits the data well.
Anonymous posted on Wednesday, April 27, 2005 - 8:13 pm
Thank you. Okay, so my hypothesized model is always the null hypothesis? This is the same in CFA, right? For example, if i am testing to confirm my hypothesis that there are, say, 3 latent factors, then
(1) my null is that there are 3 latent factors, right?
(2) if I get a p-value > 0.05 then I would not reject the hypothesis that there are 3 latent factors?
(3) how would you state the alternative hypothesis?
3. In CFA for continuous factor indicators, it is the model of free variances and convariances.
Anonymous posted on Tuesday, June 14, 2005 - 8:16 am
Dear Linda and Bengt: I am running a LGC model (linear) to test the effectiveness of the treatment(1) versus control (0) group. The outcome is a binary variable. According to the manual, I select the ML estimator to get Logistic model. If the group have significant loading on slope, say 0.5, is it correct to say the treatment group is exp(0.5) more likely to develope outcome with value 1 than control group during the whole observation time? It seems the OR is timepoint related, different timepoint have different OR as seeing from the forumal.
bmuthen posted on Wednesday, June 15, 2005 - 7:47 am
Because the slope is a continuous dependent variable, this is a regular linear regression, not a logit regression.
Anonymous posted on Wednesday, June 15, 2005 - 8:58 am
Thank you very much for your response. Since slope is a latent continuous variable, so the coefficient estimate for slope is regular regression coefficient. But I am still not clear about the level-1 formula of growth curve with binary outcome. Is it logit(p)=b0 + b1time+ e? or Pr(y=1) =F( b0+b1time+e)? or something else? Thank you very much!
I get output that I don't quite understand. I want the covariance between the random intercept and slope, plus its standard error. It's not clear to me where I find the covariance. Is it in the Model Results? or in the "estimated covariance matrix for the latent variables"? I would have thought these two bits of output would match.
You have covariates in your model so the model is estimating a residual covariance not a covariance.
Anonymous posted on Saturday, August 20, 2005 - 11:32 am
I'm not sure how to interpret your response. Am I to assume that the covariances given in the 'model results' section are based on residualized values, and the estimates given in the 'estimated covariance matrix for the latent variables' are not based on residuals? Is that why they don't match? Thanks.
In your growth model, because you regess the growth factors on a set of covariates, the parameters that are estimated in the model for the growth factors are intercepts, residual variances, and residual covariances. These are not the same values as given in TECH4. If you did not regress your growth factors on a set of covariates, then the parameters estimated in the model for the growth factors would be means, variances, and covariances. And these would match the values given in TECH4.
Anonymous posted on Saturday, August 20, 2005 - 4:11 pm
Excellent. Thank you.
Anonymous posted on Tuesday, August 23, 2005 - 10:52 pm
Hi Bengt or Linda! I have been trying to run a growth model for two paraller processes over three time points. In the output, I get the warning saying that the residual covariance matrix is not positive definite. The problem involves one of the three trend growth factors. I don't know whether this means that I can not conitnue with this model at all -- or is there anything I can try to do to fix the problem. By the way, is there any chance you'll give a course on Mplus in Europe in the near future? Greetings, and thanks in advance for your help.
bmuthen posted on Wednesday, August 24, 2005 - 6:51 am
Perhaps you have a non-positive variance for one of your trend factors. You can simply fix the variance to zero.
The are no firm plans for an Mplus course in Europe right now. Although we have in mind doing a course there, this will be a short 2-day event rather than our 5-day courses in the US.
I've tried searching the discussion boards on this, but haven't found this issue elsewhere:
I have a simple model for alcohol use growth over 3 waves of data, with 4 time invariant covariates included.
I found that the unconiditional growth model fit the data reasonably; however, the condition model showed TERRIBLE fit. Based on modification indices, I included a covariance in the model (y1drink with y2drink). The fit is now practically perfect (red flag #1), but the slope/intercept covariance estimate, the y1drink/y2drink covariance, and the level 1 residuals all have standard errors that appear to be unestimated (show up in output as asterisks *******).
The remainder of the model estimates appear okay. There are no error messages in the output
1) What do the asterisks mean? 2) Can I legitimately interpret the ouput despite this (I'm not actually interested in the covariances or errors for the substantitive purpose of the paper).
I'm running a LCGA with dichotomous DVs (diagnostic data) and wanted to check with the interpretation of some of the plot data.
When looking at the plot of the estimated probabilities - Let's say that for one class at position 1 on the x-axis, the probability value on the y-axis is at .5. Does this mean that at x = 1, 50% of the individuals would have the diagnosis? Or, does it mean that there is a .5 chance that people in that class have a diagnosis? Or, do those mean the same thing?
Dear Linda & Bengt, I am fitting a linear GMM with intercept I and slope S and requested Cprobabilities and FScores to be saved. The output contains the list of variables actually used in the analysis, I, S and C_I and C_S plus the class probabilities and class membership. Can you help me with the interpretation of C_I and C_S, please? All the best, G
Hi I'm running an unconditional growth curve of moral disengagement I have parametrisized the slope as 0 1 2 3 (the fit indices are good chi square not significant rmsea less then .05 etc) the mean of the slope is negative and significant. I have a problem with the interpretation of the negative correlation between slope and intercept. is it correct to say that "the higher the level of moral disengagement at T1, the less the change of moral disengagement from T1 to T2 "? or I have to say " higher initial values where associated with steeper decreases from T1 to T2"? I have the same problem of interpretation when I'm running a conditional model with a predictor on the slope and intercept. The regression coefficient on the slope is significant and negative. Is it correct to say " the higher the level of aggregation wìith deviant peer the less the change of moral disengagement?" or I have to say "higher level of aggregation with deviant peer predicts a higher decrease of moral disengagement from T1 to T2?
The interpretation of a negative covariance between the intercept and slope growth factors is interpreted as:
As the intercept growth factor increases, the slope growth factor decreases.
The conditional interpretation follows the same pattern.
ywang posted on Wednesday, November 04, 2009 - 8:15 pm
Hello, Drs. Muthen:
I searched "MPlus discussion" for the interpretation of estimates in Latent Growth Modeling on Categorical Variables (dummy variables), but still can not understand how to interpret the coefficients of covariates on intercept or slope. For example, if the slope of overweight on gender (male versus female) is 0.5, it seems that it can be interpreted as that the increase of logit of being overweight per unit of time for males is 0.5 higher than females. Is it correct? But it is still difficult to understand. Can you help me to interpret it in a more understandable way? Or can you recommend some papers which interpreted the results in detail instead of only listed that the coefficients were significant or not? Thanks a lot!!!
Here x_t is time and w is your gender dummy. Inserting 2 in 1, this means that you have
which means that your interpretation is correct. To get a deeper understanding of this you can compute the s mean for the 2 genders and see how the 2 means make for a different change in probability of the binary outcome when time changes one unit. You do this by converting the logit into a probability - see Chapter 13 of the UG.
ywang posted on Thursday, November 05, 2009 - 8:22 pm
Dear Dr. Muthen:
Thank you so much for your swift response. It is very helpful. I am new to LGM and have a follow-up question. As I understand, the slope is the coefficient of the interaction term between time and a covariate (for example, gender)in latent growth modeling. In tradiational regression model, if you have an interaction term between gender and time, you have to include both gender and time in the same model. Is it the same for Mlpus latent growth modeling? If I include the regression of slope on gender, do I have to include the regression of intercept on gender in the same model?
It is most natural to regress all growth factors on the covariates.
AKH posted on Tuesday, November 16, 2010 - 11:27 am
I fit a growth model with a time-varying covariate (TVC). The TVC was person mean centered so that the coefficient for the person-centered TVC was a pure within-person effect. Now I'm confused on how to interpret the within-person effect of the person-centered TVC (depression) on the outcome (abuse). The part of the model copied here:
MODEL: %WITHIN% maleslope | abuse ON time; femaleslope | abuse ON time; abuse ON dep; abuse ON dep;
I think a significant within person effect for person-centered, time-varying depression predicting abuse would mean that on occasions when depression is above the typical person’s average, it is associated with a change in abuse. Right? And would a nonsignificant effect mean that depression is not related to change in abusive behaviors?
A significant coefficient for the regression of abuse on dep says that the two variables have a signficant relationship where when dep changes one unit abuse changes the numbber of units shown in the regression coefficient. A non-significant coefficiet says that the two variables are not signficantly related.
AKH posted on Tuesday, November 16, 2010 - 4:17 pm
Thanks, but I think the way I wrote the model code example may have caused some confusion. I understand how to interpret typical regression coefficients, but am asking about an analysis of a growth model using a person period dataset with a person centered TVC. I think it might be interpreted as follows (when low abuse and low depression scores are better):
A significant, positive within person effect for person-centered, time-varying depression predicting abuse would mean that on occasions when depression is below the typical person’s average, it is associated with a decrease in abuse. Whereas a significant negative effect would indicate that individuals with above average levels of depression report a decrease in abuse. And a nonsignificant effect would indicate that a person’s time varying depression is not related to her/his abuse trajectory. Does that sound correct?
I would not refer to a "typical person's average". I would keep it simple and say an increase in dep is related to an increase in abuse and a decrease in dep is related to a decrease in abuse.
Anne Chan posted on Saturday, April 02, 2011 - 6:41 pm
I run a LGC model on the amount of time students spend for leisure activity across 5 year and also regressed students' life satisfaction at Year 5 on the intercept and slope.
The slope of leisure activity is negative. The regression coefficient of this slope and life satisfaction is negatively significant.
May I ask, how should I interpret this result? Does it means, the more the students increase the rate of lowering their leisure time (i.e., a steeper negative slope), the lower their life satisfaction?
When you say "the slope of leisure activity is negative", I think you mean that the mean of the slope growth factor is negative. Irrespective of this, the negative effect of the slope growth factor on Year 5 life satisfaction implies that as the slope growth factor value increases, life satisfaction goes down. Note that saying "as the slope growth factor value increases" could still mean that it is negative, just less so.
Note that regressing on both an intercept and a slope growth factor may run into collinearity if they are highly correlated.
Anne Chan posted on Monday, April 04, 2011 - 9:38 pm
Sorry, I still cannot get it.
Put it into example, do you mean that when the slope change from "-0.15" to "-0.25" (the ABSOLUTE value of slope growth factor increases), life satisfaction goes down, or do you mean that when slope change from "-0.15" to "-0.05" (the value of slope growth factor change to less negative), life satisfaction goes down?
By slope growth factor value increasing, I mean for example going from -0.15 to -0.05. So stated differently, the model says that a person with a higher slope growth factor value (say -0.05 instead of -0.15) has a lower Year 5 life satisfaction.
Anne Chan posted on Wednesday, April 06, 2011 - 10:34 am
Thanks a lot! May I ask another question?
The model did not fit when I regress life satisfaction on both the intercept and the slope growth factor of leisure activity.
I guess I run into the problem of collinearity, do you have any suggestion of how to deal with this problem?
If you have good fit without the distal outcome Year 5 life satisfaction and poor fit when including the distal, this says that the correlations between the repeated measures and the distal are not well fitted by the model so you should investigate why.
An easier way to model the distal is to center the intercept growth factor at the last time point and let only this growth factor predict the distal. That also may make the detection of model misfit easier.
Anne Chan posted on Wednesday, April 06, 2011 - 12:41 pm
Thanks a lot for your reply.
May I ask how to model the distal is to center the intercept growth factor at the last time point? What is the syntax for doing this?
The H0 model is the model specified in the MODEL command.
I don't find any information about the H1 model in the GMM output. This message usually comes about with TYPE=BASIC;
You can do difference testing of a model with and without covariates as long as the set of dependent variables is the same and you don't mention the means, variances, or covariances of the covariates in the MODEL commmand.
Carolin posted on Thursday, June 16, 2011 - 12:52 am
Thanks a lot so far! I have following information in the output of a GMM (TYPE = MIXTURE)before the random starts results are listed:
MAXIMUM LOG-LIKELIHOOD VALUE FOR THE UNRESTRICTED (H1) MODEL IS -5468.222
Can you explain to me which model is refered to here?
Concerning the difference testing: how can I do this (is there a special test? Could yo tell me the syntax?).
They differ because the sampling distribution of the raw and standardized coefficients are not the same. You should report the significance of the raw for the raw and the standardized for the standardized.
Above, in 2006, you wrote that C_I & C_S (for example) output from a GMM "are the factor scores based on most likely class membership." Just to be OCD, what is the difference between the output I, S, & Q factor scores, and the C_I, C_S, & C_Q factor scores? I'm guessing that the former are therefore estimated for each observation without regard to the latent classes they would be assigned to, and so the best estimates to use would be the C_I, etc? Best, Bruce
Hi. I'm hoping to get a response to Bruce Cooper's question above. I am interested in estimating individual means in a GMM conditional on latent class. I'm not sure which factor scores to use. Thank you.
Bruce was on the right track. The I, S, & Q factor scores are scores mixed over the latent classes, whereas the C_I, C_S, & C_Q factor scores are the scores for the most likely class of the individual. I would use the latter.
Hee-Jin Jun posted on Monday, November 28, 2011 - 3:18 pm
Hi, I am running piecewise growth model. What is the default covariance structure for the growth model? My codes are below. Thanks for your help.
My question is related to Anne Chan's question (2011-06-24). I have regressed future sickness absenteeism (binary) on the intercept and the linear slope of two burnout dimensions exhaustion and disengagement (one LGM/burnout dimension). In the non-standardized results the effect of the slope on future sickness absenteeism is non-significant (p = .143) but in the standardized results it is significant (p = .014).
1) I read that the reason for why these two results differ was that the sampling distribution was different, but should there really be such a big difference in p-value and which of the two effects is "true"?
In the response to Anne's question about which effect to report you said that she should report the significance of the raw for the raw and the standardized for the standardized. However, this means that one can arrive at completely different conclusions depending on which effect is reported.
2) Is there really no "best practice" for how to report these effects or any literature that discuss this issue?
A related question to my previous post is about the effect of time in LGMs. In my case I have two variables and I want to report which of the variables that changes most over time. I therefore want to report the standardized effects.
How do I accurately present the change in the outcome variable in SDs for a time score increase of one (unstandardized) unit, e.g., how many SDs do the outcome variable increase in one year?
You can use ESTIMATOR=BAYES to explore this by looking at the credibility intervals which are not symmetric and looking at the distributions of the parameters.
If the model is linear, the mean of the slope growth factor tells you about the mean of the outcomes. At each time point, you can divide the mean of the slope growth factor by the standard deviation of the outcome at each time point.
Thank you for your reply. Because I am not familiar with Bayesian estimation I am uncertain when it comes to interpreting the results and specifying the model. Is it standard to include priors when specifying the model? Is it ok to estimate a model without priors? I am interested in the effects of the intercept and slope of burnout on future sickness absenteism (SA). I have estimated a model without specifying any priors and the magnitude of the effects are very different when I used BAYES (I on SA=0.329; S on SA=1.405) compared to when I used MLR (I on SA=1.051; S on SA=3.796). Is there any reason for these differences (besides different methods of estimation) and which of the effects is more correct?
In regards to my question about presenting change in SDs I also tried a somewhat different approach compared to the one you suggested. I used the mean and the SD at T1 to standardize the outcome variable at T2 and T3 (i.e., expressing change in SDs), this allowed me to get an average change in SDs for a unit change in time over the period. Does this approach seem ok to you?
I think you said your DV is binary in which case Bayes uses a probit link and the default ML link is logit, so that can explain the difference. If you choose LINK=probit in your ML run the results should be very close.
Yes, you can use Bayes with non-informative priors - this is the Mplus default. And it is the situation where ML and Bayes give asymptotically the same results.
Your change presentation seems reasonable if those means and SDs are model-estimated quantities.
I have run a latent growth model with a binary outcome using the default settings. I have then used dummy variables to predict the intercept and slope. What is the interpretation of these parameters? For example, variable A predicts the slope with a parameter of 0.9 - how do I interpret this - can I simply add it to the threshold of the slope to understand change over time for variable A (in comparison to reference group)? Or is it an odds ratio?
Many thanks, sorry if this question is simplistic!
Growth factors are continuous latent variables. When they are regressed on a set of covariates, linear regression coefficients are estimated. They are interpreted as usual.
Wen-Hsu Lin posted on Tuesday, April 08, 2014 - 6:49 pm
Hi, I am running a parallel LGM (fi fs, pi ps), the estimated mean for each intercept (fi pi) and slop (fs ps) are clear (model 1). However, if I let fs regressed on pi and ps regressed on fi (model 2), there are no estimated means for ps and fs. How do I get the means? I compared the two models found that the means for fi and pi are the same, so does that means of ps and fs will remain the same?
Jon Heron posted on Wednesday, April 09, 2014 - 8:24 am
when any variable becomes dependent (latent or manifest) you get an intercept and a residual variance rather than a mean and variance.
If you are still desperate to see latent variable means and variances then I think tech4 will give you them.
Wen-Hsu Lin posted on Thursday, April 10, 2014 - 6:37 pm
Thank you Jon. Please let me ask a related question. I regressed ps (peer support slop) on fi(family support intercept) and gender. I got intercept for ps (.59) and estimated paths coefficient for fi is -.024 and for gender is .54. So, the equation is like: y(ps) = .59-.024(fi)+.54(gender). So my questions are: (1)what fi value shall I use in the regression to get y? (2)Is y value related to the estimated mean for ps? In the regular LGM, the mean for ps is negative. So, does that mean male would increase the path of decline?
Dear Drs. Muthén: I have similar question to the above post. My model involved with two parallel LGM (ai as; bi bs), one outcome variable (y), and two exogenous variables (x1 x2). My question is that: (1) I regressed as on ai and x1 x2, I got estimated intercept (3.5) and path coefficients for each of ai x1 x2. So, I can have an equation as: as = 3.5+0.8(ai)+3(x1)+4.32(x2). I put estimated mean for ai to get predictive as outcome, is this right? (2) If my estimated mean for as is negative, which means growth is negative. So does that mean the greater value from above equation leads to great decline? Thank you so much.
The estimated mean of my slope (social support) is negative and is positive for the intercept. I then used these two latent variables to predict a distal outcome variable (delinquency). The estimated effect of intercept on the distal outcome is negative but is positive for the slope. With regard to the intercept, the result meant that the higher the initial level of social support, the greater involvement of delinquency at later time. This is consistent with theory. However, for the slope, the result showed that as social support gets higher, the individual involves in more delinquency? But this is somewhat counter-intuitive. So, I am wondering if the true meaning of this result should be interpreted as follow: as social support gets higher, which means "stepper decline," individuals have higher level of delinquency? Thank you.
It can be difficult to predict from two growth factors like this. I wrote up some experiences I had in the paper on our website:
Muthén, B., Khoo, S.T., Francis, D. & Kim Boscardin, C. (2003). Analysis of reading skills development from Kindergarten through first grade: An application of growth mixture modeling to sequential processes. Multilevel Modeling: Methodological Advances, Issues, and Applications. S.R. Reise & N. Duan (Eds). Mahaw, NJ: Lawrence Erlbaum Associates, pp.71-89. download paper contact first author show abstract
See especially section 5.1.
One problem can also be high correlation between i and s.
I have run latent growth models with 3 time points and including time invariant and time varying covariates. The model estimation terminates normally but with most of the models I get this error message:
THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES MAY NOT BE TRUSTWORTHY FOR SOME PARAMETERS DUE TO A NON-POSITIVE DEFINITE FIRST-ORDER DERIVATIVE PRODUCT MATRIX. THIS MAY BE DUE TO THE STARTING VALUES BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. THE CONDITION NUMBER IS 0.479D-11. PROBLEM INVOLVING THE FOLLOWING PARAMETER: Parameter 35, APFY9
The named problem variables are both outcomes and covariates. Is it possible to find out what this means and whether it is problematic for the models?