I have some questions about conducting multiple group analyses with growth modeling.
Ultimately, I want to examine how family income's intercept and slope are associated with parenting's intercept and slope over three waves of data, and whether these associations vary by a family's poverty status. In other words, are increases in family income more beneficial for parenting in families that are initially poor than for families that aren't poor. Right now, however, I'm just trying to figure out how to understand multiple group analysis of an unconditional LGM.
In light of the fact that I am grouping families based on their initial income status, I expect that the intercepts and slopes will vary by group. My questions are:
1. I have conducted a multiple group LGM for three waves of family income data (there are 5 groups).
Based on my reading of pg. 303 in the user's guide[“The means and intercepts of continuous latent variables are fixed to zero in the first group and are free and not equal across the other groups as the default.”], I had expected that the means of the intercept and slope (because they are continuous latent variables) would be fixed to zero for group 1, but this wasn't the case. In fact, each group has a non-zero mean for the intercept and slope. What am I missing?
2. If the default is to allow the means of continuous latent variables (i and s) to be free and not equal across groups, and I want to demonstrate that they *are* different (compared to they null hypothesis that they are the same across gruops), would would the syntax in the group-specific model command be? Or, is the more appropriate way to examine this question to focus on the path coefficients from income to parenting rather than the fact that the means of the intercept and slope vary across groups (especially since there is every expectation that intercepts are going to vary, since that's the basis of the group formation)?
3. Finally, if I want to demonstrate that the mean intercept of group1 is significantly different from the mean intercept from group 3 (or any other group), is the appropriate way to do this to take the difference in the estimates and divide it by the standard error of the difference?
1. The text you refer to does not refer to a multiple group growth model. With a multiple group growth model, the means of the growth factors are free in all groups.
2. You can use a chi-square difference test to compare a model where the means of the growth factor(s) are free across groups to a model where the means of the growth factors are constrained to be equal across groups.
3. You can also use the chi-square difference test to do this.
In the past, when I estimate a model that includes a standardized coefficient that is >1.0, the output includes a message that that psi matrix is not positive definite. I just ran a relatively complex model (in which I have estimated growth factors for 6 variables with panel data and also estimate a structural model using the growth factors) and 4 of the standardized coefficients (StdYX) are >1.0; 3 of these are quite a bit larger than 1 (ranging from 1.4 to 1.6). Why didn't I get an error message? I'm assuming that this is not an acceptable solution--Am I right?
Some standardized coefficients can be greater than one. There is a discussion of this in Karl'e corner at the LISREL website www.ssicentral.com. If you want us to look further into your particular problem, please send your input, data, output, and license number to email@example.com
1. I assume that you mean a model with fixed linear time scores versus a model with some free time scores. To test if the free times scores are significantly different from their linear counterparts, you can subtract them and divide by the standard error of the free time score. Or you can do a chi-square difference test of the model with fixed time scores versus the model with free time scores.
3. For non-robust estimators, yes. For others see instructions on the website or use the DIFFTEST option if appropriate.
Hi Dr. Muthen, I am estimating a quadratic growth model with multiple groups. However, rather than using the "i s q | y@0y@1y@2y@3" syntax, I have set it up as a latent difference score model (with constant change and autoproportional change). I need to set it up this way because my final model involves another variable predicting the change in y.
I want to be able to use a chi square difference test to examine whether the growth parameters can be constrained to equality across groups (high and low SES). However, the default settings do not seem to allow me to set the mean slopes and levels to be equal across groups. Even when I label [slope] the same way in both model statements, the group 1 value is set to zero and the group 2 value for [slope] is estimated.
Dear Dr. Muthen, I want to run a multiple group LGA with multiple indicators. If I'm doing a multiple group LGA without multiple indicators I get mean values for all growth factors and for all groups (eg. intercept mean for boys and girls). But if I use a multiple indicator LGA Mplus sets the intercept mean of group 1 equal to 0. I guess that's the caused by the default settings. How can I override this default?
hello, i am also conduction a multiple indicator LGM (with multiple groups) and would like to estimate the mean of the intercept growth factor. Therefore I fixed the intercepts of the factor indicators to 0 & freely estimated the mean of the intercept growth factor [i]. If I am doing a single group analysis everything works fine. However, I as soon as I use the "grouping-option" the intercept and slope mean are set to 0 and the model doesn't work. Am I missing something important? Thank You very much for your help.
I'm also doing a multiple group analysis with a growth model of self-concept (4 waves). In fact, I am interested in differences in means and slopes between six groups. However, I also want to add covariates (I want to control for achievement) which means that I can no longer constrain the means of the intercept or slope across groups because they are only available in tech4 output, or is there a way to constrain them anyway? If not, than I should find another way to control for achievement. Do you have some suggestions? I thought about parallel processes, but I do not know if this what I want to do (maybe even adding achievement as a covariate is not what we want). We want to model the growth for academic self-concept after controlling for achievement and we want to compare this growth across groups. Could I do this by first regressing self-concept on achievement for each time point and then take these 4 residuals as indicators for my latent growth factors?
When you have a conditional model, the intercepts rather then the means are of interest. You should look at the differences in the intercepts across groups. You might also want to compare the regression coefficient involving achievement across groups.
Thank you, Linda, for your quick response! I just want to ask some additional questions with regard to your answer. If intercepts are of interest how do I interpret these? Because they're not the same as the means I think... and can you represent them in a graph in MPlus, because I only get graphs with means? When you refer to the regression coefficient of achievement, do you mean the effect of achievement on the intercept and the slope and compare this across groups? Finally, might I infer from your answer that you do not think it necessary to work with residuals to control for achievement? Thank you for your time!
The intercepts of the growth factors are interpreted as in any linear regression. When you regress the intercept growth factor on achievement, you receive an estimate of the intercept of the intercept growth factor and a regression coefficient. The same is true for the slope growth factor. When you regress it on achievement, you obtain an intercept for the slope growth factor and a regression coefficient. It is these regression coefficients that I refer to. The model estimated values that are used to compute the means for the PLOT command. I don't see any need to work with residuals.
Sorry for asking again, but I'm still not clear on how I can interpret my constraints. I do understand how to interpret the effects of my covariates on my slope and intercept, but then I want to add the constraints to my model, to see if there are differences between my groups in intercept and slope. However, because I have a conditional model I can only constrain the intercepts of my intercept and slope, but what would the fact that the intercepts of the intercept/slope are equal across groups mean? Could it not be possible that I find that the intercept of my intercept is equal across two groups when in fact the 'real' mean (total effect on the intercept) found in TECH 4 is not equal across these groups? Maybe I'm getting something wrong here...
Maybe it is helpful for you to think of the analogous situation in ANCOVA. In ANCOVA you have y, x1, and x2, where y is the posttest, x1 is the pretest, and x2 is the group (tx/ctrl). ANCOVA does not look at the y mean differences across groups as ANOVA does, but adjusts for pre-existing differences in x1 means, and considers the intercept as the tx effect. Think two parallel regression lines (assuming group-invariant slopes on x1) with y on the y axis and x1 on the x axis - the intercept is the difference.
In your case, i or s correspond to y, achievement corresponds to x1, and group corresponds to x2.
Thank you, this makes things more clear for me. However, I'm doubting if I should add achievement as a covariate, because I do not think this is enough to answer my research questions. I would like to compare the self-concept of equally able (or achieving) students across different groups and across time. Is it dan okay to just add achievement as a covariate in each group-specific model? Or can you suggest other methods of analysis?
I have some questions on multiple group LGMs. I want to fit LGMs for continuous observed variables measured at 3 equally spaced time points. Participants were randomly assigned to one of four intervention groups, and I indicate these groups using the GROUPING option. I am interested in comparing LGM attributes between these groups.
Since my time points are equally spaced, I began by using fixed time scores (O, 1, 2) in the overall model statements. However, plots and model fit output for some of the observed variable LGMs indicate poor fit. I also received some PSI warnings. Upon closer inspection of the PSI offending groups, it was clear that their LGMs were non linear. Therefore, I experimented with different time score approaches (e.g., free time scores, logarithmic time scores, etc.) for groups where 0, 1, 2 scores resulted in poor fit. The resultant models fit much better.
Is this an acceptable approach when addressing differential or non-linear growth across multiple groups with only three measurement points? If so, can I specify differential time scores by using group-specific model statements? Also, I assume that comparing mean slopes and intercepts across different time score groups would not be advised, correct?
As an alternative approach, could I use added growth or quadratic models? Or would these methods not be advised given more than two groups and measures at only three time points?
You should fit a growth model in each group separately. If the same model does not fit in each group, comparisons across groups should not be made.
With only three time points, you options are limited. You have only one degree of freedom so if you free one time score, model fit cannot be assesed. You can fix logarithmic time scores as you suggest. You need four timepoints for a quadratic growth model.
Hi Linda or Bengt, I am going to be analyzing data from an intervention study with stroke victims with chronic language impairments. Our research team is very interested in changes over specific time intervals - specifically pre to post and post to follow-up - so my plan is to use a latent difference score approach. Given that the patients, by definition, have chronic impairments, I think it is reasonable to assume stationarity in the untreated control group and thus am planning on fitting a proportional change model. I was also thinking I would do a multiple-group analysis comparing the treated to the untreated controls but what isn't clear to me is which parameters I would expect to be different in the treated group versus the untreated group - would it be in the coefficients of the autoregressive effect of the pre-treatment score to the first latent difference score? If I weren't going to set it up as a multiple group analysis but rather just entered treatment as a dummy-coded time-invariant predictor it seems clear to me that I would test the treatment effect by testing whether the path from the treatment variable to the first latent difference score were significantly different from zero but I am not quite seeing what the test of the treatment effect would be in the multiple group approach. Thanks in advance for any light you can shed on this for me!
If you do it as a dummy-coded covariate you are saying that the first latent difference score has different means in the treatment and control groups. So that's what you want to mimic in the multiple-group analysis. The latter, of course, can handle many other group differences such as different slopes of the latent difference score regressed on the pre-treatment score.
Thanks Bengt. Perhaps my understanding of the latent difference score model is incorrect but my understanding of the readings I have done on the topic was that one does not directly estimate the means of the latent differences in the LDS approach. Rather, it appeared to me that one had to estimate those means by hand by applying the parameter estimates for (1) the regression of the first latent difference score on the pre-treatment score and (2) the loading of the first latent difference score on the constant change factor (which in the proportional change model I plan to fit would equal zero) into the equation for the latent difference score. For example, in the Mplus code provided in their Appendix A by King, King, McArdle, Shalev and Doron-LaMarca (2009) they constrain the means and variances of the latent difference scores to equal zero. Are those constraints unnecessary and I could instead run one model in which the difference score means are freely estimated and a second model in which the difference score means are constrained to be equal across the groups and then test the difference of those two models?
I am not up on latent difference score modeling, but the fact that King et al. restrict the means at zero (which I assume is for a single group) doesn't mean that you couldn't estimate a mean difference when having two groups. You fix it at zero in a reference group and let it be freely estimated in the other group (to represent the difference - as usual).
I am rather new to mplus and have been teaching it to myself over the last few months. I am trying to modify a latent variable cross lagged script to include twin groups. Here is the base model: EF1 BY secs1* (L1) err1*(L2) per1 * (L3) cc1 * (L4) nogo1 * (L5);
ASB2 ON ASB1 EF1; EF2 ON EF1 ASB1; EF1 WITH ASB1;EF2 WITH ASB2; I need to add the biometric decomposition for each of the latent factors but even just a script/example for a univariate latent variable would be helpful! Thank you in advance!
I am doing multi group analysis. I have four groups, the sample sizes are 450, 200, 99, and 190. As you see, one group is small (n=99) compared to the other groups. Will it be problematic when conducting multi group analysis?
The other question is if I can add exogenous variables predicting intercept and slope in only two of four groups. That is, when conducting multiple analysis, the model should be same across groups? Or can I add predictors in some of groups, or add different predictors across groups?
Dear Dr. Muthen, I am doing multi group analysis LGM of reading skills with two groups and two predictors (cognitive abilities and socio-economic status). A reviewer asked whether I might really say that I am controlling the predictors across groups. He argues that because I did not constrain the regression coefficients to be equal across groups that I am only referring to group-specific values of the predictors. As to my knowledge, the LGM results are based on achievement values with the influence of my predictors partialed out (intercepts). It seems reasonable that the regression coeffients of my predictors vary across groups; I do not have any hypothesis about this influence. Therefore, I assumed that I am allowed to compare the achievement development across groups saying that I controll cognitive abilities and SES. Am I missing something? Thank you in advance!
Many thanks for your quick response! Do you suggest any other method of analysis in this case? Is there a better option to controll predictors in multi-group LGM? Unfortunately, matching is not possible in this sample. Thank you!
I can't think of any alternative. You can test whether the regression coefficients are equal across the groups using difference testing or MODEL TEST. If they are, you can compare the intercepts. If they are not, then you should not do this.
I am doing multigroup LGM with two groups (low future orientation and high future orientation) and 4 distal outcomes. I am trying to run a fully constrained model and compare it to a model in which the structural paths are released. From the documentation, I see that the intercepts, thresholds and factor loadings are held equal by default, but that I need to fix the residual variance, factor means, variances, covariances, and regression co-efficients. Below is what I have done so far. Is this correct? I think I am missing the covariances?
Variable: Names are id violob10 .... fut1di fut1di2; Missing are all (-9999) ;
Usevariables are victmiz1 victmiz2 victmiz3 victmiz4 fut1di2 nvdel10 victim10 vioapr10 violb10 ; grouping is fut1di2 (0=low 1=high) Analysis: type=mgroup;
The first thing you should do is estimate the growth model in each group separately to see if the same growth models fits well in both groups. If not, multiple group analysis would not make sense. If you proceed to multiple group analysis, you should first test the residual variances which are measurement parameters not structural parameters. Then test the structural parameters as shown adding the covariance between i1 and s1.
Hello, I am grasping to understand the results from a multiple-group growth model with individually varying time-scores and a count outcome variable. Time was defined as age centered at an early point (50 years of age in a sample of older adults). The outcome measure was a count of physical limitations modeled with a Poisson distribution (COUNT are...).
For the oldest group with a mean age of 80 at initial measurement, the intercept was -1.21 and the slope was .673. When exponentiated, this translates into an initial count of .30 limitations with a slope of 1.96. Age was included as a covariate and had a value of -.06 (exp = .94) on the intercept.
To calculate the estimated number of initial limitations for an 80 year old, I would assume this is correct: =exp(-1.21 + (30*-.06))= .0526. With that said, it is not clear why the oldest group of adults would have such a low count of physical limitations.
The -0.06 was the estimate for the covariate when the intercept of physical limitations was regressed on age.
Previously I had centered the time scores for each cohort on the cohort-mean age, but in this round I centered the time scores for all cohorts on age 50. I am thinking I would take the mean intercept representing the predicted number of limitations for a given cohort at age 50, then add to that the product of the covariate estimate for age multiplied by the number of years I want to move out from age 50.
The model currently allows for the intercept and slope to be estimated freely for each cohort, meaning it is not an accelerated design. Thanks as always for your time.
I would look at the sample count distribution for the subjects of age 80 and compare that to the distribution based on the estimated count mean at age 80. If they don't match well, perhaps the growth model is off.
Dear Dr(s) Muthen, I am running a latent difference score analysis, modelling constant change, proportional change and cross-lagged paths to look at the temporal relationship between alcohol use and psychological symptoms over time. This involves four separate models, each looking at the relationship between alcohol use and one group of symptoms. While I can get two of the models to converge using the raw data, I have found that I need to standardize the other two in order to achieve convergence. I have tried dividing one and both of the variables in each model by a constant; however, even though the variances of the variables become more similar (and less than 10), I still get the error message relating to the psi matrix being non-positive definite. I have also tried centering the variables, with the same outcome. Do you see any problems with using standardized variables in this kind of analysis?
I would not standardize to avoid a convergence problem. I would try to determine the cause of the problem. I would also not standardize with a growth model.
Cindy Huang posted on Monday, March 17, 2014 - 11:37 am
Dear Drs. Muthen,
I am doing a multiple group LGM with several predictors, and am wondering if I need to be interpreting the standardized or unstandardized betas for the results. The output is showing different significant predictors depending on whether I'm looking at the unstandardized or standardized results (there are more significant effects when looking at the standardized results). Can you please provide some clarification on this issue?
Raw and standardized coefficients have different sampling distributions so their significance can vary. You need to decide which to use based on practice in your filed.
xiaoyu posted on Thursday, April 17, 2014 - 4:28 pm
Dear Dr. Muthen, I was running a multiple group LGM with robust estimators of MLR. The chi-square value of the multiple group is not equal to the sum of two univariate LGMs, but the DF is equal to the sum of the two univariate LGMs. Is this normal for MLR estimate?
I ran a mutiple group LGM before with ML estimate. Both the chi-square and DF of the multiple group are equal to the sum of the two univariate LGMs.
xiaoyu posted on Tuesday, April 22, 2014 - 4:22 pm
Dear Dr. Muthen,
Thanks for your help. I have one more question. For the multiple group LGM comparison with covariates, is there any way to plot the graph after controlling for the covariates? I have the plot command (see below), but the plot is the one without any covariates even though these covariates are in my multiple group LGM models.
Simone L. posted on Friday, November 21, 2014 - 4:54 am
Dear Dr. Muthen, I´m running a multiple group LGM and I get a negative residual variance. So I fixed the variance of the variable to zero . But when exemaning the output, math9 remains still negative, while residual variance of math7 is zero - am I doing something wrong?
I have a couple questions about an interaction I found using multigroup modeling. I am using the default estimator (ML) to examine differences in a mediational path among individuals with variation in a specific genotype (grouping variable). I have one dichotomous predictor (presence of maltreatment), continuous mediator (slope of emotional reactivity) and continuous outcome (personality pathology).
When comparing an unconstrained versus constrained model using a Chi-Square difference test, the path between the predictor and the mediator significantly differs by group, indicative of an interaction with genotype.
Questions: 1. How would you suggest plotting this interaction? Is it possible to do using multigroup modeling? 1a. I attempted to examine this model using an interaction term, allowing for the plot function. However, the interaction term is then only marginally significant and I am not sure why this would be. Syntax for the model is pasted below. MODEL: y ON m x z xz; m ON z x xz; MODEL INDIRECT: y MOD m z (0, 1, 0.1) xz x; PLOT: TYPE = PLOT2; OUTPUT: Sampstat stdyx
2. If utilizing the multigroup model, is there a way to test if the total indirect effect also differs by group?
1. The model (1) with z and xz as covariates is a bit different from the (2) multiple-group model. Unless you have group-varying residual variances you can make (1) be specified exactly the same as (2) with the same number of parameters. The results should then agree.
2. You can use Model Constraint to express the total indirect effect for each group and difference between groups in terms of model parameter labels. That difference is given a z-test.
RuoShui posted on Monday, November 14, 2016 - 3:34 pm
Hi Dr. Muthen,
I have a conditional latent growth curve model with a series of covariates (recoded and centered). I would like to compare across two groups whether the initial status and slopes are different. But I don't think I can use, for example [i] (1) because from what I understand from other threads that the mean of the initial time point only equals the intercept of the intercept growth factor when all the other covariates are zero. If I want to compare the initial status of the whole sample between the two groups, how should I do this?
Try centering the covariates in each group so that the group-specific [i] refers to them being at the group's covariate means.
RuoShui posted on Monday, November 14, 2016 - 6:37 pm
Thank you. My covariates are dichotomous variables coded as 0 and 1 and another covaraite is SES which is standardized z-score. Do I still need to center the covariates? Could you please provide a hint of the syntax? Thank you very much
Muthén, B., Khoo, S.T., Francis, D. & Kim Boscardin, C. (2003). Analysis of reading skills development from Kindergarten through first grade: An application of growth mixture modeling to sequential processes. Multilevel Modeling: Methodological Advances, Issues, and Applications. S.R. Reise & N. Duan (Eds). Mahaw, NJ: Lawrence Erlbaum Associates, pp.71-89. download paper contact first author show abstract
Mark Wade posted on Friday, December 22, 2017 - 8:01 am
I'm performing a latent growth model with known classes (multiple group analysis) using the knownclass feature with type=mixture. I have 3 known classes/groups and 3 waves of data collection. I'm using a Bayesian estimator.
I'd like to compare the means and variances of i and s across groups; but I'm unsure if there is an equivalent method to constraining and freeing parameters and doing a chi-square difference test across models as in the case of multiple-group analysis using MLR. Is there a way of testing group differences in the means and variances of i and s using a Bayesian estimator with the knownclass option and type=mixture? ..... CLASSES = cg(3); KNOWNCLASS = cg (Group=0 Group=1 Group=2); MISSING ARE ALL (999); ANALYSIS: TYPE = MIXTURE; ESTIMATOR = BAYES; MODEL: %Overall% i s | DMSper8@0DMSper12@1DMSper16@2; i s ON Gen BW; %cg#1% i s | DMSper8@0DMSper12@1DMSper16@2; %cg#2% i s | DMSper8@0DMSper12@1DMSper16@2; %cg#3% i s | DMSper8@0DMSper12@1DMSper16@2;