Anonymous posted on Saturday, March 10, 2001 - 7:43 pm
Hi Bengt and Linda
I have 2 groups (males and females) and I'd like to compare their growth trajectories in math over grade 1 - 6. With two different trajectories for the two groups, is it possible to test two expected scores (for the two groups) at each time point are significantly different?
Typically in a 2-group growth model you want the time scores (the growth rate factor loadings) to be equal across groups, allowing for differences across groups in the intercept and growth rate factor means. So a test of equality of expected values at all time points could be done by testing equality of means of the growth factors. Testing instead the equality of the mean at a particular time point is a little more involved because the mean is a function of several parameters. You can do this using the "delta method", or you can do this by changing the centering point. The centering point is the time point that you use for determining the intercept factor (where the slope factor loading is zero). At the centering point, the observed mean is the intercept factor mean, so that gives you a test.
If one has multiple cohorts, should these be modeled separately or should they be modeled together in one input file even if their trajectories are hypothesized to be different.
Also, can mplus estimate s shaped curves or only linear and u shaped.
bmuthen posted on Wednesday, May 11, 2005 - 2:09 pm
You may want to analyze multiple cohorts in a multiple-group run since often you want to test if development is similar or different.
You can handle s-shaped curves by either estimating time scores (with some fixed for identification purposes) or fixing the time scores according to an s shape (small time score increases early on, larger in the middle, and small later on).
Hi there - I am dealing with a similar scenario as the person who asked the above question. I am comparing the growth trajectories of learning scores (over 12 evenly spaced time points) of two groups of rats (control vs experimental group). The change in learning scores is non-linear. In fact, it looks hyperbolic with a stable asymptotic level. So far, I haven't come across any mention of hyperbolic growth curves in structural equations modelling. Would you be able to give me some pointers on the type of analysis that is most appropriate for my problem? Thank you very much for your time
Non-linear growth curves have been discussed in the psychometric literature by Browne and du Toit and by Cudeck. The former discussed curves such as the Gompertz and was published in In L. Collins & J. Horn (eds.), Best Methods for the Analysis of Change. Recent Advances, Unanswered Questions, Future Directions . Washington DC: American Psychological Association. The latter appeared not long ago (in MBR?). Some of these models can be made to fit into the SEM framework. Often, simple approximations would seem feasible such as a square root function with a specific ceiling, but I am not sure here.
More generally, growth models that are nonlinear in the growth factors have been discussed in books such as the Chapman-Hall book by Davidian and ?
Hi! I was in your short courses in Montreal!! Thank you for the seminar!! So Iím doing a lot of things since Monday!! And I have a basic question: when comparing two groups with BIC and AIC for choosing how many latent classes, is there a maximum value of BIC that is a good value. I have a model with a value of BIC 11 000 and I found that is a large value... Thank you for your help Annie
When you choose the number of classes, you are comparing BIC and other measures for models with different numbers of classes. You want the model with the lowest BIC among other things. You might find Bengt's paper in the Kaplan edited book under Recent Paper on this website helpful.
Scott Ronis posted on Tuesday, January 16, 2007 - 11:50 am
Regarding your comment posted on 3/12/2001, how does one test the difference of growth factors (intercept & slope). Should we do this by hand with a t-test or ANOVA? If so, should we use the std errors of the int. & slope from the output in the denominator? Thanks.
Hello: I'm examining whether stress and depression growth over time are similar for men and women. The error message is "no convergence, iterations exceeded." What am I doing wrong? I sent the raw dataset under 'stressfinal', My syntax is as follows: DATA: FILE is Z:\stressfinal.dat; FORMAT is FREE; TYPE is INDIVIDUAL; VARIABLE: NAMES ARE v103 v2618 v6618 stress1 stress2 stress3 socsum v10916; USEVARIABLES= v103 v2618 v6618 v10916 stress1 stress2 stress3 socsum; GROUPING IS V103 (1=Male 2=Female); TITLE: Growth Model by social status for stress measures; ANALYSIS: TYPE= MEANSTRUCTURE; TYPE=MISSING H1; ESTIMATOR=ML; H1ITERATIONS=900000; MITERATIONS=900000; MODEL: i1 By stress1 stress2 stress3 socsum@1; s1 BY stress1@0stress2@1stress3@2socsum@3; i2 BY v2618 v6618 v10916@1; s2 BY v2618@0v6618@1v10916@2; s1 ON i2; s2 ON i1; [stress1-stress3@0v2618-v10916@0 i1-s2]; OUTPUT: SAMPSTAT MODINDICES (10) RESIDUAL TECH3 TECH4 TECH5 CINTERVAL STANDARDIZED;
Hello, I am looking at ptsd symptomatology outcomes in three intervention groups - data was collected at three points. However, given that there are significant baseline differences in baseline scores among the three groups, I want to control for baseline scores. I tried two different ways of doing this (1 and 2 below) the model will not run with option 2; it did with option 1. Am I specifying my model correctly in option 1? That is, does model 1 compares the growh in the three groups controlling for baseline scores (among other things)? 1 Model: iu su | PDStot21@0PDStot22@1PDStot23@2 PDStot21 ; iu su on Dgroup1 Dgroup2 TxMOD raceeth1 raceeth2 Age controlled1 controlled2 controlled3 ;
2 Model2: iu su | PDStot21@0PDStot22@1PDStot23@2 ; iu su on Dgroup1 Dgroup2 PDStot21 TxMOD raceeth1 raceeth2 Age controlled1 controlled2 controlled3 ;
I am using only two dummy group covariates (Dgroup1 and Dgroup2); the other are variables I want to control for (TxMod = treatment modality, race and ethnicity, and being in a controlled environment at the three points of data collection 1, 2, and 3).
So the baseline score is PDStot21. Is that a pre-intervention measure? If so, you can follow the Muthen-Curran (1997) Psych Methods approach. Your first alternative does not control for baseline. Your second alternative is rejected because it uses the baseline score both as a covariate and as an outcome.
If one want's to control for initial status scores in intervention studies, I found two approaches on your mplus shortcourse slides. One is, to regress the intercept (initial status, centered at T1) on let's say "treatment". And another one is to regress the slope on the intercept. Are there any differences regarding these two approaches?
The Muthen-Curran (1997) Psych Methods approach is to regress slope on intercept in the treatment group, where intercept is centered at the pre-intervention time point. I don't see how regressing intercept/initial status on treatment would make sense and I don't recall describing such an approach - but perhaps I am misunderstanding.
Sorry, I was wrong with the first approach. I overlooked that you did not center the intercept at T1. Does the second approach make sense, when using two part modeling? (I need the covariance of the intercepts of the two parts).
I have 2 groups (males and females) for each of which linear function provided the best fit to the offending data. I also included several time-invariant covariates (all measured at the baseline). I would like to compare whether there are gender differences in how these covariates are related to the growth factors. For example, deliquent friends were significantly associated with both the intercept and the slope of offending for both gender groups. Is it possible to test whether the strength of this association differ across gender groups. Thank you.
Thank you, Linda. Can I ask a related question? I undentified 4 sub-classes of boys with distinct offending trajectories. Can i estimate the extent to which the intercepts and the slopes differ across the classes of boys? Which command to use? Can I also estimate the extent to which time-invariant covariates are associated with the growth factors across latent classes of boys? Arina.
To test the intercept and slope mean differences across the classes you want to use Wald chi-2 testing via Model Test. You can't do a likelihood-ratio chi-2 test because the run holding them equal across classes will change the class formation.
You can do the same to test class-invariance of the influence of time-invariant covariates, although here I think it would be ok do alternatively do it via a likelihood-ratio test.
I understand that in categorical LGM's the intercept mean is set to zero and modeled by threshold parameters instead. I have two trajectory classes in a binary growth mixture model and I want to determine whether both intercept means are different. In one class the intercept mean is zero (default) and in the other class sig. different from zero. Does that already suggest a sig. mean difference or do I have to conduct an equality test (set the intercept mean to zero in both classes to test for equality)? Many thanks!
You can look at the multiple indicator growth example in the Topic 2 course handout to see how measurement invariance is tested across time for continuous outcomes. For categorical outcomes, the invariance is related to thresholds and factor loadings. The models described in the user's guide on pages 399-400 for testing measurement invariance across groups for categorical outcomes can be applied across time.
Hi, I have a conditional LGM where the intercept and slope are predicted by gender and the slope is predicted by treatment (randomized control group design). To get estimated growth curves separated by treatment and control I estimated separate grwoth curves for each treatment and control and fixed all parameters to estimates of the overal model. Only the slope means were freely estimated in the separate grwoth models, to capture treatment effects. All works fine and growth curves differ as expected, but the pre-test mean estimates (intercepts) differ very slightly. This irritates me, because both treatment and control should have the same mean at pre-test. Is it because the separated models are still conditional on gender (which influences the intercept at T1)? Does my approach correctly reflect the estimated growth curve of the overall conditional model?
ok, as far as I understand, because I have (for instance)a little bit more boys in treatment, the mean of aggression in treatment group is slightly higher at T1 in my separeted model? (despite treatment does not influnece the intercept) Sounds reasonable when considering the regression equation. BTW., adjusted means was not available in my kind of analysis, so this approach was my only chance to get the trajectories.
Maybe you have categorical outcomes inducing numerical integration for which the Adjusted means plot is not available with covariates. An alternative then is to use Type = Mixture with gender and tx as latent class variables that have known class membership.
I am modeling a piecewise regression involving three groups.
When I set up my model, I first tested it with all the data combined (disregarding the grouping factor). The chi-square value for model fit was 14.049, 15 d.f, p >.50.
When I tested for the three different groups, allowing their slopes to be different, I got an overall fit that was much worse, especially in relation to one group in particular. After doing some modification to that group's model, the best I get is chi-squre = 52.56, 37 d.f., p= .0465.
Am I correct in interpreting this as saying that the overall model fits much better than the model that allows variation between the groups?
I would say that the model that fits well in the total sample does not fit well for each group. I would start with each group and get a well-fitting model for each group. Only if the groups have the same growth model does it make sense to compare growth factor parameters across groups.
Hello. I am estimating a growth model with categorical outcomes and 3 time points. Estimation of the model based on the overall sample terminates normally. However, when I try to estimate a two-groups model (males and females), I get the following warning: THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES COULD NOT BE COMPUTED. THE MODEL MAY NOT BE IDENTIFIED. CHECK YOUR MODEL. PROBLEM INVOLVING PARAMETER 49. THE CONDITION NUMBER IS -0.406D-17. Parameter 49 is the intercept term in the alpha matrix. What does this warning mean? How can I fix the problem? Thanks!
Li Lin posted on Thursday, September 09, 2010 - 7:58 pm
Dr. Muthen, For a semicontinuous outcome, if we only have got pre- and post- measurements, and we want to test proportion change in the binary part and mean change in the continuous part. What kind of model you would suggest? Thanks!
You can use DATA TWOPART to create the continous and binary parts of the variable. Then estimate the following model:
MODEL: [y1] (p1); [y2] (p2); y1 WITH y2;
[u1$1] (p3); [u2$1] (p4); f BY u1@1 u2; [f@0]; f@1;
where the factor loading for u2 is the covariance between u1 and u2.
MODEL CONSTRAINT: NEW (ydiff udiff); ydiff = p1 - p2; udiff = p3 - p4;
Li Lin posted on Wednesday, September 15, 2010 - 4:38 pm
Thank you very much! The program ran successfully, and both changes were significant with ydiff=.4 and udiff=13. In order to understand the model, I ran another one with code " MODEL: [y1](p1); [y2](p2); f2 BY y1@1 y2; [f2@0]; f2@1; [u1$1](p3); [u2$1](p4); f BY u1@1 u2; [f@0]; f@1; f WITH f2;" This model output had ydiff=.3 (p<.01) and udiff=3 (p=.2). Why the first model instead of the 2nd? The second one seems more direct to me, and I donít understand why no relationship specified between the binary part and the continuous part in the first model. Could you elaborate the reason behind the model choice? Thanks!
Li Lin posted on Thursday, September 16, 2010 - 8:02 pm
Another question - what link function is used in two-part model. Is it probit with MLR and logit with WLS for binary part, and identity for the continuous part? Thanks!
You are correct that the processes should be correlated. I would use the model with f1 WITH f2. The model should be estimated using maximum likelihood where the default link is logit.
Li Lin posted on Friday, September 17, 2010 - 1:49 pm
Thank you, Linda. If I use results from a pilot study to specify those parameters, can Monte Carlo simulation in Mplus accommodate this kind of model for sample size? I tried, but always got "THE POPULATION COVARIANCE MATRIX THAT YOU GAVE AS INPUT IS NOT POSITIVE DEFINITE AS IT SHOULD BE."
If you get that message, you are either not specifying population parameter values for all parameters in the model in which case the value zero is used or the values you are using result in a population covariance matrix that is not positive definite.
Li Lin posted on Friday, September 17, 2010 - 4:21 pm
Thank you! After correction, I ran MODEL population: [y1*0]; [y2*-.4]; y1-y2*.8; f1 BY y1@1 y2*.6; [f1@0]; f1@1; [u1$1*-3]; [u2$1*-6]; f2 BY u1@1 u2*8; [f2@0]; f2@1; f1 WITH f2*.7; The output had population value of 0 for both the two intercepts y1 and y2 and the two threshold u1$1 and u2$1, while the estimates average were close to the values I gave. Also, population value for udiff and ydiff were .5. What was wrong with my code?
Li Lin posted on Thursday, September 23, 2010 - 12:53 pm
Still for the model posted above, since the research question is not about the factor loadings or the covariance between f1 and f2 or the residual variances, can I fix those values? i.e. use "@" instead of "*". By doing so, power to detect udiff and ydiff should be increased.
Jessica posted on Tuesday, March 15, 2011 - 6:08 pm
I am working with an existing data set to compare the growth trajectories of two groups of learners (impaired vs typical). As the original data set stands, the impaired group is only about 11% of the population (n=141). Would you suggest:
1) randomly selecting 10% of the typical population so the groups are even (though much small final n, that probably doesn't have sufficient power)?
2) using all participant data (n=1228) despite having significantly unequal groups?
3) based on power analysis (MacCallum, Browne, & Sugawara, 1996) for RMSEA between .05-.1, power .8, alpha .05, 4 dfs, 682 participants are needed - so using the 141 impaired, and randomly selecting 541 typical...which still results in unequal groups?
Thanks for your input! Any suggested references to backup decision would be greatly appreciated.
Hi, I have a model in which I want to predict a growth process of daily hassles (h) on a growth process of depression (d): id ON ih; sd ON sh; sh ON ih; sd ON id; However, when I look under "model results" the intercept of depression seems very small and is much lower than the estimated mean of the intercept in the TECH 4 section. However, when I allow for correlation between the latent factors of the two processes (instead of regression) the intercepts under "model results" and TECH4 are nearly the same. Do you have an answer to the large difference found in the regression approach as opposed to the covariation approach?
Thank you, what I actually wondered about was the large difference between the intercept mean under "model results" compared to TECH4? Is it because the intercept mean in TECH4 is not affected by the predictors in the model? Are both ML estimates?
You do not get an intercept in TECH4. TECH4 gives the mean. TECH4 gives the model estimated means, variances, and covariances of the latent variables in the model.
xybi2006 posted on Friday, January 16, 2015 - 8:54 pm
Dear. Dr. Muthen, I ran three latent growth models: intercept only model, linear LGM, and quadratic LGM. In addition to use (a) chi-square difference test, and (b) AIC, BIC, ABIC to select the model, can I also include significant growth factor (i.e., linear slope, quadratic slope) as one of criteria to select the model? If so, do you mention this in any of your publications so that I can cite it for the manuscript that I am working on? Thank you,
Not sure I have anything you can cite, but it seems like BIC is useful here. If you start with a quadratic, the first thing that is often not needed according to BIC is the quadratic variance, while the quadratic mean may be significant. You can't do proper (z-score or) chi-2 diff testing when one alternative has a non-zero variance and the other a zero variance (border problem).
xybi2006 posted on Sunday, January 18, 2015 - 5:51 am
So, normally deviance statistic test is not appropriate in comparison of linear LGM with quadratic LGM, if I understand your posting above correctly.
BIC is useful here. How about AIC?
Also, are linear LGM and quadratic LGM nested model or not?
AIC may be good also, but doesn't encourage parsimony of the models as much as BIC.
Linear is nested within quadratic, but the assumptions behind chi-2 testing are not fulfilled.
xybi2006 posted on Sunday, January 18, 2015 - 10:10 pm
Dear Dr. Muthen, Thanks much! Some articles I read use deviance statistics to compare linear LGM with quadratic LGM. So, they articles do not do the comparisons correctly, right? How about linear LGM with piecewise LGM, do deviance statistics work or can only AIC and BIC work?
Is it possible to tell Mplus to analyse only 1 group in a data file? For example my data file consists of 2 subsamples (coded 0 and 1) and I want mplus to analyse only the subsample coded 0. If yes what code should I use?
See the USEOBSERVATIONS option in the user's guide.
EunJee Lee posted on Sunday, May 17, 2015 - 7:18 am
Hi, I have two basic questions with lgm and lca models.
1) I read your comments regarding the model comparison between linear and quadratic models. So far, I figured out that linear is nested within quadratic but chi-difference test is not appropriated for model comparison because of zero-constrained varience which is usally done analyzing quadratic models. In this case, I can compare with BIC and check the significance of mean of quadratic term. My question is if I do not constrain any variance in quadratic model, I can use chi-difference test when I make a dicision about linear or quadratic models. And did I understand right so far?
2) I learned it is wrong to use standardized value as indicators when analyzing lgm models. I wonder that if I can use z-score when modeling LCA, or it is also inappropriate? I am trying to model LCA with 7 items indicating the levels of participation for social activities. I summarized 3 of 7 items because they are combined in the same category, theoritically. So 5 indicators were included and all of them were transformed into z-scores because one of them were summarized with 3 items so its scale is different from others, and another indicator was measured in different way from others. It is confusing that which one is better to use, raw values with different scales or standardized values.
I would use the profile of means across the variables to interpret classes. See LCA applications in the journals.
EunJee Lee posted on Thursday, May 21, 2015 - 1:13 am
I read several articles using latent profile analysis but some used z-scores and others used raw data for analyzing but used z-scores for interpreting. So I was confused and left messages to you. Could you please recommend a few articles using continuous variables measured on different scales as indicators of LPA if you know one?
I tried to interpret the results with unstandardized means, but as mentioned above, one of variables used summed score of three activities, so I cannot say one with higher score of this indicator participates on these kinds of activities more than others before comparing with standardized means.
And when I looked at other comments you wrote, there was an answer saying using z-score is not appropriate because it analyzed with not covariance matrix but correlation matrix. So if I didn't make any constrains like same variance across classes, which means my model is scale free, is it okay to use z-scores?