Mplus Discussion >> Latent Class Growth Analysis

Topics
Last Day
Last 3 Days
Last Week
Tree View

Edit Profile


Latent Class Growth Analysis

Mplus Discussion > Growth Modeling of Longitudinal Data >

Message/Author

Anonymous posted on Monday, June 03, 2002 - 1:10 pm

Hello:

I am working on a latent class growth analysis (LCGA) of binary indicators, similar to the one in example 25.11 of the Version 2 manual. The only differences are (a) three timepoints instead of four, and (b) a non-linear growth pattern represented with Helmert contrast coefficients (i.e., comparing each timepoint to the mean of those following). In performing this analysis, I have encoutered two points of confusion. The first is general, the second more specific to my model.

First, what is the scale of the "Factor Means" (e.g., intercept and slope factors)? Are these logits, or something else?

Second, the 2-class model runs normally (though the fit is unexceptional), but I begin to encounter estimation problems with a 3-class solution. These typically involve the Fisher Information Matrix in some way. The program cautions me that my start values may be poor, or the model may be unidentified, or both. So far, changing the start values has changed the details of the intermediate solution, but hasn't solved the problem. Furthermore, the error messages frequently cite "parameter #8," which led me to notice that, with 3 binary indicators, the underlying contingency table would have 8 "patterns", only seven of which represent unique pieces of information. Is the model, therefore, underidentified in the absolute sense (you can't have more parameters in the model than the number of unique pieces of information -- period)? Or is this more likely a sign of empirical underidentification (model is ok in principle but isn't right for the data)?

I would appreciate any insights from those with more experience in LCGA than I have! Thank you.

bmuthen posted on Tuesday, June 04, 2002 - 6:43 am

Yes, the scale of the factor means is logits.

The best way to check for identification is to request the MLF estimator (this is done automatically in Mplus version 2.1). For identification of all parameters you have to have at least as many pieces of information as parameters, so from this point of view you can only have 7 parameters.

Dustin posted on Friday, February 04, 2005 - 2:48 pm

I am interested in understanding the similarities and differences in latent growth trajectories that are identified using various statistical approached. Is LCGA with binary variables in Mplus identical to Nagin's proc traj modeling program in terms of extracting groups? In addition, would you expect to obtain different trajectory groups using GGMM in comparison to the previously mentioned techniques? I know a lot of it has to due with stopping points, but if you chose to extract the same number of latent classes would you anticipate obtaining the same results across the different procedures?

bmuthen posted on Friday, February 04, 2005 - 2:57 pm

see Muthen (2004) from the Kaplan-edited book which is in pdf on our web site.

Linda K. Muthen posted on Friday, February 04, 2005 - 2:59 pm

I would expect the same number of classes with Mplus or Proc Traj if the models are the same, that is, LCGA. I would expect a different number of classes with GGMM given that within class variability is allowed unlike in LCGA.

Dustin posted on Saturday, February 05, 2005 - 10:07 am

I can not find the examples for the GGMM model referenced in the Muthen (2004) chapter of the Kaplan-edited book. Have they not been posted yet?

bmuthen posted on Saturday, February 05, 2005 - 1:45 pm

Correct. If you have a specific example you are interested in, I could send it to you.

Dustin posted on Sunday, February 06, 2005 - 10:31 am

That would be great. I am interested in the delinquency GGMM model that was specified in the chapter. I am attempting to do a GGMM using a binary indicator of delinquency across 13 time points from age 7 to 19 with missing data (using a linear and quadratic curve components). In your experience, is this model too computationally complex for GGMM. I notice that you fixed the quadratic slope variance to zero to help simplify the book chapter model. Was this done due to the computational complexity of the model? I was planning to let the linear and quadratic slope variance be freely estimated given that both parameters are significant in the conventional one-class growth model.

My e-mail is dap38@pitt.edu

bmuthen posted on Sunday, February 06, 2005 - 11:55 am

3 random effects is computationally heavy with growth mixtures. You can reduce the burden by using integration = 5. But typically you don't need a random quadratic but can set its variance at zero - because you have classes that pick up much of the variation. I would do an LCGA model first (all variances zero) and then add intercept variance, then add linear slope variance.

Dustin posted on Monday, February 07, 2005 - 3:59 am

Sounds good. Thanks for the advice. I look forward to receiving your syntax.

bmuthen posted on Monday, February 07, 2005 - 5:08 pm

The syntax for the LCGA is simply

Model:

%overall%

i s q | u1@0 u2@1....;

i-q@0;

Dustin posted on Wednesday, February 09, 2005 - 10:39 am

I would actually like the syntax for the GGMM model that was specified in the book. Sorry for the misunderstanding. I have already run the LCGA with my data without any problems. It would be interesting to see the model and output for the different class solutions using GGMM in the book if it is avaiable. Again, my e-mail is email is dap38@pitt.edu

bmuthen posted on Wednesday, February 09, 2005 - 10:52 am

Will send it to you.

Anonymous posted on Monday, March 07, 2005 - 4:55 am

I have performed lcga on 3 identity dimensions simultaneously and lcga on 2 adjustment dimensions simultaneously. The first lcga revealed four classes and the second lcga revealed three classes. Now I want to examine how the four identity-classes relate to the three adjustment-classes, but I am a bit puzzled how to proceed due to the fact that group membership is probabilistic. With known groups, one can simply perform chi-square analyses by crosstabulating both group memberships, but I do not know if this can be done using latent classes?
Similarly, I wonder if one can save group membership obtained using lcga and use this group membership as a basis to conduct multigroup-analysis?

Many thanks in advance for all helpful suggestions!

Linda K. Muthen posted on Monday, March 07, 2005 - 7:09 am

You can save the posterior probabilities and most likely classmembership. But using most likely class membership in a subsequent analysis would introduce error as each observation has a probability of being in each class. The error would be greater as entropy declines.

You can have a model with more than one categorical latent variable. I would suggest that. See Example 7.14 in the Mplus User's Guide. Then you can test a set of nested models.

Anonymous posted on Tuesday, March 08, 2005 - 1:07 am

Hello,

Is TECH11 (LMR-test) appropriate to define the optimal number of classes in multivariate LCGA (LCGA performed simultaneously on two or more variables at a time), besides BIC-values?

Thank you for your help.

Anonymous posted on Tuesday, March 08, 2005 - 5:22 am

I have a question concerning LCGA. I have performed LCGA on two variables and found a number of classes for variable a and a number of classes for variable b. Is it a viable option to save group membership probabilities for all classes obtained and to correlate them among each other? A positive correlation, for instance, would then indicate that having a high probablility to belong to the first class for variable a is associated with having a low probability to belong to the first class for variable b.

Thanks in advance for your comment.

Linda K. Muthen posted on Tuesday, March 08, 2005 - 7:29 am

TECH11 is one piece of information to use in deciding on the number of classes. This issue is discussed in the following paper:

Muthén, B. (2004). Latent variable analysis: Growth mixture modeling and related techniques for longitudinal data. In D. Kaplan (ed.), Handbook of quantitative methodology for the social sciences (pp. 345-368). Newbury Park, CA: Sage Publications.

which can be downloaded from the following link:

http://www.statmodel.com/recpapers.html

Linda K. Muthen posted on Tuesday, March 08, 2005 - 7:32 am

Using most likely class membership in this way would introduce errors given that observations are in all classes proportionally during model estimation. I suggest having a model which contains two categorical latent variables and look at the correlation between these.

Anonymous posted on Friday, March 25, 2005 - 11:57 am

Hi,
I want to use growth mixture model to analyse a longitudinal dataset which contains 12 waves of a continous response variable. The problem is that many subjects only have a few waves of data, some have less than 4. Can Mplus handle this situation? if yes,what approach is used, are missing data filled in first?
also, in Muthen and Shedden's paper, binary outcomes (u) can be predicted using class memebership and covariates. has it become a procedure in Mplus yet?
Many thanks,

bmuthen posted on Friday, March 25, 2005 - 3:54 pm

Yes, Mplus handles missing data by the standard approach of MAR under ML (see introduction paragraphs for the Missing Data section of Mplus Discussion, which is taken from the User's Guide).

Yes, all of what is in Muthen-Shedden is in Mplus and much much more general models. See for example the 2002 Biostatistics paper by Muthen et al on the Mplus web site.

D. Baliunas posted on Wednesday, October 26, 2005 - 8:42 am

Hello,

Similar to a situation described above, I am trying to use a GMM to analyze a longitudinal data set that contains 44 waves of a continuous variable. There is quite a lot of missing data (I think the worst case has about half the data points missing). From several answers I have seen on this discussion board, and from reading the Mplus manual, I understand that Mplus handles missing data.

I'm new to Mplus, and am having trouble setting up my model correctly. I have followed ex8.1 from the User's Guide (version3). Modifying this dataset to introduce missing data, but keeping the syntax from the book, I am able to run the analysis. However, when I apply this simple syntax to my dataset, I get the following message:

***ERROR in Model command
Growth factor indicators must be all observed or all latent.

What does this mean? Is there an example (data, syntax) that you can recommend that would demonstrate a GMM or LCGA that incorporates missing data?

regards,

Linda K. Muthen posted on Wednesday, October 26, 2005 - 9:21 am

The error message means that you are using both observed and latent variables in the same growth model.

If you have missing data and want to estimate the model taking this into account, you add TYPE=MISSING; to the ANALYSIS command.

If this is not sufficient information to help you, please send your input, data, output, and license number to support@statmodel.com so we can see exactly what the problem is.

Anonymous posted on Wednesday, January 18, 2006 - 9:06 am

Dear Drs. Muthen,

I am using GMM to model trajectories as a function of 5 classes and then predict latent trajectory class from covariates.

The examples in the manual that I've found deal with two trajectory classes, and I understand class regressed on a predictor as a logistic regression (class 1 vs. 2).

However, in my case, where there are 5 trajectory classes and I've regressed latent trajectory class #1 on a predictor

c#1 ON x

I only get one regression coefficient in the output. I guess I expected to see four, one each for predicting class 1 vs. 2, 1 vs. 3, and 1 vs. 4, like one would get with ordinary multinomial regression.

What does the regression coefficient from the above statement mean when there are more than two classes? Is it the reference class vs. all others?

Thanks!

Linda K. Muthen posted on Wednesday, January 18, 2006 - 9:32 am

You need:

c#1 ON x;
c#2 ON x;
c#3 ON x;
c#4 ON x;

Anonymous posted on Wednesday, January 18, 2006 - 9:42 am

Thank you.

From the above, I surmise that class #5 is being treated as the reference class in *each one* of the four regressions.

Is that correct?

Linda K. Muthen posted on Wednesday, January 18, 2006 - 9:44 am

Yes.

Anonymous posted on Wednesday, January 18, 2006 - 10:02 am

Great.

One additional question on GMM with a covariate.

The class membership probabilities seem to change a bit with the addition of covariates. There are missing data, so it makes sense to me that this could be the case. The covariates are contributing additional information on the scores used as indicators of the growth factors.

Further, it seems to me that the class probabilities generated from models with additional covariates are probably more reliable b/c more info in the covariance matrix.

Is this a valid interpretation, or should I be concerned about the differing class membership probabilities?

Thanks again for all your help!

Linda K. Muthen posted on Wednesday, January 18, 2006 - 10:53 am

For a discussion of this issue, see the Muthen chapter in the book edited by Kaplan. You can download it from the website.

Anonymous posted on Wednesday, January 18, 2006 - 12:45 pm

With reference to

c#1 ON x;
c#2 ON x;
c#3 ON x;
c#4 ON x;

from above...

I am interested in shifting which class is the reference class for some analyses. For example, I set-up the following regressions to use trajectory class #2 as the reference class (it's the only one excluded from the regression statments).

c#1 ON x;
c#3 ON x;
c#4 ON x;
c#5 ON x;

However, Mplus says that I can't include the highest numbered class in regression analyses. Is there any way around this, so that I can use some other class than the highest number one (5 in this case) as the reference?

Thank you.

Linda K. Muthen posted on Wednesday, January 18, 2006 - 12:59 pm

If you want to change the last class to a different class, use the parameter estimates of the class that you want to be the last class as starting values for the last class.

Thomas Olino posted on Monday, May 01, 2006 - 8:59 am

Is the warning message (pasted below) equivalent to negative variance, or would this model be appropriate to interpret?

Thanks!

ONE OR MORE PARAMETERS WERE FIXED TO AVOID SINGULARITY OF THE INFORMATION MATRIX. THE SINGULARITY IS MOST LIKELY BECAUSE THE MODEL IS NOT IDENTIFIED, OR BECAUSE OF EMPTY CELLS IN THE JOINT DISTRIBUTION OF THE CATEGORICAL VARIABLES IN THE MODEL. THE FOLLOWING PARAMETERS WERE FIXED: 14 22

Linda K. Muthen posted on Monday, May 01, 2006 - 11:28 am

It would depend on which parameters are fixed and the type of model. The best thing is to send your input, data, output, and license number to support@statmodel.com. I would not ignore such a message unless I had more information.

Christopher J. Sullivan posted on Tuesday, August 01, 2006 - 8:00 am

I'm trying to run an LCGA on crime rates over an 11 year span (n=4000). I've run several variants of the model below, but continue to get warning messages similar to those below. I've set different starting values, but still get error statements. Am I missing something obvious? Is there a sound way to determine which starting values to use?

......
Classes= ndxcls(3);
Analysis: Type=Mixture;
Starts= 100 10;
Estimator=MLF
Model: %Overall%
Intrcpt Slope| ndx87rt@0 ndx88rt@1 ndx89rt@2 ndx90rt@3 ndx91rt@4 ndx92rt@5 ndx93rt@6 ndx94rt@7 ndx95rt@8 ndx96rt@9 ndx97rt@10;

WARNING:
MODEL ESTIMATION DID NOT TERMINATE NORMALLY DUE TO A NON-POSITIVE
DEFINITE FISHER INFORMATION MATRIX. THIS MAY BE DUE TO THE STARTING VALUES
BUT MAY ALSO BE AN INDICATION OF NONIDENTIFICATION. THE CONDITION
NUMBER IS -0.106D-17.

THE STANDARD ERRORS OF THE MODEL ESTIMATES COULD NOT BE COMPUTED. THIS IS OFTEN DUE TO STARTING VALUES BUT MAY ALSO BE AN INDICATION OF NONIDENTIFICATION. CHANGE YOUR MODEL AND/OR STARTING VALUES. PROBLEM INVOLVING PARAMETER 15.

Linda K. Muthen posted on Tuesday, August 01, 2006 - 9:30 am

This is a data dependent problem. Please send your input, data, output, and license number to support@statmodel.com and we will see if we can help you.

Tony Jung posted on Wednesday, September 06, 2006 - 9:53 am

I'm running an LCGA using a zero-inflated poisson model. How do you control for Time 1?

I tried:

Model:
%overall%
i s q | u2@-2 u3@-1 u4@0 u5@1 u6@2 ;
i s q ON u1 ;

But the ON part of the model is giving me error messages:

*** WARNING in Model command
All variables are uncorrelated with all other variables within class.
Check that this is what is intended.
*** ERROR
The following MODEL statements are ignored:
* Statements in the OVERALL class:
I ON U1
S ON U1
Q ON U1
*** ERROR
One or more MODEL statements were ignored. These statements may be
incorrect or are only supported by ALGORITHM=INTEGRATION.

What do I need to do to control for Time 1?

Tony Jung posted on Wednesday, September 06, 2006 - 11:21 am

I forgot to add that I am controlling for Time 1 because I have experimental and control groups. Thanks.

Bengt O. Muthen posted on Wednesday, September 06, 2006 - 11:46 am

Sounds like your model has no random effects as stated - see the UG ex 6.7 - you need to add

algorithm = integration;

and decide on which growth factors are random. As the error message says, I believe algo=int also allows e.g.

i on u1;

But if you want to control for baseline, I think a better way is to center at time 1 so you can work with latent (true) baseline i. Then you can regress s and q on i. See also the Muthen-Curran 1997 article in Psych Meth.

Tony Jung posted on Wednesday, September 13, 2006 - 2:12 pm

Thank you for your helpful suggestions. I have a few more follow-up questions:

First, when I add ALGORITHM=INTEGRATION, it seems to also require INTEGRATION=MONTECARLO when I specify

I S Q on U1 COND SEX ;
i-qi@0;

This has increased the computation time considerably. Is this the correct syntax if I want to go with this approach?

Second, in your chapter, "Latent Variable Analysis" in the Kaplan book, you underscore the importance of including covariates for LCGA. In my case, I take it to mean I should at least include the intervention (COND). However, by doing this, I can no longer get the graphs of the different classes that I used to by doing:

PLOT: SERIES = u1-u6 (s);
TYPE = PLOT3 ;

How can I get the estimated means plot of the different classes when doing a conditioned LCGA? Or is this only available for the unconditioned model?

Tony Jung posted on Wednesday, September 13, 2006 - 2:13 pm

Following previous post:

A related question is, if I want to use only waves 2-6 and condition it on wave 1 for the unconditioned LCGA, can I just do:

i s q | u2@-2 u3@-1 u4@0 u5@1 u6@2 ;
i s q | u1 ;

So what I'm asking is, what's the difference between doing i s q | u1 and i s q ON u1? Am I not accounting for Time 1 either way?

My final question is regarding the stacked model (multiple group) approached outlined in the Muthen & Curran (1997) paper. If I want to apply the approach to my model, do I need to run a separate model line for each path that I want to test, i.e. free one path at a time and run the stacked model?

And, how do I interpret the output, i.e. check for significance? Do I just subtract the two "chi-square contribution" values from control and intervention groups, and compare to chi-square (1 df) criteria?

Annie Desrosiers posted on Tuesday, September 26, 2006 - 7:42 am

Hi, I want to know, how MPlus can tell me how many latent classes I have in my data. Can you tell me witch example do that and were in the output MPlus revealed de number of latent classes.

Thank you and sorry for my French accent!!

Bengt O. Muthen posted on Sunday, October 01, 2006 - 12:50 pm

Answer to Jung's Sept 12 2:12 questions:

It seems like you don't need to declare u1 as a categorical variable because it is an IV - this would then avoid the MonteCarlo integration requirement I think. Plot is not available in the case you refer to.

Answer to Jung's Sept 12 2:13 questions:

See the UG chapter 16 for explications of growth model statements using the bar ("|") approach. You will see there that the bar statement does not estimate regression slopes. Note also that you cannot identify a quadratic growth model by a single outcome (as in i s q | u1). Regarding your final questions, I would simply use group as a dummy covariate.

sara hussain posted on Monday, January 29, 2007 - 10:30 am

I am sorry if this post is a repeat of a message I posted earlier (my computer crashed and so I am not sure if it was sent).

I wish to analyse the patterns of longitudinal poverty (a binary measure) over 15 waves of data. I would like to know what are the main differences between LCGA and latent Mrkov analysis (of Langeheine and van de Pol).

Many thanks.

Bengt O. Muthen posted on Monday, January 29, 2007 - 11:13 am

LCGA is similar in spirit to regular random effects growth modeling, where the idea is that a growth process affects the development over time. Latent Markov is an auto-regressive type of model where the status at one time point influences the status at the next time point. So the former model type does not specify direct influence between the outcomes over time (although they are certainly allowed to be highly correlated), while the latter does. The substantive application may be more suitable to one or the other model type.

sara hussain posted on Tuesday, January 30, 2007 - 11:21 am

Dear Dr Muthen,
Many thanks for your response. This is most helpful.
Regards
Sara

Carolyn Tompsett posted on Friday, February 02, 2007 - 12:04 pm

I would like to run a latent class growth analysis to examine differing trajectories of change across 7 waves of data. I would like to use individually-varying time points, as there is a great deal of variation in time of interview at each wave. Is it possible to run latent class analyses using individually-varying time points? Would I be able to compare models with different numbers of classes?

Linda K. Muthen posted on Friday, February 02, 2007 - 2:20 pm

Yes and yes. You use TYPE=RANDOM; and the AT option for individually-varying times of observation.

Carolyn Tompsett posted on Saturday, February 10, 2007 - 10:45 am

Thanks! This set me on the right path, and I ended up using TYPE=MIXTURE RANDOM MISSING. I am having some trouble interpreting my results, however. 1) without the �means� for intercept and slope, where should I be looking to interpret the overall slope for each class? 2) my results give me negative values for the intercepts of the intercept factor, and positive values for the intercepts of the slope factor. I had been expecting a negative trend for the slope, and do not know how to interpret a negative intercept.

Linda K. Muthen posted on Saturday, February 10, 2007 - 10:57 am

It sounds like you have covariates in the model. Otherwise, you would obtain a mean and variance for all growth factors.

Carolyn Tompsett posted on Sunday, February 11, 2007 - 6:01 pm

Thanks, taking out the covariate fixed it.

Daniel Seddig posted on Thursday, July 19, 2007 - 2:59 am

am i supposed to use the same number of sets of starting values and same number of optimizations when testing a k-class-model against a k-1-class-model when using the STARTS option in the analysis command? thanks a lot.

Linda K. Muthen posted on Thursday, July 19, 2007 - 6:53 am

You don't need the same number of starts for models with a different number of classes. You need enough starts so that the best loglikelihood is replicated.

Judith Soicher posted on Tuesday, September 04, 2007 - 8:16 am

I am doing latent class growth analysis on a dataset of 206 individuals, where the outcome is adherence to endurance exercise (minues/week) measured at 4 time points (4, 6, 8 and 12 months) following a 3-month rehabilitation program.

My first objective is to determine the number of latent trajectory classes, without covariates. I have several basic questions:

1) My understanding is that for entropy, a higher value is better. What is an acceptable value ?

2) For the Lo-Mendell-Rubin test, a low p-value indicates the model with one less class is rejected in favour of the estimated model. What is considered a low p-value: < 0.05, <0.1?

3) Given that the outcome was measured at 4, 6, 8 and 12 months, should the model syntax specifying the time scores be:

endmth4@0 endmth6@2 endmth8@4 endmth12@8

OR

endmth4@4 endmth6@6 endmth8@8 endmth12@12

I have tried it both ways. The entropy and classifications are the same, however the estimates are slightly different. Which is correct ?

Thanks very much.

Linda K. Muthen posted on Tuesday, September 04, 2007 - 8:52 am

1. Entropy ranges fro 0 to 1 so you would want a fairly high value like .8. However, entropy is a summary measure. It may be that some classes distinguish well and if those are the classes you are most interested in, then I wouldn't give so much importance to a summary measure.
2. Less than .05.
3. Your time scores should reflect time between your measurement occasions, for example, 0 1 2 4.

Judith Soicher posted on Wednesday, September 05, 2007 - 8:17 am

Thanks very much. Some additional questions re: latent class growth analysis with 4 time points.

1) In the output, 3 classifications are given: i) final class counts and proportions for the latent class patterns based on the estimated model, ii) final class counts and proportions for the latent class patterns based on estimated posterior probabilities, and iii) classification of individuals based on their most likely latent class membership.

i) and ii) appear to be the same, and is the classification used in the plot of sample and estimated means. iii) appears to differ from i) and ii), in some models, and is the classification saved when save=cprob is requested.

My question is: why is one classification used for the plot, and another saved in the .dat file ?

2) I have tried running my model with homogeneous and heterogeneous (default) variance across time points. Is either acceptable ? When should one be used versus the other ?

3) From the web course on multi-level modeling, it was mentioned briefly that piecewise modeling can be used when there are several time points per growth phase.

Can I assume that my growth model, which has a TOTAL of 4 time points, would NOT be appropriate for piecewise modeling ?

Linda K. Muthen posted on Tuesday, September 18, 2007 - 5:15 am

1. The estimated posterior probabilites and most likely class membership are all saved. Estimated means in the plots use estimated probabilities. Observed individual trajectories are based on most likely class membership.

2. We recommend allowing heterogeneous residual variances across time.

3. Four time points is typically too few time points for piecewise.

Judith Soicher posted on Thursday, September 20, 2007 - 11:38 am

Thanks. For the analysis described above:

Because the outcome has a skewed distribution, I have categorized it into 5 categories (0-4). The categories are 0: 0 minutes/wk, 1: 1-60 mins/wk, 2: 61-100 mins/wk, etc. I have then treated the outcome as continuous in a 2-class LCGA model with 4 time-independent covariates. Below is a portion of the output:

Categorical Latent Variables

C#1 ON
RFEV10PT 0.444 0.177 2.516
MCIEJHD -2.533 0.436 -5.810
EXACFU 1.126 0.392 2.870
O2EX 1.347 0.569 2.367

Intercepts
C#1
-2.287 1.032 -2.215

My questions are about the interpretation of parameter estimates.

For covariate EXACFU (binary), is the interpretation that individuals with exacfu=1 are exp(1.126)=3.08 times more likely to be in class 1 versus class 2 ?

I am unsure of the interpretation for the estimate -2.287 for Intercepts C#1. Is this the inverse natural log (or odds) of being in class #1 vs class 2, adjusted for the 4 covariates, ie. exp(-2.287)=0.102 ?

If I am incorrect, please explain how these estimates are interpreted.

Linda K. Muthen posted on Thursday, September 20, 2007 - 4:28 pm

The odds of being in class 1 versus class 2 adjusted for the other covariates is 3.08 higher for those with exacfu = 1 than those with exacfu = 0.

The intercept is the logit when all covariates are zero.

Judith Soicher posted on Friday, September 28, 2007 - 12:21 pm

For the above analysis, I have a sample of 206 individuals, measured at 4 time points.

Can you provide any guidelines or references for sample size requirements for latent class growth analysis.

Linda K. Muthen posted on Friday, September 28, 2007 - 3:15 pm

The sample size needed depends on several factors so it is hard to say without doing a simulation study. In general, one needs fewer observations with repeated measures. See the following paper which you might find helpful:

Muth�n, L.K. & Muth�n, B.O. (2002). How to use a Monte Carlo study to decide on sample size and determine power. Structural Equation Modeling, 4, 599-620.

K�tlin Peets posted on Tuesday, October 30, 2007 - 7:48 am

I have a question. I am doing growth mixture modeling. Is it OK to compare two different solutions (e.g., 2 classes vs 3 classes) when I have made modifications to one of the models (e.g., fixed the variance of the growth factor to zero in one of the classes, let the residual variances be freely estimated in different classes...)?

Thank you!

sara khanum posted on Wednesday, October 31, 2007 - 12:03 pm

Hi
I am learning about LCGA in MPlus and would be very grateful for your advice about data format.

I have a large panel data set in long format and wish to work in this format in MPlus (I will be adding many covariates for a 15 wave panel and so it is easier). Most of the examples I have seen are in wide format and I would be grateful if you could help me modify the syntax below (which uses a wide structure) to accommodate for this. Do I need to include my ID and wave (time) indicators?

Regards

Sara

DATA: FILE IS longdat.dat;
VARIABLE: NAMES ARE id gen sup pov1 pov2 pov3 pov4 gpa1 gpa2 gpa3 gpa4;
USEVARIABLES ARE pov1 pov2 pov3 pov4;
CLASSES = c (2);
ANALYSIS: TYPE = mixture;
STARTS = 100 10;
ESTIMATOR = MLR;
MITERATION = 5000;
MODEL: %OVERALL%
i s | pov1@0 pov2@1 pov3@2 pov4@3;
i@0;
s@0;
OUTPUT: STANDARDIZED MOD;
TECH11;
SAVEDATA: FILE = lcga_pov.dat;
SAVE = CPROBABILITIES;
PLOT: TYPE = PLOT2;
SERIES = pov1 (1) pov2 (2) pov3 (3) pov4 (4);

Linda K. Muthen posted on Wednesday, October 31, 2007 - 12:18 pm

See Example 9.16.

sara khanum posted on Wednesday, October 31, 2007 - 4:08 pm

Dear Linda

Thank you for your prompt response.

I looked at 9.16 but I couldn't find any LCGA examples for data that are already in long format.

I tried to modify the above syntax for my data but couldn't get it work.

I guess I am not clear where/if to specify the id and wave variables and the growth factors.

I did the following:

USEVARIABLES ARE pid wave pov1 pov2 pov3 pov4;
CLASSES = c (2);
ANALYSIS: TYPE = mixture;
IDVARIABLE = person;
REPETITION = time;
STARTS = 100 10;
ESTIMATOR = MLR;
MITERATION = 5000;
MODEL: %OVERALL%
i s | pov1@0 pov2@1 pov3@2 pov4@3 ON wave;
i@0;
s@0;
OUTPUT: STANDARDIZED MOD;
TECH11;
SAVEDATA: FILE = lcga_pov.dat;
SAVE = CPROBABILITIES;
PLOT: TYPE = PLOT2;
SERIES = pov;

In your experience, is long data more difficult/inflexible for longitudinal growth analysis?

Your guidance would be most appreciated.

Regards

Sara

sara khanum posted on Wednesday, October 31, 2007 - 4:29 pm

Dear Professor Muthen

Sorry to bother you again! I have looked at 9.16 again - in order to use long data, does one have to specify a multi-level model? I couldn't adapt 9.16 to my data as I wasn't intending to do a multi-level specification.

Regards

Sara

Linda K. Muthen posted on Wednesday, October 31, 2007 - 4:31 pm

Example 9.16 is the only example we have in long format although you can translate any one into this. You can compare the MODEL command from this example to the | statement that specifies the growth model in Example 6.1 to see how they compare.

The wide specification of the growth model is actually more flexible than the long specification. Residual variances can be estimated for each time point and residual covariances can be included in the model. The only time the long format might be desirable is with a very long time series. I would use an LCGA example from Chapter 8.

Shawn Latendresse posted on Tuesday, January 29, 2008 - 7:32 am

When using LCGA to determine the best number of classes, how do you reconcile an earlier comparison with a decreasing BIC and a non-significant LMR-LRT (e.g., two versus three classes), and a subsequent comparison with a decreasing BIC and a significant LMR-LRT (e.g., three versus four classes)? I've run into this many times when a priori testing across a range of possible classes (e.g., from 1 to 6).

Linda K. Muthen posted on Tuesday, January 29, 2008 - 2:10 pm

The various statistics that are examined to determine the number of classes do not always agree. If the 3, 4, and 5 class solutions are suggested, one can see which makes the most substantive sense.

Judith Soicher posted on Tuesday, May 06, 2008 - 12:52 pm

I am doing 3-class LCGA with 4 time points. When I include the syntax below, I get exactly the same output as when I don't include it.

%c#1%
[i](i1); [s](s1);
%c#2%
[i](i2); [s](s2);

My questions are:
1) Is this the correct syntax to specify linear slopes ?
2) What is the default (eg. linear, spline) if I do not include this syntax ?

Linda K. Muthen posted on Wednesday, May 07, 2008 - 8:16 am

The syntax you show specifies that the means of the intercept and slope growth factors are free across classes. This is the default so it has no impact. You have also given different labels to the four parameters which also has no impact on model estimation.

Judith Soicher posted on Friday, May 09, 2008 - 11:42 am

Thanks. Follow-up questions:

1) Is linear growth the default ?

2) Is it possible to specify quadratic growth ? If so, what is the syntax ?

Linda K. Muthen posted on Friday, May 09, 2008 - 11:46 am

There is no default growth model. The growth model has to be specified. See Example 6.1 for a linear growth model. See Example 6.9 for a quadratic growth model. See the discussion of growth modeling in Chapter 16 under The | Symbol.

Judith Soicher posted on Friday, June 06, 2008 - 11:48 am

According to the MPlus web training �Growth Modeling with Latent Variables using MPlus� (slide 39), the number of free parameters in the H1 unrestricted model is 14 for a linear growth model with 4 time points and no covariates (14=4 means + 4(5)/2).

My questions are:

How is the number of free parameters calculated for the HI unrestricted model, with 4 time points, but with a quadratic growth factor?

Does this calculation apply to latent class growth analysis ?

Linda K. Muthen posted on Friday, June 06, 2008 - 12:42 pm

The number of parameters for the H1 model is the same for a linear or quadratic growth model. Only the H0 model changes.

This calculation does not apply to latent class growth analysis.

Judith Soicher posted on Monday, June 09, 2008 - 6:18 pm

Thank you. I am trying to determine the degrees of freedom for different LCGA models.

1) Is degrees of freedom = #parameters for H1 model minus #parameters estimated in Ho model?

2)The MPlus output gives the 'number of free parameters' - is this the number of parameters being estimated in the Ho model?

3) How is the number of parameters for the H1 model determined for a 3-class LCGA linear growth model with 4 time points, with intercept and slope variance constrained to zero ?

4)Would the calculation be the same for a model that includes a quadratic growth factor (variance set to zero) ?

Linda K. Muthen posted on Tuesday, June 10, 2008 - 10:05 am

Degrees of freedom are relevant when means, variances, and covariances are sufficient statistics for model estimation. In this case, the degrees of freedom are equal to the difference between the parameters in the H0 and H1 models. When they are not, the number of parameters is used instead. The number of parameters given is for the H0 model.

jemila seid posted on Friday, January 02, 2009 - 7:35 am

Dear Profs. Muthen,

I am looking of a paper in Genetics Applications of Growth Mixture Models for a journal club presentation. Could you please recommend a recent paper in this area? I got the opportunity to read one of your GAW16 contributions and I am looking for a similar paper.

Thanks a lot for your help

Best regards
Jemila

Bengt O. Muthen posted on Friday, January 02, 2009 - 8:05 am

Beyond the GAW16 paper I don't recall having yet seen a GMM paper with genetics application except perhaps Irene Rebollo has done some work on that - you may want to google her at the Vrije Univ of Amsterdam. I have done a cross-sectional version of such a genetics analysis in a twin setting using "factor mixture modeling" - see

Muth�n, B., Asparouhov, T. & Rebollo, I. (2006). Advances in behavioral genetics modeling using Mplus: Applications of factor mixture modeling to twin data. Twin Research and Human Genetics, 9, 313-324.

in our Genetics Section

http://www.statmodel.com/geneticstopic.shtml

jemila seid posted on Friday, January 02, 2009 - 10:06 am

Thanks a lot, Prof. Muthen, for your prompt replay. I appreciate it. I will have a look at Irene Rebollo's work. would it be possible to use your GAW16 contribution as an example in my presentation? if so, may I get a complete version of your contribution?

Thanks once again

Best regards
Jemila

Jennifer Rose posted on Friday, January 23, 2009 - 8:19 am

Hi,

I am trying to replicate in Mplus a LCGA model with a censored normal outcome that successfully ran using Proc Traj, but I cannot get it to run in Mplus. The Mplus code is in the next message. My question is whether I am specifying the model correctly to replicate the Proc Traj analysis. I have read about the differences between the two approaches, but I'm not sure what would be causing this problem. I know numerical integration is computationally intensive, and I've tried all sorts of approaches, but nothing works. Is there a way to specify the estimation to do the same thing Proc Traj does? And, if I'm already doing this correctly, do you have suggestions for why the model converges in Proc Traj but not Mplus?

Thanks for your help.

Jennifer Rose posted on Friday, January 23, 2009 - 8:23 am

Code to replicate proc traj (part 1):

classes= group(5);
censored= lcwkav1-lcwkav35 (b);
analysis: type=mixture;
ALGORITHM=integration;
integration=5;

Jennifer Rose posted on Friday, January 23, 2009 - 8:27 am

Code (part 2)
%overall%
i s q c| lcwkav1@0...lcwkav35@34;
%group#1%
[i*-1.4 s*-.22 q*0 c*0];
%group#2%
[i*-5 s*-.05 q*0 c*0];
%group#3%
[i*.5 s*-.68 q*.03 c*0];
%group#4%
[i*1.25 s*-.1 q*-.02 c*0];
%group#5%
[i*1.7 s*.02 q*0 c*0];
i-c@0;

Bengt O. Muthen posted on Friday, January 23, 2009 - 9:02 am

You should delete your statements

ALGORITHM=integration;
integration=5;

because they initiate the use of random effects in the model, which is not part of the TRAJ model.

Also, divide your time scores by 10 to make things run more smoothly.

Sofia Diamantopoulou posted on Tuesday, March 17, 2009 - 2:45 am

Dear Drs Muth�n,

I am plotting graphs obtained from a LCGA and although I am fitting a linear model with three classes, the growth curves -two out of three- are not linear but a have a curvature as if I was fitting a quadratic model. Does this have anything to do with the use of a censored (b) distribution?

Thank you in advance!

Bengt O. Muthen posted on Friday, March 20, 2009 - 1:24 pm

Yes, linear in the underlying (uncensored) normal variable, but not in the observed variable. Same for categorical outcomes.

Amery Wu posted on Wednesday, April 08, 2009 - 11:44 am

Dear Dr. Muthen,

I am running a piecewise general growth mixture model. I selected a 4-class model based on the unconditional model (without the auxiliary variables), which has sound statistical fit and substantive interpretability. I would like to proceed to add the auxiliary variable to the same 4-class model extracted by the unconditional model, so I fixed the growth factors means to those of the unconditional model while running the conditional model. The results make sense, despite that the class distribution changed a bit.
I also tried to free the growth factors means for the conditional model, the growth factor means changed a lot due to the addition of the auxiliary variables. The classes are very different from those of the unconditional model.
My question is: Is my approach of fixing the growth factor means justifiable, especially in terms of the estimation?
Or, should I report and interpreted the growth factors means of the conditional model?
Thanks a lot,
Amery Wu

Linda K. Muthen posted on Wednesday, April 08, 2009 - 4:22 pm

This is a big topic. Please see the following two papers both of which are available on the website:

Muth�n, B. (2004). Latent variable analysis: Growth mixture modeling and related techniques for longitudinal data. In D. Kaplan (ed.), Handbook of quantitative methodology for the social sciences (pp. 345-368). Newbury Park, CA: Sage Publications.

Clark, S. & Muth�n, B. (2009). Relating latent class analysis results to variables not included in the analysis. Submitted for publication.

Amery Wu posted on Wednesday, April 08, 2009 - 8:29 pm

Many Thanks, Dr. Muthen.
I'll re-read the 2004 paper, and read the 2009 manuscript, and ask your advice again if necessary.

Amery Wu

Liudmila Lipi�inen posted on Tuesday, January 12, 2010 - 11:15 am

Dear Dr. Muthen,

I'm writing a master's thesis in Statistics about Latent Class Growth Analysis using Mplus. My outcomes are kategorical (4 kategories), 8 timepoints, N=1000. Can you please recommend me materials, that will help me in my work.

Thank you in advance!

Linda K. Muthen posted on Tuesday, January 12, 2010 - 4:20 pm

I would suggest looking at the Topic 5 and 6 course handouts, videos, and the references therein in addition to papers on the website.

Josh Bricker posted on Tuesday, February 09, 2010 - 4:46 pm

I am trying to do a latent class growth mixture model to describe different classes of individuals in terms of their pubertal trajectory. The goal is to identify individuals according to membership in "early" and "late" trajectory classes--plus some number of intermediate classes depending on the best solution. There are ~400
individuals who were measured across ages 10 through 16. Scores are continuous (based on an average of 5 sub-indices), and there is a wide variation among individuals in the number of missing scores.
I ran LCGMM for linear, freed linear, quadratic, and cubic slopes--each with either one, two, three, four, or five classes.
My strategy for choosing the best model has been to look at the lowest BIC criterion based on Nulund, Asporouhov, and Muthen (2007) and to also consider the percentage of individuals in each class-avoiding solutions with less than 5% of individuals
in the smallest class.
Is the BIC criterion only meant to be used within a particular slope model, or can it be used to compare, say, the best quadratic versus the best linear class solutions? I assumed that the slope and number of classes should be considered simultaneously.
Alternatively, should I first decide on the slope category using longitudinal growth models (based on Chisquare), and then decide on the number of classes using latent class growth mixture modeling (based on BIC and % class membership)?

Linda K. Muthen posted on Wednesday, February 10, 2010 - 2:42 pm

You can use BIC to compare models as long as the set of observed variables is the same. Remember that if you use chi-square to compare nested models and the model has any variances fixed at zero, the difference test may not be distributed as chi-square.

I would take theory and an examination of individual trajectories to find the most complex model that might need to be fitted, for example, a quadratic model. I would then use that model and extract classes. If the mean and variance of the growth factors in some classes are not significant, I would adjust the growth models for these classes.

Josh Bricker posted on Tuesday, February 16, 2010 - 5:33 pm

Thanks so much for your response. I am very new to MPLUS and have to ask some very basic follow-up questions.

1) I am using Example 8.6 as my model without the u parameter. Where is the information on the significance of means and variances?

2) In terms of making adjustments, I would like to explore models in which class 1 is linear, class 2 is quadratic, class3 is cubic, etc. How can I specify different classes having different slopes in the same model?

3) In the tech4 output for several of my models, I have small negative variances (e.g., -.008) in the estimated covariance matrix of latent variables (among my intercepts). I read in another comment that these may be the reason for non-positive definite psi matrix warnings, and that a solution is to set these to zero. What code specifies that this should be zero in the model?

Linda K. Muthen posted on Tuesday, February 16, 2010 - 5:44 pm

1. If you are asking about model results, the third column of the output is a z-score and the p-value of this score is given in the fourth column.

2. The most complex model should be specified in the %OVERALL% part of the MODEL command. The class-specific parts of the MODEL command should fix the growth factor variances to zero for the components not part of that class, for example, cubic@0;

3. Is the negative value on the diagonal or offdiagonal?

Josh Bricker posted on Wednesday, February 17, 2010 - 3:23 pm

3. the negative value in on the diagonal

Josh Bricker posted on Wednesday, February 17, 2010 - 3:44 pm

2.
to be more specific about the coding--
would that mean something like this?

VARIABLE: NAMES ARE ...;
USEVARIABLES Y1 Y2 Y3 Y4;
CLASSES = C(3);

MODEL: %OVERALL%
isq| Y1@0 Y2@1 Y3@2 Y4@3;

MODEL c1: quadratic@0 linear@0;
MODEL c2: quadratic@0;

Josh Bricker posted on Wednesday, February 17, 2010 - 4:19 pm

2. okay, I figured out 2 as follows:

VARIABLE: NAMES ARE ...;
USEVARIABLES Y1 Y2 Y3 Y4;
CLASSES = C(3);

MODEL: %OVERALL%
isq| Y1@0 Y2@1 Y3@2 Y4@3;
%c#1%
[q@0 l@0];
%c#2%
[q@0];

Bengt O. Muthen posted on Wednesday, February 17, 2010 - 5:09 pm

You want to fix not only the class-specific growth factor mean to zero but also the class-specific growth factor variance to zero.

Unless you have a strong theory for different curve shapes in different classes, a more common approach is to fit the most general shape (here quadratic) in all classes - then you will find if a growth factor mean/variance is zero in certain classes.

Josh Bricker posted on Wednesday, February 17, 2010 - 8:22 pm

Oh, so my example code was only setting the means to zero--how do I also set the variances to zero?

I am basing my exploration of different curve shapes for different classes on Linda's advice above to first do an overall model and then to make adjustments informed from the output about specific class means/variances.

I also wanted to ask (#3 above) how to set small negative variances (on the diagonal) to zero in the estimated covariance matrix of latent variables (among my intercepts). What code specifies that this should be zero in the model?

Josh Bricker posted on Wednesday, February 17, 2010 - 10:58 pm

A separate additional question in response to Bengt's comment

"You want to fix not only the class-specific growth factor mean to zero but also the class-specific growth factor variance to zero"

Is the decision to fix a parameter to zero always made for both means and variances together even if there is conflicting evidence from the p-value for the Means and Variances?

Sometimes I find that the mean for the Quadratic in a particular class is no different from 0 while its variance is significant--or vice versa--the mean is small but significant while the variance is not different from 0. Are you saying in either case, both the mean and variance should be set to 0?

Linda K. Muthen posted on Thursday, February 18, 2010 - 9:38 am

If you have a quadratic model in the overall part of the MODEL command and want a linear model in one class, fix the mean and variance of the quadratic growth factor to zero in that class, for example,

[q@0]; q@0;

I would not fix a mean to zero without also fixing the variance to zero. If a variance is significant and a mean is not, I would leave the mean free.

I would not overfit the model to the sample data.

Averdijk posted on Friday, July 16, 2010 - 3:27 am

Dear Drs Muthen,

I performed a 5-group LCGA with multinomial logistic regression in both SAS Proc Traj and MPlus. Although the overall trajectories look similar in both programs, counts for trajectory memberships are somewhat different, and moreover the results for the multinomial logit are different (I defined the same reference category). I must have made a mistake in the syntax. Would you have any suggestions on how to change the syntax?

Many thanks in advance.

usevar = taggr1 - taggr4 Emoattr Sex ISEI mighh2 Stablfam Socdes;
Censored taggr1 - taggr4 (b);
missing = all (999);
CLASSES = c(5);
Analysis: type = MIXTURE;
Estimator = ML;
Starts = 500 20;
Stiterations = 20;
LRTSTARTS = 2 1 50 15;
Model: %OVERALL%
i s | taggr1@0.1 taggr2@0.2 taggr3@0.3 taggr4@0.4;
c on Emoattr Sex ISEI mighh2 Stablfam Socdes;
OUTPUT: sampstat TECH14;

Linda K. Muthen posted on Friday, July 16, 2010 - 9:35 am

They should give the same results if the models are specified to be the same. Check the number of parameters and the loglikelihood. Also, be sure that data are the same. I would think you would want one of your time scores to be zero.

Jamie Vaske posted on Friday, August 06, 2010 - 6:26 am

Hello,
A colleague and I are estimating an LCGA with observed variables. Our observed scales, though, have very poor reliability and we were wondering whether a CFA can be incorporated into an LCGA framework. If it can, are we still modeling absolute change in our variable over time, or are we modeling change in one's factor scores over time? If this analysis is possible, do you know of a good reference/article that discusses this type of analysis? Thank you for your time and advice!
Jamie

Linda K. Muthen posted on Friday, August 06, 2010 - 9:57 am

If you have a growth model on the factors, you are modeling change in the factors over time. I don't know of a reference but the Topic 4 course handout and video goes through the steps to do this in detail.

Jahun Kim posted on Thursday, February 03, 2011 - 12:11 pm

Hello,

I'm trying to do GMM to identify classes of mother's support and to examine whether these classes predict kid's risk behavior, depression, and drug use. All of three outcome variables are continuous and I want to add them, one by one, to the GMM.

I realized that my model is similar to Example 8.6 in Mplus manual (except my outcomes are continuous).
But it doesn't say what how to write when classes predict outcomes (hrb).

I used 'on' statement like below, but it did not work.

q5hrb on c#1;

The warning statement I've got...
*** WARNING in Model command
Variable is uncorrelated with all other variables within class: Q5HRB
*** WARNING in Model command
All least one variable is uncorrelated with all other variables within class.
Check that this is what is intended.
*** ERROR
The following MODEL statements are ignored:
* Statements in the OVERALL class:
Q5HRB ON C#1

Could you help me how I can fix these problems?

Linda K. Muthen posted on Thursday, February 03, 2011 - 1:21 pm

Just add Q5HRB to the USEVARIABLES list but don't use an ON statement. The results are found in the varying of the means of Q5HRB across classes

Jill Guttormson posted on Thursday, March 10, 2011 - 5:49 pm

Drs. Muthen,
I am running an LCGA model with a continuous distal outcome (Memory). I can find in the output the mean of Memory for each class but don't know where to look for parameter estimates of the regression of Memory on C. In case my code is the problem, I am using the following:

usevar = ArousD1-ArousD5 Age APACHE Memory;
Classes = c(2);
Analysis: type = Mixture;
Model: %OVERALL%
i s|ArousD1@0 ArousD2@1 ArousD3@2 ArousD4@3 ArousD5@4 ;
i-s@0;
c#1 on Age APACHE;
Output: sampstat standardized tech1
TECH11 TECH14

Thank you for your guidance and for this forum.

Bengt O. Muthen posted on Thursday, March 10, 2011 - 6:00 pm

Your input is fine. The regression of Memory on C is expressed by the class-specific means of Memory. Mplus does not allow Memory ON C, and it is not needed. This is analogous to linear regression Of Y on a dummy X variable - the different X categories give different Y means. There is no other coefficient.

You can test mean differences of Memory using Model test.

Jill Guttormson posted on Sunday, March 13, 2011 - 10:30 am

Thank you Dr. Muthen.
An additional question on the model test syntax. Since memory does not explicitly appear in the model command:

Model: %OVERALL%
i s|ArousD1@0 ArousD2@1 ArousD3@2 ArousD4@3 ArousD5@4 ;
i-s@0;
c#1 on Age APACHE;

How do I specify the class-specific means of memory in Model Test? I have tried different variations on the NEW option of the model constraint command without success.

Thank you

Linda K. Muthen posted on Monday, March 14, 2011 - 7:02 am

You must specifically mention the means by using a bracket statement. If you continue to have problems, please send your output and license number to support@statmodel.com.

Dana Wood posted on Wednesday, October 05, 2011 - 1:34 pm

Hello,

Using a sample size of 1,495, I am estimating a growth mixture model. Comparing BIC, AIC, and BLRT points to selection of a 4-class solution over the 3-class solution. However, the 4-class solution has one very tiny class (4% of sample). When I try to run the 5-class solution, it turns out to be inestimable. Is this likely because I have one very tiny class in the 4-class solution (making it difficult or implausible to further split the sample)?

Thank you!

Bengt O. Muthen posted on Wednesday, October 05, 2011 - 9:00 pm

I think the 5-class solution is possible in principle because it could split a large class in the 4-class solution. A class with 4% isn't that small in a sample of your size. So the fact that 5 classes is inestimable is another indication of 4 classes being best.

Diana Paksarian posted on Wednesday, February 08, 2012 - 6:59 am

Hello,

I'm interested in possibly using a LCGA model and have some conceptual questions that I was hoping the topic 6 lecture video would answer (Berlin 2009). It seems like part 2 of the video ends at slide 69 and part 3 of the video starts at slide 95. Do you know where I can find the portion covering LCGA?
Thanks,
Diana

Linda K. Muthen posted on Wednesday, February 08, 2012 - 3:21 pm

It may be that was lost. You need to see the materials in the handout.

Diana Paksarian posted on Tuesday, February 14, 2012 - 5:21 am

Hello again,

Unfortunately the slides alone don't answer my question, so I'm hoping you can help.

I have a categorical variable measured over time. I would like to create "trajectories" that represent patterns over time. However, I don't really want to impose a structure on the trajectories by forcing them to be lines, per se. I want them to be able to have any shape. Does LCGA do this, or should I just use LCA?

I know that LCGA differs from GGMM in that the variances within classes are 0. I'm not sure whether this has any effect on the possible trajectory shapes.

I apologize if this doesn't make sense. I have read a few articles on LCGA and GGMM but I'm still confused. I have also gotten different answer from different people when asking about this, so I would greatly appreciate your input.

Thank you!
Diana

Jon Heron posted on Wednesday, February 15, 2012 - 12:09 am

Hi Diana,

if your data is binary, you have fixed time points, and you're not modelling variance then you can get the same answer with an LCA that you'd get with an equivalent LCGA.

For instance if you have 4 time points then you can fit an LCA or a cubic polynomial LCGA. I would imagine that a piecewise linear LCGA would also give the same answer although I haven't tried this. They all use four parameters within each class to describe the four probabilities.

Once you get above 4 time points the equivalence should still remain however estimation becomes more difficult - the loadings you'd need to apply in a quintic LCGA get a bit on the large side.

I always start of with an LCA and then move towards a potentially more parsimonious LCGA if the patterns look well behaved.

Diana Paksarian posted on Wednesday, February 15, 2012 - 6:12 am

Hi John,
Thanks for your response. Just to clarify, are you saying that to get the same answer (or to have equally flexible models) then I will need to add a polynomial term to the LCGA model every time I add a time point? I have 15 time points, so I imagine that's not a realistic option.

I had thought that they might also give different answers (even if they are equally flexible) because the LCA is disregarding time. Have you found that that isn't true?

Thanks!
Diana

Jon Heron posted on Wednesday, February 15, 2012 - 6:31 am

Hi Diana

you'd get the SAME answer with four time points and either an LCA or a cubic LCGA - they're different parameterizations of the same model (if your data is binary etc..)

In theory you could fit a polynomial of degree 14 to your data and that would give the same answer as an LCA, but that's probably impossible to estimate. You might still be able to a model with 14 piecewise linear slopes and that should once again agree with the LCA.

The only benefit to including time in the model is that you can then fit simpler shapes (as you then know the ordering of the observations). With the 4 time point example, the cubic polynomial would be the same as joining the dots so there's no simplification there.

My recommendation would be to start with an LCA and examine the patterns. You might decide at that point that a cubic or quartic or perhaps 3 or 4 piecewise linear segments would do the job just as well.

bw, Jon

Jon Heron posted on Wednesday, February 15, 2012 - 6:36 am

Some "proof" (I did this as a teaching example once)

LCA:-

categorical = msmk21 msmk33 msmk47 msmk61;
usevariables = msmk21 msmk33 msmk47 msmk61;

MODEL:
%overall%
[c#1 c#2];
$c#1%
[msmk21$1 msmk33$1 msmk47$1 msmk61$1];
$c#2%
[msmk21$1 msmk33$1 msmk47$1 msmk61$1];
$c#3%
[msmk21$1 msmk33$1 msmk47$1 msmk61$1];
$c#4%
[msmk21$1 msmk33$1 msmk47$1 msmk61$1];

Final stage loglikelihood values at local maxima, seeds, and initial stage start numbers:
-7435.680 299700 932

LCGA:-

categorical = msmk21 msmk33 msmk47 msmk61;
usevariables = msmk21 msmk33 msmk47 msmk61;

Model:
%overall%
i s q cub | msmk21@0 msmk33@1 msmk47@2.166 msmk61@3.33;
[msmk21$1@0 msmk33$1@0 msmk47$1@0 msmk61$1@0];
[c#1 c#2];

%c#1%
[i s q cub];
%c#2%
[i s q cub];
%c#3%
[i s q cub];

Final stage loglikelihood values at local maxima, seeds, and initial stage start numbers:
-7435.680 887676 22

Jon Heron posted on Wednesday, February 15, 2012 - 6:38 am

Oh moo, I've pasted the wrong syntax and thus proved nothing - doh :-S

Jon Heron posted on Wednesday, February 15, 2012 - 6:44 am

Ignore this bit

"$c#4%
[msmk21$1 msmk33$1 msmk47$1 msmk61$1]; "

and those two models are identical

Diana Paksarian posted on Wednesday, February 15, 2012 - 7:51 am

Okay, great, this is very helpful. Thank you!

Jon Heron posted on Wednesday, February 15, 2012 - 7:57 am

No probs :-)

Stephanie Rodgers posted on Wednesday, February 29, 2012 - 1:38 pm

Hello,

I am interested in the gender-specific trajectories of comorbid syndroms during and in the course of a depressive disorder in a sample of N=222. The data are longitudinal with three follow-ups after the first occurence of a depression. We used the SCL-90, which consists of nine dimensional subscales (Likert Scale). I am a real beginner with Mplus and thought, the convenient analysis would be a LCGA or a GMM (both with a covariate sex) depending on how I handle the outcome (binary or metric). Is that correct? My second question concerns the number of indicators: Can I perform a LCGA using all nine scales simultaneous to each point of measurement (0,1,2,3)? The syntax would look like this: u1@0 u2@0....u9@0 u1@1 u2@1... Or would you recommend to perform the analysis seperately for each subscale? If so, how can I bring together the results after I identified the latent classes for the nine scales seperately? Or is there a more appropriate methodical approach to answer my research question?
Thank you in advance for your answer!

Kind regards,

Stephanie

Linda K. Muthen posted on Wednesday, February 29, 2012 - 6:11 pm

I would do each scale separately as a start.

Stephanie Rodgers posted on Thursday, March 01, 2012 - 3:28 am

Thank you for your answer. And what would you do in the next step?

Thanks

Stephanie

Bengt O. Muthen posted on Thursday, March 01, 2012 - 6:33 am

There are possibilities to do parallel process growth mixture analysis of more than one subscale at a time. In this case you can see if you need one latent class variable in common for the processes or if each process needs its own latent class variable, where the latent class variables are correlated.

But cross that bridge after you've done GMM for each process.

TaeKyoung Lee posted on Wednesday, March 07, 2012 - 10:37 am

Dear Dr. Muthen,

I have a question on Growth Mixture Model.
Currently, I am testing GMM with continuous outcome. Thus, I want to test each class can predict continous outcome.

As you suggested above, I just added my outcomes in the uservariable and found the means of each classes.
As you suggested, I understood these means work in the same manner as regression coefficients with dummy variables.
To test mean difference, you suggested the model test.
However, I am little confused whether you meant the model test is the model constraint using chi-square.
If you have any advice on my question, please let me know.
I very appreciate your advice in advance and look forward to hearing back from you.

Have a nice day

Linda K. Muthen posted on Thursday, March 08, 2012 - 12:45 pm

I meant MODEL TEST.

TaeKyoung Lee posted on Tuesday, March 13, 2012 - 10:09 am

Dear Dr. Muthen,

Thank you for your quick response.
However, I have a follow-up question on your answer.
If you meant that the model test is the model constraints using mean equality, I think I already tried to do this test.
However, I couldn't find any chi-square value for the model fit in the results.
I only got Loglikelihood, BIC, and AIC.
Thus, if possible, can I ask you for your advice on how to do the model test with these information?
If you have any suggestion, please let me know.
I very appreciate your advice on my situation and look forward to hearing back from you.

Have a nice day.

Linda K. Muthen posted on Tuesday, March 13, 2012 - 10:37 am

If you use MODEL TEST, the Wald test results are with the fit statistics. If you use MODEL CONSTRAINT, z-tests for the new parameters come at the end of the results. If you need further help on this, please send the output and your license number to support@statmodel.com.

TaeKyoung Lee posted on Sunday, March 18, 2012 - 8:29 am

Dear Dr. Muthen,

Thank you for your help.
If you allow me to ask just one more question, I would like to ask for your advice on how to add control variable for outcomes in growth mixture model?

Currently, I consider using control variable (continuous) for outcome.
In this case, I think model looks like ANCOVA Model.
That is, y(distal outcome) = intercept + class*b1 + control variable(Continuous)*b2 + e.

If I am correct, could I ask you on how to add this control variable(covariate) in the program?

For this situation, do I only need to add normal regression syntax in %overall% area of model specification? (e.g., y on x)
For your information, my syntax is as follow:

MODEL: %overall%
I S|dep90@0 dep91@1 dep92@2 dep94@4;
c on bt0sex fep90 fdep90 mdep90 mmi90 fmi90;
gh on s(covariate);
%c#1%
i;
[gh](m1);
%c#2%
i@0;
[gh](m2);
%c#3%
i;
[gh](m3);
model test:
m1=m2;

I appreciate so much your help.

Have a nice day.

Linda K. Muthen posted on Sunday, March 18, 2012 - 10:01 am

This is correct. You may also want to use s as a covariate in the c ON statement.

F Lamers posted on Friday, March 30, 2012 - 2:22 pm

I�m running a LCGA with 5 time points in a sample of 804 persons. Initially, I ran the model without my covariate. This went smoothly and it seemed that a quadratic model fit the data best. However, after adding the dichotomous covariate, I�m running into problems. Models with 1, 2, 3 or 5 classes run OK, but models with 4, 6, and 7 classes give me the following error:
�THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES MAY NOT BE
TRUSTWORTHY FOR SOME PARAMETERS DUE TO A NON-POSITIVE DEFINITE
FIRST-ORDER DERIVATIVE PRODUCT MATRIX. THIS MAY BE DUE TO THE STARTING
VALUES BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. THE
CONDITION NUMBER IS 0.349D-11. PROBLEM INVOLVING PARAMETER 40.�

The parameter it mentions in all of these errors involve my covariate (TAU (U)), but I�m not sure what it means. Can you tell me what this means?

Linda K. Muthen posted on Friday, March 30, 2012 - 2:26 pm

Please send your output and license number to support@statmodel.com.

ywang posted on Thursday, July 19, 2012 - 8:18 am

Dear Drs. Muthen:

For the variable of frequency of drug use in lifetime (responses in likert scale from none to 40 or more) with about half of responses as 0 (no), can a latent growth modeling be used? If not, can a latent class growth modeling used?

Thanks!

Bengt O. Muthen posted on Thursday, July 19, 2012 - 3:53 pm

You may try "two-part" growth modeling. See our Topic 4 handout and papers on our web site.

IYH Boon posted on Tuesday, July 24, 2012 - 1:44 pm

In an LCGA, when Mplus gives the "one or more parameters were fixed to avoid singularity of the information matrix" error, what are the parameters being fixed to? According to TECH1, in the 4-class model that I'm estimating, the mean of the quadratic term for the second latent class is being fixed. Is it being fixed to 0?

Bengt O. Muthen posted on Tuesday, July 24, 2012 - 6:53 pm

You see the fixed value in the regular output.

Sar posted on Monday, July 30, 2012 - 12:35 am

Hi, I have estimated a LCGM and wish to start freeing the variance of the intercept and slope. However whenever I do this for particular classes the class structure (i.e the order of classes) changes and therefore what I have freed up is not always in the same class. I have tried to use starting values based on mean and slope of intercept from the LCGM to specify the order of the latent classes something like this for each class:
%c#1%
[i*.212 s*.114];
i s;

It doesn't seem to have any effect on the order of the latent classes and the same problem is occurring. Is this the code I should be using to freely estimate the variance for specific classes?

Linda K. Muthen posted on Monday, July 30, 2012 - 8:46 am

When you free a variance in one class, the program puts the free variance in the class that provides the best overall loglikelihood. You can try freeing all variances or use the SVALUES option of the OUTPUT command to get input statements with starting values and free just one variance. This may help the classes stay in the same order.

Laura Wray-Lake posted on Thursday, September 20, 2012 - 8:31 pm

I would like to use Latent Class Growth Analysis (LCGA) to analyze longitudinal Add Health data, which has a complex sampling design. Is it possible to specify an LCGA model with Add Health strata, cluster, and sampling weight variables? If so, could you provide a syntax example?

Linda K. Muthen posted on Friday, September 21, 2012 - 6:12 am

Yes. See pages 499-505 of the Mplus User's Guide.

Sara Anderson posted on Thursday, September 27, 2012 - 12:25 pm

I am trying to use the AUXILIARY option for latent class growth analysis with covariates on the auxiliary variable. In other words, I want to predict an outcome from the classes, but control for other potentially pre-existing differences. My current output results in:
*** ERROR in MODEL command
Unknown variable(s) in an ON statement: BIN_TM54

Here's the syntax I am trying.

usevariables = zaff01 zaff15 zaff36 zaff54 bin_tm54;

classes = c(5);
AUXILIARY = bin_tm54 (de3step);

ANALYSIS:
type = mixture;
!ESTIMATOR = ML;
starts = 1000;
stiterations = 50;
LRTSTARTS = 100 100 500 500;

Model:
%overall%
ip sp | zaff01@0 zaff15@1 zaff36@2 zaff54@3 ;
ip-sp@0;

bin_tm54 on incntm01 male ark cal kan nhamp penn temple va wa wcar
wisc meducm01 mage_m01 permar_1 mcrabm01 psyagg2 other white black trad_m01
neurtm05 stdscm36 married1 ;

Any help is appreciated.

Linda K. Muthen posted on Thursday, September 27, 2012 - 2:47 pm

A variable cannot appear on both the USEVARIABLES list and the AUXILIARY list. The variables on the AUXILIARY list should not be analysis variables. All variables on the USEVARIABLES list are used in the analysis.

Sara Anderson posted on Thursday, September 27, 2012 - 7:29 pm

I have tried without the bin_tm54 variable on USEVARIABLES too, but my error is then:
*** ERROR in MODEL command
Unknown variable(s) in an ON statement: BIN_TM54

Again, I want to add covariates to the distal outcome (bin_tm54). Is this possible?

Jon Heron posted on Thursday, September 27, 2012 - 11:42 pm

Am I right in thinking that ideally you would want NONE of the other variables in your analysis to impact on the estimation of your latent classes?

You may need to do this the old fashioned way. I'm doing just this at the minute - assessing the effect of a 4-cat latent class measure on a binary outcome whilst controlling for confounders.

If Linda agrees this can't yet be done the "new way" then I'm happy to send you some code (it's a bit long to post here).

Bengt O. Muthen posted on Friday, September 28, 2012 - 10:22 am

You should not have the BIN_TM54 variable in your MODEL (i.e. BIN_TM54 ON...) - that's what Mplus complains about (you can have it on the USEV list, but it will be removed due to being on the AUXILIARY list).

It looks like you want to control for covariates influencing your distal outcome BIN_TM54. In that case, you may want to do the 3-step "by hand" as described in our web note 15 posted on our web site.

Sara Anderson posted on Wednesday, October 03, 2012 - 5:32 am

I keep receiving the following error: *** ERROR
The following MODEL statements are ignored:
* Statements in Class %OVERALL% of MODEL:
WJAPWCX5 ON C1#1
WJAPWCX5 ON C2#1
WJAPWCX5 ON C2#2
*** ERROR
One or more MODEL statements were ignored. Note that ON statements must
appear in the OVERALL class before they can be modified in class-specific
models. Some statements are only supported by ALGORITHM=INTEGRATION.

However, my syntax is as following and seems to follow examples provided in the Users Guide:
Model:
%overall%
ip sp | zppov1@0 zppov2@1 zppov3@2 ;
ip-sp@0;

ipa spa | zaaff1@0 zaaff2@1 zaaff3@2;
ipa-spa@0;

wjapwcx5 on c1 c2;

How can I fix the syntax to get it to run? Much appreciated!

Linda K. Muthen posted on Wednesday, October 03, 2012 - 6:03 am

You can use c1 and c2 as covariates. The effect you are looking for is found in the varying of the means of japwcx5 across classes. Remove the ON statement.

Ellen Delvaux posted on Tuesday, November 06, 2012 - 8:15 am

I am not sure where to post my question...

I am investigating the emotional influence of one group member on other group members across time (4 measurement moments). In order to test for this influence, I used a fully cross-lagged model. I find that group members influence each other's emotions. But I also want to know how the path looks like. Are they influencing each other upwardly or downwardly? How can I test for this? I was thinking about running a growth model for two parallel processes (example 6.13 from the manual). Would this be the right thing to do? Are there other altnernatives that look at influence patterns and paths across time at the same time?

Thank you for your help!

Linda K. Muthen posted on Tuesday, November 06, 2012 - 12:03 pm

This sounds like a question better served by a general discussion forum like SEMNET.

IYH Boon posted on Tuesday, December 11, 2012 - 9:39 am

I realize that LCGA and GMMs are specifically designed for use with longitudinal data, but I'm wondering about LCA. Are there any specific drawbacks (or benefits) to using LCA when the observed indicators are time-ordered variables (e.g., repeated measures of poverty status)? Wagmiller et al. (2006) did it in the (widely-cited) paper referenced below, but I haven't seen many other examples. Any thoughts or information would be very much appreciated. Thanks in advance.

IYH

Wagmiller, Robert L., Mary Clare Lennon, Li Kuang, Philip M. Alberti, J. Lawrence Aber. 2006. "The Dynamics of Economic Disadvantage and Children's Life Chances." American Sociological Review 71(5):847-866.

Linda K. Muthen posted on Tuesday, December 11, 2012 - 10:58 am

There is no drawback to using LCA with longitudinal data other then the fact that your findings will be very similar to LCGA and LCGA is a more parsimonious model. LCA is a great way to see the growth shape for use in a LCGA. See the Topic 6 course handout on the website starting at Slide 76.

matteo giletta posted on Wednesday, December 12, 2012 - 1:41 pm

Dear Drs Muthen,

I'm conducting some LCGA with 4 indicators using a piecewise model (one slope is defined by 2 indicators only with variance fixed at zero). I would like to identify my trajectories while controlling for some covariates. My issue is that I am not specifically interested in the effects of the covariates on the trajectory membership. Instead, I need to account for the effects of these covariates while estimating my models because they are related to my indicators.
Thus, my question is: should I introduce my covariates on the class membership (e.g., c#1 on cov) OR should I better control for the effects of the covariates on each indicator?
Thanks!

Bengt O. Muthen posted on Wednesday, December 12, 2012 - 7:21 pm

I wonder how you know that the covariates influence your indicators and not the class membership. Covariates that influence the class membership have an indirect influence on the indicators, so they are not uncorrelated.

matteo giletta posted on Wednesday, December 12, 2012 - 7:54 pm

Thanks for your reply Bengt! I actually do not know whether my covariates influence the class membership but, as you say, I assume they do. However, because I am not specifically interested in these effects I was wondering what was the best way to estimate trajectories while controlling for the effects of these covariates (as I know that they affect my indicators). From your reply, it seems to me like I should end up with highly similar results, right?
Thanks!!!

Matteo

Bengt O. Muthen posted on Thursday, December 13, 2012 - 8:38 am

I would just regress the class membership on your covariates.

ywang posted on Monday, March 18, 2013 - 9:00 am

Dear Drs. Muthen:

I have a question about the parallel latent class growth modeling. Can we get the output of Odds Ratio instead of proportion (for example, the OR of the m class for outcome Y1, given the n class of outcome Y2)? If so, what is the syntax?

Thanks!

Bengt O. Muthen posted on Monday, March 18, 2013 - 1:37 pm

Not automatically. But you can form the odds ratios you want in MODEL CONSTRAINT, creating NEW parameters.

ywang posted on Wednesday, March 27, 2013 - 8:38 pm

Dear Drs. Muthen:

I would like to follow up with the question regarding the Odds Ratio for parallel latent class growth modeling.The output is as follows. We have three classes for C1 and three classes for C2. We specified "C2 on C1". So it is a multinomial logistic regression.

I am not sure how to interpret. For example, the regression coefficient of C2#1 on C1#1 is -3.318. Exp(-3.318) is 0.036. Can it be interpreted as that the relative risk for the kids to stay in C2#1 class versus C2#3 class is much smaller (only 0.036 times) for the kids in C1#1 class compared to the kids in C1#3 class?

If it is not correct and the relative risk ratio cannot be acheived by directly exponentiating the regression coefficient, how can we get the relative risk ratio?

Thanks a lot for your help in advance!

Categorical Latent Variables

C2#1 ON
C1#1 -3.318 1.650 -2.011 0.044
C1#2 -3.246 1.588 -2.044 0.041

C2#2 ON
C1#1 -3.345 1.505 -2.223 0.026
C1#2 -3.834 1.382 -2.773 0.006

Means
C1#1 1.748 0.359 4.865 0.000
C1#2 2.053 0.373 5.501 0.000
C2#1 -0.102 1.471 -0.069 0.945
C2#2 1.335 1.386 0.963 0.335

Bengt O. Muthen posted on Thursday, March 28, 2013 - 9:16 am

The exp of c2 ON c1 is the odds ratio. As Wikipedia says:

Relative risk is different from the odds ratio, although it asymptotically approaches it for small probabilities.

Zihan Wei posted on Saturday, May 04, 2013 - 7:14 am

Dear Drs. Muthen,
I have two independent variables, X1, X2(both binary) and a dependent variable with 10 time points repeated measurement, Y1-Y10(continuous). I'm interested in how Y develops and how X1 and X2 affect the development of Y. It was suggested that Y is heterogeneous and may have subgroups, so I chose GMM to analyses my data.
I followed the User's Guide and some other papers and written my main model syntax as follows:
ANALYSIS: TYPE = MIXTURE;
starts=10 2;
stiterations=10;
MODEL:
%OVERALL%
i s| y1@0 y2@1 y3@2 y4@3 y5@4 y6@5 y7@6 y8@7 y9@8 y10@9;
i-s@0;
i s on x1 x2;
c on x1 x2;
I'm totally new to Mplus, so I have some questions about the analysis. (1) Is this a correct model to solve my research problem? (2) I found that the path coefficient estimate for the regression of the slope and the intercept on predictor X1, X2 are the same in every class. But I suppose the coefficient may be different in different class. So, if I want to estimate the path coefficient differently in different class, is that possible?
Thanks a lot!
Regards,
Zihan Wei

Linda K. Muthen posted on Saturday, May 04, 2013 - 8:04 am

To relax the equality constraint, mention the regression in the class-specific part of the MODEL command, for example,

%c#1%
i s ON x1 x2;

ywang posted on Tuesday, July 30, 2013 - 12:19 pm

Dear Drs. Muthen:

I would like to follow up with your response regarding the p values for Lo-Mendell-Rubin Test a few years ago. In one of your responses, you mentioned that "For the Lo-Mendell-Rubin test, p-value<0.05 indicates the model with one less class is rejected in favour of the estimated model." I am wondering if I can use p<0.1 as a cutoff.

I am working on a paper and have a p value of 0.07 for 3-class vs. 2-class model. However, there is variation in slopes (both slopes not significant different from 0) in 2-class model while there is variation in slopes in 3-class model (some significant slopes from 0, and some not). The BOOTSTRAPPED LIKELIHOOD RATIO TEST shows p values less than 0.001 for either 3 compared to 2 or 4 compared to 3. BIC keeps dropping. I would like to choose the 3-class model instead of 2-class model based on (1) the LMT LRT p<0.1 and (2) the fact that there is differences in slopes across classes. Do you think I can use p value<0.10 for justification for my selection of class number? Thanks!

Bengt O. Muthen posted on Tuesday, July 30, 2013 - 1:51 pm

No, I wouldn't fiddle with the alpha level. You could instead argue that the p-value is suggesting 2 classes, but that the 3-class solution adds a substantively meaningful class, while a 4th class does not.

Or, you could investigate why BIC keeps dropping - that is often a sign that a different model is needed.

ywang posted on Tuesday, July 30, 2013 - 2:14 pm

Thanks a lot for the response. It is great idea to argue that 3-class model includes an additional meaningful class. I have a follow-up question. You mention a different model might be needed when BIC keeps dropping (for both LCGA and GMM). Do you mean a different model such as inclusion of quadratic slope, or do you mean any other alternative models?

Bengt O. Muthen posted on Tuesday, July 30, 2013 - 2:58 pm

Quadratic, or, for instance, a free growth factor variance in a class that needs it.

Lina Homman posted on Friday, March 14, 2014 - 6:27 am

Dear Muthen & Muthen,

I am running LCGA and GMM models. I want to compare how parameters and trajectories differ between genders. However, running a multi group analysis results in very different results. I Assume this is due to the smallest trajectories for the full sample not consisting of a large enough sample for each gender. I am therefore wondering what the best method is? Shall I simply add gender as a covariate to the analysis?

Many thanks
Lina

Linda K. Muthen posted on Friday, March 14, 2014 - 10:17 am

If you want to compare trajectories by gender, ideally you will do an LCA by gender as a first step to determine whether you find the same classes. If you regress c ON gender, you assume there are no direct effects from gender to the outcome and the intercept and slope growth factors.

Sebastian Teran Hidalgo posted on Tuesday, April 15, 2014 - 2:15 pm

Hi,
I am running a Latent Growth Curve Model with binary outcome and logit link. I have 4 time points. The code for my model is summarized below:

CATEGORICAL are SMOKE1 ... SMOKE4;
MODEL:
i s| SMOKE1@1 SMOKE2@2 SMOKE3@3 SMOKE4@4;
i on race_h race_w race_b;
s on race_h race_w race_b;

In the output I have that the latent variable �I� for the intercept has mean 0 and I have a threshold of 2.3. As far as I know, the threshold is the same as the intercept times -1, right? I want Mplus to output an intercept instead of a threshold and I added the following line to my code:

[SMOKE1$1@-15 SMOKE2$1@-15 SMOKE3$1@-15 SMOKE4$1@-15 i];

This successfully gives the latent variable �I� a mean which corresponds to the intercept I was looking for, and all the thresholds are -15 which I thought it meant that I don�t have the thresholds anymore. However, the results are not exactly the same. The intercept I get is -19.2, but I thought it should be close to -2.3. Also, the coefficient of i on race_h is very different, but everything else seems to be very close with both codes. Is this the appropriate way to ask Mplus to use an intercept instead of a threshold?

Thank you for your time.

Sebastian

Jon Heron posted on Wednesday, April 16, 2014 - 8:07 am

Hi Sebastian,

I think you should be fixing your thresholds to zero rather than minus 15 to transfer that value to the intercept.

To get this working, I would be inclined to temporarily remove the covariates from the model. Once you include covariates for i and s, Mplus no longer quotes you the latent variable means. Confusingly, you are now given an intercept for your Intercept. This will only equal the mean if your covariates are centred.

best, Jon

Sebastian Teran Hidalgo posted on Wednesday, April 16, 2014 - 8:47 am

Hi Jon,

Thank you for your response. Just to make sure I am understanding this correctly, what I should include in my code is:

CATEGORICAL are SMOKE1 ... SMOKE4;
MODEL:
i s| SMOKE1@1 SMOKE2@2 SMOKE3@3 SMOKE4@4;
i on race_h race_w race_b;
s on race_h race_w race_b;

[SMOKE1$1@0 SMOKE2$1@0 SMOKE3$1@0 SMOKE4$1@0];

Is this correct? Thank you for being so helpful!

Sebastian

Bengt O. Muthen posted on Wednesday, April 16, 2014 - 12:24 pm

That's correct. Except you want to add

[i];

to free the intercept mean.

You also want to have a time score 0 to clearly define i, so for instance:

i s| SMOKE1@0 SMOKE2@1 SMOKE3@2 SMOKE4@3;

Sebastian Teran Hidalgo posted on Wednesday, April 16, 2014 - 12:26 pm

Thank you!

Sebastian Teran Hidalgo posted on Tuesday, April 22, 2014 - 10:27 am

Hi,

I decided just to go with the default Threshold instead of trying to create an intercept.

CATEGORICAL are SMOKE1 ... SMOKE4;
MODEL:
i s| SMOKE1@0 SMOKE2@1 SMOKE3@2 SMOKE4@3;
i on race_h race_w race_b;
s on race_h race_w race_b;

I get some output that looks like this (not showing SD or p-values):

I ON RACE_H -4.345
RACE_W 2.167
RACE_B -4.767

S ON RACE_H -0.6
RACE_W -4.471
RACE_B 0.77

THRESHOLD 2.128

INTERCEPTS
I 0
S 19.353

Let say I want to plot the probabilities over time for Hispanics(race_h is a dummy for Hispanics).Does the following make sense, where P(t) is the probability over time:

I=-2.128-4.345
S=19.353-0.6

P(t)=1/(1+exp(-I-S*time)) ?

I have seen some examples of this and I believe this is how they do it but I wanted to confirm it.

Thank you

Bengt O. Muthen posted on Wednesday, April 23, 2014 - 10:49 am

This gives the probability at the means of i and s. So a conditional probability, not the marginal. The marginal needs to be integrated over the distribution of i and s given covariates. This is what is shown in the plots that Mplus makes (Adjusted estimated means).

Sebastian Teran Hidalgo posted on Thursday, April 24, 2014 - 8:52 am

Hi,

Thanks for your previous response. This makes sense now. I come from a Mixed Models background and displaying probabilities as I mentioned above wouldn't make sense without integrating out the 'random effects'. I have added the following lines to my code:

Plot: type =plot2;
series= SMOKE1(1) SMOKE2(2) SMOKE3(3) SMOKE4(4);

When I open the graph and I get the window that says 'Select plot to view', I only get as options 'Sample proportions', 'Item characteristic curves' and 'Information curves'. Nowhere I can select 'Adjusted Estimated Means'. Could you help me with this please?

Thank you.

Sebastian

Linda K. Muthen posted on Thursday, April 24, 2014 - 3:11 pm

Please send the output, graph file, and your license number to support@statmodel.com.

xiaoyun zhang posted on Monday, August 18, 2014 - 10:31 pm

Dear Dr.Muthen,

I have a longitudinal data set with four waves. At wave 1, the paricipants's baseline age range are 12-21.I want to estimate trajectory memberships based on age, not wave. That is the cohort sequential design. Is it fine that I resturcture the data and create age-based variables, and then use LCGA to indentify the trajectories? Does this way ignore the effect of cohort? Thanks!!!

Linda K. Muthen posted on Tuesday, August 19, 2014 - 11:48 am

You can restructure the data so time is age. When you do this you make the assumption that all cohorts come from the same population. You can also take a multiple group approach. See Example 6.18.

RuoShui posted on Thursday, March 05, 2015 - 3:24 pm

Dear DRs. Muthen,

I am conducting growth mixture modelling for two big groups A and B. I found the same number and similar pattern of trajectory classes. I can compare the mean differences in academic adjustment among subgroups such as A1, A2, A3 and A4 in Mplus. But is there a way to also compare the mean differences between similar subgroup such as A1 and B1 in Mplus?

Thank you very much!

Bengt O. Muthen posted on Friday, March 06, 2015 - 10:56 am

Statistically, you can do any mean difference estimation using Model Constraint. But perhaps you are asking when it is substantively ok to compare across subgroups.

RuoShui posted on Tuesday, March 10, 2015 - 8:43 pm

Dear Dr. Muthen,

Thank you very much for your response above. My question is more of a technical question. I was able to test whether the residuals of distal outcomes (auxilary variable: (e)) differ across latent classes within Group A or B. But I couldn't figure out how to simultaneously test in Mplus whether the residuals of distal outcomes differ across groups A and B for the similar latent classes (such as between A1 and B1; and between A2 and B2). Do you have one example that I could follow?

Thank you very much for your time.

Bengt O. Muthen posted on Wednesday, March 11, 2015 - 6:29 pm

I don't understand your setting. Are you doing an auxiliary (e) analysis? If so, how do you test residuals of those aux(e) variables? Send output to support if this is hard to explain without it.

Dustin Pardini posted on Thursday, April 02, 2015 - 11:13 am

I am running an LCGA model using the 3 step procedure and attempting to look at an outcome specified as latent variable. The model freely estimates the means and variances of the latent factor for all 4 groups. This produces an error saying that the mean of the latent factor is fixed to zero for the last class (a default solution), but the variance is estimated for the that factor.

Does this happen because the model is not identified unless the latent factor mean for one group serves as the contrast (i.e., fixed to 0). When this is done the estimated means for the other latent factors are difference scores relative to that group. The significance test for the other group means in turn indicates whether they are significantly different from the class where the mean is fixed to zero (i.e., relative difference in means).

Bengt O. Muthen posted on Thursday, April 02, 2015 - 3:50 pm

Correct on all counts.

Dustin Pardini posted on Monday, April 13, 2015 - 9:38 am

Thank you for the feedback. Building on the previous question (4/2/2015 at 11:13 AM), we are wondering about the most appropriate way to get estimated means/intercepts and standard errors for each class.� If no groups are specified as the contrast group (no means are fixed to 0), the mean for the last class is automatically fixed and there is no standard error estimated for this class (for clarity, we�ll call this Model 1 here). �

Is it possible to generate or calculate the standard error for the last class?� We tried fixing the mean of the latent outcome for class 1 at the value provided in the output from Model 1.� This allows the mean of the latent factor to be freely estimated for the last class, which allows a standard error to be estimated for this class.� However, when we do this, the standard errors associated with the means of the latent factor for the other classes are different than what they were in Model 1 (although the means and other parameter estimates are basically unchanged).� Is is okay to get the standard error for the last class this way?

Alternatively, would it be better to square root the residual variances for the latent outcome variable and present these values with the means (instead of the standard errors)?

Bengt O. Muthen posted on Tuesday, April 14, 2015 - 8:16 am

You should not want an estimate and SE for the factor mean in the reference group where it is fixed at zero. There is no more information to be obtained. Factors have an arbitrary scale and their metric needs to be set. Mean zero and variance 1 is one way. Only in comparisons across groups or time can you estimate factor means - and then only in comparison with a reference group/timepoint where the mean is fixed (say at zero). There is no disadvantage in this.

Dustin Pardini posted on Tuesday, April 14, 2015 - 10:25 am

Thanks for the reply Bengt. What we are trying to do is set up a table that summarizes group differences on a latent variable within the context of a 5 class LCGA model. We wanted to set it up similar to a traditional ANOVA table that has means and SD for each of the groups. We can definitely indicate which group has a factor mean set to 0 so the reader understands the estimated mean factor scores are relative to this group. However, we also wanted to include some metric indicating the dispersion around this mean value for each class. Is it appropriate to take the square-root of the estimated factor variance to get a SD about the mean scores even for the contrast group? Our model includes covariates, so the variance estimates are after adjusting for these factors. Thanks.

Bengt O. Muthen posted on Tuesday, April 14, 2015 - 2:23 pm

Yes.

Tessa posted on Wednesday, July 01, 2015 - 11:39 am

I am working on a GMM for a continuous outcome measured across 4 years (Y1-Y4) and plan to investigate what variables might differentiate the various classes/profiles that are identified.

My question is, can these variables be measured at the first year (Y1) or do they need to be measured at the previous year (Y0)? These would be static variables such as participant sex as well as potentially time-varying variables such as change in household size.

Thank-you!

Bengt O. Muthen posted on Wednesday, July 01, 2015 - 5:29 pm

When it was measured is not as important as when it occurred. It should occur at least by Y0.

Jordan Beardslee posted on Wednesday, August 12, 2015 - 9:51 am

We�re working on an analysis using the 3-step model to compare four latent classes on various dependent variables.� We want to test whether three of the groups, in aggregate, differ from the fourth group on a categorical dependent variable. Would it be appropriate to use the model constraint command to calculate a weighted average threshold (based on size of each group)? And if so, is it appropriate to use the model test command to compare the weighted average threshold of the first three groups with the threshold of the fourth group?

Model Constraint:

Threshold3grps=((N1/(N1+N2+N3))*threshold1)+((N2/(N1+N2+N3))*threshold2)+((N3/(N1+N2+N3))*threshold3);

Model Test:

Threshold3grps=threshold4;

Also, we understand that maximum likelihood estimation within logistic regression can produce sample bias for rare events. Will the same bias be present when contrasting the trajectory groups on rare events under MLR estimation?� Are there alternative strategies for dealing with this problem?

Bengt O. Muthen posted on Wednesday, August 12, 2015 - 4:58 pm

I don't think you have to do weighting, but Model Constraint can be used. It can also be used for checking the difference by creating a NEW parameter called e.g. diff:

diff = threshold3grps-threshold4;

Bengt O. Muthen posted on Wednesday, August 12, 2015 - 4:59 pm

Regarding your last question, MLR gets the same parameter estimates as ML so won't help. Don't know how to avoid any such bias.

Jackie Du posted on Friday, October 30, 2015 - 8:19 am

Hello,

I am a new user. I have 6 time points and a bit under 200 samples. I would like to conduct LCGA.
After looking at the linear model, it seemed that a quadratic shape would be a better fit.
In error, my initial model was:
i s q| h0@0 h10@10 h15@15 h21@21 h27@27 h39@39;
i-s@0;
I later corrected this to be "i-q@0;"
However, the former "i-s@0" model seems to make biologically more sensible curves.
I thought it was merely because it took into better account the intercept. However, when I tried instead "s ON i", I got this error:
THE MODEL ESTIMATION DID NOT TERMINATE NORMALLY DUE TO AN ILL-CONDITIONED
FISHER INFORMATION MATRIX.
Can you explain if and how the "i-s@0" model is wrong or if I can justify it's use based on the expected biological behaviour?

Many thanks!

Bengt O. Muthen posted on Friday, October 30, 2015 - 4:54 pm

It is unusual to allow a random q and fixed i and s. That would need strong substantive theory to motivate it. More typically, q variance is fixed at zero and i and s are random.

If you have i@0 you should not do s on i because you can't have a predictor (i) in a regression that doesn't have any variance.

Kelly M Allred posted on Wednesday, November 11, 2015 - 7:19 am

I am running a latent growth curve analyses using MLR estimation in a sample of 130 subjects. I found that the correlation between the intercept and slope variables of interest is . 27, p = .18. Normally a Pearson's correlation of this size (.27) would be highly significant in a sample of 130 subjects, but here it is not significant. Why is this? Is it that Mplus adjusts the p-value in light of the missing data estimation?

Bengt O. Muthen posted on Wednesday, November 11, 2015 - 1:16 pm

Perhaps you have large residual variances for your outcomes or few time points so that the growth factors are not well measured.

Emily Li-Peng Chew posted on Monday, January 11, 2016 - 9:03 pm

How do you specify a different trend for each class in the LCGA model syntax? E.g. if you wanted to test a linear trend in one class, and a quadratic trend in another? Does there still need to be an overall trend specified? Thanks.

Linda K. Muthen posted on Tuesday, January 12, 2016 - 8:20 am

The overall trend should be the most general. In classes where there is not quadratic growth, fix the mean, variance, and covariances with other growth factors to zero.

sfhellman posted on Thursday, March 03, 2016 - 3:10 pm

I am working on a LCGA of depression trajectories, n = 200 and data at 3 timepoints. I began by attempting to estimate a one-class model with linear and quadratic terms, however I got the following error message:

THE MODEL ESTIMATION DID NOT TERMINATE NORMALLY DUE TO A NON-POSITIVE DEFINITE FISHER INFORMATION MATRIX. THIS MAY BE DUE TO THE STARTING VALUES BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. THE CONDITION
NUMBER IS -0.133D-13.

THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES COULD NOT BE
COMPUTED. PROBLEM INVOLVING THE FOLLOWING PARAMETER: Parameter 3, %C#1%: ICEPT BY PHQSUM_6M

I also estimated a model without the quadratic term and got the following error message (thought the model estimation terminated normally):

WARNING: THE MODEL ESTIMATION HAS REACHED A SADDLE POINT OR A POINT WHERE THE OBSERVED AND THE EXPECTED INFORMATION MATRICES DO NOT MATCH.
AN ADJUSTMENT TO THE ESTIMATION OF THE INFORMATION MATRIX HAS BEEN MADE. THE CONDITION NUMBER IS -0.591D-08.

THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES MAY NOT BE
TRUSTWORTHY FOR SOME PARAMETERS DUE TO A NON-POSITIVE DEFINITE FIRST-ORDER DERIVATIVE PRODUCT MATRIX. PROBLEM INVOLVING THE FOLLOWING PARAMETER: Parameter 3, %C#1%: ICEPT BY PHQSUM_6M

I planned to proceed by estimating models with additional classes, but I am not sure if it is appropriate to do so. Any guidance is appreciated!

Linda K. Muthen posted on Thursday, March 03, 2016 - 5:15 pm

Please send the output and your license number to support@statmodel.com.

Tibor Zin posted on Sunday, April 10, 2016 - 1:53 pm

Dear Dr Muthen,

I am quite new in LCGA and I would like to ask a question about my analysis choice, your advice would be very appreciated!

I have three waves of data and I would like to find out whether class membership defined by 7 indicators predicts continuous variables or whether these variables predict class membership. I think that testing whether continuous variable predicts class membership is not a problem, but I am not sure whether it is also possible the other way around. If this is not possible, could you please recommend another approach to test my hypotheses?

Many thanks!

Bengt O. Muthen posted on Sunday, April 10, 2016 - 9:13 pm

The difference is that when the "continuous variables" are predicted by the latent class variable they are conditionally independent given the latent class variable. But when the continuous variables predict class membership they are not cond'y indep. Both models are possible.

Tibor Zin posted on Sunday, April 10, 2016 - 11:57 pm

Thanks for the fast reply! I have a problem to cunduct this analysis. When I regress variable on class membership, the outcome reports a problem. Please, do you know where could be the problem? Could you maybe recommend me some source, where this analysis was described?

Linda K. Muthen posted on Monday, April 11, 2016 - 11:00 am

You cannot regress on a categorical latent variable. The effect you want is found in the mean/threshold varying across classes.

Martijn Van Heel posted on Thursday, June 16, 2016 - 1:48 am

Hello

I'm quite new at MPlus and I have a question on doing an LCGA on 3 dimensions of a construct simultaneously. Do you need to create three intercepts/slopes, like in the piece of syntax below?

Also is it possible to plot not only the course of the overall construct, but also the course of the three dimensions?
Any papers on the topic are always welcome.
Thanks in advance!
i s | SUP_W1@0 SUP_W2@1 SUP_W3@2 SUP_W4@3 SUP_W5@4 ;
i2 s2 | pro_w1@0 pro_w2@1 pro_w3@2 pro_w4@3 pro_w5@4 ;
i3 s3 | psy_W1@0 psy_W2@1 psy_W3@2 psy_W4@3 psy_W5@4;

Bengt O. Muthen posted on Thursday, June 16, 2016 - 10:17 am

Q1. That is the standard approach. But you can also do one growth model for the construct and have the 3 dimensions as construct indicators - see UG index for multiple indicator growth.

Q2. Yes.

See Papers under Growth Modeling.

Nicolas Berger posted on Thursday, July 28, 2016 - 10:16 am

Hello,

I am comparing models with different number of classes using TECH11. I ran the models with many start values/optimizations then used optseed when asking for TECH11, and increased k-1 start values/opt if needed.
I have 2 questions:

1. When I compared a 2-class to a 3-class model, the number of difference in parameters indicated in TECH11 is 1, whereas it is 4 for comparisons of models with more classes (e.g. 4 vs 5, 5vs6). Could you explain me why it is the case?

2. The log-likelihood of the model in H0 in TECH11 for 3 class (vs. 4) is higher (-96246) than the best solution for a 3-class model (-96329) with many start values and final optimizations (400 80). If different, I would expect the difference to be the other way around, which could be explained by the fact that a local solution is found in TECH11. Have you got an explanation for that?

Many thanks!

Relevant part of my code:

COUNT = UN90-UN10(i) ;

i s q | UN90@0 UN91@.1 ... UN10@2.0;

Bengt O. Muthen posted on Friday, July 29, 2016 - 12:14 pm

Take a look at Mplus Web Note 14. If that doesn't help, please send file and license number to Support.

Philipp Alt posted on Thursday, August 04, 2016 - 8:53 am

Hello,

I might have a similar problem like Nicolas.

I am having trouble replicating the H0 loglikelihood value for the K-1 solution in the TECH11 output when comparing 3- and 2-class solutions in a LCGA.

I already read the Web Note 14, but increasing/using K-1Starts does not seem to fix the issue.

I should add, that it is not "exploratory" or unconditional. Instead I am using a categorical covariate to predict class#1 like this:

C#1 On cle1i1;

This might be part of the problem.

Thank you very much!

Bengt O. Muthen posted on Friday, August 05, 2016 - 2:58 pm

It is harder to replicate when covariates are included. I would simply go by BIC.

Philipp Alt posted on Monday, August 08, 2016 - 6:38 am

Thank you for the fast reply! Is there any reference I could cite for this? kind regards, Philipp

Bengt O. Muthen posted on Monday, August 08, 2016 - 9:47 am

Send an email to Karen Nylund-Gibson at UCSB and ask for her latest paper.

chioma nwaru posted on Wednesday, February 08, 2017 - 4:29 am

Hi,

I am using a longitudinal study with 4 wave response of a same cohort on disability which is on the scale of 0-10 on each wave, I am trying to find the latent classes for disability. I just wanted to know if I am using the right model, i.e. LCGM. Disability variable of different waves are coded as: dis1 (1988) n=5500, dis2 (1995) n=4500, dis3 (2002) n=3500 and dis4 (2014) n=2900 and should I use any other variable here? If I am using a wrong model or command, could you suggest a right one (or from Mplus guide). t1 t2 t3 t4 are time in years for each waves respectively; female age occup are baseline variables (from 1983).
(Every time I run the below command it gives the results, but it also says input reading terminated normally)

Thanking you!

With regards,

Bengt O. Muthen posted on Wednesday, February 08, 2017 - 4:06 pm

Please send this question to SEMNET.

Yuejiang Hou posted on Thursday, August 24, 2017 - 5:59 am

Hello!

I have a problem that although the slope showed significant variances in latent growth analysis, only one class could be exacted by growth mixture model.
Significant variances means there are multiple patterns of growth, am I getting it right?
I am not sure if it is because the fit of LGM is bad or I missed something.

Here are parameters about the latent growth model.Longitudinal data of Self-esteem with 6 wave responses of a same cohort.
N=54
CFI=0.9
RMSEA=0.13
X2(15)=163.793,p<.001
means: I=3.291***, S=.02,n.s.
variances: I=0.4***, S=.0025***
coefficient of I&S=-0.4***

When I put c(2) to run, no matter the BRLT or entropy is not desired.

I really appreciate any advice! Thanks in advance.
With regards.

Jon Heron posted on Thursday, August 24, 2017 - 8:18 am

Hi Yuejiang

significant variance means you have heterogeneity in your patterns of growth but if this heterogeneity can be (closely) captured by a bivariate normal distribution you won't find any support for more than one class.

Yuejiang Hou posted on Tuesday, August 29, 2017 - 3:00 am

Hi Jon(continue from the post of Thursday, August 24, 2017 - 5:59 am)

Sorry to replied late.
Thank you so much. Your interpretation made sense to me.
But here is another question emerged going after the last one.
That is the variance of intercept is significant too. I expected that at least two group of higher intercept and lower intercept should be exacted.
Referring to your theory, could it be possible that the heterogeneity of intercept also can be captured by a bivariate normal distribution?

Huiru posted on Wednesday, September 06, 2017 - 5:13 pm

Hi,

I am very new to Mplus and trying to use LCGA for my project. Previously, I used SAS Proc Traj and I was able to specify linear, quadratic, or cubic for each trajectory. But in Mplus, if I observe that for one group, linear is sufficient, but for some other groups, I need to specify quadratic or cubic, how could I specify different shapes for different groups? Could you please let me know how to achieve this?

Thanks a lot!

Linda K. Muthen posted on Thursday, September 07, 2017 - 1:14 pm

Use the most general model in the %OVERALL% part of the MODEL command. In the class-specific parts of the MODEL command, fix the means, variances, and covariances of the growth factors that don't apply to that class to zero.

Kotaro Shoji posted on Friday, December 01, 2017 - 1:50 am

Dear Linda,

I would like to ask and confirm what I think you have already said.
If your LCGA model involves covariates, Mplus does not provide means and variances for the intercept and slope?

Thank you for your help in advance.

Bengt O. Muthen posted on Friday, December 01, 2017 - 2:47 pm

Try TECH4.

Kotaro Shoji posted on Friday, December 01, 2017 - 3:57 pm

Thank you so much for your quick response.
It worked.

Katharine Buek posted on Thursday, February 01, 2018 - 12:51 pm

I have 6 data points corresponding to fall and spring measures in three different school years. Is there a way to model a latent class growth curve across the three years and include a main effect for semester?

Bengt O. Muthen posted on Thursday, February 01, 2018 - 2:28 pm

Use a 0/1 dummy variable to represent Spring as opposed to Fall. This will then be a time-varying covariate for the 6 time points.

Katharine Buek posted on Thursday, February 01, 2018 - 3:39 pm

Okay, thank you. Would this be the right code, for what you suggest?

i s| atl1@0 atl2* atl3* atl4* atl5* atl6@1;
atl1 on spring;
atl2 on spring;
atl3 on spring;
atl4 on spring;
atl5 on spring;
atl6 on spring;

Bengt O. Muthen posted on Thursday, February 01, 2018 - 4:02 pm

Right.

Katharine Buek posted on Thursday, February 01, 2018 - 4:18 pm

It doesn't like this solution, I assume because all subjects have the same value of the spring variable at each time point, so there is no variance. It won't run;

*** ERROR
One or more variables have a variance of zero.
Check your data and format statement.

Bengt O. Muthen posted on Thursday, February 01, 2018 - 5:12 pm

Ah; of course. Try it in twolevel, long form instead. Transform the data using WIDETOLONG.

Katharine Buek posted on Thursday, February 01, 2018 - 5:56 pm

Thank you for your patience, I'm new to mplus and trying to figure out the code. Does this look right? I don't think my model statement is correct because I'm supposed to be using the ATL variable instead of the original vars, no?

DATA WIDETOLONG:
WIDE = atl1-atl6 | spring1-spring6;
LONG = atl | spring;
IDVARIABLE = person;
REPETITION = time;

VARIABLE: NAMES = CHILDID ATL1-ATL6 spring1-spring6;
USEVAR= atl spring;
MISSING=ALL(9);
CLASSES=C(2);

ANALYSIS: TYPE=MIXTURE ;
START= 200 50;
MODEL: %OVERALL%
!Level loadings;
i s | atl1@0 atl2*.05 atl3*.05 atl4*.05 atl5*.05 atl6@1;

!Time-varying covariates;
ATL1 on spring;
ATL2 on spring;
ATL3 on spring;
ATL4 on spring;
ATL5 on spring;
ATL6 on spring;

Bengt O. Muthen posted on Friday, February 02, 2018 - 4:55 pm

UG ex 9.16 clarifies this.

SunYoung Park posted on Thursday, April 19, 2018 - 12:27 am

Dear, Drs. Muthen,

I am using 3 step approach for paralell process LCGA to predict the effects of 3 covariates on latent class membership.
Also, I am doing paralell process LCGA without covarite model.
Compared two output, there is no difference in class memership between two results.
I assumed that there is some diffrence with or without covarites.Is anything wrong in my anlaysis?

this is my syntax for 3 step approach for paralell process LCGA

CLASSES = c (3);
AUXILIARY = (R3STEP) a_tm c_tm r_tm;
ANALYSIS: TYPE = MIXTURE;
STARTS = 40 8;
MODEL: %OVERALL%
i1 s1 | t01AbG@0 t02AG@1 t03AG@2 t04AG@3;
i2 s2 | t01MAT@0 t02MAT@1 t03MAT@2 t04MAT@3;
i3 s3 | t01LG@0 t02LG@1 t03LG@2 t04LG@3;
i1-s1@0;i2-s2@0;i3-s3@0;

and paralell process LCGA without covarite model syntax is same without

AUXILIARY = (R3STEP) a_tm c_tm r_tm;

Bengt O. Muthen posted on Thursday, April 19, 2018 - 4:24 pm

Perhaps I misunderstand you, but the fact that there is no difference in class membership is the point of 3-step procedures. See our papers on this.

SunYoung Park posted on Friday, April 20, 2018 - 9:00 am

Thank you for your advice!

Anna Austin posted on Monday, June 25, 2018 - 6:39 pm

Hello!

I am conducting LCGA to identify trajectory classes of violence victimization over time. As a predictor of trajectory class membership, I am interested in examining the home environment at baseline. There are several observed variables that are intended to measure the underlying home environment. If I conduct CFA using the observed variables measuring the underlying home environment, can I then use this latent variable derived through CFA as a predictor in LCGA?

Thank you!

Bengt O. Muthen posted on Tuesday, June 26, 2018 - 3:18 pm

Yes you can. The computations may get a bit heavy due to numerical integration.

Sarah Farrell posted on Saturday, September 01, 2018 - 12:10 am

Hello there
I am exploring the extent to which levels and trajectories of a continuous variable vary as a function of group membership (4 groups). However, my groups are not known classes but latent classes (based on responses to six continuous variables in latent class analysis). While entropy is high, I am aware that treating probability based classification groups as known groups introduces a degree of error into any subsequent analysis (e.g. multiple group latent growth modelling). Is there any alternative analysis approach you can recommend? Many thanks in advance.

Bengt O. Muthen posted on Saturday, September 01, 2018 - 10:23 am

Have a look at our Web Notes 15 and 21.

Laura Serra Saurina posted on Friday, November 02, 2018 - 6:15 am

I would like to run a LCGA model specifying different growth factors in each trajectory. For example, I would like to have the first trajectory to be cubic, the second constant, the third cubic and the last one quadratic. I have write this code but it doesn't work. How can I specify it?

MODEL: %CLASS 1%
i1 s1 q1 c1| y5@0 y6@1 y7@2;
i1-c1@0;
%CLASS 2%
i2| y5@0 y6@1 y7@2;
i2@0;
%CLASS 3%
i3 s3 q3 c3| y5@0 y6@1 y7@2;
i3-c3@0;

%CLASS 4%
i4 s4 q4| y5@0 y6@1 y7@2;
i4-q4@0;

PLOT: SERIES = y5-y16 (s);
TYPE = PLOT3;

I also have doubts regarding how to write the plot part.

Thanks in advance,

Laura

Bengt O. Muthen posted on Friday, November 02, 2018 - 2:53 pm

You can't identify a quadratic or cubic growth model with only 3 time points. See our Short Course Topic 3 video and handout.

Laura Serra Saurina posted on Friday, November 02, 2018 - 3:22 pm

Yes, I know, I simplified my code for submitting the message but I'm using 12 time points.

Thanks,

Bengt O. Muthen posted on Friday, November 02, 2018 - 3:25 pm

Please send your output and data if possible to Support along with your license number.

Maggi Mackintosh posted on Monday, June 17, 2019 - 7:03 pm

Hello,

I conducting a LCGA with 5 time points using ordinal variables (5-levels). Our final solution was a 3 class model. One of our reviewers is asking if we can say the slope in Class 2 is significantly different than the slope in Class 3. I cannot see a way of doing this without possibly changing the whole solution. Is there a way? Thank you!

Bengt O. Muthen posted on Tuesday, June 18, 2019 - 3:12 pm

Use Model Test with parameter labels defined in the Model command.

Jianhua Zhou posted on Friday, July 05, 2019 - 8:23 pm

Hello,

I run a Latent class growth modeling (LCGM) analyses and examined how the extracted latent classes explained the distal outcome. I employed the Auxiliary (BCH) function to do that.

How can I output the effect size and/or explained variance when we do class comparisons on the outcome? Whether it can be achieved using parameter labels and the MODEL CONSTRAINT command? I don't know how to set it. Could you please let me know how to achieve this?

Thanks in advance.

VARIABLE: NAMES ARE AUT1-AUT8 A1-A5;

Missing =ALL (999);

CLASSES = c (4)

USEVARIABLES ARE AUT1-AUT8 LS;

Auxiliary = LS(BCH)

Define: LS=mean(A1-A5);

ANALYSIS:TYPE =MIXTURE;

PROCESSORS = 4(STARTS);

LRTSTARTS = 0 0 600 120;

STARTS = 200 120;

MODEL:

%OVERALL%

i1 s1 |AUT1@0 AUT2@1 AUT3@2 AUT4@3 AUT5@4 AUT6@5 AUT7@6 AUT8@7;

OUTPUT:SAMPSTAT TECH11 TECH14 TECH4 STDYX TECH7 ;

Bengt O. Muthen posted on Saturday, July 06, 2019 - 3:19 pm

You can use the "manual" BCH approach described in Web Note 21, section 3.2. This allows you to use parameter labels and express effects based on these in Model Constraint.

Jianhua Zhou posted on Saturday, July 06, 2019 - 6:21 pm

Thank you so much for your quick response.

I have saved the BCH weights.

But I still don't know how to use parameter labels and express effects based on these in Model Constraint.

Would you please give me a specific example?

Thanks so much.

Bengt O. Muthen posted on Sunday, July 14, 2019 - 11:31 am

Take a look at UG ex5.20.