I would like to inform if it is possible to include a latent class variable in a multilevel model. For instance in school effectiveness reseach: some characteristics could possibly distinguish 5 classes of schools. Would it be possible to use this latent class variable to predict student outcomes at the lowest level. Or is this impossible? If so, are there references to check for more details about this option?
In order to estimate multilevel mixture model (that is, multilevel model with finite mixture random component), I need the base package, the multilevel addon, and the mixture addon, am I correct about this? Thanks!
You would need the combination add-on to estimate a multilevel mixture model.
Joyce T. posted on Monday, April 04, 2005 - 6:36 am
I'am running a multilevel model (using ML) which contains 20 dependant variables, 12 independant variables and 3 continuous latent variables. I would like to know how mplus compute the degrees of freedom for both, the Chi-square test of model fit and the Chi-square test of model fit for the baseline model. Thanks.
BMuthen posted on Wednesday, April 06, 2005 - 3:03 am
The degrees of freedom is the number of parameters in the H1 model minus the number of parameters in the H0 model. The chi-square test of model fit for ML uses as H1 a model with free means, and free variances and covariances for both within and between. The baseline model is a model of free means and variances for between and within.
I am working on a multi-level LCA in a school-based data set. The variables of interest are 1) parent involvement (a child level variable) and 2) classroom quality (a classroom level variable). The ultimate goal is to use the latent classes as independent variables to predict child outcomes such as school readiness.
1) One question that has come up in our discussions is how to deal with important covariates (such as maternal education, child age, child sex, child ethnicity). I was wondering if you could speak to the differences between a) including the covariates in the model that estimates the latent class memberships vs. b) running post-hoc ANOVAS to examine the distribution of the profiles on these important covariates.
2) Also I am interested in how using the Latent Classes as independent variables to predict other outcomes would shift class memberships. Does this often happen when adding a predictive step into LCA/LPA models?
3) Finally we are using a national database with sampling weights, will weighting the data influence the LCA/LPA outcomes?
1. Generally it is best to estimate the full model simultaneously. See the following paper which is available on the website for further information:
Relating latent class analysis results to variables not included in the analysis. Submitted for publication.
2. If you add outcomes other than the original set, this will most likely change class membership and perhaps it should. See the following related paper which is available on the website:
Muthén, B. (2004). Latent variable analysis: Growth mixture modeling and related techniques for longitudinal data. In D. Kaplan (ed.), Handbook of quantitative methodology for the social sciences (pp. 345-368). Newbury Park, CA: Sage Publications.
3. You can and should include complex survey data features in the analysis. See the user's guide under complex survey data to see the options available in Mplus.
I have two follow up questions for the multilevel LCA we are working on.
1) Standardized Scores: Do you suggest running the models with standardized continuous indicators? Or is it acceptable to keep the indicators of profiles in their original metric (even if variances are different among indicators?)
2) Predicting Outcomes from Profile Membership: Also, for estimating the relationship between different profiles and an outcome (say literacy achievement) we have been including the outcome of interest as an indicator of profile membership. We then ran a Wald statistic to examine profile differences on the mean estimates of the outcome. Is this how you would suggest estimating profile differences on an outcome?
I have a question regarding a multi-level LCA too.
I am using a sample of twins in which I would like to identify the genetic-environmental etiology of class membership. I also have predictors at the within and the between level.
Since regressing the measured environmental variable doesn't seem to modify the ACE results (as seen in Turkheimer et al., 2005), I would like to use another strategy developed by Rasgach, O'Connor and Jenkins in which the genetic resemblance is a fixed effect.
Although I think this strategy is the best, I'm not sure how to actually bring the equation in a Mplus input. The equation is : y(ij) = Beta(o) + u(j) + e(ij) + g(ij)
Basically, the only thing that changes from a regular multilevel model is the g(ij) which is the genetic effect for the child (i) in the j'th family which varies for all individuals according to behavior genetic assumptions (it can be used with complex family pedigrees). I am wondering what do I write in the input. I think they use a single group to do the analyses and it departs from the multiple group analyses I am used to with twin samples.
Page 8, bottom, suggests that a covariance for the g(ij) term is a function of known constants. This reminds me of "QTL" modeling which is shown in the UG ex 5.23. This UG example shows how to use the Constraint= approach to moderate a covariance using read-in values. Perhaps that is a path towards doing what you want.
In running the multi-level LPA I mentioned earlier, we see from the output that the means of the indicators in the level 2 profiles are constrained to be equal across profiles. It appears that this is the default in Mplus.
Is it possible simultaneously estimate a latent profile of level 1 (child level variables) and a latent profile of level 2 (classroom level variables) without the level 2 means being fixed across level 2 profiles?
I attempted to override this estimation with starting values, but am getting repeated errors messages:
The following MODEL statements are ignored: * Statements in Class %CB#1% of MODEL CB on the BETWEEN level: SSCS98 LSCS98 ECPERSS98 ECFURNS98 ECLANGS98 ECMOTRS98 ECCREAS98 ECSOCLS98 LTARNS98 INTERS * Statements in Class %CB#2% of MODEL CB on the BETWEEN level: SSCS98 LSCS98 ECPERSS98 ECFURNS98 ECLANGS98 ECMOTRS98 ECCREAS98 ECSOCLS98 INTERS *** ERROR One or more MODEL statements were ignored. These statements may be incorrect.
Dear Dr. Muthén, I want to perform LCA on a complex dataset (teachers were rating students), and want to control for clustering effect. However I cannot define Type=complex, since this is already done with Type=mixture. How can I use the clustering or stratification options in this type of analysis? Thanks a lot. Robert
I am working on a two-level LCA where the variables of interest include both child-level (observed child interactions) and class-level variables (classroom quality). I have specified two classes at each level. I would like to see if the resulting four profiles differ on child-level school readiness outcomes. Is there a way to get an outcome mean for each profile? I am only able to get two means (one for each of the within classes) rather than four means (one for each of the four profiles). Thank you!
Take a look at the 2010 Henry-Muthen article in the SEM journal. For the models of figures 1-4 the between classes only make the within classes more or less likely but don't change the profiles of the observed items. In contrast, for figure 5 and on there are item-specific differences across the between-level classes so that would give the profile differences you expect.
K Frampton posted on Wednesday, March 16, 2011 - 11:23 am
I am running a multilevel LPA with continuous parenting indicators, with children nested within families. My goal is to identify parenting profiles, observe how they differ across various factors (mainly SES), and then use profile membership to predict a distal outcome (children's prosocial skills) in interaction with SES.
I first fit the model in a single level, and then in a multilevel. 4 classes were identified. Entropy is .86.
I then added covariates of interest(e.g., age of child, SES variables), to identify what distinguishes these groups. When I do this, the structure of the classes changes significantly. I know this is because of measurement variance issues. When I regress parenting indicators on covariate(s) in a single level, it improves the fit of the model. However, in a multilevel, when I do the same thing, computation time was + 2 hours, and it did not converge.
Any suggestions on how to get around this measurement variance issue in a multilevel? Because entropy is high, is it feasible in a multilevel framework to save the classes identified and then work with them as an observed variable, as you might do in a single level?
Also, with all this in mind - how would you suggest answering my final question - how SES X profile predicts a distal outcome?
It sounds like you have measurement non-invariance and that you add direct effects from covariates to indicators to take this into account. Note that you cannot identify a model with all direct effects.
To see your multilevel problem, you would have to send your input, output, data, and license number to email@example.com.
SES X profile influencing a distal can be handled by distal regressed on SEM with different slopes in the different profiles.
IYH Boon posted on Monday, June 27, 2011 - 12:58 pm
Are there any examples/code snippets available for situations like the one K Frampton describes, above, where the goal is to (1) identify latent profiles at level two and (2) relate these profiles to a distal outcome observed at level one?
I'm working on a similar problem and am unsure about how to specify the model statement.
I don't think we have that in script or paper form, but you would work along the lines of the below. This creates a between-level (say school) latent class variable cb from the between-level z indicators and cb influences the means of the random intercept for the distal outcome d (which is say a student variable varying on both within and between), which is how the cb influence carries over to the student's distal outcome.
Between = z1-z10 cb; classes = cb(2);
d on x;
I don't think you have to say more in MODEL because the z means vary across the cb classes as the default, and so does the d mean, where on between d is the random intercept in the regression of d on x.
Hope this start helps.
Junqing Liu posted on Thursday, August 25, 2011 - 12:50 pm
I am new to Mplus and LPA. I am working on a two-level LPA in a workforce data set. The variables of interest are 1) organization culture (a level 2 latent variable based on five level 2 continuous indicators) and 2) worker demography and practices and (level 1 observed variables). The goal is to use the latent classes as independent variables to predict workers' practice such as using a type of therapy. 1)One question is do i need to run the LPA first to get the classes(say there will be 2 or 3 categories)before including the latent class variable into the final model to predict the worker outcome?
2) In the final model using the org. culture class membership to predict worker outcome, do i need to include the observed organization id as a predicting variable to declare this is two-level model?
3) What is the output of the final model? Is it separate regression models for each category of org. culture? Or is it one regression?
4) Is there any empirical research reference on cross-sectional multilevel LPA analysis that I can read?
2) Orginization would be your Cluster= variable - see UG.
3)-4) You should read
Henry, K. & Muthén, B. (2010). Multilevel latent class analysis: An application of adolescent smoking typologies with individual and contextual predictors. Structural Equation Modeling, 17, 193-215.
which you can find on our web site.
Junqing Liu posted on Friday, August 26, 2011 - 1:48 pm
This is extremely helpful!
I can compare different LPA models and pick one that fits the best as the final latent profile model to do further analysis. I have some follow-up questions about the further analysis.
1.How common is it to use the latent profile variable as a predictor along with other covariates, rather than as a dependent variable?
2. If it is common, then should a level 1 latent profile variable be included as a regular categorical covariate (along with other level 1 and level 2 predictors) to predict a level 1 outcome or the way to included it depends on how the latent profile variable is modeled such as a two-level latent profile model with level 2 factor on random latent class intercepts and level 2 factor on random latent class indicators?
3. Is there any empirical research reference on using multilevel latent profile variable as predictor that I can read?
Thank you for your patience with my long-windedness.
1. It is getting more used now that software is available for easy use. There are papers on our web site showing this. But I would not say that it is common yet.
2. A latent class variable should be included as a predictor if substantive theory warrants that. Note, however, that you don't say "y ON c" (for a distal outcome y), but Mplus lets the y means change over the latent classes.
3. I have not seen multilevel latent profile used as a predictor yet in the literature, but there is nothing precluding it. The approach used in Henry & Muthen can easily be expanded to that using Mplus.
Junqing Liu posted on Friday, September 09, 2011 - 6:56 am
Thanks, Bengt. This is very helpful.
I tried the following three-class two-level LPA of org. culture. However, the output does not include results on latent classes. All the results are about correlations and covariance. The five latent indicators are the mean score of scales and the value ranging from 1 to 5.
1) How may i change the following syntax to get output on latent classes?
2) Is it ok to use the mean of scales as latent indicators? Or is it better to use the items within the scales as indicators?
Junqing Liu posted on Tuesday, September 13, 2011 - 12:49 pm
Thanks, Bengt and Linda. The technical problem is solved.
I have a couple of questions related to the results of the two-level model i mentioned earlier.
1) Are tech 11 and tech 14 applicable to a two-level LPA model? If so, when the LMR p value of tech 11 is not but the p of tech 14 is significant, should i pick a K as oppose to a k-1 class model?
2) The BIC and AIC of the two-level model are smaller than those of a single level model, but not by much (e.g. the adjusted BIC is 2542.96 for the two-level model and is 2559.12 for the single level model). In this case, should i still choose the two-level model?
3) What the following specification means, especially C#1 WITH C#2?
1) In principle tech14 is more reliable however you might also want to look at BIC and AIC for the two models.
2) You should also consider the size of the variance on the between level and see if it is significant. Note also that the performance of BIC would depend on the number of clusters (two-level units) which drives the asymptotics. Ultimately a simulation study would show if AIC and BIC are useful in this context. This is not well studied.
3) These are the interaction parameters in a log-linear model for contingency tables. For example in
These are normally distributed random effects that vary over clusters and allow the class proportions to change over clusters. The covariance term C#1 WITH C#2; is just the covariance between the two random effects and C#1; C#2; are the two variances.
Mary Campa posted on Thursday, September 15, 2011 - 4:40 pm
Hello Dr. Muthen,
I am reading your paper with Dr. Henry (2010, SEM 17m 193-215)and trying to replicate the analysis titled: Three classes at level 1, Two classes at level 2 random effects model: nonparametric approach (Model 4a from Table 1).
I am using 4 classes at level one but otherwise the model statements are the same. However, I continue to get this error message:
** ERROR in MODEL command Unknown class model name CW specified in C-specific MODEL command.
This can only be answered if you send the output to support.
Junqing Liu posted on Friday, September 16, 2011 - 9:03 am
Thank you very much, Tihomir.
I have a couple follow up questions regarding the interpretation of c#1 and C#2.
1) Does the within level means of c#1 means the random intercept of c#1 as compared to c#3? What dos a significant P value of the within level means of c#1?
2) What does a non-significant p value of the between level variance of c#1 mean?
3) Regarding your earlier response on examining the significance of between level variance, should i do a log likelihood ration test of the one level and two level model, or should i use the p value that mention above, the p value of the between level variances of c#1 and c#2?
Happy New Year! I hope, you had a great holidays! I'm a beginner in the field of multilevel analysis with a lot of questions. First, my aim is to reveal a typology on musicians who were rated by audience members. I have read your very intersting and inspiring paper (Henry & Muthen, 2010). First of al, I have some questions about your paper which could help me, solving some misunderstandings: 1) To which item or construct does your cluster variable, named LEAID, refer to? 2) It would be much easier to follow your steps, reported in your article, if I could run your code on the dataset. I wrote to Mrs. Henry, but she advised me to simulate such a dataset by Monte Carlo technique. Unfortunately, I am not able to do this. How does it work? 3) I would like to use a model for my own dataset that is the same as presented in Figure 7 (multilevel latent class model – non parametric approach with level 2 factor on random latent class indicators). Would you give me a hint how to write this in MPLUS-code? 4) The final model seems also be attractive to me. But it is hard to understand your reported code without any comments on it. Especially the lower part (model constraint). I would be happy, If you could give us some comments on the code.
I would be very happy, if you could help me answering these questions.
You can look at Examples 10.6 and 10.7 in the user's guide. These are similar to what is done in the Henry and Muthen paper and there are data available with each example. A note of caution, if you are a beginner in multilevel analysis, starting with a multilevel mixture model may not be a good idea. Studying both multilevel and mixture modeling as a first step is a good idea.
Angela Urick posted on Thursday, January 19, 2012 - 12:46 pm
I’m working on a two-level LCA (types of teachers in schools with types of principals) with a cw and cb that have different indicators (I have labeled them uw’s and ub’s below). These two sets of indicators have dichotomous and continuous measures. Finally, I want to regress cw on cb with a random intercept (similar to ex. 10.7). Here is my basic code:
Here are my questions: 1. Is a LRT (TECH 11, 14) possible for this model with different indicators for cw and cb? If not, how would you suggest that I asses class fit? 2. Theoretically, there is a two-way relationship between cw and cb, how would you suggest that this be modeled?
1. No. It is an open research question on how to determine number of classes in a multilevel setting such as this. The ordinary BIC, for example, may not be the best approach. Interpretability can be helpful.
2. The relationship between cw and cb is captured by the between-level statement
cw#1 – cw#3 on cb;
where the cw random intercepts are influenced by cb.
Angela Urick posted on Saturday, January 21, 2012 - 10:33 am
Good afternoon, Dr. Muthen, I have another question in reference to the model mentioned above. In the results, the means of the level 1 indicators (uw/u) vary across classes as expected. However, the means of the level 2 indicators (ub/z) are the same across all classes. I ran cb as a single level LCA—there should be three different classes. Why would these means remain the same across the between level classes? Do I need to free the between level indicators or make other edits to the code? Thanks again, Angela
Mary Campa posted on Saturday, May 26, 2012 - 12:13 pm
Hello, I am building a model similar to the Henry & Muthen, 2010 model 2, a two-level random effects LCA. I have selected a four-class model as best fitting. My question is about the between-level variances produced by this code:
%BETWEEN% %OVERALL% C#1; C#2; C#3; C#1 WITH C#2 C#3; C#2 WITH C#3;
I used starting values to switch the ordering of the classes and the estimates and standard errors of the between-level variances (C#1 -C#3) changed. For example, on the initial run, the variance in class 2 was significant but on the next run the variance for the same class (although now a new number) was not. This happened for multiple classes, where the parameter estimates changed based on what class I selected as the reference.
My understanding from the Henry & Muthen paper is that these parameters represent the between-level variance in the class membership. The classes (proportions, probability of indicators) remains the same regardless of which is the reference class so I am not clear why these are changing.
Does this suggest there is something wrong with my model or am I wrong in the interpretation?
The multinomial regression has the coefficients at zero for the last class, the reference class. The between-level variance components add to the coefficients for all but the last class. It makes sense that the size and SEs of these variance components change when you change the order of the classes because it is all relative to the reference class.
Hi Dr. Muthen, Thank you so much for your help in advance. I'm running a two level LCA and latent variable and independent (treat) is school level, and dependent variable is student level. According to the Number of clusters: 32, isn't the sum of latent classes supposed be 32 as well? But, the results show latent variables were converged by student level not school level like below: CLASSIFICATION OF INDIVIDUALS BASED ON THEIR MOST LIKELY LATENT CLASS MEMBERSHIP
Although the latent class variable is a between-level variable (varying across schools only), the class counts and proportions printed say how many students are in each class. But there are only 32 schools and the latent classes refer to them.
xiaoshu zhu posted on Wednesday, July 25, 2012 - 10:56 am
I have a question regarding the class membership at the group level. I followed the codes for the non-parametric MLCA in Henry and Muthen (2010) and specified a model with two student LCs and three group LCs. The output showed that some groups were assigned to two group LCs, simultaneously.
How can we deal with this problem? Should we decide the group class membership based on the one with largest proportion of students within the group?
Please send the output and saved data along with your license number to firstname.lastname@example.org. Point to specifically what you see as the problem.
Mike Todd posted on Sunday, January 20, 2013 - 5:14 pm
We have a multilevel dataset (individuals nested with census tracts) that we would like to use in a multilevel latent profile analysis.
The census tracts (Level 2 units), and in turn, the individuals (Level 1 units) can be grouped into two categories. What we are wondering is if how/if measurement invariance across the two categories can be tested for in a multilevel LPA/LCA. Would we test this in the same manner that we would for standard "single-level" LPA? If not, can you point us to a relevant approach that could be applied to results generated by Mplus?
It sounds like you have an observed grouping variable on level 2. This can be handled by defining a between-level latent class variable that is exactly the same as the grouping variable. See UG ex 7.24 for how this is done in the single-level case. The between-level latent class variable has to be declared on the Between = list as in UG chapter 10. Then you can specify and test various degrees of measurement invariance across these between-level classes.
Mike Todd posted on Monday, January 21, 2013 - 9:11 am
i am running a latent analysis with complex survey data on 9 items aiming at political alienation and willingness to participate in the democratic process.
in my early steps i ran the analysis without type = complex mixture (only type = mixture) and it turned out that a 3- and 4-class solution seemed the most reasonable solutions (log-likelihood based fit indices were all pretty sobering, but interpretation was consistent with substantive theory of the construct).
later i realised i should be using type = complex mixture, since the data is clustered with n(cluster)= 27 and differing cluster sizes. hence, i re-ran the analysis for 3,4 and 5 latent classes.
what struck me was, that the estimates did not change at all for the 3 and 5 class solutions, but changed considerably for the 4 class solution, which was the best model interpretation-wise when using only type = complex and now unfortunately is far less sensical.
how is that possible, that 3 and 5 class solutions did not change, but the 4 class solution did? estimator is mlr, which i assume somehow weights with cluster-size? could that be "playing against" my beloved 4 class solution, because one the classes is quite small in comparison to the other 3. maybe if people in this class come from cluster units with a small weight (due to small cluster unit size), this class cannot be detected well enough?
as you can tell, i only have a very vague understanding of how the estimators work. i apologize for painful stupidity in my thoughts expressed above.
do you have any recommendations for chosing an estimator when using type = complex mixture. ive heard there are different options: mlr, uls ... are there any paper where i could look things up?
Unless you have weights, your classes should not change when adding COMPLEX. Perhaps you are not replicating the best loglikelihood in all analyses. Or perhaps the order of the classes changed. If you can't see the problem, send the relevant outputs and your license number to email@example.com.
The only estimator available for TYPE=COMPLEX is MLR.
Please limit future posts to one window. If they are longer than that, they are not appropriate for Mplus Discussion.
thanks for the answer. i want to run the blrt but i am using type = complex mixture option. if i compare the 5 class solution using type=complex mixture to the 5 class solution using only type = mixture, estimates dont change and p-values do, but only slighty. would it be acceptable to run the blrt using type = mixture even though the data is in fact clustered?
hi, i'm running this multilevel latent class model: Variable: Names are indir INDIR1 INDIR2 id GENERE REGOLARE CITT CITT2 NUCLEO1 NUCLEO2 NUCLEO3 LIBRI SP_DOM PC F_ISCED M_ISCED PARED PROF_P PROF_M BFMJ BMMJ HISEI SP_SCOL sp_sc_d cod_scu; Missing are all (-9999) ; auxiliary are id; usevariables are GENERE REGOLARE CITT2 NUCLEO3 LIBRI SP_DOM PC PARED HISEI; categorical are GENERE REGOLARE CITT2 NUCLEO3 LIBRI SP_DOM PC PARED HISEI; Classes = CB(2) CW(3); within = GENERE REGOLARE CITT2 NUCLEO3 LIBRI SP_DOM PC PARED HISEI; between = CB; cluster = cod_scu; Analysis: Type= Mixture Twolevel; Model: %within% %overall% %between% %overall% CW on CB;
And i have two questions: 1) i have different thresholds' estimates in the same within classes, i.e. thresholds' estimates of latent class 1 1 are different from these of latent class 2 1. Is it correct?
2) How can i calculate thresholds' estimates in probability scale?
1) The thresholds vary across the between-level CB classes as the default. You should think of thresholds as a between-level quantity in line with having means appear on between in regular multilevel modeling.
2) This is tricky because the probabilities involve the random effects and therefore require numerical integration.
Dear Bengt, thank you for your kind reply. I would have two minor remarks just to be sure to have properly understood your comments.
1) the following is part of my output. According to your examples, the first coefficients in each group of thresholds (e.g. the pairs of GENERE$1) should be equal. As you can see, mine are not.Is there a mistake or is there a reason I cannot see underlying this results?
2) ok, i understand, but it's very difficult interpreting characteristics of classes looking at threshold's estimates. could it be a good idea saving class probabilities and then analizing classes with descriptive statistics?
I don't think I'm finding a models similar to what I want to explore in MPlus inside the user's guide. I have 2 specific questions.
-Question 1: I'm creating a multilevel latent profile model for students in classrooms. I want to see if the the random effects (intercepts) for profiles latent means, constituted by a battery of student indicators, are predicted by a level 2 latent factor for classroom environment.
Would I specify that latent factor for classroom environment in this part of the model statement? %OVERALL& %BETWEEN% schoolfactor by indicators c#1 on schoolfactor c#2 on schoolfactor
- Question 2: Finally, how would I specify a multilevel mixture model where the profile is constituted by child (level1) and classroom (level2) indicators? The profile might have measures of cognitive and social outcomes for children as well as measures of classroom environment.
Q1. You would follow the ideas in the Henry-Muthen multilevel LCA paper on our website. Declare a between-level latent class variable. On between you simply say
schoolfac BY ....;
which will give you the schoolfactor means in the different between classes.
Q2. Just declare the classroom environment variables as Between variables and include them in the BY statement.
CMP posted on Thursday, February 26, 2015 - 3:03 am
Hi, I am running a multilevel mixture analysis with random effects. My variables are y x1 x2 x3 x4 x5. On level 1: Y on x1 x2 x3 x4 On level 2, I would like to identify latent classes (cb) using only the random slopes from level 1 (s1 s2 s3 s4). I do not want to include random intercept (y) as it does not make substantive sense, in my case. Following your example on the User’s Guide (example 10.2), I notice the random intercept is included as an indicator of cb. How can I specify that only the random slopes be used?
Hello, I posted earlier on about the multilevel analyses I am running. My further questions are: 1) Repeated measures nested in individuals is by other multilevel standards 2-level but considered 1-level model by Mplus, as I understand from the UG. In my input I specified the model as TYPE = TWOLEVEL MIXTURE RANDOM; is this correct? 2) The ICs (Bic, aBic) and entropy favour a 3-class solution but the LRT does not. So I am trying to use TECH14 to test the different class solutions by bootstrap but the model keeps running and never stops (2 days!). What could the problem be? ANALYSIS: lrtstarts= 50 20 50 20; 3) I had done the multilevel analyses without modelling latent class in HLM. When I run this same model in Mplus, some random slope variances which were significant in HLM become NSG in Mplus. Why is this so? 4) Could this be due to nonnormale multivariate nature of the data? I wanted to change the estimator to MLM but got an error message. Is this impossible to do? Your help with these questions will be much appreciated.
1) In Mplus you can do growth as 2-level in long format or as 1-level in wide format. We recommend the latter whenever possible. So if you take the latter approach, Type=Twolevel would refer to some other clustering like students in classrooms.
Morgan, G. B. (2014). Mixed mode latent class analysis: An examination of fit index performance for classification. Structural Equation Modeling: A Multidisciplinary Journal, DOI: 10.1080/10705511.2014.935751
E. Cohen posted on Monday, August 24, 2015 - 6:57 am
Dear Drs. Muthén, I have the following multilevel LCA problem: I want to identify subtypes of related cases in a multilevel sampling context (three levels), using a set of seven categorical variables measured on levels 1 and 2 (4 L1 variables, 3 L2 variables).
Two questions: 1. Is it possible (yet) to estimate a MLCA model with latent classes based on indicators/variables measured on different levels? (Henry & Muthén (2010) as well as other articles on MLCA look at models comprising L1 latent classes only.)
2. Is it possible (yet) in Mplus to estimate a three-level LCA (with covariates on two levels predicting class membership)? If not, is there a general estimation/data preparation strategy you would suggest (e.g., ignoring – while acknowledging – L3 clustering, or aggregating scores on L1 variable and treat them as L2 variables, then omitting L1 and shifting the level of analysis upwards)?
1. Mplus handles latent class variables on both level 1 and level 2. See for example, the paper on our website:
Muthén, B. & Asparouhov, T. (2009). Growth mixture modeling: Analysis with non-Gaussian random effects. In Fitzmaurice, G., Davidian, M., Verbeke, G. & Molenberghs, G. (eds.), Longitudinal Data Analysis, pp. 143-165. Boca Raton: Chapman & Hall/CRC Press.
Muthén, B. & Asparouhov, T. (2009). Multilevel regression mixture analysis. Journal of the Royal Statistical Society, Series A, 172, 639-657.
2. I think level 1 and level 2 latent class variables can have their random intercepts be predicted by level 2 and level 3 covariates.
HEllo, I'm trying like a mad to change the reference category of the between-level latent class in a non-parametric two-level LCA as descripted in Henry & Muthén's paper.
my syntax of the VARIABLE and MODEL sections is: usevariables = TOLER PROT ; cluster = cntry; within = TOLER PROT ; classes = cb (5) cw(3); between = cb; MODEL: %within% %OVERALL% %between% %OVERALL% CW on CB ;
Dear Dr. Muthen, I am running a multilevel LPA model, akin to the model developed by Henry & Muthen's (2010). With your kind permission, I want to pose three questions about interpretation of my output with references to the main article. 1- The propability estimates of level-2 predictors over the level-1 latent class solutions (i.e. Table 3, Henry & Muthen, 2010) refer to "C#2 on [Level-2 predictor name]" statements in the output, right? 2- The effect of level-2 predictors over the level-1 latent class indicator intercepts (i.e. Table 4, Henry & Muthen, 2010) refer to estimates of "model constraints" such as "POP30, GRW30"?? Then what is the function of total effect variables (i.e. C2POPLOG, C2TOBGRW and C2POVLEV) in the output now that the significance and effect size of those would not reported in the article? Or, first we should check the significance of total effect (e.g. C2POP*LOG) and direct effect estimates (e.g. C2_POP) on the latent class factors and then we can report the effect size of individual cross-level variables (e.g. POP30)? 3- I use LPA with continuous indicators thus I interpret the linear regression outcomes. Should I pay an extra attention to any estimate while interpreting the output?
Thank you for your patience to read my message in advance. Now I cross my fingers for your answer.
and I would need to change the reference class of from CB#6 to CB#1. I know how to do it with the within-level classes (by fixing the means as in the example above) but I can't find what I should do for the between one
Use SVALUES in your original run to save the estimates. Then use them in a Starts=0 run where you switch the regression coefficients for CW on CB so that CB#1 values are given for the last CB class. You may also have to re-calculate the logit for CB called [cb#1] etc. You do that by first computing all the CB class probabilities from the logits as described in Chapter 14 and then switch the probabilities and then compute the new logits.
With model 2 in table 1 in Henry and Muthen (2010), the parametric approach to a multilevel 3 class model, do I need the following in the syntax for a 2 class model? Would I just limit that to only "C#1"?
Hi Bengt, sorry for bothering again, but it's still not clear how I can switch the regression coefficients for CW on CB so that CB#1 values are given for CB#5 (the last class) as I cannot mention the last class directly.
Send your run with the original solution and your best attempt at switching the classes to Support along with your license number. Briefly describe where you get stuck.
Brian Knop posted on Monday, March 07, 2016 - 2:26 pm
I have looked through the user guide and Henry and Muthen (2010), but can't figure out how to get odds ratios for level 2 variable effects on latent classes (such as effect of tobacco growing state on smoking type). I can get coefficients (for level 2 variables), but not odds ratios. Thank you.
The categorical variables are random intercept on level 2. So they are continuous.
Brian Knop posted on Tuesday, March 08, 2016 - 8:23 am
Thank you for your quick response. That explains why the output gives odds ratios for within-level effects, but not between-level effects on the latent classes. But Henry and Muthen (2010) present between-level odds ratios, so it is possible, yes?
The variable would need to be on the BETWEEN list and the CATGORICAL list to get an adds ratio on between.
Brian Knop posted on Wednesday, March 09, 2016 - 9:17 am
Ok, for some reason I still can't get odds ratios when I do that. If I were hypothetically looking at school-level effects on latent classes of test scores and all variables used are categorical, my code looks something like this:
Hello, We are trying to run a two-level LCA. The syntax we are using is
VARIABLE: NAMES ARE ID agency directservices ServYouth TimeWork gender age RaceBWO education fieldstudy sector RacePriv InstDisc BlatantRace ; MISSING are all (999); USEVARIABLES = RacePriv InstDisc BlatantRace ; CLASSES=C(4); CLUSTER=agency; ANALYSIS: TYPE = TWOLEVEL MIXTURE; STARTS = 60 30; PROCESS=8(STARTS); MODEL: %between% %overall% output: tech11 tech14 ;
Our output AIC and BIC are nearly identical to the AIC and BIC we received when not running it with the second level. Additionally, the output gives us within-level variances but not between-level variances. Is this syntax correct for running a multilevel LCA? Is it actually running it with two levels or just reproducing the single level output? Thank you very much for your help!
You have to mention the random intercepts for the latent class variables on Between. For an example, see UG ex 10.6.
Allison posted on Monday, August 15, 2016 - 9:48 am
Hello Bengt and Linda:
I have a multilevel LPA, with events (Level-1) nested within people (Level-2). Currently, I have a model with 8 within-person profiles, and nothing being specified at the between-person level of analysis.
In traditional multilevel research with a similar structure (events nested within people), it is common to group-mean center the L1 constructs prior to analyses. Would you recommend group-mean centering the L1 variables being specified as profile indicators? Would other centering decisions, such as grand-mean centering or even standardizing the profile indicators within-person, be appropriate?
Hi Dear Dr. Bengt\Linda Muthen I want to run a two-level bi-factor IRT model with random effects. 1. Is this the right syntax? 2. How do I calculate the effect of the clustering factor, for example gender? Or Is the gender impact significantly different from zero?
VARIABLE : NAMES ARE gender lang i3-i148; USEVARIABLES ARE i3-i27; CATEGORICAL ARE i3-i27; MIssing are all (9); cluster= gender; ANALYSIS: TYPE =twolevel random; estimator = ML; ALGORITHM = INTEGRATION;
MODEL: %within% G by i3-i27; F1 BY i3@ i4-i15 (4-15); f2 by i16@1 i17-i27 (17-27); G with f1-f2@0; F1 with F2@0;
%between% G by i3-i27; F1 BY i3@ i4-i15 (4-15); f2 by i16@1 i17-i27 (17-27); G@1; F1@1; F2@1; G with f1-f2@0; F1 with F2@0; OUTPUT : TECH1 TECH8;
I am trying to run a 1-2-1 multi-level mediation model and have variables that are both between level and within level, thus I did not specify them in the BETWEEN ARE or WITHIN ARE statements.
When I run my syntax I get this error:
*** ERROR in MODEL command Observed variable on the right-hand side of a between-level ON statement must be a BETWEEN variable. Problem with: QOLITOT *** ERROR in MODEL command Observed variable on the right-hand side of a between-level ON statement must be a BETWEEN variable. Problem with: LFQTOT
When I added those variables to the BETWEEN ARE statement, I then got this error:
*** ERROR in MODEL command Observed variable on the right-hand side of a within-level ON statement must be a WITHIN variable. Problem with: QOLITOT *** ERROR in MODEL command Observed variable on the right-hand side of a within-level ON statement must be a WITHIN variable. Problem with: LFQTOT
I am doing a multilevel latent profile analysis (students within classes within schools) but I only want to identify the profiles at the individual level (while accounting for the sampling design).
Is this the correct code?
idvariable is studID; cluster=idschl; ... type=mixture complex;
! no within or between statements
The output for this code provides summary data identifying the correct number of schools but the model output is modeling solely at a single level. In this case, am I getting a model that is adjusted for school? Or am I incorrectly coding for a 2 level (student within school) model?
When I change it to: cluster=ischl studID; within=v1 v2 v3 v4 v5 v6 v7;
I get almost identical loglikelihoods (-143857.684 one cluster and -143848.173 two cluster), similar AIC/BIC/LRT, and identical class counts and proportions of posterior probabilities and entropy. Since I only specified indicator variables at the individual level, is this model and the former similarly adjusting for schools?
If I want to also include class, it states a threelevel model is not possible for LPA. Is there another way to also account for a third level?
I am conducting MLCA. What is the procedure to assign units at Level 2 to the most likely class at Level 2 based on the latent class posterior distribution? Would you please provide me with the example of Mplus code?
You can use User's guide example 10.5 to see how this works. Add the following command: savedata: file=1.dat; save=cprob;
Take a look at the SAVEDATA INFORMATION section in the output file. The classification variable for the cluster would be CB and you can find those values in the saved data file 1.dat.
shonnslc posted on Tuesday, August 13, 2019 - 12:36 pm
I am conducting LPA and the clustering variables come from different classrooms (nested data). However, the number of cluster is only 6. In this case, I don't think MLPA is appropriate? In this case, what I can do to address the nested data structure? Thanks.
shonnslc posted on Wednesday, August 14, 2019 - 11:04 am
Thank you, Muthen!
One follow-up question: I found that the LPA results are identical for LPA without dummies (without controlling for clustering effects) and LPA with dummies using R3STEP. Is this normal? I thought bringing in dummies as covariates should somehow change the LPA solution to some extent.
Yes, it is the point of R3STEP that they are the same - read web note 15.
Covariates do change the LPA solution when you don't do 3-step like R3STEP.
shonnslc posted on Thursday, August 15, 2019 - 1:33 pm
Thank you, Muthen!
1. So, I guess if my goal is to using covariates (i.e., dummies) to control for the nested data structure, 1-step approach is preferred since I want LPA solutions to account for the clustering effect. Am I right?
2. I am wondering why we specify the model by adding paths from dummy variables to the latent factor (i.e., c on dum1-dum5) but not to the clustering indicators (u1-u4 on dum1-dum5). Isn't the indicators are the dependent variables, and in fixed effects models, dummies are adding as predictors of dependent variables?