bmuthen posted on Tuesday, March 26, 2002 - 10:39 am
A frequent question is how 3-level modeling of growth of students within schools compares to HLM-type analysis. Here are some answers.
Actually, these analyses are one and the same. The 3-level HLM formulation describes across-time variation at level 1 involving random intercept and slope coefficients, level 2 studies the across-student variation of those coefficients, while level 3 studies their across-school variation. In the latent variable context, level 1 and level 2 are combined into the "Within" part of the model, i.e. the part describing variation across students. The "Between" part describes across-school variation and corresponds to level 3 of HLM. In this way, the 3-level HLM model is turned into a 2-level latent variable model. The fact that level 1 and level 2 are both considered in the Within part of the latent variable model is due to viewing the level 1 across-time variation as a multivariate observation vector rather than as a univariate repeated observation. Level 1 of the Within part is the latent variable measurement model part and level 2 of the Within part is the structural part.
So in summary, Mplus estimates a random coefficient model and the Between component intercept and growth coefficients are representing variation across clusters in the coefficients defined for Within (students), just as in HLM analysis. If you use the ML estimator you get the same estimates.
Hi, I am examining the mediational effect of (COLL) on the relationship between CL and TS. I have the predictor CL at the school level, the mediator COLL at the school level, and the outcome variable, TS, at the individual level. Thus, it is 2 > 2 > 1 (i.e., CL > COLL > TS; with a direct path from the predictor to the outcome as well).
I am only familiar with the software HLM, through my independent reading. I wanted to examine the mediational effect but i was not able to regress the mediational variable on the predictor variable. Regressing the mediator on the predictor is one essential equation to examine mediational models. Can i examine this equation using OLS through SPSS regression. The rationale is that both the mediator and the predictor variables are both at the school level. While for other equations, i am using HLM to regress the outcome on the predictor and the outcome on the mediator. If i use SPSS regression to regress the mediator on the predictor, how comparable the coefficients produced by SPSS and HLM. Is there any implimintation required while running the regression. Or, is there any function within the software HLM to run a mediational modeling. I know that MPlus enables to examine the mediational modeling, but my limited time and limited statistics will not help me. The best i can use is the simple 2-level HLM. I appreciate your help.
Another question is related to examining the assumptions before using HLM. I checked my data in terms of outliers, normality, and linearity through SPSS. I followed Tabachnick and Fidell's 2001) discussion of regression assumptions. My question, then, is there any examinations that are specific to HLM and how to run these examination. Can you refer me to any published work with this regard. Thank you
bmuthen posted on Friday, September 30, 2005 - 9:15 am
I think the easiest way to get the correct estimates and particularly the correct standard errors is to use Mplus for this modeling. That would make teh analysis very straightforward.
Marc Reis posted on Monday, November 21, 2005 - 12:36 pm
I would like to use Mplus to estimate the effect of the group-level variable organizational climate on an individual-level outcome. Organizational climate is measured with multiple individual-level (continous) indicators, so there are two scources of error variance for the aggregated organizational climate score for each organizational unit: the variation among the items (due to the fact that they are not perfectly reliable) and among the members of each organizational unit (assuming that there is a "true score" for each group). Since three-level-modeling is not yet available, are there any other ways to estimate the model?
Multiple indicators do not count as a level in Mplus because it takes a multivariate approach to multilevel modeling. In Mplus this would be a two-level model if I am understanding you correctly.
Marc Reis posted on Tuesday, November 22, 2005 - 2:46 am
My problem is that organizational climate is defined as a latent group-level-variable that influences a latent individual-level variable. The idea was to specify the following model:
%within% DV_within BY i1-i6;
%between% ORG_CLIMATE BY i7-i12; DV_between BY i1-i6; DV_between ON ORG_CLIMATE;
I'am unsure about how Mplus treats i7-i12 in this case? Would ORG_CLIMATE be a "correct measure" of the latent group-level variable? (e.g. would it be reasonable to save a factor score for each group?)
You would specify the model as you have done above. You would also have BETWEEN = i7-i12; in the VARIABLE command.
Marc Reis posted on Tuesday, November 22, 2005 - 1:15 pm
I tried to specify BETWEEN = i7-i12; in the VARIABLE command, but Mplus doesn´t allow group-level variables to have within-group variation. Note that i7-i12 are individual level indicators that I would like to aggegrate to the group level. So maybe there's another way or I made a mistake.
bmuthen posted on Tuesday, November 22, 2005 - 4:36 pm
If i7-i12 are scores on individuals, then you don't put those variables on the Between = list. And, you might want to use a within-level factor, say
wORG by i7-i12;
bORG by i7-i12;
This assumes that you are interested in the within structure of these variables as well as the between structure (and it may not be the same).
Another alternative is to simply aggregate each variable to the between level, i.e. creating cluster means, and then treat these as Between = variables with only between-level variation - and then specify the between-level factor model you have.
Anonymous posted on Wednesday, November 23, 2005 - 1:39 am
I have an idea about the above discussion, maybe this helps. In general there might be two reasons to specify a model with latent variables: The indicators are no perfect measures (--> error variance) and they usually do not measure the factor to the same extent (--> different factor loadings). So requesting factor scores means actually weighting the indicators, doesn´t it?
The question is whether it is necessary to weight the members of a group to obtain a better estimate of the group-level variable. Assuming that there is no special sampling procedure, I don´t see a rationale to weight one group member more than others. So from this point of view, it would be reasonable to simply compute the cluster mean as Bengt suggested, maybe based on the individual-level factor scores. But I am unsure whether to compute the factor scores based on the original covariance matrix or the dissaggregated within covariance matrix.
Maybe someone would like to comment...
bmuthen posted on Wednesday, November 23, 2005 - 6:37 pm
Comments are invited regarding this. Concerning whether to compute factor scores based on the original cov matrix or within cov matrix, I would say that if factor scores are needed for a multilevel setting, you are better off getting factor scores from a multilevel factor analysis model.
chantanee posted on Thursday, August 10, 2006 - 2:19 am
The purpose of my study was to find out the relationships among multilevel variables, student variables, classroom variables, and school variables, effected on student science learning achievement of Thai upper secondary school students. The study consisted of 3 sub objectives: (1) to identify student variables directly effected on science learning achievement (2) to identify the direct influences of classroom variables and cross level interaction between classroom variables and student variables effected on student science learning achievement, and (3) To identify the direct influences of school variables and cross level interaction between school variables and classroom variables or between school variables and student variables effected on student science learning achievement. The sample of the study employed multi-stage random sampling. It consisted of 3 groups: (1) 132 school administrators: principals, assistant principals and heads of science department from 44 public upper secondary schools in Thailand, (2) 132 science teacher who taught Chemistry, Biology and Physics in the classroom, from 88 classroom samples ( 2 classrooms per school), and (3) 2,488 Grade 11 science students. If it’s possible, could you kindly give me some advice for these question? 1. Are the results effectiveness to report? 2. Are two classrooms per school powerful enough for employing HLM?
This sounds like multilevel modeling would work well - you have enough schools and classrooms. 2 classrooms per school is a bare minimum which does not allow you to study many classroom variables. Mplus does not currently handle this 3-level model.
We have a dataset that includes three waves of data collected on a number of family context constructs. LGM has been our preferred method of analysis, but we're now planning to collect 2 days of cortisol samples from our participants with multiple cortisol samples each day. Most prior studies that use cortisol data tend to model the daily hormone patterns using HLM. I'm confused how we can use our LGMs of the family context constructs to predict the HLM based hormone patterns. Could the cortisol data be modeled in LGM and then used as a dual process predicted by family context?
The SEM and HLM growth models differ in two basic ways. One is the treatment of time scores. In SEM, they are treated as parameters in the model. In HLM, they are treated as data. The second is the treatment of time-varying covariates. The regression coefficients are fixed in SEM and random in HLM. Mplus can have time scores as parameters or data and can have fixed or random coefficients for time-varying covariates. So I think you should be okay.
Frank Gallo posted on Sunday, August 09, 2009 - 8:02 pm
Dear Dr. Muthen
I am a beginner with Mplus. I am using Mplus Version 5.21. I have stratified data: police arrests (n =3,300) within police departments (n = 16) that serve community population levels (n = 4). The DV police force is continuous. I have a mixture (nominal, ordinal, ratio) of 21 covariates at level 1 and none at levels 2 and 3. Community levels are fixed effects. Would the multilevel modeling features of Mplus handle these data? Thank you.
It sounds like you have a two-level cross-sectional model which can be estimated in Mplus. The problem I see is that you have only 16 police departments. It is usually recommended to have a minimum of 30 clusters.
Utkun Ozdil posted on Thursday, December 16, 2010 - 11:28 am
I collected my data from a university's three faculties (Faculty of Education, Faculty of Engineering, and Faculty of Arts and Sciences). In each of these faculties were involved second, third, and fourth grade undergraduates. So, I have students nested within grade levels within faculties. This led me to analyze a three-level model. Does MPlus handle such data analysis as mine or is the HLM program more appropriate?
Mplus does not currently have three-level cross-sectional models. HLM does. Your data, however, are not suitable for multilevel modeling given that faculty and grade cannot be considered random modes.
Jing Zhang posted on Wednesday, August 24, 2011 - 3:35 pm
Dear Dr. Muthen, In your post dated on March 26, 2002, you talked about how to deal with 3-level modeling of growth of students within schools. You said that level 1 and level 2 are combined into the "Within" part of the model by viewing the level 1 across-time variation as a multivariate observation vector rather than as a univariate repeated observation. My question is that: Does this mean that the data in long format will not work, and the data has to be changed to wide format if it is in long format? Thanks, Jing
I am working on a three-level CFA model (unbalanced data). The method I used was ESM to examine the variability of a continues variable. Because the method is so intense, I used three items to capture the construct. Person gave ratings on these items three times a day, for five days. Therefore, moments were nested within days, and days within people. These are the 3 levels.
I gave a unique ID to every person, and this unique ID appears in the data 15 times per person.
Example below for one person a bit of a second person. M stands for moment.
When I run my analysis MPlus gives me a warning message:
*** WARNING Clusters for DAY with the same IDs have been found in different clusters for RESP_NR. These clusters are assumed to be different because clusters for DAY are not allowed to appear in more than one cluster for RESP_NR.
I want to know what does this warning message mean? Can I trust my output with it, or is it affecting my results? I would appreciate your help!
Luo Wenshu posted on Tuesday, March 24, 2015 - 7:42 pm
Hi Dr. Muthen,
I am using MPlus to run a 2-level HLM model and have the following questions. 1) Is there a default setting for centering of predictors, grandmean or groupmean? 2) For random slopes, if we find they are not statistically significant, does this mean we do not need to build level 2 model with predictors for these random slopes and just turn to fixed model for slopes? 3) Do we need to allow random intercepts and slopes correlated at Level 2?
2) You may still find significant influence of level-2 predictors on the random slopes.
Luo Wenshu posted on Thursday, March 26, 2015 - 1:25 am
Thank you very much Dr. Muthen, For the correlations among random intercepts and slopes at Level 2, what is the default setting in MPlus? It seems that the corrlelations are fixed to be zeros by default.
It depends on the analysis setting. You see in the output what is done in each case. If the covariance is not there, add it.
Melody Kung posted on Thursday, August 06, 2015 - 12:28 pm
Hi Drs. Muthen,
I am running a 2-level model with 2 independent, latent, between-level variables.
Using example of 9.12 in the manual as a guide, I specified "within" and "between" variable names in the VARIABLES section and then defined the latent variables and their indicators in the MODEL section, followed by the "%WITHIN%" and "%BETWEEN%" statements. The example does not include latent variables, whereas my model does.
Can I include latent variables in the %BETWEEN% statements? When I try to do so, the error message that shows up states that the two latent variables in the BETWEEN option are unknown, even though I specified the latent variables in the MODEL section.
I hope my question makes sense. Thanks for any help you can provide.
I have a longitudinal study where teens completed daily diaries for 14 days each year for 3 years. Thus, days are nested within years which are nested within individuals. I am interested in whether an individual level variable (e.g., sex) moderates certain slopes at the daily level (e.g., conflict --> distress within a day). I'm having trouble figuring out the appropriate analysis to run -- would this be a three-level model such as in example 9.20 of the version 7.6 guidebook?
Having only 3 years on Level-2 is too few for a 3-level analysis. Perhaps you could do a 2-level analysis of days within subject and let year be represented by dummy covariates.
Cynthia Yuen posted on Thursday, February 11, 2016 - 7:41 am
Thanks for the quick response! Would it be better to do something like 9.12 or 9.13 instead and model growth within a two-level model? Two of our main questions are whether the daily relations between events (e.g., conflict --> distress) change as teens age, and whether some individual-level characteristics like ethnicity predict how/whether these slopes change over time. Do you have any advice on how to appropriately model this?
I don't hear that you have a growth model situation but a regression of distress on conflict - where that regression may change over year (I assume, not over the days). If so, I would do a 2-level regression where level 1 is time (the 14 days) and level 2 is subject. Year can be level-2 and can predict the DV distress and perhaps the slope by creating Year*conflict and letting that influence distress. But it is a research question which I don't have enough background in your study to really answer.
Luo Wenshu posted on Friday, April 15, 2016 - 4:46 pm
Dear Dr. Muthen,
In 2-level HLM analysis (student and class level), I see from the users guide that for level 2 predictors, we may use observed (mx) or latent(x). I know the observed level 2 can be obtained by aggregating level 1 scores at the class level. 1) How is the latent Level 2 predictor calculated? 2)Which one is preferred? 3)If I have a level 1 predictor as Rasch measure (i.e.,latent variable), do I still need to use latent variable of the predictor at Level 2?
Lüdtke, O., Marsh, H.W., Robitzsch, A., Trautwein, U., Asparouhov, T., & Muthén, B. (2008). The multilevel latent covariate model: A new, more reliable approach to group-level effects in contextual studies. Psychological Methods, 13, 203-229.
3. If the Rasch indicators vary over level-2 units, you should express a latent variable model on this level also - the indicators are then the random intercepts of each observed indicator.
Luo Wenshu posted on Friday, April 15, 2016 - 6:11 pm
Thank you for the quick response, Dr. Muthen!
I used latent level 2 predictors in my 2 level random slopes analysis. Using the default estimator MLR, I got the following warning message. Does this mean the result is not trustworthy. Do I need to turn to MLF estimator as suggested.
WARNING: THE MODEL ESTIMATION HAS REACHED A SADDLE POINT OR A POINT WHERE THE OBSERVED AND THE EXPECTED INFORMATION MATRICES DO NOT MATCH. AN ADJUSTMENT TO THE ESTIMATION OF THE INFORMATION MATRIX HAS BEEN MADE. THE CONDITION NUMBER IS -0.157D-01. THE PROBLEM MAY ALSO BE RESOLVED BY DECREASING THE VALUE OF THE MCONVERGENCE OR LOGCRITERION OPTIONS OR BY CHANGING THE STARTING VALUES OR BY USING THE MLF ESTIMATOR.
In addition, I found level 2 predictors had big standard errors. I suspect multicollinearity issue at level 2 because correlations at level 2 are usually higher than correlations at level 1 between the same set of variables. How to solve the problem?
We tried to run 3-level analyses (student, class, school) and had some difficulties. We run an intervention study; within schools, classes were randomly assigned to either one of two intervention conditions or the control condition. Now we have outcome y (measured after the intervention) and would like to know if the intervention had an effect on y while controlling for the pretest measure. We expect the intervention effect to differ between conditions and between classes within one condition. That is, we want to allow random slopes for the intervention effect on the outcome y in our analyses. The problem is that we had schools participating with 1 up to 5 classes which is why the distributions of the conditions within schools differed between schools. To consider this in our analyses we would like to let the condition variables ("cond1" and "cond2") vary between schools.
Thus, the cond variable is a variable with model variance at level 3 and no variance at level 1. When we try to implement that in Mplus we get the following error message:
"*** ERROR in MODEL command Variables that have been declared as variables for the BETWEEN CLASS_ID level cannot be used on the BETWEEN SCHOOL_ID level. Variable incorrectly used: COND1".
I am following the recommendations of Nezlek (2016) to estimate the reliability of an ESM measure using three level analyses. He provides an Mplus 7 example output with a warning which Nezlek tells to ignore:
WARNING Clusters for DAYNUM with the same IDs have been found in different clusters for SUBJNUM. These clusters are assumed to be different because clusters for DAYNUM are not allowed to appear in more than one clusters for SUBJNUM.
The problem, when I repeat it in Mplus 8, instead of a warning, I get an error:
ERROR Clusters for BEEP with the same IDs have been found in different clusters for ID. These clusters must have different IDs because clusters for BEEP are not allowed to appear in more than one cluster for ID. Check that the cluster variables are specified in the right order. Alternately, create unique IDs for BEEP in DEFINE based on its original IDs and a multiple of ID.
I multiplied beep with ID, it runs, but I think it is a bit strange to do, and it gives me a between beep variance for SELF of 0 which is odd.
Is there something changed in how Mplus handles this between version 7 and 8? Is creating unique ID's for Beep indeed the way to go?
We go back and forth about what to do regarding that issue and indeed in V8.1 we have changed to the more restrictive setup, but in the next version we will go back to what we had in V7.
All you need to do is add this command
to create unique BEEP values for each level 3 cluster.
Jim Sloane posted on Thursday, August 30, 2018 - 2:10 pm
Hello. I am trying to fit a multilevel growth model using ANALYSIS: TYPE = THREELEVEL RANDOM. I have data for test scores over time nested within students (L2) nested within schools (L3). Let's say it's the simplest case with a test score variable, "score", and a time variable, "t". I want to estimate the model such that there's a random slope on time at both the student and school levels. However, I'm having trouble doing it in Mplus. My two specific question are:
1. For this scenario, under Variable:, do I list score and t as (a) WITHIN = score t, (b) BETWEEN = score t, (c) both, or (d) neither?
2. How do I specify something like
s1 | score t s2 | score t
to get slopes at both the student and school levels without getting an error about duplication of terms?
Thank you, and apologies if this is all spelled out somewhere already!
1. (d) Neither - which mean variation exists on all 3 levels.
2. You say Within = t and then you
say s | score on t;
and then mention s on the 2 higher levels. Also see V8 UG ex 9.20 and later examples for variations on this theme. See also out Short Course handout and video for Topic 10 on our website where 3-level modeling is discussed.
Jim Sloane posted on Friday, August 31, 2018 - 9:53 am
because s1 refers to one slope, not several. So say
s1 | score ON t;
s2 | score ON t2;
Javed Ashraf posted on Wednesday, September 05, 2018 - 10:34 am
Hi I had a query that can we conduct three level multilevel mediation modelling using categorical observed variables and between and within subject groups in latent variables context using Mplus version 8.1.
Is there any solution possible if we don't use the sample weights or employ complex survey methodology in the given scenario. Best Regards
Yes, Mplus does 3-level with categorical variables using Bayesian estimation - see UG ex 9.21.
Zhi Ye posted on Sunday, September 16, 2018 - 6:25 pm
Dear Dr. Muthen, I am running a three-level interaction model according to example 9.20 in the UG as following: WITHIN= skill1 skill2 ; BETWEEN = (classid) prosocial1 prosocial2 (schoolid)climate1 climate2 ; DEFINE: CENTER skill1 skill2 (GRANDMEAN);
ANALYSIS: estimator=ML; TYPE = threelevel Random; ALGORITHM=INTEGRATION; MODEL: %WITHIN% skill by skill1 skill2 ; PVW by PV1 PV2 ; s1 | PVW on skill; %BETWEEN classid% prosocial by prosocial1 prosocial2; PVB1 by PV1 PV2 ; s2 | PVB1 on prosocial; s12 | s1 on prosocial; PVB1 with s1; %BETWEEN schoolid% PVB2 by PV1 PV2 ; climate by climate1 climate2 ; PVB2 on climate; s1 on climate; s2 on climate; s12 on climate;
PVB2 with s1 s2 s12; s1 with s2 s12; s2 with s12; OUTPUT: TECH1 TECH8;
However, there is an error showed that:
*** ERROR in MODEL command The following random slope is not allowed for TYPE=THREELEVEL. Problem with: S2 | PVB1 ON PROSOCIAL
The following random slope is not allowed for TYPE=THREELEVEL. Problem with: S12 | S1 ON PROSOCIAL
Could you please help me to fix the problem？