Mplus Discussion >> Building a Multilevel Structural Model

Topics
Last Day
Last 3 Days
Last Week
Tree View

Edit Profile


Building a Multilevel Structural Model

Mplus Discussion > Multilevel Data/Complex Sample >

Message/Author

chera posted on Wednesday, October 18, 2000 - 9:29 am

I am attempting to figure out how to create a structural model with a one factor two-level independent variable predicting a one-level continuous dependent variable.

Is this possible to do? Should I just build the dependent variable on the within side alone?

Linda K. Muthen posted on Thursday, October 19, 2000 - 9:48 am

Let me restate your question to be sure that I understand it. You have a factor on the cluster (between) level and you want to relate it to an observed variable on the individual (within) level. If this is what you mean, you would state this in the between part of the model as y ON f, for example. Every observed variable on the within level has a between level counterpart that is automatically created by Mplus.

Anonymous posted on Saturday, January 06, 2001 - 11:40 am

I indadvertantly posted this question already under Hierarchichal Regression, but it really belongs under this heading. I will repeat it here: If I have found a two-factor model with an EFA and have found it not to converge as a two-factor model in a multilevel framework, are there any cites that I could use to argue that the two-factor model found in the EFA is an artifact of the nested nature of the data?

Bengt O. Muthen posted on Saturday, January 06, 2001 - 4:18 pm

I don't know of any cites related to this. I would first get the multilevel factor analysis model to converge starting with within level and then adding the between. It is known that there are sometimes less factors found on the between level than on the within level. See references in Muthen (1989, Psychometrika) on the website. It is certainly possible that one factor on the between level and two factors on the within level could give rise to the well-fitting two-factor EFA that you found. It is also possible that there are two-factors on the between and one on the within. You should probably work more with the multilevel model before you draw any conclusions about the artifactual results from the EFA.

Anonymous posted on Friday, February 02, 2001 - 12:32 am

Is it possible to estimate a multilevel model without the option that every observed variable on the within level has a between level counterpart, which is automatically created by Mplus.

Linda K. Muthen posted on Friday, February 02, 2001 - 8:28 am

No, this is not possible. However, if the variable is assumed to have no relationship with another variable on the between level, this can be specified in the between model.

Anonymous posted on Monday, August 27, 2001 - 1:42 pm

Are there technical, practical, or computational reasons why Mplus only allows for the calculation of 2-level HLMs ? Do you plan on allowing for 3-level HLMs in the future ?

Linda K. Muthen posted on Tuesday, August 28, 2001 - 9:34 am

No. We do plan to add 3-level in the future. It is among many planned additions.

Anonymous posted on Tuesday, November 13, 2001 - 6:09 pm

Hello -- I am learning Mplus so that I can estimate some multilevel path models, but I'm afraid I've gotten confused.

In a standard mixed regression model, you can estimate a level-1 regression where x_1 and x_2 predict y, and it is possible to get random components for the intercept and both regression parameters over level-2 units. However, as best as I can tell, in Mplus it is only possible to get a random intercept but NOT random slopes in the same situation.

Is there a straightforward way to understand why this is so? Is the answer to this the same reason why in a multilevel CFA in Mplus you can only get random intercepts in the indicators but the factor loading matrices are forced to be invariant across level-2 units?

Thank you very much.

Bengt O. Muthen posted on Wednesday, November 14, 2001 - 6:52 am

You are correct that random slopes are not part of the Mplus multilevel model for cross-sectional data. Latent variable modeling has traditionally considered mean and covariance structure models. With random slopes, there is no one covariance structure, but the covariance structure changes for each covariate value. See, for example, the Raudenbush chapter in the Collins, Sayer book. In Version 3 of Mplus, random slopes for observed covariates will be included.

Anonymous posted on Sunday, March 10, 2002 - 12:13 pm

In the Step 4 (estimation of between structure) of the multilevel CFA model building procedure described in Muthen (1994) Sociological Methods and Research article, I am running into the following problem:

*** FATAL ERROR

THE SAMPLE COVARIANCE MATRIX COULD NOT BE INVERTED.THIS CAN OCCUR IF A VARIABLE HAS NO VARIATION, OR IF TWO VARIABLES ARE PERFECTLY CORRELATED, OR IF THE NUMBER OF OBSERVATIONS IS NOT GREATER THAN THE NUMBER OF VARIABLES.
CHECK YOUR DATA. THIS PROBLEM IS DUE TO:
VAR11
How can I understand which of these is causing the real problem? If the problem is due to only one variable as suggested, does that mean that variable has no variance in the Sb matrix. When I checked the ICC of that items it is not very small (relative to other items in the analysis).

And, is it possible to use the Sb matrix in an exploratory factor analysis in Mplus to get an idea of the factor structure in the between level? Your help is much appreciated.

Linda K. Muthen posted on Monday, March 11, 2002 - 9:21 am

As stated in the article, this is a common problem. Are you analyzing Sb or SigmaB. You will probably have the same problems with both but SigmaB is recommended. You can save this using SAVEDATA: FILE (SIGB) IS filename; The covariance matrix is saved by default. You can also save the correlation matrix in a separate run by stating FILE (SIGB) is filename; TYPE=CORRELATION;

You can see if any variables have zero variances by looking at the diagonal of the covariance matrix. You can see if any variables have correlations of one by looking at the correlation matrix. The sample size is the number of clusters. If you have more variables than the number of clusters, then you violate the last warning.

You can use the SigmaB correlation matrix in EFA with the ULS estimator. This is the default estimator.

Anonymous posted on Monday, March 11, 2002 - 12:15 pm

Thank you very much for your reply.
I am using the SIGB matrix. None of the variances have zero variance, although some of them is very close. The problem seems to be some of the correlations that are larger than 1.0. I am guessing these correlations are caused by the low item variances. Does this simply mean that there is not enough variance in the group level to model?

The EFA output says:
THE INPUT SAMPLE CORRELATION MATRIX IS NOT POSITIVE DEFINITE. THE ESTIMATES GIVEN BELOW ARE STILL VALID.

I am not sure I understand why the the estimates are still valid even though the matrix is positive definite. Can I legitimatly report these estimates in a manuscript? Is there any literature that explains why these estimates are considered valid? Thank you again for your help.

Bengt O. Muthen posted on Monday, March 11, 2002 - 4:49 pm

Correlations greater than one means that the matrix is not positive definite which is a common problem with the estimated sigma between matrix as is mentioned in step 4 of the paper. It does not mean that there is low variance on the group level, but simply that the sigma between matrix is not well-estimated.

EFA estimation using ULS does not depend on the correlation matrix being positive definite. This is just an informational warning. However, in your case with correlations greater than one, I would not trust the results. You may instead want to use the second alternative mentioned in step 4, to analyze the sample between matrix using ULS as an approximation to analyzing the sigma between matrix. You can use these results in the multilevel model.

Anonymous posted on Monday, November 11, 2002 - 9:17 am

I'm student who studing multi-level model.

I'm finding reference about multi-level SEM analysis exept mplus manual.

Your help is much appreciated.

Linda K. Muthen posted on Monday, November 11, 2002 - 9:24 am

Following are two basic multilevel references:

Raudenbush, S.W. & Bryk, A.S. (2002). Hierarchical linear models: Applications and data analysis
methods. Second edition. Newbury Park, CA:
Sage Publications.

Snijders, T. & Bosker, R. (1999). Multilevel
analysis. An introduction to basic and advanced
multilevel modeling. Thousand Oaks, CA: Sage
Publications.

Following is a reference that uses Mplus in the analysis:

Heck, R. (2001). Multilevel modeling with SEM. In
G.A. Marcoulides & R.E. Schumacker (eds.), New
Developments and Techniques in Structural Equation
Modeling (pp. 89-127). Lawrence Erlbaum Associates.

You can find other multilevel references at
www.statmodel.com under References.

Yi-fu Chen posted on Tuesday, April 01, 2003 - 7:51 am

Hi, Dr. Muthen,

We are trying to run a two-level SEM model and encounter problems.
We have 320 subjects nested within 9 counties. Five counties are in intervention group and four are in control group. (This means the treatment is in the county level).
We now want to run a model with three latent constructs. Two latent constructs have four indicators each and one latent construct (intervention) has only one indicator. Here we treat intervention as a between-level variable, so we run the two-level model like this:

Between
intven------>Eta2
Eta1----------^

Within
Eta1------>Eta2

Is this a right way to run this model?
Besides, we have a little confuse about the sample size for the between-level. For our case, is it right to say that we have 9 cases in the between-level? Or the model presented in the between-level is only the result of adjusting cluster effect.

Thank you for your help!

Yi-fu Chen posted on Wednesday, April 02, 2003 - 7:56 am

Hi, Dr. Muthen,

This is a follow up question.
We try the model using complex sample. We have 9 clusters in the sample. In the model there are 11 observed variables and then we got the following message:

*** FATAL ERROR

THE SAMPLE BETWEEN COVARIANCE MATRIX COULD NOT BE INVERTED. THIS CAN
OCCUR IF A VARIABLE HAS NO VARIATION, OR IF TWO VARIABLES ARE PERFECTLY
CORRELATED, OR IF THE NUMBER OF CLUSTERS IS NOT GREATER THAN THE NUMBER
OF VARIABLES. CHECK YOUR DATA. THE PROBLEM IS DUE TO:

NUMBER OF VARIABLES : 11
NUMBER OF CLUSTERS : 9

So, if we understand correctly, the error is because we have more number of variables than the number of clusters. Does this mean that when running the complex sample model, we should have more number of clusters than number of variables?
Do you have any suggestions for dealing with multilevel issue when the number of cluster is small?

Thanks

Linda K. Muthen posted on Wednesday, April 02, 2003 - 8:32 am

I am asking someone with experience with a small number of clusters to answer your question. Less than 20 clusters makes the statistical analysis difficult.

booil jo posted on Thursday, April 03, 2003 - 8:50 am

Regarding Yi-fu Chen on Tuesday, April 01, 2003-

I think your model setup is correct given your cluster randomized trial situation and your research question. However, in your situation with only 9 clusters, I don't think it is a good idea to rely on nonparametric standard errors provided when COMPLEX command is used. Although simple, the sandwich estimator is known to yield anticonservative coverage probability (i.e., type I error rate higher than the nominal rate) with small numbers of clusters. If the number of clusters per condition is less than 10 (you only have 5 and 4 in each condition), the resulting sandwich estimator is very unreliable. See, for example, Jo et al. (2002) and Murray et al. (1998). To counter this limitation, several methods such as jacknife sandwich estimates (MacKinnon & White, 1985), adjustment using the t-distribution (Thornquist & Anderson, 1992), and adjustment considering the variance of the sandwich estimate (Kauermann & Carroll, 2001) have been suggested. As far as I know, modification procedures to counter anti-conservative sandwich estimates are not embedded in the current version of Mplus. I wonder if the conventional model based methods such as mixed effect ANOVA would do any better than the sandwich method in your situation. Another (even simpler) way to deal with this problem will be to treat the clusters as fixed (i.e., dummy covariates) and do regular fixed effect regression analysis. However, the results will be valid only under a strong assumption that the nesting structure is completely explained by these dummy covariates. For more explanation about the disadvantage of this regular regression approach, see Chapter 4 of Snijders & Bosker (1999).

REFERENCES

Jo, B., Muthén, B., Ialongo, N.S., & Brown, C.H. (2002). Cluster randomized trials with nonadherence. Submitted for publication. Can be downloaded from Mplus website.

Kauermann, G., & Carroll, R. J. (2001). A note on the efficiency of sandwich covariance matrix estimation. Journal of the American Statistical Association, 96, 1387-1396.

MacKinnon, J. G., & White. H. (1985). Some heteroscedasticity-consistent covariance matrix estimators with improved finite sample properties. Journal of Econometrics, 29, 305-325.

Murray, D. M., Hannan, P. J., Wolfinger, R. D., Baker, W. L., & Dwyer, J.H. (1998). Analysis of data from group-randomized trials with repeat observations on the same groups. Statistics in Medicine, 17, 1581-1600.

Snijders, T. A. B., & Bosker, R. J. (1999). Multilevel Analysis: An Introduction to Basic and Advanced Multilevel Modeling. Thousand Oaks, CA: Sage.

Thornquist, M. D., & Anderson, G. L. (1992). Small sample properties of generalized estimating equations in group-randomized designs with Gaussian response. Paper presented at the annual meeting of American Public Health Association. Washington, D. C.

Anonymous posted on Monday, April 28, 2003 - 5:52 pm

I am attempting to build a 2-level structural model (n=762, number of clusters=108). i am trying to follow the 4 steps given in muthen 1994 (multilevel covariance structure analysis). i am having a difficult time understanding how to accomplish the third step, in which one estimates the pooled within-group covariance matrix (with a sample size of total n minus the number of groups). can you give some guidance on how to accomplish this in mplus? thank you.

Linda K. Muthen posted on Monday, April 28, 2003 - 6:19 pm

In the SAVEDATA command, the FILE (SAMPLE) option will save the pooled-within covariance matrix. See page 90 of the Mplus User's Guide.

Janet Holt posted on Thursday, February 19, 2004 - 8:44 am

In constructing a multilevel model in MPLUS. I understand that random slopes need to be fixed in MPLUS. However, is it possible to model a cross-level interaction even with fixed slopes. In HLM this would be comparable to a gamma11 with no error term (u1j) in the equation.

Linda K. Muthen posted on Thursday, February 19, 2004 - 9:49 am

Random slopes are allowed in Mplus. This is discussed in the Addendum to the Mplus User's Guide which can be found at www.statmodel.com under Product Support.

Jennie Jester posted on Tuesday, December 07, 2004 - 8:38 am

I am developing a model of the relationship of executive functioning between parents and children. I want to do this in a latent variable framework, as I have a number of tests of executive functioning. So I was hoping to have a model where the latent variables executive functioning of the mom and executive functioning of the dad are regressed on executive functioning of the child,with Efmom and Efdad indicated by the tests measuring executive functioning in the moms and dads and Efchild indicated by the tests measuring executive function in the children. The problem I�m running into is that I have siblings in the sample, so when I develop the SEM like this, I am actually duplicating parents in the parent side of the model. This doesn�t seem right to me, but I can�t quite figure out how to put this into a modeling framework that makes sense.

Thanks for your help,

Jennie

bmuthen posted on Tuesday, December 07, 2004 - 11:31 am

You can handle this in 3 ways.

First, you can use type = complex with cluster=family to get the right SEs and chi-square taking the correlations within family into account.

Second, you can do 2-level modeling with cluster=family, where family variables go on level 2 (between).

Third, you can do multivariate modeling of all siblings jointly - see the Khoo-Muthen paper on the Mplus web site.

Jennie Jester posted on Friday, December 10, 2004 - 11:00 am

First I wanted to look at a CFA of this, so I tried the 2-level modeling using this syntax:
CLUSTER IS family;
BETWEEN = read56m name56m read56f name56f
WCST56mo trlBresm towerfa towermo stopfa stopmo WCST56fa trlBresf;
WITHIN = word45rs colr45rs wcst45rs toh45rs ssrt45 trail45r;
ANALYSIS: TYPE = twolevel;
MODEL:
%BETWEEN%
speedmo by read56m name56m trala56m ;
speedfa by read56f name56f trala56f;
EFmom by towermo stopmo WCST56mo trlBresm;
EFdad by towerfa stopfa WCST56fa trlBresf ;
%WITHIN%
execfunc by toh45rs SSRT45 wcst45rs trail45r ;
speed by word45rs colr45rs ;

where all the variables with m or mo at tne end are mother variables, those with f or fa at the end are father variables and the rest are child variables.
When I run this, I get the result that the intraclass correlations for all of the child variables are all 0.000, although when I look at these with SAS proc mixed, they are not zero. So I assume I'm doing something wrong in my setup here?

Second, I want to look at the mother and father (between variables), EFmom and EFdad, predicting the child (within variable) Execfunc. But I can't see how to do this in the model, because I need to specify between or within and this is both. How can I set this up properly?

Thanks,

Jennie

bmuthen posted on Saturday, December 11, 2004 - 6:00 pm

If you want between-level variation of the child variables - and hence get a non-zero intraclass correlation - you should not put these variables of the Within list because that says they have zero between variance.

You may then also add a between-level version of the exefunc factor:

exefuncb by toh45rs-colr45rs;

where you may find that you need to fix the residual variances at zero.

You can then add the between level statement:

exefuncb on efmom efdad;

Anonymous posted on Tuesday, February 15, 2005 - 6:36 am

I am trying to save the within correlation matrix of a 2level CFA. I am using this syntax:

SAVEDATA: SAMPLE IS filename.dat;
TYPE IS correlation;

It does generate a datafile, but it does not contain any data/correlation matrix (0 KB).

The syntax mentioned above (FILE (SIGB) is filename; TYPE=CORRELATION;) does not work anymore.

Also, the empty data file is saved in the WINDOWS registry, how do I specify a path?

What am I doing wrong? ;-)

Thank you for any information.

Linda K. Muthen posted on Tuesday, February 15, 2005 - 6:58 am

The best thing to do with a question like this is to send your output and data to support@statmodel.com.

Anonymous posted on Thursday, April 28, 2005 - 12:51 pm

Hello,
My collegues and I are working on a multi-level, multi group analysis trying to confirm a specific model. I was looking at your document '6 steps for Two-Level SEM' and was wondering if we should be updating our model as we proceed through the steps even if we are doing a confirmatory type of analyses.
Thank you for your help.

bmuthen posted on Thursday, April 28, 2005 - 6:33 pm

Tough question. Seems like the 6 steps are exploratory in nature - otherwise you would simply go straight to the last (confirmatory) step.

Anonymous posted on Thursday, May 05, 2005 - 6:34 am

Can MPlus do a three-level longitudinal model with cross-classifications of level 2 units at level 3? (time is level 1, student level 2, teacher level 3 -- students change teachers)

Linda K. Muthen posted on Thursday, May 05, 2005 - 7:07 am

Yes, Mplus can have three levels when one of them it time.

bmuthen posted on Saturday, May 07, 2005 - 12:00 pm

Cross-classified random effects modeling is not yet available in Mplus.

Eva.Vandegaer@ped.kuleuven.ac.be posted on Tuesday, May 10, 2005 - 3:39 am

Hello,

I have problems with getting a considerable fit with my data. I am testing a path model where I take into account the fact that my data are clustered so I use the 'TYPE= complex' command to get accurate SE. However, when I test the same path model without the 'TYPE-complex' command the fit (CFI; TLI and RMSEA) is much better. The modification indexes do not give me good suggestions to improve my model. At this moment I have a CFI: 0.819, TLI: 0.764 and RMSEA 0.182. According to the rules of thumb these values are not good enough.
Is it possible that merely the specification of the clustered data is responsible for the lowering fit? And do you have any suggestions (besides looking at the MI because these do not help) how to improve the fit?

Thank you

bmuthen posted on Tuesday, May 10, 2005 - 5:53 am

More information is needed to answer this. Typically, taking clustering into account (using type=complex) lowers the chi-square value in the test of model fit, at least if you have substantial intraclass correlations. What were your chi-square values without type=complex and with it? And what were your CFI, TLI, and RMSEA values without type=complex?

Eva.Vandegaer@ped.kuleuven.ac.be posted on Tuesday, May 10, 2005 - 7:02 am

With taking clustering into account:
CFI=0.735
TLI=0.706
RMSEA=0.125
WRMR=2.323
Chi square model fit=219.575 df=9
Chi square model fit for the baseline model:804.756 df=10
Without taking clustering into account:
CFI=0.821
TLI=0.742
RMSEA=0.190
WRMR=4.168
Chi square model fit=990.507 df=18
Chi square model fit for the baseline model:5473.1354 df=26

I have continuous as well as categorical dependent variables. The estimator is WLSMV and I used theta parameterization.

Do the estimator or the type of parameterization have something to do with the poor fit? Are these fit indices biased or is my model specified incorrectly (but as i said the MI do not give meaningfull indications what can be altered)?

Thank you!

bmuthen posted on Tuesday, May 10, 2005 - 5:21 pm

It looks to me like the fit is not good with or without taking clustering into account. I think the CFI should be at least 0.96 and the RMSEA less than 0.05, for example. I would try to revise the model. But you claim that MIs don't help. You say you have a path model - perhaps you could make that just-identified by including all paths to see where your model goes wrong.

Anonymous posted on Saturday, August 06, 2005 - 9:07 pm

I am trying to replicate the Step 0 or basic.inp for example 9.8 using the Six Steps for Two-Level SEM. When I run the following syntax:

TITLE: test
DATA: FILE IS ex9.8.dat;
VARIABLE: NAMES ARE y1-y6 x1 x2 w clus;
USEVARIABLES ARE y1-y6 ;
CLUSTER = clus;
ANALYSIS: TYPE = TWOLEVEL BASIC;
SAVEDATA:
SAMPLE = spw.dat;
SIGB = estsigb.dat;
TYPE = CORR;

I get the following error message:

*** WARNING in Savedata command
(Err#: 9)
Error opening SAMPLE save file:
spw.dat
SAMPLE will not be saved.
*** WARNING in Savedata command
(Err#: 9)
Error opening SIGB save file:
estsigb.dat
SIGB will not be saved.
2 WARNING(S) FOUND IN THE INPUT INSTRUCTIONS

Do you know why I would receive this error message? Thanks for your help.

bmuthen posted on Monday, August 08, 2005 - 1:20 pm

Please send this to support@statmodel.com.

Mpduser1 posted on Sunday, November 06, 2005 - 6:31 pm

I had a question about the default error structure used in Mplus 3.13.

I am attempting to build a model of the following form:

x --> y1
x --> y2
x --> y3
(x, y1, y2, y3) --> z

1. y1, y2, y3, and z all have random intercpts (with sufficient between-level variation);

2. I specify the "x --> y" coefficients as random (and I have sufficient between level random variation);

3. I specify the "x --> y" coefficients as random (and I have sufficient between-level variation).

If I include no special statements in Mplus, Mplus assumes that the random coefficients for the random y's intercepts and the "x --> y" coefficents are correlated.

But I generally have problems estimating the model if I assume that the residual variances for the "x --> y" and the "x --> z" random coefficients are correlated, and thus have to use "@0" to manually restrict these paramters' covariances.

Shouldn't Mplus assume that the "x --> y" and the "x --> z" random coefficients' error structures are uncorrelated ?

On the one hand, I could see making an argument that these terms should be correlated (for example, if I was to write the above model out longhand), but I can rarely get such models to converge in practice.

BMuthen posted on Saturday, November 12, 2005 - 6:00 pm

The Mplus defaults can be overridden by using @0 as you say. It is sometimes the case that it is difficult to estimate a model where the covariance matrix for the random effects has many free elements.

mpduser1 posted on Tuesday, March 14, 2006 - 8:35 am

I have a pair of dummy categorical variables I wish to use as predictors in a series of HLM and OLS models.

The predictors are coded as "0,1" (i.e., 0=male, 1=female) in my sample.

Does Mplus retain the dummy 0/1 coding, or use a 1/2, contrast, (or other) coding scheme ?

Bengt O. Muthen posted on Tuesday, March 14, 2006 - 9:14 am

Mplus retains the coding of predictors.

Ramin Azad posted on Monday, July 17, 2006 - 12:48 am

I am wondering how we can compute the figues below from a SEM when we see that, for example, the structural model explains 41.9%, 31.9%, and 34.1%, respectively, of the variation in X,Y,Z.

I mean how can we get these figuers from a SEM?

Linda K. Muthen posted on Monday, July 17, 2006 - 8:40 am

R-square is the explained variance divided by the total variance as in regular regression.

Ramin Azad posted on Saturday, October 21, 2006 - 8:40 am

Hi

I have some questions:

1) I have to compute a question which asks the respondents to give the number of new ideas that had been adopted by the organization in a period of time. Different firms have given different responses. for example, zero, 7, 8, five to ten, etc. What is the name of this scale?

2) When I want to find out the impact of a five-point Likert scale on the above scale, can I use a Regression? if not, what should I do?

Please accept my thanks in advance.
Hamid

Ramin Azad posted on Saturday, October 21, 2006 - 8:43 am

Hi
I have another question.How can I multiple R? It means that I have to use correlation in power 2? Then what is the difference between that and R2?
Thanks
Hamid

Linda K. Muthen posted on Saturday, October 21, 2006 - 3:55 pm

If the variable is scored as the number of ideas and is normally distributed, it can be treated as continuous. If you use categories like 5 - 10, then it would be a Likert scale.

You can regress the variable on a Likert scale.

R-square is not multiple R but I'm not sure what multiple R is. You should look it up in a textbook.

Ramin Azad posted on Sunday, October 22, 2006 - 2:07 am

Hi Linda

Thank you for your help.
If the variable which is scored as the number of ideas and is normally distributed, and I treat it as continuous variable,then
(1)can I regress it to find out the impact of a five-point Likert scale on this variable?

Or

(2) should I myself categorize the responses in let's say five categories and then regress it?

(3) Which one is correct?

(4) Should I use Hierarchical regression?

Thank you so much

Linda K. Muthen posted on Sunday, October 22, 2006 - 6:21 am

If you regress a continuous variable on a Likert scaled variable, the Likert scaled variable is treated as a continuous variable. An alternative is to create a set of four dummy variables and use them as covariates. That would have to be your decision. Regular regression is sufficient.

kanako ishida posted on Thursday, December 07, 2006 - 9:17 pm

I have a question about cross-level interaction. I have a study design similar to the user's guide example 9.2: two level regression analysis for a categorical depentende variable. Is it possible to include cross-level interaction in this model?

Thank you!

Linda K. Muthen posted on Friday, December 08, 2006 - 10:34 am

The model includes a cross-level interaction. It is the random slope.

kanako ishida posted on Friday, December 08, 2006 - 10:55 am

I am not sure if I am asking something too basic about statistics or I didn't phrase my question correctly. I mean, more specifically, I was wondering if I could interact individual's race (level 1 var) and community�s SES status (level 2 var) in order to see the different effect of community's SES by ethnicity on the outcome? Or is this theoretically, or methodologically irrelevant?

Thank you!

Bengt O. Muthen posted on Friday, December 08, 2006 - 6:30 pm

Look at example 9.2 Race (ethnicity) is x and community SES is w. If you have a random slope on level 1 for y on x, this means that the model includes the cross-level interaction term x*w (see multilevel text books on cross-level interactions). So it sounds to me that you want to do exactly what ex9.2 does.

kanako ishida posted on Friday, December 08, 2006 - 9:33 pm

Thank you for your response, yes, it makes perfect sense, but I'm not sure what happens if I have several community level variables and individual variables, out of which I am only interested in the interaction between Race and community SES. Am I able to model that way?? Maybe not in a random effect model, but in a random intercept model?

Thank you!

Linda K. Muthen posted on Saturday, December 09, 2006 - 9:05 am

If you specify y ON x; you get a random intercept and a fixed slope. If you specify s | y ON x; you get a random intercept and a random slope.

Ramin Azad posted on Sunday, January 07, 2007 - 11:32 am

Hi

I am wondering if it is possible that the results of regression are different from SEM? if so, what are the reasons? For example, I have found insignificant relationships, whereas, SEM shows a very strong significant with the same data!!!!

Kind regards
Ramin

Linda K. Muthen posted on Monday, January 08, 2007 - 9:36 am

I would have to see the models you are comparing to answer this. Depending on the model, you might see different results. Please send your outputs and license number to support@statmodel.com.

Ramin Azad posted on Tuesday, January 09, 2007 - 10:25 am

Hi
Thank you for your reply. I got my funny mistake.However, I have another question. How can I find R2 in the AMOS diagram or its results?
Kind regards
Ramin

Linda K. Muthen posted on Tuesday, January 09, 2007 - 10:45 am

This is a discussion board for Mplus. I would have no idea how to find something in Amos. You should contact Amos support.

Mike Tobak posted on Sunday, June 10, 2007 - 9:15 am

Hi, Prof. Muthen,

I have several questions about multilevel path analysis and Mplus. I am new to this field and Mplus. I am trying to analyze a two-level path analysis model with random slopes, and binary level-2 covariates. I have 20 path coefficients to estimate for within model and I have 30 clusters.

Q1: I wonder how many random slopes I can estimate at level-2 if I only have 30 clusters. It seemed that I cannot set all of the 20 path coefficients to be random, since I only have 30 clusters. I want to use TYPE=TWOLEVEL.

Q2: Could you please tell me in which of your articles I can find the mathematics, derivations and related algorithms to this specific model? I am new to multilevel SEM, and I found piles of papers from the website (kind of lost). I wanted to start with the key paper for multilevel path analysis (no latent variables) with random slopes.

Thank you for your time!

Linda K. Muthen posted on Monday, June 11, 2007 - 10:37 am

Q1. Each random slope is one dimension of integration so estimating more than 4 becomes computationally very heavy. In our experience, slopes are most often not random. As a first step, you might consider looking at each regression in your path model separately to determine which slopes are random.

Q2. I would start with looking at multilevel regression in, for example, Raudenbush and Bryk and path analysis in, for example, Bollen. Following are three relevant articles:

Bauer et al. (2006). Psych Methods, 11, 142-163.
Kenny et al. (2003). Psych Methods, 8, 115-128.
Krull et al. Multivariate Behavioral Research, 36, 249-277.

Mike Tobak posted on Wednesday, June 13, 2007 - 6:24 pm

Thank you! I will read the articles carecfully!

Mike Tobak posted on Thursday, June 28, 2007 - 12:40 pm

Hi, Prof.Muthen,

Thank you for your recommendations. I have read the articles and books. I wonder if you can tell me, besides Mplus User's Guide, what is the key article of Mplus talking about including random slopes in a multilevel SEM.

I know that the maximum likelihood with numerical integration is used in Mplus, but I want to know more details and mathematics behind it. And it seems that Mplus user's guide didn't provide too many details in this particular field regarding how to obtain the estimations of multilevel SEM with random slopes.

Thank you for your time and help!!

Bengt O. Muthen posted on Thursday, June 28, 2007 - 2:30 pm

We don't have a paper on random slopes per se, but this topic is included in the technical details of the Muthen-Asparouhov (2006) chapter for the forthcoming Chapman-Hall book, which is on our web site under Papers, within the growth mixture topic.

Scott R. Colwell posted on Tuesday, February 05, 2008 - 1:11 pm

I have a fairly simple SEM model based on a finite sample size. That is the sample is students and I probably have over 10% of the them in the sample. Given the problems that a finite sample can have on estimating standard errors and fit statistics, I am wondering if the Type = Complex function will correct for this.

My thinking is that the ill-effects of the finite sample will show up in the estimation of the standard errors (and fit statistics) due to the potential non-independence of the respondents. If I test for both the design effect and the intraclass correlation coefficient (Hech 2001; Muth�n and Satorra 1995) and they show a high degree of non-independence, then type = complex should work so long as I have reasonable clusters.

Does that seem sound to you?

Linda K. Muthen posted on Wednesday, February 06, 2008 - 10:34 am

Mplus does not do anything special for finite samples. Using TYPE=COMPLEX would take the nonindependence of observations into account.

Joyce Kwan posted on Monday, May 05, 2008 - 3:39 am

Hi Dr Muthen,

I have a sample of 482 (with 19 clusters, average cluster size = 25). I have 38 observed variables for 7 latent factors. I wonder if it is appropriate for me to fit a multilevel CFA model because as I run the model, I encountered error message as followings,

THE MODEL ESTIMATION DID NOT TERMINATE NORMALLY DUE TO AN ERROR IN THE COMPUTATION. CHANGE YOUR MODEL AND/OR STARTING VALUES.

A MATRIX COULD NOT BE INVERTED DURING THE H1 MODEL ESTIMATION.
THE ESTIMATED BETWEEN COVARIANCE MATRIX IS NOT POSITIVE DEFINITE AS IT SHOULD BE.COMPUTATION COULD NOT BE COMPLETED.THE VARIANCE OF N4 APPROACHES 0.FIX THIS VARIANCE AND THE CORRESPONDING COVARIANCES TO 0, DECREASE THE MINIMUM VARIANCE, OR SPECIFY THE VARIABLE AS A WITHIN VARIABLE.
THE H1 MODEL ESTIMATION DID NOT CONVERGE.SAMPLE STATISTICS COULD NOT BE COMPUTED.

Is it because the number of cluster is too small so I encountered problem when I run the analysis? or anything else suggested by the error message?

Thanks.

Linda K. Muthen posted on Monday, May 05, 2008 - 9:10 am

This may be because the number of your clusters is small. A minimum of 30-50 is recommended. When Version 5.1 is made available later today or early tomorrow, I suggest trying to run the analysis using Version 5.1.

Joyce Kwan posted on Friday, May 09, 2008 - 1:11 am

Dear Dr Muthen,

Thanks for your answering. Some follow-up questions regarding to my previous question. What is the sample size requirement for doing a multilevel CFA? You suggested me to have a minimum of 30-50 number of clusters. How about the cluster size? Any literature related to sample size requirement of multilevel modeling/multilevel CFA I can make reference to?

Thanks.

Linda K. Muthen posted on Friday, May 09, 2008 - 9:33 am

The necessary sample size depends on many factors and is best determined by doing a Monte Carlo simulation study. Joop Hox has written a lot about the number of clusters needed and cluster size. I would search for articles by him.

Kathryn Modecki posted on Saturday, May 10, 2008 - 5:03 pm

Hello,

I'm doing logistic regression analyses in mplus with multiple groups.

To say something about the change in the amount of variance explained, R2-change (R2-model2 minus R2-model1), between two logistic regression analyses, can I simply subtract the r2scores (all predictors-just demographic predictors) from 2 different analyses predicting the same outcome?

From other postings, it seems like I can't. If not, how would I answer such a question?

Thank You very much in advance.

Linda K. Muthen posted on Sunday, May 11, 2008 - 10:53 am

I don't think R-square for logistic regression is accepted in all circles and I don't know of a test of the difference in two R-square values. I would try to answer my question in a different way. See the logistic regression literature to see how others do this where I think they work with likelihood ratio differences.

Pamela A. Dooley posted on Tuesday, June 03, 2008 - 11:12 am

Hello,
I am trying to predict my dependent variable with race, poverty, and vars'1--5. Poverty & var5 are level 2 & all others are level 1. I am testing for a cross-level interaction between race and poverty (thus I indicate a random slope/fixed intercept). I'm unsure about how to include the other variables in the program since they should NOT have random slopes. Thanks for your help in advance.

Analysis: Type=Random Twolevel;
Model: %WITHIN%
s | dep on race;
dep ON var1 var2 var3 var4;
%BETWEEN%
dep s ON poverty;
dep ON var5;

Thanks,
Tucker

Linda K. Muthen posted on Tuesday, June 03, 2008 - 2:43 pm

This looks correct. You use the | symbol for random effects. Without the | symbol, fixed effects are estimated.

John Hipp posted on Friday, June 06, 2008 - 3:58 pm

Hi- this is probably a simple question, but I did not see the answer above. I'm trying to run a multilevel model where an individual-level measure affects y at the individual level, but its neighborhood-level equivalent (as a latent variable, not a summation) does also.
If the data were swung wide, a simple version of my model would be (if there were four people in every neighborhood):
fy by y1 y2 y3 y4;
fx by x1 x2 x3 x4;
fy on fx;
y1 on x1;
y2 on x2;
y3 on x3;
y4 on x4;

What would my code look like if I used M-Plus's multilevel capability? My guess is something like:
%WITHIN%
y on x;
x on ;
%BETWEEN%
y on x;
I'm guessing that regressing x on an intercept would give a random version of x? And I would not declare x as either a within or between variable.
Is this giving me the model I want to estimate?
thanks much.

Bengt O. Muthen posted on Friday, June 06, 2008 - 5:14 pm

You just need

%WITHIN%
y on x;
%BETWEEN%
y on x;

Here, x gets decomposed into latent within and between parts if it isn't on either the Between= list or the Within=list. See the V5 UG ex 9.1, second part on pages 230-231.

Danielle McCarthy posted on Tuesday, August 05, 2008 - 8:53 am

I am using Mplus to test multilevel mediation models. I am interested in testing relations among a binary treatment condition variable (manipulated between subjects), a continuous mediator(assessed repeatedly within subjects on a variable occasion schedule), and a categorical between-subjects treatment outcome (smoking vs. abstinent). I can estimate the treatment effects on the mediator in a two-level Mplus model in which the intercept and slope are random, but I cannot use the estimated intercept or slope variables as a predictor of a categorical outcome between subjects. The outcome is treated as an observed variable (a knownclass variable), not a latent variable. Do I need to add the mixture model program to my base and multilevel program in order to fit such a model, or is there a way I can do this wtihin the multilevel Mplus program? I have opted not to use a growth curve model due to the high number of within-subjects observations (around 50). Thank you in advance for your advice.

Bengt O. Muthen posted on Tuesday, August 05, 2008 - 6:32 pm

Let me see if I interpret this correctly. It sounds like you consider 2-level data where level 1 is occasion and level 2 is individual - you do this as a 2-level, univ (actually bivariate) outcome model rather than as a 1-level multivariate outcome model due to having many time points. The ultimate outcome is a binary variable (smoking or not). It sounds like you are not considering a growth model but a path analysis model (tx -> mediator -> outcome). The way I hear you the random intercept and slope that you refer to are for the mediator regression on tx. If I am right so far, then you can specify a random intercept also for the binary outcome and on level 2 let that be predicted by the intercept and slope from the mediator regression. You don't need mixture.

Danielle McCarthy posted on Wednesday, August 06, 2008 - 7:43 am

Thank you very much for your response. I failed to note that the model does include a growth component. I've pasted the syntax and error messages below. The variables are defined as follows:
couns is the binary indicator of treatment group, reltime is the number of days since the stop-smoking day, ability is the mediator assessed repeatedly within subjects, and abst is the binary outcome at the end of treatment (and is not time varying).

The model below does not work, but if I substitute a continous outcome for abst or drop the CATEGORICAL statement the model runs and converges. Does this additional information change your advice at all? Thank you for your guidance!

CATEGORICAL = abst;
WITHIN = reltime;
BETWEEN = couns abst;
ANALYSIS: TYPE = TWOLEVEL RANDOM;
MODEL:
%WITHIN%
slope | ability ON reltime;
%BETWEEN%
ability slope ON couns;
abst ON couns ability slope;

ERROR in MODEL command
Observed variable on the right-hand side of a between-level ON statement must be a BETWEEN variable. Problem with: ABILITY
ERROR The following MODEL statements are ignored: * Statements in the BETWEEN level: ABST ON ABILITY

Linda K. Muthen posted on Wednesday, August 06, 2008 - 10:17 am

Please send your files and license number to support@statmodel.com.

Andrew Burton-Jones posted on Thursday, October 02, 2008 - 12:11 pm

Dear Statmodelers

My model tries to explain the extent to which groups of individuals can fix errors in files. Each group is given one file and has a week to fix the errors in it. In multilevel terms, the file is at Level 2 and the individual members are at Level 1. I am interested in the extent to which individuals in the groups identify errors and fix them. (Note that in my context, the individuals typically make copies of the file and distribute it to each other and then compile their work). After the time period elapses, each group will have a file (at Level 2) that has more or less errors than it did originally.

My model has the form:
Level 2 predictor:
- Number of errors in file at T1

Level 1 endogenous variables:
- Extent to which perceive errors
- Extent to which take actions to fix errors
- Extent of knowledge of file contents (moderator)

Level 2 outcome:
- Number of errors in file at T2

My model predicts that:
- The # of errors (Level 2) affects individuals perceptions (Level 1), moderated by individual's knowledge (Level 1)

- At Level 1, perception --> action

- The level 1 variable (action) affects the level 2 outcome (errors in file)

I don't think I can do this in HLM. I wondered whether I might be able to it with MPLUS.

I hope you can help.

Linda K. Muthen posted on Friday, October 03, 2008 - 9:02 am

How many groups do you have? Does each group get the same files?

Andrew Burton-Jones posted on Friday, October 03, 2008 - 10:00 am

Hi Linda - thanks for getting back to me so fast!

About 100 groups of 4 per group. I suspect that this is small. However, I am at the planning stage of my research and can adjust the numbers up if necessary.

Also, I think I solved one of the problems with my design. In my model, I had conceptualized a Level 1 factor moderating the effect of a Level 2 factor on a Level 1 factor. I had thought that this could not be done in HLM. However, I now think that this is the same as a Level 2 factor moderating the effect of a Level 1 factor on another Level 1 factor, which of course is perfectly feasible in multilevel modeling.

I still have a remaining problem, however, in that I have an "upward" effect from a Level 1 factor to a Level 2 outcome. I think that is not possible to test this in HLM but I'm not sure about MPLUS.

all the best, Andrew.

Bengt O. Muthen posted on Saturday, October 04, 2008 - 11:55 am

Regarding your last paragraph, the way something like this can be accomplished in Mplus is that you use the random intercept (or mean) of the level 1 factor (which is assumed to have variation across level 2 as well) to predict the level 2 outcome.

One remaining issue is that it seems that the file fixing task needs to be the same for all groups to do this 2-level modeling.

Tianyi Yu posted on Thursday, October 09, 2008 - 6:51 am

Hi, Dr. Muthen:

I want to test a multilevel model in which the level 1 is a growth model. But at the level 2, I want to use the intercept and slope as predictors of the other outcome variables.

So, the model looks like:

%within%

s | y on time;

%between%
[s y];
s with y;
z on s;
z on y;

I got a warning message as:

In the MODEL command, the following variable is an x-variable on the BETWEEN
level and a y-variable on the WITHIN level. This variable will be treated
as a y-variable on both levels.

Is that still the case in Mplus version 5 that one variable can only be either x-variable or y-variable on both levels?

If I really want to test such kind of model, is the latent growth model (SEM) the better (or the only way) to do it?

Thanks so much!

Tianyi

Linda K. Muthen posted on Thursday, October 09, 2008 - 9:44 am

Yes, a variable must be treated as dependent or independent at both levels. This is just a warning so you realize that distributional assumptions are being made about the variable.

Ruth Zschoche posted on Friday, April 02, 2010 - 1:15 pm

I am trying to design a study for use with Mplus. New user so please bear with me.

I am attempting to run it as a multilevel SEM, but when I originally designed the study, I had many more clusters than it seems I now have to work with. Is there a minimum number of clusters required for Mplus to run the model? I know that with a a low level 2 sample size that I will have limited statistical power in my level-two analysis and that for all practical purposes, I may have to forget about statistical significance at level-two and instead focus on the magnitude of the relationships between latent variables. But my question is, can I use MPlus to run the analysis regardless if that's my preference, or will I have to find other software? Will I get an error message with too few clusters? Thanks in advance.

Linda K. Muthen posted on Friday, April 02, 2010 - 2:32 pm

It is recommended to have a minimum of 30-50 clusters for multilevel analysis. This is not specific to Mplus. I think the error message you refer to says you have more parameters than clusters. If you have further questions about this, please send your full output and license number to support@statmodel.com.

Ruth Zschoche posted on Monday, April 05, 2010 - 6:12 pm

Thank you. I am not yet at the point of having data to run, but will send in the output if at that point I encounter a problem. Just to clarify, the error message for more parameters than clusters...which is a problem likely to arise with my small cluster number...will this message pop up INSTEAD of an output, basically keeping me from running the analysis, or will the program just warn me of the problems with an error message while still providing the output for review? I just want to make sure that I can use your program for my analysis at all. Thanks again for the help.

Linda K. Muthen posted on Monday, April 05, 2010 - 6:20 pm

You will be warned of the problem. The analysis will not stop.

maureen dollard posted on Thursday, April 22, 2010 - 5:33 am

Hello Lynda Im trying to write syntax for a 2-1-1 mediation, with 3 separate factors (la, lm and lam) predicting t in turn predicting the latent variable eg (that is indicated by e1 e2 e3 pe and cy). I am obviously confused as I get the following output. Can you help please, thansk M

VARIABLE: NAMES ARE
y s a g t e1 e2 e3 la lm lam pe cy;
WITHIN = y a t e1 e2 e3 pe cy;
BETWEEN IS la lm lam;
CLUSTER IS s;
CENTERING = GRANDMEAN (y a g t e1 e2 e3 pe cy);
ANALYSIS: TYPE IS TWOLEVEL RANDOM;
MODEL:
%WITHIN%
t;
eg BY e1 e2 e3 pe cy;
eg on t(b);
%BETWEEN%
egb by e1 e2 e3 pe cy;
t eg;
t on lam (a);
t on la;
t on lm;
egb on t(b);
MODEL CONSTRAINT:
NEW(indb);
indb=a*b;
MODEL CONSTRAINT:
OUTPUT: TECH1 TECH8 CINTERVAL;
example warnings*** ERROR in MODEL command
Within-level variables cannot be used on the between level.
Within-level variable used: E1
*** ERROR in MODEL command
Within-level variables cannot be used on the between level.
etc.,
The following MODEL statements are ignored:
* Statements in the BETWEEN level:
T
EG
EGB BY E1
EGB BY E2
EGB BY E3
EGB BY PE
EGB BY CY
EGB ON T
T ON LAM
T ON LA
T ON LM

Linda K. Muthen posted on Thursday, April 22, 2010 - 8:45 am

Variables on the WITHIN list cannot be used in the between part of the model. If you remove them, they can then be used in both parts of the model.

Utkun Ozdil posted on Wednesday, December 01, 2010 - 11:47 am

Hi Drs Muthen,,

I've just begun learning MPlus to conduct multilevel structural modeling. Although I'm familiar with LISREL,, MPlus and multilevel modeling are quite new for me. Unfortunately,, I have to learn the two via books, articles, and manuals by myself... =( And I got confused at the very beginning as I read and read all these materials.

On the course of learning I think I need some practical highlights about the framework of multilevel modeling or the step-by-step procedures to follow in data analysis (which would make me easily handle the core issues)... I would appreciate your recommendations.

And I have one more question:::I have a large data set saved as .sav file in SPSS. Does MPlus allow exporting a file such as that? (like LISREL?)

Thanks...

Utkun

Linda K. Muthen posted on Wednesday, December 01, 2010 - 2:33 pm

If you go to the website, you will find our course videos and handouts. You might find these useful. Chapter 9 of the user's guide contains many multilevel examples.

Mplus reads only numeric ASCII files.

Utkun Ozdil posted on Thursday, February 10, 2011 - 5:11 am

Hi,,

While I was watching the course videos about multilevel analysis I noticed that a variable (e.g. f1) is treated as an observed variable in the within part and the same variable is treated as a latent variable in the between part.
I would appreciate if you explain the reason for that.

Thanks...

Utkun

Linda K. Muthen posted on Thursday, February 10, 2011 - 7:06 am

This latent variable decomposition is explained in the second part of Example 9.1.

Carolin Hagelskamp posted on Friday, March 04, 2011 - 1:01 pm

I have a longitudinal data set where time (5 time points) is nested in kids (3000) who are nested in classroom (150) which are nested in schools (64). Does Mplus6 allow me to run a 4-level multi-level model?

Linda K. Muthen posted on Friday, March 04, 2011 - 1:03 pm

No, you could do 3 levels if one of them is time.

Nidhi Kohli posted on Thursday, September 08, 2011 - 10:38 am

I am trying to fit an SEM model on a 3-level nested dataset where n = 322. The dependent variable is unordered categorical variable with 3 categories: 0, 1 & 2. I ran the model in Mplus and got the following messages:

THE ESTIMATED WITHIN COVARIANCE MATRIX COULD NOT BE INVERTED. COMPUTATION COULD NOT BE COMPLETED IN ITERATION 1900. CHANGE YOUR MODEL AND/OR STARTING VALUES.

THE MODEL ESTIMATION DID NOT TERMINATE NORMALLY DUE TO AN ERROR IN THE COMPUTATION. CHANGE YOUR MODEL AND/OR STARTING VALUES.

My questions is how can I address the above mentioned issues in an effective way. Here is the Mplus code:

VARIABLE:
...
CATEGORICAL = alt_dep anx_att anx_diag
dep_diag other_mh_diag;
COUNT = anx_meds dep_meds other_mh_meds;
WITHIN = ...;
CLUSTER = rand_cid rand_mdid;
ANALYSIS:
TYPE = COMPLEX TWOLEVEL;
ALGORITHM = INTEGRATION;
MITERATIONS = 2000;
PROCESSORS = 8;
MODEL:
%WITHIN%
f1 BY anx_diag anx_meds anx_att;
f2 BY dep_diag dep_meds phq2tot;
f3 BY other_mh_diag other_mh_meds;
f1@1 f2@1 f3@1;
[anx_meds@0 anx_att$1@0 anx_diag$1@0];
...
f1 WITH f2;
...
alt_dep ON f1 f2 f3;

Linda K. Muthen posted on Thursday, September 08, 2011 - 1:56 pm

You say you have an unordered categorical variable but I don't see the NOMINAL option.

Nidhi Kohli posted on Thursday, September 08, 2011 - 3:56 pm

Do you mean I should use the NOMINAL statement instead of CATEGORICAL under the ANALYSIS command? I was not aware that this can make a difference. Thanks you.

Linda K. Muthen posted on Thursday, September 08, 2011 - 5:07 pm

Yes, you should use NOMINAL. With CATEGORICAL, a multiple category is treated as an ordered categorical variable which would be the wrong model. See the user's guide for further information.

Nidhi Kohli posted on Friday, September 09, 2011 - 8:08 am

I changed to NOMINAL, however, I am still getting the same message, i.e.,

THE ESTIMATED WITHIN COVARIANCE MATRIX COULD NOT BE INVERTED. COMPUTATION COULD NOT BE COMPLETED IN ITERATION 138.
CHANGE YOUR MODEL AND/OR STARTING VALUES.

THE MODEL ESTIMATION DID NOT TERMINATE NORMALLY DUE TO AN ERROR IN THE COMPUTATION. CHANGE YOUR MODEL AND/OR STARTING VALUES.

Here is the code:
VARIABLE:
...
NOMINAL = alt_dep anx_att anx_diag dep_diag other_mh_diag;
COUNT = anx_meds dep_meds other_mh_meds;
WITHIN = anx_att phq2tot anx_meds
dep_meds other_mh_meds
anx_diag dep_diag other_mh_diag;
CLUSTER = rand_cid rand_mdid;
...
ANALYSIS:
TYPE = COMPLEX TWOLEVEL;
ALGORITHM = INTEGRATION;
MITERATIONS = 5000;
PROCESSORS = 8;

MODEL:
%WITHIN%
f1 BY anx_att#1 anx_diag#1 anx_meds;
f2 BY phq2tot dep_diag#1 dep_meds;
f3 BY other_mh_diag#1 other_mh_meds;

f1 f2 f3;

f1 WITH f2;
...;

[anx_att@0 anx_diag@0 anx_meds@0];
...;

alt_dep#1 alt_dep#2 ON f1 f2 f3;

Linda K. Muthen posted on Friday, September 09, 2011 - 11:04 am

Please send the full output and your license number to support@statmodel.com.

Chris Thomas posted on Thursday, January 05, 2012 - 8:41 am

Hi Linda and Bengt,
I have what I hope is a somewhat simple question.

I have seen each of you mention that that the between level often doesn't support the same number of factors as the within level. I learned multilevel modeling in Mplus from Bob Vandenberg who also asserted the same thing. In one of his examples he has two factors at the within level that he collapses into a single inclusive factor on the between level.

I am running into a similar situation. I have two work-unit climate constructs (service climate and psychological safety) that load nicely as independent constructs on the within level. But, after many different modeling attempts, it seems that I may be best served to collapse them into an omnibus unit-level climate factor at the between level.

So, here is the question...is there a reference/paper/citation for this phenomenon (i.e., fewer factors on the between level)?

If not, what justification/description would you include in a manuscript to convince reviewers that it is appropriate to model a different factor structure at the between level?

Linda K. Muthen posted on Thursday, January 05, 2012 - 2:20 pm

There is a reference in the following paper to a paper by Harnqvist, Gustaffson, and Muthen that was forthcoming in Intelligence at that time. I think that discusses this issue. You can Google it.

Muth�n, B. (1994). Multilevel covariance structure analysis. In J. Hox & I. Kreft (eds.), Multilevel Modeling, a special issue of Sociological Methods & Research, 22, 376-398.

Chris Thomas posted on Friday, January 06, 2012 - 7:49 am

Thank you, Linda. That was just what I needed.

jenny posted on Thursday, January 19, 2012 - 7:59 pm

Hi,
I am attempting to test a moderated-mediation model with the variables in the mediation (x, m, y) conceptually at the group level (level 2), The moderating variable (z) conceptually at the individual level (level 1). Data collected for x, m, and y were through individual responses (and using iccs and rwgs to justify aggregation to the group-level). The model depicts x (level 2) and z (level 1) interacting to affect m (level 2) and y (level 2).

In other words, we are interested in testing a 2-way interaction between a Level 2 IV and a Level 1 moderator on a Level 2 DV, mediated through a Level 2 mediator.

Is it possible to test such a model in MPlus?

Thanks very much!

Bengt O. Muthen posted on Thursday, January 19, 2012 - 8:54 pm

That works in Mplus - you let the between-level component of z moderate the effects on the between level using XWITH since the between-level part of z is latent.

Drew C. Coman posted on Thursday, January 26, 2012 - 10:58 am

Hello Dr. Muthen,

I am just starting to familiarize myself with Multilevel SEM. I apologize for the very simple question, but to confirm, I should be constructing multilevel data sets in the "long" format correct? Is this the optimal way to analyze multilevel data in Mplus? Thanks for the help!

Linda K. Muthen posted on Thursday, January 26, 2012 - 12:41 pm

We recommend using the wide format which is more flexible than the long format. See the examples in Chapter 6 of the user's guide.

Linda K. Muthen posted on Thursday, January 26, 2012 - 2:12 pm

See chapter 9 for multilevel SEM.

Marina Milyavskaya posted on Sunday, February 05, 2012 - 9:05 am

Hi,
I have a data set where we ask participants about 3 important goals (so goal nested within person). I would like to test a model where X1-->y1-->y2-->y3-->Y4,
where all the variables are assessed at the goal (within) level. So I was wondering whether I should model the entire path on both the between and within level (like I would for a mediation analysis, following Preacher 2010), or only on the within level?
When I tried to model it on both levels (specifying the same model on both), some of the paths are significant only on the between and not the within level, and I am not quite sure what that means conceptually?
Also, if I wanted to control for a person-level (between) variable, how would that influence the model, and where would I put that in?

thank you,

Marina

Linda K. Muthen posted on Monday, February 06, 2012 - 1:58 pm

I'm not sure what your cluster variable would be in this analysis. If you have measured all individual on all variables, multivariate modeling takes into account non-independence of observations.

Marina Milyavskaya posted on Tuesday, February 07, 2012 - 7:35 am

The cluster variable is person - I have multiple goals for each respondent, and all my variables are assessed separately for each goal.

Linda K. Muthen posted on Tuesday, February 07, 2012 - 5:35 pm

What does your data set look like?

person y1 y2 y3 y4

or

person y1
person y2
person y3
person y4

Marina Milyavskaya posted on Wednesday, February 08, 2012 - 8:30 am

My dataset looks like this:
person1 goal1 y1 y2 y3
person1 goal2 y1 y2 y3
person1 goal3 y1 y2 y3
person2 goal1 y1 y2 y3
person2 goal2 y1 y2 y3
...

By the way, would you prefer that I communicate directly with you by email about this rather than on the message board?
Thank you,
Marina

Linda K. Muthen posted on Friday, February 10, 2012 - 10:15 am

Please send the input, data, and your license number to support@statmodel.com.

Laura Stapleton posted on Thursday, April 19, 2012 - 8:19 am

Hello! I am working out some simple examples for a ML-SEM workshop and wanted to demonstrate running a "Maximal" model (Hox, 2002), wherein I covary all within and between variables -- which should just result in the ML covariance estimates at each level (as provided in the SAMPSTAT section).

Syntax for this example:
ANALYSIS:
TYPE IS TWOLEVEL;
MODEL:
%WITHIN%
Q1 WITH Q2-Q7;
Q2 WITH Q3-Q7;
Q3 WITH Q4-Q7;
Q4 WITH Q5-Q7;
Q5 WITH Q6-Q7;
Q6 WITH Q7;

%BETWEEN%
Q1 WITH Q2-Q7;
Q2 WITH Q3-Q7;
Q3 WITH Q4-Q7;
Q4 WITH Q5-Q7;
Q5 WITH Q6-Q7;
Q6 WITH Q7;

What is stumping me is that I am getting a Model chi-sq value greater than 0.
The df = 0, but chi-sq = 0.416 in this case. In the past, I have always obtained chi-sq=0. I removed Q1 and tried again (just to play) and got chi-sq=.2.

Any ideas why chi-sq would not be zero?
Thank you,
Laura

Linda K. Muthen posted on Thursday, April 19, 2012 - 10:12 am

Please send the output to support@statmodel.com.

Eva posted on Thursday, June 07, 2012 - 12:01 am

I want to build my BETWEEN model without predictors at the WITHIN level using the TWOLEVEL analysis type, to compare the paths of this BETWEEN-only model against the final model with both WITHIN and BETWEEN models.

My outcome binary variable is measured at the WITHIN level. How would I specify a WITHIN model with no level-1 predictors that contains only the intercept that could be used as the outcome at the BETWEEN level? (So I want to do the %WITHIN% model as Y ON B-0j; then specify my %BETWEEN% model as Y ON Z;)

Linda K. Muthen posted on Thursday, June 07, 2012 - 10:47 am

Just use %WITHIN% with no entries. A variance is not estimated for a binary variable and the threshold is declared in the between part of the model.

Helen Zhao posted on Thursday, June 21, 2012 - 6:16 pm

Hi Linda, I'm running into an error like this

ERROR:
One or more between-level variables have variation within a cluster for one or more clusters. Check your data and format statement.

I wonder why it happens? My data is like:

ID Cluster
1 1
2 1
3 2
4 3
5 3
6 3
7 3
8 4
9 4

Could you please help? thx!!

Linda K. Muthen posted on Friday, June 22, 2012 - 6:53 am

The value of a between-level variable by definition is the same for each person in a cluster. This is what generates the message.

Paraskevas Petrou posted on Wednesday, October 31, 2012 - 4:26 am

Hi,

I have a 3-wave longitudinal design where all variables are nested within individuals, therefore, I follow previous literature to analyze my data as a multilevel SEM model with Mplus. All variables are within-level (repeated) variables. It is important to repeat exactly the same paths both at the within- and the between-level of my model in the input. Most of the significant paths occur at the between level and I have to discuss that in the discussion of my paper. Are the following statements correct?

"All paths were examined at both levels of analysis. A significant path at the within level means that at measurement times that the independent variable is high, the dependent variable is high too. At the between level, it means that if the aggregate level of the independent variable is high irrespective of time, the aggregate level of the dependent variable is also high."

Thank you!
Paris

Linda K. Muthen posted on Wednesday, October 31, 2012 - 1:05 pm

That sounds correct.

Magdalena Mo Ching Mok posted on Thursday, February 07, 2013 - 6:21 pm

Dear Linda,

Greetings! I have a 64-bit machine and I am using MPLUS v7. When running Multilevel SEM, I got an error message "NOT ENOUGH MEMORY SPACE TO RUN THE PROGRAM ON THE CURRENT
INPUT FILE. THE ANALYSIS REQUIRES 5 DIMENSIONS OF INTEGRATION RESULTING
IN A TOTAL OF 0.75938E+06 INTEGRATION POINTS... "

Much obliged if you could help. Many thanks!
Magdalena

Linda K. Muthen posted on Friday, February 08, 2013 - 5:55 am

Try using INTEGRATION = MONTECARLO (5000).

Magdalena Mo Ching Mok posted on Saturday, February 09, 2013 - 4:53 am

Dear Linda, Thank you very much for your advice. Magdalena

Paraskevas Petrou posted on Tuesday, April 02, 2013 - 6:02 am

Dear Linda,

I am testing a multilevel SEM model with 4 dependent within-level variables and 10 independent between-level variables. 4 of the between-variables are control variables. The rest are 4 predictors (main effects) and their 2 interaction terms computed manually in SPSS. The only relationships that are specified at the within-level are the intercorrelations between the DV's. All other (hypothesized) relationships are at the between-level. I haven't been able to get any results yet. Even if I try to build a very simple model based on the above, I always get an error which -if I interpret correct- says that there are negative values in my data file -which is logical since I have standardized the predictors. An example of some of the errors I get:

Invalid symbol in data file:
"-" at record #: 8, field #: 62

Best regards,
Paris

Linda K. Muthen posted on Tuesday, April 02, 2013 - 6:19 am

It sounds like that data may be in fixed format and you are reading it in free format. If you can't see the issue, send the output, data, and your license number to support@statmodel.com.

Paraskevas Petrou posted on Tuesday, April 02, 2013 - 10:34 am

Thank you Linda!
I replaced commas with dots in the data file and it works now.
There is another problem though, I now get:

A NON-POSITIVE DEFINITE FIRST-ORDER DERIVATIVE PRODUCT MATRIX [...] DUE TO HAVING MORE PARAMETERS THAN THE NUMBER OF CLUSTERS. REDUCE THE NUMBER OF PARAMETERS.

Indeed my clusters are only 30. I have tried to reduce my parameters as much as I can but I still get the error. Can I ignore it or do I really have to reduce the parameters more?

I have also tried estimation = BAYES because I have heard it works well with small samples. Then I do not get the error anymore but I am not sure how to interpret the findings. The "p values" are similar to what I was getting previously with ML estimator. However, I also get "significance" based on the confidence intervals and this is non-significant nearly for all findings. Do I need to look at "p values" or at "significance" and its associated asterisk to interpret my findings?

Best,
Paris

Linda K. Muthen posted on Tuesday, April 02, 2013 - 12:22 pm

I would ignore the message as long as you don't have more than 30 parameters in the between part of the model.

With Bayes, you look at the credibility intervals.

Paraskevas Petrou posted on Wednesday, April 03, 2013 - 2:42 am

Thank you Linda.
I'm now reading Bengt Muthen's (2010) working paper on Bayes but there are a couple of things I am trying to clarify:

1. One of my interaction effects has a standardardized estimate of -0.336 and a p value of 0.04 but the CI include zero (-0.670-0.032). Do I have to report this interaction effect as significant or non-significant?

2. In the example of multilevel analysis in this article priors are specified but I do not have this information for my model. Can I run the model with the default without specifying priors?

Bast,
Paris

Linda K. Muthen posted on Wednesday, April 03, 2013 - 10:55 am

1. You should use the confidence interval. The p-values is a test of the parameter being positive.

2. Yes.

Paraskevas Petrou posted on Wednesday, April 03, 2013 - 11:39 pm

Thank you!
Paris

kja posted on Thursday, April 18, 2013 - 9:42 am

Hello,

I am trying to build a multi-level model in MPLUS but am running into some confusion. I have nine indicators that load on three latent factors, and I want to test whether these three latent factors predict minutes to relapse (DV). Measures are clustered within individuals because we examined the DV in two conditions: deprived and non-deprived. I would also like to examine whether one variable mediates any of these relationships (also measured in both conditions). And last, I would like to see whether deprivation moderates any of these relations. Can all of this be done in MPLUS, and if so is a multi-level model the best way to test these questions? I am trying to set up the most basic model (not accounting yet for mediation or moderation) with the following syntax:
…
CLUSTER = sid;
WITHIN = deprived;
BETWEEN = shs cesd ad gdd gda aa asrs aqr audit;
ANALYSIS: TYPE = TWOLEVEL;
MODEL:
%WITHIN%
delay ON deprived;
%BETWEEN%
f1 BY shs cesd ad gdd;
f2 BY gdd gda aa;
f3 BY asrs aqr audit;
delay ON f1 f2 f3;
…
But with this, I get the following error: THE MODEL ESTIMATION DID NOT TERMINATE NORMALLY DUE TO A NON-POSITIVE DEFINITE FISHER INFORMATION MATRIX etc. It seems like I need to change start values, but I wanted to make sure this was the proper set up first.

Thank you very much in advance for your time.

Linda K. Muthen posted on Thursday, April 18, 2013 - 11:28 am

It does not sound like you need a multilevel model. When several variables are measured on each individual, multivariate modeling handles the non-independence of observations.

MODEL:
delay ON deprived;
f1 BY shs cesd ad gdd;
f2 BY gdd gda aa;
f3 BY asrs aqr audit;
delay ON f1 f2 f3;

kja posted on Thursday, April 18, 2013 - 12:10 pm

Hi Linda,

Thank you very much - it sounds like I was making it a bit more complicated than necessary. In this model, the data would then be in wide format, correct?

Linda K. Muthen posted on Thursday, April 18, 2013 - 12:21 pm

Yes, it would be in wide format.

kja posted on Thursday, April 18, 2013 - 1:07 pm

Hello,

When I switched the data back to wide format, I realized I am a little confused as to your response above. Participants have two 'delay' scores - one when nondeprived and one when deprived. I would like to test the overall main effect of the latent factors on the 'delay' score, as well as whether deprivation moderates these links. With the above model,

MODEL:
delay ON deprived;
f1 BY shs cesd ad gdd;
f2 BY gdd gda aa;
f3 BY asrs aqr audit;
delay ON f1 f2 f3;

this only accounts for one delay score with data in wide format. Would I then need to take out the first statement and include both in the model:

MODEL:
f1 BY shs cesd ad gdd;
f2 BY gdd gda aa;
f3 BY asrs aqr audit;
delay_nondeprived ON f1 f2 f3;
delay_deprived ON f1 f2 f3;

And then constrain the pathways to the two different delay scores to test for moderation? I have a relatively smaller sample size, so I am also trying to figure out how to maximize power and wasn't sure if using long format data with multi-level modeling would do this.

Thank you again, and sorry for the confusion.

Bengt O. Muthen posted on Thursday, April 18, 2013 - 3:49 pm

Ok, so deprivation status does not define groups. Then I would go with your second MODEL, that is, you have 2 delay outcomes (so wide in that regard). And, yes, moderation can be thought of as the differences in their coefficients.

rebecca lazarides posted on Tuesday, July 30, 2013 - 7:13 pm

Hello,
for my multilevel model with latent var
x = level1 (student) predictor and observed
xa (aggregated x) = level2 (classroom) predictor I have 2 questions:

1) %within% Mod indices suggest "xa with xa" and indicate that M.I. would be 310.898.

a) What does this mean? (I guess I should set free the variance of xa across classrooms, but is this not a default?)
b) How can I change my model effectively to improve model fit?

2) With two latent predictors on level 1 (x1, x2) and level 2 (xa1, xa2) I want to test for interaction on level 1.
When including the latent interaction term f | x1 XWITH x2 in the equation, the following error message appears "THE ESTIMATED BETWEEN COVARIANCE MATRIX COULD NOT BE INVERTED. COMPUTATION COULD NOT BE COMPLETED IN ITERATION 194.
CHANGE YOUR MODEL AND/OR STARTING VALUES."

a) Is the problem that I can not include latent interactions on level1?
b) How can I test them ?
c) If this is not the problem, what could be wrong?

Thank you very much. I appreciate your helpful comments.

Bengt O. Muthen posted on Wednesday, July 31, 2013 - 11:02 am

1a) Not all MI's make substantive sense; ignore this one.
1b) The usual SEM rules apply: Free parameters with large MIs and when freeing makes substantive sense.

2)Send files to support for diagnosis.

rebecca lazarides posted on Wednesday, July 31, 2013 - 6:15 pm

Dear Bengt,

thank you very much. I will send the files to mplus support. However, I would like to know if it is generally possible to include interaction terms (either observed or latent) only on the %within% or only on the %between% level?

Linda K. Muthen posted on Thursday, August 01, 2013 - 10:38 am

XWITH can appear on both within and between.

hogehoge posted on Friday, August 16, 2013 - 4:28 am

Hello,

I ran the model below and got the following error messages.

Model:
%within%
STRESS on DEMAND CONTROL;
SLOPE | DEMAND on SEX;
%between%
STRESS on DEMAND;
MSICKLR on JUN STRESS;
SLOPE on JUN;

*** ERROR in MODEL command
Observed variable on the right-hand side of a between-level ON statement
must be a BETWEEN variable. Problem with: DEMAND
*** ERROR in MODEL command
Observed variable on the right-hand side of a between-level ON statement
must be a BETWEEN variable. Problem with: STRESS
*** ERROR
The following MODEL statements are ignored:
* Statements in the BETWEEN level:
STRESS ON DEMAND
MSICKLR ON STRESS

But when running the model without random slope, I didn't get such errors.

Model:
%within%
STRESS on DEMAND CONTROL;
DEMAND on SEX;
%between%
STRESS on DEMAND;
MSICKLR on JUN STRESS;

Is it impossible to use observed within-level variables on the right-hand side of between-level ON statement?
How can I change the model with random slope?

Thank you.

Tihomir Asparouhov posted on Friday, August 16, 2013 - 5:12 pm

If all the variables are continuous/normal there is no problem with this model. Are you running version 7.11?

hogehoge posted on Friday, August 16, 2013 - 8:13 pm

Thank you for your help.

I am using version 7.

STRESS, DEMAND and CONTROL are within-level continuous variables.
MSICKLR is a between-level continuous variable.
SEX is a within-level binary variable.
JUN is a between-level binary variable.

Linda K. Muthen posted on Saturday, August 17, 2013 - 7:38 am

Please send the input, data ,output, and your license number to support@statmodel.com.

Tihomir Asparouhov posted on Monday, August 19, 2013 - 8:33 am

Remove these specifications

SEX is a within-level binary variable.
JUN is a between-level binary variable.

These variables are covariates and don't need that.

Bengt O. Muthen posted on Monday, August 19, 2013 - 3:33 pm

For the run that you sent to Support, all you have to do is to remove the unnecessary request for integration.

hogehoge posted on Tuesday, August 20, 2013 - 12:41 pm

Dear Bengt,

I'm sorry for the mishap.
Thank you so much.

hogehoge posted on Monday, September 09, 2013 - 9:02 pm

Hello,

When using within-level independent variables as between-level dependent variable,
I got the following warning messages.

Model:
%within%
A on B C;
B with C;
%between%
A on B C;
B on D;
C on E;

*** WARNING in MODEL command
In the MODEL command, the following variable is a y-variable on the BETWEEN
level and an x-variable on the WITHIN level. This variable will be treated
as a y-variable on both levels: B
*** WARNING in MODEL command
In the MODEL command, the following variable is a y-variable on the BETWEEN
level and an x-variable on the WITHIN level. This variable will be treated
as a y-variable on both levels: C

Then I have a question.
Is the within-level correlation between B and C residual correlation?

Linda K. Muthen posted on Tuesday, September 10, 2013 - 10:57 am

When we say B and C are treated as dependent variables, we mean that distributional assumptions are made about them. B WITH C is a covariance not a residual covariance.

hogehoge posted on Tuesday, September 10, 2013 - 9:42 pm

Dear Linda,

Thank you very much for your explanation.

Katherine Taylor posted on Thursday, October 03, 2013 - 3:56 pm

I am I am trying to develop a twolevel model using school class as the cluster variable. However, I am getting the following error message:

*** ERROR
One or more between-level variables have variation within a cluster for
one or more clusters. Check your data and format statement.

Between Cluster ID with variation in this variable
Variable (only one cluster ID will be listed)

I checked my data and the within cluster values are the same. Is there something else I should do to try and fix this?

Thank you.

Linda K. Muthen posted on Thursday, October 03, 2013 - 6:14 pm

Perhaps you are misreading you data, for example, having blanks in a free format data set. If you can't see the problem, please send the files and your license number to support@statmodel.com.

rebecca lazarides posted on Wednesday, January 15, 2014 - 6:22 am

Dear Drs Muth�n,
I am new to multilevel modeling with Mplus. I understood that an observed DV (I have one in my model) on the within-level has a between level counterpart, which is automatically created by Mplus. I would like to know what this counterpart is. In the Mplus user guide v7.0 it is writen that "In the within part of the model, the ON statement describes the linear regression of y on the observed individual-level covariate x. " (exampl. 9.1; p.262) and "In the between part of the model, the ON statement describes the linear regression of the random intercept y on the observed cluster-level covariates w and xm."

a) Is the between counterpart the random intercept of y?
b) Am I right by stating that on the between-level my (averaged) IVs ( predict the random intercept of my (individually perceived) DV?
- I use average values as IVs in the between part of the model and list them in between part of the syntax.I did not average my DV.

Thanks in advance.

Best,
Rebecca

Bengt O. Muthen posted on Wednesday, January 15, 2014 - 10:35 am

a) Yes. You can think of it as the cluster-mean of y.

b) Yes, but IVs can also use the latent variable decomposition into within and between. See

Lüdtke, O., Marsh, H.W., Robitzsch, A., Trautwein, U., Asparouhov, T., & Muth�n, B. (2008). The multilevel latent covariate model: A new, more reliable approach to group-level effects in contextual studies. Psychological Methods, 13, 203-229.

Megan Bell posted on Sunday, April 06, 2014 - 8:46 pm

Dear Drs Muth�n,

I am a student who is new to MLM in Mplus. I have received conflicting advice on whether I should be running a two- or three-level model for my analysis, and would appreciate your opinion.

I am looking at the impact of child, parent, and neighbourhood characteristics on scores on a developmental outcome measure. My sample is a single population birth cohort (twins excluded), so there is only one child per family.

I have 5 binary (yes/no) outcome variables, and 2 explanatory variables each for children, parents and neighbourhoods.

One person has advised me to run a three-level model with children (L1) nested within families (i.e. parents; L2), nested within neighbourhoods (L3).

Another person has advised me to run a two-level model, with children and parents (L1) nested within neighbourhoods (L2), the reasoning being that there is only one child per family, so parents cannot be on a separate level.

I will be running multigroup models, to compare boys with girls. I have also been advised to run separate models for each outcome variable, rather than to include all outcome variables at level 1 and nest them within individuals.

Your advice on the best way to build my model would be appreciated. Please let me know if anything is unclear.

Linda K. Muthen posted on Monday, April 07, 2014 - 6:28 am

How many families and neighborhoods do you have?

Megan Bell posted on Monday, April 07, 2014 - 7:26 am

Dear Linda,
I have more than 20,000 families and around 80 neighbourhoods.
Many thanks.

Linda K. Muthen posted on Monday, April 07, 2014 - 9:34 am

I would do TWOLEVEL with children and parents nested in neighborhoods. When each cluster has only one observation, there is not ill effect of ignoring that clustering.

I would run the theoretical model using all outcomes for boys and girls separately as a first step.

Megan Bell posted on Monday, April 07, 2014 - 6:04 pm

Thank you Linda, appreciate your advice.

Carolyn CL posted on Friday, October 24, 2014 - 10:11 am

Dear Drs. Muthen,

I am running a TWOLEVEL SEM (N = 1285). I have a large number of clusters (N = 479) but few observations per cluster (Min = 1 (40%), Max = 16, Mean = 2.68). All measures are at the individual-level, but the model takes into account potential school-based clustering (ICC's are mostly low, ranging from 0.5% - 13%, with one variable at 34%). All pathways are estimated at the within and between levels.

IVs: 3 dummy variables (reflecting categories of SES), sex, age
DVs: 2 latent variables (one with 3 continuous indicators, the other with 5 categorical indicators), 2 categorical variables, 1 continuous variable

The model will not run - usually getting stuck during bivariate or univariate estimation. I tried variations in my modeling approach, such as only modeling factors on the within level (in CFA they appear not to fit on the between level), using cluster_mean to create between-level variables, and switching from ML to WLSMV. But nothing works - I always get an error message and no results. Much of the time, it appears that the problems are with bivariate estimation for my 3 dummy IVs (e.g. "SINGULAR INFORMATION MATRIX PROBLEM OCCURRED IN THE BIVARIATE ESTIMATION".

I am hoping you may be able to provide some assistance with why the model is not running.

Bengt O. Muthen posted on Friday, October 24, 2014 - 11:34 am

The error message you report seems to be when you use WLSMV. What happens when you use ML?

I assume that you have first explored parts of the model and made those converge before putting it all together.

Carolyn CL posted on Friday, October 24, 2014 - 1:27 pm

As per your recommendation, I estimated parts of the model using WLSMV to ensure that it ran at a basic level (i.e., each DV regressed on the 3 SES dummies, age and sex), these all seemed to work fine. The next step, however, of adding a second DV and a structural component starts leading to issues. Usually involving estimating the alpha, beta or psi for certain variables or associations.

For example, I received the following, with results but no standard errors:

NO CONVERGENCE. NUMBER OF ITERATIONS EXCEEDED.
SLOW CONVERGENCE DUE TO PARAMETER 18.
THE FIT FUNCTION DERIVATIVE FOR THIS PARAMETER IS -0.98714615D-02.

The parameter is a Beta linking a categorical DV to another categorical DV.

Trying to run these models instead using ML led to the following error message, with no results:

Observed variable on the right-hand side of a between-level ON statement
must be a BETWEEN variable.

Mark Prince posted on Tuesday, February 10, 2015 - 7:36 am

Hello,

I am trying to run a 2-1-1 MSEM with random slopes and I keep getting the following errors:

*** ERROR in MODEL command
Observed variable on the right-hand side of a between-level ON statement
must be a BETWEEN variable. Problem with: HELPSTRAT
*** ERROR
The following MODEL statements are ignored:
* Statements in the BETWEEN level:
TOTLDRNKS ON HELPSTRAT

Here is my code (below).
CD1 and CD2 are level 2 variables
totldrnks and helpstrat are assessed at level 1

USEVARIABLES ARE
ID CD1 CD2 TOTLDRNKS HELPSTRAT;

BETWEEN = CD1 CD2;
CLUSTER = ID;

ANALYSIS:
TYPE = TWOLEVEL RANDOM;
Algorithm = integration;
Integration = montecarlo;

MODEL:
%WITHIN%
helpstrat totldrnks;

sb1 | totldrnks on helpstrat;

%BETWEEN%

CD1 CD2 helpstrat totldrnks ;

helpstrat on cd1 (a1);
helpstrat on cd2 (a2);

totldrnks on helpstrat (bb1);

totldrnks on cd1;
totldrnks on cd2;

sb1 with helpstrat totldrnks;

[sb1] (bw1);

Model constraint:
New(b1 indb1 indb2);

b1 = bb1+bw1;

indb1 = a1*b1;
indb2 = a2*b1;

Output: cinterval;

Bengt O. Muthen posted on Tuesday, February 10, 2015 - 12:05 pm

This relates to today's posts with Falkenstrom. Because you define a random slope for helpstrat on Within, there is no latent variable decomposition into within and between parts of helpstrat as there is otherwise, so that there is no between part of helpstrat to regress on on Between in your statement:

%BETWEEN%

totldrnks on helpstrat (bb1);

You have to create a cluster-level version of helpstrat, say using the Cluster_mean option.

Francis Huang posted on Saturday, February 14, 2015 - 4:53 pm

I am interested in the group level factor analytic results. Just a few questions:

1. When I run a twolevel CFA, should the between group results be the same/similar to the results when I run a single-level analysis BUT using the corrected between group correlation matrix that Mplus generates as the data file (and specifying the correct ns at the group level)?

2. Mplus generates a between group correlation matrix (using a type=basic twolevel specification). How is the corrected covariance matrix (as per Muthen [1994]) scaled into a correlation matrix in this instance (it's not anymore dividing the covariance by the product of the SDs)?

3. How different are the decompositions of the within and between correlation matrices from the WABA (within and between analysis) method described by Dansereau et al., 1984? (my understanding is that there is something off with the between correlation matrix computed using WABA).

Thanks again!

Bengt O. Muthen posted on Monday, February 16, 2015 - 8:37 am

1. Only if you have a large number of clusters.

2. Send the relevant output to support to show the difference you refer to.

3. I am not familiar with WABA.

Yoosoo posted on Saturday, March 28, 2015 - 3:16 pm

Hello,

I have a question regarding the multilevel latent covariate (MLC) model with binary outcome and a formative/aggregated Level 2 contextual variable .

My data has two-level structure, (individuals within community) with low sampling ratio. The outcome is a binary variable (healthy/unhealthy). The independent variables include a binary variable at level 1 (HCARD, possession of health card) and a contextual variable at level 2 (% community population with health card).

I applied TYPE= TWOLEVEL COMPLEX RANDOM with MLC by excluding the HCARD variable in the within/between variable section. Both my within/between models include regression of outcome on HCARD. I'm getting the following error:

*** ERROR in MODEL command
Unrestricted x-variables for analysis with TYPE=TWOLEVEL and ALGORITHM=INTEGRATION must be specified as either a WITHIN or BETWEEN variable.The following variable cannot exist on both levels: HCARD

Do you have any suggestions on what may be wrong with my model? Also is my method of introducing MLC correct, for an formative aggregate L2 contextual variable (that is analogous to community gender composition)?

Linda K. Muthen posted on Saturday, March 28, 2015 - 3:28 pm

I don't think you need TWOLEVEL and COMPLEX if you have individuals nested in community. Use only TWOLEVEL RANDOM with the cluster variable being community.

You will need to create a cluster-level variable for HCARD to use on between. You can use the CLUSTER_MEAN option of the DEFINE command to do this.

Melissa Thibault posted on Sunday, March 29, 2015 - 8:48 am

I have already completed a CFA at the individual level (teachers) but I need to look at the next level (teachers in schools) to determine if there is between-level and within-level variance.

I am attempting to run a Multilevel CFA using a number of resources. One article aligned with my research suggest that I first create within and between matrices and obtain ICC values, then run a confirmatory factor analysis on the within matrix. When I attempt this step (run the CFA referencing the within matrix) I get the error message
*** ERROR
Insufficient data in "WinCov.dat"

Is there a resource you can recommend for me to double-check my language or to ensure that the SAVEDATA generated SAMPLE IS WinCov.dat; file is complete?

thank you.

Linda K. Muthen posted on Sunday, March 29, 2015 - 9:33 am

If you have three levels, you can use TYPE=THREELEVEL.

Please send the relevant files and your license number to support@statmodel.com.

Yoosoo posted on Sunday, March 29, 2015 - 3:20 pm

Dear Linda,

Thank you for your response.I have two follow-up questions.

1) I am trying to use the MLC approach as per the following paper:

Lüdtke, O., Marsh, H.W., Robitzsch, A., Trautwein, U., Asparouhov, T., & Muth�n, B. (2008). The multilevel latent covariate model: A new, more reliable approach to group-level effects in contextual studies. Psychological Methods, 13, 203-229.

I was under the understanding that MLC approach uses a single variable to represent both level 1 and level 2 influences. Does using cluster_mean option apply to the MLC approach?

2) Sorry I did not explain it fully. My data was sampling used stratification method. Would COMPLEX option be suitable for my model?

Thank you very much for your excellent support as always. You've been a great help to my research!

Bengt O. Muthen posted on Monday, March 30, 2015 - 12:47 pm

1) The cluster_mean option is not using the latent between-variable approach of MLC. it is simply the observed cluster mean. The latent between-variable approach of MLC is not available with algorithm = integration. Algorithm = integration is needed with random slopes.

2) Yes.

Yoosoo posted on Monday, March 30, 2015 - 1:39 pm

Thank you Bengt for the response.

Removing RANDOM from my analysis did not resolve the problem. I believe this is because my outcome is a binary variable (as elaborated in my first post), which I believe requires algorithm=integration.

Would you please suggest if there is any other way to apply MLC on a binary outcome on MPlus? Thank you so much.

Bengt O. Muthen posted on Monday, March 30, 2015 - 4:39 pm

Two alternatives: 2-level WLSMV or ML adding a factor behind the X.

For 2-level WLSMV, see the UG ex 9.9 and the paper:

Asparouhov, T. & Muth�n, B. (2007). Computationally efficient estimation of multilevel high-dimensional latent variable models. Proceedings of the 2007 JSM meeting in Salt Lake City, Utah, Section on Statistics in Epidemiology.
download paper contact first author show abstract

For ML, add a factor behind X:

%Within%
f BY x; x@0;

%Between%
fb BY x; x@0;

The factors then capture the latent variable decomposition.

Yoosoo posted on Monday, April 06, 2015 - 9:31 am

Thank you for the response. I have a few follow-up questions.

My x variable being decomposed is a binary variable. My cluster level variable is the fraction of individuals with x=1.

1) Does the suggested ML code still apply, or does setting it as following make sense (since x is binary)?

%Within%
f BY x@1;

%Between%
fb BY x; x@0;

2) Do I need to make further changes to the code since my cluster average x is a ratio and not average (referred to as "formative" aggregate in the following paper):

Lüdtke, O., Marsh, H.W., Robitzsch, A., Trautwein, U., Asparouhov, T., & Muth�n, B. (2008). The multilevel latent covariate model: A new, more reliable approach to group-level effects in contextual studies. Psychological Methods, 13, 203-229.

Thank you.

Bengt O. Muthen posted on Monday, April 06, 2015 - 5:11 pm

I would not get into latent variable decomposition with a binary covariate.

Yoosoo posted on Monday, April 06, 2015 - 6:55 pm

Thank you Dr. Muthen for the response.

I wonder if this is better suited for SEMNET, but would you help me understand why binary covariates may be better off without latent decomposition? Is it a theoretical issue, or more of a practical concern (computational)?
My data has low sampling ratio (~0.15) and low number of LV.1 units/cluster. I understand that MLC may still be biased but I wanted to use it as a comparison to MMC. Thank you so much for your patient support as always.

Bengt O. Muthen posted on Tuesday, April 07, 2015 - 7:37 am

The latent variable decomposition assumes that the latent between and within parts are uncorrelated, normal variables.

Cynthia Yuen posted on Monday, September 21, 2015 - 10:31 am

Hello,

I am trying to run a model using daily diary data where the person is the cluster. I'm interested in whether the level 1 random slope b1 between X (obsv) and M (latent) predicts a level 2 outcome Y (latent), such that a steeper slope is related to greater Y. I'm also trying to show that another level 2 variable W (latent) moderates b1, with the goal that b1 predicts Y only at certain levels of W. Does this make sense to do/is it possible? If so, would this be the appropriate code? Thanks a lot!

%within%

Mw BY M1-M5;
B1 | Mw ON X;

%between%

Mb BY M1-M5;
Mb ON X;

Y BY Y1-Y3;
Y ON B1 Mb X;

W BY W1-W3;
B1 ON W;

Bengt O. Muthen posted on Monday, September 21, 2015 - 6:43 pm

This looks reasonable.

Cynthia Yuen posted on Tuesday, September 22, 2015 - 3:07 pm

How would you interpret this if B1 ON W was significant but Y on B1 was not? Does this mean that the slope does not predict Y because the slope is only significant for individuals at a certain level of W? How would you unpack that to find out the relationship between Y and B1 for people who vary in W? Thanks again!

Bengt O. Muthen posted on Wednesday, September 23, 2015 - 11:06 am

B1 ON W nonsig: W does not influence the slope

Y ON B1 nonsig: The slope does not influence the intercept

Cynthia Yuen posted on Wednesday, September 23, 2015 - 4:24 pm

In the case where Y ON B1 was not significant but B1 ON W was sig, does Y ON B1 take into account the effect of W? For example, if I anticipated that B1 would be positive for people high on W but nonsig or negative for people low on W, would the nonsig Y ON B1 apply to the whole sample, regardless of their scores on W? Or could Y ON B1 be significant only for those high on W? If that's the case, how would I test that?

claudia sacramento posted on Wednesday, October 07, 2015 - 11:08 am

A rather basic question on multilevel modeling, I am afraid.

I was under the assumption that by not declaring variables as within or between mplus would separate the variances and analyse the between and within parts independently, thus I was expecting to have the same results with model 1 as with model 2 below:
Cluster = team;
between = ; within = ;
Analysis: Type = twolevel random;Estimator = ml;
!model 1
Model:
%Within%
x y;
%Between%
x on y;

!model 2
Model:
%Within%
x on y;
%Between%
x on y;

However, in the first case the between effect of x on y is significant, coefficient .84, while in the second it is not, dropping to .45 and the within effect of x on y is significant, coefficient .32.

Does this mean that even if we are only interested in the between effects the within effects must be estimated, otherwise the between effect will be biased?
Many thanks,
Claudia

Bengt O. Muthen posted on Wednesday, October 07, 2015 - 5:44 pm

Check that you have the same number of parameters in the 2 models. Also ask for TECH3 so you can see which Within parameters are correlated with which Between parameters. If you find non-zero correlations, misfit on one level can affect misfit on the other level.

claudia sacramento posted on Thursday, October 08, 2015 - 3:05 am

Thank you very much for your answer, I am not quite clear about how to interpret your note or the results though.

Indeed, the number of parameters estimated differs (2w+5b in model 1; 2w+3b in model 2), but i wasn't expecting these to be the same, as in model two I am also estimating the within x-->y effect and in model 1 I don't.

Also, the correlations between w and b parameters are not 0 - at least not always. Should they be? What are the implications of this? I have test the same models and looked at correlations between parameters using datasets from MPlus website examples (e.g., file ex9.1a.dat) and the same occurs, so I don't think this is something particular to my data. You indicate that 'If you find non-zero correlations, misfit on one level can affect misfit on the other level' how should this be then addressed?

Finally, would the correlations between within and between parameters be 0, would I then find that the between effect x-->y would remain consistent regardless of estimating or not within x-->y?

Apologies if I am lacking basic notions that would explain this, I have tried to look for this information but couldn't find anything, I am happy to read further is there is basic underlying knowledge I am missing and you can point me to a source.
Thank you again,
Claudia

Bengt O. Muthen posted on Thursday, October 08, 2015 - 4:01 pm

Your model 1 with 2w+5b probably does not allow x and y to correlate which creates misfit and this distorts parameter estimates on between given the estimate correlations that you see. Say x WITH y.

Ald posted on Monday, December 28, 2015 - 7:52 am

I want to study the cross-interaction effect between x1 and w.
Names are varid u1-u6 x1 x2 w;
Categorical=u1-u6; !u1-u6 are binary
within=x1 x2; ! x1 and x2 are ordinal
between=w; !w is nominal with seven categ.
cluster=varid;
Analysis:
type=twolevel;
Model:
%within%
f1w by u1*-u3;
f2w by u4*-u6;
f1w f2w on x1 x2;
f2w on f1w;
f1w@1;
f2w@1;
%between%
f1b by u1*-u3;
f2b by u4*-u6;
f1b f2b on w;
f2b on f1b;
f1b@1;
f2b@1;
1. Can I consider the above model?
2. I have generated dummy variables. Should I run six models for each dummy variable as between variable?
3. How do I interpret the results?
Thank you very much.

Bengt O. Muthen posted on Monday, December 28, 2015 - 4:10 pm

1. Yes, it is ok but it doesn't have a cross-level interaction between x1 and w. You need random slopes for that.

2. You should create 6 dummy variables for w and include them in a single run.

Ald posted on Tuesday, December 29, 2015 - 5:55 am

Thank you very much for your response.

Following the idea, can I consider a twolevel random slope model with cross-sectional data?

Bengt O. Muthen posted on Tuesday, December 29, 2015 - 5:20 pm

Yes.

Ald posted on Wednesday, December 30, 2015 - 11:32 am

Thank you. I add the random slopes:

Names are varid u1-u6 x1 x2 w;
Categorical=u1-u6; !u1-u6 are binary
within=x1 x2; ! x1 and x2 are ordinal
between=w; !w is nominal
cluster=varid;
Analysis:
type= twolevel random;
estimator= ml;
integration= montecarlo(500);
Model:
%within%
f1w by u1*-u3;
f2w by u4*-u6;
f1w f2w on x2;
f2w on f1w;
f1w@1;
f2w@1;
s1 | f1w on x1;
s2 | f2w on x1;
%between%
f1b by u1*-u3;
f2b by u4*-u6;
f1b f2b on w;
f2b on f1b;
f1b@1;
f2b@1;
s1 s2 on w;

I am wondering if I omit further model specifications and about defining two slopes in the same run or one slope at a time.
Thank you very much.

Bengt O. Muthen posted on Thursday, December 31, 2015 - 5:39 pm

Syntax looks ok. Do one random slope at a time as a start.

Yulan Han posted on Wednesday, April 27, 2016 - 2:11 am

Dear Dr. Muthen,

I��m doing a multilevel SEM using type=twolevel. I have 854 individuals from 153 working teams nested in 15 firms. My interest is in how team-level variables influence individual-level variables. But the reviewer asked me to consider the effect of firm using type=complex or dummy variables. I tried dummy variables. But, because there are too many dummy variables (14), the model fit indices became worse especially CFI and TLI. I also tried type=complex, and the model fit indices are good. Can I use type=complex when I only have 15 firms? I saw you said ��Less than 20 clusters makes the statistical analysis difficult�� on the discussion board. Does it mean the results I got based on 15 firm are not reliable?

Kind regards,
Yulan

Bengt O. Muthen posted on Thursday, April 28, 2016 - 6:58 pm

You should not use Type=Complex with onlky 15 firms. Instead, explore why you don't have good fit when using dummy variables.

Alexandra Marx posted on Tuesday, October 04, 2016 - 3:01 am

Dear Drs Muth�n,

I am trying to run a multilevel model with imputed data to test for a cross-level interaction. I have covariates on both the first and the second level which have missing data, so I entered the respective variances in the model. However, every time I include the slope in my model, I get the following error message:

SERIOUS PROBLEM IN THE OPTIMIZATION WHEN COMPUTING THE POSTERIOR
DISTRIBUTION. CHANGE YOUR MODEL AND/OR STARTING VALUES.
THE MODEL ESTIMATION DID NOT TERMINATE NORMALLY DUE TO AN ERROR IN THE
COMPUTATION. CHANGE YOUR MODEL AND/OR STARTING VALUES.

Unfortunately, I was not able to figure out what`s the problem with my model. The syntax is:

[�.]
model: %within%
mathe on S_Sex HISEI;
s | S_MIG on mathe;
S_MIG;
S_Sex;
HISEI;

%between%
mathe on sform L_MIG L_Sex Year Year_sc Ausb HISEI_kl MIG_kl S_Sex_kl;

s on L_MIG;
s with mathe;

sform;
L_MIG;
L_Sex;
Year;
L_Sex;
Year_sc;
Ausb;
HISEI_kl;
MIG_kl;
S_Sex_kl;

Any help with this problem would be greatly appreciated.
Thanks in advance,
Alex

Bengt O. Muthen posted on Tuesday, October 04, 2016 - 2:26 pm

Can't say without looking at it closer. Please send your input, output, data and license number to Support.

Anna Gkiouleka posted on Sunday, March 26, 2017 - 10:54 am

Hi, I am running an intercept only two level model and at the output, I don't get an estimate for variance neither for within nor for between level. Standard errors appear as well as between level mean. Both variances appear significant though. My sample is small 130 observation across 29 clusters. Is it possible that the reason I dont get an estimate for the variance is due to the small sample?

Thank you!

Bengt O. Muthen posted on Sunday, March 26, 2017 - 6:02 pm

If you don't put your outcome on the Within= or Between=list and you mention the variance on each level, you get the 2 variance estimates.

If this doesn't help, please send output to Support along with your license number.

Timothy Ihongbe posted on Tuesday, June 06, 2017 - 4:07 pm

Hi Drs. Muthen,

I am running a multilevel (2-level) analysis (2-2-1 model) and I have 2 problems.

1. My cluster variable is the census tract. However, some census tracts have decimals (e.g. 701.01, 701.02). Mplus seems to recognize this as a variability in the cluster variable and gives me an error message.

2. My outcome variable (y) and other covariates (x1, x2, x3) are individual-level variables and I treated them as both within and between levels variables. When I regress y on x and m (both between level variables), I do not have any problem. But when I include x1, x2, and x3, it tells me x1, x2, x3 cannot exist on both levels and must be specified as either a WITHIN or BETWEEN variable. How do I address this as I intend to control for x1, x2, and x3? My code is below.

VARIABLE:
NAMES ARE clu x m y x1 x2 x3;
USEVARIABLES ARE clu x m y x1 x2 x3;
MISSING ARE ALL (99);
CATEGORICAL ARE y;
BETWEEN ARE x m;
CLUSTER IS clu;

ANALYSIS:
TYPE IS TWOLEVEL RANDOM;

MODEL:
%BETWEEN%
m ON x x1 x2 x3 (a1-a4);
y ON m x1 x2 x3 (b1-b4);
y ON x x1 x2 x3 (c1-c4);

MODEL CONSTRAINT:
NEW(direct indirect total);
indirect = a1*b1;
direct = c1;
total = c1+ a1*b1;

OUTPUT: TECH1 TECH8 cinterval;

Bengt O. Muthen posted on Tuesday, June 06, 2017 - 6:15 pm

1. You can multiply the tract number by 100.

2. We need to see your full output to say - send to Support along with your license number.

SY Khan posted on Thursday, August 10, 2017 - 12:59 pm

Hello,

I am analysing a moderated mediation model using latent constructs and observed variables in a multilevel framework (1-1-2 with level 1 moderator).

The model runs for a long time and gives the following error messages.

THE ESTIMATED WITHIN COVARIANCE MATRIX COULD NOT BE INVERTED.
COMPUTATION COULD NOT BE COMPLETED IN ITERATION 235. CHANGE YOUR MODEL AND/OR STARTING VALUES.

THE MODEL ESTIMATION DID NOT TERMINATE NORMALLY DUE TO AN ERROR IN THE COMPUTATION. CHANGE YOUR MODEL AND/OR STARTING VALUES.

Changing the START Values, gives the following error:

RANDOM STARTS RESULTS RANKED FROM THE BEST TO THE WORST LOGLIKELIHOOD VALUES
Final stage loglikelihood values at local maxima, seeds, and initial stage start numbers:

Unperturbed starting value run did not converge.
4 perturbed starting value run(s) did not converge.

THE ESTIMATED WITHIN COVARIANCE MATRIX COULD NOT BE INVERTED. COMPUTATION COULD NOT BE COMPLETED IN ITERATION 235. CHANGE YOUR MODEL AND/OR STARTING VALUES.

THE MODEL ESTIMATION DID NOT TERMINATE NORMALLY. ESTIMATES CANNOT E TRUSTED.

THE MODEL ESTIMATION DID NOT TERMINATE NORMALLY DUE TO AN ERROR IN THE COMPUTATION. CHANGE YOUR MODEL AND/OR STARTING VALUES.

Can you please suggest ways to fix the problem?

Thanks.

Bengt O. Muthen posted on Thursday, August 10, 2017 - 4:35 pm

Perhaps you have some almost zero variances among your non-converged estimates.

If that doesn't help, send to Support along with your license number.

Mitsuko Tanaka posted on Friday, August 11, 2017 - 6:20 pm

I created a multilevel SEM using the following model command (A, B, X, & Y are observed variables):

VARIABLE:
NAMES ARE Group A B X Y z1 z2;
USEVARIABLES ARE A B X Y;
CLUSTER = Group

ANALYSIS:
TYPE IS TWOLEVEL;
ESTIMATOR IS MUML;
ITERATIONS = 1000;
CONVERGENCE = 0.00005;
COVERAGE = 0.10;

MODEL:

%WITHIN%
A on X Y;
B on X Y;

%BETWEEN%
A on X Y;
B on X Y;

Instead of using the observed variable, Y,
I'd like to use a second-order latent variable, Z consisting of two observed variables, z1 & z2.
Is it possible to use a second-order latent variable in a multilevel SEM?
If possible, what command should I use?

Thank you for your help.

Bengt O. Muthen posted on Saturday, August 12, 2017 - 10:47 am

Yes, this is possible. Just use BY and ON as usual.

Jacqueline Power posted on Wednesday, August 23, 2017 - 1:34 pm

My model is similar to Mplus Short Course 7, slide 81. x and y are level 1, m is level 2. I am not using monte carlo. I have random slopes.
I would like to add modifiers to the x->m relationship and the m->y relationship. Can
you show me how to do this? Thank you.

Bengt O. Muthen posted on Wednesday, August 23, 2017 - 4:38 pm

Perhaps something like this is what you have in mind, letting xz1 be the interaction between x and the z1 moderator and mz2 be the interaction between m and the z2 moderator (I assume Z1 and Z2 are between level):

Between = m z1 z2;

Model:

% within%
y on x; x;

%Between%

m on x xz1;
y on m mz2;
y on x;

Jacqueline Power posted on Thursday, August 24, 2017 - 8:10 am

Thank you very much. Re: Your message of yesterday, beginning "Perhaps something like this is what you have in mind"

In my model, z1 and z2 are level 1 moderators. In this model, only m is a level 2 variable.

Can you show me how to change the instructions to reflect this change?

Thank you,

Bengt O. Muthen posted on Thursday, August 24, 2017 - 6:49 pm

I am confused - in your first message you talked about

modifiers to the x->m relationship and the m->y relationship.

m is only on the Between level so how can you have a level 1 moderator of that?

Jacqueline Power posted on Friday, August 25, 2017 - 10:23 am

Thank you.
1. x-> m relationship in which there is a level 1 moderator called z1.

We want to predict M which is 'mean_ses for the school'(from slide 36).

Student income (Si) ->Mean_ses, modified by student debt(sd).

Si is x, M is Mean_ses, z1 is sd.

2. m-> y modified by z2
Mean_ses for the school-> self-esteem for student modified by student height or z2

Bengt O. Muthen posted on Friday, August 25, 2017 - 4:20 pm

Let's talk about case 1. (case 2 is analogous). The x-> m relationship is a between-level relationship given that m is a between variable. This means that on Between x has to appear in its cluster version (either by latent variable decomposition as in UG ex 9.2 or by computing its observed cluster mean). And the z1 moderator variable similarly has to appear on Between in its cluster version.

Jacqueline Power posted on Tuesday, August 29, 2017 - 7:50 am

Thank you. If I use the 'observed cluster mean' method, I would compute xmean by taking the average of x for each cluster. For slide 34, I would take the mean for each school. I compute the average of z1 for each school which is z1mean. Same for z2.

Now I have xmean, z1mean, z2mean.
Can you show me how to write the coding for this model if I have random slopes? To find this model, look at August 23.
Thank you,

Bengt O. Muthen posted on Tuesday, August 29, 2017 - 5:03 pm

Note that there are 2 cases of moderation:

(1) Random slopes are defined for Within-level relationships and then used in Between-level modeling. That's what slides 34 and 36 show (Topic 7).

(2) Moderation of Between-level relationships are carried out using regular product terms created for instance in the Define command and then entered in the Between-level regression equation.

You said that your moderation is for x-> m and m -> y, which is actually xmean -> m and m -> ymean, so we are talking about case (2), i.e. the Between level.

But you also seem to imply that m is a mean which seems to say that you might have Within-level relationships.

But at this point I don't think I can help you further.

Jacqueline Power posted on Thursday, August 31, 2017 - 7:13 am

Thank you. I am in situation 2 (Moderation of Between-level). My level 2 variable is only measured at level 2. This variable is department morale (DM). I was trying to use the school example as this example is discussed in the slides.
I need to prove to my colleagues that I can use Mplus for this type of situation. I have seen published papers using mplus for this type of problem.

Can you show me how to handle
xmean->m measured at level2, modified by z1mean [random slope]
Or should I create a product term? Not sure how to start.

Bengt O. Muthen posted on Thursday, August 31, 2017 - 4:55 pm

Yes, create a product term in line with my suggestion (2).

Guillem Rico posted on Wednesday, October 11, 2017 - 5:01 pm

Dear Dr. Muth�n,

I am new to Mplus. I would like to estimate how different job attributes affect interest in a job position. I am using a conjoint study where respondents are asked to rate their interest in different types of jobs. The dataset is structured in stacked form, such that each job rating is a separate observation (there are eight different observations per respondent), with job attributes as (dichotomous) predictors. I want to examine how the effect of these job attributes vary across various respondents' latent characteristics (associated with different observed variables at the respondent level). I have two questions:

1. What kind of multilevel/complex data approach is more appropriate for this? Please note that I am trying to estimate a cross-level interaction here.

2. Is it feasible to obtain latent variable estimates at level 2 (i.e. respondent) within the same model? If so, could you provide some guidance as to how to do this?

Bengt O. Muthen posted on Wednesday, October 11, 2017 - 5:53 pm

Sounds like a Type = twolevel analysis. Yes, you can request factor scores which gives them also for level 2.

Guillem Rico posted on Friday, October 13, 2017 - 10:22 am

Thank you! My first attempt apparently run without problems and results make sense to me, but I am wondering if I am actually doing what I intend to. Specifically, I want to see how the effect of job attributes (sector, service, duration) varies with respondents' latent characteristics (aps, cpv, com, ss). Female, age, and findjob are covariates. I am using this syntax:

Variable:
Names are [omitted] ;
Usevariables = id female age findjob accept duration service sector
mp7_a mp7_b mp7_c mp7_d mp8_a mp8_b mp8_c mp8_d mp9_a mp9_b mp9_c
mp9_d mp10_a mp10_b mp10_c mp10_d ;
Missing are all (-999) ;
Within = duration service sector ;
Between = female age findjob mp7_a mp7_b mp7_c mp7_d mp8_a mp8_b mp8_c
mp8_d mp9_a mp9_b mp9_c mp9_d mp10_a mp10_b mp10_c mp10_d ;
Cluster = id ;

Analysis:
Type = twolevel random ;

Model:
%within%
s1 | accept ON sector ;
s2 | accept ON service ;
s3 | accept ON duration ;

%between%
aps BY mp7_a mp7_b mp7_c mp7_d ;
cpv BY mp8_a mp8_b mp8_c mp8_d ;
com BY mp9_a mp9_b mp9_c mp9_d ;
ss BY mp10_a mp10_b mp10_c mp10_d ;
accept ON female age findjob aps cpv com ss ;
s1 s2 s3 ON aps cpv com ss ;
accept WITH s1 s2 s3 ;

Thanks for your help.

Bengt O. Muthen posted on Friday, October 13, 2017 - 2:29 pm

Looks ok.

Guillem Rico posted on Saturday, October 14, 2017 - 11:23 am

Thank you very much!

Aurelie Lange posted on Friday, April 05, 2019 - 7:16 am

Dear Dr Muthen,

I got a message saying that the estimated covariance matrix could not be inverted. I would like to estimate several random slopes . This is my model:

MODEL:
!cross-lagged paths
conflt22 on criesst1;
s1 | criesst2 on conflt12;

!stability paths
conflt22 on conflt12;
criesst2 on criesst1;

!initial covariance
S2 | criesst1 on conflt12;

!disturbance covariance
S3 | criesst2 on conflt22;

criesst1 on sexet1 aget1;
criesst2 on sexet2 aget2;

s1 on contt1_2;
s2 on contt1_2 contt2_2;
s3 on contt2_2;

The exact error message is:
THE ESTIMATED COVARIANCE MATRIX COULD NOT BE INVERTED.COMPUTATION COULD NOT BE COMPLETED IN ITERATION 459. CHANGE YOUR MODEL AND/OR STARTING VALUES.

Am I specifying anything wrongly?

Sincerely,
Aurelie

Bengt O. Muthen posted on Saturday, April 06, 2019 - 12:52 pm

We need to see your full output - send to Support along with your license number.

Aurelie Lange posted on Monday, September 30, 2019 - 12:46 pm

Dear Dr Muthen,

I am trying to set up a multilevel model with several x-variables on the individual level that predict a binary y-variable on the between-level.
I have read in this thread that mplus automatically makes a between-level variant for individual-level variables. Could you let me know if the following input is correct?

between: y

%within%
x1;
x2;
x3;
%between%
y on x1 x2 x3;

Also, when including all x-variables I encounter a problem. Some of the x-variables are dummy-variables. One of them has only a couple of observations. If I include this as dummy in the model, I get the following error message:

A MATRIX COULD NOT BE INVERTED DURING THE BASELINE MODEL ESTIMATION. THE ESTIMATED BETWEEN COVARIANCE MATRIX IS NOT POSITIVE DEFINITE AS IT SHOULD BE. COMPUTATION COULD NOT BE COMPLETED.
THE VARIANCE OF LIVOTHER APPROACHES 0. FIX THIS VARIANCE AND THE CORRESPONDING COVARIANCES TO 0, DECREASE THE MINIMUM VARIANCE, OR SPECIFY THE VARIABLE AS A WITHIN VARIABLE.

THE BASELINE MODEL ESTIMATION DID NOT CONVERGE. THE CHI-SQUARE VALUE COULD NOT BE COMPUTED.

Should I exclude this variable from the model, or is there some other way around this?

Thank you so much!

sincerely,
Aurelie

Bengt O. Muthen posted on Monday, September 30, 2019 - 2:26 pm

You don't have a model of interest on Within so I suggest you analyze this simply as a single-level model where you use the Cluster_mean to create between-level versions of the x variables. The sample size is then the number of clusters.

Aurelie Lange posted on Thursday, October 03, 2019 - 4:23 am

Dear Dr Muthen,

Thank you for your advice. This simple model indeed didn't have any model of interest on the within variable. In the end, however, I do have model statements on both within and between. My model is

%within%
y1 on x1 x2;
x1;
x2;

%between%
y2 on x1 x2;
x1;
x2;

Whereby y1 is binary. When I initially tried this, I got the following message:

Observed variable on the right-hand side of a between-level ON statement must be a BETWEEN variable when using ML estimators. The BAYES estimator may resolve this problem.

I have therefore used cluster_mean for x1 and x2 and have used these new variables in the between-statements.
Is this a correct way to go about?

Thank you!

Sincerely,
Aurelie Lange

Aurelie Lange posted on Thursday, October 03, 2019 - 5:18 am

Additional question (sorry for the double post): If one of the x-variables is a dummy-variable, how do you include this on the between-level? You can't take a cluster_mean of a dummy 'cause then it would no longer work as a dummy.
Could you include the k-1 dummies on %within% and then include the cluster_mean of the k-variables on %between% and treat these as separate averages of yes/no questions?

I hope this makes any sense.

Thank you so very much!
Aurelie

Bengt O. Muthen posted on Thursday, October 03, 2019 - 2:18 pm

It is a good approach to use cluster_mean for x1 and x2 on the Between level. For a dummy x variable on Within, this gives an x variable on Between that is the proportion of people in the cluster who have x=1.

When you use cluster_mean x's on Between, you should use group-mean centered x counterparts on Within and put them on the Within= list.

Stanislaw posted on Thursday, November 21, 2019 - 3:50 am

Dear professors,
I want to run a mediation analysis with predictor X (L2), mediator M1 (L2), mediators M2a and M2b (L1), and outcome Y (L1). Could you help me by checking the syntax and answering few questions? Thank you in advance!

1)Is the syntax okay?

2)Are their means to evaluate the model fit? If I compute the same model without random slopes, I find fit indices that indicate good fit and a higher AIC compared to the model with random slopes. Can I interpret this as evidence for good fit of the random slopes model?

3)Is the following interpretation correct? The output says that the effect of M1 on M2a is 0.46, which signifies that the mean of M2a changes by 0.46 if M1 changes by 1.

SYNTAX:
model: %within%
s1 | Y on M2a;
s2 | Y on M2b;
M2a with M2b;
M2a M2b Y;

%between%
M1 on X (a1);
M2a on M1 (d1);
M2a on X (a2);
M2b on M1 (d2);
M2b on X (a3);
Y on M2a (e1);
Y on M2b (e2);
M2a with M2b;
M1 M2a M2b Y;
s1 with Y;
s2 with Y;
s1 with s2;

model constraint:
new (ind);
ind = a1*d1*e1+a1*d2*e2;

Bengt O. Muthen posted on Friday, November 22, 2019 - 3:58 pm

1) Looks ok

2) Use the model with the lowest BIC. There is no overall test of fit when you have random slopes.

3) it signifies that M2a (not its mean) changes by 0.46 etc.

Stanislaw posted on Monday, November 25, 2019 - 12:30 am

Thank you! That helped me a lot!

Stanislaw posted on Monday, November 25, 2019 - 5:16 am

A follow-up question arose: I have a significant effect of M2a on Y on level 1, but not on level 2. On level 2, the indirect effect X->M1->M2a->Y is not significant. Is it possible to consider both levels when calculating the indirect effect?

Bengt O. Muthen posted on Monday, November 25, 2019 - 11:38 am

You can report effects on each level. Only if you have random slopes will the 2 levels interact to affect the effect calculations.

Stanislaw posted on Thursday, December 12, 2019 - 5:02 am

Dear Bengt,
Thank you very much! Could you perhaps indicate how to calculate the indirect effect X -> M1 -> M2a -> Y on both levels? I do not understand how to integrate effects on level 1 between mediator M2a and Y. I would really appreciate your help!

Thank you!

Bengt O. Muthen posted on Thursday, December 12, 2019 - 5:48 am

Take a look at our Short Course video and handout for Topic 7, slides 67-85 and especially the random slope slides 80-85. See also our web page:

http://www.statmodel.com/Mediation.shtml

especially under the heading:

Two-level mediation with random slopes

Roy Konings posted on Friday, March 13, 2020 - 12:00 pm

Dear dr. Muthen,

I am trying to build a multilevel moderated mediation model but I am not sure whether I am doing this correctly. I have one predictor (X1) on the within level, which influences my dependent variable (Y), both directly and indirectly through a mediator on the within level (M). In addition, I also have a predictor on the between level (X2) which influences the same dependent variable (Y) directly and indirectly through the same mediator (M) on the within level. To conclude, I have a cross-level interaction (I) between x1 and x2 which has a direct and indirect influence (via M) on my dependent variable (Y).

I am not sure how to create the cross-level interaction term. Should I do this with the define option (x1*x2), and include this interaction term as a predictor on the between level (and/or the within level?)? Or do I have to create a random slope for the effect of x1 on M and Y, and use x2 as a predictor for that slope on the between level? If I am correct, I cannot enter the direct/fixed effect of x1 if I also have the random slope of x1 in it. Would it then be correct to create multiple models, one for the random slope (and thus for the interaction) and one for the fixed effect?

Bengt O. Muthen posted on Saturday, March 14, 2020 - 2:57 pm

From your first two sentences, I don't see why you need a cross-level interaction. I think you are saying that X1 influences the Within part of M and X2 influences the Between part of M. But you are not saying that X2 influences the influence of X1 on the Within part of M - that would have been a cross-level interaction that would be handled by having a random slope for the regression of the Within part of M on X1.