Message/Author 

Shige Song posted on Wednesday, June 18, 2003  3:01 am



Hi, Is it possible to estimate a twolevel model with autoregressive errors at the second level using MPlus? Some other packages (HLM, MLwiN, aML) can estimate multilevel model with autoregressive errors at the first level but not at the second level, as described in DiPrete and Grusky (1990). Since MPlus is more felxible in handling covariance structure, maybe it can do a better job? Thanks! Best, Shige Song Department of Sociology, UCLA  Reference DiPrete, Thomas A., and David B. Grusky. 1990. "The Multilevel Analysis of Trends with Repeated CrossSectional Data." Sociological Methodology 20:337368. 

bmuthen posted on Wednesday, June 18, 2003  4:38 am



Yes, this is possible. In 3level modeling of growth  which is handled as a 2level model in Mplus  the timespecific residual variances for the outcomes are in fact assumed zero on the cluster level (level 3) in multilevel modeling, whereas they can be estimated in Mplus. 

Shige Song posted on Wednesday, June 18, 2003  1:41 pm



Thanks for your reply! I did not find the word "autoregressive error" or "autocorrelation" in the user manual, could you tell me how to specify them using MPlus language? Best, Shige 


You would use the WITH option. To specify the residual covariance between y1 and y2, you would say: y1 WITH y2; 


If you want to impose a firstorder autoregressive residual structure, you respecify the model so that the residual is expressed as a factor. This means that the observed outcome has a factor influencing it and zero residual. MODEL: f1 BY y1@1; y1@0; f2 BY y2@1; y2@0; f3 BY y3@0; y3@0; f3 ON f2 (1); f2 ON f1 (1); This gives a firstorder autoregressive for the residuals called f1, f2, and f3. 

Shige Song posted on Friday, June 20, 2003  3:25 am



Dear Linda, Thanks for the response. I am fairly new to MPlus and trying to read through the User Guide. Meanwhile maybe you can give me some quick response that may take me days to figure out otherwise. Suppose I have 8 different surveys collected in one country at different time points with each survey containing 1000 observations. I pooled them together and want to do a twolevel analysis. At level1 (individual) I have dependent variable y and two independent vairables x1, x2; at level2 I have two independent variables z1, z2. All 5 variables are continuous. If I ignore the fact that the data is created by stacking 8 crosssectional data sets from the same country at different time points, this is a fairly straightforward 2level model that can be estimated using any multilevel packages like HLM, MLwiN, aML. The MPlus code probably looks like this:  ... VARIABLE: NAMES ARE y z1 z2 x1 x2 cohort; CLUSTER IS cohort; ANALYSIS: TYPE=TWOLEVEL; MODEL: %BETWEEN% y ON z1 z2 x1 x2; %WITHIN% y ON z1 z2 x1 x2;  But now I want to consider such a fact by including a firstorder autoregresive residual temr at the second level, how do I incorporate the code you presented above into this specific question? Specifically, what are the "y"s in your code, are they the "x"s or "z"s in my question? Thanks a lot!  MODEL: f1 BY y1@1; y1@0; f2 BY y2@1; y2@0; f3 BY y3@0; y3@0; f3 ON f2 (1); f2 ON f1 (1);  


For clarification, when you say you have stacked your data, it appears that you are doing a crosssectional design where you have 8 clusters and a total sample size of 8,000. Is this true? 

Shige Song posted on Friday, June 20, 2003  11:53 am



That's right, I have 8 crosssectional data sets (each has sample size of 1000), I pool them together into one data set, now the total sample size is 8000. 

bmuthen posted on Friday, June 20, 2003  12:05 pm



Let me jump in and ask some more questions. So, you have only 8 clusters then? This seems very small for 2level analysis. Also, I don't understand how you can have level 2 autocorrelation if level 2 is cohort and the cohorts are 8 independent samples. But perhaps that is explained in the Soc Meth article you referred to? Typically, it is assumed that the highest level, here cohort, are independently observed. In singlelevel models, however, I am aware of the use of autocorrelated errors and how that changes the likelihood. I haven't seen that in 2level models. Mplus handles autocorrelated observations in a multivariate approach, but I don't see immediately how that plays in here. 

Shige Song posted on Friday, June 20, 2003  12:35 pm



Hi Linda, 8 cohort is just an example (a bad one, appenrently) to simplify the question. In my real data, I have data from 30 countries, and 2050 cohorts within each country (now I want to put the crosscountry country comparison aside for a moment and focus on one country). The reason for cohort level autoregressive error, as described in the paper I cited, is because in each country the data was compiled by combining many crosssectional data sets collected in different time points, and they are not completely independent samples in the sense they are sample of the same population at different time! As you mentioned, level1 autoregressive error can be handled in other multilevel packages  HLM, MLwiN, aML. But they can not handle autoregressive error on the aggregate level; it was my hope that the flexibility of MPlus in handling variance/covariance structure that can push one step further. Even if it is not feabile in the current version, maybe it's something to think about in the next version? Thanks! Shige 

bmuthen posted on Sunday, June 22, 2003  3:25 pm



I think I have to read the original Soc Meth article to be able to understand what you want to do and to be able to say how/if Mplus can be helpful here. One source of my confusion is your statement that your samples are not independent because they are samples from the same population at different times. I don't see how you get nonindependent samples this way, unless the population is very small. Unfortunately, I don't have time right now to study the original article. 


I am trying to replicate findings from a growthcurve (intercept and slope) multilevel model I have estimated in both HLM and the multilevel module in LISREL in Mplus 3. At this point, I am just interested in estimating intercepts and slopes with no covariates. In HLM terms, the model is a 3level model with time (8 annual assessments with some missing values) which is nested in partner (only 2 per couple) which, in turn, is nested in couple. Fixing the variance of the slope at the partner level to 0, I can get the following syntax to run just fine in Mplus (using Example 9.12 as a model): USEVARIABLES ARE sat1 sat2 sat3 sat4 sat5 sat6 sat7 sat8; ANALYSIS: TYPE = TWOLEVEL MISSING; MITERATIONS=5000; MODEL: %WITHIN% iw sw  sat1@7 sat2@5 sat3@3 sat4@1 sat5@1 sat6@3 sat7@5 sat8@7; sat1sat8 (1); sw@0; %BETWEEN% ib sb  sat1@7 sat2@5 sat3@3 sat4@1 sat5@1 sat6@3 sat7@5 sat8@7; sat1sat8@0; My problem is that I want to explore different error covariance structures at "level 1" involving sat1sat8. I have tried simply adding WITH statements after the last model entry without success. (For example, just adding "sat1 WITH sat2" to the above syntax to get the correlated error for sat1 and sat2 yields the following error message: THE ESTIMATED BETWEEN COVARIANCE MATRIX IS NOT POSITIVE DEFINITE AS IT SHOULD BE. COMPUTATION COULD NOT BE COMPLETED. PROBLEM INVOLVING VARIABLE SAT8. THE CORRELATION BETWEEN SAT2 AND SAT1 IS 0.996 THE CORRELATION BETWEEN SAT3 AND SAT2 IS 0.994 THE CORRELATION BETWEEN SAT4 AND SAT3 IS 0.991 THE RESIDUAL CORRELATION BETWEEN SAT2 AND SAT1 IS 3.150 THE MODEL ESTIMATION DID NOT TERMINATE NORMALLY DUE TO AN ERROR IN THE COMPUTATION. CHANGE YOUR MODEL AND/OR STARTING VALUES.) Is using the WITH statements the proper way to model the level 1 error covariance matrix? Do I need to change statements for the WITHIN and BETWEEN portions? 


I am interested in using intercepts and slopes from 3level (time, partner, couple) growth curves from MULTIPLE variables to predict a categorical (dichotomous) outcome. More specifically, I want to see whether average values and linear change over 8 annual assessments for satisfaction, commitment, and investment (6 predictors) contribute unique information regarding eventual divorce (no, yes). Usually, I run this kind of analysis in 2 rather tedious steps. In the first step, I run the growth curves for each outcome in a mutlilevel program and then output the estimates so I can create a new data set from them. Then I merge the files from the separate outcomes and run a logistic (or multinomial) regression. Can I run this problem in one step in Mplus 3? Example 6.13 (2 parallel processes) comes close to what I am looking for, but it is not set in a multilevel context. 

bmuthen posted on Tuesday, April 13, 2004  10:26 am



Larry  here is an answer to your first question above about the correlated residuals on level 1. Using a with statement is correct. I wonder if you are incorrectly doing this on the Between level instead of the Within level (the Within level is level 1). The error message concerns the betweenlevel covariance matrix and says that you have a problem with residual covariances  you should not have any such covariances on between since you have zero residual variances there. 


Yes, that is exactly the error I made. It also seems to be the case that if I model the error covariance matrix at level 1, I need to set the variances for BOTH the within intercept and the within slope to 0. Is that correct? (I assume so, because the results I got were an exact match to the results I got from a parallel run in LISREL). 

bmuthen posted on Tuesday, April 13, 2004  12:15 pm



Yes, you can do this in a single analysis step in Mplus Version 3. Version 3 allows twolevel analysis of growth with a dichotomous distal outcome where you can have random effects that vary across the betweenlevel units. 

bmuthen posted on Tuesday, April 13, 2004  12:19 pm



Regarding your question about having to set the variances to 0 for both the within intercept and slope  no, you can still estimate these variances and you would typically want them free. 


Does the Version 3 manual have an example of a twolevel analysis of growth with a dichotomous distal outcome where you can have random effects that vary across the betweenlevel units? I can run the growth curve analyses for the multiple outcomes in one analysis, but can't see any examples of how to link the intercepts and slopes from these analyses to a separate and common categorical outcome. 

bmuthen posted on Tuesday, April 13, 2004  6:08 pm



Not an explicit example, but you get it if you piece together examples from different chapters. We couldn't fit all the combinations... Here's the idea of how you do it. You specify your iw, sw and ib, sb growth factors on within and between. And you have a distal categorical, say a binary u, and you have withinlevel covariates x and betweenlevel covariates w. So regarding u, you say on within u on x; or u on iw sw w; and on between you say u on ib sb w; where on between, u is the random intercept (a continuous latent variable) in the logistic regression of u on x etc. 

bmuthen posted on Tuesday, April 13, 2004  6:08 pm



I meant to say u on iw sw x; 


Thank you for your suggestions. I tried running the following syntax to predict a dichotomous outcome (sep) by an intercept and slope. I assumed I needed to declare 'sep' as a categorical variable: TITLE: Trial run DATA: FILE IS C:\DATA\MPLUS\BARRIERS\gl252.txt; FORMAT IS 24F3/16F3,5F2,F4; VARIABLE: NAMES ARE sat1 sat2 sat3 sat4 sat5 sat6 sat7 sat8 alt1 alt2 alt3 alt4 alt5 alt6 alt7 alt8 inv1 inv2 inv3 inv4 inv5 inv6 inv7 inv8 bar1 bar2 bar3 bar4 bar5 bar6 bar7 bar8 cmt1 cmt2 cmt3 cmt4 cmt5 cmt6 cmt7 cmt8 sep wth drp lng fu cpl; CATEGORICAL=sep; WITHIN=; BETWEEN=; CLUSTER=cpl; MISSING ARE ALL (9); USEVARIABLES ARE sat1sat8 sep; ANALYSIS: TYPE = TWOLEVEL MISSING; MODEL: %WITHIN% iw sw  sat1@7 sat2@5 sat3@3 sat4@1 sat5@1 sat6@3 sat7@5 sat8@7; sat1sat8 (1); sep ON iw sw; %BETWEEN% ib sb  sat1@7 sat2@5 sat3@3 sat4@1 sat5@1 sat6@3 sat7@5 sat8@7; sat1sat8@0; sep ON ib sb; and received the following fata error message: *** FATAL ERROR THERE IS NOT ENOUGH MEMORY SPACE TO RUN THE PROGRAM ON THE CURRENT INPUT FILE. YOU CAN TRY TO FREE UP SOME MEMORY BY CLOSING OTHER APPLICATIONS THAT ARE CURRENTLY RUNNING. ANOTHER SUGGESTION IS CLEANING UP YOUR HARD DRIVE BY DELETING UNNECESSARY FILES. I have ample memory and hard drive space, so the error does not make sense to me. Thanks! 

bmuthen posted on Thursday, April 15, 2004  11:37 am



When you do this type of analysis, numerical integration is required for maximumlikelihood estimation. You will see this in the screen output if you request Tech8 and in the printed output (for successful runs). Your example uses 4 or 5 dimensions of integration which is very high, using many integration points with the default settings. The memory requirement and computational time go up essentially as a function of the product of integration points and sample size. So even having a lot of RAM won't always be sufficient and even if it were, the run could take an extremely long time. All this is described in the User's Guide section on numerical integration. To get your analysis going, you can reduce the number of integration points per dimension to say 10 or 7, or you can simplify the model at least as a first step. You want to build up your model in small steps, starting with simple parts of it. For example, you can first do the model without the distal outcome. Perhaps the sb variance is ignorable, which would reduce the integration dimensionality by 1. Building up the model like this also gives you good starting values for the final run, which then goes faster. Let me know how it goes and don't hesitate to send input, output, and data to support@statmodel.com. 

bmuthen posted on Saturday, April 17, 2004  11:36 am



Larry  can you please send me your input and data for the run above where you ran out of memory so I can check where this happens? 

Zoogah posted on Sunday, April 25, 2004  11:55 am



Are there student rates for Mplus? I am new to Mplus and would want to get it because of its ability to model categorical variables. 

zoogah posted on Sunday, April 25, 2004  11:59 am



bmuthen, I have a model that has continuous indicators on a categorical latent variable (1). The latter in turn relates to another latent variable (2) that is continous and has continuous indicators. Together, the model involves LPA and LCA. Can I analyze the model at once? I believe I need to do the analysis separately (LPA and LCA). Can you help me? Thanks 

bmuthen posted on Sunday, April 25, 2004  12:05 pm



This analysis can be done in a single step in Mplus Version 3. For information about student pricing, see the top of the Mplus home page. There is also a free demo version as described in the web site. 

Anonymous posted on Wednesday, February 23, 2005  8:46 am



Dear Dr. Muthen, I'm trying to fit the following growth curve model with binary outcomes: DATA: FILE IS "C:\Data\data123.txt"; VARIABLE: NAMES = SCHOOL x1 x2 x3 y1 y2 y3; USEV = SCHOOL x1 x2 x3 y1 y2 y3; CATEGORICAL ARE y1 y2 y3; MISSING ARE ALL (999); WITHIN = x1 x2 x3; CLUSTER = SCHOOL; ANALYSIS: TYPE = TWOLEVEL MISSING H1 RANDOM; MODEL: %WITHIN% iw sw  y1@0 y2@1 y3@2; iw sw ON x1 x2 x3; %BETWEEN% ib sb  y1@0 y2@1 y3@2; y1y3@0; OUTPUT: TECH1 SAMPSTAT CINTERVAL; Here's the error message I get: *** FATAL ERROR THERE IS NOT ENOUGH MEMORY SPACE TO RUN THE PROGRAM ON THE CURRENT INPUT FILE. YOU CAN TRY TO FREE UP SOME MEMORY BY CLOSING OTHER APPLICATIONS THAT ARE CURRENTLY RUNNING. ANOTHER SUGGESTION IS CLEANING UP YOUR HARD DRIVE BY DELETING UNNECESSARY FILES. Could you point out why this happens even though I have enough memory and disk space? Thanks. 

Thuy Nguyen posted on Wednesday, February 23, 2005  11:29 am



This model requires numerical integration which can be computationally heavy. There is a section in Chapter 13 of the Mplus User's Guide that discusses numerical integration and suggestions for using numerical integration. If you still have problems, please send your input and data to support@statmodel.com. 

bmuthen posted on Sunday, February 27, 2005  11:24 am



In the current version of Mplus you will be informed about the number of dimensions of integration. I think it is 4 in your case, which leads to heavy computations and can cause memory shortage with large sample sizes. You can try integration = montecarlo instead to reduce the computations. 

Back to top 