Message/Author 


Dear Dr. Muthen, I work with a large sample of twins, and I am using the clustered data option (TYPE=complex) to adjust for nonindependence. I am wondering if I should use TYPE=TWOLEVEL instead. What would be the difference? Is one approach better than the other? My general question is: In which cases do you use TYPE=COMPLEX and in which cases do you use TYPE=TWOLEVEL? Thanks in advance. 


This is described in the introduction to Chapter 9 of the Mplus User's Guide. 


Thank you for your reaction. I had already read that chapter, but unfortunately I am still not able to decide when I should use one approach versus the other. What would you recommend in the case of twins and what would you recommend in the case of siblings and parents (so 3 family members). Or does the choice depend on your specific research question, interests and analyses (I perform GMM), and not so much on the design of the study? I read that the multilevel approach allows random intercepts and random slopes that vary across clusters. Does the clusterapproach not do that? Thank you in advance for your time. 


TYPE=COMPLEX computes standard errors and a chisquare test of model fit taking into account stratification, nonindependence of observations due to cluster sampling, and/or unequal probability of selection. TYPE=TWOLEVEL specifies a model for each level of the multilevel data thereby modeling the nonindependence of observations due to cluster sampling. So it depends on what you want. Do you want to only correct standard errors and chisquare or do you want a model for the between level. Your research questions should guide you. 

Student 09 posted on Monday, March 02, 2009  7:34 am



A colleague of mine claims that type= complex is not appropriate if the aim is to examine the effect of a betweenlevel variable on a dependent variable measured at the individual level. According to his view, the type = complex framework does not take into account that in a twolevel data structure, there a always less observations on the higher as compared to the lowerlevel of analysis. His example: If there are n1 = 1000 pupils nested in n2 = 50 schools and one would examine the effect of "school denomination" as level2 variable on "math achievement" (measured on level 1), type = complex would not be aible to take into account that "school denomination" refers to 50 (level 2)cases, and not to 1000 pupils. Is this right or wrong? Are there any references available explaining the logic of the Mplus robust s.e. in the type= complex framework? Thanks a lot! 


I believe that the standard errors are correct with TYPE=COMPLEX when some variables are measured on the cluster level and some on the individual level. To be certain, however, you would need to do a Monte Carlo simulation and see if you obtain the correct standard errors. 

Student 09 posted on Monday, March 02, 2009  11:29 am



Dear Dr. Muthen I like the idea of a MC simulation, but maybe it would even be sufficient if the formula for the type = complex (using MLR) robust se's would be available, just to see whether the s.e.'s take into account the different N's of level2 vs. level1 variables ? Best wishes Jan 


See: Asparouhov, T. (2005). Sampling weights in latent variable modeling. Structural Equation Modeling, 12, 411434. Equation (5) sums over c in the middle of the sandwich, that is, over the number of clusters. This means that independence is assumed only for the C clusters, not all N observations. 


Dear Dr. Muthen I am currently looking into the randomintercept crosslagged panel model (RICLPM) (Hamaker, Kuiper, & Grasman, 2015), which allows to separate within  and between variance. As I read above, the difference between type= complex and type = twolevel, is that in the latter you can specify a betweenperson model. Now my question is whether the cluster approach would also be appropriate in the context of RICLPM, since it allows to disentangle within and between person variance. Many thanks in advance 


Yes, I think so. 


Dear Dr. Muthen Sorry for the followup question, but to be certain, it is sufficient to cluster the data without specifying a between model to capture the essence of RICLPM? Especially if you're interested in the withinperson level, this makes the model easier to handle. Many thanks in advance. 


Will get back to you about this before too long. Also, have you seen the new Child Development article by Berry and Willoughby on crosslagged modeling. 


Let me change my answer to say that you don't use type=twolevel or type=complex for the RICLPM. It should be done as a singlelevel, wideformat model as shown in the Hamaker et al Figure 1, righthandside. The model already has within (time) and between (person)level features, where the random effects kappa and omega for the two outcomes represent the between person variation. 


So if I understand correctly, the model statement in the Hamaker figure would look something like this with everything on the same level: x1 by p1 x2 by p2 x3 by p3 y1 by q1 y2 by q2 y3 by q3 Kappa by x1@1 x2@1 x3@1 omega by y1@1 y2@1 y3@1 p2 on p1 q1 p3 on p2 q2 q3 on q2 p2 q2 on q1 p1 q1 with p1 q2 with p2 q3 with p3 


You should say p1 by x1; etc for the others. and also x1@0; etc 


Dear Dr. Muthen Would it be ok to construct the random intercept factor as a secondorder latent variable instead of constructing it directly from the manifest variables? Many thanks in advance. 


Yes, if you set it up so that you get the same model (same number of parameters and same model fit). 

Back to top 