MODELING WITH COMPLEX SURVEY DATA
There are two approaches to the analysis of complex survey data in Mplus. One approach is to compute standard errors and a chisquare test of model fit taking into account stratification, nonindependence of observations due to cluster sampling, and/or unequal probability of selection. Subpopulation analysis, replicate weights, and finite population correction are also available. With sampling weights, parameters are estimated by maximizing a weighted loglikelihood function. Standard error computations use a sandwich estimator. For this approach, observed outcome variables can be continuous, censored, binary, ordered categorical (ordinal), unordered categorical (nominal), counts, or combinations of these variable types.
A second approach is to specify a model for each level of the multilevel data thereby modeling the nonindependence of observations due to cluster sampling. This is commonly referred to as multilevel modeling. The use of sampling weights in the estimation of parameters, standard errors, and the chisquare test of model fit is allowed. Both individuallevel and clusterlevel weights can be used. With sampling weights, parameters are estimated by maximizing a weighted loglikelihood function. Standard error computations use a sandwich estimator. For this approach, observed outcome variables can be continuous, censored, binary, ordered categorical (ordinal), unordered categorical (nominal), counts, or combinations of these variable types.
The multilevel extension of the full modeling framework allows random intercepts and random slopes that vary across clusters in hierarchical data. These random effects can be specified for any of the relationships of the full Mplus model for both independent and dependent variables and both observed and latent variables. Random effects representing acrosscluster variation in intercepts and slopes or individual differences in growth can be combined with factors measured by multiple indicators on both the individual and cluster levels. In line with SEM, regressions among random effects, among factors, and between random effects and factors are allowed.
The two approaches described above can be combined. In addition to specifying a model for each level of the multilevel data thereby modeling the nonindependence of observations due to cluster sampling, standard errors and a chisquare test of model fit are computed taking into account stratification, nonindependence of observations due to cluster sampling, and/or unequal probability of selection. When there is clustering due to both primary and secondary sampling stages, the standard errors and chisquare test of model fit are computed taking into account the clustering due to the primary sampling stage and clustering due to the secondary sampling stage is modeled.
Most of the special features listed above are available for modeling of complex survey data.
