

Level 2 random intercept in latent re... 

Message/Author 


Dear Mplus experts, For the latent regression below, may I have some advice about random intercept usage? A latent trait (eta) is measured by 15 binary responses (u_1u_15). We test a centre effect on the latent trait, controlling for individual level covariates (x_1x_6). H0: SEM no clustering, H1: centre has a random intercept in a 2level model. Measurement part = clusterinvariant 1PL IRT. For item m, individual i, centre j: P(u_{ijm} = 1) = F( eta_{ij} – tau_m ) tau_m=item threshold, F=logistic. Structural part: eta_ij = beta_0j + beta_1 x_{ij;1} + ... + beta_6 x_{ij;6} + epsilon_ij H0: beta_0j = beta_0 H1: beta_0j = beta_0 + r_j Mplus input H1: (cf. ex. p. 150 of Topic 7 short course) VARIABLE: CLUSTER=centre WITHIN = x_1x_6 CATEGORICAL = u_1u_15; ANALYSIS: ESTIMATOR = MLR MODEL: %WITHIN% eta BY u_1u_15@1; ! 1PL: loadings=1 eta on x_1x_6; %BETWEEN% etab BY u_1u_15@1; H0: no CLUSTER, WITHIN, %WITHIN% keyword, %BETWEEN% section. 1) Is the syntax correct? 2) Mplus output gives on level 2 the intercept’s variance and the 15 thresholds, not the intercept’s mean beta_0. Why? Thanks very much for any help! Mo 


1. It looks correct. The best check is if you get the results you are looking for. 2. In a crosssectional model, factor means are not identified when thresholds are free. 


Dear Dr Muthen, Thank you very much for answering so quickly to my inquiry. A. Regarding 2), I have followed the example of your short course, but it is unclear to me why the unique random intercept (beta_Oj) on level 2 is modeled as 15 thresholds (etab BY u1u15). Since the measurement part is considered to be invariant across clusters, why should the thresholds (tau_m) be modeled on level 2? Curiously, under H1, the estimated 15 thresholds on level 2 are (almost) identical to the thresholds under H0, with an (almost) constant shift (0.363 to 0.367). Has the clusterspecific random intercept been somehow integrated into the clusterinvariant thresholds? B. Regarding 1), yes, the results are consistent with what I expected, and suggest a clustering effect. Am I right to use a loglikelihood difference testing with the scaling correction factor as described in your technical appendix? C. However, I get a warning (NONPOSITIVE DEFINITE FIRSTORDER DERIVATIVE PRODUCT MATRIX... THE NONIDENTIFICATION IS MOST LIKELY DUE TO HAVING MORE PARAMETERS THAN THE NUMBER OF CLUSTERS...). There are only 9 clusters, totaling 800 individuals, and 22 parameters. A multiple group analysis with the KNOWNCLASS option ran into other numerical problems (message ONE OR MORE MULTINOMIAL LOGIT PARAMETERS WERE FIXED TO AVOID SINGULARITY OF THE INFORMATION MATRIX). How should I address this? 


A. Saying etab BY u1u15; does not mean that the random intercept is modeled as 15 thresholds. It means that he covariation among the u1u15 random intercepts is explained by etab. The etab parameter is its variance. The thresholds end up on level 2 in line with multilevel modeling where all means appear on level 2, not level 1. The thresholds are indeed cluster invariant. Means/thresholds are often not different in single and twolevel analyses. B. This is a tricky topic  see "Likelihood ratio tests in linear mixed models with one variance component" CrainiceanuRuppert (2004) JRSSB 66, Part 1, pp. 165185. To avoid this, I would simply report your twolevel results. C. 9 clusters is smaller than we recommend. At least 20 are typically needed for good SEs and variance estimation. You should also check that you don't have more betweenlevel parameters than clusters. An alternative is to create 8 dummy variables and use these as covariates in a singlelevel analysis. So changing from random to fixed mode for the clusters. 


Dear Professor Muthen, Thank you very much for your prompt and helpful answer. It's really great to have such a support from both of you and your team! A(2). This clears up the subject to me > C(2a). Maybe it is a very basic question: what are then the free parameters at level 2? In the modeling equations of my initial post, the only unknown parameter at level 2 is the variance (+/ mean) of the clusterspecific intercept beta_0j. However, in the Mplus input, the level 2 parameters also include the 15 thresholds (and exceed nb of clusters). Was there a mistake in my equations? B(2). Thank you for the reference. The technical details are a bit hard for me, but I got it that LRT may suffer a serious bias in this multilevel setting. C(2b). Thank you for suggesting this alternative of having centre as a nominal dummy coded covariate  well suited to our problem, since crossgroup variability is on the intercept only. It works fine (results are similar to the ones I obtained in the multilevel approach). Am I right using loglikelihood difference testing in this setting to infer a centre effect? A linked question: for this latent regression with binary outcomes, should ML or MLR estimation be preferred? If I use MLR, does this setting lend itself to the scaling correction factor technique? Many thanks again for your much invaluable time and help. 


A(2). The betweenlevel parameters are the betweenlevel factor variance and the thresholds. B(2). It is a tough topic. For an overview, you may also want to read Chapter 3 of the second edition of Joop Hox's Multilevel book. C(2b). Center effect is simply seen as the Z test for each center dummy. I don't think you see much difference between ML and MLR for this example, but I would use MLR and it works with the scaling correction. 

Back to top 

