I am running a multilevel regression analysis using Mplus (with random intercepts and slopes).
In my model, I've entered three continuous level 2 independent variables, two dummy-coded level 1 independent variables and a continuous dependent variable.
My level 2 sample size is only 10 (level 1 sample size is 300). I am using the default estimator, MLR.
The model runs and I get the corresponding output. However, I also get the following error message:
THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES MAY NOT BE TRUSTWORTHY FOR SOME PARAMETERS DUE TO A NON-POSITIVE DEFINITE FIRST-ORDER DERIVATIVE PRODUCT MATRIX.THIS MAY BE DUE TO THE STARTING VALUES BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. THE CONDITION NUMBER IS -0.130D-18. PROBLEM INVOLVING PARAMETER 11.
THE NONIDENTIFICATION IS MOST LIKELY DUE TO HAVING MORE PARAMETERS THAN THE NUMBER OF CLUSTERS. REDUCE THE NUMBER OF PARAMETERS.
To what extent (if at all) can I trust my results (If I counted correctly, the number of estimated parameters is 12)?
Thanks for your help!
Boliang Guo posted on Tuesday, October 18, 2005 - 1:43 am
see raudenbush's hlm book, the minmum number of level 2 unit is 30 but he conducted a meta analysis with 19 studies. the number of level2 unit in your study is really small and you are also not lucky to get the good result. can you modle a complex OLS equation with only 10 case? if you have no strong theory background about the between l2 unit variance, then, try ols.
Dear Dr. Muthen I am trying to run multilevel logistic regression analysis, with ML estimator, to identify variables that significantly effect the school type that a child goes to. At the micro level I have child characteristics and at macro level I have family characteristics. I have following questions about the analysis:
1) When I run the null model I get intercept variance at the macro-level but I do not get intercept coefficient. How can I get it? Is threshold the intercept coefficient?
2) Is the fact that my model has only random intercept and no random slope good enough justification (taking into account hierarchical data) for multilevel modelling?
3) Can I use var/s.e. to test if the intercept in null model is random or not. Or will I have to do some other test like fixing variance of intercept to zero and doing likelihood ratio test? Is estimate/s.e. for covariates, variances and covariances z-score or t-value?
Many thanks Joanna p.s. I hope you had great vacation.
one last question I could not post due to post limit.
In order to test model fitness I am using loglikelihood ratio test. What I am doing is I introduce a covariate in the model and record the loglikelihood value and then in the second step I fix the coefficient of that covariate to zero and record second loglikelihood value. Then I multiply the difference (second Ė first) of two recorded values with -2 and do chi-square test. Is this way correct? Or is there any other way of checking model fitness?
For the LRT, take 2 times the difference in H0 Value from the model fit, and take the difference in model parameters as the df. For example, assume the first model has an H0 = 5237.921, df = 16, and for the second model, if H0 = 5236.561, df = 20, 5237.921-5236.561 = 1.36 * 2 = 2.72, df = 4. Chi-square for df = 4 critical value = 9.488, therefore, chi-square 4 = 2.72 > .05, and the second model is not an improvement over the first model.
If a post does not fit in the space provided in one window, it is too long for Mplus Discussion. Please do not double post in the future.
1. Yes. 2. Yes. 3. Yes as an approximation. Because you are testing a variance against zero which is on the border of admissible values, this test may have problems. There is a large literature on this topic.
I think when you test two nested models where covariates values are fixed to zero and where they are free, the test describes the significance of the covariates. I don't think it says that the model with covariates is better.
Dear Linda, Regarding intercept coefficient in multilevel regression. UCLA website has presented example from Snijders and Bosker using MPlus in chapter 14. In all the examples intercept coefficients has been shown by S & B as threshold for dependent variable at between level, but with negative sign. Could you please clarify if intercept coefficient is threshold or threshold with (-) sign? Also, when this intercept coefficient for null model is converted to probability will it be same as proportion of 1 in the dependent variable? Many thanks and sorry about previous post. Joanna
Dear Linda, Following what you told I ran null model with just dependent variable at between level. Results are: Estimates S.E. Est./S.E. Between
Thresholds Q2$1 0.709 0.650 1.090
Variances Q2 34.909 12.985 2.688
From what you told me my intercept coefficient is -0.709. When I convert it to probability of u=1 it comes out as 0.33 where as I have 41.3% observations as u=1 in my sample. Why am I getting this difference? Shouldn't the two probabilites be same? Many thanks Joanna
I am running a "Random Intercept Random Slope with Level 1 & 2 Predictors". My level 2 predictor is the school SES. The DV is fruit intake. I got the following error message: One or more between-level variables have variation within a cluster for one or more clusters. Check your data and format statement.
This is the code that I used:
Within = eat_frontof TV; between= SES;
%within% slope |fruit on eat_TV;
%between% fruit slope on eat_TV; fruit with slope;
Any variable on the BETWEEN list must have the same value for every member of a cluster. Apparently SES does not meet this criterion. If you can't figure it out, please send the output, data, and your license number to email@example.com.
Anne Casper posted on Wednesday, January 20, 2016 - 9:27 am
I am running a multilevel regression analysis with two predictors and their interaction. Both predictors are groupmean centered level 1 variables. The interaction term is significant and I would now like to do simple slope analyses.
Q1: I requested TECH3 to obtain the asymptotic covariance matrix in order to use the Preacher et al online tool for mulitlevel simple slopes. All the estimates in TECH 3 end with D-01 or D-02 or D-03. Could you tell me what this means? Does it look correct to you?
Q2: Is it also possible to use MODEL CONSTRAINT for calculating the simple slopes? If so, how would I do this?
I have a question about the covariance structure used by MPlus. I run a model in SPSS and there I have to specify the covariance structure, I can choose from VARIANCE COMPONENTS, AUTOREGRESSIVE(1), COMPOUND SYMMETRY, and several others. However, in Mplus, I just specify ANALYSIS: Type is twolevel GENERAL. I do not know what kind of covariance structure Mplus is using as default. Could you inform me about this?
Hi Bengt, thank your for your response. I think I had to add some information about my design.
I have 87 individuals, who completed 3 measures a day for 5 days. Therefore, there is dependance in my data because of the repeated measures. In SPSS and R, I can control for this with an autoregressive covariance matrix. I cannot specify this on Mplus, or can I?
Also, I am not sure if I should specify a three level model (moments, nested within days, nested within individuals), or a two level model because in chapter 9 I read that for longitudinal data Mplus has 1 level less than other softwares. Therefore, I thought of keeping my long format data and having two levels only, but how does Mplus know the order of the observations if there is no third level?
I think is important to know that I am not examining growth or change, just regressions on the DV measured in the moment from predictors that were measured in the same moment.
Thank you for your reply. I am a bit confused still though. If I use the wide format and regressed y1 on x1 y2 on y2 ... . . y15 on x15
then, I would have 15 slopes. I would be able to do autocorrelation, if I follow the example on the UG, but I want to compare my results with SPSS and R. In these programs, I get one slope that takes into account the auto correlation. How can I get 1 slope when I have 15 DVs (representing one variable) and 15 predictors (representing one variable)?
Dear Professor Muthen, IĀ0Ö7m now performing a Bayes twolevel regression as you sugested a couple of days ago, as I only had 20 clusters besides a 1229 individuas sample. After conducting the first step of the analysis, I found an significant degree of heterogeneity among clinical groups for the intercept in the regression of thought supression on depression (WBSI and Dep, respectively). I found a median point estimate for random intercep variance of 0.712; a 95% credibility from .005 to 2.944. Following that, I try to include level-2 covariates to explain the heterogeneity. Here The Mplus is giving me an error that I can not understand. Did I did something wrong in the commands and if so, whatt should I do ?. IĀ0Ö7ll send you the output file in a second mensage as the message size is limited. Sincerly Joana Costa
For covariate x with a randome slope, if it is neither specified in within nor in between, is the latent between covariate x actually the cluster mean of the observed covariate x? and the slope of this latent covariate x is the so called contextual effect, right?
Perhaps you are looking at the "third part" of ex9.2. No, in this case the latent between part of the covariate x is not the cluster mean of x. It is the latent between part of x. They are related but not the same.
The contextual effect has to do with the random intercept regression on the cluster mean. For a definition of contextual effect, see the Raudenbush-Bryk multilevel book, page 140, Table 5.11.
1. So if the covariate x is not mentioned in the within part of the code, when it is a random intercept model(e.g. second part of ex 9.1), then the covariate x is decomposed into two latent variables (latent within part of x and latent between part of x); If it is a random slope model (e.g. third part of ex 9.2), then the covariate x is also modeled on both within and between part, but the within part is actually the original observed covariate x (not the latent within part of x) and the between part of x is the latent between part of x?
2. In the second part of ex9.1, why grand mean centering is used? I checked Raudenbush-Bryk multilevel book, page 140, Table 5.11, if the contextual effect is calculated as the difference between the between slope and the within slope (as in the second part of ex9.1), group mean centering should be used. If the grand mean centering is used, the contextual effect is directly the slope of the level-2 random intercept equation, but not the difference between the between slope and within slope. How to understand this ?
1. This is correct. In the forthcoming Mplus version 8.1, the latent variable decomposition of x will be available also for random slopes when using Bayes, that is, the latent within-level part of x will be used with a random slope as well.
2. It is confusing - in retrospect it probably would have been clearer to not have grandmean centered x in Define. The reason that the betac = gamma01 - gamma10 formula is used in Model Constraint is that Mplus does a "latent group-mean centering of the latent within-level covariate" (see the text on pages 274-275). That's why the group-mean centered version of betac is used even though the observed x is not group-mean centered in Define.
The authors of these examples used Mplus to test the multilevel indirect effect following the guide of ex9.1 and ex9.2. From your last reply, can I interpret it as it is more intuitive to use group mean centering for the random slope model? If my goal is to get the multilevel mediation effect, can I bypass the estimation of the contextual effect? As long as I get the between slopes with the group mean centering, multiply these between slopes, then I get between level indirect effect, right?
I have a two-level moderated mediation model. IV is workplace-level. Moderator, mediator and 4 outcomes are individual-level with a few control varaibles on both levels. I am estimating simple moderation at individual-level, and moderated mediation on path a in the between model. My queries:
1-Overall model fit is bad (CFI=0.456; TLI=-4.444; RMSEA=0.489). Is this a relaible model? What can can do to improve the model fit?
2- For estimating differences in the public vs pvt workplace, I specified grouping option. It gives within and between results seperately, but not conditional indirect effects seperately. However, there is no differnce in the estimated between and withing regression results. How can I specify grouping option correctly in two-level data.
There are three variables: Teacher motivation, student self-efficacy, and studentsí achievement.
I specified teacher as an L2 variable, and I didnít specify self-efficacy and achievement as either between or within variables because they would exist at both models.
Regarding the relations between self-efficacy and achievement, we have two models, one within and one between. Regarding the relation between these variables at between model, which interpretation is correct?
1) class-average self-efficacy is related to class-average achievement or
2) class-average self-efficacy is related to individual studentsí achievement
I have the same question about teacher motivation and studentsí achievement:
1)Teacherís motivation is related to class-average achievement or
2) teacherís motivation is related to individual studentsí achievement
and the last question: are the relations in the between model (teacherís motivation with studentsí achievement, and self-efficacy (b) with achievement (b)) macro-micro models?
2) and 2). Although Mplus can use the latent variable between part instead of the observed class-average.
The relations on between are macro-macro.
See also the video and handout for our Short Course Topic 7.
anonymous posted on Tuesday, October 23, 2018 - 1:19 am
I am doing multilevel modelling with MPLUS (7) in which I compare three groups using dummy variables. I get different groups' mean of the latent variables (always the reference group mean as other are controlled for), but variance (diagonal of the covariance matrix) is always same for different groups. Can I get group specific variance, in which I can calculate SD and SE for different groups. Actually, my main aim is to get SE for different groups. How I can get these?
You need to do a multiple-group analysis of your 3 groups to let the variance vary. But a complicating factor is that the group variable is on Level 1 - this means you have to study our Web Note 16 on our web site.
anonymous posted on Tuesday, October 23, 2018 - 10:43 pm
Okay, Thank you!
anonymous posted on Tuesday, October 23, 2018 - 10:43 pm
Okay, Thank you!
anonymous posted on Wednesday, October 24, 2018 - 12:54 am
However, do you know, can I use this formula
SD = square root(sum(xi^2)/(N-1)-Mean^2)
to calculate SD and SE=SD/square root(n), as I know means (specific for each groups) and I get every terms (xi) when I save factor scores.
I am obtaining the following error message when I run a random intercept linear model using FIML. I do not obtain this error when I am not using FIML and estimates are similar between complete case and FIML (with this error) models. Thoughts?
THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES MAY NOT BE TRUSTWORTHY FOR SOME PARAMETERS DUE TO A NON-POSITIVE DEFINITE FIRST-ORDER DERIVATIVE PRODUCT MATRIX. THIS MAY BE DUE TO THE STARTING VALUES BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. THE CONDITION NUMBER IS -0.979D-16. PROBLEM INVOLVING THE FOLLOWING PARAMETER: Parameter 21, %WITHIN%: DAYS34
We need to see your full output - send to Support along with your license number.
Anne Casper posted on Wednesday, August 07, 2019 - 1:21 am
I am running a multilevel regression analysis with one predictor, using the MLR estimator. The regression weight of the predictor is not significant at p=.09 but the model comparison of the null model and the model with that one predictor using the loglikelihoods reaches significance at the .05 level. I followed the recommendations for loglikelihood testing from this site https://www.statmodel.com/chidiff.shtml.
If I re-run my analyses with estimator = ML, the predictor reaches significance at .05 in the regression as well.
z-tests and likelihood-ratio chi-square tests may disagree. The ML estimator may have smaller SEs and therefore give significance - but those SEs may not be as dependable as those from MLR. You can settle the matter by requesting bootstrap confidence intervals.
anonymous posted on Friday, November 08, 2019 - 12:10 am
In the double latent model, Marsh et al. (2009) grand-mean cantered indicators that load on the latent variables at the individual level and then the effect of this latent aggerated variables is aggerated effect at the classroom level not contextual effects (calculated separately as beta difference in the model constrain).
1. Do you know how the contextual effect of a latent aggerated variable can be directly controlled at the classroom level in the double latent model.