Mplus Discussion >> Multilevel model testing

Topics
Last Day
Last 3 Days
Last Week
Tree View

Edit Profile


Multilevel model testing

Mplus Discussion > Multilevel Data/Complex Sample >

Message/Author

Joanna Harma posted on Thursday, October 25, 2007 - 9:28 am

Dear Linda,
I am doing two level logistic regression with children characteristics on lower level and family characteristics on upper level. I am trying to establish what determines the type of school child goes to. Dependent variable is binary and explanatory variables are mix of categorical and continuous.
Is there any way of testing my model fitness?
How can I obtain residuals, y-yhat?
Can you suggest any possbile graphs for such analysis to test model fitness?
Regards
Joanna

Joanna Harma posted on Friday, October 26, 2007 - 6:17 am

Hello,
Can anyone explain the significance of factor score (Empirical Bayes residuals) obtained during multilevel logistic regression and how it can be used to estimate predicted probabilities?
Thanks

Bengt O. Muthen posted on Friday, October 26, 2007 - 10:08 am

The factor score is an estimate of the random intercept for each cluster. To estimate the overall probability you need numerical integration over the random intercept distribution. To estimate a probability at a certain value of the random intercept distribution, such as the estimated value for a certain cluster, you simply translate the resulting logit into probability by the usual formula

Prob = 1/(1+exp(-Logit))

where the Logit contains the factor score estimate of the cluster random intercept plus the usual level 1 part of the Logit (including the intercept/threshold).

Joanna Harma posted on Monday, October 29, 2007 - 7:56 am

I estimated probabilites for level 1 observations using level 2 residuals (factor scores) added to usual level one equation. However when I did normality test for residuals, it plotted long tailed curve. Any suggestions how to deal with it when most of my variables are categorical.
Also the output files gives R-square for level 1 & 2 when standardized option is used in the output. Can this be used reliably for model fitness?
Thanks
Joanna

Bengt O. Muthen posted on Tuesday, October 30, 2007 - 8:02 am

I think you are saying that the random intercept in the 2-level logistic regression has an estimated distribution that looks non-normal with long tails. In the forthcoming Chapman-Hall chapter by Muthen & Asparouhov on our web site the end of the paper discusses the sensitivity to violations of normality for random effects and how one can instead use a non-parametric approach to estimating the random effect distribution. This is in the related case of growth modeling with binary outcomes using a logistic link. The sensitivity to the normality assumption of random effects has also been studied by Charles McCulloch at UC San Francisco as indicated by his recent talk here in LA.

Bengt O. Muthen posted on Tuesday, October 30, 2007 - 8:06 am

P.S. I don't think R-square is useful for indicating model fit with logistic regression.

Joanna Harma posted on Tuesday, October 30, 2007 - 8:34 am

Thanks
I started modelling with adding level 2 variables as I am more interested in the effect of family characteristics, on the school type child goes to, while controlling the effect of child level variables. I added child level variable after I had added all the relevant family variables. It was surprising to see that the residual variance (level 2) shot up from 10 to 34 and few of level 2 variables became insignificant. Is this normal? Also, most of my level two variables have high S.E. and t-value around 2.5.
My data has 250 families and these families have 475 children. Thus the size of cluster is not very big. Do you think I could do logistic regression with this kind of data?
Thanks

Bengt O. Muthen posted on Wednesday, October 31, 2007 - 8:32 am

I think this residual variance behavior has been described in the multilevel literature - see for instance the Snijders & Bosker book. - Any lit. suggestions from other Mplus Discussion readers?

With an average of 2 children per family you don't have many subjects for within-level parameter estimation. But if you for example do 2-level logistic regression you don't have any within-level parameters (not even a level-1 residual variance). 2-level factor analysis, however, would be problematic.

Joanna Harma posted on Wednesday, October 31, 2007 - 9:10 am

Thanks,
I have 250 families of which 120 send all their children to government schools, 110 send their children to private school and only 20 families send their children to both type of children(i.e. 1 child to government and 1 to private). Now I have school type child goes to as dependent variable. So lower level variable relate to child characteristic and upper level relate to family characteristics. This data structure to me speaks not much within level variance and high between level variance. In intercept model I get between variance as 30. Could this be a problem? Also, could the fact that I have not much variability at child level be a problem? Can I still use 2 level logistic regression?
Thanks

Joanna Harma posted on Wednesday, October 31, 2007 - 9:14 am

I forgot to ask. Since my cluster size is relatively small, could I ignore two level structure and do ordinary logistic regression with both child level and family level variable?
Thanks

Bengt O. Muthen posted on Wednesday, October 31, 2007 - 9:29 am

Given that only 20 families have within-family variation on your binary dependent variable, the binary dependent variable is almost a level 2 outcome. This may be making your analysis more difficult.

As for your second question, instead of Type = Twolevel, you could use Type = Complex which simply corrects SEs for the clustering of children within families (not estimating a level 1 and level 2 model). That is an easier analysis.

Joanna Harma posted on Friday, November 02, 2007 - 6:52 am

Thanks Bengt,
Another question. Is normality and homoscedasticity an assumption for level 2 residual in 2 level logistic regression?
Thanks
Joanna

Bengt O. Muthen posted on Friday, November 02, 2007 - 7:03 am

Yes, that is a standard assumption.

SY Khan posted on Sunday, August 06, 2017 - 6:24 am

Hello,

I am analysing a moderated mediation model using latent constructs and observed variables in a multilevel framework (2-levels). The independent variable (MS � 6 indicators) and moderator (JC-5 indicators) are individual level variables (measured as latent constructs). The mediator is an organisational-level observed variable and there are three organisational-level observed variables.

The model runs fine and I get all the results normally displayed. Howevre, the output gives the following error message followed by the message that �Model Terminated Normally�.

THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES MAY NOT BE TRUSTWORTHY FOR SOME PARAMETERS DUE TO A NON-POSITIVE DEFINITE
FIRST-ORDER DERIVATIVE PRODUCT MATRIX. THIS MAY BE DUE TO THE STARTING VALUES BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. THE CONDITION NUMBER IS 0.592D-10. PROBLEM INVOLVING THE FOLLOWING PARAMETER: Parameter 78, MEDJC

THE MODEL ESTIMATION TERMINATED NORMALLY

Parameter 78 (MEDJC) is the mean value of moderator (JC) = 0 as JC is measured as a latent construct - specified in the model constraint command.

Should this message be ignored or is it something that needs to be fixed.

Thanks.

Linda K. Muthen posted on Sunday, August 06, 2017 - 6:42 am

Please send the output and your license number to support@statmodel.com.