Message/Author 


Dear Dr. Muthen I am working on data analysis for my dissertation. Multilevel regression is used for data analysis. While working on data analysis using Mplus, I got this error message. I tried to solve this problem by increasing start value but it did not work. Would you mind if I ask your advice to solve this problem? I look forward to hearing from you. Below is output that I had. Thank you in advance for your help. INPUT INSTRUCTIONS TITLE: Level2 Predictors: F5 DATA: FILE IS Study2.Mplus.Addmean.031706.dat; VARIABLE: NAMES ARE ID gender disabli subject OverSup LenSup SDgroup ExCogrop cyclenum AcaResp AcaResp1 CmpResp TeInstBe F4 F5 F6 F7 F8 F9 AcsScore AcsScf6 AcsSc6x2 blank SumSDS SumSds1 SumSds2 SumSds3 SumSds4 SumAirE SumAirS Tinsb_m Sds_Tinb checkID AcaR_m Smag_m Comp_m Tmg_m Tfos_m F4_m F5_m F6_m F7_m F8_m F9_m; USEVARIABLES ARE ID F5 AcaR_m Smag_m Comp_m Tinsb_m Tmg_m Tfos_m SumSDS; CATEGORICAL = F5; WITHIN = ; BETWEEN = AcaR_m Smag_m Comp_m Tinsb_m Tmg_m Tfos_m SumSDS; CLUSTER = ID; ANALYSIS: TYPE = TWOLEVEL; MODEL: %WITHIN% %BETWEEN% F5 on AcaR_m Smag_m Comp_m Tinsb_m Tmg_m Tfos_m SumSDS*10000.00; OUTPUT: TECH1 STANDARDIZED; INPUT READING TERMINATED NORMALLY Level2 Predictors: F5 SUMMARY OF ANALYSIS Number of groups 1 Number of observations 1350 Number of dependent variables 1 Number of independent variables 7 Number of continuous latent variables 0 Observed dependent variables Binary and ordered categorical (ordinal) F5 Observed independent variables ACAR_M SMAG_M COMP_M TINSB_M TMG_M TFOS_M SUMSDS Variables with special functions Cluster variable ID Between variables ACAR_M SMAG_M COMP_M TINSB_M TMG_M TFOS_M SUMSDS Estimator MLR Information matrix OBSERVED Optimization Specifications for the QuasiNewton Algorithm for Continuous Outcomes Maximum number of iterations 1000 Convergence criterion 0.100D05 Optimization Specifications for the EM Algorithm Maximum number of iterations 500 Convergence criteria Loglikelihood change 0.100D02 Relative loglikelihood change 0.100D05 Derivative 0.100D02 Optimization Specifications for the M step of the EM Algorithm for Categorical Latent variables Number of M step iterations 1 M step convergence criterion 0.100D02 Basis for M step termination ITERATION Optimization Specifications for the M step of the EM Algorithm for Censored, Binary or Ordered Categorical (Ordinal), Unordered Categorical (Nominal) and Count Outcomes Number of M step iterations 1 M step convergence criterion 0.100D02 Basis for M step termination ITERATION Maximum value for logit thresholds 15 Minimum value for logit thresholds 15 Minimum expected cell size for chisquare 0.100D01 Optimization algorithm EMA Integration Specifications Type STANDARD Number of integration points 15 Dimensions of numerical integration 1 Adaptive quadrature ON Progressive quadrature stages 1 Cholesky ON Input data file(s) Study2.Mplus.Addmean.031706.dat Input data format FREE SUMMARY OF DATA Number of clusters 45 Size (s) Cluster ID with Size s 30 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 SUMMARY OF CATEGORICAL DATA PROPORTIONS F5 Category 1 0.175 Category 2 0.825 THE MODEL ESTIMATION DID NOT TERMINATE NORMALLY DUE TO A NONZERO DERIVATIVE OF THE OBSERVEDDATA LOGLIKELIHOOD. THE MCONVERGENCE CRITERION OF THE EM ALGORITHM IS NOT FULFILLED. CHECK YOUR STARTING VALUES OR INCREASE THE NUMBER OF MITERATIONS. ESTIMATES CANNOT BE TRUSTED. THE LOGLIKELIHOOD DERIVATIVE FOR PARAMETER 7 IS 0.18718467D+02. 


You can try increasing the number of MITERATIONS as suggested in the error message. If that does not work, please send your input, data, output, and license number to support@statmondel.com. Please don't post output on the discussion board. We try to keep the posts short. 


Thank you so much for your kind reply. I am sorry to post output on this board. I simply thought that it might be better to post it for you to understand my problem. I tried to delet it, but I could not do it. Sorry again. I know it would be silly question. But I am not good at Mplus, could you please let me know how I can try to icrease the number of MITERATIONS? If there is any syntax for it, please let me know. I hope that this question does not bother you. Thank you for your understanding and help. 


Look up MITERATIONS in the Mplus User's Guide. Choose a number larger than the default value. As I said earlier, if you have further problems of this type, you need to contact support@statmodel.com and provide the information I asked for. 


Reliability correction in regression is possible using: f BY x@1; x@a; y on f; a: err var of x; a=(1rel)*var My aim is to use this in ML regression with Rasch estimates as IVs on both levels. Reliability is calculated using SEs from IRT software. I'd like to apply this with A) latent decomposition of covariates and B) observed grp means and tried: A) %within% f by x@1; x@0.174; y on f; %between% y on x; x grand centered Output: "this variable will be treated as a yvariable on both levels: x" B) between = xb; %within% f by xw@1; xw@0.174; y on f; %between% y on xb; xw group centered, xb group mn Comparing results of A & B with regression on LVs (2lvl Rasch), there's much higher conformity than without correction. A) How is the variance decomposed between levels? Is it ok to have M+ decompose the variance or should I use observed group means? B) With group centering, I still use x@0.174 for correction. Should I use within variance instead of total? Must reliability of grp means be taken into account (grp sizes~20)? 


You may find your answer in Web Note 11 at the following link: http://www.statmodel.com/examples/webnote.shtml 


I read the web note but still need some clarification. 1) I understand that if I use latent covariates the predictor is a LV and has the "withinbetween status" (web note). But in the model I described (model A) regression is being done on f on lvl1 and x on lvl2. f is defined by x but is not x actually. I thought about using f as predictor on both levels but this doesn't work. I'm not sure if my model specification is correct since I use the same predictor corrected for unreliability on within (f) and not corrected on between (x). 2) In Model B I use regular group centering so predictor variance on within should be variance of deviation scores. I wonder if for the calculation of error variance I may use then: (1  reliability)*within variance, with reliability calculated from IRT SEs. Thanks for your patience. 


I think your Model A approach is most straightforward. I would think that you get very similar results to Model A if instead you declare Within = x and use an observed betweenlevel, centered "xb" on Between. That is similar to your B approach, although you would have to declare Within = xw and not do group centering. 

Yan Liu posted on Wednesday, November 28, 2012  9:28 pm



Dear Dr. Muthen, I am working on multilevel regression analysis with random intercept only for continuous outcome variable. I have two questions. (1) In the output, I see the variancecovariance(correlation) matrix is provided at both within and between levels. How are the varcov matrices computed? Are they computed like those in multilevel SEM, which are additive? (2) Does Mplus use pseudomaximum likelihood for multilevel regression analysis? Thanks a lot! Yan 


1. Yes. 2. What type of pseudoML are you thinking about? There is a pseudoML for complex survey data. See Asparouhov, T. (2005). Sampling weights in latent variable modeling. Structural Equation Modeling, 12, 411434. and other complex survey data papers at http://www.statmodel.com/resrchpap.shtml 

Yan Liu posted on Tuesday, December 04, 2012  5:16 am



Dear Dr. Muthen, Thanks a lot! That is very helpful!Just want to confirm with my about my understanding. So basically PseudoML are used in Mplus if we use MLR, MLM, or MLMV estimators. 


We use the term PML for type=complex or with weights. 

Li posted on Monday, May 12, 2014  2:54 am



I ran a 2level regression with dichotomous variable as outcome. This is an interceptonly model with all the level2 slopes fixed. Sample size is around 2,000. MPlus automatically used MLR as the estimator. I am mostly interested in whether a level1 predictor is significant or not. The results of the Wald test are reasonable. This predictor is significant in about 10 out of 30 cases, which agrees with the substantive knowledge. However, if I use the 2Loglikelihood difference test, this predictor is significant for all the 30 cases. I did use the scaling factor for calculating the scaled Chisquare difference. While the difference of the degree of freedom is only 1, the difference of 2 loglikelihood ratio drops by at least 200 when this particular predictor is added to the model. I always thought Wald test and log ratio test should produce more or less equivalent results with large sample size. However, I am very confused by the drastically different results in this case. Have you ever heard of or experienced such thing? What could have gone wrong in your view? Many thanks for your comments. I really appreciate it. Hongli 


What do you mean by 30 cases. How many observations do you have at each level? 

Back to top 