Message/Author 

Anonymous posted on Friday, September 06, 2002  10:58 pm



I am doing two path models. One is nested, and i want to compare the two using a chisquare difference test. Some variables are not normal, so as i understand things, i will need to use an estimator with robust statistics. All variables are continuous. Can you recommend which estimators are most appropriate? 


The only robust estimator for continuous outcomes that can be used for difference testing is MLM. A scaling correction factor is given in the Mplus output and an explanation of how to do difference testing using MLM is given on the Mplus website. 

Anonymous posted on Monday, November 18, 2002  1:21 pm



Hello, Linda, In psychology field, exponential variable is quite common in practice. If in the SEM there are categorical, normal and exponential variables, is Mplus still useful? which procedure do you recommond? if not, Can you create special programm (variables are mixture of categorical, normal and exponential) for common use in SEM. Daniel 

bmuthen posted on Monday, November 18, 2002  5:19 pm



I think you are referring to a variable with an exponential distribution. Mplus does not handle such outcomes. Unless a transformation can be done, I am not sure that approximating this as continuous or categorical is recommendable. With enough uses for these kinds of outcomes, the Mplus team might be motivated to add this feature to their list of future expansions. 

Mahyar posted on Wednesday, March 26, 2003  4:06 pm



I'm new to Mplus and wanted to know whether Mplus has the bootstrapp option? If so, what is it? Also are there any good recent papers on bootstrapping that is in Mplus? 

bmuthen posted on Wednesday, March 26, 2003  4:08 pm



The current version of Mplus does not have a bootstrap option, but it is on a future wish list. If many users are interested in this, the Mplus team will consider the request. 


I'm interested in learning more about the MLR estimator that is suitable for use with clustered or nonclustered nonnormal continuous or categorical observed variables. I've read Yuan and Bentler's 2000 article and understand that MLR traces its origins to White's 1982 sandwich variance estimator as well as Satorra and Bentler's seminal work that lead to the widely used T_MLM estimator. Can you tell me how MLR differs from the usual White sandwich estimator that is used routinely in packages such as SAS (e.g., PROC GENMOD) and Stata (via the robust option in most estimation commands)? I know that MLR assumes incomplete data arise from a MAR process whereas the White sandwich estimator assumes incomplete data arise from an MCAR process, but I am curious to learn, in broad strokes, the differences in computation between these two approaches. Also, if I wanted to compute T_MLR and associated standard errors manually from the outputs of a SEM program such as Mplus, how would I go about doing it? With many thanks, Tor Neilands 

BMuthen posted on Sunday, June 19, 2005  5:14 am



The MLR formulas are in the Tech appendix 8 on the website. It is the sandwich approach which is the same as White. The missing data aspect comes in by using an observed instead of an expected info matrix in this expression. MAR vs MCAR affects the parameter estimates not the standard errors. I don't think you can easily compute T_MLR and associated standard errors from the output. 

Manuel posted on Tuesday, August 05, 2008  7:40 am



I noted a slight (but for me important) difference between the third and fifth edition of the users guide. In the third edition it says that "the MLR chisquare test statistic is also referred to as the YuanBentler T2* test statistic" (p. 368), while the fifth edition states that "the MLR chisquare test statistic is ASYMPTOTICALLY equivalent to the YuanBentler T2* test statistic" (p. 484). a) What is the reason for this change? If in a finite sample MLR chi^2 is NOT equivalent to YuanBentler T2*, how do the two differ? b) what is the correct reference for MLR chi^2 (Yuan & Bentler 2000?). I checked appendix 8, which contains no reference to Yuan and Bentler. c) On a related note, am I correct to assume that the MLM chisquare test statistic is equivalent to the SatorraBentler chisquare (even non asymptotically)? What is the correct reference for MLM chi^2 (Satorra & Bentler, 1994?). Thank you very much for any insights! 


a) They are supposed to be equivalent, but Mplus and EQS report slightly different results in some cases. I think the main variation in the different computations of the robust statistic is due to the way the information matrix is computed  observed or expected. b) Full details on MLR are given in Asparouhov, T. and Muthén, B. (2005). Multivariate Statistical Modeling with Survey Data. Proceedings of the Federal Committee on Statistical Methodology (FCSM) Research Conference. http://www.fcsm.gov/05papers/Asparouhov_Muthen_IIA.pdf c) Yes. 

Joonmo Son posted on Sunday, May 23, 2010  4:48 am



I employ an endogenous variable, number of volunteering hours per month, which might be regarded as a censored measure because one cannot use infinite time for volunteering activities. Is it right that MLM estimator is proper for such censored nonnormal endogenous variable? Or should I use a different specialized estimator (e.g., tobit)? Please note that I am using multiplyimputed data sets for path analysis. 


If the variable does not have a piling up of values at one end, I would use MLM or MLR which are robust to nonnormality. It it does, I would use the CENSORED option to obtain Tobit modeling. 

Joonmo Son posted on Sunday, May 23, 2010  7:27 pm



Thanks for your kind and helpful answer. Would you give me one or two reference(s) dealing with the issue of how MLM takes care of censored continuous dependent variables? 


MLM/MLR are for skewed variables without piling up at the end points. CENSORED is for when there is piling up. The model used by CENSORED with ML is described in the regression literature under Tobit regression (Google it) and in factor analysis I have a paper  see Muthén, B. (1989). Tobit factor analysis. British Journal of Mathematical and Statistical Psychology, 42, 241250. 

Joonmo Son posted on Monday, May 24, 2010  1:11 am



I see your point, and thanks for your reference (I found it!). Would you mind if I raise a further query to make it clear? My dependent variable (volunteering hours per month) is skewed to the left because many respondents (844 out of 1,523, 55% of the respondents) did not volunteer at all as shown below. volhours N 0 844 1 25 2 62 3 35 4 76 5 44 ¡¦¡¦¡¦ 80 3 82 1 110 1 112 1 120 1 125 1 200 1 Total 1,523 However, it should be regarded as a rightcensored variable because people should have physical limit for maximum use of their time for volunteering. But as you can see, there is no piling up at the rightend point. Do you think that I can use MLM? 

Joonmo Son posted on Monday, May 24, 2010  6:25 am



Following on the previous posting: I do not think that my dependent variable involves left"censoring" because 0 hour of volunteering literally means that the respondents chose not to volunteer. Rather is it plausible to think of it as a count measure that has zeroinflation for which I may employ negative binomial regression? The variable is overdispersed indeed (mean=6.6, S.E.=13.9). If MLM estimator can be a good estimator to take care of my dependent variable, it would be great. Thanks in advance for your answer. 


You should not treat this variable as a regular continuous variable using MLM or MLR. You have a strong floor effect (844 subjects at 0) which needs a different model than the regular linear model. You can use either censorednormal or count modeling. Treating it as a count variable is probably best. You can use Poisson, Zeroinflated Poisson, or negative binomial. For a discussion of those options, see the Web Talk video on regression with a count dependent variable at http://www.statmodel.com/webtalks.shtml 


Hello, I am following the discussion of tobit models. Does Mplues calculate the normal tobit or also the tobit type 2 model. how would i specify the second? thanks a lot 


Remind me what Tobit type 2 is  it was about 30 years since I looked at that  was that estimating the censoring point? Mplus only does the classic Tobit of Tobin. 


hi Bengt, the type 2 model integrates a second y* for an additional process determining the choice part (0;1) y1= 1 if y1* > 0 y1= 0 if Y1* <=0 y2 =y2* if y1* > 0 y2 =0 if Y1* <=0 with two separate regressions for y1* and y2*. it is a bivariate sample selection model, equivalent to tobit modell with stchastic treshold and also called probit selection equation. see Cameron, Adrian Colin; Trivedi, Pravin K. (2009): Microeconometrics. Methods and applications. 8. printing. Cambridge: Cambridge Univ. Press. p 547 Maybe I can replicate this with the twopart model? 


Don't know the answer, but the twopart thinking is interesting, with a censorednormal (regular Tobit) model for the y2 part instead of the twopart missingness for y2 when y1=0. But you have to check that you get the right likelihood that way  or check with known results for a data set. 


Apologies if the answers can be found elsewhere in these posts  tried to locate them but could not. I am conducting a path analysis (all observed variables) with continuous predictors and a continuous outcome. Two of my predictors are count variables and are highly skewed (piling up on the left end  most had few negative events and few alcoholic drinks in the past week). My outcome is a hormone concentration and is usually logtransformed to correct for nonnormality. There is missing data I need to account for. Several questions regarding Mplus: 1) Can Mplus "handle" my two skewed count predictors (piling up on left end) and if so, which estimator would you recommend? 2) Given the answer to #1, does it matter that the outcome/dependent variable has been logtransformed? 3) Can missing data (MAR or MCAR) be handled by Mplus if the estimator you recommended in #1 is used? Thanks in advance! 


12. There are no distributional assumptions made about covariates. For both the count and continuous variables, transformations that yield a linear relationship might be considered. You might consider not transforming the continuous variable but instead treating it as censored. I would use the MLR estimator. 3. Yes. 

Ted Barker posted on Wednesday, April 25, 2012  11:05 am



Hello, Is there a range of values of skew in which the MLR estimator functions best? Many thanks! Ted 


I know of no articles on this topic. It is a research question. 

Tracy Witte posted on Thursday, April 11, 2013  9:14 am



Given that no distributional assumptions are made about covariates, is there any difference in Mplus between a model that specifies as predictor variable as a count variable and leaving that statement out? 


The COUNT option is for dependent variables only. In regression, all covariates are treated as continuous variables. 


Dear Drs. Muthén and Muthén, I am very new with SEM and Mplus. I am deeply get stuck which really need your help. I am running several mediation models in which independent variables are latent variables; mediators are latent variables; 4 original outcome variables are including (1) 5 ordered categorical variable; (2) latent variables for example: alcohol problems in which alcohol problems = mean (x1, x2, x3) and x1, x2, x3 = 5 ordered categorical variables. I transferred all outcome variables into real scale so they become continuous variables. However, it is very much skew (the sknewness is about 9.4 and 7.4 even I used square root transformed) My questions: (1) if the outcome variables are very much skew, can I till model it with estimator = MLR? (2) if not, I also think about zero inflated negative binominal regression. Is this ok to use this model? If yes, what outcome variables I should use (the original variables or the transformed ones)? If no, please suggest me what kind of model I should use. Many thanks and Regards, 


(1) if your strong skew is due to strong floor or ceiling effects MLR won't help because you may need a nonlinear model instead of the standard linear. (2) Using ZINB assumes that you have a count outcome which it doesn't sound like you have. You can treat the outcomes as ordered categorical (ordinal variables) and let them be indicators of a latent variable. 

Howard Li posted on Friday, June 03, 2016  10:50 am



Dear Madam or Sir: I am using path analysis in my study, evaluating the mediating role of social norm and selfefficacy on the association between community engagement and condom use. The study have four variables as follow: 1. community engagement, 8 items, dichotomy (yer or no) 2. social norm, 6 items, 5 point likert scale 3. selfefficacy, 7 items, 5 point likert scale 4. condom use frequency with different partners, 4 items, and each item is their own outcome. My question is what estimator I should use in this path analysis using Mplus? Thank you so much!!! 


When one or more dependent variables are categorical, WLSMV is the default. You can also use ML. 


Dear Drs. Muthen, Maybe we use results from mplus for an article. I think we have definitional misunderstandings between coauthors and you can help to clarify this. First, we have a simple model with five manifest variables: x1 (number of persons) x2 (number of doctors) x3 (a dummy coded variable for operating room) y1 y2 (concentration of specific particles) We have included the resulting 2way, and 3wayinteraction terms: x1x3 x2x3 x1x2 x1x2x3 Because the data are not normally distributed, we used the MLM estimator. ANALYSIS: ESTIMATOR=MLM; y1 y2 ON x1 x2 x3 x1x3 x2x3 x1x2 x1x2x3; 1)A coauthor asks, what estimation method is used in model. In my understanding we use the maximumlikelihoodapproach and because the data are not normally distributed we choose the SatorraBentler corrections. Is that right or may be something else meant? 2)Which references we can cite? 3)Because all variable are manifest. Shall we call the model as OLS regression or SEM? 4)Because the model is "saturated", fit indices will show perfect fit (either zero or 1.0 for most fit indices) and these statistics cannot be used to determine how well the model fits. Which indexes should we report? Best regards, Martin 


1) You are right. 2) See references on our webpage: http://www.statmodel.com/chidiff.shtml 3) I would call it multivariate regression (OLS and ML estimates are the same). 4) I would do the usual regression checks of outliers and plots of residuals. 


I thank you very much! 

Rhyan posted on Thursday, April 12, 2018  7:05 am



Hi. I am running a large path analysis with 5 continuous and 2 dichotomous hypothesized mediators between an ordinal exogenous measure and a dichotomous outcome measure. I'm currently using the Estimator = WLSMV and FIML to account for the different types of variables and missing data. My question is: 4 of the continuous measures are not normally distributed. I've seen mixed thoughts online about transforming these variables. What would you recommend? Thank you. 


The main issue is if the nonnormal variables have strong floor or ceiling effects in which case you might want to use the Censored option. Note also that mediation with categorical mediators and outcomes are optimally handled by counterfactuallydefined indirect and direct effects. See our Mediation page on our website. 

shonnslc posted on Thursday, May 16, 2019  7:24 pm



Hi I am running path model with a severely nonnormal endogenous variable (skewness: 5.45/kurtosis:36.80). This variable is proportion data (0 to 1). Around 80% of my participants have 0. I found that cubic root is effective in transforming the variable to become relatively normal. At the same time, I am wondering in this case if other models such as censored regression or censoredinflated regression are also appropriate to model my data. If yes, should I use transformation or use other models? Thank you. 


Just use the censored approach  transforming doesn't solve it with an 80% floor effect. 

Back to top 