Choice of estimator PreviousNext
Mplus Discussion > Categorical Data Modeling >
Message/Author
 Leigh Roeger posted on Sunday, January 30, 2000 - 9:51 pm
I am working on a multigroup meanstructure analysis. There are 2 groups (boys and girls) who rated their mothers on a 25 item - 4 point (strongly agree, agree, etc) rating scale. The items are very skewed. The scale consists of three sub-scales or factors. By simply adding items to the subscales girls (on average) rate their mothers better than boys on all three subscales.

I have been perplexed by the results produced from different estimators when testing the latent means. In particular with WLS (when factor loadings and threasholds are invariant between the groups) one of the latent means goes negative indicating that girls (the second group) rate their mothers more negatively than boys on this factor despite the raw data saying the opposite.

Any ideas on why or how this happens would be much appreciated.
 Linda K. Muthen posted on Tuesday, February 01, 2000 - 9:17 am
The only thing that comes to mind is that perhaps girls are not the second group. Do they have the higher code on the gender variable? If so, can you send your input or output and data so we can take a look at it and give you a better answer?
 Anonymous posted on Wednesday, June 01, 2005 - 2:21 pm
I don't know why there are differences between MPLUS probit regression and STATA probit regression. Is it because the default MPLUS probit is estimated by weighted least square while STATA probit is estimated by maximum likelihood?

If I specify "ANALAYSIS: ESTIMATOR=ML," then the coefficient and s.e. of the MPLUS logistic regression are the same as the STATA logit regression. Can I get the same results of probit regression in both MPLUS and STATA?

Thanks!
 Anonymous posted on Wednesday, June 01, 2005 - 2:23 pm
I don't know why there are differences between MPLUS probit regression and STATA probit regression. Is it because the default MPLUS probit is estimated by weighted least square while STATA probit is estimated by maximum likelihood?

If I specify "ANALAYSIS: ESTIMATOR=ML," then the coefficient and s.e. of the MPLUS logistic regression are the same as the STATA logit regression. Can I get the same results of probit regression in both MPLUS and STATA?

Thanks!
 bmuthen posted on Wednesday, June 01, 2005 - 5:59 pm
The Mplus "Sample Statistics" (requesting sampstat in the output) gives ML probit regression with a single dependent variable - this should agree with STATA. These sample statistics represent the first stage of the Mplus weighted least squares estimator.
 Marleen de Moor posted on Monday, September 05, 2005 - 4:12 am
Dear Linda and Bengt,

I have a few questions concerning categorical data and the TYPE=TWOLEVEL option.

1. Is it true that Mplus uses a logistic regression for all multilevel analyses (TYPE=TWOLEVEL) with a categorical outcome variable, because estimators available are MLR, ML and MLF, and not WLSMV? Is it therefore correct to interpret the beta coefficient as the log odds ratio?

2. In my model I would like to correlate the errors of my two dependent variables, of which one is normal and the other categorical. Is that somehow possible with the option TYPE=TWOLEVEL, or is the only way out using the options TYPE=COMPLEX with ESTIMATOR=WLSMV?

3. Do you have any plans to make it possible to use censored data with TYPE=TWOLEVEL in Mplus in the future?

Thank you very much in advance!
Kind regards, Marleen de Moor
 BMuthen posted on Monday, September 05, 2005 - 2:46 pm
1. Yes.

2. You cannot use WITH to specify a residual covariace when one or more outcome is categical in TWOLEVEL analysis with maximum likelihood. You could consider putting a factor behind the two variables as shown in Example 7.16.

3. Yes.
 Sally Czaja posted on Thursday, October 12, 2006 - 1:58 pm
I am testing a path model with 1 independent variable predicting 2 intermediate variables which predict a dependent variable. Each of the endogenous variables has 2-4 control variables. One of the intermediate variables is dichotomous, which makes the default estimator WLSMV. Iíve read in the MPlus manual and discussion board that this gives a probit regression and that I can specify the estimator as ML to get logistic regression, which makes sense for the dichotomous DV.

But what kind of regression is done with the continuous DVs (i.e., what are these path coefficients/how are they to be interpreted?)?

(continued in 2nd post)
Sally
 Sally Czaja posted on Thursday, October 12, 2006 - 2:10 pm
(continued from prior post re path model with 1 IV predicting 2 intermediate variables which predict a DV)

The path coefficients differ, sometimes substantially:
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Coefficients
For the path. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . WLSMV. . . . . ML
from IV to the dichotomous variable,. . . . . . . . . . .04 (n.s.). . . . .15 (p<.001)
from dichotomous variable to the final DV,. . . . . .28 (p<.001). . .53 (p<.001)
from IV to the other intermediate variable. . . . . . .10 (p<.01). . . .07 (p.05)
from other intermediate variable to the final DV,. .20 (p<.001). . .14 (p<.001)
What accounts for these differences?. They are both more & less than the approx. 1.7 scale difference between logistic and probit.. I would have thought the pattern of significance would be the same, even with different methods.

Finally, on what basis do I choose an estimator?. The dichotomous variable has a 76/24 split and skewness & kurtosis statistics are n.s., which suggests it could be treated as normally distributed.. But if I donít declare it categorical, the fit becomes awful.

Iíd really appreciate your help in understanding this area.
Sally
 Linda K. Muthen posted on Thursday, October 12, 2006 - 2:40 pm
The regression coefficients for the continuous dependent variables are simple linear regression coefficients.

The coefficients will differ between WLSMV and ML because one is probit and the other is logit. They are on a different scale. You should be comparing the ratios.

I would choose WLSMV with a 76/24 split.
 Sally Czaja posted on Friday, October 13, 2006 - 12:46 pm
Hi Linda
Sorry, but what ratios are you referring to in your 2nd paragraph?

Could you elaborate on why I should use WLSMV? I'll have to explain this to someone else.

Thanks.
 Linda K. Muthen posted on Friday, October 13, 2006 - 2:47 pm
The ratio of a parameter estimate to its standard error. It is the third column of the results.

It seems you want residual covariances. You can't have more than four with maximum likelihood because a model with four dimensions of integration is probably the maximum you can estimate. This is why I recommended WLSMV.
 Sally Czaja posted on Monday, October 16, 2006 - 12:37 pm
Hi Linda
Thank you for your quick responses last week. I have 2 more related questions:

If, as I understand, the coefficients for predictors of continuous DVs are simple linear regression coef. regardless of the estimator (WLSMV or ML), shouldn't they be identical? For 2 paths, I get .20 in WLSMV vs .14 in MLR (both p<.001); and -.13 (p<.01) in WLSMV vs -.05 (p<.05) in MLR (and smaller differences on other paths).

Also, for a predictor of the dichotomous variable, MLR gives an OR of 2.26 and est./SE of 4.92, while WLSMV gives an OR of 2.55 (using exp(Estimate*1.7)) with est./SE of 2.98. Should they be this far apart?

Thanks for your help.
 Linda K. Muthen posted on Tuesday, October 17, 2006 - 7:46 am
They should be the same. You would need to send me your inputs, data, outputs and license number to support@statmodel.com for me to see why they are not.

Odds ratios cannot be computed for probit regression coefficients.
 Ramzi Mabsout posted on Wednesday, October 15, 2008 - 4:39 am
Hi

From version 5, I see WLSMV can be used with TWO LEVEL. Are the loadings using CFA, categorical variables & no covariates probit coefficients?

Why I cannot conduct multi-group analysis with TWO LEVEL CATEGORICAL CFA & WLSMV? Is my only alternative to use integration in that case?

Thank you very much.
 Linda K. Muthen posted on Wednesday, October 15, 2008 - 10:06 am
Your only option in this case is numerical integration.
 Ramzi Mabsout posted on Wednesday, October 15, 2008 - 10:42 am
I also cannot conduct analysis with integration: I am requested to use KNOWNCLASS & MIXTURE. Why?
 Linda K. Muthen posted on Wednesday, October 15, 2008 - 11:27 am
When numerical analysis is required, multiple group analysis uses the KNOWNCLASS option and TYPE=MIXTURE.
 Richard Rivera posted on Tuesday, June 16, 2009 - 8:13 pm
I am conducting multiple logistic regression on a binary outcome. I have missing data, so I am allowing the default to use missing data theory, and I also included INTEGRATION=MONTECARLO;.

I would like to get unbiased estimates of confidence intervals and I know that I canít use bootstrap CI when I am using the montecarlo integration.

For logisitic regression, there two options for estimation procedures (ML & MLR). For both of these, I asked for confidence intervals in outcome.

When I use ESTIMATOR = MLR I get the same point estimates then when I use
ESTIMATOR = MLR. So I assume that I get log odds (or odds ration) for either ML estimator.

However, I get different standard errors, which estimator should I use?
 Richard Rivera posted on Tuesday, June 16, 2009 - 8:25 pm
What I meant to ask:

When conducting multiple logisitic regression with missig data, which estimation procedure would give me the least bias estimates of the standard errors (or confidence intervals)?

Thanks
 Paul Silvia posted on Wednesday, June 17, 2009 - 5:59 am
When ML and MLR diverge in their SE estimates, MLR is generally more trustworthy. Broadly, though, this is often a sign to explore residuals, distributions, and possible influential cases.
 Cecily Na posted on Monday, February 07, 2011 - 3:07 pm
Hi Professors,
I am new to Mplus. I used the syntax MODEL = BASIC; Estimator = ML to generate a covariance matrix in Mplus. It was not same as the one produced in SPSS. What's the reason (suppose I treated all variables as continuous)?

Also, when can I use ML? Can I use it for ordered categorical variables?

Thanks!
 Linda K. Muthen posted on Monday, February 07, 2011 - 3:54 pm
It is likely that the sample sizes are not the same. If they are, you may be reading the data incorrectly and should send the problem along with your license number to support@statmodel.com.

Yes, ML can be used for ordered categorical data. See the ESTIMATOR option in the user's guide where there is a table that shows the cases when each estimator can be used.
 burak aydin posted on Tuesday, May 10, 2011 - 4:00 pm
Hi,
An article named "propensity score adjustment for multiple groups SEM" (Hoshino,Kurata & Shigemasu, 2006) uses weighted M estimator. Weights are propensity scores.
I wonder if WLS estimator does the same job?
Thanks.
 Bengt O. Muthen posted on Tuesday, May 10, 2011 - 5:23 pm
The Mplus WLS estimator is not based on propensity scores. M estimators are sometimes connected with GEE. The connection between GEE and WLSM is shown in

Muthťn, B., du Toit, S.H.C. & Spisic, D. (1997). Robust inference using weighted least squares and quadratic estimating equations in latent variable modeling with categorical and continuous outcomes. Unpublished technical report.

which is on our web site under Papers, SEM.
 Bengt O. Muthen posted on Wednesday, May 11, 2011 - 3:33 pm
Perhaps this can be done using weighted ML, which we call quasi-ML in some of Asparouhov's writing on complex survey data analysis on our web site?
 burak aydin posted on Wednesday, May 11, 2011 - 4:06 pm
I made some further search and figured out that residual based GLS estimator is what I need. I know Mplus has traditional GLS estimator. Is there a way to modify GLS estimator to residual based GLS estimator? (Yuan&Bentler,1997,mean and covariance structure analysis: theoretical and practical improvements)

Furthermore, I d like to learn if there is an estimator which is robust to both non-normality and outliers?
Thanks.
 Bengt O. Muthen posted on Thursday, May 12, 2011 - 9:52 am
Don't know the answer to that. The Mplus GLS does not allow weights.

Outlier detection is available in Mplus - see the UG. MLR is in principle robust to model mis-specification, but how well that works with outliers I'm not sure of.
 Heike B. posted on Thursday, October 20, 2011 - 3:28 am
Dear Dres. Muthen,

I intend to build a manifest path model containing two exogenous variables and 5 endogenous variables. Three of them are mediators.

The observed variables are means from four-step likert scales (two variables actually are single items). That's why I wanted to treat the data as ordinal.

My sample is small (230 objects), the data is skewed and not normaly distributed.

I tried to estimate the model using WLMSV, however now I would like to add an interaction.

Besides one endogenous variable ended up with eleven categories, so MPLUS did not allow to decleare it as categorical.

Given all this -

1. which estimator would you recommend?

2. if an ML based estimator is recommended, should I declare all my variables as continous?

Many thanks in advance.
Heike
 Linda K. Muthen posted on Thursday, October 20, 2011 - 2:00 pm
If the original Likert variables have floor or ceiling effects, I would not recommend summing them.

I think you want an interaction between two observed variables. You can create that as the product of the two variables using the DEFINE command.

Both weighted least squares and maximum likelihood estimation can be used with categorical dependent variables.
 Miho Tanaka posted on Monday, February 13, 2012 - 11:01 am
Hi,

I have been working on a SEM for my dissertation. The primary outcome in my model is a binary (whether participant did a hepatitis B screening or not). Predictors are three latent variables by non-normally distributed continuous factor indicators. By default, Mplus uses WLSMV estimator for both structural and measurement part. I would like to know what is happening to the measurement model if I allow the default estimator (WLSMV). That is WLSMV is used to non-normally distributed continuous factor indicators. For CFA (only for the measurement part), I may chose to use MLR, rather than WLSMV. Is there any significant difference by these two estimators? I understand both estimators are robust to non-normality.

Thanks for your advice.
 Linda K. Muthen posted on Wednesday, February 15, 2012 - 10:25 am
WLSMV is not robust to non-normality of continuous variables. I would use MLR.
 Owis Eilayyan posted on Tuesday, March 20, 2012 - 5:08 pm
Hello,

I am doing a path analysis. i have 5 intermediate continuous variables and one dependent variable.

I am not sure which type of estimation i should use?

Thanks
Owis
 Bengt O. Muthen posted on Tuesday, March 20, 2012 - 6:32 pm
I would use ML or MLR.
 Owis Eilayyan posted on Tuesday, March 20, 2012 - 9:06 pm
Hi again,

Thanks for your response. I used MlR and i got this error message:

"*** FATAL ERROR
THIS MODEL CAN BE DONE ONLY WITH MONTECARLO INTEGRATION."

is that because i have missing values?

Thanks
Owis
 Linda K. Muthen posted on Wednesday, March 21, 2012 - 7:08 am
Yes, you must have missing values on a mediator. Add INTEGRATION=MONTECARLO; to the ANALYSIS command.
 Owis Eilayyan posted on Wednesday, March 21, 2012 - 7:11 am
Ok, if i removed the missing, can i use MLR or ML estimator? i dont want to use WLSMV.

Thanks
Owis
 Bengt O. Muthen posted on Wednesday, March 21, 2012 - 7:39 am
When you add Integration=MonteCarlo you are still doing ML/MLR, it's just that you specify a certain algorithm for doing it.

Your dependent variable must have been categorical or count, in which case missing on mediators leads to numerical integration with MonteCarlo when using the ML or MLR estimator.
 Owis Eilayyan posted on Wednesday, March 21, 2012 - 7:46 am
Actually my independent variables have these missing values.

Thanks a lot
Owis
 Owis Eilayyan posted on Wednesday, March 21, 2012 - 10:01 am
Hello again,

i used Integration=MonteCarlo and ML/MLR estimator but i didnt have Chi-Square Value and RMSEA in the output, is it normally? also, i got a different results (i.e. different direction of relationships between variables) in ML/MLR versus WLSMV!
 Linda K. Muthen posted on Wednesday, March 21, 2012 - 11:00 am
When means, variances, and covariances are not sufficient statistics for model estimation, chi-square and related fit statistics are not available.

Please send the two outputs and your license number to support@statmodel.com.
 Owis Eilayyan posted on Wednesday, March 21, 2012 - 11:18 am
When i use WLSMV estimation, i get the fit statistics.

i am using my supervisor program, both of us dont know the license number. where is it written usually?

Thanks
Owis
 Linda K. Muthen posted on Wednesday, March 21, 2012 - 1:08 pm
With WLSMV, the statistics for model estimation are thresholds and correlations.

You can login to your account on the website and see it.
 Owis Eilayyan posted on Wednesday, March 21, 2012 - 1:17 pm
Sorry for bothering you,

but does that mean with WLSMV, i get a wrong result?

i got a good fit model with WLSMV!

Thanks
Owis
 Linda K. Muthen posted on Wednesday, March 21, 2012 - 3:53 pm
We don't make a habit of giving wrong results. WLSMV gives chi-square and related fit statistics.
 Owis Eilayyan posted on Wednesday, March 21, 2012 - 4:18 pm
One more question,
so with WLSMV, we get chi-square and related fit statistics while with ML/MLR we dont, is that true?
Also, if i use ML or WLSMV i get similar result, isnt it? that what i understood from your video!

Thanks
Owis
 Bengt O. Muthen posted on Wednesday, March 21, 2012 - 4:19 pm
To understand the different aspects of testing model fit in this situation, see

Muthťn, B. (1993). Goodness of fit with categorical and other non-normal variables. In K. A. Bollen, & J. S. Long (Eds.), Testing Structural Equation Models (pp. 205-243). Newbury Park, CA: Sage

which is paper #45 at

http://pages.gseis.ucla.edu/faculty/muthen/full_paper_list.htm

This chapter makes the distinction between testing the underlying structure (as WLSMV does) versus testing the model against the data (which isn't always feasible as presumably in your case).
 Bengt O. Muthen posted on Wednesday, March 21, 2012 - 4:24 pm
ML and WLSMV tends to give similar results when the missing data are MCAR (missing completely at random) or MAR as a function of covariates.
 Mauricio Garnier-Villarreal posted on Thursday, April 19, 2012 - 8:04 am
Hi

I am running a simulation study with categorical indicators using the BAYES estimator, I have heard that Mplus uses two methods for handling categorical variables: tetrachorical correlation and direct ML. In the specific case of using the BAYES estimator, which method uses Mplus?

thank you
 Bengt O. Muthen posted on Thursday, April 19, 2012 - 10:57 am
Bayes does not use tetrachorics and does not use ML. But like ML, Bayes is a "full-information" estimator that uses all available data in an optimal way. It is equivalent to ML in its missing data handling. Bayes is an estimator in its own right. So Mplus offers 3 major estimators: WLSMV (which builds on tetrachorics/polychorics), ML, and Bayes.
 Owis Eilayyan posted on Monday, April 30, 2012 - 7:16 pm
Hello,

I would like to ask a technical question with Mplus. When i use WLSMV estimator, i get Chi-Square, RMSEA, and CFI values automatically.
My question is: can i get a Chi-Square, RMSEA, and CFI values with ML estimator?

Thanks
Owis
 Linda K. Muthen posted on Tuesday, May 01, 2012 - 10:31 am
With maximum likelihood and categorical variables means, variances, and covariances are not sufficient statistics for model estimation. Because of this, chi-square and related fit statistics are not available.
 Gabriel Nagy posted on Friday, March 01, 2013 - 11:29 am
Dear all,
I have some questions regarding the ODLL algorithm implemented in Mplus.
Iím running a large IRT model including many nonlinear parameter constraints (around 700). ML estimation on basis of the EM algorithm is no longer feasible and the constraints are not supported in the Bayes framework. Iíve tried out different algorithms and found out that ODLL (in combination with MLF) works well in reasonable time. Unfortunately, I was not able to find any documentation of the ODLL algorithm. I only found out that ODLL optimizes the observed data likelihood directly.

Is ODLL something like JML (Joint Maximum Likelihood)?

Is ODLL an iterative algorithm (Tech 8 doesnít report an iteration history for ODLL)?

What is ODLL exactly doing? Are there any references about this algorithm that might be cited in a manuscript?

What about the performance of ODLL relative to other algorithms, such as EM? I suspect that there might be some reasons that the much slower EM algorithm is routinely used in the IRT framework.

Thank you for your help!
 Tihomir Asparouhov posted on Monday, March 04, 2013 - 11:00 am
ODLL stands for Observed data log-likelihood. The algorithm optimizes the log-likelihood using the Quasi-Newton method.

http://en.wikipedia.org/wiki/Quasi-Newton_method

You can look at Tech5 for the iterations.

Use the Mplus manual as a reference.

My experience is that in most cases (but definitely not in all cases) the default EMA algorithm is faster. EMA actually contains ODLL within it and is occasionally deployed.

My suggestion is to spend time simplifying your model constraints. There are 3 types of constraints listed in order of complexity

1) New parameters = function of model parameters

2) Dependent parameters = function of independent parameters

3) anything else

Try to use 1 and 2 as much as you can instead of 3. Model constraints can be written in many different ways and using the most optimal way can improve the estimation dramatically.
 Anna posted on Sunday, June 02, 2013 - 11:21 pm
Hello,

I have a model with five observed variables, A, B, C, D, and E. E is categorical. The model proposes an indirect link, A->C->D->E, while B moderates the A->C path. There are missing values on A, B, C, D. Sample size is around 250.

I would like to know which estimator is more appropriate for testing this kind of model: categorical outcome, aims to test moderated mediation effect, has missing values.

I have tried WLSMV, MLR, and BAYES. The results estimated through these three estimators are actually comparable, and the fit indices in WLSMV and the Bayesian PPC and PSR indicate good fit. I tend to favor Bayesian estimation because it handles missing data well and it does not require normal distribution. But I am not sure to what extent it is favored against the other two estimators in my situation. (I don't have specific estimation of the priors.)

Thank you very much for your help!
 Linda K. Muthen posted on Monday, June 03, 2013 - 10:55 am
Bayes and missing data handle missing data in the same way. I would choose them above WLSMV if there is a lot of missing data. You can use non-informative priors in Bayes.
 Anna posted on Monday, June 03, 2013 - 11:44 am
Dear Linda,

Thank you!

I would like to ask more about these estimators. Beside the difference in handling missing data, are there any other concerns in choosing among these methods?

1. Is WLSMV robust for models with interaction terms and nonnormal distribution of indirect effects (e.g., a*b term)? I read the Muthen, du Toit, and Spisic (2007) technical report and I think that WLSMV often underestimates SE when the sample is small and skewed.

2. I also wonder if I should correlate the IVs with the interaction term (and perhaps correlate the exogenous covariates) because WLSMV does not automatically do so in the sequential modeling.

3. For MLR, since bootstrapping is not allowed with numerical integration, will this be a big deal for estimation of indirect effects with nonnormal distribution?

Thanks!
 Linda K. Muthen posted on Wednesday, June 05, 2013 - 1:31 pm
1. You can use bootstrap with WLSMV.

2. The model is estimated conditioned on the exogenous variables. Their means, variances, and covariances should not be mentioned in the MODEL command. To obtain these values, do a TYPE=BASIC with no MODEL command.

3. If they have a non-normal distribution, this will not be taken into account.
Back to top
Add Your Message Here
Post:
Username: Posting Information:
This is a private posting area. Only registered users and moderators may post messages here.
Password:
Options: Enable HTML code in message
Automatically activate URLs in message
Action: