Daniel posted on Tuesday, December 16, 2003 - 6:39 am
Hi, is there a way to determine whether a model is linear in the logit for continuous predictor variables in mplus? I am following the Hosmer and Lemeshow criteria for variable selection, section 4.2, in their 2000 second edition text.
bmuthen posted on Tuesday, December 16, 2003 - 10:03 am
You can always plot the sample logits against x to see if they are approximately linear in x. See Muthen (1993) in the Bollen-Long book - ref on the Mplus web site.
daniel posted on Wednesday, December 17, 2003 - 4:39 am
Don posted on Saturday, August 27, 2005 - 10:27 am
Hi, I have a big logictic model: i. 1 binary response(dependent, y) ii. 20 explanatory(independent, c1-c15 dichotomous, f16-f20 continuous) iii. 5 interactions including "dichotomous*continuous" and "dichotomous*dichotomous" iv. 7 coefficients had to be fixed in logictic regression v. the missing are everythere, "." symbol in SAS file
When I ran yesterday, it appeared that: 1. I couldn't use "TYPE=LOGICTIC", the error message suggested to use "TYPE=RANDOM", CAN I use "TYPE = general missing h1"? 2. I couldn't use "c1xf16 | c1 XWITH f16" and "c1xc2 | c1 XWITH c2", the error message suggested to use "DEFINE command", I tried "DEFINE: c1xf16 = c1 * f16 c1xc2 = c1 * c2;", but the error message said that "c1xf16" and "c1xc2" are unknown, so it didn't work. Could you please to tell me how to create the interaction terms in M+ command? OR is there a simple way in M+ to for interaction like in S+: "c1:f16" or in SAS: "c1*c2" ? 3. Can I use "OUTPUT: standardized sampstat"? If I can't, does it realy matter for model estimates interpretation? 4. Does M+ offer the results of -2*loglikelihood under Ho and Ha? Since we need to use likelihood ratio test for significant term-selestion but I didn't find out in the output. 5. Does M+ have a build-in or existing function for stepwise model selection based on specified BIC or AIC? I know S+ has this kind of function called "stepwise".
1. TYPE=LOGISTIC; is just for univariate logisitc regression. If you have categorical outcomes and are using the maximum likelihood estimator, you are estimating logistic regressions. Yes, you can use TYPE=GENERAL MISSING H1; with ESTIMATOR = ML or MLR;
2. If you define new variables, you need to put them at the end of the USEVARIABLES statement. If you did that and still get the message, send your input, data, output, and license number to email@example.com.
3. Yes, you can use this in most cases. You will get sample statistics and standardized parameter estimates.
4. You get a loglikelihood value for your H0 model.
Please send further questions of this type along with your license number to firstname.lastname@example.org. We try to reserve Mplus Discussion for questions that do not fall under Mplus support.
Don posted on Wednesday, August 31, 2005 - 12:53 pm
I am dealing a data with less than 18000 subjects (18000 rows in SAS data file) with less than 12000 complete. In the future, I am told to deal with 300000 - 600000 subjects (rows in SAS data file).
Could M+ handle this huge data set? Could you please tell me the limit of rows or records for M+?
Hi, I have a problem with calculating propensity scores: I wanted to predict a binary (treatment/non-treatment)outcome to receive the propabilites to which group the persons should belong considering their covariate-differences. Problem: I have a hierarchical dataset with some non-ignorable missing data - so Mplus seemed to be the option. But I encountered two problems which i couldn't solve with the manual only:
1) Theoretically I think a "TYPE = LOGISTIC MISSING H1 CLUSTER"-analysis would be appropriate but it seems that this combination doesn't work.
2) I need to have the estimated probabilities to which outcome class my cases belong. [In the manual I just found an option for latent class analysis (these CPROBABILITIES)]. I couldn't find a similar option for the logistic regression.
So my question would be: Can Mplus solve these two problems? How can I do that?
1. I think what you want is TYPE=COMPLEX MISSING; with ESTIMATOR = ML; This will give you logistic regression when you have categorical outcomes. TYPE=LOGISTIC; is only for univariate logistic regression and is limited in which options can be used with it.
2. You would need to calculate probabilities from the logistic regression coefficients. How to do this is described in Chapter 13.
thank you, that was very helpful already - the model seems to work (although with MLR instead of ML - but that sould be fine i guess) for 2.) this chapter 13 did not refer to the problem i have. i need the indvidual values for predicting the class membership: i can't find a command to tell this mplus (or to get the the odds or something) To illustrate it a bit more what I need: I refer to those values which were equivalent to those ones one gets out of the SPSS-logistic regression with the option: "regression -> binary regression -> save -> predicted values -> probabilities", or in syntax language the LOGISTIC REGRESSION y_binary /METHOD = ENTER x_contin /SAVE = PRED DEV
I hope that I was able to explain myself a bit better...
Thank you again!
BMuthen posted on Thursday, September 15, 2005 - 2:42 pm
Mplus does not have an automatic way to compute the values that you want. You would need to compute them using DEFINE in a second run. Let's say that there is one predictor x and therefore one estimated slope b equal to .5 say and one intercept equal to .2 say.
For logistic and proportional odds (ordered categorical I assume), a latent response variable variance is used. It's residual has variance pi squared divided by three. For multinomial logistic regression, standardized coefficients are not provided.
David Bard posted on Thursday, December 07, 2006 - 11:58 am
Hmmm, I've tried using that value but cannot replicate the M+ values. For example, my point estimate for a single latent variable predictor's logistic coefficient is .277. The variance of this latent predictor is .78. The reported Std coefficient is .245 which does = .277*sqrt(.78). But the reported StdYX is .128 which by my hand calculation does not equal .277*sqrt(.78)/(pi/sqrt(3))=.135. Am I setting up the calculation correctly?
I am using a bivariate panel model across four time points with one dichotomous (R) and one continuous (S) variable. I assume an autoregressive process for R and use S as a time-varying covariate.
VARIABLE: NAMES ARE R03 R04 R05 R06 S03 S04 S05; CATEGORICAL ARE R03 R04 R05 R06; ANALYSIS: ESTIMATOR = ML; MODEL: R06 ON R05 S05; R05 ON R04 S04; R04 ON R03 S03;
!S05 WITH S04 S03; !S04 WITH S03;
Two quick questions:
1. The model is exactly what I want, but unfortunately the only fit statistics I get are LL(H0), AIC, BIC and adj. BIC, which are more or less helpful in comparing slightly different specifications, but not enough evidence for my reviewers;-) What other options (fit indices, etc.) do I have to justify its use?
2. Regarding the last two omitted ("!") statements: It is my understanding that by default Mplus assumes all exogenous variables to be correlated (as in standard regression). That is what I also assume for all S variables, although I am not particularly interested in this correlation. Parameter estimates are the same with or without the WITH statement, but fit indices differ, so should I include or omit the WITH statement when assessing & reporting model fit?
1. With ML, this is all you get. You can use WLSMV and probit regression if you want to see more traditional fit statistics.
2. The means, variances, and covariances of the exogenous observed variables are not parameters in the model unless you bring them into the model as you do with you WITH statements. When you bring them in, you make distributional assumptions about them.
In a path analysis with binary outcomes, y1 on x z y2 on x z y3 on y1 y2 z
1. is it possible to estimate indirect effects and be able to get odds ratio interpretation? Using ML gives an error message. 2. is the indirect effect of x on y3 through y1 equivalent to the product of the coefficients for y1 on x and y3 on y1?
1. Yes. 2. Unstandardized. 4. If probabilities are not part of the results, there is no option to compute them.
The formula is the same for continuous or binary covariates. You simply select an appropriate value of the covariate.
fritz posted on Monday, November 17, 2008 - 11:02 am
Thanks a lot. I've got another follow up question:
I computed logistic regression with SPSS and MPlus. The results are quite comparable. The only striking difference the algebraic sign of the constant/intercept. In SPSS it's negative in MPlus positive; in both programms it's about +/-9. I don't understand this.
In addition, computing logit and prob, I get a zero variance for the prob. I tried the same calculation for the logit with a negative sign for the MPlus intercept. This lead to comparable probs as I got in SPSS. Do I have to multiply the intercept by -1?
I conducted multiple logistic regression with the default procedure (FIML for estimate parameters) for data w/ missingness . Moreover, I used MLR estimation w/ monetecarlo numerical integration.
I would like to calculate Satorra-Bentler chi-square difference test between model (w/ multiple indicators) and model with just the intercept. However, I need to how to get log likelihood of just the intercept model.
Rebeca posted on Friday, January 06, 2012 - 2:27 pm
I conducted a logistic regression analysis with a dichotomous moderating variable (i.e. gender). My results showed two significant interactions but I am having trouble figuring out how to interpret these findings in MPLUS.
I tried to probe the interactions similarly to how I would in SPSS (i.e., create a separate variable for boys and one for girls, create new interaction term with variable, and re-run the analyses separately for boys and girls); however, this did not seem to work as the results showed that nothing was significant, which doesn't make sense.
Also, I thought of exporting the data to SPSS so that I could probe and graph the interactions there but I don't think that the values in the ouput or any other scores that I could save would allow me to do that.
I've looked through all the forums and wasn't able to find anything that was especially helpful for this particular issue. Do you have any other suggestions? Thank you.
It sounds like you create an interaction variable as x1*x2, where x2 is dichotomous and you are interested in knowing what the slopes of x1 are for the different genders in the regression
y = a + b1*x1+b2*x2+b3*x1*x2+e.
If that is the case and x2 is scored 0/1, you simply use your regular regression knowledge to find that for x2=0 the x1 slope is b1 and for x2=1 the x1 slope is b1+b3.
Rebeca posted on Tuesday, January 10, 2012 - 9:09 am
Thank you. Just to clarify, I know that a=the constant but do I get this number from the threshold section? If so, because there are two values here do I assume that they are the two values of my dichotomous outcome variable (dropout$1;graduated or dropout$2;dropout) such that I would choose the value in the estimate column that corresponds to the value I am most interested in (i.e., dropout)?
Relatedly, I know that e=the error term but where in the mplus output would I be able to locate this value. Thank you again for all of your help, it is greatly appreciated.
With a dichotomous outcome, the a term is obtained as the negative of the threshold for this outcome. But it sounds like you have more than 2 outcome categories and have either an ordinal or nominal outcome.
You don't use e when you plot interactions - the expected y does not include that residual. Note that the equation I gave is for a continuous dependent variable, which is the logit DV in your case. Translating that to probabilities for the dichotomous outcome is described in our Topic 2 short course - see handout and video on our web site.
Emil Coman posted on Thursday, January 12, 2012 - 2:04 pm
I am also trying to save the probabilities using Bengt's suggestion above, and I have e.g. DEFINE: logit = 0.267*sex - 0.011*age ; prob = 1 / (1 + exp (-logit)); Now, what do I ask for in SAVEDATA, and how? I tried SAVEDATA: FILE IS logistic_1.csv; MNAMES logit prob subject ; MSELECT = logit prob ; but the file saved does not contain the 2 newly defined variables. Any suggestions? Thanks, Emil
You would need to put the new variables, logit and prob, at the end of the USEVARIABLES list. Remove the MNAMES and MSELECT options. They are for merging two data sets.
Rebeca posted on Tuesday, January 17, 2012 - 8:03 am
The outcome variable only has two values that are labeled but I think that the problem might be that during the analysis, some participants have missing data for the outcome variable and it is being interpreted as being a third value. Consequently, it is assuming that this is an ordinal variable.
However, I was under the assumption that when you run an analysis in MPLUS and run it with syntax to estimate missing data, it would automatically identify the missing value code and not interpret it as an additional value.
If the missing value code is on the MISSING list for that variable, it will not be treated as a legitimate value. It sounds like you are not reading the data correctly. Please send the input, data, output, and your license number to email@example.com.
You can't use a latent variable (intent) in Define - it has to be an observed variable in your data.
Instead, use TECH4 in the run where you got the estimates and get the mean and variance of intent. Then use the formula you have (I assume 1.106 is the threshold) for different points on the intent scale (say mean, 1 SD below the mean, 1 SD above the mean).
Hi, I am predicting a default of companies with variables which are time varying. for exp. Year Default Comp. 1Variable 2variable 1999 0 ABC 200 10 2000 0 ABC 50 7 2001 1 ABC 20 2 1999 0 KKK 201 5 2000 1 KKK 100 5
How can i run logit reg. where 1-defaulted in year t, 0 - no.
health on childsep childhealth kiddum1 kiddum3 kiddum4 ;
I get an error message :
One or more variables have a variance greater than the maximum allowed of 1000000. Check your data and format statement or rescale the variable(s) using the DEFINE command. Then it indicates a problem with kiddum variables.
How can I fix this? It doesn't matter which dummies or binaries I add in, I get the same problem. Do I need to rescale all the dummy/binary variables? Is there some important bit of code I have missed out?
I would really appreciate any suggestions you have.
Hi, Thanks for the quick reply, Sorry I did not save the output. However I tried re-creating the dummies using the define option and it worked fine. Not sure why. Now I have another query. I want to run a model with latent DV, 2 latent IVs and many more categorical/continuous IVs. The only way I got this to fit was to use the WLSMV estimator (otherwise it says I have too many integration points), but ideally I want to use logistic regression and get odds ratios. Is there any way I can fit the model but obtain odds ratios?
ML with the default logit link is the only way to get odds ratio results. Check the output (also TECH8) to see how many dimensions of integration you have and if more than say 5, try to simplify your model.
Briana Chang posted on Wednesday, January 07, 2015 - 7:14 am
I am running a series of linear regression and logistic regression models in Mplus. It is my understanding that for simple linear regression with manifest variables the output "Chi-Square Test of Model Fit for the Baseline Model" indicates whether or not he estimation of a regression model is meaningful (i.e., when significant, the baseline model is rejected and the regression results are meaningful).
Is there a commensurate set of parameters provided in the output for running a logistic regression?
The baseline model sets the x-y covariances to zero, that is, the regression slopes are all zero. You can do that for logistic regression too. To convince yourself, try it out to get the baseline results in the continuous case by the likelihood ratio test of 2 times the LL difference.
I am running i censored regression based on 10 imputed data sets (MI done in SPSS, type = imputation under data in mplus) and have 2 latent factors (borderline dimensions) as exogenous predictors and several dichotomeous variables (diagnoses) and one continous variable (depression level) as mediating variables, which are decleared as categorical. Outcome is hospitalized self-harm (in somatic hospitals). Mplus do not report standardized regression weights in this model, even if it did when in a model with two observed predictors against the censored variable. Is it something to do about this?
I keep getting the following error messages when trying to compute probabilities for a two-group model. *** WARNING in MODEL command Variable is uncorrelated with all other variables: MVLOGIT *** WARNING in MODEL command Variable is uncorrelated with all other variables: MVPROBIT *** ERROR .165+(.030* INTRA_CT) +(-.015*INTER_CT) +(.072* acad) +(.069* SES) +(.0000* TXCSTD) +(-.18*
I tried to follow the instructions above, but I must be missing something. Here is my input for the first group. MVlogit=.165+(.030* Intra_ct)+ (-.015*inter_ct)+(.072* acad) +(.069* SES)+(.0000* TXCSTD) +(-.18*nonwhite)+ (.257* opsexrto)+ (.422* -.047)+ !avg Mis (-.02*-.027)+ (.063*.131)+ !avg fr2 (.127*.004); !avg lsc MVprobit=1/(1+exp(-MVlogit));
In a logistic regression model in MPlus, is there any way to output a Receiver Operating Curve (ROC) as a measure of classification ability of the model? And/or sensitivity and specificity? Or might there be a way to calculate these from the output MPlus gives?
My Code so far is (I'm estimating missing data):
CATEGORICAL IS C; USEVARIABLES ARE A B C;
MISSING ARE ALL (9999); ANALYSIS: ALGORITHM=INTEGRATION; integration = montecarlo; MODEL: C ON A B; [A B]; OUTPUT: STD STDY STDYX CINTERVAL;
I am running a logistic regression with 20 imputed datasets. However, when I run this, it does not report the odds ratios. I see that these are provided when running the logistic regression with just one dataset. Is there a way to receive ORs when running logistic regression with multiply imputed datasets?
Try expressing this in Model Constraint using model parameter labels.
May Gong posted on Sunday, July 21, 2019 - 7:54 am
We are running a logistic regression with latent independent variables and categorical dependent variables, however, the standardized regression results are higher 1. Besides, which index should be checked, rather than ORs. Thanks in advance.