Post hoc power analysis using monte c... PreviousNext
Mplus Discussion > Structural Equation Modeling >
Message/Author
 Xu, Man posted on Thursday, October 10, 2013 - 4:13 pm
Dr. Muthen,

I am trying to conduct a post hoc power analysis by using the parameters saved in my real data analysis (as in the example 12.7).

In my analysis an observed continuous predictor is used to predict a few latent factors based on observed categorical indicators. I tried a few specifications and I found that:

1. I must use theta parameterisation in order to save the parameters,

2. In order for the simulation input file to use the saved parameters, I must specify the variance and mean of the predictor in both the real data analysis as well as the simulation input file.

Could I check with you that both points are necessary and correct?

Thank you!

Kate
 Linda K. Muthen posted on Friday, October 11, 2013 - 10:51 am
1. A better approach with WLSMV is to use the SVALUES option which gives you input statements that include the final estimates as starting values. For categorical variables, you need to include the variance of the latent response variables underlying the categorical latent variables. See a Monte Carlo example that uses WLSMV to see how this is done.

2. You need to give the means, variances, and covariances of covariate in MODEL POPULATION but not in MODEL.
 Xu, Man posted on Friday, November 15, 2013 - 5:10 am
Thank you very much for your suggestions (and sorry for the delayed follow up).

Could I please first check about standardised output with ordinal indicators, under WLSMV estimator? I noticed that, if the variances of the co-variate predictor (for latent variables based on ordinal items) if specified in the model, then I was able to obtain standardised output together with their p values and standard errors. This is exactly what I want, but I just wanted to double check that these p values trustworthy?

Regarding point 1 from your previous reply, I tried the SVALUES to get values for a potential posthoc power analysis population model. But I don't think there are values for variance of the latent response variables either. So I am not sure how to proceed in this regard...

Also, I actually have a few hundred models like this that I need to do post-hoc power analysis on. So even if I could work out the specifications of the latent response variable variances, it would probably be a bit too much in terms of the amount of work to be repeated. If the parameters are saved in a separate file then I can write scripts for it by using mplusautomation in R. This enables me to run all the power analysis relatively simply.

All in all, do you think WLSMV with theta (so that I can save parameters in a separate file to feed in the power analysis population model) could be viable for me?

Thank you very much!
 Xu, Man posted on Friday, November 15, 2013 - 5:32 am
I use complete data, by the way, so no missing values were present.
 Linda K. Muthen posted on Friday, November 15, 2013 - 11:46 am
With WLSMV you should not add the variances of the observed exogenous variable to the MODEL command as you can do with maximum likelihood estimation. It changes the results. Wait for Version 7.2 if you need these values.

You get the values for the latent response variables in the R-square section where residual variances are printed.

This could be viable.
 Xu, Man posted on Monday, November 18, 2013 - 5:43 am
Thank you. Is it a problem even for complete data? When will V 7.2 become available please?

I added the variances of the observed exogenous variables to the MODEL command because this seemed to be the only way for me to make use of the TWO STEP power analysis function under WLSMV (with theta para). Although the two step approach works perfectly well under ml estimator (if I do not declare all indicators to be categorical) without adding the co-variate variance, once switch to WLSMV it no longer works (unless the co-variate variance is estimated).

Re values for latent response variables in the R-square section, do you mean under delta parametersation? Under theta I think it is something called the Scale Factors.
 Xu, Man posted on Monday, November 18, 2013 - 6:24 am
Just in case I have not explained myself clear enough. For me because I have hundreds of models to analyse, using the two step approach is the most straightforward way under the R package mplusautomation.

Although working with SVALUES option plus the Residual variances would certainly be possible for when I have only a few models, for as many as hundreds I would need to write much more complex scripts, so I prefer the two step...
 Bengt O. Muthen posted on Monday, November 18, 2013 - 8:34 am
By "Two-step", do you mean what's on slide 113 of our Topic 4 handout?
 Xu, Man posted on Monday, November 18, 2013 - 8:44 am
Sorry for the confusion. Slides 113 and rest are actually really helpful for me. But here I meant the "Two-step" as in example 12.7's step 1 and step 2. And specifically a problem with the step 1 in this thread. Thanks a lot!
 Bengt O. Muthen posted on Monday, November 18, 2013 - 9:32 am
Ah, that 2-step.

When you bring the x variables into the model with categorical outcomes WLSMV it makes for different model assumptions. You no longer assume normality for y* outcomes conditional on the x's, but you assume joint normality of the y*'s and the x's. That is, you move from Case B to Case A using terms in the Muthen (1984) Psychometrika article.
 Xu, Man posted on Monday, November 18, 2013 - 10:11 am
Thank you - I see, and probably that's why the step 2 Monte Carlo input file would not read in the saved model parameter value file under WLSMV, unless the covariant variances were specifically specified in Model command in Step 1 .

I will try to read the article (assuming it is going to be quite difficult to read due to my maths limitations...). But could I first ask, would this joint normality usually lead serious bias in model estimation?

First glance over results things (including p-values) seem to be quite similar across the models with and without estimating the co-variate variances. If the joint normality issue is not usually too serious, then it would be great for me to both have p values for the standardised model results, as well as the ease of using the 2 step power analysis functions.
 Xu, Man posted on Friday, November 29, 2013 - 1:48 am
I'd like to ask a follow-up question to the power analysis please.

So I have conducted power analysis based on the parameters obtained from the data analysis. The original model has 5 latent factors (with ordinal indicators) predicted by a continuous covariate. My key interests are the regression paths from the observed predictor to the latent variable factors. I noticed that from the real data analysis, the factor with the largest standardised regression coefficient has got a p value larger than a factor with a smaller standardised regression coefficient. And these was also reflected in the power analysis: the factor with smaller effect size has larger power.

I am a bit puzzled because I would have thought that given the same sample size the effect size should be in proportion with p value, and that it should also correspond to power.

However, I know already that the factor loadings for the factor with large effect were a bit low so maybe its the unreliability that caused the high p value, and low power in the simulation?

Your comments will be greatly appreciated!

Thank you very much!
 Linda K. Muthen posted on Friday, November 29, 2013 - 9:40 am
Effect size is not all that determines p-values or power. Standard errors must also be considered.
 Xu, Man posted on Friday, November 29, 2013 - 10:00 am
Thanks. Sorry my basic stats has got rusty :-( I thought simple correlation's s.e. is also a function of the correlation itself and sample size. Whereas here effect size is essentialy the simple correlation (because only one predictor is present, and sample size is fixed). But perhaps with latent variables it gets more complicated.

Regarding the previous reply on doing power analysis (Friday, October 11, 2013 - 10:51 am), I used the values from SVALUES as you suggested previously (without specifying variance of covariate in the analysis model). But it seems that I still need to specify population value of the covariate's variances and mean. So I just ran samplestat only using this covariate as input (I use complete data) then used that for the population value in the monte carlo.

Is this OK?

Thanks!
 Bengt O. Muthen posted on Friday, November 29, 2013 - 5:25 pm
That's fine. I would specify the covariate means, variances, and covariances in the Model Population command but not in the Model command.
 Xu, Man posted on Monday, December 02, 2013 - 2:24 am
Thank you. Yes, I specified the covariate means and variances and covariance(in the form of regression coefficient) in the Population, and I didn't put covariate means, variances of the covariate in the Model command.

However, I noticed that the power of that regression coefficient has become super high. The unstandardised regression coeff remained the same, but the standard errors has decreased dramatically. In the actual real data analysis, the stdyx regression coeff was only about 0.1 with a p value about 0.005, but it became 0.36 from one of the simulated datasets with a z value of 9...

It seems that the observed covariate variances might be different from the one used on calculating the standard errors.
 Xu, Man posted on Monday, December 02, 2013 - 2:43 am
Just to report the variances from adding and not adding the covariate when doing the real data analysis, in case this could be of help.

1. analysis with covariate variances specified in the real data analysis

Var of Cov: 0.227
Residual Var of latent variables:
f1 0.365; f2 0.052, f3 0.158; f4 0.221; f5 0.224

2. analysis withOUT covariate variances specified in the real data analysis

Var of Cov: 2.33 (from descriptive stats)
Residual Var of latent variables:
f1 0.363; f2 0.055, f3 0.159; f4 0.221; f5 0.221
 Xu, Man posted on Monday, December 02, 2013 - 5:05 am
Aaa, actually now I see I put variances of covariate a decimal place too big from the sample stats...

Now I have corrected it, and the power has now returned to the expected performance.

Although, still, adding the variance of covariate in the analytic model seems to make quite little differences in the results. So maybe the mutlivariate normal assumption does not have too much impact.
 Alissa Beath posted on Saturday, December 06, 2014 - 3:25 pm
Hi Linda,
I'm trying to calculate power using the procedure specified in your 2002 paper (and Thoemmes et al., 2010), but all the SDs from the model results are 0.
Input:
montecarlo:
names = ei_1 ei_2 reap sch_1 sch_2;
nobs = 100;
nreps = 10000;
cutpoints = ei_1(-0.5) ei_2 (0.5);
seed = 1234;
Model population:
[ei_1 @ 0.33]; [ei_2 @ 0.33]; ei_1 @ 0.25; ei_2 @ 0.25;
[reap@29]; reap@35.6;
[sch_1@19]; sch_1@1040;
[sch_2@-8]; sch_2@676;
reap on ei_1@-8.7 ei_2@-2.5;
sch_1 on reap@1.2 ei_1@-11 ei_2@4.2;
sch_2 on sch_1@-0.28;
Model:
[reap@29];reap@35.6;
[sch_1@19];sch_1@1040;
[sch_2@-8]; sch_2@676;
reap on ei_1 @ -8.7 ei_2 @ -2.5;
sch_1 on reap @ 1.2 ei_1 @ -11 ei_2 @ 4.2;
sch_2 on sch_1 @ -0.28;
Model indirect:
sch_1 IND ei_1; sch_1 IND ei_2;
Many thanks,
Alissa
 Bengt O. Muthen posted on Saturday, December 06, 2014 - 4:21 pm
In your Model statement you should change @ to * sp the parameters are free to be estimated.
 Alissa Beath posted on Saturday, December 06, 2014 - 4:41 pm
Thank you so much Bengt, that solved it!
Much appreciated.
Back to top
Add Your Message Here
Post:
Username: Posting Information:
This is a private posting area. Only registered users and moderators may post messages here.
Password:
Options: Enable HTML code in message
Automatically activate URLs in message
Action: