1. A better approach with WLSMV is to use the SVALUES option which gives you input statements that include the final estimates as starting values. For categorical variables, you need to include the variance of the latent response variables underlying the categorical latent variables. See a Monte Carlo example that uses WLSMV to see how this is done.
2. You need to give the means, variances, and covariances of covariate in MODEL POPULATION but not in MODEL.
Xu, Man posted on Friday, November 15, 2013 - 5:10 am
Thank you very much for your suggestions (and sorry for the delayed follow up).
Could I please first check about standardised output with ordinal indicators, under WLSMV estimator? I noticed that, if the variances of the co-variate predictor (for latent variables based on ordinal items) if specified in the model, then I was able to obtain standardised output together with their p values and standard errors. This is exactly what I want, but I just wanted to double check that these p values trustworthy?
Regarding point 1 from your previous reply, I tried the SVALUES to get values for a potential posthoc power analysis population model. But I don't think there are values for variance of the latent response variables either. So I am not sure how to proceed in this regard...
Also, I actually have a few hundred models like this that I need to do post-hoc power analysis on. So even if I could work out the specifications of the latent response variable variances, it would probably be a bit too much in terms of the amount of work to be repeated. If the parameters are saved in a separate file then I can write scripts for it by using mplusautomation in R. This enables me to run all the power analysis relatively simply.
All in all, do you think WLSMV with theta (so that I can save parameters in a separate file to feed in the power analysis population model) could be viable for me?
Thank you very much!
Xu, Man posted on Friday, November 15, 2013 - 5:32 am
I use complete data, by the way, so no missing values were present.
With WLSMV you should not add the variances of the observed exogenous variable to the MODEL command as you can do with maximum likelihood estimation. It changes the results. Wait for Version 7.2 if you need these values.
You get the values for the latent response variables in the R-square section where residual variances are printed.
This could be viable.
Xu, Man posted on Monday, November 18, 2013 - 5:43 am
Thank you. Is it a problem even for complete data? When will V 7.2 become available please?
I added the variances of the observed exogenous variables to the MODEL command because this seemed to be the only way for me to make use of the TWO STEP power analysis function under WLSMV (with theta para). Although the two step approach works perfectly well under ml estimator (if I do not declare all indicators to be categorical) without adding the co-variate variance, once switch to WLSMV it no longer works (unless the co-variate variance is estimated).
Re values for latent response variables in the R-square section, do you mean under delta parametersation? Under theta I think it is something called the Scale Factors.
Xu, Man posted on Monday, November 18, 2013 - 6:24 am
Just in case I have not explained myself clear enough. For me because I have hundreds of models to analyse, using the two step approach is the most straightforward way under the R package mplusautomation.
Although working with SVALUES option plus the Residual variances would certainly be possible for when I have only a few models, for as many as hundreds I would need to write much more complex scripts, so I prefer the two step...
By "Two-step", do you mean what's on slide 113 of our Topic 4 handout?
Xu, Man posted on Monday, November 18, 2013 - 8:44 am
Sorry for the confusion. Slides 113 and rest are actually really helpful for me. But here I meant the "Two-step" as in example 12.7's step 1 and step 2. And specifically a problem with the step 1 in this thread. Thanks a lot!
When you bring the x variables into the model with categorical outcomes WLSMV it makes for different model assumptions. You no longer assume normality for y* outcomes conditional on the x's, but you assume joint normality of the y*'s and the x's. That is, you move from Case B to Case A using terms in the Muthen (1984) Psychometrika article.
Xu, Man posted on Monday, November 18, 2013 - 10:11 am
Thank you - I see, and probably that's why the step 2 Monte Carlo input file would not read in the saved model parameter value file under WLSMV, unless the covariant variances were specifically specified in Model command in Step 1 .
I will try to read the article (assuming it is going to be quite difficult to read due to my maths limitations...). But could I first ask, would this joint normality usually lead serious bias in model estimation?
First glance over results things (including p-values) seem to be quite similar across the models with and without estimating the co-variate variances. If the joint normality issue is not usually too serious, then it would be great for me to both have p values for the standardised model results, as well as the ease of using the 2 step power analysis functions.
Xu, Man posted on Friday, November 29, 2013 - 1:48 am
I'd like to ask a follow-up question to the power analysis please.
So I have conducted power analysis based on the parameters obtained from the data analysis. The original model has 5 latent factors (with ordinal indicators) predicted by a continuous covariate. My key interests are the regression paths from the observed predictor to the latent variable factors. I noticed that from the real data analysis, the factor with the largest standardised regression coefficient has got a p value larger than a factor with a smaller standardised regression coefficient. And these was also reflected in the power analysis: the factor with smaller effect size has larger power.
I am a bit puzzled because I would have thought that given the same sample size the effect size should be in proportion with p value, and that it should also correspond to power.
However, I know already that the factor loadings for the factor with large effect were a bit low so maybe its the unreliability that caused the high p value, and low power in the simulation?
Effect size is not all that determines p-values or power. Standard errors must also be considered.
Xu, Man posted on Friday, November 29, 2013 - 10:00 am
Thanks. Sorry my basic stats has got rusty I thought simple correlation's s.e. is also a function of the correlation itself and sample size. Whereas here effect size is essentialy the simple correlation (because only one predictor is present, and sample size is fixed). But perhaps with latent variables it gets more complicated.
Regarding the previous reply on doing power analysis (Friday, October 11, 2013 - 10:51 am), I used the values from SVALUES as you suggested previously (without specifying variance of covariate in the analysis model). But it seems that I still need to specify population value of the covariate's variances and mean. So I just ran samplestat only using this covariate as input (I use complete data) then used that for the population value in the monte carlo.
That's fine. I would specify the covariate means, variances, and covariances in the Model Population command but not in the Model command.
Xu, Man posted on Monday, December 02, 2013 - 2:24 am
Thank you. Yes, I specified the covariate means and variances and covariance(in the form of regression coefficient) in the Population, and I didn't put covariate means, variances of the covariate in the Model command.
However, I noticed that the power of that regression coefficient has become super high. The unstandardised regression coeff remained the same, but the standard errors has decreased dramatically. In the actual real data analysis, the stdyx regression coeff was only about 0.1 with a p value about 0.005, but it became 0.36 from one of the simulated datasets with a z value of 9...
It seems that the observed covariate variances might be different from the one used on calculating the standard errors.
Xu, Man posted on Monday, December 02, 2013 - 2:43 am
Just to report the variances from adding and not adding the covariate when doing the real data analysis, in case this could be of help.
1. analysis with covariate variances specified in the real data analysis
Var of Cov: 0.227 Residual Var of latent variables: f1 0.365; f2 0.052, f3 0.158; f4 0.221; f5 0.224
2. analysis withOUT covariate variances specified in the real data analysis
Var of Cov: 2.33 (from descriptive stats) Residual Var of latent variables: f1 0.363; f2 0.055, f3 0.159; f4 0.221; f5 0.221
Xu, Man posted on Monday, December 02, 2013 - 5:05 am
Aaa, actually now I see I put variances of covariate a decimal place too big from the sample stats...
Now I have corrected it, and the power has now returned to the expected performance.
Although, still, adding the variance of covariate in the analytic model seems to make quite little differences in the results. So maybe the mutlivariate normal assumption does not have too much impact.
Alissa Beath posted on Saturday, December 06, 2014 - 3:25 pm
I tried running a post hoc power study using a high number of categorical variables. I used svalues in my CFA and copied the output in both the model montecarlo (Model Population) and the Model. I specified the variables as categorical in the Montecarlo input section and (example item1(1)).
However I get this message:
*** FATAL ERROR THE POPULATION COVARIANCE MATRIX THAT YOU GAVE AS INPUT IS NOT POSITIVE DEFINITE AS IT SHOULD BE.
However, the original CFA had no error messages. Do I need to specify WLSMV as the estimator in the Montecarlo or is it something else?
That sounds like you haven't specified a variance for a variable in the model, for instance for a covariate (which wouldn't show in the run that gives you svalues). See how we do monte carlo studies by looking at the monte carlo counterparts to the UG examples - you find them with our UG on our web site.