Daniel posted on Monday, August 16, 2004 - 8:19 am
Hi, I'm working on a proposal where I am looking at the effects of physical activity on smoking. However, I only have two time points and a relatively small budget. The issue is that I need to have as small a sample as possible to find an adequate result. Since I only have two time points, I'm thinking the benefits of repeated measures are not there like there would be had I had a larger sample. Do you have any suggestions of how I would approach this question? By the way, my outcome will be an ordered categorical variable.
I would have to see the saved data set to know for sure, but I suspect that you are reading it with a free format and have blanks in the data for missing values causing it to be read incorrectly. You would need to send the input and data to email@example.com for me to give a definite answer.
Todd Huschka posted on Tuesday, September 28, 2004 - 11:41 am
Is there a way to perform Power Calculations with Mplus?
On the website, we are given the following SAS code to calculate power after obtaining an estimate of the noncentrality parameter:
DATA POWER; DF=1; CRIT=3.841459; LAMBDA=9.286; POWER=(1-(PROBCHI(CRIT,DF,LAMBDA))); RUN;
I don't have access to SAS and am hoping someone can give me some code that will allow me to make this calculation in SYSTAT, SPSS or even Excel. Or, can someone recommend a free-standing executable or a web-based java program I can use for this?
(Of course, I can always ask colleagues who use SAS to run this for me, but I'd prefer to be able to do the calculation myself if possible.)
Now, there is an online non-central chi-square calculator at UCLA at http://calculators.stat.ucla.edu/cdf. Can I just use that to perform the power calculation described in the "How To" on the Mplus web site?
I'm not sure what the X Value parameter is in the calculator web form. I'm trying to replicate the result in the Mplus power calculation example as a check to make sure I can do this correctly. I assume that DF=1, that the Noncentrality Parameter=9.286, and that I should leave Probability as a ? to be solved, but I don't know what the X Value represents. Is it the sample size?
I apologize if this is a really basic question. I couldn't find any help on the calculator, and I'm hoping it will solve my problem.
Thanks for your patience.
bmuthen posted on Sunday, October 30, 2005 - 6:37 pm
I would think X is the chi-square value (the value on the x axis). Have you tried using X = CRIT?
When I set X=3.841459, df=1, and NCP=9.286, the online calculator solves p=0.138 for power of .862.
I assume that the difference between that estimate and the .85 you show in the table in the example is due to a difference in precision of the web-based calculator and the higher precision calculations performed in SAS. Does that sound right?
By the way, would you consider adding a utility/procedure for non-central chi-square calculations into a future version of Mplus? I don't know how much of an effort that would be or how it fits into your vision of what belongs in Mplus, but looking at the code for the online calculator on the UCLA site it doesn't look like it would be a major undertaking.
Thanks again for all your help and for this wonderful program.
bmuthen posted on Monday, October 31, 2005 - 2:59 pm
Yes, rounding matters. The 4 values on our web site should be:
I am currently estimating power/sample size for a study in which the outcomes are longitudinal binary measures. I'm planning on using a growth curve for the anticiapted analysis, and I may elect to model the treatment effect in the form of a multi-group analysis. If I simulate a full and restricted model (to test a treatment effect), can the the difference between the -2*loglikehoods be used in approximating the non-centrality parameter used Satorra-Saris approach to estimating power?
Satorra-Saris was developed for continuous-normal outcomes where the mean vector and covariance matrix are sufficient statistics, whereas with binary outcomes all moments are needed. I don't know of studies to simulate the non-centrality parameter. If you simulate it would seem that using the last column in the Mplus Monte Carlo output would be most straightforward - the proportion of replications rejecting the zero value for the key parameter.
My sample in a paper has 133 observations and a reviewer doubts whether I can run structural equation modeling on such a small sample size (I have three equations, each of which has about 2-3 endogenous variables and 6-8 exogenous variables – MPLUS finishes the computing in a normal manner). I remember that MPLUS is particularly useful for small-sample SEM, but I think the reviewer wants more technical details – could you please give me suggestions on this issue?
Critical factors are if the outcomes are continuous or categorical, how skewed the outcomes are, how many parameters the model has, how much missing data there is, etc. n=133 may or may not be sufficient depending on these factors. Also, see the web site's Muthen & Muthen (2002) article on Monte Carlo simulation to determine if n is large enough.
I have a question regarding an error message that I received when running a model: "THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES MAY NOT BE TRUSTWORTHY FOR SOME PARAMETERS DUE TO A NON-POSITIVE DEFINITE FIRST-ORDER DERIVATIVE PRODUCT MATRIX. THIS MAY BE DUE TO THE STARTING VALUES BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. THE CONDITION NUMBER IS -0.180D-16. PROBLEM INVOLVING PARAMETER 35. THIS IS MOST LIKELY DUE TO HAVING MORE PARAMETERS THAN THE SAMPLE SIZE IN ONE OF THE GROUPS."
I have a sample size of 172 and I am doing a group comparison and one of the groups has n=35. Is this error message regarding my sample size or something else? Thanks!
Hi, I am doing a multiple group SEM with continuous (actually 5pt likert) indicators. I have a small sample size (N=161) with 49 in one group and 112 in the other. I am examining a multiple group model that estimates 80 or so parameters (80 free parms). Everything seems fine - model terminates normally, solution is positive definite, etc. Model fit is generally marginal (e.g. CFI = .89, RMSEA = .07). My question is whether there is anything wrong with this. I know this sometimes called "empirical under-identification", but with no problems with convergence (etc) do I need to worry?
In the case that I do need to worry, would you recommend that I test to see how many parameters I can (tenably) constrain to equality across groups, such that I can (perhaps) get the number of free parameters under 49 (n of smallest group). Maybe I can impute the loadings and residuals for my latent variables by obtaining them from a related CFA to get the number of free parms down as well? (I already know that measurement invariance (loadings) is tenable across groups).
Last thing - I am using the MLR estimator in this case which I believe to be the best for a small sample SEM with continuous indicators - Is this accurate, should I be using another estimator perhaps?
When you say several more parameters "in a group" I assume you mean the number of parms estimated for that group and not overall (e.g. 80 parameters being estimated overall, but 40 parms for each group)??
Another option I have to increase N is to include more observations that have missing data on 2 of the 3 waves. But this makes it such that some of my indicators will have up to 50% missing data at waves 2 & 3. Can FIML handle this? Is there any indication, perhaps a reference, of how much missing data is allowable, or that FIML (or any other imputation method can handle?
I am estimating two SEM models similar in all respects exept in the final outcome (dependent) variable. In model 1 this variable is continuous. In Model 2 it is categorical (3x categ). By default, MPLUS estimates the first model with ML the second with WLSMV.
I also know that the final outcome categorical variable (in model 2) has four missing cases.
However, when I estimate the models, the differences in the number of observations between model 1 & 2 is 16 (not four). How is this possible?
alia aishah posted on Wednesday, April 04, 2012 - 11:15 am
WHEN I RUN MY PATH ANALYSIS THIS MESSAGE CAME UP = THE MINIMUM COVARIANCE COVERAGE WAS NOT FULFILLED FOR ALL GROUPS. CATEGORICAL VARIABLE ATLEASTO HAS ZERO OBSERVATIONS IN CATEGORY 1.
I noticed that mplus output showed that my outcome variable had 4 categories but in truth it is a binary variable. what do i do? also, i used the USEOBS command which means i only used a subsample with no missing data, but the output said that there were missing dat for x, and the observations noted in the output did not tally with my subset. how do i solve this, am i missing something
It sounds like you are not reading your data correctly. This could be caused by blanks in the data set. If you can't figure it out, please send your input, data, output, and license number to firstname.lastname@example.org.