Daniel posted on Wednesday, February 16, 2005 - 10:24 am
Hi, I'm running an analysis type=meanstructure, with option "missing" added, and some covariates. What determines the sample size? I ask this question because my sample size is inflated above the true sample size. My four repeated measures have sample sizes of s9,n=1115; s10,n=1068; s11,n=1043; s12, n=1002. However, my sample with the option "missing" is 1133.
The sample size should be the total number of observations. I would have to see the output and data to understand what is going on. Please send them to firstname.lastname@example.org. You may be reading your data incorrectly.
The data set you sent has 1143 observations. Seven cases were eliminated because all variables to be used in the analysis had missing data. This results in 1136 cases being used in the analysis. You have 26 variable names in the NAMES statement and 27 variables in the data set. Perhaps you are not using the data that you mean to be using.
Anonymous posted on Monday, May 30, 2005 - 8:17 pm
If I have non-normal data but a very large sample size (>9000) am I ok if using MLE? I found that: "GLS (generalized least squares) is the second most popular method after MLE. GLS works well for large samples (n>2500) even for non-normal data." Thank you
I answered the first part of the question earlier. Conventional GLS is not robust to non-normality. The so called ADF version of GLS is robust to non-normality and does need very large samples for this robustness to come into effect. "ADF" is obtained using the Mplus WLS estimator with continuous outcomes.
Anonymous posted on Friday, July 15, 2005 - 1:23 pm
I have seen references to 10/1 and 5/1 ratio guidelines for "sample size" adequacy in SEM (among other discussion of the issue). Is the reference referring to sample size/# parameters in the measurement and structural models or degrees of freedom/# parameters in the measurement and structural models, or some other ratio?
For example, suppose a paper is using a sample of 180 and is estimating a model with 25 indicators of 6 latent variables in the structural model. If the output indicates 80 parameters are being estimated, is the relevant ratio 180/80 = 2.25 or 325/80 = 4.06, where 325 = (25*26)/2.
Do you have a good reference with a straightforward discussion of this issue?
I believe these refer to the number of observations per parameter in the model. So for a model with 80 parameters, using 10 observations per parameter would require 800 observations. I don't think these rules of thumb have been studied extensively and probably don't give a very good estimate of the necessary sample size becuase this depends on the model and the data. In the following paper, Bengt and I suggest a way to determine sample size using a Monte Carlo study that is tailored to your model and data:
Muthén, L.K. & Muthén, B.O. (2002). How to use a Monte Carlo study to decide on sample size and determine power. Structural Equation Modeling, 4, 599-620.
I have a dataset I'm running analyses on where only 660 people out of ~710 have information on the outcome variable, but there are 664 observations reported in the output. I believe this is possible due to the default in Mplus to estimate the model under missing data theory using all available data (which is stated throughout the manual), but I was wondering where I could find a more detailed explanation of this for reporting purposes?
I used USEOBSERVATIONS to select cases for an MLM analysis in MPlus (so as to only include consented students, only include classrooms with 70% participation rates, and only include classrooms with at least one participating English language learner student).
Here is the useobservations syntax that I used, in case it is helpful:
USEOBSERVATIONS ARE consent eq 1 and fsgte70 eq 1 and numELp1 gt 0 and numELp3 gt 0;
The MPlus output shows the resulting sample for this analysis is 501. However, when I use these exact same selection criteria in SPSS and SAS, the sample size meeting these criteria is only 421. Could you help me understand why this discrepancy might be happening?
This is regarding your paper Muthén, L.K. & Muthén, B.O. (2002). How to use a Monte Carlo study to decide on sample size and determine power. Structural Equation Modeling, 4, 599-620. The paper is beautifully written and very informative.
I have the following questions: 1) Can I calculate power if I have only one factor in CFA? As mentioned in the paper "The focus of the power investigation in the CFA model is the factor correlation."
2)How to select the population values in MC simulation if there is not enough or no literature to provide population values? Any reference regarding this would be of great help.
One way is to do a Monte Carlo study for the particular model and sample size that you have. See our UG examples which have Monte Carlo counterparts on our web site. See also the Muthen & Muthen 202 article on MC on our website.