Mplus Discussion >> Parameters and sample size

Topics
Last Day
Last 3 Days
Last Week
Tree View

Edit Profile


Parameters and sample size

Mplus Discussion > Structural Equation Modeling >

Message/Author

Anonymous posted on Monday, April 08, 2002 - 4:21 pm

We are running a structural equation model, and have missing data on some of the measured indicators. We notice on the mplus printout that the total number of parameters estimated increases when there is missing data, by the number of indicators in the model. Are there any implications of this for sample size? Specifically, if one is following the 10 cases to each parameter guideline set forth by others, should these additional parameters be included in the calculation of the total number of parameters?

Linda K. Muthen posted on Tuesday, April 09, 2002 - 8:30 am

Means are included in the model with missing data. These are the extra parameters that you see. Having means in the model does not change the degrees of freedom of the model. The means both increase the number of sample statistics and the number of parameters equally. If you are using the rule of 10 cases per parameter, I would think that these means should not be counted. This part of the model is unstructured and does not therefore affect the other results.

Anonymous posted on Monday, September 30, 2002 - 10:07 pm

I'm running a SEM in Mplus where I have a set of endogenous latent variables (K) that are regressed on another set of endogenous latent variables (I). Mplus gives the standardized effects (Std) for the regression of one of the I's on one of the K's as greater than 1. Is this possible ? The remaining model output / parameters look fine.

bmuthen posted on Tuesday, October 01, 2002 - 9:17 am

Please see Mplus Discussion under Structural Equation Modeling, Standardized Coefficients.

Anonymous posted on Tuesday, June 03, 2003 - 7:55 am

Provided one has sufficient RAM and (hard) disc spaces, is there a limit on the size data file Mplus can accept, either in terms of variables, cases, or total size ?

Thanks.

Linda K. Muthen posted on Tuesday, June 03, 2003 - 8:53 am

We do have a limit of 500 variables. There is no limit on the number of observations.

Anonymous posted on Monday, November 15, 2004 - 7:05 pm

Hi -

I am working with a data set of over 80,000 participants. I ran an SEM mediation model and then conducted multi-group invariance analyses. Of course, the indirect effect is significant due to the sample size, as well as all chi-square different tests for measurement invariance. Do you have any ideas about how to handle the large sample size problem? Is there a way to convert my parameter estimates to effect sizes? Is there a practical (rather than statistical) rule of thumb for examining differences in parameter estimates across groups?

Thank You!

Linda K. Muthen posted on Tuesday, November 16, 2004 - 9:43 am

With 80,000 observations, it seems randomly dividing the sample into two or more samples so that you can cross-validate your results would be a good idea. You can convert to effect sizes. See a statistics text for the formula. The practical versus statistical significance issue can be considered. What is practically significant would be guided by your substantive area. This issues are the same for SEM as for other types of analysis.

Anonymous posted on Friday, January 21, 2005 - 6:48 am

Is the rule of thumb of 10 cases to each parameter followed with Structural Equation Modelling as with simpler models?. Is that suitable for both situations of observed and latent endogenous dependent variables?, is it suitable for a combination of observed and latent dependent variables?. I very much appreciate answers/suggestions/suitable references!.

Linda K. Muthen posted on Saturday, January 22, 2005 - 3:09 pm

Rules of thumb for sample size are usually not very useful. The necessary sample size varies depending on so many factors. See the Muthen and Muthen 2000 paper in the SEM journal for determining sample size using Monte Carlo simulations.

Anonymous posted on Wednesday, January 26, 2005 - 2:34 am

Re-your response to a previous message: Rules of thumb for sample size are usually not very useful. The necessary sample size varies depending on so many factors. See the Muthen and Muthen 2000 paper in the SEM journal for determining sample size using Monte Carlo simulations.
I am sorry but I am unable to find this article!. Further details would be mostly appreciated.
thanks

bmuthen posted on Wednesday, January 26, 2005 - 1:28 pm

It is actually a 2002 SEM article. You can also find a pdf of it on the web site under Research Papers Using Mplus.

Anonymous posted on Tuesday, March 15, 2005 - 6:42 am

Is WRMR size-dependant like the p-value in AMOS? I tested a model with 8900 cases and CFI (.97), TLI (.97), RMSEA (0.050) indicated a good fit, while WRMR was too high (1.4). Now, I splited the whole sample in subsamples of about 500 cases I got good WRMR-values, well below 0.9.

Linda K. Muthen posted on Tuesday, March 15, 2005 - 8:22 am

Yes, this is a power issue. With more observations, you have more power to reject the model. WRMR can be unreliable is some situations. I would ignore it if all other tests of fit look good.

Henri Bonnabau posted on Monday, October 17, 2005 - 7:58 am

Hello,

I'am a baby's user of SEM, can you help me to know if a sample size with 65 cases is enought??? I saw that it depends on the number of the parameters, How do I determine this (path, variance, disturbance....other???)???

Sincelery

Linda K. Muthen posted on Monday, October 17, 2005 - 8:05 am

Sample size depends on many factors. You may find the following paper helpful:

Muth�n, L.K. & Muth�n, B.O. (2002). How to use a Monte Carlo study to decide on sample size and determine power. Structural Equation Modeling, 4, 599-620.

Henri Bonnabau posted on Monday, October 17, 2005 - 8:57 am

I can't find this paper, I don't have access...
What can I do???

Linda K. Muthen posted on Monday, October 17, 2005 - 9:15 am

You can download it from our website. See Papers under Analyses/Research.

Henri Bonnabau posted on Wednesday, October 26, 2005 - 8:31 am

Hello,
I want to know if normalising some data by differents transformation could be have some effect on the sample size minimal.
thanks

Linda K. Muthen posted on Wednesday, October 26, 2005 - 9:22 am

I have never heard anything about the effect on transformations on minimum sample size. Perhaps someone else has.

Shimul Melwani posted on Friday, December 01, 2006 - 6:22 am

Hi I am running an SEM analysis. My model is very large (about 110 parameters) but I have a dataset of 100 cases. I know this is a problem and when I try to reduce parameters, but taking out latent variables and replacing them with single means (from factor analyses), I run in a problem with degrees of freedom, the df drop from 100 to 12-15. What would you suggest that I do? Thanks so much!

Linda K. Muthen posted on Friday, December 01, 2006 - 9:49 am

When you have fewer observed variable because you have created sum scores, you will have fewer degrees of freedom. There is nothing you can do about this.

Jennifer M. Jester posted on Wednesday, March 07, 2007 - 12:39 pm

I would like to report effect size for chi-square difference tests. I found a Psych Methods article by MacCallum 2006 which gives a formula which is the difference in the product of degrees of freedom times RMSEA squared for the two models.
For instance, I have a nested model design where one model has 8 d.f. and RMSEA = .033 and the other model has 9 d.f. and RMSEA = .0375, giving me an effect size of .0039. This seems like a small number. I wanted to know if this seems like a reasonable way to calculate effect size and what are reasonable values for effect size.
Thanks!

Linda K. Muthen posted on Wednesday, March 07, 2007 - 6:17 pm

I am not familiar with this approach and therefore have no opinion.

Jennifer M. Jester posted on Thursday, March 08, 2007 - 9:16 am

Do you know of another approach to effect size for chi-square difference tests? What about effect size for parameters in a model (for instance, a regression coefficient)? Would this be just the same as for a regular regression?

Thanks,

Jennie

Linda K. Muthen posted on Thursday, March 08, 2007 - 9:26 am

For a parameter, it would be the same as in regular regression. I don't know of any approach for effect size of chi-square.

sammy posted on Sunday, February 17, 2008 - 11:41 pm

Hello,

I've got a question concerning the appropiateness of using ML-estimates in my path analysis:
I've got N=193, depending on the model 4 to 8 manifest variables (2 dependent variables, the rest IV). As my data is not normally distributed, I should not use ML or GLS. Unfortunately, using WLS generates a non-positive definite weight matrix.
Do you have any advice for me? Is ML robust enough or should I calculate the Bentle-Satorra corrected Chi?

Thanks in advance!

Linda K. Muthen posted on Monday, February 18, 2008 - 6:21 am

What do you mean by your data are not normally distributed. Are the dependent variables skewed continuous variables or categorical variables with floor or ceiling effects.

sammy posted on Monday, February 18, 2008 - 1:09 pm

Dear Dr Muthen,

my first dependent variable is continuous with a skewness of +1.75 (and a curtosis of 3.3). The second one is categorical, measured with a 7-point rating scale. Its skewness is only .16, but its curtosis -1.2.
I'm not sure, whether I should stick to ML nonetheless.

Thank you very much for your help!

Bengt O. Muthen posted on Monday, February 18, 2008 - 6:03 pm

I would treat the dependent variables as continuous and use the MLM or MLR estimators to take into account non-normality - unless the categorical variable has a strong floor or ceiling effect in which case I would use WLSMV (I would not use WLS).

cecilia posted on Thursday, September 15, 2011 - 3:25 pm

Considering the relationship sample size/ estimated parameters into a sem where I'm more interested in the structural relations than other estimated parameters: Should I take into account all estimated parameters? Or is it possible only to take into account the structural paths (relations between latent variables)? Thanks

Bengt O. Muthen posted on Thursday, September 15, 2011 - 5:57 pm

I don't think that ratio rule is a good one, so I don't know how to answer you. See instead

Muth�n, L.K. & Muth�n, B.O. (2002). How to use a Monte Carlo study to decide on sample size and determine power. Structural Equation Modeling, 4, 599-620.

which you find on our web site.

John Saldanha posted on Friday, November 16, 2012 - 6:58 pm

I have an SEM for interaction effects with the MLR estimator. I am importing N=694 cases with 49 variables off which I am using 18 variables using an ASCII comma separated DAT file. The file does not have any variable(column) labels or other test values.

The problem I am experiencing is that in the output the number of observations being returned is 661. Besides that the analyses is successfully completed however, some parameters are incorrect from a previous run on a colleagues computer.

I have imported and examined the data in Excel. The original data set was created in SAS 9.2, and I am working in Windows 7 64-bit.

Not sure what to do?

Linda K. Muthen posted on Saturday, November 17, 2012 - 10:13 am

Please send your output, data, and license number to support@statmodel.com so I can see what is happening.

Steve Swanson posted on Tuesday, April 09, 2013 - 3:59 pm

Hello,

I am new to Mplus, and I was wondering if it is still possible to run Multivariate Multiple Regression with Mplus (as listed on UCLA Stats page http://www.ats.ucla.edu/stat/mplus/dae/mvreg.htm)? And if so - is there a way to calculate the minimum sample size needed? (I am working with a small sample of about 60)

Many thanks.

Steve

Linda K. Muthen posted on Wednesday, April 10, 2013 - 1:32 pm

Yes, this is still possible. See the following paper which is available on the website:

Muth�n, L.K. & Muth�n, B.O. (2002). How to use a Monte Carlo study to decide on sample size and determine power. Structural Equation Modeling, 4, 599-620.

Christoph J. posted on Wednesday, June 12, 2013 - 3:36 am

Hi,

above you have written that WRMR is dependent on the sample size. Do you know of an article I can quote to argue why I ignore the value even though it is recommended for categorical data?

I have estimated the same model for two different samples, all "classic" GoF indices are fine, except the WRMR for the bigger sample:
For n = 417:
RMSEA 0.064
CFI 0.992
TLI 0.977
WRMR 0.457

For n = 12571
RMSEA 0.038
CFI 0.997
TLI 0.991
WRMR 1.200

Is there any simulation study you know of that I can use?

Kind regards,
Christoph

Linda K. Muthen posted on Wednesday, June 12, 2013 - 8:04 am

WRMR is an experimental fit statistic. I would not use it. It has not proven to work that well. It has never been written about as far as I know except in the Yu dissertation on the website.

Steve posted on Thursday, July 11, 2013 - 4:28 am

Hello,

My question is with regard to the ratio of sample size to parameters estimated. I have N>1000 (using MLR) and have justified my study�s sample size based on 5:1 ratio (Bentler & Chou, 1987). However, when I calculated this minimum sample size, I did not factor in that Mplus estimates the intercepts of the observed variables by default. When applying 5:1 ratio to this much larger number of free parameters listed in the output, my sample size no longer meets this requirement. Therefore, I am wondering:

1) is it possible to justify that my study meets the 5:1 ratio when using Mplus default analysis (somehow say that the observed variable intercepts don�t apply to this calculation)? If not,

2) is there a way to conduct analysis in Mplus without estimating the observed variable intercepts {I tried setting observed variables means and intercepts at zero (e.g., x1@0 [x1@0] ) but this didn�t seem to work (i.e. a lot worse fit)?

Your help would be much appreciated. Thanks!

Linda K. Muthen posted on Thursday, July 11, 2013 - 6:57 am

I think the best way to see if your sample size is sufficient is to do a Monte Carlo study and see if you can recover your parameters using your sample size.

You can add MODEL=NOMEANSTRUCTURE to the ANALYSIS command in some cases.

Steve posted on Thursday, July 11, 2013 - 10:42 am

Dear Linda,

Thank you. I will look into the Monte Carlo study as you suggest.

I just ran my CFA with MODEL=NOMEANSTRUCTURE to the ANALYSIS command as you suggested. However, it still added the number of observed variables to the output for 'Number of Free Parameters'. Any ideas why this is the case - or how the analysis could be conducted without estimating the intercepts of the observed variables?

Thanks.

Linda K. Muthen posted on Thursday, July 11, 2013 - 2:11 pm

The default of using all available data requires means. You can do listwise deletion by saying LISTWISE=ON in the DATA command.

Steve posted on Thursday, July 11, 2013 - 3:24 pm

Hi Linda,

It appears that it may not be possible to have Mplus not estimate the intercepts of the observed variables (and therefore raise the number of free parameters substantially in my case). I have no missing data and have now run my CFA multiple ways including with MODEL=NOMEANSTRUCTURE to the ANALYSIS command, LISTWISE=ON in DATA command, and both at the same time. Each time the Mplus still adds the number of observed variables to the free parameters. Thank you for all of your help and please let me know if anything else comes to mind to lower the number of free parameters to what is normally calculated.

Linda K. Muthen posted on Thursday, July 11, 2013 - 3:48 pm

Then your model requires means.

Steve posted on Friday, July 12, 2013 - 3:57 am

Dear Linda,

I would like to conduct Monte Carlo study as you suggest. I have reviewed your paper on how to use a Monte Carlo study to decide on sample size, and also the chapter in the handbook on this subject. However, after reading these I have many questions and do not feel equipped to specify the analysis correctly. Is there another reference that I could purchase which would detail how to proceed with the Monte Carlo study for my large model?

Many thanks.

Linda K. Muthen posted on Friday, July 12, 2013 - 10:17 am

Try asking on SEMNET. I don't know of anything else.

Steve posted on Friday, July 12, 2013 - 1:00 pm

Thank you Linda. I will do as you suggest.

WEN Congcong posted on Monday, July 13, 2020 - 9:48 am

Dear professors,

Hello! I am now running a LTA model with 2 dummy covaritate. The sample size of the 0 reference category is just 41, I want to use the a monte carlo simulation study to test if this size is sufficient.

I think it's like normal regression specification, but how do I specify the size of a specific response category? It is not a multiple group analysis, we cannot use Ngroup option.

Thank you in advance!

Bengt O. Muthen posted on Monday, July 13, 2020 - 4:36 pm

Try the Cutpoints option used in UG ex 12.1.