Message/Author 

Anonymous posted on Monday, April 08, 2002  4:21 pm



We are running a structural equation model, and have missing data on some of the measured indicators. We notice on the mplus printout that the total number of parameters estimated increases when there is missing data, by the number of indicators in the model. Are there any implications of this for sample size? Specifically, if one is following the 10 cases to each parameter guideline set forth by others, should these additional parameters be included in the calculation of the total number of parameters? 


Means are included in the model with missing data. These are the extra parameters that you see. Having means in the model does not change the degrees of freedom of the model. The means both increase the number of sample statistics and the number of parameters equally. If you are using the rule of 10 cases per parameter, I would think that these means should not be counted. This part of the model is unstructured and does not therefore affect the other results. 

Anonymous posted on Monday, September 30, 2002  10:07 pm



I'm running a SEM in Mplus where I have a set of endogenous latent variables (K) that are regressed on another set of endogenous latent variables (I). Mplus gives the standardized effects (Std) for the regression of one of the I's on one of the K's as greater than 1. Is this possible ? The remaining model output / parameters look fine. 

bmuthen posted on Tuesday, October 01, 2002  9:17 am



Please see Mplus Discussion under Structural Equation Modeling, Standardized Coefficients. 

Anonymous posted on Tuesday, June 03, 2003  7:55 am



Provided one has sufficient RAM and (hard) disc spaces, is there a limit on the size data file Mplus can accept, either in terms of variables, cases, or total size ? Thanks. 


We do have a limit of 500 variables. There is no limit on the number of observations. 

Anonymous posted on Monday, November 15, 2004  7:05 pm



Hi  I am working with a data set of over 80,000 participants. I ran an SEM mediation model and then conducted multigroup invariance analyses. Of course, the indirect effect is significant due to the sample size, as well as all chisquare different tests for measurement invariance. Do you have any ideas about how to handle the large sample size problem? Is there a way to convert my parameter estimates to effect sizes? Is there a practical (rather than statistical) rule of thumb for examining differences in parameter estimates across groups? Thank You! 


With 80,000 observations, it seems randomly dividing the sample into two or more samples so that you can crossvalidate your results would be a good idea. You can convert to effect sizes. See a statistics text for the formula. The practical versus statistical significance issue can be considered. What is practically significant would be guided by your substantive area. This issues are the same for SEM as for other types of analysis. 

Anonymous posted on Friday, January 21, 2005  6:48 am



Is the rule of thumb of 10 cases to each parameter followed with Structural Equation Modelling as with simpler models?. Is that suitable for both situations of observed and latent endogenous dependent variables?, is it suitable for a combination of observed and latent dependent variables?. I very much appreciate answers/suggestions/suitable references!. 


Rules of thumb for sample size are usually not very useful. The necessary sample size varies depending on so many factors. See the Muthen and Muthen 2000 paper in the SEM journal for determining sample size using Monte Carlo simulations. 

Anonymous posted on Wednesday, January 26, 2005  2:34 am



Reyour response to a previous message: Rules of thumb for sample size are usually not very useful. The necessary sample size varies depending on so many factors. See the Muthen and Muthen 2000 paper in the SEM journal for determining sample size using Monte Carlo simulations. I am sorry but I am unable to find this article!. Further details would be mostly appreciated. thanks 

bmuthen posted on Wednesday, January 26, 2005  1:28 pm



It is actually a 2002 SEM article. You can also find a pdf of it on the web site under Research Papers Using Mplus. 

Anonymous posted on Tuesday, March 15, 2005  6:42 am



Is WRMR sizedependant like the pvalue in AMOS? I tested a model with 8900 cases and CFI (.97), TLI (.97), RMSEA (0.050) indicated a good fit, while WRMR was too high (1.4). Now, I splited the whole sample in subsamples of about 500 cases I got good WRMRvalues, well below 0.9. 


Yes, this is a power issue. With more observations, you have more power to reject the model. WRMR can be unreliable is some situations. I would ignore it if all other tests of fit look good. 


Hello, I'am a baby's user of SEM, can you help me to know if a sample size with 65 cases is enought??? I saw that it depends on the number of the parameters, How do I determine this (path, variance, disturbance....other???)??? Sincelery 


Sample size depends on many factors. You may find the following paper helpful: Muthén, L.K. & Muthén, B.O. (2002). How to use a Monte Carlo study to decide on sample size and determine power. Structural Equation Modeling, 4, 599620. 


I can't find this paper, I don't have access... What can I do??? 


You can download it from our website. See Papers under Analyses/Research. 


Hello, I want to know if normalising some data by differents transformation could be have some effect on the sample size minimal. thanks 


I have never heard anything about the effect on transformations on minimum sample size. Perhaps someone else has. 


Hi I am running an SEM analysis. My model is very large (about 110 parameters) but I have a dataset of 100 cases. I know this is a problem and when I try to reduce parameters, but taking out latent variables and replacing them with single means (from factor analyses), I run in a problem with degrees of freedom, the df drop from 100 to 1215. What would you suggest that I do? Thanks so much! 


When you have fewer observed variable because you have created sum scores, you will have fewer degrees of freedom. There is nothing you can do about this. 


I would like to report effect size for chisquare difference tests. I found a Psych Methods article by MacCallum 2006 which gives a formula which is the difference in the product of degrees of freedom times RMSEA squared for the two models. For instance, I have a nested model design where one model has 8 d.f. and RMSEA = .033 and the other model has 9 d.f. and RMSEA = .0375, giving me an effect size of .0039. This seems like a small number. I wanted to know if this seems like a reasonable way to calculate effect size and what are reasonable values for effect size. Thanks! 


I am not familiar with this approach and therefore have no opinion. 


Do you know of another approach to effect size for chisquare difference tests? What about effect size for parameters in a model (for instance, a regression coefficient)? Would this be just the same as for a regular regression? Thanks, Jennie 


For a parameter, it would be the same as in regular regression. I don't know of any approach for effect size of chisquare. 

sammy posted on Sunday, February 17, 2008  11:41 pm



Hello, I've got a question concerning the appropiateness of using MLestimates in my path analysis: I've got N=193, depending on the model 4 to 8 manifest variables (2 dependent variables, the rest IV). As my data is not normally distributed, I should not use ML or GLS. Unfortunately, using WLS generates a nonpositive definite weight matrix. Do you have any advice for me? Is ML robust enough or should I calculate the BentleSatorra corrected Chi? Thanks in advance! 


What do you mean by your data are not normally distributed. Are the dependent variables skewed continuous variables or categorical variables with floor or ceiling effects. 

sammy posted on Monday, February 18, 2008  1:09 pm



Dear Dr Muthen, my first dependent variable is continuous with a skewness of +1.75 (and a curtosis of 3.3). The second one is categorical, measured with a 7point rating scale. Its skewness is only .16, but its curtosis 1.2. I'm not sure, whether I should stick to ML nonetheless. Thank you very much for your help! 


I would treat the dependent variables as continuous and use the MLM or MLR estimators to take into account nonnormality  unless the categorical variable has a strong floor or ceiling effect in which case I would use WLSMV (I would not use WLS). 

cecilia posted on Thursday, September 15, 2011  3:25 pm



Considering the relationship sample size/ estimated parameters into a sem where I'm more interested in the structural relations than other estimated parameters: Should I take into account all estimated parameters? Or is it possible only to take into account the structural paths (relations between latent variables)? Thanks 


I don't think that ratio rule is a good one, so I don't know how to answer you. See instead Muthén, L.K. & Muthén, B.O. (2002). How to use a Monte Carlo study to decide on sample size and determine power. Structural Equation Modeling, 4, 599620. which you find on our web site. 


I have an SEM for interaction effects with the MLR estimator. I am importing N=694 cases with 49 variables off which I am using 18 variables using an ASCII comma separated DAT file. The file does not have any variable(column) labels or other test values. The problem I am experiencing is that in the output the number of observations being returned is 661. Besides that the analyses is successfully completed however, some parameters are incorrect from a previous run on a colleagues computer. I have imported and examined the data in Excel. The original data set was created in SAS 9.2, and I am working in Windows 7 64bit. Not sure what to do? 


Please send your output, data, and license number to support@statmodel.com so I can see what is happening. 


Hello, I am new to Mplus, and I was wondering if it is still possible to run Multivariate Multiple Regression with Mplus (as listed on UCLA Stats page http://www.ats.ucla.edu/stat/mplus/dae/mvreg.htm)? And if so  is there a way to calculate the minimum sample size needed? (I am working with a small sample of about 60) Many thanks. Steve 


Yes, this is still possible. See the following paper which is available on the website: Muthén, L.K. & Muthén, B.O. (2002). How to use a Monte Carlo study to decide on sample size and determine power. Structural Equation Modeling, 4, 599620. 


Hi, above you have written that WRMR is dependent on the sample size. Do you know of an article I can quote to argue why I ignore the value even though it is recommended for categorical data? I have estimated the same model for two different samples, all "classic" GoF indices are fine, except the WRMR for the bigger sample: For n = 417: RMSEA 0.064 CFI 0.992 TLI 0.977 WRMR 0.457 For n = 12571 RMSEA 0.038 CFI 0.997 TLI 0.991 WRMR 1.200 Is there any simulation study you know of that I can use? Kind regards, Christoph 


WRMR is an experimental fit statistic. I would not use it. It has not proven to work that well. It has never been written about as far as I know except in the Yu dissertation on the website. 

Steve posted on Thursday, July 11, 2013  4:28 am



Hello, My question is with regard to the ratio of sample size to parameters estimated. I have N>1000 (using MLR) and have justified my study’s sample size based on 5:1 ratio (Bentler & Chou, 1987). However, when I calculated this minimum sample size, I did not factor in that Mplus estimates the intercepts of the observed variables by default. When applying 5:1 ratio to this much larger number of free parameters listed in the output, my sample size no longer meets this requirement. Therefore, I am wondering: 1) is it possible to justify that my study meets the 5:1 ratio when using Mplus default analysis (somehow say that the observed variable intercepts don’t apply to this calculation)? If not, 2) is there a way to conduct analysis in Mplus without estimating the observed variable intercepts {I tried setting observed variables means and intercepts at zero (e.g., x1@0 [x1@0] ) but this didn’t seem to work (i.e. a lot worse fit)? Your help would be much appreciated. Thanks! 


I think the best way to see if your sample size is sufficient is to do a Monte Carlo study and see if you can recover your parameters using your sample size. You can add MODEL=NOMEANSTRUCTURE to the ANALYSIS command in some cases. 

Steve posted on Thursday, July 11, 2013  10:42 am



Dear Linda, Thank you. I will look into the Monte Carlo study as you suggest. I just ran my CFA with MODEL=NOMEANSTRUCTURE to the ANALYSIS command as you suggested. However, it still added the number of observed variables to the output for 'Number of Free Parameters'. Any ideas why this is the case  or how the analysis could be conducted without estimating the intercepts of the observed variables? Thanks. 


The default of using all available data requires means. You can do listwise deletion by saying LISTWISE=ON in the DATA command. 

Steve posted on Thursday, July 11, 2013  3:24 pm



Hi Linda, It appears that it may not be possible to have Mplus not estimate the intercepts of the observed variables (and therefore raise the number of free parameters substantially in my case). I have no missing data and have now run my CFA multiple ways including with MODEL=NOMEANSTRUCTURE to the ANALYSIS command, LISTWISE=ON in DATA command, and both at the same time. Each time the Mplus still adds the number of observed variables to the free parameters. Thank you for all of your help and please let me know if anything else comes to mind to lower the number of free parameters to what is normally calculated. 


Then your model requires means. 

Steve posted on Friday, July 12, 2013  3:57 am



Dear Linda, I would like to conduct Monte Carlo study as you suggest. I have reviewed your paper on how to use a Monte Carlo study to decide on sample size, and also the chapter in the handbook on this subject. However, after reading these I have many questions and do not feel equipped to specify the analysis correctly. Is there another reference that I could purchase which would detail how to proceed with the Monte Carlo study for my large model? Many thanks. 


Try asking on SEMNET. I don't know of anything else. 

Steve posted on Friday, July 12, 2013  1:00 pm



Thank you Linda. I will do as you suggest. 


Dear professors, Hello! I am now running a LTA model with 2 dummy covaritate. The sample size of the 0 reference category is just 41, I want to use the a monte carlo simulation study to test if this size is sufficient. I think it's like normal regression specification, but how do I specify the size of a specific response category? It is not a multiple group analysis, we cannot use Ngroup option. Thank you in advance! 


Try the Cutpoints option used in UG ex 12.1. 

Back to top 