Mplus Discussion >> Monte-carlo

Topics
Last Day
Last 3 Days
Last Week
Tree View

Edit Profile


Monte-carlo

Mplus Discussion > Structural Equation Modeling >

Message/Author

Anonymous posted on Wednesday, February 28, 2001 - 7:42 pm

I have two question.
First,in monte carlo output,what is the meaning of 95%Cover
Second,if I select replication 1000,the output only find the average and std.Dev. Can I get the 1000 estimator?

Linda K. Muthen posted on Friday, March 02, 2001 - 9:21 am

95% coverage is the proportion of replications for which the 95% confidence intervale contains the true parameter value.

If you are using mixture Monte Carlo, you can save the results from each replication. If you are doing regular Monte Carlo, you cannot. If you generate data outside of Mplus, you can save all of the results. A Monte Carlo utility is available on the website for this purpose.

Anonymous posted on Sunday, March 04, 2001 - 11:13 pm

My simulations were made for testing general SEM (continuous data) and WLSMV (ordinal data) models. Can I get the simulated dataset for each replication ?

In a mplus Monte Carlo input file, mplus only needs the true mean and covariance matrices. How can it (Mplus) know what are the true parameter values of a specific parameter structure of a SEM model?

Linda K. Muthen posted on Monday, March 05, 2001 - 8:43 am

No, you can save data from only the first replication in Mplus MONTECARLO. If, however, you generate data outside of Mplus, we provide a Monte Carlo utility on the Mplus website to aid in running the Monte Carlo simulation and saving results for further analysis.

For the estimator coverage statistics to be correct, Mplus picks up the true population values of the parameters from the starting values of each parameter in the MODEL command.

Mike Willoughby posted on Friday, January 07, 2005 - 11:26 am

I am trying to follow the procedures listed in Muthen & Muthen, 2002, concerning monte carlo for power estimation.

In that paper (p. 603), you desribe 3 steps for generating non-normal data
1 - generate data for 2 classes;
2 - run analysis w/ 1 replication;
3 - solve resid variances & use these as pop values for data generation.

Can you clarify how many files are needed to accomplish this?

For example, do I create data using 1 input file (using save command), analyze that saved data using a second input file (with replication = 1), and then use results of 2nd input to establish correct population values (for the aggregate sample) that are used as start values in the 'MODEL' portion of the first input file?

If this is correct, I'm assuming that only the last of these 3 files was listed in the appendix of that paper (and is avail for download on your website)?

Thanks for any clarification you might provide.

bmuthen posted on Friday, January 07, 2005 - 5:53 pm

The 3 steps listed are used as initial steps based on which the Monte Carlo input file (for many replications) shown in the paper is done.

The 3 initial steps can be done via a single input file that looks much like the one in the paper, but uses the large sample mentioned and only 1 replication. You save the data and check the skewness and kurtosis, etc.

Girish Mallapragada posted on Thursday, February 02, 2006 - 9:04 pm

Hi,

I am running a Monte-Carlo analysis on a SEM.

I am interested in testing the effects of construct reliability, R-squared and sample size on the structural coefficients (2 parameters).

I have finished my runs with the various test condistions and stored my results. However to run my ANOVAS I just need the values of only the two parameters I am interested in from my monte-carlo results file.

Is there a way I can specify the results command so that only these two parameters are stored?

Girish

http://www.personal.psu.edu/users/g/z/gzm108/

Linda K. Muthen posted on Friday, February 03, 2006 - 6:42 am

No, you can't do this.

Girish Mallapragada posted on Saturday, February 04, 2006 - 11:00 am

Appreciate the clarification.

Girish Mallapragada posted on Thursday, February 16, 2006 - 8:28 pm

Hi,

I am running Monte- Carlo analysis on a SEM.
I am interested in testing the effects of construct reliability, R-squared and sample size on the structural coefficients (2 parameters).

I am fixing R-squared of my regression equations by fixing the residual variance of the dependent variable. i.e., If my dependent construct say is f1, is it alright to specify f1@0.3 in my model population, to fix the r-squared of the equation to 0.7?

Thanks in advance.

Linda K. Muthen posted on Friday, February 17, 2006 - 6:30 am

In MODEL POPULATION, by specifying f1@0.3 or f1*0.3, data will be generated with a residual variance of 0.3. So this is correct for an R-square of 0.7.

kimberly arcoleo posted on Monday, January 15, 2007 - 3:03 pm

I'm trying to run a Monte Carlo study to generate data for use in replicating my dissertation model which was under-powered. I've run into a glitch in that my model has a grouping variable (Clinic Yes/No) and I haven't been able to figure out the syntax for generating this variable under the Monte Carlo study. Any help you can give me is greatly appreciated! Kim

Linda K. Muthen posted on Monday, January 15, 2007 - 3:33 pm

If you mean a grouping variable as in multiple group analysis, see the Monte Carlo counterparts of Examples 5.15-5.18. If you mean a grouping variable as in a covariate, see Example 11.1.

kimberly arcoleo posted on Tuesday, January 16, 2007 - 2:17 pm

I'll try that - thanks so much!

kimberly arcoleo posted on Monday, January 22, 2007 - 10:37 am

I now have another issue. I have saved the parameter estimates from a previous analysis to use as the population values in the MC simulation. However, I'm getting an error message that there's insufficient data in the POPULATION file. What would cause this error? Thanks!

Linda K. Muthen posted on Monday, January 22, 2007 - 11:07 am

It may be that you changed something in the ANALYSIS command between Step 1 and Step 2 of the analysis. If this doesn't help, send all relevant files and your license number to support@statmodel.com.

Christian Geiser posted on Wednesday, March 21, 2007 - 2:09 pm

My apologies if this is a stupid question...

This afternoon I was working with an Mplus MC input and found the following issue quite puzzling. I always thought that the MODEL MONTECARLO statement is the one used to specify the true population values for a model, while the MODEL option is used to specify the model to be estimated in the 500 or so MC replications (as described in the Muth�n & Muth�n 2002 SEM paper). However, what I found today was that Mplus actually seems to interpret the values provided in the MODEL statement as population values (when I provided different values following an asterisk (*) which I thought would be taken only as starting values under the MODEL statement, the population values in the output changed accordingly). On the other hand, population values in the output did not change when I changed the values following an asterisk in the MODEL MONTECARLO command. Am I missing something?

Linda K. Muthen posted on Wednesday, March 21, 2007 - 3:32 pm

The use of the values in MODEL MONTECARLO and MODEL are described in the first Monte Carlo example in Chapter 11. MODEL POPULATION provides the values for data generation. MODEL provides the values that are used as both the true population values for computing coverage and as starting values. The reason for this is so the models for data generation and analyis can be different.

marissa hansen posted on Wednesday, May 16, 2007 - 1:55 pm

Hi - I am conducting a path analysis with count variables and receive an error message stating that I must use Monte Carlo integration in the analysis. Why is this a default in the program? See below for syntax and error message.

Thanks!

Mplus VERSION 3.12
MUTHEN & MUTHEN
05/16/2007 1:37 PM

MISSING are all (999);
USEVARIABLES ARE a5 h9 h17 recgende lngaccul relginf2 d1 needdepr forsrv
instlng emolng;

COUNT ARE h9 h17 FORSRV;

ANALYSIS: TYPE=GENERAL MISSING;

MODEL:forsrv on a5 recgende relginf2 lngaccul h9 h17 d1 needdepr emolng instlng;
OUTPUT: SAMPSTAT RESIDUAL STANDARDIZED;

INPUT READING TERMINATED NORMALLY

4-11-07 mplus

*** FATAL ERROR THIS MODEL CAN BE DONE ONLY WITH MONTECARLO INTEGRATION.

Linda K. Muthen posted on Wednesday, May 16, 2007 - 2:15 pm

Monte Carol integration is required when the number of dimensions of integration vary for individuals due to missing data. I recommend upgrading from Version 3.12 to the most recent version of Mplus. There have been many changes and improvements since Version 3.12.

marissa hansen posted on Thursday, May 24, 2007 - 3:54 pm

Thank you for the information. Is this due to the count nature of my mediatior and outcome variables? Is the FIML defualt turned off so to speak with use of Monte Carlo simulation in addressing the missing data?

Thanks for the information as I am novice Mplus user.

Linda K. Muthen posted on Thursday, May 24, 2007 - 4:13 pm

In this situation, Monte Carlo is type of numerical integation not a simulation. There is a brief description of numerical integration in Chapter 13 of the user's guide which is on the website.

Marvella Bowman posted on Tuesday, November 13, 2007 - 12:35 pm

Hello, I am new to Mplus, and I am using the demo version in attempts to determine the sample size and power that would be necessary for my dissertation research. My hypothesized model posits mediated-moderation, and there are 6 continuous variables involved (4 predictors (2 pairs of interacting variables) and two outcomes (one is a proposed mediator, the other the outcome).

Which would be the best monte carlo simulation study to run using the demo version?

Linda K. Muthen posted on Tuesday, November 13, 2007 - 1:17 pm

Use the Monte Carlo counterpart of Example 3.11.

Scott R. Colwell posted on Thursday, November 15, 2007 - 12:35 pm

I'm running a Monte Carlo study to examine sample size requirements. I have one continuous variable that is an interaction between X1 and X2. The define command does not seem to be available for monte carlo. (ie: if I were to create Z1 = X1*X2)

Does it matter in the calculation of sample size if the variable Z1 is not defined as a function of X1*X2? If it doesn't then I assume I would just add Z1 to the names are command.

Linda K. Muthen posted on Thursday, November 15, 2007 - 1:34 pm

In this situation, you would need to generate the data outside of Mplus and use external Monte Carlo in Mplus to analyze the data.

Jon Elhai posted on Monday, March 10, 2008 - 3:46 pm

Drs. Muthen,
In a Monte Carlo analysis, where I'm trying to estimate observed power after having conducted a confirmatory factor analysis... I'm wondering:
1) When freeing parameters with an asterisk, I assume the numbers I insert after an asterisk are the parameter estimates I obtained in my CFA?
2) If so, is this the standardized estimate, such as from the STDYX column in my CFA output?
3) If most of my Monte Carlo output�s estimates in the % Signif Coeff column are 1.000, could I have done something wrong, or is this possible?

Linda K. Muthen posted on Monday, March 10, 2008 - 4:19 pm

The values placed after an asterisk or @ symbol in the MODEL POPULATION command are used a population values for data generation. The values placed behind an asterisk in the MODEL command are used to compute coverage and as starting values. It sounds like you do not have values in the MODEL command. See Example 11.1.

Heiko Schimmelpfennig posted on Thursday, June 12, 2008 - 4:55 am

Hello, I would like to generate data for a simple SEM with two latent variables. However, how can I specify a nonlinear, e. g. concave and monotonically increasing, relationship between the variables?

Linda K. Muthen posted on Thursday, June 12, 2008 - 12:38 pm

Mplus cannot specify non-linearity of this type. It can do x squared or x1 times x2.

Kathryn Degnan posted on Wednesday, October 22, 2008 - 7:40 am

Hi,
I am trying to run a monte carlo simulation to determine the power needed for inclusion in our grant proposal. My model is a path analysis with 3 x's, a mediator (y1), and a categorical outcome (u1). I have read through the manual on monte carlo simulations and am using example 3.14 as a guide as well, but I have a couple of questions.

1) Am I correct in assuming that the path estimates to u1 are in logits (if I am using ML)?

2) If I am using ESTIMATOR = ML, do I have to estimate a residual variance for u1? Isn't it zero?

3) Is it possible to simulate patterns of missingness when using the ML estimator?

Thank you.

Linda K. Muthen posted on Wednesday, October 22, 2008 - 11:43 am

1. Yes, the default for ML is logistic regression.

2. With logistic regression, the residual variance is not estimated. It is fixed to pi squared divided by 3.

3. Yes. See Examples 11.1 and 11.2.

Kathryn Degnan posted on Friday, October 24, 2008 - 8:01 am

Thank you for your help. I have now gotten the monte carlo simulation to run with missing patterns, although I had to add MONTECARLO = integration to get it to run. When looking at examples for this language I saw that those examples (e.g., 3.17) used ESTIMATOR = MLR. However, my model also runs with ESTIMATOR ML and I can't see what the difference is.
My model is a path analysis with 3 x's, a mediator (y1), and a categorical outcome (u1). Is one of these estimators better than the other for my simulation?

Thank you.

Linda K. Muthen posted on Friday, October 24, 2008 - 8:38 am

The parameter estimates are the same for ML and MLR. The standard errors and fit statistics are different. ML has conventional standard errors and fit statistics. MLR has standard errors and fit statistics that are robust to non-normality. Our default in most cases is MLR.

Kathryn Degnan posted on Friday, October 24, 2008 - 8:56 am

Thank you for the clarification. Since my outcome variable is categorical and doesn't have a 50/50 split, MLR sounds like the better option for me.

Kathryn Degnan posted on Saturday, October 25, 2008 - 10:51 am

In the results of my monte carlo simulation using MLR, there is a log likelihood estimated across the replications for H0, but not one for H1. How do I assess model fit?
Since I am doing this mainly for a power analysis of the paths I am not sure what the comparison model would be - where all of the paths are estimated at zero?

Thanks for your help.

Bengt O. Muthen posted on Saturday, October 25, 2008 - 12:16 pm

With a combination of a continuous mediator and a categorical distal outcome plus ML estimation, there is no model fit assessment. This is because there is no relevant H1 model with an unrestricted covariance matrix. Such H1 models are relevant only for continuous outcomes.

Eun Sook Kim posted on Sunday, November 02, 2008 - 4:45 am

Hi.
I am trying to generate binary data. For the further use of the generated data in IRT programs, I want to format the data without decimals and spaces. Monte carlo command does not have format subcommand. Is there any way to format the data?

Linda K. Muthen posted on Sunday, November 02, 2008 - 8:16 am

No, there is no way to change the format of the data saved from a Monte Carlo study.

sorya posted on Wednesday, January 07, 2009 - 9:42 am

Dear Prof. Muthen,

just a very short question: Conducting a Montecarlo study does it make any difference whether an asterisk or the @ symbol is used in the "Model Population" command?

Thanks!

Linda K. Muthen posted on Wednesday, January 07, 2009 - 10:25 am

No. But it does make a difference in the MODEL command.

Tae Seok Yang posted on Monday, April 13, 2009 - 2:34 pm

Dear Dr. Muthen,

I am new to Mplus. I'm trying to run Monte Carlo now. After running, I saved the results in the text file. However, I don't know the order of results. The only guiding message I can see in the output file is "Parameter estimates
(saved in order shown in Technical 1 output)".

Can you tell me where I can get 'Technical 1 output', Please? How about technical 5 output?

Amir Sariaslan posted on Monday, April 13, 2009 - 2:49 pm

Hi Tae,

Did you ask for the outputs? You can retrieve them by specifying the following:

OUTPUT:
TECH1 TECH5;

For more information on this, see Chapter 17 in the User's Guide.

Sincerely,
Amir

Tae Seok Yang posted on Monday, April 20, 2009 - 1:19 pm

Hi, Thank you, Amir for your answer.

I have another question. I ran simulations with Mplus v. 3.2. Recently, I re-ran the same simulations (I mean the syntax was identical) in Mplus v.5.

I expected the same result. However, degree of freedom was different in two versions.

Is there any reason for having two different dfs in the different versions of Mplus?

Bengt O. Muthen posted on Monday, April 20, 2009 - 3:24 pm

Check that you have the same estimator in both runs and that the Tech1 outputs agree, and if not go with the one you want.

Daniel Rodriguez posted on Thursday, September 17, 2009 - 5:55 am

Hi, can the multiple processors function be used with monte carlo analysis to speed up the process?

Linda K. Muthen posted on Thursday, September 17, 2009 - 7:12 am

Yes.

Scott R. Colwell posted on Friday, September 25, 2009 - 9:53 am

May I ask a question for clarification regarding the Monte Carlo procedure.

If I want to specify what the true values are in the population but don't want to save the data, I use the command MODEL MONTECARLO.

If I want to specify what the true values are in the population as I did above, but this time I want to save the data, then I use the MODEL POPULATION command.

Is that correct?

Linda K. Muthen posted on Friday, September 25, 2009 - 10:23 am

MODEL POPULATION and MODEL MONTECARLO are the same and can be used interchangeably. The values given in them are used for data generation.

Daniel Rodriguez posted on Monday, September 28, 2009 - 7:57 am

OK, How would I code the multiple processors option to speed up the Monte Carlo analysis? Would I merely write processors=4 within the analysis option, or do I have to add something else?

Linda K. Muthen posted on Monday, September 28, 2009 - 9:19 am

It would be faster if you put:

PROCESSORS = 4 (STARTS);

Daniel Rodriguez posted on Monday, September 28, 2009 - 9:42 am

Thanks, I'll give it a shot.

Scott R. Colwell posted on Tuesday, September 29, 2009 - 5:50 pm

I have a question regarding the assessment of power with a Monte Carlo simulation.

Suppose in the Model Population command I fix a path of F2 on F1@.60 and do the same in the Model command. The % Sig Coeff is the estimate of power and even with a large sample size it is showing at 0. Is this because the parameter is fixed in the Model command?

Bengt O. Muthen posted on Tuesday, September 29, 2009 - 6:16 pm

Yes. You can't estimate power for a fixed parameter because it doesn't obtain an estimate/SE ratio that the power estimate is based on.

Scott R. Colwell posted on Wednesday, September 30, 2009 - 11:24 am

Suppose when specifying a CFA model in the Model Population Command you have 3 indicators (x1-x3) of Factor 1. If you set the residual variances of all three to .51, for example X2@.51 shouldn't it automatically set the loading of Lambda11, Lambda12, and Lambda 13 at a standardized value of .70? Given that 1 - (.70^2) = .51? Currently it seems to require you to provide the loading.

Thank you.

Linda K. Muthen posted on Wednesday, September 30, 2009 - 4:00 pm

No population parameter value is set automatically. The value zero is used for any parameter for which you do not give a population value

Scott R. Colwell posted on Thursday, October 01, 2009 - 4:52 pm

To follow-up on the question regarding the Model Population command in a Monte Carlo simulation, I have 2 questions:

(1) If I want to set a latent factor with 3 indicators to have an average r-square of 60%, then should I set the population parameter values for the unstandardized residual variance of all three items at .40 (eg. x1@.40 x2@.40 and x3@.40) in Model Population .

(2) If that is correct? Do I have to hand calculate what the unstandardized factor loading for each observed variable (item) would be or can I set the variance for the factor at 1 (F1@1) and then set the factor loading to start somewhere say F1 by x1-x3*.75 assuming that the program will calculate the proper loading based on the residual variance being set at .40 and the factor variance being set at 1.

At the end of the day, I want to be able to simulate data with different factors that have differing r-square values.

Thank you

Linda K. Muthen posted on Friday, October 02, 2009 - 3:42 pm

R-square = (lambda squared * factor variance)/(lambda squared * factor variance) + (residual variance)

If the factor variance and factor loadings are one, then

R-square = 1 / 1 + residual variance

Solving for the residual variance

Residual variance = 1 - R-squared/R-squared

For R-square of .6, the residua variance is

1-.6/.6 = .667

Scott R. Colwell posted on Wednesday, October 28, 2009 - 11:27 am

Am I reading this correctly from the manual page 346-347 regarding external Monte Carlo.

If data is generated using Mplus and saved as replist.dat, then used in the Type = Monte Carlo, the population values in the output come from the Model command (in the external Monte Carlo) and the average values come from the replist.dat. Is that correct? In this sense one could mispecify the model in the Model command and compare the results of population to average.

Linda K. Muthen posted on Wednesday, October 28, 2009 - 4:40 pm

The population values used for coverage are taken from the MODEL command for both internal and external Monte Carlo. The saved data sets are analyzed and the values are averaged across the data sets.

Scott R. Colwell posted on Monday, November 02, 2009 - 10:36 am

I am running a simulation to look at the bias due to attenuation when averaging items. So I have 2 factors with 5 indicators each. In the population I have modelled them to correlate at .40. I average the using the Mean command in Define. Then when I run the external montecarlo (which I need because I am using the Define command) I put the original value (.40) in the Model: command F1 with F2@.40. In tech9 the average values will be those using the means of F1 and F2 comparing them against the population values. Is there any problems in reading the output that way when using the Define command? It seems right to me.

Bengt O. Muthen posted on Monday, November 02, 2009 - 5:56 pm

Not sure I understand the intent here. Here are a couple of observations which may be of use:

If the indicators are correlated 0.40, their averages won't correlate 0.40. I don't know why F1 with F2 is fixed at .40 in the MODEL command. Typically, the MODEL command would use *. You mention average values using the means of F1 and F2 comparing them against pop values - why would that be of interest? I thought you were getting at the attenuated correlation between means of indicators as compared to factor correlation.

Scott R. Colwell posted on Tuesday, November 03, 2009 - 7:11 am

Sorry I wasn't very clear. What I have done is create two factors each with 5 indicators. In the population I have modelled them to correlate at .40 with varying sample sizes and 1000 reps. I want to examine the attenuation due to measurement error when you average the items in the factor to create one observed variable (call it OV) for each.

In the external Monte Carlo, I use the define command to create the averages so factor 1 becomes OV1 = Mean(x1-x5) and factor 2 becomes OV2 = Mean(x6-x10). I put the original values in the Model command OV1 with OV2*.40

In tech9 I am assuming that the average values in the columns will be those using the means of OV1 and OV2 comparing them against the population values. Is there any problems in reading the output that way when using the Define command? It seems right to me.

On the same topic, but different scenario, in the model command, is there anyway of specifying F1@(1-.65)*Var(x) without specifying what the value Var(x) is? This would be needed if you are correcting for attenuation and are running external Monte Carlo with Type = Monte Carlo.

Thank you,

Bengt O. Muthen posted on Wednesday, November 04, 2009 - 12:20 pm

The average values in the output give you the average over the replications of the estimate for whatever parameter is printed. So for example the average estimate of OV1 WITH OV2. And also for the mean parameters of OV1 and OV2 which it sounds like you refer to.

The answer to your V(x) question is no. Perhaps this can instead be done via Model Constraint. Perhaps using the Constraint = option in the VARIABLE command.

Scott R. Colwell posted on Thursday, November 05, 2009 - 8:17 am

That works great. I've never used this function before, but its great.

One question. In example 5.20 and the discussion of it, it states that you specify "standardization" in the output command for the R-sqr values. But when I use the model constraint command it states that:

"STANDARDIZED (STD, STDY, STDYX) options are not available when specific constraints are used in MODEL CONSTRAINT."

One other question. Does using the model constraint command with the New function change the analysis model at all?

Bengt O. Muthen posted on Thursday, November 05, 2009 - 8:34 am

When standardized with R-square is not provided you can do it yourself in Model Constraint.

NEW parameters do not change the analysis model.

Scott R. Colwell posted on Thursday, December 10, 2009 - 9:03 am

Is there a way to save covariance matrices and/or correlation matrices for each replication (in a separate file) in a Monte Carlo run. For example, instead of generating the raw data for each replication it generates the matrix for each replication. - Thanks,

Linda K. Muthen posted on Thursday, December 10, 2009 - 9:42 am

No.

Martin Schultze posted on Monday, January 18, 2010 - 7:36 am

Hi,
I have a quick question concerning the PROCESSORS command with Monte-Carlo-Studies. I'm currently running a Monte-Carlo analysis on different computers using the TYPE=MONTECARLO subcommand under DATA:. I also used the PROCESSORS command in the ANALYSIS section. Now my problem is: while running these analyses on the 32-bit version it actually uses multiple processors. However, when I run the same inputs on the 64-bit version and I check the task-manager I can see that the procedures simply skip from processor to processor and the CPU-usage always hovers around 25% (on a computer with 4 processors).

Am I missing something? I tried it with and without the (STARTS) subcommand but the result is always the same.

Thanks in advance for your help!

Linda K. Muthen posted on Tuesday, January 19, 2010 - 2:46 pm

We have not had this experience. Can you send the input and data along with your license number to support@statmodel.com. We can run it on our 64-bit computer and see if we have the same experience.

Robert Wickham posted on Thursday, March 11, 2010 - 10:07 pm

Hello Drs. Muthen,
I am using Mplus to generate and analyze data for a Monte Carlo study, but I would also like to import said datasets into SAS for additional analyses. Given that I am using NREPS = 10000, I would like to automate this process using a macro. Unfortunately, I have been unable to figure out a way to import the dat files produced by MPlus using a PROC IMPORT statement.
I realize that this is primarily a SAS-related issue, however I was curious if there was any way to modify the format or type of output datafile produced by the Monte Carlo option in Mplus.
Thanks

Linda K. Muthen posted on Friday, March 12, 2010 - 5:49 am

There is no option to change the format of datasets saved via the REPSAVE and SAVE options of the MONTECARLO command.

Alexander Kapeller posted on Tuesday, April 27, 2010 - 8:12 am

Hello,

I want to specify R2 for a latent dependent variable in a mc simulation to 0.3. therefore I fix the residual variance to 0.7. correct?

By doing this, do I also affect the reliability of the measurement construct of the DV? Can I specify residual variance and variance of the latent dependent variable separately?

Thanks
Alex

Linda K. Muthen posted on Tuesday, April 27, 2010 - 12:47 pm

If the variance of the variable is one, then a residual variance of .7 reflects an R-square of .3.

You specify a variance or residual variance depending on the parameter that is estimated. For example, in a conditional model, a residual variance is specified. In an unconditional model, a variance is specified.

Alexander Kapeller posted on Tuesday, April 27, 2010 - 2:29 pm

Hello Linda,

thanks for clarifying. Then i have a follow up question.

my model

dv on iv ;
dv by v1-v3@0.8;

dv@0.7; !R-square = 0.3

v1-v3@0.4

is then indicator reliability
(0.8**2 * 0.3 )/(0.8**2 * 0.3 + 0.4)= 0.32

thanks in advance
alex

Linda K. Muthen posted on Tuesday, April 27, 2010 - 4:07 pm

You need a few more values in MODEL POPULATION:

iv*1;
dv on iv*.55;
dv by v1-v3@0.8;
dv@0.7; !R-square = 0.3
v1-v3@0.4

This results in:

var(dv)= (.55**2)*1 + .7 = 1
var(v) = (.8**2)*1 + .4 = 1.04

reliability (v) = .64/1.04 = .62

Alexander Kapeller posted on Wednesday, April 28, 2010 - 10:10 am

Thanks again,

I understand the condition that the variance of the dependent variable has to be 1 to specify the R-square=1-residual variance.

my path coefficient
dv on iv*.55;
will probably not be 0.55. so I will get a variance(dv) <> 1. how can I then specify R-square.

Best

Alex

Linda K. Muthen posted on Wednesday, April 28, 2010 - 10:23 am

Then you have to figure total variance and variance explained and compute R-square from that.

Alexander Kapeller posted on Thursday, April 29, 2010 - 10:18 am

Hi Linda,

so eg. in the case of 2 IV I use the formula for R2=

(b1**2 * var(IV1) + b2**2 * var(IV2) + 2*b1*b2*cov(IV1IV2)) / (b**2 * var(IV1) + b2**2 * var(IV2) + 2*b1*b2*cov(IV1IV2) + residual(DV)) = R2

is that correct? and to calculate this from a normal Mplus output I take the unstandardized values?

then I have another question belonging to the input parameters I specify in the monte carlo command. Are the parameter values standardized or unstandardized? I think they are unstandardized. is that correct?

Thanks a lot

alex

Linda K. Muthen posted on Thursday, April 29, 2010 - 2:07 pm

Yes. Use the unstandardized values.

You should specify population parameter values using unstandardized estimates.

Alexander Kapeller posted on Sunday, May 02, 2010 - 2:33 pm

hi Linda,

I am now conducting the MC Simulations. Is there a way to have an output with the R-square results. And if not, is there a special reason they cannot be calculated or something else?

Best Alex

Linda K. Muthen posted on Sunday, May 02, 2010 - 4:33 pm

R-square is not available for Monte Carlo. There is no particular reason.

Alexander Kapeller posted on Wednesday, May 05, 2010 - 9:20 am

Hi Linda,

I conducted the MC analysis and now I am a bit confused about the output. in the parameter specification and the population values the path coefficients are reported under the headline beta.
1)Are those standardized or unstandardized values? 2)Or are they both, depending what I specify as variances in the population and model part of the input.

thanks
Alex

Linda K. Muthen posted on Wednesday, May 05, 2010 - 9:26 am

We do not standardize them. They depend on the population parameter values that you give.

Alexander Kapeller posted on Wednesday, May 05, 2010 - 10:22 am

Linda,

thanks a lot

Alex

Ginger Lockhart posted on Friday, August 27, 2010 - 9:55 am

Hello,
I'm wondering whether Mplus has a way of exporting information on non-positive definite theta matrices in the save data command. I have successfully saved out the results for each replication but I would like to also identify which ones are not positive definite.
Thanks,
Ginger

Linda K. Muthen posted on Friday, August 27, 2010 - 10:42 am

You can see that if you add TECH9 to the OUTPUT command.

Ginger Lockhart posted on Friday, August 27, 2010 - 11:52 am

Thanks Linda; I'm wondering, though, if this information can be included in the data file produced in the save data command, (for example, a binary term indicating whether the theta matrix for each replication was non-positive definite).

Linda K. Muthen posted on Friday, August 27, 2010 - 3:23 pm

There is no way to request this if we don't save it with the other results and I don't think we do.

ywang posted on Thursday, September 16, 2010 - 12:44 pm

Hello,

To calculate the required sample size for grant proposals. Mplus needs specific information such as the mean and variance of slope and intercept of growth in the Monte Carlo study, but where can we get the information from the literature? The related paper usually has only part of the information, but not all information. If we do not have complete information, what can we do?

Thanks a lot in advance!

Linda K. Muthen posted on Thursday, September 16, 2010 - 2:24 pm

You need plausible population parameter values for all parmaeters in the model. If they are not provided by theory, you can estimate a model using your data and use these values as population parameter values.

Alexandre Morin posted on Wednesday, October 06, 2010 - 3:21 am

Hi,
This might be a stupid question (sorry for that) but I cant seem to find the answer (maybe due to a cold I am stuck with ).
When generating data under "Monte Carlo Population", what is the impact on the data that is generated of using @(fix) or *(start) when providing population model values ?
Thnak you very much !

Linda K. Muthen posted on Wednesday, October 06, 2010 - 6:48 am

There is no difference between @ and * when they are used in MODEL POPULATION. Two places this is documented is under MODEL POPULATION and the first example in the Monte Carlo examples chapter.

Alexandre Morin posted on Sunday, October 24, 2010 - 12:52 pm

Hi,
I would like to generate a series of data set using the monte carlo facility (lets say 500) and then to analyse them via the EXTERNAL facility. Why? because I want to run different types of models on the same data sets (and to save the time required to generate the data each time).
For instance, I want to generate a set of data that fits a CFA model with cross loadings and see how to best recover the population parameters (ESEM, CFA,PVs, etc.).
If there a way to still specify the population parameters somewhere to get the coverage and other Monte Carlo indices?
Thanks

Linda K. Muthen posted on Monday, October 25, 2010 - 11:41 am

The coverage values are taken from the MODEL command for both internal and external Monte Carlo.

Richard E. Zinbarg posted on Friday, November 19, 2010 - 2:50 pm

Hi,
I would like to generate some simulated data sets to compare conventional regression analysis with SEM. When creating the factor structure underlying simulated data, in the past I have typically just set the variance of the factors equal to 1. As I am interested in comparing unstandardized coefficients in this particular set of simulations, I need to set the metric of the factors another way (otherwise my factors are standardized and so the regression coefficient relating them is also standardized and I am interested in unstandardized coefficients for this particular project). Am I correct that all I need to do as an alternative is set one of the factor loadings for each factor equal to one as in the syntax below?
Model Population:
f1 BY y1@1 y2-y4*.707;
f2 BY y5@1 y6-y8*.707;
f1 on f2*.5;
y1-y8*1;
Or are there any other constraints I need to add to the model? Many thanks!

Bengt O. Muthen posted on Saturday, November 20, 2010 - 7:35 am

That's it.

Richard E. Zinbarg posted on Saturday, November 20, 2010 - 6:24 pm

thanks Bengt! And I have what is probably a very basic follow-up question. Using the above specification, how do I know what the variances of the factors are?

Bengt O. Muthen posted on Sunday, November 21, 2010 - 9:48 am

Ask for TECH4 in the OUTPUT command.

Richard E. Zinbarg posted on Sunday, November 21, 2010 - 3:37 pm

many thanks Bengt for the speedy and helpful response (as always)!

Richard E. Zinbarg posted on Monday, November 22, 2010 - 7:15 pm

I must be doing something wrong but can't figure out what. Unless I specify the variance of the factors in addition to the factor loadings in my model population statement, I run into a problem such that only a relatively small subset of the replications that I request are actually completed. As long as I also specify the variance of the factors, then the number of replications I request is the number completed.

Bengt O. Muthen posted on Tuesday, November 23, 2010 - 7:56 am

You have to give population values for the factor variances even if you don't fix them, so e.g.

f1-f2*1;

Erika Wolf posted on Tuesday, December 07, 2010 - 2:23 pm

Can you provide more detail on using the Monte Carlo approach to generate non-normal data for power analyses? I have read your 2002 paper on this but I am not clear on how to determine the appropriate population values to achieve the desired level of non-normality in the generated data. In a message on this string dated 1/7/05, you state that the initial 3 steps in this process can be completed in 1 input file. Can you provide an example of that script? Thanks.

Linda K. Muthen posted on Wednesday, December 08, 2010 - 9:39 am

The approach we used in the paper was using mixture modeling with trial and error. I think we describe this in the paper and the inputs are part of the paper and/or on the website.

Erika Wolf posted on Monday, December 13, 2010 - 2:22 pm

I think there is only script available for the last step (the actual power analysis) in the paper. I'm trying to determine how you arrived at the values that you specified in the script.

As I understand, you started with data generation for a 2 class model with .8 factor loadings, indicator error of .36 and factor intercorrelation of .25 for both classes. (a) Is this correct?

The paper describes 3 steps before getting to the script provided in the paper.
Step 1: Generate data for 10,000 cases with 2 latent classes (which will eventually be analyzed as 1). When you generate non-normal data by having 2 latent classes, (b) are the only differences that you specify between the 2 classes that the factor 2 mean is set to 15 and the factor 2 variance is set to 5 in Class 1? Are all other parameters (factor loadings, indicator error, factor correlation) the same in the overall and class-specific models?

I generated data for 10,000 cases with 2 classes (12% and 88%), each with .8 loadings, .36 indicator error, .25 factor correlation, and class 1 mean and variance of 15 and 5, respectively. But when I pulled the saved data into SPSS, the skewness and kurtosis values for the factor 2 indicators differed from what you report in the 2002 paper. I'm trying to replicate what you did before I move on to running the power analysis I need to run for my paper. Thanks!

Linda K. Muthen posted on Monday, December 13, 2010 - 3:28 pm

The two inputs for generating non-normal data are shown on the examples page under the headings:

CFA model with non-normal continuous factor indicators without missing data.

CFA model with non-normal continuous factor indicators with missing data.

The trick is to generate using two classes and analyze using one class. All of the steps are contained in the inputs. If you run those inputs, you should get the same results as in the paper.

Yo In'nami posted on Friday, December 17, 2010 - 6:01 am

Dear Muthen and Muthen,

Perusing the Muthen and Muthen (2002) and relevant materials using this method (the Mplus user's guide; Thoemmesa et al., 2010), I am planning to conduct a post-hoc power analysis on a variety of models found in the field of language education. Since both sources illustrate many examples of input commands, I will use these examples as a starting guide. Is it correct to assume that power of any model can be calculated as long as the model can be programmed into Mplus? Or, in other words, is there any model whose power cannot be calculated using Mplus?

Yo

Muthen, L. K., & Muthen, B. O. (2002). How to use a monte carlo study to decide on sample size and determine power. Structural Equation Modeling, 9, 599-620.

Muthen, L. K., & Muthen, B. O. (2007). Chapter 11: Monte Carlo simulation studies. Mpus user's manual 5th edition.

Thoemmesa, F., MacKinnon, D. P., & Reiser, M. R. (2010). Power analysis for complex mediational designs using Monte Carlo methods. Structural Equation Modeling, 17, 510-534.

Linda K. Muthen posted on Friday, December 17, 2010 - 6:09 am

You can use this approach on any model that can be specified using the MONTECARLO command. Note that the power is for one parameter of the model not the entire model.

Yo In'nami posted on Friday, December 17, 2010 - 9:13 am

Linda,

Thank you very much for a very useful comment! I have been considering this question for the past several months. Now I can go ahead with my work.

Yo

Yo In'nami posted on Wednesday, April 13, 2011 - 7:19 am

Muthen and Muthen (2002) explain that a Monte Carlo output labeled "% Sig Coeff" refers to power and this shows the proportion of replications for which the null hypothesis that a parameter is equal to zero is rejected for each parameter at the .05. I have conducted several post-hoc power analyses on published models by specifying the MODEL POPULATION and the MODEL commands to be identical. If parameter estimates in published models are reported to be statistically significant and these estimates are specified in both the MODEL POPULATION and the MODEL commands, will the "% Sig Coeff" of these parameter estimates be always over .80? If so, conducting a post-hoc (not a priori) power analysis just seems to be establishing/reconfirming what is already apparent--that there was sufficient power to detect statistical significance of parameters. In other words, is it correct to say that there is no need to conduct "post-hoc" power analyses if parameters of interest have been already reported to be statistically significant?

Yo

Muthen, L. K., & Muthen, B. O. (2002). How to use a monte carlo study
to decide on sample size and determine power. Structural Equation
Modeling, 9, 599-620.

Muthen, L. K., & Muthen, B. O. (2007). Chapter 11: Monte Carlo
simulation studies. Mpus user's manual 5th edition.

Linda K. Muthen posted on Wednesday, April 13, 2011 - 9:35 am

I would think this type of power analysis is usually done in the planning of a study to determine the necessary sample size or perhaps after to see if non-significance is due to lack of power. However, even if you find significance, you don't know with what power you find it. Perhaps your power was .3. You may have been lucky to find significance in your sample but may not be so lucky in another sample of the same size.

radanielina-hita marie louise posted on Wednesday, April 13, 2011 - 10:53 pm

Hi,

I am running three sets of Monte Carlo simulation studies for my dissertations: one for my measurement model, one for a follow-up structural portion and another one for a parallel growth curve process.
I could make the CFA run but I do not see the commands for a parallel growth curve process in the Mplus guide. Any advice on how I can do this will be appreciated.
Thanks.

Bengt O. Muthen posted on Thursday, April 14, 2011 - 8:01 am

See UG ex 6.13.

Martin Schultze posted on Monday, May 16, 2011 - 2:19 am

Hi,

we're currently preparing a rather large simulation study and are desperately looking for ways to speed the process up. We had the idea that, while we wan't to have Chi-Square, SRMR, and RMSEA we're not really interested in CFI/TLI. Is there any way to switch off the baseline model estimation while keeping the H1 model estimation?

Thank you so much in advance!

Martin

Linda K. Muthen posted on Monday, May 16, 2011 - 8:31 am

There is no way to do this. If you send your input and license number to support@statmodel.com, we will see if we can make other suggestions to speed things up.

Yo In'nami posted on Saturday, May 21, 2011 - 2:33 am

I am using a Monte Carlo power analysis and received an error message that the model is unidentified although it is identified with other SEM programs. The degrees of freedom are 17 in Mplus but 19 in other programs. I have been unsuccessful rectifying the syntax. I am grateful for your generous help!

MONTECARLO:
NAMES ARE X1-X8;
NOBSERVATIONS = 259; ! SAMPLE SIZE OF INTEREST
NREPS = 10000;
SEED = 53567;
MODEL POPULATION:
IAVLE BY X1*.89 X2*.69 X3*-.75;
SRC BY X4*.81 X5*.83 X6*.71 X7*.79 X8*.43;
SRC ON IAVLE*.67;
X1@1 X4@1;
X1*.21; X2*.52; X3*.44; X4*.34; X5*.31;
X6*.50; X7*.38; X8*.82;
SRC*.55;
MODEL:
(Note. Exactly the same as the Model Population above and thus omitted to shorten the message);
ANALYSIS: ESTIMATOR = ML;
OUTPUT: TECH9;

Linda K. Muthen posted on Saturday, May 21, 2011 - 6:57 am

Please send the full output including TECH1 and your license number to support@statmodel.com.

Linda K. Muthen posted on Sunday, May 22, 2011 - 9:52 am

You have freed all factor loadings but forgot to fix the factor variances to one.

mpduser1 posted on Thursday, June 23, 2011 - 4:12 pm

I've been running some Monte Carlo power analyses in Mplus 6.11 and I'm wondering if the results make sense. My input file is:

MONTECARLO:
NAMES ARE Y T;
NOBSERVATIONS = 280;
NREPS = 1000;
SEED = 4533;
GENERATE = Y (n 2);
NOMINAL ARE Y;
CUTPOINTS = T(0);

MODEL POPULATION:
T*.50 ;
[T*.25] ;
Y#1 on T*.3;
Y#2 on T*.3;
[Y#1*.21];
[Y#2*1.41];

ANALYSIS: TYPE = GENERAL;
ESTIMATOR = ML;
MODEL:
Y#1 on T;
Y#2 on T;

<results>

Linda K. Muthen posted on Friday, June 24, 2011 - 9:22 am

The input looks fine.

mpduser1 posted on Friday, June 24, 2011 - 2:04 pm

Okay, I guess I was thrown by the fact that Mplus doesn't report by the "population" parameter values, such that the "% Sig. Coeff" column provides the parameter specific power estimates, unless MODEL COVERAGE is specified.

So, to get the estimated power figures directly, I have to specify my population model twice, once in MODEL POPULATION and once again in MODEL COVERAGE. Is this correct?

Linda K. Muthen posted on Friday, June 24, 2011 - 4:16 pm

The population parameter values for coverage should be in the MODEL command not MODEL POPULATION. See Example 12.1.

Anonymous posted on Wednesday, June 29, 2011 - 11:04 am

BOOTSTRAP is not allowed with MONTECARLO. Is there any way to obtain the bootstrap standard errors and confidence interval for parameter estimates in a monte carlo study by using Mplus?

Linda K. Muthen posted on Wednesday, June 29, 2011 - 2:42 pm

Not unless you analyze each data set separately after data generation.

Leslie Rutkowski posted on Friday, September 16, 2011 - 1:38 pm

Hi, I just ran a fairly simple path model and tried to save the estimates to use for a subsequent MC simulation. I am getting the following error:
"Saving of ending values for the ESTIMATES option is not available for models with covariates. Request for ESTIMATES is ignored."

Thanks,
Leslie

Linda K. Muthen posted on Friday, September 16, 2011 - 3:13 pm

Please send the full output and your license number to support@statmodel.com.

Jak posted on Monday, October 10, 2011 - 8:18 am

Hello,

In a montecarlo analysis, 1 of the 500 replications was not completed.
I saved the results to file, which has the results from the 499 completed replications.

For each replication, I want to compare the fit of this model with the fit of another model, for which I have results for all 500 replications.

How do I find out which replication is missing in the first file?

Thanks in advance, Suzanne

Linda K. Muthen posted on Monday, October 10, 2011 - 11:14 am

Ask for TECH9 in the OUTPUT command and you will see which replication had the problem.

elementary posted on Friday, October 28, 2011 - 2:48 am

Hello,
I want to run a power analysis using the Monte-Carlo Approach for a planned survey using a two-stage sampling procedure.

As my hypotheses are on level-1 only, I do not plan multilevel-analyses, but rather want to run a "normal" SEM, adjusting standard errors with cluster/type = complex.

I tried to model this using the Mplus montecarlo option as described in the CFA-textbook of Brown (2006, p. 420ff)(To get started, at the moment I ignore the mean structure and the multiple group design, which have to be added in a later stage). When assuming that data are not nested, everything worked.

However, I did not manage modeling the clustering as well. Type = complex seems not to be available within montecarlo. And using NCSIZES and CSIZES is obviously available only with type=twolevel, which does not seem to be applicable in this case.

Thus, I thought of finally correcting the sample size I get under the assumption of a random sample, using a correction formula for effective sample size, (cf. the Multilevel-textbook of Hox, 2002, p.5). Is this reasonable and are there alternatives to that?

Thanks!

Linda K. Muthen posted on Friday, October 28, 2011 - 8:02 am

You need to generate the data in one step using TWOLEVEL and analyze it using external Monte Carlo is a second step. This is described in Example 12.6.

Eric Teman posted on Tuesday, January 24, 2012 - 1:12 pm

When running a Monte Carlo simulation, is there a way to monitor or output convergence failure information?

Linda K. Muthen posted on Tuesday, January 24, 2012 - 5:51 pm

Ask for TECH9 in the OUTPUT command.

elementary posted on Wednesday, January 25, 2012 - 6:20 am

Thanks for your quick reply to our posting from October 28 regarding (external) montecarlo! We first tried to understand the examples (which took us some time). For this purpose, we reran example 9.12 and subsequently tried to identify the numbers in the output that have been used as input for example 12.6. Unfortunately, we often were not able to identify them. For example, in the unstandardized output for example 9.12, the residual variance for sw was 0.473 and for sb it was 0.214, while in example 12.6 input, it was entered as 0.2 for sw and 0.1 for sb . These differences generalize to other parameters. Do you have any suggestions on that? That might help us to locate possible further errors we made in subsequent steps of our analyses. Thanks!

Linda K. Muthen posted on Wednesday, January 25, 2012 - 6:42 am

Please send an output that shows your question along with your license number to support@statmodel.com. Specify exactly where in the output the numbers you refer to can be found.

Lois Downey posted on Thursday, February 09, 2012 - 9:26 pm

Chapter 12 of the User's Manual includes a discussion of Monte Carlo output related to the chi-square test of model fit (pp. 362-3). In the example, the critical value of chi-square at the 0.05 level was exceeded in 0.058 of the replications. The discussion indicates that 0.058 is close to the expected value of 0.05, thus indicating that the chi-square distribution is well approximated. I'm not clear on how much greater than .05 the value can be, and still be acceptable. What would you use as an upper limit before concluding that the chi-square distribution is NOT well approximated?

Thanks.

Linda K. Muthen posted on Friday, February 10, 2012 - 9:55 am

I would say less than .10 but it is really your choice as to how precise you want to be.

Gavin T L Brown posted on Tuesday, February 28, 2012 - 2:50 pm

Dear MPlus
We are studying DIF under small sample conditions to determine the degree of bias in parameter and se values. We are using a MIMIC model for DIF detection. In the simulation we have varied the sample size (400 to 1000) and expected the bias values to decrease as sample size increased which was not the case. Consequently, we are concerned that we may have misspecified our simulation model.
We have split the population as being either high=2 or low=1 motivation (r_att_1). The factor Improvement (Improv) has 6 items each of which has ordered categorical responses (6 options, with 5 thresholds).
We appreciate your comments as to whether the parameter values (esp. the residual variances) have been specified correctly.
MODEL POPULATION:
[r_att_1 @1.5]; ! mean of the IV
r_att_1 @.25 ; ! variance of the IV
Improv BY i1@1 i2-i6*.6;
Improv ON r_att_1*.8;
i1 ON r_att_1*.24; ! i6 not included
i2 ON r_att_1*.24;
i3 ON r_att_1*.24;
i4 ON r_att_1*.24;
i5 ON r_att_1*.24;
i1-i6 *.76;
Improv*.6;
The threshold values for all items are:
! thresholds
[i1$1*-.5];
[i1$2*0];
[i1$3*.5];
[i1$4*1];
[i1$5*1.5];

Bengt O. Muthen posted on Wednesday, February 29, 2012 - 8:24 am

If you use the WLSMV estimator you have to make sure that the population values of the variances of the DVs conditional on the IVs are 1 in order to get the parameters in the metric that WLSMV uses. Your DVs are the i variables and your IV is r_att.

If this doesn't help, send complete output to Support.

Gavin T L Brown posted on Wednesday, February 29, 2012 - 5:59 pm

Hi Bengt.
we checked with documentation as to how to calculate the variance for the i variables and found this: residual variance = 1-R2. Since each i variable has 2 predictors (r_att_1 and Improv) we have squared the beta values and then summed them before subtracting from 1 to get the amount for the error variance. In the example provided, e have beta paths of .6 (Improv) and .24 (r_att_1) so the R2= .36 and .0576 respectively. When summed we get R2=.4176 Subtract from 1=.5824 which would lead to i1-i6 *.58;
is that right?
thanks for clarifying

Bengt O. Muthen posted on Wednesday, February 29, 2012 - 6:20 pm

That would only be correct if Improv and r_att_1 are uncorrelated, but r_att_1 influences Improv. Instead, express the i variables in terms of r_att_1, so Improv is no longer in the picture. That is, using the fact that i is a function of Improv which is a function of r_att_1 and a residual.

Gavin T L Brown posted on Thursday, March 01, 2012 - 4:17 pm

Bengt
thanks. I agree that my calculation depends on the assumption of independence which is not the case in our model. However, your explanation still leaves me uncertain as to how to calculate an appropriate value for the i variable residual.
Should we adjust the Improv beta weight (.6) by the beta weight from r_att_1 to Improv (.8) before determining residual variance?
This is what I understand your comment means: (square(.6*.8))=.23 as the new effect of Improv and add it to .0576 the effect of r_att_1. This would give us a residual of .7124.

This seems awfully high to me as a residual but if it's right then we go with it.
Please advise if I've understood correctly.

Bengt O. Muthen posted on Thursday, March 01, 2012 - 4:58 pm

You have in general notation

(1) y = lam*f + gam*x + e

(2) f = beta*x + d

You insert (2) into (1) and get the "reduced-form" expression

y = lam*beta*x+gam*x+lam*d+e

= (lam*beta+gam)*x+lam*d+e

from which you get the variance of y easily in the usual way because all 3 terms are uncorrelated (x, d, and e are uncorrelated).

Bengt O. Muthen posted on Thursday, March 01, 2012 - 6:17 pm

I should add that the variance you want to be one is the conditional variance

V(y | x) = V(lam*d+e) = lam*lam*V(d)+V(e).

Gavin T L Brown posted on Monday, March 05, 2012 - 6:04 pm

Hi Bengt
Please check that we have understood this correctly from your instructions.
The formula for e is
e=1-(lam*beta+gamma)*x-(lam*d)
lam=regression from f to I var = .60
beta=regression from x to Improv = .80
gamma=regression from x to I var = .24
x=variance of x =.25
d=residual of f = .36
Thus, e=1-(.6*.8+.24)*.25-(.60*.26)=.604
Please advise if we got this right
Thanks

Bengt O. Muthen posted on Monday, March 05, 2012 - 8:16 pm

It is not e that you want, but its variance V(e):

V(e) = V(y|x) - lam*lam*V(d)
= 1 - lam*lam*V(d)
= 1- 0.60*0.60*0.6
= 0.784.

Christoph Weber posted on Thursday, August 09, 2012 - 12:27 pm

Dear Dr. Muthen,
I have run a complex two wave model (df=800, N = 595) with a second order factor. The ratio of the sample size to estimated parameters is about 5. This is not consistent with rules of thumbs as given be Kline (1998). Would it be correct to use a monte carlo anaysis, such as example 12.7 to prove that estimates are "correct"?
thanks
Christoph Weber

Linda K. Muthen posted on Friday, August 10, 2012 - 10:23 am

Yes, you can use a Monte Carlo study to do this.

burak aydin posted on Wednesday, September 19, 2012 - 5:10 pm

Hello,
I generate the data externally. I use monte carlo facility to analyze them. I save both results and the output. I can see which iterations did not run, but I would like to automate this process. I would like to see an iteration indicator on the results file or a more efficient way to see failed iterations rather than visual check (ctrl+f "did not"). Is this possible with Mplus? or do you know any other way?
Thank you very much.

Linda K. Muthen posted on Thursday, September 20, 2012 - 9:44 am

The results file includes a replication number. If you ask for TECH9, you will see error messages related to each replication.

burak aydin posted on Thursday, September 20, 2012 - 12:59 pm

Hello Dr. Linda,
We use version 6.11, the results file does not include a rep number. These are what it saves in our case

RESULTS SAVING INFORMATION

Order of data

Parameter estimates
(saved in order shown in Technical 1 output)
Standard errors
(saved in order shown in Technical 1 output)
Chi-square : Value
Chi-square : Degrees of Freedom
Chi-square : P-Value
CFI
TLI
H0 Loglikelihood
H0 Scaling Correction Factor for MLR
H1 Loglikelihood
H1 Scaling Correction Factor for MLR
Number of Free Parameters
Akaike (AIC)
Bayesian (BIC)
Sample-Size Adjusted BIC
RMSEA : Estimate
SRMR : Between level
SRMR : Within level
Condition Number

and yes I ask for tech9 and find which iteration did not terminate normally. I do it by cntrl+f "did not". What I need to accomplish is having j rows in the result file if I run j iterations. If iteration number i did not run, I want to insert -99s in row i.
Thank you

Linda K. Muthen posted on Thursday, September 20, 2012 - 2:32 pm

Try Version 6.12 which is the latest version. I am 99% sure there is a replication number saved.

Margarita Olivera Aguilar posted on Tuesday, January 08, 2013 - 9:54 am

Dear Dr. Muthen,
I am using the Monte Carlo facility in MPLUS just to generate data but I am not fitting any model to the data. However, I get the following warning for all replications when I have a sample size of 100:

THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES MAY NOT BE
TRUSTWORTHY FOR SOME PARAMETERS DUE TO A NON-POSITIVE DEFINITE
FIRST-ORDER DERIVATIVE PRODUCT MATRIX.

Since I am not fitting any model to the data I am not sure how to interpret the warning. I wonder if it is a default warning when the number of parameters is larger than the sample size.

Thank you.

Linda K. Muthen posted on Tuesday, January 08, 2013 - 12:19 pm

Please send the output and your license number to support@statmodel.com.

Ray Cheung posted on Tuesday, January 08, 2013 - 11:16 pm

Hi,

I generate 20 indicators using montecarlo but I want to use only the first 18 for subsequent analysis. Is there a way for me to ask Mplus to ignore the last 2 indicators? Thanks!

Linda K. Muthen posted on Wednesday, January 09, 2013 - 6:39 am

You can use external Monte Carlo where you generate the data in the first step and analyze it in the second step. See Example 12.6.

Ray Cheung posted on Thursday, January 10, 2013 - 12:00 am

Thank you very much.
In addition, I understand that TYPE=MONTECARLO and SAVEDATA: SAVE=FScore together. I would like to ask if Mplus can save factor scores in each dataset analyzed. Thank you.

Linda K. Muthen posted on Thursday, January 10, 2013 - 6:09 am

I don't think you can do this. Try adding it to the input to be sure.

Scott R. Colwell posted on Thursday, January 10, 2013 - 6:47 am

Is it possible to obtain the average covariance matrix or correlation matrix for all replications as opposed to just the first replication?

Thanks,

Linda K. Muthen posted on Thursday, January 10, 2013 - 10:08 am

No, this is not an option at this time.

Ray Cheung posted on Thursday, January 10, 2013 - 5:43 pm

Hi Linda,

When I use TYPE=montecarlo and request tech9, there are condition codes in some of the replications. Is that a way to ask Mplus to report the average results ignoring those conditions with error code? Thank you

Linda K. Muthen posted on Thursday, January 10, 2013 - 6:14 pm

The averages are over all replications that converge. There is no way to change this.

Samuel McAbee posted on Thursday, January 17, 2013 - 11:32 am

Hello Drs. Muthen,

I am interested in saving the means and standard deviations for the raw data in each replication of a multi-group Monte Carlo analysis with 10 variables. I would like to be able to produce a matrix of this information for use in an external program. Is this currently possible at this time? Thank you.

Linda K. Muthen posted on Thursday, January 17, 2013 - 1:54 pm

This is not possible. You can save the data from each replication but not the means and standard deviations.

Geumju LEE posted on Sunday, March 03, 2013 - 9:02 pm

Hello.
I��m trying to simulate an unconstrained latent interaction model by Monte Carlo.
First I generated the 1000 data sets of y1-y3, x1-x3, z1-z6.
And the population values of main effects of X and Z are set to 0.4 both,
the population correlation between X and Z is set to 0.3, and the population interaction effect is manipulated as 0.2.
And then, I tried to analyze the 1000 data sets I generated.
This is the part of output.

MODEL RESULTS

ESTIMATES S. E
Population Average Std. Dev. Average
X BY
X1 1.000 1.0166 0.0685 0.0668
X2 1.000 0.8779 0.0616 0.0612
X3 1.000 0.8275 0.0594 0.0591

.
.
.
.
.
.
.

Y ON
X 0.000 0.4077 0.1011 0.0951
Z 0.000 0.4077 0.0973 0.0940
XZ 0.000 0.2005 0.0955 0.0895

X WITH
Z 0.000 0.2960 0.0734 0.0711

I don��t understand what the column labeled ��POPULATION�� means.
I didn��t set either 1.000 or 0.000.
How did I get these values?
Please explain the meaning of these.
Thanks in advance.

Linda K. Muthen posted on Monday, March 04, 2013 - 7:03 am

The population values are taken from the MODEL command. They are used to compute coverage.

Geumju LEE posted on Wednesday, March 06, 2013 - 6:04 am

Thank you for your reply.
However it couldn't answer my question.
So I'm writing my question again with more detailed information.

Actually, the average values are similar with the values that I set in DATA GENERATION stage.
(I conducted monte carlo simulation in separate stages of 'data generation'
and 'analysis of unconstrained approach'.)
I set from x1 to x3 as 1.02, 0.88, and 0.83 respectively when I generate data,
and I got the AVERAGE values of the ANALYSIS output.
The other values of indicators that I set in data generation are also similar with the average values.

Although I've set the POPULATION values neither 1s nor 0s in Model command,
I've got 1s and 0s in unconstrained ANALYSIS output.
I don't understand what 'POPULATION' means and how I got these.
By any chance, aren't these values just for filling the blanks?

Again, Thank you so much.

Linda K. Muthen posted on Wednesday, March 06, 2013 - 6:55 am

Please send the output and your license number to support@statmodel.com.

John Plake posted on Monday, July 29, 2013 - 9:39 am

I am running a monte carlo simulation for sample size and power (Muthen & Muthen, 2002) on a hypothesized four-factor CFA model with 12 indicators. When I run it according to the description in the literature, it works flawlesly. Parameter and SE bias along with power indicate that N > 44 should work.

However, when trying to extend that simulation to include a single second-order factor in place of the first-order correlations, the output gets wonky. Standard error bias is off the charts, even with N = 2,000. I'm sure I'm mis-specifying something, but I can't find any sample syntax for a second-order CFA in a montecarlo simulation.

The specific problem I'm having is in the theta matrix. Here is the syntax I'm using...

MODEL POPULATION:
CPERF by CP_P1-CP_P3*0.8;
PARTNER BY PA_P1-PA_P3*0.8;
TPERF BY TP_P1-TP_P3*0.8;
TWORK BY TW_P1-TW_P3*0.8;
MJP BY CPERF-TWORK*.6;
MJP@1;
CP_P1-TW_P3*.36;
CPERF-TWORK*.64;

MODEL:
CPERF by CP_P1-CP_P3*0.8;
PARTNER BY PA_P1-PA_P3*0.8;
TPERF BY TP_P1-TP_P3*0.8;
TWORK BY TW_P1-TW_P3*0.8;
MJP BY CPERF-TWORK*.6;
MJP@1;
CP_P1-TW_P3*.36;
CPERF-TWORK*.64;

Linda K. Muthen posted on Monday, July 29, 2013 - 1:04 pm

It looks like all of the first-order factors have all factor loadings free. In this case, you must fix the factor variances to one.

John Plake posted on Monday, July 29, 2013 - 1:52 pm

Thanks, Linda! Somehow I forgot that you can't estimate both loadings and variances at the same time. [facepalm]

Wang Shan posted on Monday, October 28, 2013 - 8:59 pm

Hi Linda,
I am doing a CFA Monte Carlo simulation study. And I have a question here.
As we know,in the model population part, the values such as factor loadings are fixed to true values. But I'm not sure in the model part, I should give true values as the starting values for analysis or just use ordinary values such as 1��-1 and so on. If we use true values as the starting values, will it ifluence the estimated results ?
So here is my syntax, I'm not sure which one to choose.

(1)Use ordinary values as starting values
MODEL POPULATION:
......
Trait1 BY
y1@0.397
y2@-0.559;
......
y1@.58
y2@.497;
......
MODEL:
Trait1 BY
y1*1
y2*-1;
......
y1@1;
y2@1;
......
(2)Use true values
MODEL POPULATION:
......
Trait1 BY
y1@0.397
y2@-0.559;
......
y1@.58
y2@.497
......;
MODEL:
Trait1 BY
y1*0.397
y2*-0.559;
......
y1@.58
y2@.497
......;

Linda K. Muthen posted on Tuesday, October 29, 2013 - 6:43 am

In the MODEL command you should give the population values. The values given in the MODEL command are the values that are used for coverage. See Example 12.1 where the MODEL POPULATION and MODEL command are described.

Wang Shan posted on Tuesday, October 29, 2013 - 8:40 pm

I have checked the user's guide. And find that I misunderstood the MODEL command before.

Thank you so much for your reply!

Andrew Grotzinger posted on Monday, January 13, 2014 - 9:06 pm

Hello,

I am running a Monte Carlo simulation for a growth model with 10 time points (y1-y10) and a single dichotomous predictor (tx). I am attempting to use the MODEL MISSING command to generate a steady increase in missing data that ends in roughly 20% missing data by the final time point. However, in the section of the output that lists the summary of missing data for the first replication, a few of the missing data patterns appear to show close to 90% or 100% missing across all time points. Is this right, or have I completely misunderstood how to go about coding for missingness? Thanks for your help!

MODEL MISSING:
[y1-y10@-15];
y1 on tx@0;
y2 on tx*10.405;
y3 on tx*11.108;
y4 on tx*11.523;
y5 on tx*11.821;
y6 on tx*12.248;
y7 on tx*12.5576;
y8 on tx*13.0075;
y9 on tx*13.3417;
y10 on tx*13.613;

Bengt O. Muthen posted on Wednesday, January 15, 2014 - 10:58 am

Perhaps your tx slopes are too high. And are you sure you used Cutpoints on tx. Try it out for one replication with a huge sample like 10,000.

Tracy Zhao posted on Thursday, January 23, 2014 - 3:04 pm

Hi, I am trying to learn how to do Monte Carlo studies using Mplus. I read the user guide, version 7, and have a question. Chapter 12, page 418, you have the example:

MODEL POPULATION: [x1-x2@0];
x1-x2@1;
f BY y1@1 y2-y4*1;
f*.5;
y1-y4*.5;
f ON x1*1 x2*.3;
MODEL: f BY y1@1 y2-y4*1;
f*.5;
y1-y4*.5;
f ON x1*1 x2*.3;

I wanna know why "f BY y1@1 y2-y4*1;" in both MODEL POPULATION and MODEL command? What would be the difference if I write it as "f BY y1-y4*1;" in both commands?

Thanks!

Linda K. Muthen posted on Thursday, January 23, 2014 - 3:53 pm

Either one factor loading or the factor variance must be fixed at one to set the metric of the factor.

Tracy Zhao posted on Thursday, January 23, 2014 - 6:23 pm

Oh, so two questions follow:

1. If I write it as "f BY y1*1;", it is possible that 1 is just a starting value that can change to any other number in the modeling process, right? But if I write it as "@1" then it will be fixed to 1 instead of other value? Is my understanding correct?

2. Why do I need starting value in MODEL POPULATION? Am I not just specifying the model parameters? Under what circumstance would I want the model parameters I specified to change to other numbers? Perhaps you can recommend some readings if this is too hard to explain here.

Thanks again!

Linda K. Muthen posted on Friday, January 24, 2014 - 9:35 am

In MODEL POPULATION there is no difference between @ and *. In MODEL there is. * designates a starting value for a free parameter. @ fixes a parameter to the value that follows.

MODEL POPULATION gives the population parameter values for data generation. See Example 12.1. All of the commands and options are explained. See also the MONTECARLO command in the user's guide.

Tracy Zhao posted on Friday, January 24, 2014 - 10:22 am

I see. Thanks a lot Dr. Muthen!

Tracy Zhao posted on Tuesday, January 28, 2014 - 9:11 am

Hi, I don't know where this question should go, so I am just going to ask it here: if I am batch running Mplus for a large simulation study, can I still run Mplus for a different project (using the editor and run from the editor)? Thanks!

Linda K. Muthen posted on Tuesday, January 28, 2014 - 11:52 am

I would not recommend this.

Jan posted on Sunday, June 29, 2014 - 12:26 pm

In a simple twolevel model with WLSMV estimation I obtain the message:

*** WARNING in SAVEDATA command
Saving of ending values for the ESTIMATES option is not available for
models with covariates. Request for ESTIMATES is ignored.

How to solve this problem?

Jan posted on Sunday, June 29, 2014 - 12:28 pm

I would like to save these estimates for a monte carlo study. When I enter values 'manually' to the monte carlo input, I obtain the message:

*** FATAL ERROR
A POPULATION VARIANCE FOR A COVARIATE IS ZERO.

However, the twolevel regression model works fine. I would appreciate any suggestion.

Linda K. Muthen posted on Monday, June 30, 2014 - 6:36 am

Use the SVALUES option of the OUTPUT command. You will receive input with the final estimates as starting values.

A model is estimated conditioned on the covariates. Their means, variances, and covariances are not model parameters. However, you need to specify them in MODEL POPULATION to generate data. Do a TYPE=TWOLEVEL BASIC with no MODEL command to find these values.

Jan posted on Monday, June 30, 2014 - 6:45 am

Thank you very much Linda!

Jan posted on Monday, June 30, 2014 - 7:21 am

Linda, I entered variances/means to the Model Population accordingly with your suggestion, but how to enter the covariances if 2 variables belong to the within level and one to the between level?

I get this error message:

*** ERROR in MODEL POPULATION command
Between-level variables cannot be used on the within level.
Between-level variable used: g
*** ERROR in MODEL POPULATION command
Between-level variables cannot be used on the within level.
Between-level variable used: g
*** ERROR
The following MODEL POPULATION statements are ignored:
* Statements in the WITHIN level:
STIMTYPL WITH g
ORD WITH g

Linda K. Muthen posted on Monday, June 30, 2014 - 9:40 am

Put the covariance between the 2 within variables in the within part of the model. The between part of the model will have only a mean and variance for the between-level covariate.

Jan posted on Monday, June 30, 2014 - 9:51 am

Perfect, thanks very much.

mpduser1 posted on Wednesday, August 13, 2014 - 3:16 pm

I am trying to do a power analysis for a negative binomial regression model in Mplus. I am using the following syntax:

MONTECARLO:
NAMES ARE y x;
NOBSERVATIONS = 275;
NREPS = 10000;
CUTPOINTS = x(0);
COUNT = y (nb);

MODEL POPULATION:
[y@1];
y@2;

[x@0];
x@1;

y on x@.35;

MODEL:
y on x*.35;

The error message I get is:

*** ERROR in MONTECARLO command
A COUNT(nb) variable in the analysis must be generated as a negative binomial
variable. Variable cannot be analyzed as COUNT(nb): Y

Specifying the mean and dispersion of y on the POPULATION command does not work either.

Is this type of power analysis not possible in Mplus?

Bengt O. Muthen posted on Wednesday, August 13, 2014 - 6:42 pm

Your COUNT statement specifies how to analyze Y. You need to specify how to generate Y, adding

GENERATE = u1(nb);

This is like in the second part of UG ex 3.8. Note that all the UG examples have Monte Carlo versions which are posted on our website under Mplus User's Guide Examples.

mpduser1 posted on Wednesday, August 13, 2014 - 7:03 pm

That's very helpful. Thank you.

mpduser1 posted on Thursday, August 14, 2014 - 9:11 am

Professor Muthen,

Via the error message I noted above, am I correct in understanding that I cannot introduce a model misspecification where I specify a negative binomial for a POPULATION model, but a normally-distributed Y for the analytic model?

Bengt O. Muthen posted on Thursday, August 14, 2014 - 1:38 pm

You can do that using Mplus Monte Carlo in two steps: First generate the data by one model; then analyze the data by another (called "external Monte Carlo"). See the User's Guide chapter 12 for examples of 2-step Monte Carlo approaches.

Yueqi Yan posted on Monday, August 25, 2014 - 5:35 pm

Hi Dr. Muthen,

I am doing a Monte Carlo simulation on the efficiency of planned missing data designs. In the attached syntax, there are 84 missing data patterns, and I got a error message as follows.

*** ERROR in MONTECARLO command
The number of sets of PATMISS variables does not match the number of
patterns in PATPROBS.

I tried to reduce the number of patterns, and found that the when there are less than or equal to 64 patterns there seems to be no problem, but when there are more than 64, the error message comes back.

I was wondering if Mplus has a limit on the number of missing data patterns. Or there are something wrong with my code? Because in my study I also have a design that needs 2*64 patterns.

Linda K. Muthen posted on Tuesday, August 26, 2014 - 12:11 pm

Are you using Version 7.2. I think that has been increased.

thanoon younis posted on Thursday, September 11, 2014 - 1:07 am

Hi Dr. Muthen,
i want your help to simulate data with multiple group nonlinear SEMs with the following model.
NAMES = Y1-Y10;
ANALYSIS: TYPE = RANDOM;
ALGORITHM = INTEGRATION;
ANALYSIS: ESTIMATOR = ML;
MODEL:
x1 BY Y1 Y2 Y3 Y4;
x2 BY Y5 Y6;
x3 BY Y7 Y8;
x4 BY Y9 Y10;
x4 on x1 x2 x3;
x1 with x2;
x2 with x3;
X x1x2 | x1 XWITH x2;
X x1x1 | x1 XWITH x1;
X x2x2 | x2 XWITH x2;
x4 ON Xx1x2;
x4 ON Xx1x1;
x4 ON Xx2x2;

where all values of lambda are 0.8
all values of gama are 0.6
all values of mu.y1-y10 are 0.

i hope to help me to do that because i am new on mplus.
i appreciate your help and your time.

Linda K. Muthen posted on Thursday, September 11, 2014 - 6:07 am

All examples come with a Monte Carlo counterpart. See mcex5.13.inp as a starting point.

thanoon younis posted on Thursday, September 11, 2014 - 8:44 pm

Hi Dr. Muthen
I am trying to write the solve the code below
montecarlo:
names = y1-y10;
generate y1-y10(1);
categorical = y1-y10;
ngroups = 2;
nobs = 200 200;
nreps = 100;
SEED = 53487;
save = mplus.dat;
analysis:
type = random;
model population: g1

x1 by y1@1 y2-y4*0.8;
x2 by y5@1 y6*0.8;
x3 by y7@1 y8*0.8;
x4 by y9@1 y10*0.8;

x4 ON x1*0.6;
x4 ON x2*0.6;
x4 ON x3*0.6;

y1-y10*0.0;

X x1x2 | x1 XWITH x2;
X x1x1 | x1 XWITH x1;
X x2x2 | x2 XWITH x2;
x4 ON Xx1x2*0.6;
x4 ON Xx1x1*0.6;
x4 ON Xx2x2*0.6;

model population-g2:

model: g2
x1 by y1@1 y2-y4*0.8;
x2 by y5@1 y6*0.8;
x3 by y7@1 y8*0.8;
x4 by y9@1 y10*0.8;

x4 ON x1*0.6;
x4 ON x2*0.6;
x4 ON x3*0.6;
y1-y10*0.0;
X x1x2 | x1 XWITH x2;
X x1x1 | x1 XWITH x1;
X x2x2 | x2 XWITH x2;
x4 ON Xx1x2*0.6;
x4 ON Xx1x1*0.6;
x4 ON Xx2x2*0.6;
output:
tech9;

many thanks in advance

Bengt O. Muthen posted on Friday, September 12, 2014 - 6:03 pm

You didn't say what the problem was.

thanoon younis posted on Friday, September 12, 2014 - 6:51 pm

Hi Dr. Muthen
I changed the previous mplus code to this code to get on montecarlo simulation is that correct? and when i implemented this code i got on this error

*** ERROR in ANALYSIS command
ALGORITHM=INTEGRATION is not available for multiple group analysis.
Try using the KNOWNCLASS option for TYPE=MIXTURE.

i appreciate your help and your time.many thanks

Bengt O. Muthen posted on Sunday, September 14, 2014 - 4:59 pm

Your model has categorical outcomes and continuous factors and XWITH. This means that ML needs to be used with numerical integration. Multiple-group analysis in this case is done with Type=Mixture and Knownclass - see examples in the User's Guide for how to do this.

thanoon younis posted on Sunday, September 14, 2014 - 6:04 pm

thank you so much for your help
actually i want to get on simulation data without applying any estimator.
after putting the type= mixture and knownclass i still have error
*** ERROR in MONTECARLO command
Unknown option:
KNOWNCLASS
****************************************

title: this is an example of a multiple group nonlinear SEM
with categorical variables

montecarlo:
names = y1-y10 group;
generate y1-y10(1);
categorical = y1-y10;
ngroups = 2;
nobs = 200 200;
nreps = 100;
SEED = 53487;
save = mplus.dat;
CLASSES = c(2);
KNOWNCLASS = c(group = 1-2);

many thanks again

Bengt O. Muthen posted on Monday, September 15, 2014 - 4:54 pm

The Mplus Version 7.1 Language Addendum says:

The NGROUPS option of the MONTECARLO command has been
extended for use with TYPE=MIXTURE. It is used to specify the
number of classes to be used for data generation and in the
analysis. The program automatically assigns the label %g#1% to
the first class, %g#2% to the second class, etc. These labels are
used in the MODEL POPULATION and MODEL commands.

So you can say for example:

MONTECARLO:
NAMES = y1-y10;
ngroups = 40;
NOBSERVATIONS = 40(500);
NREPS = 1;

ANALYSIS:
TYPE =MIXTURE;
ESTIMATOR = ml;

MODEL POPULATION:
%OVERALL%
f1 BY y1-y10*1;
[y1-y10*0];
[f1*0];
f1*1;
y1-y10*.5;

%g#1%
f1 BY y1*.7 y2-y10*1 (lam1_1-lam1_10);
[y1*.5 y2-y10*0] (nu1_1-nu1_10);
[f1*0];
f1*1;
etc

thanoon younis posted on Monday, September 15, 2014 - 5:46 pm

thank you for your help but still my program not work untill now can you help me to correct it.

Linda K. Muthen posted on Monday, September 15, 2014 - 5:58 pm

Please send your output and license number to support@statmodel.com.

Mika S. posted on Tuesday, November 25, 2014 - 6:23 am

Hi! I was asked by a reviewer to conduct a post-hoc power analysis for bivariate cross-lagg models because some of the cross-lagged coefficients were of rather small magnitude but significant (e.g., STDXY cross-lagged betas were around .08 and p was below .05). My sample size was, in my opinion, quite in a normal range for such studies (N = 700). And my first question is whether a post-hoc power analysis really makes sense in this case. My initial thought was, that adding 95% CI's to the betas would be a more suitable way "to better understand the small but significant effects" (core request of the reviewer)!?

Anyway, a first look at post-hoc power analysis (EX 12.7.)revealed some problems for me and my specific data set. My initial bivariate cross-lagged model was based on complex/cluster analysis (data are clustered in schools). I do not want to conduct multilevel analysis because school effects are not the main aim of my study. My second question thus is: Are there any chances to conduct the 2 step EX 12.7 with the original complex/cluster analysis (since monte carlo, at a first try, does not want "complex/cluster" in step 2) and how should I do this!?

Third question: Is there any chance to use a subset of variables in EX 12.7 for analyses or do I have to "cut" the datasets for these analyses in SPSS? "Usev" does not seem to work. Many thanks!

Bengt O. Muthen posted on Thursday, November 27, 2014 - 3:25 pm

You would have to generate the data using a Type=two-level model in step 1 to get the complex survey features, then analyze in step 2 using Type=Complex. So that doesn't sound desirable since step 1 would then go beyond your initial model (which is not twolevel). If the Type=Complex SEs in your real-data run aren't that different from a run where you ignore Complex you are in a better position.

I don't know that the simulation adds much. I guess it can tell you: If I generate data with my sample size assuming my analysis model and parameter values are correct, do I get these low SEs/these p-values? If you don't, that's telling you that your model has some misspecification that causes small SEs.

Daniel E Bontempo posted on Tuesday, January 06, 2015 - 11:19 am

Can I do an internal montecarlo for a simple path model where the population model specifies an exogenous variable with a non-normal distribution?

Specifically, I am thinking about a power study that included age as an exogenous variable and there is a fairly uniform distribution of 5 (60 mo.) to 7 (84 mo) year old kids. Or, where the exogenous variable is a proportion and the distribution is slightly U-shaped?

I can come up with a variance, but I do not want a normal distribution in either case.

Bengt O. Muthen posted on Tuesday, January 06, 2015 - 2:19 pm

I don't know off-hand how you would generate a uniform distribution. A U-shape can be obtained using a mixture of two normals with means sufficiently apart. Non-normals can also be generated using the new skew-t techniques discussed in the paper on our website:

Asparouhov, T. & Muth�n B. (2014). Structural equation models and mixture models with continuous non-normal skewed distributions. Web Note 19. Version 2. Forthcoming in Structural Equation Modeling.

But, I wouldn't think your power results really are very strongly dependent on such deviations from a normal covariate.

Daniel E Bontempo posted on Tuesday, January 06, 2015 - 2:30 pm

Thanks. I'll look at #19, and also think about just planning power with normal covariate.

Daniel E Bontempo posted on Wednesday, January 21, 2015 - 10:40 am

I am going to assume normally distributed co-variates as you suggested earlier. It keeps things much simpler.

I am now wondering about the power for testing a hypothesis of no effect.

I am using a path model with 3 tests regressed on a single covariate. The tests are correlated because they all measure the same capability. The covariate captures the predominance of language#1 or language#2 in bilingual hoseholds.

Two of the tests are expected to be slightly biased by the match between the test's language and the dominant language in the household. I have chosen effect sizes of interest and used them in my data generation model, and can easily determine power from the rightmost column in the monte carlo output.

However I used a near zero effect for the third test because it is supposed to be immune to household language dominance.

So I really want to know the power that this coefficient is zero.

In an ordinary path model I could use MODEL TEST, but I am not sure how to do this in the monte carlo model. I considered using MODEL CONSTRAINT to make two new parameters that expressed the difference of the 3rd test with each of the others - but this is almost the same as the power testing I was already doing.

If I pick a deviation from zero that is of clinical interest, how can I estimate the power to detect that the regression of test#3 on the covariate is less than that deviation? Is there any relevant example?

Bengt O. Muthen posted on Wednesday, January 21, 2015 - 11:01 am

If the true coefficient is zero, then the power column gives you the Type I error rate.

Model Test is not available with Monte Carlo.

You may want to ask this general Monte Carlo question on SEMNET.

Naomi Friedman posted on Thursday, March 12, 2015 - 6:00 pm

I have run a power analysis for a single parameter with the Monte Carlo simulation and also by estimating the noncentrality parameter as described at http://statmodel.com/power. I'm getting a higher power estimate (.98 vs. .85) in the Monte Carlo analysis and I'm just wondering if this is to be expected, or if I have done something wrong?

Naomi Friedman posted on Thursday, March 12, 2015 - 6:20 pm

Please disregard my previous post; I answered my own question (I did something wrong)!

Sierra Bainter posted on Friday, April 17, 2015 - 6:10 pm

Is there a way to save the true factor score values for generated cases using Mplus' monte carlo utility?

Linda K. Muthen posted on Saturday, April 18, 2015 - 11:07 am

Factor scores cannot be saved with Monte Carlo.

Bilgin Navruz posted on Wednesday, May 13, 2015 - 1:26 pm

Dear Dr. Muthen,

In a two level Monte Carlo study with categorical outcomes (WLSM estimator), how can I get WRMR fit indices in the output? When I generate and analyze the data in the same input file, the output only provides Chi-Square and RMSEA. However, if the data first generated, then analyzed by using generated data sets, the output gives, Chi-Square, RMSEA, CFI, TLI, SRMR-W and SRMR-B, but does not give WRMR.

My first question: How can I get WRMR in two level SEM using WLSM?

Second: Is there any way to see all fit indices in the output file when using just one input file to simulate and analyze all replication?

Thank you.

Bengt O. Muthen posted on Thursday, May 14, 2015 - 11:00 am

1. WRMR has not been sufficiently studied for two-level.

2. These are not available due to an output glitch.

QianLi Xue posted on Thursday, May 21, 2015 - 9:46 am

On page 803 of User's Guide, there is this sentence about the SAVE command within the Monte Carlo Statement: "The variables are not always saved in the order that they appear in the NAMES statement." If there are no clear rules on how the variables simulated in the Monte Carlo are stored, how to tell which is which in the output dataset?

Linda K. Muthen posted on Thursday, May 21, 2015 - 11:15 am

This information is given at the end of the output where you generate and save the data sets.

Rakotoasimbola posted on Monday, August 24, 2015 - 6:24 am

Hi Dr Muth�n,

I would like to know if there is some package on Mplus on how to generate non-normal variables from an implied variance covariance matrix of a SEM model.

Thank you in advance!
Regards

Hasina posted on Monday, August 24, 2015 - 7:01 am

Following the previous message on Monte Carlo Simulation "how to generate non-normal variables from an implied variance covariance matrix of a SEM model. ", I would like to know if there is a package on Mplus for categorical variables.

Many thanks!!

Bengt O. Muthen posted on Tuesday, August 25, 2015 - 6:32 am

Mplus generates non-normal data using either mixtures of normals or using Distribution = t, skewnormal, skewt. There is not a way to generate according to a covariance matrix. Instead the model's relationships between the variables generates the data.

Mplus also generates data on categorical variables.

Andrea Norcini Pala posted on Tuesday, December 15, 2015 - 6:29 pm

Hi,

a colleague of mine has analyzed a longitudinal (2 time point) dataset (Long Format) with missing data using generalized linear model (ML estimator). A reviewer is asking to compute the observed power associated with the interaction between Time and Conditions (3 groups). I thought of running a monte carlo simulation. Would you suggest an example with long format dataset and repeated measures?
Thank you

Linda K. Muthen posted on Wednesday, December 16, 2015 - 7:35 am

See the Monte Carlo counterpart to Example 9.16. You can find this on the website. It is also downloaded when Mplus is installed.

Tor Neilands posted on Thursday, February 25, 2016 - 8:09 am

I would like to simulate data to perform a power analysis for a multiple linear regression model with 7 x variables included as main effects. I will also need to simulate the interactions of the first 3 x variables with the 7th to obtain the power for investigating moderation (10 predictors total in the model).

In 2007, someone asked above about a similar situation and Linda noted the data would need to be generated outside of Mplus. However, I'm wondering if one could now use some of the newer features as demonstrated in, e.g., Ex 3.18, where data with interaction are generated via TYPE = RANDOM? If so, can anyone point me to a worked example and also an explanation of how using TYPE = RANDOM to generate the data yields an interaction variable? I am trying to understand not only how to do the simulation, but also how it works.

Thanks.

- Tor Neilands

Bengt O. Muthen posted on Thursday, February 25, 2016 - 1:49 pm

Such simulations can use a random slope approach. I will send you a pdf that describes this.

Nicole Brocato posted on Saturday, June 04, 2016 - 8:19 pm

Hi:

I have a question about using Monte Carlo studies to determine sample sizes for ESEMs: Do the criteria recommended in the Muth�n & Muth�n 2002 article (pp 605-606, e.g., biases < 10%) apply to the EFA portion of ESEMs?

I ask because I have run a set of Monte Carlo studies to determine the sample size for an ESEM that has an EFA component in which 60 items are allowed to freely load onto 6 factors.

For the EFA portion of the model, I have not been able to find sample sizes that produce biases within the recommended ranges.

I have had success with the CFA and structural portions of the model.

Many thanks.

Bengt O. Muthen posted on Sunday, June 05, 2016 - 12:03 pm

Yes I think so.

EFA is not always easy to handle in Monte Carlo simulations. Check out our FAQ:

Lambda is not compatible with the notion of simplicity of the rotation criterion

Nicole Brocato posted on Sunday, June 05, 2016 - 7:40 pm

Ah! That explains why some pattern coefficient biases were getting worse with increasing observation numbers.

Many thanks.

sfhellman posted on Friday, August 26, 2016 - 1:29 pm

I am trying to simulate a cross-lagged model in order to determine sample size needed and power where I expect group to moderate the cross-lagged relations. How can I get a chi-square difference test between a model where the paths are constrained to be equal across groups versus a model in which the bidirectional paths are allowed to vary across groups? Sample syntax:
MODEL MONTECARLO:
[CINT1@7 CINT2@8 CINT3@9];
[PDEP1@10 PDEP2@12 PDEP3@14];
CINT1@1 CINT2@.4 CINT3@.4;
PDEP1@1 PDEP2@.4 PDEP3@.4;
CINT3 on CINT2*.5 PDEP2*.2;
PDEP3 on PDEP2*.5 CINT2*.2;
CINT2 on CINT1*.5 PDEP1*.2;
PDEP2 on PDEP1*.5 CINT1*.2;

CINT3 WITH PDEP3*.3;
CINT2 WITH PDEP2*.3;
CINT1 WITH PDEP1*.3;

MODEL: [same as above]

MODEL G2:
CINT3 on CINT2*.5 PDEP2*.01;
PDEP3 on PDEP2*.5 CINT2*.01;
CINT2 on CINT1*.5 PDEP1*.01;
PDEP2 on PDEP1*.5 CINT1*.01;
CINT3 WITH PDEP3*.01;
CINT2 WITH PDEP2*.01;
CINT1 WITH PDEP1*.01;

OUTPUT: TECH9;

Bengt O. Muthen posted on Friday, August 26, 2016 - 5:18 pm

You can get the power for each constraint by using Model Constraint to express each group difference using parameter labels given in the Model command.

Stig Hebbelstrup Rye Rasmussen posted on Thursday, September 22, 2016 - 11:39 pm

Is it possible to generate data using the model population command and then estimate two models instead one just one? I want to use the first model as a reference model and then test the second (nested) model's fit compared to the first model using a chisquare test?

I found out that you can do this by generating and saving multiple datasets and then estimating the models sequentially and then testing the decrease in fit but it would be nice if there was some way this could be done within the monte carlo functionality as this is much faster

Linda K. Muthen posted on Friday, September 23, 2016 - 6:37 am

This cannot be done in one step.

Stig Hebbelstrup Rye Rasmussen posted on Monday, September 26, 2016 - 1:50 am

Hi Linda

Thank you for your quick response.

Eunsoo Lee posted on Sunday, November 13, 2016 - 9:26 pm

Hi Linda,

I'm trying to simulate 3-level growth model using "TYPE = threelevel random".

This is the equation:

Level 1 : Ytij = π0ij + π1ij*TIME + etij
Level 2: π0ij = b00j + b01j*Xij + r0ij
π1ij= b10j + b11j*Xij + r1ij
Level 3: b00j=r000+r001*Zj+u00j
b10j=r100+r101*Zj+u10j

And I want to generate data with "long format".

school student time x z
1 1 0 2.4 3.6
1 1 1 2.4 3.6
1 1 2 2.4 3.6
1 2 0 2.1 3.6
1 2 1 2.1 3.6
1 2 2 2.1 3.6
...
2 1 0 2.8 4.0
2 1 1 2.8 4.0

However, I cannot find Monte Carlo counterpart to Example 9.16.

Could you recommend any paper which explains generating long-format time data?

Thank you!

- Eunsoo

Bengt O. Muthen posted on Monday, November 14, 2016 - 5:06 pm

Just use the s | approach for the slope on time. And let s vary on both the second and third level.

Eunsoo Lee posted on Tuesday, November 15, 2016 - 9:52 am

Thank you for your response!

Xuan Chen posted on Tuesday, April 04, 2017 - 8:10 am

Hi Dr. Muthen,

As in the examples, it says The summary of the analysis results includes the population value for each parameter, the average of the parameter estimates across
replications. The column labeled Average gives the average of the
parameter estimates across the replications of the Monte Carlo
simulation study.

My question: is it possible to get the "median" of the parameter estimates across the replications of the Monte Carlo
simulation study.

Many thanks

Bengt O. Muthen posted on Wednesday, April 05, 2017 - 3:43 pm

No.

Hakan At�lgan posted on Friday, April 21, 2017 - 11:33 pm

Dear Muthen,
I tried to generate data that very important for me but i failed. the data will be used for multidimensional graded response model.Can you please help and share the syntax with me data�s properties:
N= 500
polytomous 5 likert type (1 2 3 4 5)
15 items
3 dimension (5 items per dimension)
correlation between dimensions must be r=0.30
ajhs [0,.8, 2]
bjks [-2,2]
Im waiting for your answer
thank you for your help
Dr. Atilgan

Bengt O. Muthen posted on Saturday, April 22, 2017 - 5:28 pm

Check the UG example that comes closest to this and then look up the corresponding Monte Carlo version which is posted on our website for all the UG examples.

Justin posted on Wednesday, May 31, 2017 - 1:52 pm

Hi,

I'm running an external Monte Carlo analysis (I generated the data elsewhere) with the Bayes estimator, and I keep receiving segmentation faults. I can run a batch file of individual inputs with nearly identical syntax, however.

Is ESTIMATOR = BAYES available for TYPE=MONTE CARLO? Are additional commands required to make the input work? Below is my input:

DATA:
FILE = ida_bifactor_full_dat_list.txt;
TYPE = MONTECARLO;
VARIABLE:
NAMES are y1-y8 snp sex trueAgg mean;
USEVARIABLES = y1-y8 snp sex;
MISSING ARE all (-999);
ANALYSIS: ESTIMATOR=BAYES;
COVERAGE=0;

Thank you,

Justin

Bengt O. Muthen posted on Wednesday, May 31, 2017 - 5:33 pm

Please send your output to Support along with your license number so we can see what's going on.

Christoph Weber posted on Thursday, August 10, 2017 - 2:24 pm

Dear Mplus-Team!

In the case of 2 IVs the residual variance of the DV is:

Var(F3) - b1�*Var(F1) + b2�*Var(F2) + 2b1b2*Cov(F1F2) = Var(d)

How can I determine the residual variance, when more than 2 IVs are used?

is it:

Var(F4) - b1�*Var(F1) + b2�*Var(F2) + b3�*Var(F3) + 2b1b2*Cov(F1F2) + 2b1b3*Cov(F1F3) + 2b2b3*Cov(F2F3) = Var(d)

Thanks
Christoph

Bengt O. Muthen posted on Thursday, August 10, 2017 - 4:31 pm

Right.

Christoph Weber posted on Thursday, August 10, 2017 - 11:30 pm

Thanks a lot, but I forgot the brackets, isn't it?
Var(F3) - (b1�*Var(F1) + b2�*Var(F2) + 2b1b2*Cov(F1F2)) = Var(d)

Thomas Rodebaugh posted on Thursday, September 21, 2017 - 8:40 am

I just discovered last night how comparatively simple it is to run a Monte Carlo simulation in Mplus to estimate power! Now I'm wondering if it's possible to run a simulation to determine power for DSEM. I haven't seen it mentioned in documentation, so I'm guessing no, but thought I would ask.

Bengt O. Muthen posted on Thursday, September 21, 2017 - 3:51 pm

Yes, this is possible - see the Schultzberg-Muthen paper at

http://www.statmodel.com/TimeSeries.shtml

Bengt O. Muthen posted on Thursday, September 21, 2017 - 3:52 pm

Note that all User's Guide examples also exist in a Monte Carlo version at

http://www.statmodel.com/ugexcerpts.shtml

Thomas Rodebaugh posted on Friday, September 22, 2017 - 2:47 pm

Very helpful, thanks!

Rochelle Nafatali posted on Tuesday, November 07, 2017 - 6:10 pm

Hi there, I noticed someone had posted earlier about the command under analysis input section PROCESSORS = 4 (STARTS); not working on a 64-bit machine. I have the same problem; had anyone found the issue and solution yet please? Thank you

Bengt O. Muthen posted on Thursday, November 09, 2017 - 4:58 pm

This should work. Send your output to Support along with your license number and explain why you think it doesn't work for you.

Doug Hemken posted on Thursday, December 21, 2017 - 9:33 am

I am generating and analyzing Monte Carlo data. I use the same model population and seed, but two different analysis models.

I save the data generated.

The data saved is different - shouldn't it be the same?

Bengt O. Muthen posted on Thursday, December 21, 2017 - 2:57 pm

If set up correctly, they should be the same - send to Support along with your license number.

Doug Hemken posted on Friday, December 22, 2017 - 1:53 pm

I sent this in, thanks.

Fred posted on Monday, January 08, 2018 - 12:31 am

I am trying to conduct a MC study with different models. For example one of the models is a regression model. Now I was wondering if my input command is correct. I am having problems with specifying the variance of my independent variables.

MONTECARLO:
Names are x1 y1 z3 z4;
NOBSERVATIONS = 750;
NREPS = 5;
SEED = 3611;
GENERATE y1 (4) x1 (4) z3 (1) z4 (3);
CATEGORICAL z4 z3 y1 x1;
MISSING = y1 z4 z3;
REPSAVE=ALL;
SAVE = M2_S1_0*.dat;

MODEL POPULATION:
y1 ON x1@.1 z3@.3 z4@.5;
x1@1;
z3@1;
z4@1;
y1@.538;
z3 with x1@.2 z4@.2;
z4 with x1@.4;
[x1$1*-1.5 x1$2*-0.5 x1$3*0.5 x1$4*1.5
z3$1*0
z4$1*-1.25 z4$2*0.0 z4$3*1.25
y1$1*-1.5 y1$2*-0.5 y1$3*0.5 y1$4*1.5];

Thank you for your help.

Bengt O. Muthen posted on Monday, January 08, 2018 - 5:33 pm

Send the output that shows the problem to Support along with your license number.

Mplus User posted on Wednesday, January 17, 2018 - 3:07 pm

I ran a monte carlo power analysis and got this :

THE POPULATION COVARIANCE MATRIX THAT YOU GAVE
AS INPUT IS NOT POSITIVE DEFINITE AS IT SHOULD BE.

Can you help?

MONTECARLO:
NAMES ARE y1-y3 x1-x9 ;
NOBSERVATIONS = 500;
NREPS=500;
SEED=4533;
MODEL POPULATION:
[y1-y3@0]; !mean for y variables
[x1-x9@0]; !mean for x variables
y1 - y3@1; !variance for y variables
x1 - x9@1; !variance for x variables
F1 BY y1*.853;
F1 BY y2*.795;
F1 BY y3*.783;
F2 BY x1*.765;
F2 BY x2*.692;
F2 BY x3*.769;
F3 BY x4*.750;
F3 BY x5*.926;
F3 BY x6*.804;
F4 BY x7*.883;
F4 BY x8*.849;
F4 BY x9*.916;

INT | F2 XWITH F3;
F1 ON F2 *.315;
F1 ON F3 *.053;
F1 ON F4 *.132;
F1 ON INT *.553;

F2 WITH F3 *.735;
F4 WITH F2 * .665;
F4 WITH F3 * .696;
F1@1;
F2@1;
F3@1;

ANALYSIS:
TYPE = RANDOM;
ALGORITHM=INTEGRATION;
MODEL:
F1 BY y1-y3;
F2 BY x1-x3;
F3 BY x4-x6;
F4 BY x7-x9;
INT | F2 XWITH F3;
F1 ON F2 F3 F4 INT;
OUTPUT: TECH9;

Bengt O. Muthen posted on Wednesday, January 17, 2018 - 3:46 pm

Try it first without the interaction.

Ti Zhang posted on Wednesday, February 07, 2018 - 9:30 am

Hi, Dr. Muthen,
Mplus user guides says "The column labeled % Sig Coeff gives the proportion of replications for which the null hypothesis that a parameter is equal to zero is rejected at the .05 level ".
If researchers specifies the "population" column values as nonzero, will the null hypothesis become that a parameter is equal to the value specified? For example, the population value given in the "model" command for latent mean differences is 0.2.
Would the null hypothesis become the obserevd latent mean differences equal to 0.2?
The Mplus output is below (in "model population-g2" command, I specified the latent mean differences is 0.2; In the "model g2" command, I also specified it to be 0.2). This means the simulated data between two groups have a latent mean differences of 0.2. Also, researchers model the differences as 0.2 too.

Means
F1 0.200 0.2425..... 1.000 0.400
The last column above gives a value of 0.400. So, can I say we cannot reject Ho?
Then, Mplus manual says "For parameters with population values different from zero, this value is an estimate of power with respect to a single parameter, that is, the probability of rejecting the null hypothesis when it is false. "
This sentence confuses me because I am not sure how to understand 0.4 will be the power because we cannot reject Ho previsouly.could you tell me which part I misunderstood? Thank you.

Bengt O. Muthen posted on Wednesday, February 07, 2018 - 4:32 pm

On your first question (your first ?): No, we are still testing against zero (that's what "significant" means).

Regarding our statement which ends with "the probability of rejecting the null hypothesis when it is false", the null hypothesis is that the parameter value is zero which is false since your population value is 0.2.

Morgan Kawamura posted on Tuesday, April 17, 2018 - 3:43 pm

Hi Drs. Muthen & Muthen,

Please excuse me if my question is misdirected. I'm hoping you can assist me or direct me to any simulation papers/resources that might help.

I am simulating a 3 mediator model with the Monte Carlo feature where 2 of my mediators are continuous and 1 is binary. I am having trouble knowing what I need to set my variance to for the binary mediator and how that variance impacts the variance of my other 2 continuous mediators. Do you have any resources that address this issue? Thank you!

Bengt O. Muthen posted on Tuesday, April 17, 2018 - 4:40 pm

See our Topic 11 Short course video and handout - also available on YouTube. And/or read our book Regression and Mediation Analysis using Mplus which shows many such Monte Carlo runs (book script examples also posted on our website).

Mary M Mitchell posted on Thursday, June 07, 2018 - 7:51 am

Dear Drs. Muthen,

I am running a Monte Carlo with example 3.7, the Poisson distribution, for a power analysis in a grant application. I kept the [u1*1]; code to define the intercept. I'm surprised there is no definition of mean and standard deviation for the outcome variable in the code. My results indicated that for an intercept of 1, the power is .87 for an unstandardized beta of .2. Is this what reviewers will be looking for? What throws me off is that we have two groups for the x1 variable, which is study arm, and then a continuous count outcome for number of sessions attended. We expect the mean of sessions for the control group will be 12 and the mean of the intervention group will be at least 13, but none of this information is conveyed in the sample Monte Carlo that I used.

Thanks for your help!

Mary

Bengt O. Muthen posted on Thursday, June 07, 2018 - 5:48 pm

Check out the Topic 11 YouTube video and handout with refer to our RMA book (chapter 6 is on counts). You see there that for counts you model

log(mean) = a + b*x

So the scale is not in counts, but in log(mean) for the count distribution. And the log(mean) is the same as the intercept a only if x=0.

Mary M Mitchell posted on Friday, June 08, 2018 - 9:35 am

This was extremely helpful! Thank you!

Behzad posted on Sunday, July 22, 2018 - 11:53 am

According to the rules-of-thumb sample size recommendations, it is recommended to have 10 participants per parameter (Kline, 2011, suggests 20). But, based on your point of view and according to the Monte Carlo approach (Muthen & Muthen, 2002), to determine sample size in SEM several criteria should be examined including: "parameter and standard error biases, the standard error bias for the parameter, and the coverage remains between 0.91 and 0.98". In my study, I have 286 samples (n=286). I have one predictor (scale include 5 items), two mediators (each scale have 5 items), and three outcomes [two of the scales (latent) have 3 items, and the one is an observe scale (measured with one item)]. I have also included gender (male and female), background (yes or no) and two dummy variables as covariates. The rules-of-thumb recommendation with these number of parameters is to examine the hypotheses through path analysis and do not use the SEM, due to the small sample size. What is your point of view? I don�t know how to apply Monte Carlo approach to examine the required sample size for SEM. You only provided examples for CFA and Growth Models in your paper, and I was wondering if you could let me know how to calculate Monte Carlo assumption regarding the latent variable model (SEM) in Mplus. Would you also please let me know if you have an example command to run this?

Bengt O. Muthen posted on Sunday, July 22, 2018 - 2:32 pm

I think it is better to do your own more specific Monte Carlo simulation study than go by rules of thumb. You find Monte Carlo scripts for all the UG examples at

http://www.statmodel.com/ugexcerpts.shtml

Margarita posted on Friday, August 31, 2018 - 4:26 am

Hi Dr. Muthen,

Can I confirm that for a Monte Carlo study the R2 of an endogenous latent that is predicted by let's say 1 exogenous observed (X1) and 2 endogenous latents (X2,X3) would be:

R2 = b1^2*Var(X1) + b2^2*Residual Var(X2) + b3^2*Residual Var(X3) + 2*b1*b2*Corr(x1,x2) +
2*b1*b3*Corr(x1,x3) +
2*b2*b3*Corr(x2,x3)

I just want to make sure I define things properly, especially given that exogenous observed are in the mix

Thanks!

Bengt O. Muthen posted on Friday, August 31, 2018 - 2:11 pm

You need the variance of X2 and X3, not just their residual variances. And you should use the covariances not the correlations.

Margarita posted on Friday, August 31, 2018 - 2:15 pm

Apologies for posting again, thought I would save you some time. I already confirmed the formula, also realised I wrote Res var instead of Var. Thank you anyway :-)

Margarita posted on Friday, August 31, 2018 - 2:19 pm

I think I was typing while you sent the above :-)

Yes, I realised I made an error there. I wrote correlation because I am using stand betas in the monte carlo (based on previous literature. Thank you for your reply

Nicholas Bishop posted on Friday, November 01, 2019 - 8:46 am

Hello,
I'm trying to produce estimates for a relatively simple growth model to use for subsequent MC simulation. I'm receiving the following error:
"Saving of ending values for the ESTIMATES option is not available for models with covariates. Request for ESTIMATES is ignored."

Example 12.7 Step 1 appears to contain covariates, so I'm unsure of the issue. Any support is appreciated.

Nick

Linda K. Muthen posted on Friday, November 01, 2019 - 5:18 pm

A much better approach that was developed after the ESTIMATES option is to use the SVALUES option of the OUTPUT command. This gives you a MODEL command with the ending values of the analysis as starting values. You can then use this in the Monte Carlo analysis.

Anton Dominicson posted on Sunday, December 22, 2019 - 11:48 am

Dear Mplus team, I ran a simple regression with two groups and with Model Test I tested whether the slopes where different across groups (Model Test: b1=b2; ...With Model Constraint I used: diff = b1-b2). Now I want to run a MC simulation and want to include the Model Test part, and I understand Model Test isn't available in MC, so I tried with Model Constraint, but I'm not sure how to do the Model Constraint part in MC. I tried the following syntax but I get a fatal error message (A POPULATION VARIANCE FOR A COVARIATE IS ZERO)

MONTECARLO:
NAMES ARE HbA1c x1;
NGROUPS = 2;
NOBSERVATIONS = 123 119;
NREPS = 10000;
MODEL POPULATION:
hba1c ON x1*-0.01799 (b1);
[ hba1c*9.32303 ];
hba1c*3.03982;
MODEL POPULATION-g2:
hba1c ON x1*-0.01763 (b2);
[ hba1c*9.16726 ];
hba1c*2.15833;
MODEL:
hba1c ON x1*-0.01799 (b1);
[ hba1c*9.32303 ];
hba1c*3.03982;
MODEL g2:
hba1c ON x1*-0.01763 (b2);
[ hba1c*9.16726 ];
hba1c*2.15833;
MODEL CONSTRAINT:
NEW(diff*-0.00036);
diff= b1-b2;

How can I fix this? Or is there some other approach in which Mplus could handle this? Thanks.

Bengt O. Muthen posted on Monday, December 23, 2019 - 2:25 pm

The x1 variable needs to be given a variance (and a mean) in Model Population, otherwise it defaults to zero as the message says.

Yue Yin posted on Wednesday, February 26, 2020 - 10:35 am

I am trying to use multiple group MIMIC modeling with categorical variable to generate data. One condition is: the residual variance of items for two groups are different(one group .3, .1 for another group). If I use Delta to generate it, I can't see the residual variance in the output. But if I use Theta, one group of residual variance is always fixed to 1. Below is the
part of code, could you help me with it? Thank you!
Model population:
[x1@0]; x1@1;
f by u1@.9 u2*.7 u3*.6 u4*.8 u5*.7 u6*.6;
f*.75;
f on x1*.5;
u4 on x1*.4; u5 on x1*.5;
u1@.3 u2@.3 u3@.3 u4@.3 u5@.3 u6@.3;
[u1$1*-.15];
[u2$1*.25];
[u3$1*.15];
[u4$1*-.25];
[u5$1*-.10];
[u6$1*.10];
model population-g2:
f*.45;
u1@.1 u2@.1 u3@.1 u4@.1 u5@.1 u6@.1;
Analysis:
parameterization=theta;

Bengt O. Muthen posted on Wednesday, February 26, 2020 - 4:38 pm

You can fix the Theta parameters to values other than 1 - e.g. 0.3.

Yue Yin posted on Wednesday, February 26, 2020 - 6:55 pm

Do you mean fixed the default Theta parameters? I fixed the residual variance to .3 for one group and .1 for another group using "@", but in the output, the residual variance of one group is still 1. Do you know how to fix the Theta parameters? Thank you!

Bengt O. Muthen posted on Thursday, February 27, 2020 - 11:55 am

Right. But fix it in both the Model Population and Model commands.

If that doesn't help, send output to Support along with your license number.

Yue Yin posted on Friday, February 28, 2020 - 1:04 pm

Yeah, I sent the output, but I still didn't get the reply,could you check it? Thank you!

Bengt O. Muthen posted on Saturday, February 29, 2020 - 5:58 am

You need to explicitly give the fixed values not only in your second group but also in the first group, so using

Model population-g1: u1@.3 u2@.3 u3@.3 u4@.3 u5@.3 u6@.3;

and

Model g1: u1@.3 u2@.3 u3@.3 u4@.3 u5@.3 u6@.3;

I am not sure, however, why you give these fixed values. It merely means that you choose different scales of the estimates than when the default of 1 is used. The
standardized solutions will be the same. No new information is obtained.

Yue Yin posted on Saturday, February 29, 2020 - 12:46 pm

Yeah, that's what I did, I sent the output to he help, I fixed those values in both Population Model and Model. But the residual is still 1 in the output. The reason I want to fix those value because I want to check if the different residual variance of two groups or equal residual variance of two groups will affect some DIF detection methods. And in some previous DIF studies, they fix residual variance to .3 for both groups. So I just want to use those values in my studies, but I still can't figure out why it was not working?

Bengt O. Muthen posted on Saturday, February 29, 2020 - 5:03 pm

In what you sent Support, you did not do what I suggest here - you did not specify it explicitly for group 1.

Jen posted on Thursday, April 30, 2020 - 2:52 pm

Hello, I am hoping for help with a monte carlo power simulation. I need to test the power for testing an interaction. We are likely to use manifest variables, but I am trying to use the XWITH function for the power simulation since I don't understand how to use two steps. The entire model is quite complex, but I've tried to simplify it as much as possible here. I have been searching this forum and other resources all week but have not been able to fix my model -- I keep getting MCONVERGENCE errors related to many different model parameters. Is there something obvious I'm missing? Thanks for any guidance!

MODEL POPULATION:
stigwf BY stigw@1;
stigwf*1;
stigw@.0001;

modwf BY modw@1;
modwf*1;
modw@.0001;

medw ON stigwf*.3;

stigwf WITH modwf*.0001;
medw ON modwf*-.2;

int | stigwf XWITH modwf;

medw ON int*-.2;

[medw*0];
medw*1;

Bengt O. Muthen posted on Friday, May 01, 2020 - 4:28 pm

I assume you also use a Model command, not only a Model Population command.

We need to see your full output to tell what's going on - send to Support along with your license number.

Brandon Goldstein posted on Wednesday, June 10, 2020 - 1:37 pm

Hello,
I am attempting to learn how to use mplus for power calculations starting with simple scenarios like multiple regression with 2 and 3 variables. Is there a good resource for learning how to do that.
Additionally, I am trying to understand whether there are certain rules for determining how to select values for the various parameter and their relationship to one another. For instance, if I change the variance of a predictor should I change other values as well or is that unnecessary.

Bengt O. Muthen posted on Wednesday, June 10, 2020 - 3:29 pm

Here are two sources:

Muth�n, L.K. & Muth�n, B.O. (2002). How to use a Monte Carlo study to decide on sample size and determine power. Structural Equation Modeling, 4, 599-620. Mplus inputs and outputs used in this paper can be viewed and/or downloaded from the Examples page.
download paper contact first author show abstract

Simulations for regression and mediation is thoroughly covered in our 2016 book Regression and Mediation Analysis Using Mplus:

http://www.statmodel.com/Mplus_Book.shtml

Brandon Goldstein posted on Monday, August 17, 2020 - 7:22 am

Thanks again for the assistance. In the 2016 book (which is great!), I am interested in the model with the exposure-mediator interaction (Case 3 in chapters 2 and 3). I am hoping to conduct a handful of simulations to estimate power to detect various combinations of small, medium and large effects at different sample sizes with x, m and y all treated continuously. I have two main questions so far.

1. What is the appropriate equation, for the purposes of determining r-squared for the equation for Y in Case 3 (and for this equation is it necessary to determine the variance of the mx product, if so how is that variance calculated)?

2. If I am interested in understanding the coefficients in the model as standardized coefficients, would this be accomplished by always having the total variances for m and y be equal to 1? I am thinking that in order to be able to fairly compare necessary sample sizes for models with "small" vs. "medium" effect sizes, the way to do this would be to balance increases of the coefficient sizes, with decreases in the residual variances of m and y, while keeping the total variances for m and y equal to 1 across these different scenarios.

Bengt O. Muthen posted on Monday, August 17, 2020 - 4:45 pm

1. See intro formulas for products of variables in the paper on our website (see Recent Papers):

Asparouhov, T. & Muth�n, B. (2019). Bayesian estimation of single and multilevel models with latent variable interactions. Forthcoming in Structural Equation Modeling.

2. That's right.

Brandon Goldstein posted on Thursday, August 27, 2020 - 5:54 am

Thank you as always!
In the 2016 regression book, you describe the second step monte carlo study. I am wondering if you can explain what this second step is doing and whether it is necessary?
The way the text is written suggests that the second step is important for examining power at different samples.
However, if I were to examine the power for two different sample sizes, wouldn't it be sufficient to conduct two separate runs of the the first step of the monte carlo study, but simply change the Nobs value.
Thanks for helping to clarify as always.

Bengt O. Muthen posted on Thursday, August 27, 2020 - 4:25 pm

You do 2 steps typically when you want to generate the data one way and analyze it another. See UG ex 12.6. A second step also makes it possible to use Define, for instance if you want to create an interaction.

Brandon Goldstein posted on Thursday, August 27, 2020 - 5:25 pm

That absolutely makes sense. What though would be the reasons to need make the interaction in the second step? Wouldn't the method you describe for using the random slope with a fixed variance be sufficient to test power for the interaction. What exactly is added by having it done with the define command in the second step?

Bengt O. Muthen posted on Saturday, August 29, 2020 - 4:33 pm

Nothing in a simple case but you may have more complex interactions (3-way?). Also, Define can be used for many other transformations.

Brandon Goldstein posted on Monday, August 31, 2020 - 10:59 am

Ok,
I had not thought about a three -way interaction. Now that you mention it would the initial model syntax for a three way model look like this:

model population:
x@1;
M@1;
z@1;

y on x*0.1; !main effect of x

beta2 | y on m; !main effect of m
[beta2*0.2];
beta2@0;

beta3 | y on z; !main effect of z
[beta3*0.2];
beta3@0;

beta2 on x*0.1 z*0.1; !mx and mz terms;

beta4 | beta3 on m; !zm term;
[beta4*0.1];
beta4@0;
beta4 on x*0.1 ! 3-way with x "predicting" zm term;

Y*0.6 !residual of y

Bengt O. Muthen posted on Monday, August 31, 2020 - 3:56 pm

Try it out and if you have problems, send output to Support along with your license number.

Brandon Goldstein posted on Wednesday, September 02, 2020 - 8:56 am

Another question.
I understand that by using the Cut(0), and then in the model population section X@1 and [x@0], we get a dichotomozied variable that has a mean of 0.5 and variance of 0.25.

What if we wanted a dichtomous variable that did not have a 50/50 split? Could you please provide some guidance about how we would tell mplus to do that. I imagine that we would need to sale the mean and variance in the population command in someway, but it is not clear to me how to do that.
Thank you as always

Bengt O. Muthen posted on Friday, September 04, 2020 - 3:53 pm

If you stick with X mean=0, variance=1 (which is before the cut), you use z-values in CUT(z) to get the probability you want (see Z-value tables).

Brandon Goldstein posted on Thursday, October 01, 2020 - 7:49 am

Hello,
I am interested in power analysis for moderation analyses (all variables continuous), in which I can vary the correlation between X and M. It should be the case that higher correlations between X and M lead to greater power to detect the interaction. I am having trouble specifying a model that will allow me to set the correlation. Using a with statement like M with X*.5; or M with X@.5; is producing error messages. The syntax works fine when I do not try to write a line for the correlation. How could I go about doing this? Is it enough to factor the correlation of X and M into the calculation for the residual variance of Y?
Alternatively, would it be appropriate to make the M and X relationship with a regression statement (as in Case 3 from the Mplus Regression book). If so? This would mean that testing power for a regular moderation model is equivalent to testing power for Case 3. Does that seem correct to you?

My population generation syntax look like this (the correlation between M and X at .5 is kept in mind for the residual variance of Y, where the overall variance of y is 1):
Model Population:
x@1;
M@1;

y on x*.3;
beta1 | y on m;
beta1 on x*.25;
[beta1*.3];
beta1@0;

y*0.651875;

Bengt O. Muthen posted on Friday, October 02, 2020 - 4:06 pm

You vary the correlation between X and M by varying the regression coefficient of M on X.

Brandon Goldstein posted on Tuesday, October 06, 2020 - 6:47 am

Thank you. I have now been able to get these models to work using either with or on.
Part of my original question was about which syntax is correct to use to characterize the M and X relation when running an interaction with correlated predictors. Should it be on or with?
When keeping everything the same between these two approaches, I see that that power power and coverages differ between the two approaches (is that simply part of the error in simulation?). When I look at the summary statistics, the pattern of correlations between M, X and Y is pretty different. Somewhat surprisingly to me, when I use the on statement for the model generation, the correlation coefficient for M and X is almost exactly what I specify, but when using the with statement the summary statistics provide inflated correlation for M and X.

This might suggest using on as opposed to with, but how do I know which approach should be used?

Bengt O. Muthen posted on Tuesday, October 06, 2020 - 5:33 pm

Have a look at the RMA book examples for Table 2.26 at

http://www.statmodel.com/mplusbook/chapter2.shtml