Mplus Discussion >> Dual/multi-core processor

Topics
Last Day
Last 3 Days
Last Week
Tree View

Edit Profile


Dual/multi-core processor

Mplus Discussion > Structural Equation Modeling >

Message/Author

Maria Caridad García-Cepero posted on Tuesday, May 29, 2007 - 11:39 am

I just bought a new computer (AMD Athlon 64 X2 4000+)
I am not sure how to take advantage of the Multi-processing speed gains. Do I have to do a special set up for MPLUS or MPLUS 4.21 automatically does?
I have notice that when I run my models (for instance I am running some MIMIC models with zero-inflated poison endogenous indicators in a sample of 4000 subjects) that I am only using one of the processors but never two. Is it normal?
Thanks
Caridad

Thuy Nguyen posted on Tuesday, May 29, 2007 - 11:59 am

Yes, Mplus only uses one processor by default. You need to specify the number of processors Mplus should use with the PROCESSORS option in the ANALYSIS command.

For example:

ANALYSIS:
process=2;

Matthew Cole posted on Friday, July 13, 2007 - 7:23 pm

Fantastic!

This really speeds up the analysis, particularly when running the mixture analyses with random starts.

Spencer James posted on Friday, February 17, 2012 - 7:31 am

Is there a limit to how many processors Mplus can/will use?

Linda K. Muthen posted on Friday, February 17, 2012 - 1:43 pm

No.

Anonymous posted on Thursday, July 05, 2012 - 10:55 am

I have a quad-processor desktop with 4MB, and would have assumed that the maximum number of processors I could run would be 4. However I tried 8 and this seemed to work. Should this be?

Linda K. Muthen posted on Friday, July 06, 2012 - 11:19 am

We have no way of knowing how many processors a computer has. If you say you have more than you do, this could cause Mplus to run inefficiently. You should state the number of processors correctly.

Robert Nichols posted on Friday, April 01, 2016 - 3:26 pm

When using the PROCESSORS option should I specify the number of cores or the number of logical processors? I have 4 cores and 8 logical processors.

Thuy Nguyen posted on Saturday, April 02, 2016 - 1:29 pm

You should specify the number of logical processors that you want to use. So on a quad-core computer with hyperthreading, you can specify PROCESSORS=8.

Peter posted on Sunday, January 29, 2017 - 8:28 am

Dear Drs. Methuen,

1) Could you update (Jan 30, 2017) specific M-Plus commands which run faster on multiple cores. i.e.,

PROCESSORS = 12

2) Have all of these listed M-Plus commands in 1) above been parallelized in your code so that they can take advantage of multiple cores?

Would you please comment on the following:

3) In benchmarking timing tests within Stata and Geekbench, we found faster parallel processing times when only hardware cores were specified to be used; that is, while virtual cores were left free.

Our best impression for this difference, while watching Mac OSX and Windows 8 allocate the work load among multiple processors (on the same dual boot machine),
is that it is best to leave all virtual cores free for operating system calls and other "housekeeping" chores that are required while floating point, matrix algebra math computation is ongoing on the hardware cores.

Have tests WITHIN M PLUS on your multiple core machine, confirmed or refuted point 3) above?

Thank you,

Paul

Tihomir Asparouhov posted on Monday, January 30, 2017 - 9:23 am

1) All the up to date information on that is in the Mplus users guide page 648-650

2) yes

3) no - however there is such a great variety of processor and OS that I wouldn't generalize too much here. On Intel CPUs and Windows OS using virtual cores is faster.

John B. Nezlek posted on Thursday, August 24, 2017 - 1:21 am

Hi Mplus,

If I understand Thuy Nguyen's post of April, 2016 properly, the processor command can "pick up" threads and cores as well as physical processors. So, even on a single processor machine with hyperthreading, I can specify processors = 2. If I had a dual core (single physical CPU) machine with hyperthreading, I could specify processors = 4.

I am considering purchasing a new machine primarily to do some multilevel mixture models with some larger samples (e.g., 3,000 level 1, and 250 level 2). I assume the chips will provide hyperthreading. Do you have any sense about where (in terms of the number of processors) the performance curve dips? That is, I assume 2 is better than 1, 4 is better than 2, and so forth; however, the law of diminishing returns may apply here.

Along these lines, what is the impact of RAM? Of course, more is usually better, but is there an inflection point in terms of performance here also? I assume a starting point of 16gb, but I will be running win7 pro, which allows up to 192gb.

Thank you for considering these questions, and please accept my apologies if they are poorly informed.

John

Bengt O. Muthen posted on Friday, August 25, 2017 - 5:06 pm

While we hesitate to comment on hardware, we feel it is better to get the fastest multi-core processor instead of the processor with the most number of cores for Mplus. Not all Mplus computations are done in parallel – so having the fastest single processor speed avoids running into computational bottleneck.

The ideal number of RAM to have is dependent on the size of the model and data and how much multitasking is done while Mplus is running. Our best computer has 32GB of RAM. We have run into memory limitations with that much RAM. But the number of variables in those models was very large.

Kurt Beron posted on Tuesday, August 29, 2017 - 12:24 pm

I thought I'd add to this thread (sorry) as I've been dealing with the question of the optimal number of processors/threads myself. Below are my benchmarks for my 4 core/8 thread PC with 32gb ram. I can open quite a few Mplus threads, but the ones that match my CPU work better than increasing the number. Obviously I am running a specific program, but I repeated with a different program with similar results:

4 processors

Beginning Time: 13:27:02
Ending Time: 13:30:36
Elapsed Time: 00:03:34

8 "processors"

Beginning Time: 13:31:33
Ending Time: 13:34:25
Elapsed Time: 00:02:52

16 "processors"

Beginning Time: 13:35:06
Ending Time: 13:38:39
Elapsed Time: 00:03:33

32 "processors"

Beginning Time: 13:39:28
Ending Time: 13:43:03
Elapsed Time: 00:03:35

Kurt

John B. Nezlek posted on Tuesday, August 29, 2017 - 12:52 pm

Hi Kurt:

Thanks for the post. As I understand things, with a quad-core processor with hyperthreading you have 8 cores (processors) So, the fact that specifying 8 processors provides the most efficient solution makes sense. That is, although you can specify more processors than you have, it seems that it does no good.

Thanks again,
John

Bengt O. Muthen posted on Tuesday, August 29, 2017 - 5:12 pm

That's right. There is a penalty in the form of overhead for specifying additional procs so if you don’t have the hardware behind it – all you will get is the penalty.

Denise Kerkhoff posted on Friday, June 14, 2019 - 12:50 am

Dear Linda, dear Bengt,
I have a few questions regarding the processor specifications for THREELEVEL analysis.
1) Is it correct that for TYPE=THREELEVEL, I can specify the number of processors, e.g. PROCESSORS=4, but for TYPE=THREELEVEL RANDOM, I need to specify the number of processors and threads in conjunction with the STARTS option, e.g. PROCESSORS = 4 2; STARTS= 200 20; Otherwise, only one processor would be used?
2) If this is correct, how would I be able to increase computation speed for TYPE=THREELEVEL RANDOM if I use a multi-core processor without hyperthreading? Would PROCESSORS = 4 1; in conjunction with the STARTS option work?
3) If I use a quad-core processor with hyperthreading, would a sensible specification of processors and threads be PROCESSORS = 4 2; for TYPE=THREELEVEL RANDOM?
4) I am not sure if I understand the STARTS option correctly. If I do not specify this option, what are the default values for the number of random sets of starting values and the number of optimizations to use in the final stage in TYPE=THREELEVEL RANDOM? Can I find them in the output?

Thank you very much for your help,
Denise

Tihomir Asparouhov posted on Saturday, June 15, 2019 - 2:58 pm

1) This does not sound correct. The starts command is not usually used with type=threelevel. It is mostly used for mixture models. The processor command for type=threelevel is available for the bayes estimator and for the ML estimator when numerical integration is performed, for example with categorical variables. It is not available with the ML estimator and continuous variable where the computation is quite fast usually. You should be able to see the number of processors Mplus uses in the task manager.

2) If the processor command is available (bayes or ML with algo=int) then specifying processor=4 would help. You might also be experiencing slow convergence due to too many random slopes that. What can help in that case is to run one random slope at a time to see if it is really needed (significance variance) and just keep those that are needed in the final model.

3) For ML with numerical integration I would recommend proc=8 for your computer. For Bayes I would recommend proc=4.

4) We would use the default starting values (see tech1) and perform just one optimization. Starts is generally not needed for type=threelevel but is available.