Mplus Discussion >> Random Starts

Topics
Last Day
Last 3 Days
Last Week
Tree View

Edit Profile


Random Starts

Mplus Discussion > Growth Modeling of Longitudinal Data >

Message/Author

Jungeun Lee posted on Friday, December 14, 2007 - 4:38 pm

Hi,

I am working on a growth mixture model (outcome=continuous, estimator=MLR, type=Mixture Missing). When I increased the number of class=4, I encountered the following error. In my current mplus input for this model, STARTS =500 20. I am puzzled about what I can do about this...

WARNING: THE BEST LOGLIKELIHOOD VALUE WAS NOT REPLICATED. THE
SOLUTION MAY NOT BE TRUSTWORTHY DUE TO LOCAL MAXIMA. INCREASE THE
NUMBER OF RANDOM STARTS.

Bengt O. Muthen posted on Friday, December 14, 2007 - 5:51 pm

You can increase the number of random starts further until the best LL is replicated. If you have problems replicating it for many random starts, this might indicate that you are trying to extract too many classes - the data don't show signs of that many classes.

linda beck posted on Monday, August 18, 2008 - 3:07 am

I have a very complex 3 class model and used starts = 1500 50.

my loglikelihoods were:

-8082.751
-8083.564
-8083.641
-8083.641
-8083.642
-8083.642
-8083.642
-8083.642
-8083.642

Since I get no "loglikelihood warning message", can I trust this solution with only imprecisely replicated values? I'm aware of the possibility of extracing too many classes with this 3 class solution (as posted above). I only want to use the BIC and tech11 here, in order to have arguments for my preferred 2 class solution, where I had no problems to replicate loglikelihood values exactly.

thanks, linda

Bengt O. Muthen posted on Monday, August 18, 2008 - 6:37 am

The way to check if the first LL is close enough to the second is to check if it gives approximately the same solution in terms of parameter estimates. This in turn can be determined by using the seed for the second LL as "OPTSEED" in a new run where you inspect the parameter estimates and compare them to those of the first LL run.

linda beck posted on Tuesday, August 19, 2008 - 2:10 am

thank's for that advice. The second LL isolates different kinds of classes, which was often the case when I tried to compute 3 class models with that data.
Despite that instability also speaks for 2 classes, what can one do to force LL replication? I have heard of increasing both values in the starts option, increasing stiterations, increasing number of integration points... Did I miss an option?

Linda K. Muthen posted on Tuesday, August 19, 2008 - 7:28 am

You can increase starts to as many as 5000 100 and decrease MCONVERGENCE. If this does not help, you may need to use a simpler model.

linda beck posted on Tuesday, August 19, 2008 - 9:44 am

Thanks, I've done exactly what you recommended before you posted it, funny... It is still running. One additional question: How can one deal with nonconverging perturbed starting values? There was a message that some did not converge under the list of LL-values (I guess this is for the final stage optimizations) and above the list of LL-values.

linda beck posted on Tuesday, August 19, 2008 - 9:47 am

add: in another post Bengt recommended to switch to stscale=1. Is that an option?

Linda K. Muthen posted on Tuesday, August 19, 2008 - 11:05 am

If you increase the starts to 5000 it will take more time than 100. If you have further questions about output, send your files and license number to support@statmodel.com.

You could try STSCALE=1; You can also send your files and license number to support@statmodel.com.

Carey posted on Sunday, February 08, 2015 - 10:21 am

I am slightly confused with how many random starts to use in my LPA. Should the random starts change the solution? If it does, what does it mean?

Linda K. Muthen posted on Sunday, February 08, 2015 - 11:28 am

You should use enough random starts so that you replicate the best loglikelihood. If you do not, you have reached a local solution. Use, for example, STARTS = 200 50 or more where the second number is 1/4 of the first number.

J.D. Haltigan posted on Friday, September 07, 2018 - 4:51 am

Is it more generally the case that the more complex the model, the larger number of random start sets one is likely to need to replicate the best LL?

Bengt O. Muthen posted on Friday, September 07, 2018 - 2:24 pm

Yes because the likelihood is not as smooth and pointed.

Margarita posted on Monday, October 08, 2018 - 8:54 am

Hi Dr. Muthen,

I wonder if there is a specific reason why you suggest (in the output) running the same model twice with twice the random starts? I understand that this is a good advice to avoid hitting local maxima. However, is this always a requirement? For instance, would 2 runs of the same model (1000 250/2000 500) offer more information than running it only once but with as many starts as the 2nd run e.g. 2000 500?

Thank you,

Bengt O. Muthen posted on Monday, October 08, 2018 - 10:39 am

Q1: No.

Q2: No.

This output suggestion is simply meant to avoid using a solution that is based on premature stoppage by PSR. If you have a very long run and have seen PSR close to 1 for many iterations, you don't need to do this follow-up check.

Margarita posted on Monday, October 08, 2018 - 2:04 pm

that's great thank you!

Apologies, just realised that this thread is under growth modelling and MCMC and not general random starts. I assume the same reasoning applies to ML-EM with integration, but one would have to see close to zero abs & rel change?

Bengt O. Muthen posted on Monday, October 08, 2018 - 2:55 pm

To some extent it is a similar story but with ML it is possible to have the more exact convergence criterion of close to zero first-order derivatives.

Andrea Roman Alfaro posted on Thursday, January 31, 2019 - 9:18 am

Hi, I am doing a LCA with 20 variables and a sample of 3324 individuals. When I run the data, I keep on getting the following output for Tech14:
TECHNICAL 14 OUTPUT

Random Starts Specifications for the k-1 Class Analysis Model
Number of initial stage random starts 0
Number of final stage optimizations 0

Random Starts Specification for the k-1 Class Model for Generated Data
Number of initial stage random starts 0
Number of final stage optimizations for the initial stage random starts 0
Random Starts Specification for the k Class Model for Generated Data
Number of final stage optimizations 150
Number of bootstrap draws requested Varies

Is this normal? This is my analysis input:
Type=mixture;
Processors = 4(starts);
starts = 0;
LRTSTARTS = 0 0 600 150;

Bengt O. Muthen posted on Thursday, January 31, 2019 - 4:18 pm

Please send your output to Support along with your license number.

See also web note 14:

Asparouhov, T. & Muth�n, B. (2012). Using Mplus TECH11 and TECH14 to test the number of latent classes. Paper can be downloaded from here. Mplus Web Notes: No. 14. May 22, 2012.

namer posted on Friday, March 15, 2019 - 7:08 am

Hello,

Is there a rule of thumb regarding a reasonable number of random starts? I was always under the impression that 1000 250 was a generous starting point but my current situation leads me to question that -

I have a model at the moment in which the LL replicates very well with 5000 random starts, and the solution looks sound, but then if I run the same model with 10,000 random starts, I get a different solution (replicated LL many times as well, and also seems to be a sound, but different solution)?

Does this indicate the 5000 random starts solution is actually a local solution?

Also, can you clarify as to above you mention checking PSR close to 1 - what does PSR stand for?

Thank you!

Bengt O. Muthen posted on Friday, March 15, 2019 - 11:39 am

Typically, 1000 250 is quite sufficient. Your model must be complex (perhaps overly complex) where the likelihood is a bit flat here and there so that some parameters are not well pinned down by the information in the data.

Yes, the 5000 solution must then be a local max. You can check to see if the 2 sets of estimates (from 5000 and from 10000) are much different; they may not be.

PSR stands for potential scale reduction - you should watch the Bayes segment of our Topic 11 Short Course video (also on YouTube). It is important to know how to work with the PSR.

namer posted on Friday, March 15, 2019 - 3:34 pm

Thank you very much for your response!

The model is a 3 variable LPA, and all variables have a 5 pt response scale, though we do include sampling strata and weights, so perhaps the complex sample is where the complexity comes from? :-)

I will definitely check out PSR. Thanks again!

Bengt O. Muthen posted on Friday, March 15, 2019 - 5:51 pm

I don't think so. More likely is having a (too) high number of classes or allowing variances to vary across classes.

Chris Giebe posted on Thursday, March 21, 2019 - 5:48 am

Hi,

I would like to model a mediation using the Model Indirect function, but am receiving the following error:

*** ERROR
MODEL INDIRECT is not available for analysis with ESTIMATOR=BAYES.

I'm using Version 7.3. Is this option not available for my version?

Bengt O. Muthen posted on Friday, March 22, 2019 - 2:07 pm

This is available in our current version of the program.

Alice Wickersham posted on Tuesday, February 25, 2020 - 3:00 am

Regarding the above advice to make the second number a quarter of the first number when specifying initial stage starts and final stage optimizations (e.g. STARTS=200 50): is this a general rule of thumb? Or should deciding what to make these two numbers be informed by something? I've seen many instances where the second number is not a quarter of the first (e.g. 100 20, 200 20, 200 40), and I'm curious to know whether these can result in improper solutions in a way I haven't thought of.

Bengt O. Muthen posted on Tuesday, February 25, 2020 - 2:44 pm

It is a very rough rule of thumb. There is nothing in any specific modeling that supports that ratio. The choice of ratio cannot result in improper solutions. That is more related to the number of Starts where a too small number may lead to a poor local solution.