Anonymous posted on Tuesday, July 06, 2004  2:06 pm



Hello I am writing a paper in which I estimated a mixture model used the randomization procedure new to MPlus 3.0. I want to describe specifically how the start values are generated, but cannot find documentation either in the new manual or in the online technical appendix. Can you give me some information regarding how this is done? Thank you. 


We have not yet written up all of the technical information related to Version 3. As we do, we will add it to the Technical Appendices on the website. If you look under STARTS in the Index of the Mplus User's Guide, there is a brief description. 

Anonymous posted on Friday, July 23, 2004  7:19 am



I really appreciate the addition of random starts to Mplus. A couple of questions so that I understand how to use this better: In the manual it indicates that random starts are random perturbations around the userspecified or automatic start values for all parameters except variances and covariances. So are variances and covariances held constant at the userspecified or automatic values? Also, the default for the STSCALE is 5  does that control the dispersion of the random perturbations? Is it the width of a uniform distribution, or is the metric in sds of a normal distribution, or something else? 5 is indicated to be a medium value, but I don't understand the scale. 

Anonymous posted on Friday, July 23, 2004  10:11 pm



The variances and covariances will get the same starting values across the different perturbation runs. STSCALE controls the dispersion of the random perturbations by multiplying the a perturbation in the sds metric by a uniform distribution with that width. The rule of thumb is that if the default scale doesn't produce enough diverse solutions you would increase STSCALE, if it produces too many improper solutions, like classes collapsing and singular variances, you would decrease it. You can also get the perturbed starting values with OPTSEED and MITER=1. Tihomir 

Anonymous posted on Friday, November 18, 2005  6:42 am



Hi I'm am trying to use the OPTSEED option to specify a start seed for a latent class analysis but cannot get MPLUS to run more than the default 500 iterations. I've tried using MITERATIONS = 2000 but this doesn't seem to work with the OPTSEED option could you please help? Thank you 


Please send your input, data, output, and license number to support@statmodel.com so we can see what is happening. Also, include the output where you found the seed that you are using. 

anonymous posted on Thursday, March 02, 2006  5:16 am



I have a question with regard to starting values. I have been running LCA with different sets of starting values in order to examine whether there exist different local maxima. i am a little unsure as to how to evaluate the tech8 output. do i simply check the column labelled 'loglikelihood at local maxima', and then examine whether there exist vast differences between the values? in all my runs (with different sets of starting values), the loglikelihood values in that colum are almost identical and the estimated loglikelihood value listed along with the fit statistics is the same across all runs. does this mean I can be confident in the obtained solution in that it is not reaching too many different local maxima? 


I'm not sure if you are changing the starting values yourself. It sounds that way given that you are looking at TECH8 for each solution. You can use the STARTS option which will randomly generate sets of starting values. This option and related options are described in the user's guide on pages 436438. On pages 325328, you will find a description of how to know if you have found a good solution. Note that the new user's guide is available online. 


I have a question concerning the Mplus output with regard to random starts. I noticed that while Mplus versions 3 and 4 always provided the loglikelihood values, seeds, and initial stage start numbers for *all* sets of starting values (initial + final stage of the optimization), Mplus 5 provides this information only for the final stage of the optimization. Is there a way to make the initial stage starting value information available in Mplus 5 in addition to the final stage values (other than TECH8)? (I am teaching LCA with Mplus 5 and for didactic reasons, it would be nice to have both sets in the output to illustrate what is done in order to avoid local maxima.) Thank you very much in advance! 


We decided to delete the initial starts because with faster computers and more complex models many starts are being used and the output is lengthy. Plus we typically didn't look at the values of the initial starts. For instance, the final set of values tells us how many fewer random starts we might have been able to get away with. There is not a way to make them available in Version 5. So the pedagogical presentation would have to draw on displays outside the output. 


Professor(s) Muthen, I am running "LCA with covariates" model. The model estimation terminated normally. However, I got the following warning "WARNING: THE BEST LOGLIKELIHOOD VALUE WAS NOT REPLICATED. THE SOLUTION MAY NOT BE TRUSTWORTHY DUE TO LOCAL MAXIMA. INCREASE THE NUMBER OF RANDOM STARTS." I tried with four different STARTS: (500 10), (1000 10) (500 20), (1000 20) with STITERATIONS=20, however I keep getting the same warning. With (1000 10) the Loglikelihood values at local maxima, seeds, and initial stage start numbers are the following: 1076.170 863691 481 1083.083 366533 484 1083.703 928287 197 1085.223 349263 263 1085.545 61587 400 1086.163 352277 42 1090.509 479273 156 1091.846 275475 413 1094.520 888905 444 1100.841 879211 453 Q1. How diverse these likelihood values are? Can I submit the result claiming that we reach global rather than local maxima? If not how should I proceed? Q2. How should I use the option “OPTSEED” and analyze the result. Could you suggest some example? I am using MPlus 4. Thanks and regards 


There have been so many changes between Version 4 and Version 5.1 that it is difficult to say what is happening. I recommend getting Version 5.1. 


Hello, Regarding "Anonymous posted on Friday, July 23, 2004  10:11 pm ", what does "diverse soultions" mean in the sentence "the default scale doesn't produce enough diverse solutions"? Diverse seeds? diverse LL? or something else? Thanks!! 


Diverse LL values.  Which is a function of the diversity of the starting values. 

Ruixue Wang posted on Tuesday, February 15, 2011  12:37 pm



Hi,I have couple of questions about tech8 and starts value. 1.RANDOM STARTS RESULTS RANKED FROM THE BEST TO THE WORST LOGLIKELIHOOD VALUES Final stage loglikelihood values at local maxima, seeds, and initial stage start numbers: 4545.804 569131 26 4545.804 608496 4 Here I know the final stage use the starts number 26 and 4 from initial stage. But what's seeds, how does mplus generate seeds? Can I use seeds to identify the exact starting value? 2.In tech8 ECHNICAL 8 OUTPUT FOR STARTING VALUE SET 1 ITER LOGLIKELIHOOD ABS CHANGE REL CHANGE CLASS COUNTS ALGORITHM 1 0.60111874D+04 0.0000000 0.0000000 339.588 160.412 EM 2 0.45545041D+04 1456.6833360 0.2423287 337.311 162.689 EM 3 0.45536646D+04 0.8394853 0.0001843 335.247 164.753 EM How can I find what's the starting value in set 1? Is it random value? Can I specify the starting value? The iter is 3. Does Mplus run 3 iter only or it stops when the abs change is small enough? 


The seed is a random variable that determines what the starting values will be. So for your first LL value of 4545.804 the seed is 569131. You can use this seed in a new run to get those starting values, saying OPTSEED = 569131; You can see the starting values in the Tech1 output. Mplus runs iterations until the firstorder derivatives are small enough. 

Ruixue Wang posted on Tuesday, February 15, 2011  2:22 pm



thank you. Is that possible to decide what the start value by myself? 


Yes. Just give them and say STARTS=0. 

Ruixue Wang posted on Tuesday, February 15, 2011  2:48 pm



sorry I don't understand. I mean how to set the start value from the very beginning.you mean use starts=0,then give the starts value(such as 39 15 3)to what? 


No, you give start values for the parameters in the usual way, such as MODEL: [i*1.5]; 

Ruixue Wang posted on Tuesday, February 15, 2011  4:41 pm



thanks. IF I want to setup several start values, is it [i*1.5 1.6];or [i*1.5];[i*1.6];?can you tell me which part of users guide introduce this concept:random or fixed start values? 


You can give only one starting value for a parameter. 


Dear Bengt and Linda, I have been trying to compare a 2 versus a 3 class mixture model. The 2 class model converged appropropriately, showed a repetition of the best log likelihood value, had lower BIC, and a had significant bootstrapped likelihood ratio test all favoring the 2 versus 1 class model. To attempt to obtain a repetition of the lowest log likelihood value for the 3 class model, I increased the initial random starts to 1000 and final stage optimizations to 10. I also increased the initial stage iterations to 20. This still did not result in a repetition of the lowest log likelihood value. Therefore, I followed the suggestion in the user guide and used the optseed command to examine the parameter estimates in the model solutions. These model solutions showed different estimates for each seed. The Mplus user guide suggests that this indicates that the model is not welldefined, possibly due to there being too many classes. Are there any other steps or recommendations that you would have to trying to acheive a replicated lowest log likelihood value before I conclude that the 3 class model is not welldefined and go with the 2 class model? 


The final stage optimizations should be about one fourth of the random starts. I would try 1000 250 or even 2000 500. 


Linda, thank you for your reply. I was able to acheive model replication by trying your suggestion of increasing the starts to 2000 and final stage optimization to 500. I will keep this ratio in mind for future analyses. 

Stata posted on Thursday, March 08, 2012  7:06 pm



Dear Bengt and Linda, I am trying to follow the example in 7.6 for my study. 1) How should I determine starting values? In addition,the example in the manual also assign negative values to some variables but not others. I can't find further explanation about this in the manual. 2) Does Mplus automatic starting values with random starts take care of the problem associate with "converge on local solutions" Thank you. 


1. This example uses starting values but not random starts. This would usually occur when an analysis in in the final stages and one does not want a lot of random starts so uses starting values and no random starts to speed things up. You would take the starting values from the analysis. 3. Using random starts and the Mplus default starting values will help to avoid local solutions. 

Stata posted on Friday, March 09, 2012  7:47 pm



Linda, Thank you very much for this prompt reply. 

Stata posted on Saturday, March 10, 2012  9:05 am



Dear Professor Muthens, I previously did not mention that there are 60 variables with 4point, 3point, and binary indicators in my study. When I used default starting values with random starts, I got the following message: WARNING: THE BEST LOGLIKELIHOOD VALUE WAS NOT REPLICATED. THE SOLUTION MAY NOT BE TRUSTWORTHY DUE TO LOCAL MAXIMA. INCREASE THE NUMBER OF RANDOM STARTS. I am really confused with deciding threshold starting values. The manual use 1 and 1 or binary; 0.5 and 1, 0.5 and 0 to threecategory indicators. I am not sure if I assign correct threshold starting values to my 9 indicators with 4point scale: [u1u5$1*0.33 u6u9$1*0.33]; [u1u5$2*0.66 u6u9$2*0]; [u1u5$3*1 u6u9$3*0.33]; Thank you 


If you do not have knowledge of what the starting values should be, use the default starting values. The message means you should increase the random starts using the STARTS option, for example, STARTS = 1000 250; 

Stata posted on Saturday, March 10, 2012  9:49 am



Thank you. 

Stata posted on Saturday, March 10, 2012  10:57 am



I got error message after adding STARTS = 500 125; *** ERROR in VARIABLE command Unknown option: STARTS I tried different ways to get rid of the problem, but it kept showing the same error message. 


STARTS belongs in the ANALYSIS command. 

Julia Lee posted on Tuesday, April 17, 2012  7:30 am



I am running an LTA with 5 classes and I am using STARTS = 800 40; The output did not include the final stage loglikelihood values. Is it because the model was not terminated normally? I did not see the message about normal termination. From covariance, the output jumped straight to model fit information. Thank you for your advice. 


I would need to see the output to understand why this happened. Please send it and your license number to support@statmodel.com. 


Hello, I am testing several latent class mixture models with categorical indicators for 2 latent factors. (u1u21 has 3 categories and u2236 has 4 categories) Sample set up code for allowing the variance and covariance to vary across classes looks like this: Analysis: Type=mixture; algorithm=integration; integ=7;estimator=mlr; starts= 1000 250; %overall% f1 by u121; f2 by u2236; [f1f2@0] %c#1% f1f2; f1 with f2; %c#2% f1f2; f1 with f2; %c#3% f1f2; f1 with f2; I have had success with having the means vary (variance and cov invariant across classes) but I am trying to determine if there is a way to use my start values from the basic latent class model to speed up the analysis? I know using the SVALUES gives this information and I have used it previously to change my reference class. Are these start values usable for the more complex variations? Thanks for any help! 


You can use them but I don't think they are useful here because your solution with free factor covariance matrices may be quite different. But  are you holding the factor means at zero and the thresholds invariant across classes? That seems very restrictive. 


Thanks for the quick response. I had a feeling that would be the case. I am looking at different variations my next model frees factor loadings, thresholds, and variances. I was just concerned because with the current model, 4 classes is taking upwards of 810hrs and I assume the less restrictions I place the longer the model will take. 


Dr. Muthen Just to ammend/update my previous post (and to thank you) part of my 10hr run was a technical problem i just realized. Also per your comment on being too restrictive, I started to think that that was the reason the loglikelihoods were not replicating even with max random starts. I allowed the thresholds and means to vary and the model completed with LL replicated within 3 hours for 4 classes. Does this just mean my data does not fit the more restricted models? And I would report that the LL were not replicated in the restricted models? Thank you!! 


Note that when you allow the thresholds to vary across the classes, the factor means must be fixed at zero in all classes for the model to be identified. I think your earlier model had the thresholds equal across classes and the factor means fixed at zero for all classes. Such a model does not seem meaningful so should not be considered. 

