Mplus Discussion >> Unperturbed starting values

Topics
Last Day
Last 3 Days
Last Week
Tree View

Edit Profile


Unperturbed starting values

Mplus Discussion > Latent Variable Mixture Modeling >

Message/Author

Bruce A. Cooper posted on Thursday, May 22, 2008 - 4:19 pm

Sorry, but perhaps I've answered my own question -- I just re-ran my model setting STARTS = 0 with the same starting values I'd used, specified 100 iterations, and got the identical results. So, I take it that in addition to running the # of models with random seeds, Mplus also runs one model with the specified starting values, and that one happened to give me the smallest LL? If so, what does it mean that that one is not replicated even with as many as 300 random starts?
Thanks,
Bruce

Bengt O. Muthen posted on Thursday, May 22, 2008 - 5:32 pm

Yes on your 1st question.

Your 2nd question is hard to answer without seeing the problem run. Sometimes the model is just too complex for the data and doesn't replicate - a simpler model should then be chosen. Sometimes there are messages that not all perturbed runs converged, in which case switching to stscale=1 might help. Sometimes more numerical precision is needed with numerical integration and it helps to increase the number of integration points.

Bruce A. Cooper posted on Thursday, May 22, 2008 - 6:14 pm

Thanks so much for your quick reply, Bengt! I think my first posting didn't get posted because I failed to click the correct button after reviewing it, but the results that generated my question were that I got the lowest LL = -3873.278 with the unperturbed starting values, then (asking for STARTS = 300 20, STITERATIONS = 40), got 19 subsequent solutions with the identical LL = -3878.458. So, the solution did replicate for all the models with random seeds, but the one with the unperturned starting values was smaller. (Only 2 out of 300 perturbed starting values did not converge in the initial stage.) I'll try STSCALE=1 and then numerical integration if that doesn't work. Best, Bruce

Linda K. Muthen posted on Friday, May 23, 2008 - 9:33 am

Please send your input, data, output, and license number to support@statmodel.com.

nina chien posted on Friday, June 27, 2008 - 3:23 pm

In running an LCA with 600 10 random starts, the largest loglikelihood is always the unperturbed starting value (and of course is not replicated). The next 9 best loglikelihoods are identical. Does this indicate the model is unreliable?

Thanks,
nina

Linda K. Muthen posted on Friday, June 27, 2008 - 3:54 pm

You may need more random starts. You should replicate the best loglikelihood. If more random starts do not help, please send your input, data, output, and license number to support@statmodel.com.

Bruce A. Cooper posted on Wednesday, July 02, 2008 - 5:00 pm

Thank you for your prompt reply way last May, Linda. Rather than burden you with unnecessary work, I decided to do more reading and more analyses before bothering you. So, I was able to get the smallest LL of -3874.155 to repeat -- twice in addition to the unperturbed solution, with

STARTS = 1500 50;
STITERATIONS = 100;

The two next smallest LL = -3877.816, then the next 45 were LL = -3878.458. Although the solution with the smallest LL interpretable, such results raise the question how trustworthy a solution is with only 3/1500 replications? Doesn't that suggest a local maximum even so? Wouldn't the LL with so many solutions be more reliable?

Linda K. Muthen posted on Wednesday, July 02, 2008 - 5:35 pm

What do you mean by 3/1500 replications? Is this a Monte Carlo study?

Bruce A. Cooper posted on Wednesday, July 02, 2008 - 6:02 pm

Thanks for your fast response!

Sorry, wrong term. I mean I got the same smallest LL in only 3 out of 1500 starts, with 2 being random starts and the 3rd being with unperturbed starting values -- all with starting values I specified based on a prior model. This was for a 4-class GMM. (Is that the right way to say it?)

Linda K. Muthen posted on Wednesday, July 02, 2008 - 6:06 pm

If the largest loglikelihood is replicated (note the values are negative), you should be fine. I would not provide starting values. Try it without starting values.

Bruce A. Cooper posted on Wednesday, July 02, 2008 - 6:16 pm

Right -- smallest absolute number, but largest negative number (being closer to one)! Will do it again without starting values but 1500 random starts and see if I get the same results! Thanks!
Bruce

Bruce A. Cooper posted on Wednesday, July 09, 2008 - 12:56 pm

I ran the model as you suggested without starting values, allowing intercept and linear slope variances to differ across classes. (I set quadratic slope variances to zero for the model.) I used 5000 random starts and got the same largest LL AND class assignments as for only 100 random starts, and in both models got the same warning about the covariance matrix in one class being non-positive definite. This is not the smallest class, but it is small (C4 n=17; the other three classes are C1=215, C2=8, and C3=13). This sort of problem is why I specified starting values before, and solved the problem with the one class by setting slope variance = 0, which seemed reasonable given the tests for the slope variances -- all had p-values < 0.06 except for the problem class, for which the p-value was 0.945.

I want to be able to go to the next step and test whether 4 is better than 3 classes (the plots look interesting and plausible) but can't without getting the model to fit and replicate. It seems to me that the best next step is to use the estimates for the means and variances from this model as starting values, specifying the slope variance as zero for the problem class. However, that's what I did before that led to my question on July 2nd. The model fit then, but with only a very small number of replicated largest LL even with 1500 starts -- only two in addition to the unperturbed, specified starting values.

Thanks!

Bengt O. Muthen posted on Wednesday, July 09, 2008 - 1:30 pm

I would first run with the default of class-invariant variances. Then plot the model-estimated mean curve together with the observed individual curves classified into most likely class to see if there is a need for class-specific variance for a class. If so, free say the intercept variance only for that class. Models with all variances class varying can be hard to replicate.

Another trick is to force variances to be greater than zero. This is done by labeling a variance parameter such as the random slope variance:

s (s);

and then use

Model constraint:

s>0;

Bruce A. Cooper posted on Wednesday, July 09, 2008 - 2:35 pm

Thanks very much, Bengt. This is very helpful. But in the interest of time -- assuming you have more time for a response! -- I understand that you are suggesting that I use a GMM with i s & q variances the same across classes (the default model), then examine the actual individual (spaghetti) plots against the mean plots for the classes. If any look like there is within-class variation not accounted for by the default model, then free up the intercept variances, one at a time, as suggested by the empirical vs est. mean plots, and examine fit for improvement in the less constrained model. Stop when model fit doesn't improve much -- say -- using the BIC and/OR SABIC, and perhaps the CFI and RMSEA?

Also, thanks for the note about how to prevent negative slope variances (a type of Heywood case, right?) if nothing else seems to work. Although I assume that, as with other latent variable models, this strategy would not be as good as fixing a problem leading to the negative variance?

Bengt O. Muthen posted on Thursday, July 10, 2008 - 8:56 am

Regarding your first paragraph - the idea is to see if one class has less individual scatter than the rest. The rest of your statement is ok.

Right.

Bruce A. Cooper posted on Monday, July 14, 2008 - 2:54 pm

Thanks very much. I've been working on the problematic data set, starting over with the default model and working my way up through the class numbers to find the model that fits best. The individual spaghetti plot variations around the class mean plots do not look that different in the amount of variation across classes.

I have another question now about selecting the number of classes, but I think that should be in a different thread.

Thanks for all your help with this set of questions!
Bruce

Jungeun Lee posted on Tuesday, June 23, 2009 - 3:06 pm

Hello!

I am running a LCA with count variables. When I increased # of classes=5, I got the following error.

THE MODEL ESTIMATION DID NOT TERMINATE NORMALLY. ESTIMATES CANNOT BE TRUSTED.

I also got the following errors.

97 perturbed starting value run(s) did not converge in the initial stage
optimizations.

Final stage loglikelihood values at local maxima, seeds, and initial stage start numbers:

100 perturbed starting value run(s) did not converge.

Given my STARTS = 5000 100, it basically tells me that none of perturbed starting value [from the final stage] didn't converge.

In this case, can I say that the 5 class solution model is too much for the data and then the simpler model [# of classes=4] should be chosen?

Thanks!!

Bengt O. Muthen posted on Tuesday, June 23, 2009 - 3:28 pm

It probably means that the data cannot support as many as 5 classes. But it doesn't mean that the 4-class model should necessarily be be chosen. For example, if a model different than LCA is the right model, then reducing the number of classes is not sufficient. As one example of a model alternative, factor mixture modeling may be used, adding a factor to the LCA that explains correlations among items within class. Our web site has several papers on FMM.

Jungeun Lee posted on Tuesday, June 23, 2009 - 4:44 pm

Thank you for a quick response!! Just to be clear... the 4 class model is the best fitting model. Does your answer hold the same with this new piece of info?

Bengt O. Muthen posted on Tuesday, June 23, 2009 - 5:18 pm

Yes. That's what I understood.

mpn posted on Tuesday, July 06, 2010 - 2:45 am

Dear Bengt and Linda,

I am running an LCA with covariates, using 500+ starts. The lowest LL is consistently replicated but with one of the seeds 'unperturbed'. I cannot find an explanation for this in the User Guide. Could you please suggest a reference, and advise if the following model is acceptable?

Regards,

Final stage loglikelihood values at local maxima, seeds, and initial stage start numbers:

-4213.884 596257 405
-4213.884 unperturbed 0
-4301.005 140849 515
-4302.228 140987 1093
-4302.228 211281 292
-4302.228 779009 1458
-4375.415 50887 389
-4377.010 928287 197

THE MODEL ESTIMATION TERMINATED NORMALLY

Linda K. Muthen posted on Tuesday, July 06, 2010 - 8:10 am

The model is estimated once with the default starting values. Then these values are randomly perturbed to obtain other sets of starting values.

Yuchun Peng posted on Tuesday, February 15, 2011 - 8:12 am

Hi Bengt and Linda,

I am running a 2-cluster LCGA model with 11 binary variables. The warning message I got is like '60 perturbed starting value run(s) did not converge', regardless the starting values and stiterations used.

Although the highest likelihood replicated many times.

I tried to solve this problem by specifying the scale of the random perturbation to 2 and it worked.

Is there anything I need to be aware when specifying a low level scale of perturbation?

Thanks
Vicky

Bengt O. Muthen posted on Tuesday, February 15, 2011 - 11:56 am

By 2-cluster LCGA I think you mean an analysis with 2 latent class variables, rather than a 2-level model with only 2 clusters. Models with more than one latent class variable sometimes need a more gentle perturbation of the random starts to not get wild starting values. You may want to experiment with also using a little larger value between 2 and the default 5 to make sure you don't find a better loglikelihood. Other than that, it should be fine.

Melissa Kimber posted on Tuesday, February 21, 2012 - 8:39 am

Hello,
I am running exploratory LPA right now and have started with 3 classes.
The following is the input

CLASSES = c (3);
Analysis: TYPE = MIXTURE;

I got a number of error messages including the following.

1. Final stage loglikelihood values at local mxima, seeds & initial stage start numbers.

2. Unperturbed starting value run did not converge

3. Best loglikelihood not replicated.

4. One or more parameters fixed to avoid singularity

5. Standard errors of model parameter estimates may not be trustworthy.

6. Entropy is .925

The continuous indicators are decimal numbers because they are proportions. Do you recommended resaling the numbers for eaiser estimation? (for example, one indicator is the proportion of lone mothers in the individuals neighbourhood. Which is being combined with other neighbourhood varialbes t run in the LPA for a latent variable of community adversity).
Any suggestions you have would be helpful.
***Melissa

Linda K. Muthen posted on Tuesday, February 21, 2012 - 8:50 am

Please send your output and license number to support@statmodel.com.

Melissa Kimber posted on Tuesday, February 21, 2012 - 9:12 am

Hi Dr. Muthen,
I cannot send my output given that it is government protected data. Do you have any other suggestions?

Linda K. Muthen posted on Tuesday, February 21, 2012 - 1:45 pm

You can increase the number of starts to replicate the best loglikelihood. I would need to see the output to comment on the identification message.

Melissa Kimber posted on Friday, February 24, 2012 - 5:59 am

That worked! I had to increase it by a lot, but it worked! Thank you very much.

John Woo posted on Monday, August 17, 2015 - 2:23 pm

Hi, a beginner-level question here.. The best LL replicated but there were some starting values that did not converge. Is this a problem? I ask because I ran Type=imputation and saw in Tech 9 this message:
"Errors for replication with data file ice1.dat:
THE MODEL ESTIMATION TERMINATED NORMALLY"

When I ran the model using this particular imputed data (ice1.dat), I found that the best LL is replicated but some starting values did not converge. How serious a problem is this? [I used Start = 1000 10]

Thank you in advance for your help.

John Woo posted on Monday, August 17, 2015 - 2:32 pm

To be more specific (re above), the best LL includes the unperturbed starting value. And only some of the perturbed starting values did not converge. [I see Dr. Muthen's answer for a similar question, but I am still a bit unclear.] Thank you.

Linda K. Muthen posted on Monday, August 17, 2015 - 2:39 pm

As long as the best logikelihood is replicated, you are fine.

Please keep your post to one window.

Amy Syvertsen posted on Tuesday, June 26, 2018 - 11:47 am

We are working on a 4-class LCA model with covariates and a distal outcome using 20 multiply imputed datasets. The main problem is that as we increase the number of random starts we receive the following messages (or messages similar to these, depending on the number of random starts - this is for STARTS = 5000 1300):

2090 perturbed starting value run(s) did not converge in the initial stage optimizations.
[List of LLs truncated.]
Unperturbed starting value run did not converge.
607 perturbed starting value run(s) did not converge.
THE BEST LOGLIKELIHOOD VALUE HAS BEEN REPLICATED. RERUN WITH AT LEAST TWICE THE
RANDOM STARTS TO CHECK THAT THE BEST LOGLIKELIHOOD IS STILL OBTAINED AND REPLICATED.

What does the lack of convergence for some perturbed starting values in the initial stage, and final stage suggest? What about the unperturbed starting value that did not converge?

Bengt O. Muthen posted on Tuesday, June 26, 2018 - 3:26 pm

This means that the model is hard to estimate. The data have little information on the parameters of the 4-classes. Perhaps you are trying to extract too many classes or the model is too flexible.

Alice Wickersham posted on Thursday, February 13, 2020 - 5:10 am

I have read the above threads but am still trying to get my head around the extent to which perturbed starting value runs represent a problem with the modelling. I am running a 4-class GMM, and receive the following message when STARTS=100 20:

"2 perturbed starting value run(s) did not converge or were rejected in the third stage."

When I increase the number of starts, the number of perturbed starting value runs also increases, and I still get perturbed starting value runs when I include STSCALE=1, STSCALE=2, STSCALE=3 or STSCALE=4. However, the same best log-likelihood value is consistently replicated with more starts, and irrespective of the STSCALE I specify. With that in mind, do the perturbed starting value runs indicate a problem with the model which should preclude me accepting that solution, or is it sufficient that my best log-likelihood value appears to be fairly robust?

Many thanks!

Bengt O. Muthen posted on Thursday, February 13, 2020 - 2:21 pm

It is ok if some perturbed starting value sets don't converge - as long as you get a couple of replications of the best loglikelihood value.

Monita Karmakar posted on Friday, March 13, 2020 - 12:06 pm

Hello,

I am working on a 4-class parallel process LCGA using 3 measures of health of which one is categorical (5 values) and two are count variables. I am using a zero-inflated count model for the two count variables. I have used START = 2000 200.

In my output, I have the following error messages:

1196 perturbed starting value run(s) did not converge or were rejected in the third stage.
Final stage loglikelihood values at local maxima, seeds, and initial stage start numbers:

[log likelihood list ommitted]

183 perturbed starting value run(s) did not converge or were rejected in the third stage.

THE BEST LOGLIKELIHOOD VALUE HAS BEEN REPLICATED. RERUN WITH AT LEAST TWICE THE
RANDOM STARTS TO CHECK THAT THE BEST LOGLIKELIHOOD IS STILL OBTAINED AND REPLICATED.

THE MODEL ESTIMATION TERMINATED NORMALLY

My understanding from some of the posts above is that since my log likelihood was replicated, my model should be okay. Am I safe in assuming that?

Thanks in advance.

Monita

Bengt O. Muthen posted on Saturday, March 14, 2020 - 3:01 pm

Sounds like the model is difficult to estimate given the large number of non-convergences. This could be a sign that the model isn't the best for the data. But if you have the best loglikelihood replicated a couple of times, the results should be trustworthy.

Monita Karmakar posted on Saturday, March 14, 2020 - 3:13 pm

Thank you for the quick response. I have a follow up question. I am trying to replicate the model on a virtual desktop to access some restricted data for the second part of the analysis where I will be using the class membership to predict a distal outcome. This virtual desktop uses an earlier version of the software (v8.2), while I used V8.4. However, the model will not even converge saying that the log likelihood was not replicated and no log likelihood estimates were reported. Could this be because of differences in the version?

Bengt O. Muthen posted on Saturday, March 14, 2020 - 3:26 pm

Don't think so. We need to see the output to guide you - send to Support along with your license number.