abc posted on Tuesday, September 22, 2015 - 6:59 am
Dear all, I'm trying to analyze a model with two latent variables and its latent interaction predicting a nominal variable at a subsequent data collection. Due to the design, I have a huge amount of missingness in the indicators of my predictor variables. Since this is missing by design, it might be okay, to estimate the parameters for the whole sample nevertheless (currently, I'm not decided yet, but I'm somehow playing around with the data). I understood that I have to use integration = montecarlo;
I reran my model at two different computers with differing Mplus versions (7.2 and 7.31). Surprisingly, very different results were recorded. In the output I identified that the default for the number of integration points differs (200 in Mplus 7.2 and 2000 in 7.31). When defining the number in the input, of course, identical results turn out.
However, I'm confused now, how to decide how many integration points lead to trustworthy results. Since my results are really very different for both numbers, I have no idea, which results I should trust (is more always better?). Or maybe, this indicates that my model is not trustworthy at all...?
We did change the default between Version 7.2 and 7.31 because most people have more powerful computers and more integration points is generally better. To be certain that you reach a global solution the best loglikelihood should be replicated or you should run with more integration points to be sure you get the same solution.
Please send both outputs and your license number to email@example.com so we can see more about your situation.
abc posted on Wednesday, September 23, 2015 - 12:04 am
Thank you, I think checking the replication of the loglikelihood will help. If I will keep on having trouble, I'll send you the outputs and license numbers.
Samuli Helle posted on Wednesday, December 02, 2015 - 11:51 pm
I also have problems to "decide" how many integration points to use. Most of the model parameters remain the same when changing the number of integration points, but the SE of one factor loading seems (all indicator of this factor are categorical) to depend heavily on the number of integration points (see below). Should I trust the solution that gives the smallest SE or should I still increase the number of integration points? Please note that using e.g. 5000 and 8000 integration points results in a convergence failure.
#points likelihood est se 6000 -10477.13 10.068 65.156 7000 -10477.292 13.304 8.221 9000 -10477.415 13.787 17.786
The loglikelihood is probably rather flat in the direction of that parameter. With increased number of integration points you could try to sharpen the convergence criterion to get a better logL. On the other hand it looks like that parameter is not significantly different from zero so why chase it? Or, you may want to fix it at zero.
I increased the number of integration points to 11,000 and the model converged much faster than when the number of integration points was 9,000 (22min vs. 1h 13min) although logL stayed more or less the same (-10477.471). Now, the estimate for this loading was clearly significant (b=10.9, SE=2.7). But when I increased the number of integration points further to 12,000, the running time again increased to more than an hour and the estimate (b=35376.5) and its standard error (SE=30345.3) became huge (also, the threshold parameters for this was huge, 18940.7) while the logL increased a bit to -10472.772.
I'm interested in this parameter because I expect it to have a significant loading. Can a high correlation with another indicator cause such a instability for this parameter?