Problems computing standardized estim...
Message/Author
 Andrew Johnson posted on Thursday, January 30, 2020 - 8:28 pm
When receiving the error:

WARNING: PROBLEMS OCCURRED IN SEVERAL ITERATIONS IN THE COMPUTATION OF THE STANDARDIZED ESTIMATES FOR SEVERAL CLUSTERS.

At what point should I view the standardized results as untrustworthy? I'm running a bivariate RDSEM with a linear time effect (to account for non-stationarity), and 95/95 clusters are listed as having draws removed.

Is there way to identify how many draws were removed? If only 1 out 5000 were removed, that seems like less of a concern than 4000.

Model code:
VARIABLE:
NAMES = ID TREAT TIME Y2 Y1;
MISSING = ALL(-99);
CLUSTER = ID;
USEOBS = (TIME < 15);
USEVAR = Y2 Y1 TIME2;
LAGGED = Y2(1) Y1(1);
TINTERVAL = TIME(1);
WITHIN = TIME2;

DEFINE:
TIME2 = TIME;

ANALYSIS:
PROCESS = 8;
TYPE = TWOLEVEL RANDOM;
ESTIMATOR = BAYES;
BITER = (5000);
THIN = 10;
CHAINS = 4;

MODEL:
%WITHIN%
Y1 ON TIME2;
Y2 ON TIME2;

AR_Y2 | Y2^ ON Y2^1;
CL_Y2 | Y2^ ON Y1^1;

AR_Y1 | Y1^ ON Y1^1;
CL_Y1 | Y1^ ON Y2^1;

%BETWEEN%
Y2 Y1 AR_Y2 CL_Y2 AR_Y1 CL_Y1;
 Tihomir Asparouhov posted on Friday, January 31, 2020 - 12:22 pm
We don't provide yet that kind of additional information but you could make some progress on the question I think. I would consider the standardized results untrustworthy if there is a large discrepancy between the observed and estimated within level variances, i.e., I would compare the variance given in
output:residual(cluster);
for the above model v.s. the corresponding quantities for a model like this

%within%
v | y1;
%between%
y1;

those should be close to the observed values but if you have missing data all bets are off since a time series model may actually provide unbiased estimates for the within level variance while sample quantities (based on listwise deletion) could be biased.

The iterations that are deleted are those where the model becomes non-stationary, usually that happens a lot for very short time series, due to very large AR posterior distribution or for clusters where the AR coefficients are large.

It might also be useful to check with plots that Y1 and Y2 are not subject to non-linear trends of Time.
 Andrew Johnson posted on Friday, January 31, 2020 - 8:06 pm
Thanks Tihomir! The observed and model-estimated variances do differ, but I also have missing data, so that's inconclusive. The data definitely have a linear trend over time, though.

The data have N=105 and T=12, so that might be causing the larger AR posteriors. Would a possible fix be to put tighter priors on these parameters?
 Tihomir Asparouhov posted on Saturday, February 01, 2020 - 12:50 pm
You won't be able to put priors on the random effects, just on thier means and variance, but that won't be enough. But pretty much, yes, clearly T=12 is the source of the issue. N is not very relevant.

I wouldn't worry about the message being printed at all. I would say that the standardized estimates that we print are the best you can get with that data. Typically not many iterations are thrown away. I have actually never seen a case even with more than 10%. If the process is non-stationary in 50% of the iterations a more sever problem would occur and the model wouldn't be converging.

You can also do observed standardization - standardize the variables in each cluster. You can do that with a couple of auxiliary runs and the cluster_mean option in the define command.
 Andrew Johnson posted on Monday, February 03, 2020 - 2:57 am
Great, thanks again for the help Tihomir!