Anonymous posted on Thursday, January 20, 2005 - 12:18 pm
I have been working in Mplus on several two-level structural equation models, and sometimes the residual variance of my observed dependent variables is negative. The model fits are high, but I am not sure how to interpret or fix these rv's so they are no longer negative. Please advise. Thank you.
BMuthen posted on Thursday, January 20, 2005 - 7:57 pm
If the negative residual variances are large, this is a sign that your model is not appropriate for your data and you need to change your model. If they are small, you may want to fix them to zero. Residual variance are often small on the between level of multilevel models.
matthew posted on Sunday, November 27, 2005 - 3:14 am
hi i am a new user of SEM and also face the similar problem. what are the causes of negative residual variance? i mean how inappropriateness of the model would cause this problem but not reflecting on the model-fit indices. i want to ignore it (simply delete it), is there any guideline or significant level of the value so that it would be comfortable to do this?
there is too much to say on this topic and checking books is helpful. - in sum, reasons for neg vars include small sample size (so neg est even if pop value is pos), model misspecification, and very skewed variables (floor effects). also see my other answer today.
Could you advise me on the following problem, please?
Using CFA, I established discriminant validity between 6 latent variables (x1-x6). Then using SEM, I regressed y1, a latent variable with 3 continuous indicators on x1-x6. The model fit very well.
However, on addtion of a final path, regressing u1 a binary observed variable on y1, I received a warning that the residual covariance matrix is not positive definite.On inspection of the results, I realised there was a negative residual on one of the indicators of y1, and the model fit indices are very poor. What should I do? I tried dichotomising y1, as the indicators are negatively skewed but the model fit is worse.
I found a negative error variance for one of my ordinal outcomes (estimator WLSMV), and i'd like to test the inequality H0: error variance is greater than or equal to zero using the Wald test. Is it possible in Mplus?
I don't know what your model is, but unless you have a longitudinal or multi-group situation, the residual variances for ordinal outcomes are not free parameters. They are printed as remainders when requesting a standardized solution - perhaps that's where you see the negative value. So since they are functions of other parameters, you cannot do a test on them in a straightforward fashion.
If you for instance consider a single factor and no covariates (in a cross-sectional, single-group, the residual variance remainder is theta,
theta = 1 - lambda*lambda* psi
so you would have to do the Wald test on the new parameter theta (testing against = 0, not > 0).
Thanks Bengt for your answer. I indeed saw the negative residual variance doing ESEM with WLSMV and Delta parameterization on 2-factor model described by a total of 5 ordinal data.
1. But, if one uses the Theta param., the residuals should be the free parameters, shouldn't they? In this case, the Wald test could be directly used on the residual? 2. About the Wald test, I was wondering if it was possible to do a one-sided test of H0: residual >= 0 (as opposed to the two-sided test H0: residual=0), as it is suggested in the article of Kolenikov & Bollen "Testing negative error variances: is a Heywood case a symptom of misspecification?" http://web.missouri.edu/~kolenikovs/
Hope these questions make sense! Thank you very much for your help.
1. With the theta parameterization the residual variance is fixed to 1 (unless you have multiple group situation) - so in a way this is giving you residual variance > 0 condition. The residual variance is not a free parameter because it is still not identified so it has to be fixed to a value that determines the parameterization. For the theta parameterization that value is 1.
2. In principle yes - this amounts to dividing the p-value you get by 2, but again with the theta parameterization you can not do this at all because the residual variance is fixed to one. In the delta parameterization you can do this using the method Bengt outlined above, i.e., by making a new parameter in model constraints that is equal to the residual variance parameter. The residual variance parameter in the model is not really a regular parameter - it is a dependent constrained parameter that you can not access directly so you have to make your own duplicate of it.
RDU posted on Saturday, February 13, 2010 - 12:29 am
I have a question concerning the residual variance provided in the standardized model results for categorical outcomes. Since this is a standardized solution, then are the residual variances listed standardized residual variances or are they unstandardized? Furthermore, if they are standardized then how exactly does one obtain the unstandardized residual variances (Is it similar to what it would be in regular regression?). Thanks.
I think you are asking about the residual variances that are printed with R-square. These are raw coefficients that are computed as a remainder from the model estimated results. They are not estimated as part of the model. Categorical outcomes do not have variance parameters.
RDU posted on Saturday, February 13, 2010 - 10:32 am
To better clarify my question, I was referring to the residual variances that are provided using Theta parameterization (WLSMV estimation) with categorical data, where standardized model results can be requested. As part of the standardized output, residual variances are given along with each item's R^2.
Thus I was wondering whether the residual variance is standardized since it is part of the standardized model output, or whether it is an unstandardized estimate.
If it is in fact a standardized residual variance, I also wanted to know if an unstandardized estimate could be obtained or calculated by hand. Thank you for your response.
With the Theta parameterization, scale factors are given with R-square. If you have further questions about this, please send the full output as an attachment and your license number to email@example.com.
RDU posted on Saturday, February 13, 2010 - 11:37 am
Yes, I apologize for the confusion. I believe I mistook the scale factors from Theta parameterization for the residual variances provided in delta parameterization.
RDU posted on Saturday, February 13, 2010 - 1:40 pm
Given the previous question I am also curious as to whether the residual variances given for delta parameterization are standardized or unstandardized, as theta parameterization was said to provide scale factors and not residual variances...Is this correct?
Neither the scale factors or residual variances presented with R-square are standardized.
Hemant Kher posted on Thursday, April 21, 2011 - 8:04 am
Hello Professor Muthen,
I have a question, and I hope that you can provide some insights. My question is related to a multiple-indicator latent curve model. Using the CFA approach, I estimate a latent construct for 4 different time points (factors F1, F2, F3 and F4 -- each estimated using the same 4 items). I followed directions to establish measurement invariance (same scale indicator at each time, loadings for non-scale items constrained equal across time, and equal intercepts for non-scale items). The model with CFA works fine with a good fit.
However, when I fit a growth model on the factors, I get a negative residual variance for the first factor (F1); the residual variance is small and statistically insignificant (-0.026, Z=-0.354, p=0.723). When I fix this residual variance for F1 to zero (f1@0;), the change in model fit is negligible and not statistically significant. But I am not sure if doing this (setting factor residual variance to zero) is reasonable. Your thoughts at your convenience would be appreciated.
Negative residual variances typically reflect a mis-specified model. For instance, perhaps a non-linear growth model is more suitable.
Also, instead of fixing the residual variance at zero, you could try holding them equal across time.
Hemant Kher posted on Thursday, April 21, 2011 - 8:26 am
Professor Muthen -- Thank you for a quick response.
Holding the factor residual variances equal across over time has solved the problem.
Katja posted on Monday, January 14, 2013 - 1:14 am
I have a question, regarding a neg. residual variance. There was a post: "If the negative residual variances are large, this is a sign that your model is not appropriate for your data and you need to change your model. If they are small, you may want to fix them to zero. Residual variance are often small on the between level of multilevel models."
What is considered as a large/small negative variance? I have a neg. residual variance of -,083. Can i fix it to zero?
There seem to be two approaches of handling negative residual variances: The first is to fix the residual variance to 0 or a small positive value. The second one is to use the 'model constraint' to constrain the variance to be greater than zero. These approaches seem to be different because the first one delivers an additional degree of freedom wheras the second does not.
So I am wondering which approach you consider more appropriate and why.
It is common practice to fix small insignificant negative residual variances to zero or constrain them to be greater than zero. These approaches are basically the same. Neither are optimal. A better choice is to change to model.
Linh Nguyen posted on Saturday, July 19, 2014 - 9:31 pm
I am also having the negative residual variance problem with my measurement and structural models.
The problem is at the second-order construct which has only 2 firt-order indicators. So when I run CFA for this second-order construct, it is unidentified (My sample size is 364). So I have set the factor's loadings to be equal (Tau equivalence) in order to run CFA and validate the construct individually. The CFA model fits very well then with an insignificant Chi-squared, CFI =.995, RMSEA =.04, SRMR =.02. The overall measurement model (with 5 other constructs, 32 observed variables) is also adequately fit.
In my structural model, if I did not set the factor loading of the above construct to be equal, I would get 1 negative residual variance (from the above construct's indicator). The negative residual variance was -.017, insignificant. If I continued to employ Tau-equivalence assumptions, the model would be fine. Can I set the construct's factor loadings to be equal like in the CFA model for my structural model to solve the problem?
Thank you very much!
I am looking forwards to hearing from you
Kind regards Linh
Linh Nguyen posted on Saturday, July 19, 2014 - 10:35 pm
With regard to the mentioned second-order construct, eventhough it has 2 first-order indicators, each first-order indicator has 3 observed variables.
Thanks so much for your quick response! It really helps. Sorry for my late reply cause I could not post any message on the website until now.
I am writing up my analysis, just wondering if you can give me some references for using the CFA model of equal loadings in the structural model? I have been searching for journal articles about the issue but could not find the proper one.
I don't know of any reference. A model with two second-order factors is not identified unless you make some constraint like equal loadings. Ideally you would have more second-order factors so this is not necessary.
Ted Fong posted on Friday, August 01, 2014 - 2:12 am
Dear Dr. Muthén,
I understand that a 2nd-order factor model with just two first-order factors is typically unidentified. When the model is made identified with a model constraint such as equal loadings on the 2nd-order factor, this model should have the same df as the two-factor correlated model.
My question is: is it possible for an identified 2nd-order factor model with two first-order factors to have a lower df than the two-factor correlated model? I have recently come across a paper where the former model has 2 df fewer than the latter one.
Hi Dr Muthen- I am running a full structural equation model (with categorical outcome) and am having trouble. While it runs and is terminated normally, I still get the error message about a negative residual variance: "THE MODEL ESTIMATION TERMINATED NORMALLY
WARNING: THE LATENT VARIABLE COVARIANCE MATRIX (PSI) IS NOT POSITIVE DEFINITE....PROBLEM INVOLVING VARIABLE DLMORAL."
After reading online I fixed the residual variance to 0 (DLMORAL@0) , but am still getting this error message.
Am I able to still use the estimates provided or is there something else I can do to fix this problem? Although having it fixed @0 means I do not have estimates for two correlations involving this variable. This is the starting model so I want to make sure there are no issues before I modify it. I have run CFA on all independent variables before running the full model and there were no problems with this variable.
I should add that my sample size is only 320 but that has not been a problem for several other models (including one similar) that use this variable. Does this mean that this issue is caused by the addition of one latent variable and one observed variable (that does not correlate with the variable in question)? Sorry about the second post.
Linda Lin posted on Tuesday, March 10, 2015 - 3:04 pm
Can you provide how Mplus decompose the total outcome variance and calculate the level 2 residual variance?
I tried to demonstrate as below: the total variance of outcome variable can be decomposed as [ level-1 residual variance ]+ [level-1 explained variance] + [level 2 residual variance] + [level-2 explained variance].
That is, for example
usevariables are class x y; within=; between=; cluster=class; Analysis: type=twolevel;
Model: %within% y on x (a); y (b); x (c);
%Between% y (d); x(e); y on x (a);
I expect that total Var(y)should=b+ c*a^2 + d + e*a^2. But I found that the formula I used here can not get the accurate total variance of Y.
I don't know if you compare your formula to the total variance in the sample or the model-estimated total variance. If you don't have perfect fit, those two are different.
The formula is correct.
You use the label "a" on both levels - is that what you intend?
If this doesn't help, send output to support@statmodel along with your license number.
Linda Lin posted on Wednesday, March 11, 2015 - 7:12 pm
Thanks for your answer, Dr. Muthen. For clarifying my question above: 1) I intend to constrain the coefficients to be same across both levels. 2) I compared the total variance estimated from below model with the total variance estimated from the above model.
usevariables are class y; within=; between=; cluster=class; Analysis: type=twolevel;
Model: %within% y (e);
%Between% y (f);
total variance = e+f. I expected the total variances are the same. Is this e+f the sample variance you mentioned?