Negative variance PreviousNext
Mplus Discussion > Growth Modeling of Longitudinal Data >
Message/Author
 Daniel posted on Thursday, August 14, 2003 - 10:51 am
I ran a parallel process LGM with multiple groups. Everything is fine except when I request standardized values I get a negative residual variance for one of my categorical observed variables, and an undefined r-square. DOes this invalidate my results? How can I fix it?


R-SQUARE

Group LOW

Observed Residual
Variable Variance R-Square

SOMA9 0.372
DEPAFF9 0.769
POSAFF9 0.207
INTERP9 0.529
SOMA10 0.496
DEPAFF10 0.898
POSAFF10 0.384
INTERP10 0.628
SOMA11 0.552
DEPAFF11 0.860
POSAFF11 0.468
INTERP11 0.578
SMOKE9 -0.018 Undefined 0.10153E+01
SMOKE10F 0.073 0.937
SMOKE10S 0.069 0.941
SMOKE11 0.009 0.993

Latent
Variable R-Square

F1 0.877
F2 0.520
F3 0.723
SLEVEL 0.168
STREND 0.149

Group HIGH

Observed Residual
Variable Variance R-Square

SOMA9 0.380
DEPAFF9 0.930
POSAFF9 0.246
INTERP9 0.632
SOMA10 0.461
DEPAFF10 0.944
POSAFF10 0.365
INTERP10 0.682
SOMA11 0.482
DEPAFF11 0.870
POSAFF11 0.433
INTERP11 0.610
SMOKE9 0.167 0.862
SMOKE10F 0.107 0.925
SMOKE10S 0.070 0.955
SMOKE11 0.297 0.854

Latent
Variable R-Square

F1 0.781
F2 0.523
F3 0.794
SLEVEL 0.252
STREND 0.647
 Linda K. Muthen posted on Saturday, August 16, 2003 - 4:37 pm
The negative residual variance seems to be for the first occasion where it is likely that there is very little smoking. Negative residual variances in growth models are common for variables with strong floor/ceiling effects.

It looks like you have a multiple indicator growth model not a parallel process model. I would need to see you full output and possible data to comment further.
 Anonymous posted on Monday, August 25, 2003 - 12:28 pm
I am conducting a LGM model using categorical outcomes (intercept free; thresholds fixed to zero; slope free; quadratic latent variable free). I obtained a 2.36 residual variance for one outcome (compared to .39 to .55 for the other variables) and an undefined r-square for the offending variable. I thought that an undefined r-square was usually caused by negative variance. What does this suggest? How might I fix this?

In addition, I continually receive a standardized estimate of the intercept (i.e., threshold) that is greater than one. Can you explain this?


Further, in a different LGM model (same model but with a second slope added and the quadratic term removed) the standardized estimate of the mean of the first slope is greater than one. Again, is this possible?

Also, I tend to have high standardized correlations between the first slope and second slope (e.g., -.61; not significant), but negligible unstandardized correaltions (e.g., -.009).What do you make of this? I also, see this same pattern when I replace the second slope with a quadratic latent variable.

Perhaps, there is a strong linear relationship between the slope and quadratic which is influencing this pattern? Any suggestions?

Thanks in advance!
 Linda K. Muthen posted on Tuesday, August 26, 2003 - 10:07 am
The undefined r-square can happen for a variety of reasons. I would need to see the full output to answer this.

Stadnardized values can be greater than one when they do not correspond to correlation coeffients.
The value of the threshold can be greater than one just like a z-score can be greater than one.
Slope means can also be greater than one because they are not interpreted as correlaton coefficients.

Regarding the negligible unstandarized correlation comes about from small variances.
 silvia_sorensen posted on Sunday, January 18, 2004 - 2:08 pm
Hi Linda,
I'm running an LGM over every other of 15 time points. This model fits very nicely, except for a negative variance for pdamn01.


USEVARIABLES ARE pda pdamn01 pdamn03
pdamn05 pdamn07 pdamn09
pdamn11 pdamn13 pdamn15 ;
MISSING ARE . ;
DEFINE:
pda = pda*10; pdamn01 = pdamn01*10;
pdamn03 = pdamn03*10; pdamn05 = pdamn05*10;
pdamn07 = pdamn07*10; pdamn09 = pdamn09*10;
pdamn11 = pdamn11*10; pdamn13 = pdamn13*10;
pdamn15 = pdamn15*10;
ANALYSIS: TYPE = MEANSTRUCTURE;
MODEL: PDAint BY PDA-PDAmn15@1;
PDAslope BY PDA@0 PDAmn01@1 PDAmn03@1 PDAmn05@1 PDAmn07@1 PDAmn09@1 PDAmn11@1 PDAmn13* PDAmn15*;
PDAquad BY PDA@0 PDAmn01@1 PDAmn03@4 PDAmn05@9
PDAmn07@10 PDAmn09* PDAmn11* PDAmn13* PDAmn15*;
[PDA-PDAmn15@0 PDAint PDAslope PDAquad];
!constraints on model
PDAint@0;
! PDAmn01@0;

!correlated residuals: all adjacent residuals are correlated
PDA WITH PDAmn01;
PDAmn01 WITH PDAmn03;
PDAmn03 WITH PDAmn05;
PDAmn05 WITH PDAmn07;
PDAmn07 WITH PDAmn09;
PDAmn09 WITH PDAmn11;
PDAmn11 WITH PDAmn13;
PDAmn13 with PDAmn15;

OUTPUT: SAMPSTAT MODINDICES (10) STANDARDIZED tech1;

You mentioned above that negative residual variances in growth models are common for variables with strong floor/ceiling effects. I presume that a model with a negative residual variance on an observed variable is not really "reportable." How do you suggest getting rid of this problem? Setting the pdamn01 variable variance to zero doesn't work (model won't converge). Already had to set the intercept variance to 0 b/c that was coming up negative too. Is transforming the variable an option you would suggest? Wouldn't I need to transform all the observed variables then?

Thanks, Silvia
 Linda K. Muthen posted on Sunday, January 18, 2004 - 4:55 pm
I don't know how you started this analysis but the growth model you have ended up with is a little odd given that the quadratic factor should have time scores that are the square of the slope growth factor. Your growth slope factor has time scores of:

0 1 1 1 1 1 1 free free

The quadratic has:

0 1 4 9 10 free free free free

This does not make sense to me and could be causing problems. I am not sure how this model would be interpreted.
 silvia_sorensen posted on Monday, January 19, 2004 - 12:29 pm
Oh! Looks like I was unaware of the exact nature of the relationship between growth and quadratic factor. Now I wish I had posted anonymously! :-) (the 10 in the quad was a typo)

THe growth pattern, based on means, looks
something like this:
3.005 8.647 8.356 7.889 7.752 7.611 7.587 7.536 7.544

What I was seeing here is a decreasing linear pattern, with a quadratic overlay at the beginning (treatment effect). I guess I am still confused about time score usage. You had suggested 01111... for an earlier problem I had with a similar pattern. But maybe a very different model is indicated here? I am actually interested in relating this variable to another one in a dual growth model, so I want to get the simple growth pattern as uncomplicated as possible. Thanks for your assistance!
silvia
 Linda K. Muthen posted on Monday, January 19, 2004 - 1:53 pm
I suggested 0 1 1 1 1 ... because I see a big jump between time 1 and time 2 and then a flattening out. I guess one question is if the difference between 8.6 and 7.5 meaningful? Is that type of decline important enough to model? If not, I would fit a model with one growth factor 0 1 1 1 .... Is the increase from time 1 to time 2 important to model or can you start the time series at time 2?
 silvia_sorensen posted on Sunday, January 25, 2004 - 11:23 am
Thanks for your previous note.

Yes the difference between 8.6 and 7.5 is meaningful in our context and worth modeling.

The time 1 value is the baseline before intervention (percent days abstinent from alcohol). If i leave it out for this model, what exactly does the intercept factor mean?
Could I account for baseline abstinence by using the time 1 variable as a covariate (PDAslope on PDA)? How is the interpretation of this information different from having the baseline be an indicator of the intercept factor in the growth model (i.e., PDAslope on PDAint)?

I did try taking out the first time point and got a barely acceptable fit. The modindices suggest that time 15 is the most problematic. If I free this , what are the implications for interpreting the drop-off form 8.6 to 7.5?

here are the commands used.
ANALYSIS: TYPE = MEANSTRUCTURE;
MODEL: PDAint BY PDAmn01-PDAmn15@1;
PDAslope BY PDAmn01@0 PDAmn03@1 PDAmn05@2
PDAmn07@3 PDAmn09@4 PDAmn11@5 PDAmn13@6 PDAmn15@7;
[PDAmn01-PDAmn15@0 PDAint PDAslope];
!correlated residuals: all adjacent residuals are correlated
PDAmn01 WITH PDAmn03;
PDAmn03 WITH PDAmn05;
PDAmn05 WITH PDAmn07;
PDAmn07 WITH PDAmn09;
PDAmn09 WITH PDAmn11;
PDAmn11 WITH PDAmn13;
PDAmn13 with PDAmn15;

!effects of interest
PDAslope on PDA;

thanks for your help.
 Linda K. Muthen posted on Monday, January 26, 2004 - 1:23 pm
I think the reason you are having problems is that the model after the first time point is not linear. It seems like you have two things going on looking at all time points. First you have the big jump -- from 3 to 8.6. Then a small series of declines and then a leveling out. You could consider a piecewise model with two pieces -- the first representing the initial growth (up to 8.5) and the second representing the decline and leveling out. So you could have timescores of

0 1 1 1 1 1 1 1 1 for the first slope and
0 0 1 2 3 4 5 5 5 for the second slope. I think this reflects your means:

3.005 8.647 8.356 7.889 7.752 7.611 7.587 7.536 7.544

So you would have one intercept factor and two growth factors. I don't think the variance of the first slope factor is identified so fix it to zero. Also fix its covaraince with the other growth factors to zero. Then you could have different predictors of the initial jump and the decline.

Also, don't put the residual covariances in until you fit this model.
 Linda K. Muthen posted on Monday, January 26, 2004 - 1:25 pm
Also, leave the ON statement out until you get the growth model fit.
 silvia sorensen posted on Tuesday, January 27, 2004 - 9:07 am
Hi again,
That model ran (hurray); fit is poor (CFI= .89), but not hopeless.

ANALYSIS: TYPE = MEANSTRUCTURE;
MODEL: PDAint BY PDA-PDAmn15@1;
PDAslop1 BY PDA@0 PDAmn01@1 PDAmn03@1 PDAmn05@1 PDAmn07@1 PDAmn09@1 PDAmn11@1 PDAmn13@1 PDAmn15@1;
PDAslop2 BY PDA@0 PDAmn01@0 PDAmn03@1 PDAmn05@2 PDAmn07@3 PDAmn09@4 PDAmn11@5 PDAmn13@5 PDAmn15@5;
[PDA-PDAmn15@0 PDAint PDAslop1 PDAslop2];
!constraints on model
PDAslop1@0;
PDAslop1 with PDAslop2@0;

I'm sending you the output via email. Some of the big modindices are puzzling to me - e.g.,

PDASLOP2 BY PDA 124.764 2.576 1.382 0.463

Why would it want me to free the first indicator of the second growth factor?
 Peter Elliott posted on Sunday, August 28, 2005 - 5:39 pm
Hi Linda,

I too am having Mplus (v3) report negative variances with simple linear growth modeling. The code for one example is:
Analysis: TYPE = meanstructure missing h1;
Model: i by cog0@1 Cog2@1 Cog6@1;
s by cog0@0 cog2@2 cog6@6;
[Cog0 - Cog6@0 i s];
i s on group;
Output: Samp;

The negative variances are normally associated with s. Otherwise the output looks good. Indeed, when I plot out the predicted means for the 2 groups using the Mplus parameters they are pretty good, despite the negative varainces. Is it OK to just ignore the negative variances? Can I just set the offending variances to zero? Any advice would be well received.
There are only 3 time points (0, 2, & 6 months) and only about 100 cases (some with missing data). Group is a dichotomous variable. Cog is continuous (fairly normal).

Many thanks,

Peter
 bmuthen posted on Sunday, August 28, 2005 - 8:20 pm
This may simply imply that you have no individual variation in the slopes and should therefore treat it as fixed, i.e. fix its variance at zero.
 Andy posted on Thursday, November 03, 2005 - 6:15 pm
Hi,

I have a random intercepts and slopes growth model as on Ex 6.1 on p75 of the V3 manual, except that I have 3 times points (0, 1 , 2). I have a continuous outcome with 80 individuals.

There are 6 covariance df to play with, and I'm fitting a model using all 6 (ie random intercept and slope variances, their covariances, and unconstrained 3 residual variances.) Equating sample covariance matrix to model covariance parameters and solving the 6 equations in 6 unknowns gives the solution as reported in Mplus.

As with others, I get a negative residual variance estimate for the error variance at my time 2, and the numerical value reported in Mplus matches the algebraic calculation. Denoting the sample covariance elements by Sij, the estimate of the error variance at my time 2 is S33+S13-2S23. My data just happens to have S13 as the smallest element in the matrix and S23 approx the same as S33, so this gives the negative variance estimate.

The algebra indicates that these negative variances can occur quite frequently in this model with a saturated covariance parameterisation. Have you found this to be so in your experience with many datasets? What is the usual recommended action? Is it to set the offending error variance to zero, or instead add some constraints to the error variances?

This has reinforced to me some of the limitations with growth modelling with 3 time points. Do you recommend against fitting such an unconstrained error variance model with 3 time points?

Many thanks.
 bmuthen posted on Friday, November 04, 2005 - 7:54 am
In my experience, negative residual variances often happen with strongly skewed outcomes. If your residual variance comes out negative you can hold the variances equal across time points, or fix it at zero. I recommend having more than 3 time points for reasons of being able to have a more flexible and realistic model (correlated residuals, free time scores...).
 Andy posted on Sunday, November 06, 2005 - 8:44 pm
Dear Dr Muthen,

Thanks for your quick reply. My data was a bit right skewed, but the negative variance persists with square-root and log-transform of my data.

I also tried the model with the classic Potthoff and Roy 1964 growth dataset involving the pituitary gland in boys and girls, and with a few other datasets. Whenever I restricted the model to 3 time points I got a negative variance somewhere in the model. These datasets had constant sample variance or increasing variances with time, and constant off-diagonal correlations or decreasing correlations. All gave negative variances somewhere.

So now I'm wondering whether it is the exception rather than the rule to be able to estimate all 3 residual variances unconstrained when you have only 3 time points?

I'm also curious why MPlus does not implement non-negative variance constraints in the estimation routines, eg by reparamaterising a variance sigma^2 as exp(A) where A = ln(sigma^2), and estimating A rather then sigma^2 directly?

Thanks again.
 BMuthen posted on Saturday, November 12, 2005 - 6:24 pm
I don't think this is an issue that comes up because of three time points. I have experienced many such applications without problems. If you want us to look into your particular data sets, please send the inputs, data, outputs, and license number to support@statmodel.com.

We do not have non-negative variance constraints automatically built in because we do not want to hide this information from the analyst. An analyst can fix a residual variance to zero if they want or to constrain the variance to be non-negative.
 Jeff Cookston posted on Tuesday, March 14, 2006 - 2:07 pm
We're preparing a dual process model looking at differences in growth between parents and children, however, we obtained a negative variance for one of the slope factors. Does anything about the syntax below suggest an overfitted or problematic model?

USEVAR ARE achvl sachvl tachvl pchvl spchvl tpchvl sex gen2;
MISSING ARE ALL (999);

ANALYSIS:
TYPE = meanstructure;

MODEL:
int_a by achvl@1 sachvl@1 tachvl@1;
slope_a by achvl@0 sachvl@1 tachvl@2;
int_p by pchvl@1 spchvl@1 tpchvl@1;
slope_p by pchvl@0 spchvl@1 tpchvl@2;
[achvl@0 sachvl@0 tachvl@0 pchvl@0 spchvl@0 tpchvl@0 int_a slope_a
int_p slope_p];
int_p slope_p int_a slope_a on sex;
int_p slope_p int_a slope_a on gen2;
int_p with slope_p@0;
int_a with slope_a@0;
int_p with int_a;
slope_a with slope_p;
slope_p with int_a;
slope_a with int_p;
OUTPUT:
standardized sampstat modindices;
 Bengt O. Muthen posted on Tuesday, March 14, 2006 - 5:52 pm
The statements

int_p with slope_p@0;
int_a with slope_a@0;

are unusual because intercepts and slopes of the same process are typically correlated. A negative slope may simply mean that the slope variation is almost zero in which case you want to fix its variance at zero and consider it a fixed effect, rather than random.

Also, I would recommend switching to the newer growth langues to simplify your input.

You may also want to correlate contemperanous residuals using WITH statements
 Amber Watts posted on Thursday, October 29, 2009 - 11:58 am
I have a similar problem to Peter above. Bengt suggested
"This may simply imply that you have no individual variation in the slopes and should therefore treat it as fixed, i.e. fix its variance at zero."

When I fix the slope variance to zero I still get a problem, but if I fix the intercept-slope covariance to 0 the model runs fine. Is this an acceptable thing to do?
 Linda K. Muthen posted on Thursday, October 29, 2009 - 12:04 pm
It sounds like you are using an old version of the program. If a variance is fixed to zero, all covariances with the variable need to be fixed to zero. This has been done automatically in Mplus for some time.
 Amber Watts posted on Thursday, October 29, 2009 - 12:25 pm
I am using Mplus version 5. If I fix slope@0 the Tech4 output it says that Level-Level covariance is negative even though the level-slope is 0.00.

If I fix level@0 the tech4 says the slope-slope covariance is negative even though the level-slope covariance is 0.00

If I fix the level WITH slope@0, I no longer get the error message and both level-level covariance and slope-slope covariance become positive.

What does this mean?
 Linda K. Muthen posted on Thursday, October 29, 2009 - 2:24 pm
You need to send the outputs and your license number to support@statmodel.com.
 Victor Heh posted on Friday, October 30, 2009 - 8:00 am
cfa approach
Means
I 12.404 0.791 15.681 0.000
S -0.655 0.208 -3.151 0.002
Variances
I 44.827 9.051 4.952 0.000
S -1.624 0.795 -2.043 0.041

USING TSCORES
Means
I 12.474 0.809 15.424 0.000
S -0.672 0.237 -2.832 0.005
Variances
I 49.958 9.755 5.121 0.000
S 0.253 1.097 0.231 0.818

My concern is about the negative variance in the cfa approach. How do I specify my model with cfa to obtain same result as the two-level approach.
 Linda K. Muthen posted on Friday, October 30, 2009 - 9:28 am
I would need to see the full outputs to understand exactly what you are doing. Please send them and your license number to support@statmodel.com.
 Hossein Khalili posted on Friday, November 04, 2011 - 12:55 pm
I just have the following question/s for you:
What does a zero slope variance mean (considering a real zero, r/t forcing to zero)? Is that meaning that the inter-individual differences are the same as those at the beginning of the study? OR does it mean that the inter-individual differences come close to zero over the course of the study?
 Bengt O. Muthen posted on Friday, November 04, 2011 - 8:36 pm
The former.
 Christopher T. Allen posted on Thursday, December 01, 2011 - 2:01 pm
Hello Drs. Muthen,

I'm doing an ESEM model with categorical indicators. When I try to run the model, I get an error message stating that one of my variables has a negative residual variance. Since the dataset I was using is one of several datasets created using multiple imputation (to address missing data), I decided to try running the model in one of the other datasets to see if the model would converge. It did; I then proceeded to run the model in a few more datasets. What I'm seeing is that the model converges in about half the of datasets. When model converges (with good fit indices; CFI = .99, TLI = .98, RMSEA = .07) the residual variance is about .01. When the model does NOT converge, the residual variance is about -.01. How would you suggest I proceed with interpreting and/or adjusting this model?

Best,

Chris
 Bengt O. Muthen posted on Friday, December 02, 2011 - 9:42 am
Please send to support@statmodel.com.
 Andreas Hirschi posted on Tuesday, July 10, 2012 - 1:59 am
I have computed latent growth models and have come across some problems I don’t seem to be able to resolve. I have two scales for which item data was collected at three time points. I modeled each of these scales separately in univariate LGM models, using the items as indicators for the three latent construct variables at three time points. I obtained sensible estimates for each LGM. However, when specifying a multivariate model where the slopes and intercepts of the two scales are allowed to covary to examine potential relationships of growth, I obtained the following warning:
WARNING: THE LATENT VARIABLE COVARIANCE MATRIX (PSI) IS NOT POSITIVE
DEFINITE. THIS COULD INDICATE A NEGATIVE VARIANCE/RESIDUAL VARIANCE FOR A
LATENT VARIABLE, A CORRELATION GREATER OR EQUAL TO ONE BETWEEN TWO LATENT
VARIABLES, OR A LINEAR DEPENDENCY AMONG MORE THAN TWO LATENT VARIABLES.
CHECK THE TECH4 OUTPUT FOR MORE INFORMATION.
PROBLEM INVOLVING VARIABLE S_CAL. [S_CAL is the slope of one of the two scales]

I also obtained a negative value for the variance of S_CAL (which did not occur in the univariate LGM model of this scale). Please can you advise how to solve this problem?
 Linda K. Muthen posted on Tuesday, July 10, 2012 - 10:24 am
Sometimes with a parallel process model, there is a need for residual covariances across processes at each time point.
 Andreas Hirschi posted on Tuesday, July 10, 2012 - 10:57 am
Dear Linda,

Many thanks for your quick reply. If I understand you correctly, you advise me to allow the residuals of the latent variables of the two constructs at each time point to covary. I have already done this. Sorry for not making this explicit. However, it does not help to obtain a model that converges.
 Linda K. Muthen posted on Tuesday, July 10, 2012 - 1:15 pm
Please send the output and your license number to support@statmodel.com.
 Ginnie posted on Thursday, April 25, 2013 - 2:53 pm
Dear Dr. Muthen,

I am preparing a growth model for parallel processes; however I obtain an error message regarding the variance of s2.

variable: names are y1-y35;
usevariables y7-y10 y22-y25;
missing = all (999);
analysis: estimator = ml;
model: i1 s1 | y7@0 y8@1 y9*2 y10*3;
i2 s2 | y22@0 y23@1 y24*2 y25*3;
s1 on i2;
s2 on i1;
output: tech4; MODINDICES (ALL);

THE MODEL ESTIMATION TERMINATED NORMALLY

WARNING: THE LATENT VARIABLE COVARIANCE MATRIX (PSI) IS NOT POSITIVE
DEFINITE. THIS COULD INDICATE A NEGATIVE VARIANCE/RESIDUAL VARIANCE FOR A
LATENT VARIABLE, A CORRELATION GREATER OR EQUAL TO ONE BETWEEN TWO LATENT
VARIABLES, OR A LINEAR DEPENDENCY AMONG MORE THAN TWO LATENT VARIABLES.
CHECK THE TECH4 OUTPUT FOR MORE INFORMATION.
PROBLEM INVOLVING VARIABLE S2.

Can you kindly advice me know to fit it? Thanks!

Ginnie
 Linda K. Muthen posted on Thursday, April 25, 2013 - 3:53 pm
If the variance is a small not significant negative value, you can fix it to zero, for example,

s2@0;
 Celia Matte-Gagné posted on Wednesday, August 14, 2013 - 12:04 pm
Hi,

In a multilevel growth model, I want to fix the variance of the slope at zero (random intercept only model) but I want to keep the correlation between the intercept and the slope. How can I do that ?
 Bengt O. Muthen posted on Wednesday, August 14, 2013 - 2:39 pm
I don't recommend doing this because a correlation implies a relationship between two variables and if one of them has variance zero it is not a variable.
 Celia Matte-Gagné posted on Wednesday, August 14, 2013 - 3:42 pm
Then, if the slope have a non-significant variance of zero and the covariance between the intercept and the slope is significant, I would have to keep both (the variance of the slope and the relation between the intercept and the slope) even if the fit is worst with the variance of the slope?

Thank you very much for your help.
 Celia Matte-Gagné posted on Wednesday, August 14, 2013 - 5:01 pm
And sorry for my english...
 Bengt O. Muthen posted on Thursday, August 15, 2013 - 1:24 pm
You would either keep the covariance and variance for the slope, or get rid of both.
Back to top
Add Your Message Here
Post:
Username: Posting Information:
This is a private posting area. Only registered users and moderators may post messages here.
Password:
Options: Enable HTML code in message
Automatically activate URLs in message
Action: