Message/Author 

Daniel posted on Thursday, August 14, 2003  10:51 am



I ran a parallel process LGM with multiple groups. Everything is fine except when I request standardized values I get a negative residual variance for one of my categorical observed variables, and an undefined rsquare. DOes this invalidate my results? How can I fix it? RSQUARE Group LOW Observed Residual Variable Variance RSquare SOMA9 0.372 DEPAFF9 0.769 POSAFF9 0.207 INTERP9 0.529 SOMA10 0.496 DEPAFF10 0.898 POSAFF10 0.384 INTERP10 0.628 SOMA11 0.552 DEPAFF11 0.860 POSAFF11 0.468 INTERP11 0.578 SMOKE9 0.018 Undefined 0.10153E+01 SMOKE10F 0.073 0.937 SMOKE10S 0.069 0.941 SMOKE11 0.009 0.993 Latent Variable RSquare F1 0.877 F2 0.520 F3 0.723 SLEVEL 0.168 STREND 0.149 Group HIGH Observed Residual Variable Variance RSquare SOMA9 0.380 DEPAFF9 0.930 POSAFF9 0.246 INTERP9 0.632 SOMA10 0.461 DEPAFF10 0.944 POSAFF10 0.365 INTERP10 0.682 SOMA11 0.482 DEPAFF11 0.870 POSAFF11 0.433 INTERP11 0.610 SMOKE9 0.167 0.862 SMOKE10F 0.107 0.925 SMOKE10S 0.070 0.955 SMOKE11 0.297 0.854 Latent Variable RSquare F1 0.781 F2 0.523 F3 0.794 SLEVEL 0.252 STREND 0.647 


The negative residual variance seems to be for the first occasion where it is likely that there is very little smoking. Negative residual variances in growth models are common for variables with strong floor/ceiling effects. It looks like you have a multiple indicator growth model not a parallel process model. I would need to see you full output and possible data to comment further. 

Anonymous posted on Monday, August 25, 2003  12:28 pm



I am conducting a LGM model using categorical outcomes (intercept free; thresholds fixed to zero; slope free; quadratic latent variable free). I obtained a 2.36 residual variance for one outcome (compared to .39 to .55 for the other variables) and an undefined rsquare for the offending variable. I thought that an undefined rsquare was usually caused by negative variance. What does this suggest? How might I fix this? In addition, I continually receive a standardized estimate of the intercept (i.e., threshold) that is greater than one. Can you explain this? Further, in a different LGM model (same model but with a second slope added and the quadratic term removed) the standardized estimate of the mean of the first slope is greater than one. Again, is this possible? Also, I tend to have high standardized correlations between the first slope and second slope (e.g., .61; not significant), but negligible unstandardized correaltions (e.g., .009).What do you make of this? I also, see this same pattern when I replace the second slope with a quadratic latent variable. Perhaps, there is a strong linear relationship between the slope and quadratic which is influencing this pattern? Any suggestions? Thanks in advance! 


The undefined rsquare can happen for a variety of reasons. I would need to see the full output to answer this. Stadnardized values can be greater than one when they do not correspond to correlation coeffients. The value of the threshold can be greater than one just like a zscore can be greater than one. Slope means can also be greater than one because they are not interpreted as correlaton coefficients. Regarding the negligible unstandarized correlation comes about from small variances. 


Hi Linda, I'm running an LGM over every other of 15 time points. This model fits very nicely, except for a negative variance for pdamn01. USEVARIABLES ARE pda pdamn01 pdamn03 pdamn05 pdamn07 pdamn09 pdamn11 pdamn13 pdamn15 ; MISSING ARE . ; DEFINE: pda = pda*10; pdamn01 = pdamn01*10; pdamn03 = pdamn03*10; pdamn05 = pdamn05*10; pdamn07 = pdamn07*10; pdamn09 = pdamn09*10; pdamn11 = pdamn11*10; pdamn13 = pdamn13*10; pdamn15 = pdamn15*10; ANALYSIS: TYPE = MEANSTRUCTURE; MODEL: PDAint BY PDAPDAmn15@1; PDAslope BY PDA@0 PDAmn01@1 PDAmn03@1 PDAmn05@1 PDAmn07@1 PDAmn09@1 PDAmn11@1 PDAmn13* PDAmn15*; PDAquad BY PDA@0 PDAmn01@1 PDAmn03@4 PDAmn05@9 PDAmn07@10 PDAmn09* PDAmn11* PDAmn13* PDAmn15*; [PDAPDAmn15@0 PDAint PDAslope PDAquad]; !constraints on model PDAint@0; ! PDAmn01@0; !correlated residuals: all adjacent residuals are correlated PDA WITH PDAmn01; PDAmn01 WITH PDAmn03; PDAmn03 WITH PDAmn05; PDAmn05 WITH PDAmn07; PDAmn07 WITH PDAmn09; PDAmn09 WITH PDAmn11; PDAmn11 WITH PDAmn13; PDAmn13 with PDAmn15; OUTPUT: SAMPSTAT MODINDICES (10) STANDARDIZED tech1; You mentioned above that negative residual variances in growth models are common for variables with strong floor/ceiling effects. I presume that a model with a negative residual variance on an observed variable is not really "reportable." How do you suggest getting rid of this problem? Setting the pdamn01 variable variance to zero doesn't work (model won't converge). Already had to set the intercept variance to 0 b/c that was coming up negative too. Is transforming the variable an option you would suggest? Wouldn't I need to transform all the observed variables then? Thanks, Silvia 


I don't know how you started this analysis but the growth model you have ended up with is a little odd given that the quadratic factor should have time scores that are the square of the slope growth factor. Your growth slope factor has time scores of: 0 1 1 1 1 1 1 free free The quadratic has: 0 1 4 9 10 free free free free This does not make sense to me and could be causing problems. I am not sure how this model would be interpreted. 


Oh! Looks like I was unaware of the exact nature of the relationship between growth and quadratic factor. Now I wish I had posted anonymously! :) (the 10 in the quad was a typo) THe growth pattern, based on means, looks something like this: 3.005 8.647 8.356 7.889 7.752 7.611 7.587 7.536 7.544 What I was seeing here is a decreasing linear pattern, with a quadratic overlay at the beginning (treatment effect). I guess I am still confused about time score usage. You had suggested 01111... for an earlier problem I had with a similar pattern. But maybe a very different model is indicated here? I am actually interested in relating this variable to another one in a dual growth model, so I want to get the simple growth pattern as uncomplicated as possible. Thanks for your assistance! silvia 


I suggested 0 1 1 1 1 ... because I see a big jump between time 1 and time 2 and then a flattening out. I guess one question is if the difference between 8.6 and 7.5 meaningful? Is that type of decline important enough to model? If not, I would fit a model with one growth factor 0 1 1 1 .... Is the increase from time 1 to time 2 important to model or can you start the time series at time 2? 


Thanks for your previous note. Yes the difference between 8.6 and 7.5 is meaningful in our context and worth modeling. The time 1 value is the baseline before intervention (percent days abstinent from alcohol). If i leave it out for this model, what exactly does the intercept factor mean? Could I account for baseline abstinence by using the time 1 variable as a covariate (PDAslope on PDA)? How is the interpretation of this information different from having the baseline be an indicator of the intercept factor in the growth model (i.e., PDAslope on PDAint)? I did try taking out the first time point and got a barely acceptable fit. The modindices suggest that time 15 is the most problematic. If I free this , what are the implications for interpreting the dropoff form 8.6 to 7.5? here are the commands used. ANALYSIS: TYPE = MEANSTRUCTURE; MODEL: PDAint BY PDAmn01PDAmn15@1; PDAslope BY PDAmn01@0 PDAmn03@1 PDAmn05@2 PDAmn07@3 PDAmn09@4 PDAmn11@5 PDAmn13@6 PDAmn15@7; [PDAmn01PDAmn15@0 PDAint PDAslope]; !correlated residuals: all adjacent residuals are correlated PDAmn01 WITH PDAmn03; PDAmn03 WITH PDAmn05; PDAmn05 WITH PDAmn07; PDAmn07 WITH PDAmn09; PDAmn09 WITH PDAmn11; PDAmn11 WITH PDAmn13; PDAmn13 with PDAmn15; !effects of interest PDAslope on PDA; thanks for your help. 


I think the reason you are having problems is that the model after the first time point is not linear. It seems like you have two things going on looking at all time points. First you have the big jump  from 3 to 8.6. Then a small series of declines and then a leveling out. You could consider a piecewise model with two pieces  the first representing the initial growth (up to 8.5) and the second representing the decline and leveling out. So you could have timescores of 0 1 1 1 1 1 1 1 1 for the first slope and 0 0 1 2 3 4 5 5 5 for the second slope. I think this reflects your means: 3.005 8.647 8.356 7.889 7.752 7.611 7.587 7.536 7.544 So you would have one intercept factor and two growth factors. I don't think the variance of the first slope factor is identified so fix it to zero. Also fix its covaraince with the other growth factors to zero. Then you could have different predictors of the initial jump and the decline. Also, don't put the residual covariances in until you fit this model. 


Also, leave the ON statement out until you get the growth model fit. 


Hi again, That model ran (hurray); fit is poor (CFI= .89), but not hopeless. ANALYSIS: TYPE = MEANSTRUCTURE; MODEL: PDAint BY PDAPDAmn15@1; PDAslop1 BY PDA@0 PDAmn01@1 PDAmn03@1 PDAmn05@1 PDAmn07@1 PDAmn09@1 PDAmn11@1 PDAmn13@1 PDAmn15@1; PDAslop2 BY PDA@0 PDAmn01@0 PDAmn03@1 PDAmn05@2 PDAmn07@3 PDAmn09@4 PDAmn11@5 PDAmn13@5 PDAmn15@5; [PDAPDAmn15@0 PDAint PDAslop1 PDAslop2]; !constraints on model PDAslop1@0; PDAslop1 with PDAslop2@0; I'm sending you the output via email. Some of the big modindices are puzzling to me  e.g., PDASLOP2 BY PDA 124.764 2.576 1.382 0.463 Why would it want me to free the first indicator of the second growth factor? 


Hi Linda, I too am having Mplus (v3) report negative variances with simple linear growth modeling. The code for one example is: Analysis: TYPE = meanstructure missing h1; Model: i by cog0@1 Cog2@1 Cog6@1; s by cog0@0 cog2@2 cog6@6; [Cog0  Cog6@0 i s]; i s on group; Output: Samp; The negative variances are normally associated with s. Otherwise the output looks good. Indeed, when I plot out the predicted means for the 2 groups using the Mplus parameters they are pretty good, despite the negative varainces. Is it OK to just ignore the negative variances? Can I just set the offending variances to zero? Any advice would be well received. There are only 3 time points (0, 2, & 6 months) and only about 100 cases (some with missing data). Group is a dichotomous variable. Cog is continuous (fairly normal). Many thanks, Peter 

bmuthen posted on Sunday, August 28, 2005  8:20 pm



This may simply imply that you have no individual variation in the slopes and should therefore treat it as fixed, i.e. fix its variance at zero. 

Andy posted on Thursday, November 03, 2005  6:15 pm



Hi, I have a random intercepts and slopes growth model as on Ex 6.1 on p75 of the V3 manual, except that I have 3 times points (0, 1 , 2). I have a continuous outcome with 80 individuals. There are 6 covariance df to play with, and I'm fitting a model using all 6 (ie random intercept and slope variances, their covariances, and unconstrained 3 residual variances.) Equating sample covariance matrix to model covariance parameters and solving the 6 equations in 6 unknowns gives the solution as reported in Mplus. As with others, I get a negative residual variance estimate for the error variance at my time 2, and the numerical value reported in Mplus matches the algebraic calculation. Denoting the sample covariance elements by Sij, the estimate of the error variance at my time 2 is S33+S132S23. My data just happens to have S13 as the smallest element in the matrix and S23 approx the same as S33, so this gives the negative variance estimate. The algebra indicates that these negative variances can occur quite frequently in this model with a saturated covariance parameterisation. Have you found this to be so in your experience with many datasets? What is the usual recommended action? Is it to set the offending error variance to zero, or instead add some constraints to the error variances? This has reinforced to me some of the limitations with growth modelling with 3 time points. Do you recommend against fitting such an unconstrained error variance model with 3 time points? Many thanks. 

bmuthen posted on Friday, November 04, 2005  7:54 am



In my experience, negative residual variances often happen with strongly skewed outcomes. If your residual variance comes out negative you can hold the variances equal across time points, or fix it at zero. I recommend having more than 3 time points for reasons of being able to have a more flexible and realistic model (correlated residuals, free time scores...). 

Andy posted on Sunday, November 06, 2005  8:44 pm



Dear Dr Muthen, Thanks for your quick reply. My data was a bit right skewed, but the negative variance persists with squareroot and logtransform of my data. I also tried the model with the classic Potthoff and Roy 1964 growth dataset involving the pituitary gland in boys and girls, and with a few other datasets. Whenever I restricted the model to 3 time points I got a negative variance somewhere in the model. These datasets had constant sample variance or increasing variances with time, and constant offdiagonal correlations or decreasing correlations. All gave negative variances somewhere. So now I'm wondering whether it is the exception rather than the rule to be able to estimate all 3 residual variances unconstrained when you have only 3 time points? I'm also curious why MPlus does not implement nonnegative variance constraints in the estimation routines, eg by reparamaterising a variance sigma^2 as exp(A) where A = ln(sigma^2), and estimating A rather then sigma^2 directly? Thanks again. 

BMuthen posted on Saturday, November 12, 2005  6:24 pm



I don't think this is an issue that comes up because of three time points. I have experienced many such applications without problems. If you want us to look into your particular data sets, please send the inputs, data, outputs, and license number to support@statmodel.com. We do not have nonnegative variance constraints automatically built in because we do not want to hide this information from the analyst. An analyst can fix a residual variance to zero if they want or to constrain the variance to be nonnegative. 


We're preparing a dual process model looking at differences in growth between parents and children, however, we obtained a negative variance for one of the slope factors. Does anything about the syntax below suggest an overfitted or problematic model? USEVAR ARE achvl sachvl tachvl pchvl spchvl tpchvl sex gen2; MISSING ARE ALL (999); ANALYSIS: TYPE = meanstructure; MODEL: int_a by achvl@1 sachvl@1 tachvl@1; slope_a by achvl@0 sachvl@1 tachvl@2; int_p by pchvl@1 spchvl@1 tpchvl@1; slope_p by pchvl@0 spchvl@1 tpchvl@2; [achvl@0 sachvl@0 tachvl@0 pchvl@0 spchvl@0 tpchvl@0 int_a slope_a int_p slope_p]; int_p slope_p int_a slope_a on sex; int_p slope_p int_a slope_a on gen2; int_p with slope_p@0; int_a with slope_a@0; int_p with int_a; slope_a with slope_p; slope_p with int_a; slope_a with int_p; OUTPUT: standardized sampstat modindices; 


The statements int_p with slope_p@0; int_a with slope_a@0; are unusual because intercepts and slopes of the same process are typically correlated. A negative slope may simply mean that the slope variation is almost zero in which case you want to fix its variance at zero and consider it a fixed effect, rather than random. Also, I would recommend switching to the newer growth langues to simplify your input. You may also want to correlate contemperanous residuals using WITH statements 

Amber Watts posted on Thursday, October 29, 2009  11:58 am



I have a similar problem to Peter above. Bengt suggested "This may simply imply that you have no individual variation in the slopes and should therefore treat it as fixed, i.e. fix its variance at zero." When I fix the slope variance to zero I still get a problem, but if I fix the interceptslope covariance to 0 the model runs fine. Is this an acceptable thing to do? 


It sounds like you are using an old version of the program. If a variance is fixed to zero, all covariances with the variable need to be fixed to zero. This has been done automatically in Mplus for some time. 

Amber Watts posted on Thursday, October 29, 2009  12:25 pm



I am using Mplus version 5. If I fix slope@0 the Tech4 output it says that LevelLevel covariance is negative even though the levelslope is 0.00. If I fix level@0 the tech4 says the slopeslope covariance is negative even though the levelslope covariance is 0.00 If I fix the level WITH slope@0, I no longer get the error message and both levellevel covariance and slopeslope covariance become positive. What does this mean? 


You need to send the outputs and your license number to support@statmodel.com. 

Victor Heh posted on Friday, October 30, 2009  8:00 am



cfa approach Means I 12.404 0.791 15.681 0.000 S 0.655 0.208 3.151 0.002 Variances I 44.827 9.051 4.952 0.000 S 1.624 0.795 2.043 0.041 USING TSCORES Means I 12.474 0.809 15.424 0.000 S 0.672 0.237 2.832 0.005 Variances I 49.958 9.755 5.121 0.000 S 0.253 1.097 0.231 0.818 My concern is about the negative variance in the cfa approach. How do I specify my model with cfa to obtain same result as the twolevel approach. 


I would need to see the full outputs to understand exactly what you are doing. Please send them and your license number to support@statmodel.com. 


I just have the following question/s for you: What does a zero slope variance mean (considering a real zero, r/t forcing to zero)? Is that meaning that the interindividual differences are the same as those at the beginning of the study? OR does it mean that the interindividual differences come close to zero over the course of the study? 


The former. 


Hello Drs. Muthen, I'm doing an ESEM model with categorical indicators. When I try to run the model, I get an error message stating that one of my variables has a negative residual variance. Since the dataset I was using is one of several datasets created using multiple imputation (to address missing data), I decided to try running the model in one of the other datasets to see if the model would converge. It did; I then proceeded to run the model in a few more datasets. What I'm seeing is that the model converges in about half the of datasets. When model converges (with good fit indices; CFI = .99, TLI = .98, RMSEA = .07) the residual variance is about .01. When the model does NOT converge, the residual variance is about .01. How would you suggest I proceed with interpreting and/or adjusting this model? Best, Chris 


Please send to support@statmodel.com. 


I have computed latent growth models and have come across some problems I don’t seem to be able to resolve. I have two scales for which item data was collected at three time points. I modeled each of these scales separately in univariate LGM models, using the items as indicators for the three latent construct variables at three time points. I obtained sensible estimates for each LGM. However, when specifying a multivariate model where the slopes and intercepts of the two scales are allowed to covary to examine potential relationships of growth, I obtained the following warning: WARNING: THE LATENT VARIABLE COVARIANCE MATRIX (PSI) IS NOT POSITIVE DEFINITE. THIS COULD INDICATE A NEGATIVE VARIANCE/RESIDUAL VARIANCE FOR A LATENT VARIABLE, A CORRELATION GREATER OR EQUAL TO ONE BETWEEN TWO LATENT VARIABLES, OR A LINEAR DEPENDENCY AMONG MORE THAN TWO LATENT VARIABLES. CHECK THE TECH4 OUTPUT FOR MORE INFORMATION. PROBLEM INVOLVING VARIABLE S_CAL. [S_CAL is the slope of one of the two scales] I also obtained a negative value for the variance of S_CAL (which did not occur in the univariate LGM model of this scale). Please can you advise how to solve this problem? 


Sometimes with a parallel process model, there is a need for residual covariances across processes at each time point. 


Dear Linda, Many thanks for your quick reply. If I understand you correctly, you advise me to allow the residuals of the latent variables of the two constructs at each time point to covary. I have already done this. Sorry for not making this explicit. However, it does not help to obtain a model that converges. 


Please send the output and your license number to support@statmodel.com. 

Ginnie posted on Thursday, April 25, 2013  2:53 pm



Dear Dr. Muthen, I am preparing a growth model for parallel processes; however I obtain an error message regarding the variance of s2. variable: names are y1y35; usevariables y7y10 y22y25; missing = all (999); analysis: estimator = ml; model: i1 s1  y7@0 y8@1 y9*2 y10*3; i2 s2  y22@0 y23@1 y24*2 y25*3; s1 on i2; s2 on i1; output: tech4; MODINDICES (ALL); THE MODEL ESTIMATION TERMINATED NORMALLY WARNING: THE LATENT VARIABLE COVARIANCE MATRIX (PSI) IS NOT POSITIVE DEFINITE. THIS COULD INDICATE A NEGATIVE VARIANCE/RESIDUAL VARIANCE FOR A LATENT VARIABLE, A CORRELATION GREATER OR EQUAL TO ONE BETWEEN TWO LATENT VARIABLES, OR A LINEAR DEPENDENCY AMONG MORE THAN TWO LATENT VARIABLES. CHECK THE TECH4 OUTPUT FOR MORE INFORMATION. PROBLEM INVOLVING VARIABLE S2. Can you kindly advice me know to fit it? Thanks! Ginnie 


If the variance is a small not significant negative value, you can fix it to zero, for example, s2@0; 


Hi, In a multilevel growth model, I want to fix the variance of the slope at zero (random intercept only model) but I want to keep the correlation between the intercept and the slope. How can I do that ? 


I don't recommend doing this because a correlation implies a relationship between two variables and if one of them has variance zero it is not a variable. 


Then, if the slope have a nonsignificant variance of zero and the covariance between the intercept and the slope is significant, I would have to keep both (the variance of the slope and the relation between the intercept and the slope) even if the fit is worst with the variance of the slope? Thank you very much for your help. 


And sorry for my english... 


You would either keep the covariance and variance for the slope, or get rid of both. 

Back to top 