Message/Author 

Greg Norman posted on Tuesday, March 11, 2008  1:36 pm



How are the degrees of freedom determined for a path analysis model? My model has 6 observed variables (5 dependent, 1 dependent) and I estimated 18 parameters (9 regression paths, 4 correlation paths, 5 residual variances). My model ChiSquare test has 2 degrees of freedom. I thought that with 6(7)/2 = 21 elements minus 18 parameters, I would have 3 degrees of freedom. Thanks, Greg 


The information available is different for dependent and independent variables. You have 5*6/2 = 15 variances and covariances for the dependent variables and five covariances between the independent variable and the five dependent variables. This is a total of 20 for 2 degrees of freedom. 

Greg Norman posted on Tuesday, March 11, 2008  2:22 pm



very helpful! Thanks, Greg 

Greg Norman posted on Wednesday, March 12, 2008  2:32 pm



After some further discussion with my colleague about the degrees of freedom in the path model we ran the same model in EQS. In EQS it estimated 19 parameters (rather than 18 in Mplus) with the one additional parameter being the variance of the observed independent variable. The model chisquare also had 2 degrees of freedom [2119] (rather than 2018 in Mplus). The chisquare value and the parameter estimates were very similar but not exact between the two programs. It looks like Mplus and EQS are taking different approaches to how the information in the variancecovariance matrix is used. Is this true? Is it the case that Mplus doesn’t use or estimate the variance of strictly exogenous independent variables since the variance isn’t technically part of the specified model? Thanks, Greg 


It seems that EQS is treating all variables as dependent variables and estimating the variance of the independent variable in this case. My guess is that the estimate is the same as the same variance. We do not do this because a model is estimated conditioned on covariates. Means, variances, and covariances are not estimated for independent variables as part of the model. The difference you see in chisquare is most likely due to Mplus using n and EQS using n1. 


I have a simple path model and tried following the above steps to determine how the degrees of freedom were calculated. I have 1 independent variable and 4 dependent variables. I applied the above formula and calculated (4*5)/2 = 10 variances and covariances for the DVs and 4 covariances between the IVs and the 4 DVs. This results in a total of 14 elements. I believe the following parameters should be estimated in the model: 4 regression paths (the IV to each of the DVs), 3 correlations(the 4DVs are autocorrelated repeated measures) and 4 residual variances for the DVs, for a total of 11. Therefore, the degrees of freedom should be 3, but I am getting a justidentified model with 0 dfs in my output. Am I missing some parameters in my calculation of df? Thank you very much. 


How many parameters does the output say are estimated? Which estimator are you using? 


I am not sure how to tell how many parameters are being estimated in the output. It says that there are 4 dependent variables and 1 independent variable and under information criteria the number of free parameters is 18. I do not think this is what you are asking for though. Would you mind directing me as to how to determine the number of parameters being estimated? I am using MLR. Thank you. 


Besides the fourteen parameters that you give, there are also four means of the dependent variables for a total of 18. If 18 parameters are being estimated, the degrees of freedom are zero. You can see the 18 parameters that are being estimated by looking at the Results or TECH1. If there are parameters being estimated as the default that you do not want in the model, you can fix them at zero. 


I am running a path model (i.e., a cross lagged panel design) and I have missing data. I approached this in two ways: 1) I used 10 imputed datasets and had mPlus run the path models and combine results appropriately (TYPE=IMPUTATION) and 2) I used mPlus with the dataset with missing data and used FIML. I wanted to be able to see if results did not differ much between the two methods. I have 9 observed variables, 22 paths, and 6 dependent variables. While as expected, results did not differ much, I am figuring out why there are differences in the degrees of freedom if I use the 10 imputed datasets and if I use the one dataset with FIML for the same, exact model. Why should the number of free parameters differ at all and how are the df's calculated? Using FIML:  number of free parameters: 36  degrees of freedom: 13 Using 10 imputed datasets:  number of free parameters: 34  degrees of freedom: 11 If I do not use FIML (and just use the model with listwise deletion), the free parameters is the same as the one using 10 imputed datasets (34 free). Also, I would have preferred to use the 10 imputed datasets but I also cannot get mplus to model indirect using imputed datasets (and I cannot see how using constraints actually 'tricks' mplus into doing it). Thank you. 


Please send the two outputs and your license number to support@statmodel.com. Be sure they both include TECH1. 

Tinchi Lin posted on Thursday, October 17, 2013  10:06 am



Hi Linda and Bengt, Example 3.1 has three observed variables; one DV and two IV's. Why is the model justidentified (0 df in chisquare test for goodness of fit)? How does MPLUS determine the number of known information in this special casewhy is it not (3*(3+1))/2 = 6? Or, referring to the first two posts in this thread, why isn't the number of known info 3, which = 1 (variance of the DV) + 2 ( covariances between y and the x's) ? Example 3.1 is a simple linear regression and four parameters are to be estimated, including the DV's mean. Sometimes MPLUS would estimate a DV's mean but sometimes not. How does MPLUS determine whether to estimate a DV's mean or not? Many thanks, 


The H1 model is a mean and variance for the dependent variable and two covariances between the dependent variable and the independent variables for a total of four parameters. Regression models are estimated conditioned on the covariates. The means, variances, and covariances of the observed exogenous independent variables are not model parameters. The H0 model has one intercept, one residual variance, and two regression coefficients for a total of four parameters so the degrees of freedom are zero. In a conditional model, the intercept of the dependent variable is always estimated unless TYPE=NOMEANSTRUCTURE is used. 


Hello, I would like help calculating degrees of freedom. I receive this error message: “THE DEGREES OF FREEDOM FOR THIS MODEL ARE NEGATIVE. THE MODEL IS NOT IDENTIFIED. NO CHISQUARE TEST IS AVAILABLE. CHECK YOUR MODEL.” I am trying to run a mediation model with all manifest variables: 1 predictor (X), 1 mediator (M), 1 outcome variable (Y), and 4 control variables (C14) which I have regressed on M and Y and correlated with X. I have missing data on C4 which I want to handle with FIML, so I specified its variance and covariances. I am also creating an indirect effect variable a*b for bootstrapping. Thus, I thought I would be estimating 23 parameters (see code below), and with 7(8)/2 = 28 elements, my degrees of freedom should be 5. However, the ChiSquare degrees of freedom in the output is 2 and number of free parameters = 37. I also noticed that means and intercepts are included in the output although I did not specify this. MODEL: M ON X (a); Y ON M (b); Y ON X; M ON C1 C2 C3 C4; Y ON C1 C1 C2 C3 C4; X WITH C1 C2 C3 C4; C4; ! Variance C4 WITH M Y C1 C2 C3; ! Covariances MODEL CONSTRAINT: NEW (ab1); ab1 = a*b; 


The same variables cannot be used in both an ON and WITH statement, for example, m ON c4 and m WITH c4. They cannot both be identified. 


Dear Linda, Thank you for your response. I removed M and Y from the C4 WITH statement and the model now converges. However, I am still wondering how degrees of freedom are calculated in this situation. I thought I had 7(8)/2 = 28 elements and am now specifying 21 parameters for a df of 7. But my output states the number of free parameters is 35 and the chi square df is 0. Thank you kindly 


Please send the full output and your license number to support@statmodel.com so I can see exactly what you are looking at. 


I am having trouble like many others before in connecting what Rex Kline defines as model degrees of freedom and what MPlus defines as model degrees of freedom. My model is the following: Y1 ON Y2 X1 X2 IV1; Y2 ON Y1 X1 X2 IV2; Y1 WITH Y2; [Y1@0]; [Y2@0]; I have 6 observed variables, thus using the v(v+1)/2 rule I get 21 model observations. I can count that I have 20 model statistics (8 effect arrows + 6 variances (one for each nondependent variable and one for each error term) + 6 covariances between independent variables plus the error term (IV1/X1 IV1/X2 IV2/X1 IV2/X2 X1/X2 e1/e2)=20). Thus, I should have 2120 = 1 df for the model. Mplus reports 11 free parameters and 2 degrees of freedom. Can you please help me identify the discrepancy? 


The v(v+1)/2 rule refers to the covariance structure and does not work when you have a mean structure which you have imposed when you say [y1@0], [y2@0]. Instead you should go by the simple and general rule of df= h1  h0, where h1, h0 refers to the number of parameters that those two models have. H1 can be computed as the the 8 covariances between the 2 y's and the 4 x's plus the 5 means, variances, and covariance parameters of the two y's . That is 13 H1 parameters. H0 are the 11 free parameters of your model (I think you know which they are). So 1311=2. Alternatively, you can bring the 4 x's into the model and into the counting of the h1 and h0 parameter but these extra parameters for the x part cancel out in the df calculation. 


Thanks Bengt! I realize my error. Can you advise me how I estimate this model without a mean structure? I thought by imposing @0 I was removing the mean structure, but I see now that I am imposing it at 0 in my syntax. 


Follow up: Actually, can you refer me to the advantages or disadvantages of fixing the means at zero? I thought that I had to do this in order to correctly estimate the two simultaneous feedback paths (through and) between Y1 and Y2. Again, my model is the following: Y1 ON Y2 X1 X2 IV1; Y2 ON Y1 X1 X2 IV2; Y1 WITH Y2; Many, many thanks. 


I don't know that fixing the intercepts like you do has any usefulness in this modeling. I would not do that  unless there is literature I don't recall at the moment. And, I'm not sure that the WITH parameter is identified. 


Thanks Bengt. If I run the model without the WITH parameter, then the covariance between e_Y1 and e_Y2 is automatically set to zero (version 7.4). I thought I wanted the intercepts out for two reasons. The first is that the example nonrecursive models with a simultaneous feedback loop in Rex Kline (2011) and Pamela Paxton et al (2011) both are discussed without mean structure/intercepts. The second is that I worry about two nonzero constants in the equations somehow biasing the feedback loop. I haven't found any literature yet but each time I trace a circular path through Y1 and Y2 I would have the impact of the constants, and this worries me about the long term trajectory of the system. Do you know anything about this? 


If you fix the intercept to zero by saying [y@0] you are introducing a mean structure, not removing it. Don't do that. Kline and Paxton don't fix these intercept to zero I assume. 


Good afternoon Dr. Bengt and Linda Muthen, I am trying to figure out where I am going wrong in calculating the degrees of freedom for my model. I have 12 dependent variables and 10 independent variables in a CLPM design. By using Kenny's (2011) formula of known values: k(k+1)/2 I get 253 known values for my model. Mplus states there are 69 free parameters and 141 dof. Clearly 25369 is not 141 so am I using the wrong formula in this instance? Thank you for your help and any information that can be provided. 


Mplus does not estimate the parameters of the IV marginal part of the model  just like in regression analysis  so that part is not counted as pieces of information (H1 parameters) or as model parameters. You get those from the Sampstat. So the number of pieces of information are the 12 DV means, the 12*13/2 DV varcovs, and the 12*10 covariances between the DVs and the IVs. This gives you df=141. 


Thank you much for your detailed explanation. 


Hello, I am running a CLPM with 6 dependent variables and 3 independent variables. My syntax is as follows: y1 on m1 m2 x1; y2 on m1 m2 m3 x2; y3 on m2 m3 x3; m1 on x1 x2; m2 on x1 x2 x3; m3 on x2 x3; y1 with y2 y3; y2 with y3; m1 with m2 m3; m2 with m3; x1 with x2 x2; x2 with x3; I have include withinwave correlations for the independent variables because I'd like to use FIML to manage the missing data on these variables. By my count, there are 21 variance/covariance for dependent variables and 18 covariances between independent and dependent variables, for a total of 39 elements. For estimated parameters, I'm counting 17 pathways, 9 correlations, and 6 residual variances. This would give me 39  32 = 7 df. However, Mplus calculated 10 df with 44 estimated parameters. Can you help me understand how Mplus calculated these df? Any help would be much appreciated. 


We need to see your full output  please send your output to Mplus Support along with your license number. 

Back to top 