The degrees of freedom are not calculated in the regular way for MLMV. They are instead estimated according the formula 109 on page 281 of the Mplus User's Guide.
Anonymous posted on Monday, January 10, 2000 - 4:39 pm
For scholars doing applied research the formula 109 is hard to understand. Could you explain in plain English what effects the degrees of freedom? For instance, I noticed that using the same model specification but different data (with the same number of cases) can result in different degrees of freedom. What is the reason for that? Thanks a lot in advance
I haven't found a good intuitive way to explain the degrees of freedom with the mean- and variance-adjusted approach. I don't think you will find one in Satorra's work either. The d.f. is data dependent because it draws on the estimates, their derivatives, and the asymptotic covariance matrix of the sample statistics with the aim of choosing the degrees of freedom that gives a trustworthy chi-square-based p value. But that's not very intuitive. We'll post a good intuitive explanation if and when we find one. Others?
I have a situation where I have a group of latent factors (Group A) that are descriptive of the work environment. There is a second group of factors (Group B) that are descriptive of safety levels in the workplace. I have a group of 3 criterion (Group C) variables. The pattern of relationships in the model is A->B->C. In my model, I have some non-significant paths between Group B and Group C. When I drop one of the paths between Group B and Group C, I actually lose degrees of freedom - it appears as though Mplus is automatically correlating residuals among my Group C variables, and adding some correlations among other variables. Any thoughts on what is going on?
I would have to see the output to say exactly. This could be due to the estimator. WLSMV and MLMV do not calculate degrees of freedom in the regular way. Or as you suggest, it could be due to certain default residual covariances. In the output, it shows how many parameters are in the model. You can check that in both cases and then look in the results section to see which parameters differ between the two runs.
Mplus defaults are described on pages 158-160 of the Mplus User's Guide. If you cannot figure it out, please send the two outputs to firstname.lastname@example.org.
If there are parameters in the model by default, you can fix them to zero by, for example,
Anonymous posted on Friday, November 05, 2004 - 9:09 am
I have a path model with 10 continous variables. due to non-nomrality of the variables, I am using mlmv estimator. from my output the number of free parameters is 38. but i have only 37 parameters estimated. Am I missing something? how do you caluclate free parameters for MLMV estimator.
LMuthen posted on Friday, November 05, 2004 - 10:22 am
The degrees of freedom for the MLMV estimator are not computed in the regular way. See formula 110 in Technical Appendix 4.
Anonymous posted on Thursday, June 30, 2005 - 10:17 am
I have a sample size of 24 and have estimated a model with 9 dependent variables and 6 independent variables. Using ML and fitting a particular model, the DF is 70. It seems MPlus disregards the sample size when calculating DF. I realize such a complicated model should not be fit to such a small sample size, but why does Mplus allow it?
In covariance structure models, degrees of freedom are basd on the number of restrictions imposed on the covariance matrix not on sample size. Unless, the covariance cannot be inverted, a model will be estimated. However, with such a small sample size, power would be low and standard errors large.
Thank you for your response. I have a further question related to this model.
I am trying to use the MODEL TEST option to test for measurement invariance by gender. In order to do so I wish to label all parameters. When I specify labels in my model command, my model is no longer just-identified. Why is this so?
Here is my model: Y1 ON X1 - X10 (p1 - p10); Y2 ON X1 - X10 (p11 - p20); Y3 ON X1 - X10 (p21 - p30) Y1 (p31) Y2 (p32); Y4 ON X1 - X10 (p33 - p42) Y1 (p43) Y2 (p44); Y5 ON X1 - X10 (p45 - p54) Y1 (p55) Y2 (p56); Y1 WITH Y2 (p57); Y3 WITH Y4 (p58) Y5 (p59); Y4 WITH Y5 (p60);
I would need more information to answer your question. Please send your input, data, output, and license number to email@example.com. Also, please explain to me what you mean by measurement invariance. This usually refers to latent variables.
You can see which parameters are estimated as the default by looking at the results. In your example, you have five degrees of freedom because you have 15 sample statistics and 10 parameters. The fifteen sample statistics come about because you have 6 variances and covariances among the 3 dependent variables and 9 covariances among the 3 dependent and 3 independent variables. The parameters estimated are 7 regression coefficients and three residual variances.
Hi, I have the following path model: y ON x1 x2 x3; x1 ON x2 x3;
Mplus tells me that I have 9 free parameters and 0 degrees of freedom. The covariance matrix of variables y and x1-x3 includes 10 elements, so I would have expected to have 1 df. Can you please explain why for Mplus degrees of freedom equal 0?
1 df would imply 1 restriction in the model. But this model has no restrictions - all paths are drawn.
The 9 parameters Mplus reports are for the distribution of y given x2, x3. The covariates x2 and x3 do not contribute parameters because that their distribution is not part of the model (just like in regression). So you should not consider 10 elements, but instead 3 variances-covariances among y and x1 plus 4 covariances between y, x1 and x2, x3. Which gives 7 covariance matrix elements. So that's eliminating the 3 covariance matrix elements corresponding to x2, x3. The Mplus parameters are the 5 slopes, the 2 residual variances, and the 2 intercepts. So 9 parameters. There are actually a total of 7 + 2 elements because you add the means of y1 and x1 since those are part of their distribution. Therefore you get 0 df.
Wei Chun posted on Tuesday, May 11, 2010 - 11:15 pm
I am running a path model and I get a chi-square value of 0, df=0, and CFI=1 and RMSEA=0. It means that the model is just-identified. How to make it to be identified? Thanks for your advice.
A just-identified model is identified. Fit cannot be assessed. To obtain an overidentified model for which fit can be assessed, some paths need to be fixed at zero. Theory should decide on which of these paths are fixed at zero.
I have compared MPLUS and EQS/LISREL output for a three-factor model with 9 indicators. The model has 24 df. Data are highly non-normal.
However, MPLUS does not produce the same ML estimates as EQS and LISREL. Is this a known issue, i.e. is this possible in practice? Note that the ML chisquares are coincide, given that EQS multiplies by (n-1) while MPLUS multiplies the fit function by (n).
Moreover the MLMV chi-square in MPLUS still has 24 df, while EQS reduces the df to 16.
I could send you the .out files from EQS and MPLUS if you think this is interesting..
I have never seen a situation where you would obtain the same chi-square in Mplus and Eqs/Lisrel and different parameter estimates. I would have to see that to figure it out.
If you are not using Version 6 or 6.1, the degrees of freedom for MLSMV in Mplus will not be the regular degrees of freedom. With MLMV prior to Version 6, the chi-square test statistic and degrees of freedom were adjusted to obtain a correct p-value. Only the p-value is to be interpreted. Starting with Version 6 a new adjustment is used which results in the expected degrees of freedom.
jane jin posted on Wednesday, June 29, 2011 - 3:53 pm
I am doing an analysis using MIMIC model. The following are the input:
f1@1; f1 by y1 - y10*; f1 on group; y1 - y10 on group;
I have one latent variable (f1) and 10 indicators and one covariate (group). The results indicated that I have 31 free parameters, 10 direct effect of indicators on the covariate, 10 thresholds (indicators are dichotomous), 10 loadings and 1 direct effect of the covariate on the latent variable. I got the following message:
THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES MAY NOT BE TRUSTWORTHY FOR SOME PARAMETERS DUE TO A NON-POSITIVE DEFINITE FIRST-ORDER DERIVATIVE PRODUCT MATRIX. THIS MAY BE DUE TO THE STARTING VALUES BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. THE CONDITION NUMBER IS 0.282D-11. PROBLEM INVOLVING PARAMETER 21.
I am confused about the model identification problem. I think I should have 65 smaple stats, and the model should be identified, but the message above said the model is not identified. Any thoughts? My question in general is how to compute df in MIMIC model, any special constrains, and what are known information (if not 65)?
Hi, I have a quick question about equality constraints and degrees of freedom. I am running a Multi Group Analysis with Mplus (it is a SEM with 2 groups). I have a restrictive model where all paths are set to be equal across groups. To do a chi-square difference test I compare the restrictive model with less restrictive models which means that I compare it with models where I freed one path coefficient at a time. The problem is that I have a loss of 2 degrees of freedom although I just freed one path. So actually I would think that I just loose one degree of freedom (for the freed path). I can't find any hints in the output which could explain the loss of one extra df. Do you have an idea? Thanks a lot!
By freeing one parameter, a default parameter may have also become free. Check either the results to see how the two models differ or check TECH1 for both models.
Jinseok Kim posted on Thursday, November 24, 2011 - 10:59 pm
my path model is as following:
y on x1 x2 x3; x1 on x2 x3; x2 x3 on x4;
I think that there are 15 elements in covariance matrix of y, x1-x4, and that there are 11 free parameters in the model described above, which I think should give 4 degrees of freedom.
Mplus output correctly reports 14 as the number of free parameters. However, it says 3 degrees of freedom. Can you help me understand why the sum of the number of free parameters and degrees of freedom is not same as the number of elements in covariance matrix?
Jinseok Kim posted on Thursday, November 24, 2011 - 11:22 pm
Sorry there has been an error in the previous posting by me. It should have been "Mplus output correctly reports 11 (instead of 14 as in the original posting) as the number of free parameters". Sorry for the confusion.
You have four dependent variables and one independent variable. Therefore your H1 model has four variances and six covariances among the dependent variables and four covariances between the independent variable and the dependent variables. This is a total of 14. The H0 model therefore has three degrees of freedom.
Jinseok Kim posted on Friday, November 25, 2011 - 9:34 pm
Thanks for your prompt response.
From your answer, I suppose that the definition of H1 model in mplus may be different from the full (or saturated) model where the variances of all of the observed variables and covariances between every pair of observed variables are estimated. I think that is what popular sem textbooks (e.g., Kline) say when it comes to the total degrees of freedom (or p=v(v+1)/2, where p is total df, v is number of observed variables).
In the H1 model of mplus you described, the variance of the observed independent variable was not considered when you count the total degrees of freedom (or 14 in this case). Would you explain why the difference is (necessary) in mplus, or direct me to references that I can learn more about the logic in mplus?
The degrees of freedom are the same if you treat the model as an all y model as Kline probably does because you have more parameters. For example, with your model, there are 15 variances and covariances and five means in the H1 model for a total of 20. The H0 model has 5 intercepts, 5 residual variances, and 7 regression coefficients. The degrees of freedom are three. There are not different ways of determining the degrees of freedom for a model.
Jinseok Kim posted on Saturday, November 26, 2011 - 4:58 pm
Sorry for these nagging questions but I just want to understand clearly.
In the model in the original posting I could have modeled the variance of x4(or independent variables) as following:
y on x1 x2 x3; x1 on x2 x3; x2 x3 on x4; x4;
By adding the last line in the model statement, the total degrees of freedom gets 15, yet the model degrees of freedom remains 3.
I guess that my confusion is around "all y model" vs. "y and x model(if you will)" distinction as it appeared in your message. Would you explain why the distinction is necessary? Putting differently, what are the benefits of the distinction? Also, please recommend a few references that I can learn more about this topic.
I am asking this because other sem software such as amos does not seem to make such a distinction and treat the model as "all y model" by default.
One totally different question: I was told that "meanstructure" was one of default settings in Ver. 5. Is "meanstructure" still a default setting in ver. 6.12?
The "all y" vs "y given x" distinction is mentioned in the Version History for version 6.1 under the heading Analysis cond'l on covariates". The distinction becomes important when the y variables are not continuous. For example, with categorical y's you don't have to assume joint normality of x, y*, where y* are the latent response variables for y - instead you only have to assume y* | x normal (see my 1984 Psychometrika article where I made a distinctin between case A and B). This is actually the same as in regular regression, where the model does not concern the x's, but only y | x. Unfortunately, SEM started off working with y and x jointly and it didn't matter since in the continuous case the two approaches are the same (see 1975 JASA article by Joreskog & Goldberger). In most of statistical modeling outside SEM, x's are conditioned on (not included in the model or appearing as model parameters), just like Mplus does it.
So with continuous y's, Mplus does not include the parameters for x as free parameters in the optimization to obtain the model parameter estimates, but nevertheless in line with regression allows them to be freely correlated. You can think of it as Mplus knowing that those var-covs are estimated by the sample var-covs (assuming no missing) and can therefore fix the parameters at those values.
When you add x to the model by mentioning its variance - that is, you have 5 y's instead of 4 y's and 1 x - you add 1 parameter to both H1 and to H0, resulting in the same df.
Meanstructure is still the default in 6.12. So with 4 y's you have 4 H1 mean parameters and 4 H0 mean parameters p- again not influencing the degrees of freedom. Again, with no missing the sample means provides the estimates.
I am running a structural equation model with 1 latent factor in MPLUS, but I am a litte bit confused about the calculation of the number of (free?)estimated parameters. The total number of informations in a model = (total number of manifest variables*(total number of manfist variables + 1))/2. In my analysis I have 38 manifest variables so 741 informations. I have 137 degrees of freedom, which means that 604 (741-137) parameters are estimated. These are normally all my intercepts, factorloadings, path coefficients, variances and covariances. When I take a look in the mplus output only 48 parameters are estimated. That is exactly the sum of my intercepts, factor loadings and path coëfficients. Why aren't my variances and covariances "freely" estimated? I get however estimated variances and covariances in my output.Or do I do not get something here?
Please send your output and license number to firstname.lastname@example.org so I can see your full model and explain the calculation of the degrees of freedom.
Eric Teman posted on Thursday, January 26, 2012 - 1:32 pm
When using LISREL versus Mplus to estimate a model, the degrees of freedom are the same, as would be expected, but why are the number of free parameters different? And how does this difference still lead to the same values for the parameter estimates?
I also have a question about calculating the degrees of freedom in Mplus. Is it true that Mplus calculates the total number of data points in the model as p(p+1)/2 + p where p is # of variables and the "+p" accounts for the inclusion of means?
If yes, does this mean that Mplus allows for the potential of having more degrees of freedom in a model compared to other programs? For example, if in Mplus, means and intercepts are fixed to zero, and the same model with no means/intercepts was estimated in another SEM program, would the Mplus model have more degrees of freedom?
It makes sense that the model would be misspecified if you purposely fixed means/intercepts to zero to obtain dfs. But, what if the means or intercepts were not significantly different from zero? Removing them would make the model more parsimonious, but would also give additional degrees of freedom. Would this be an acceptable practice?
Dear Drs. Muthen. I understand that I cannot assess the model fit of a just-identified model. But, how can I compare the performance of a just-identified model in relation with other nested models with lesser free parameters (e.g. fit to zero)? Thank you.
Does Mplus output provide in its output the number of parameters estimated in the model?
Could we calculate this for the unstandardized model as n(n+1)/2 minus the degrees of freedom, where n is the total number of dependent and independent variables (which Mplus does list toward the top of its default output)?
We are using MLR estimation; I'm not positive whether it's important for me to mention that or not.
I'm running a moderated mediation measured variable path analysis using Mplus version 7 and maximum likelihood. I've come across some degrees of freedom issues that I just can't figure out.
In my model, the mediator interacts with the moderator to predict the outcome and I have four predictors. I want to free the covariances between the moderator/interaction term and the mediator's residual variance.
If I run a model constraining these two covariances to zero, I have df=2 as expected. As such, I would expect that freeing two covariances would result in a just-identified model with df=0.
However, Mplus reports that this model has df=8! Furthermore, the output lists means and variance estimates for the moderator and interaction term.
The best explanation I can come up with is that Mplus is including means as the "knowns," which would account for the 8 degress of freedom given that I have 8 measured variables. Can you explain why this is happening rather than estimating a just-identified model?
If all covariances between exogenous measured variables are manually freed, the df=0, but the means and variances of the exogenous variables are still estimated.
I want to make sure that I completely understand what's going on so I don't make an error when analyzing this data, so I appreciate any thoughts you have or help you can provide.
Regression models are estimated conditioned on the observed exogenous covariates. Their means, variances, and covariances are not model parameters and no distributional assumptions are made about them. If you bring the covariates into the model by mentioning either their means, variances, or covariances, the variables are treated as dependent variables in that distributional assumptions are made about them. Their means, variances, and covariances are estimated in this situation. Degrees of freedom are not affected because both the H1 and H0 model are affected in the same way.
Thank you very much for your response. I understand the first part clearly and that explains why the means/variances are estimated.
I'm not so clear on the last part, however. If I'm understanding you correctly, specifying the covariance between two exogenous and one endogenous variable brings the means/variances/covariances of the exogenous variables into the model. If degrees of freedom are not affected, though, why do my degrees of freedom bump up to 8 instead of reducing to 0 when freeing the only two fixed paths?
The degrees of freedom at not affected by non-normality.
Coda Derrig posted on Sunday, October 20, 2013 - 3:25 pm
Hi Dr. Muthen,
I am a doctoral student using Mplus for my dissertation. I am new to SEM overall. I am looking at the measurement model (which is hierarchical) and structural model for a mediational model (f1-->intern -->BD). This may be elementary, but I am wondering why my measurement and structural models have identical fit indices, including chi-square and df?
Here is my syntax :
family by piqd piqi parcel3; peer by parcel1 parcel2; media by infot presst parcel4; f1 by family peer media; intern by genp1 genp2 genp3; bd by bew bea bsqt;
The structural part of the model is just-identified. It has zero degrees of freedom.
Chiho Song posted on Tuesday, March 10, 2015 - 1:32 pm
Dear Dr. Muthen,
I have a question about my measurement model. Is it necessary to have at least four indicators for my latent construct? Two indicators under one latent factor gave me a non-positive definite matrix. Thank you.
I have run a growth curve model with 3 time points using this syntax: MODEL: intercept linear| Y3_IN_EX@0 Y5_IN_EX@ Y9_IN_EX@6;[Y3_INT-Y9_INT@0];intercept linear]; To improve model fit I freed the time scores at year 5, which gives: RMSEA C.I. 0.043 0.129 Probability RMSEA <= .05 0.085 CFI 0.962 TLI 0.886 Chi-Square Test of Model Fit for the Baseline Model Value 260.243 Degrees of Freedom 3 P-Value 0.0000 SRMR Value 0.025 I want to improve the fit but I am limited because I have zero degrees of freedom. Freeing a different time point gives similar results. In your Topic 3 video you mention that adding residual covariances can improve model fit & they can be held equal in order to get around the degrees of freedom problem. When I add intercept WITH linear@0; I get the error message: WARNING:THE LATENT VARIABLE COVARIANCE MATRIX (PSI)IS NOT POSITIVE DEFINITE…The model fit is also much worse. Is it possible to find out what I might be doing wrong and how I might improve model fit? I would also like to add covariates to the model so any advice on this would also be appreciated. Many thanks
You say you want to improve fit but have zero degrees of freedom. This statement does not make sense - if you have zero df, then RMSEA = 0, CFI = 1 etc. In other words, fit cannot be improved; in fact, it cannot be assessed. A 3-time point model is limited in what you can modify - this is why we recommend at least 4. With 3 you quickly get to zero df, but the problem is that there are several different ways to get to zero df and the model estimates are different.