One feature of Mplus is that it automatically frees/constraints some parameters. While I can see the point for new Mplus users, other people might be annoying by this behavior. In my case, I must say that this issue is the number one complaint I read about Mplus.
I was wondering is Mplus has some kind of “Option Explicit” to make this implicit freeing of parameters stop. Once set, this option would require the user to specify each parameter individually. For example, exogenous variables would be uncorrelated unless manually otherwise and different classes would be constrained to have the same means unless otherwise specified.
My impression is that such a feature would empower users with more flexibility in their Mplus modeling. It would also squash what I think is one of the most frequent complaint about Mplus.
Your input on this is appreciated. Program settings are always to some extent a matter of taste. Here is how we think about it.
The Mplus defaults are chosen to save people time and get model settings that are identified and fit data in many applications. The defaults are designed not only to help new users, but are indispensable for experienced users doing real-data analyses. With typical real-data models there are too many parameters to “specify each parameter individually” as you suggest. Think of common models with around 50 parameters. I would also argue that the default parameter settings are typically much closer to the model one desires than starting from a model where nothing is initially free, as you suggest. One can use TECH1 to see what one wants to change.
Even long-time analysts make mistakes in their parameterization unless guided by the defaults. Growth modeling is a case in point. Before the | specification was introduced I don’t know how many times I made errors in my specification when I moved away from the most simple cases.
In our customer support we don’t hear the default settings as a frequent, serious complaint. True, sometimes users are surprised that they get say a free covariance between dependent variables, but this is clear from the output and can be modified if desired. More often, users have been alerted to including a parameter that they overlooked but is really needed in the model.
An “option explicit” would perhaps be of interest to some technical users with small-sized models. But to settle on what a neutral starting point should be for the parameters is not straightforward. To use your two examples, I would argue that most people don’t like their exogenous variables to be uncorrelated (if they are correlated in regular regression why shouldn’t they be in other models?) and equality of means across classes implies a non-identified model that would always have to be modified. To satisfy users, one would have to offer customized settings but as this discussion illustrates those settings would have to have a very wide variety of choices.
Also note that Mplus allows a zero covariances option in the ANALYSIS command, namely MODEL=NOCOVARIANCES.
Thoughts by other users?
Paul Silvia posted on Tuesday, February 17, 2009 - 12:50 am
I can relate to the urge to specify each parameter individually, but that urge usually goes away when the model has two levels, a mix of latent and observed variables, random effects, and count, ordinal, and continuous outcomes.
For what it's worth, I wouldn't place this near the top of my "Mplus wish list." (Expanded plotting options for multilevel models, a la HLM 6's L1-L2 model graphs, would be nice.)
I agree with your assessment of the situation. I personally find Mplus defaults to be really helpful most of the time, and especially in more complex models. I would not modify that.
However, there are additional issues that might be considered.
Indeed, although the defaults are very helpful most of the time, they may also be a pain when they have to be turned off for (from what I see), two reasons.
First, most of the input examples (at least those that are easily accessible) are built from the defaults. So, when we need to go behind the scenes, guidelines are hard to get (easiest example is the full specification of an LCM model without using the | function). As a solution, and without modifying Mplus per see, it might be helpful to add full input examples as a complement to the reduced (defaults) ones for selected models (not all of them) in next versions of the manual.
Second, some of the defaults are painful to turn off. This is the case of the NOMEANSTRUCTURE option. NOMEANSTRUCTURE is unavailable with the default TYPE= MISSING (while it was available before). Likewise, NOMEANSTRUCTURE is no more available with estimator MLR (I believe it was before). Similarly, to obtain it, we have to modify another default, INFORMATION=OBSERVED (here I lost it as to what was possible in earlier versions). All of that for a simple request for "NOMEANSTRUCTURE". I agree this is minor, but thei are other examples in which defaults are interrelated. In a related way, why is TECH13 (with mardia coefficients) only available with mixture models (and LISTWISE) when their most apparent use is to estimate the multivariate normality in basic CFA-SEM models ?
Don¡¦t get me wrong, these are only fine tuning issues. I am generally quite happy with the current defaults and with the almost bi-annual adjustment you make to the program and I will continue to recommend it on SEMNET ƒº
This being said, I would not necessarily agree with Guillaume request to create a new function on the basis of these points. The "experienced users" I know already specify what they want to specify (or use R) quite freely and are also very happy for the time they save with the defaults.
I must add, in answer to Guillaume implicit question, that everything is presently quite explicit in the various sections (and TECH) of the output. I usually work from an initial output as a guide to refine the model.
It is good to discuss this so perceptions can be clarified. Here are some comments.
You mention that you would like to see the “behind the scenes” guidelines for topics such as growth modeling when not using the | approach to specifying the model. Note that the Mplus User’s Guide Chapter 16, pp. 542-547 has an extensive list of explicit model statements showing the relationship between the BY and | language for growth models.
You also mention the NOMEANSTRUCTURE option. In this area we are trying to move the field forward a bit. The approach of not including means in a model is mostly of historic interest. Due to EFA being based on correlations and CFA expanding this to covariances, the means got left behind. This is strange from a general statistical perspective where one first focuses on means and then variances, correlations, and covariances. Note that including unstructured means does not hinder your analysis in any way and our recommendation, and therefore default in Mplus, is to have them in the model. As long as the model does not impose a structure on the means, the results for the other parts of the model are identical to the old-school correlation- or covariance-structure results. One example where one may want to test a covariance structure only is with multiple groups and equality of factor indicator loadings, but not the factor indicator intercepts. Such a model makes it possible to study group invariance of factor covariance matrices. Here too, one can use the default of including means, while letting the factor indicator intercepts be different across the groups to not impose a mean structure. Or if it falls in the framework of TYPE=GENERAL; MODEL=NOMEANSTRUCTURE; can be used.
TYPE=MISSING has always required that means be included in the model. MODEL=NOMEANSTRUCTURE; can be used only when means, variances, and covariances are sufficient statistics for model estimation.
TECH13 was not intended for testing normality, but to test if a mixture model fit higher-order moments or not. Here too, we have not moved towards Mardia testing of normality because such a normality test seems outdated in the light of available non-normality robust standard errors and chi-square tests of model fit via MLR. That is, if you worry that your data are non-normal, why do you need a normality test – why not just use MLR? If those SEs are different than SEs under ML, then your data are non normal.
Similar ideas of moving the field forward are behind our decision to no longer have listwise as the default and no longer forcing people to say TYPE=MISSING, which gives MAR treatment. Because MAR is the standard in statistics, this is now the default. It takes a little bit longer computationally, but is the way to go.
Thank you once again for these detailled answers (and for the time and efforts you devote to this discussion board).
I perfectly agree with your point of view on these topics. I was mostly providing examples of when dealing with defaults might become difficult that I did encounter, mostly when answering questions.
For | my example was badly chosen and pp. 542-547 are clearly in the direction I was suggesting. What would be even nicer is to add to the current exemples (some of them only in chapter 1 to 10) by comparing the simplifed and detailed syntax.
For NOMEANSTRUCTURE, I agree with you. But it is still a good example of a way in which things get complicated to turn off defaults.
Finally, for the normality issue, I only parlty agree. Yes, is we suspect non normality, the MLR estimator is perfect. But mardia coefficients are still usefull to verify this suspicion. also, the use of ML (instead of MLR) in papers remains simpler and more consistent with waht people are used to. It also avoids the need to explain to the readers (and especially to suspicious reviewers) what the scaling factors are for when we do not submit in methodologically oriented journals.
As for moving the field forward, I do also agree with you. For me, Mplus did and is still doing a lot and I thank you for that.
How do I have to define a model with categorical data, in which the factor loadings and the intercepts of the observed variables should be restricted in order to test configural invariance (equal pattern of factor loadings)?
I tried the following two ways, which both do not seem to work. Are there mistakes in the syntax? Do you have an idea, what is wrong?
(1) If I add "MODEL = NOMEANSTRUCTURE" - because only the covariances are needed - the following warning occurs: MODEL=NOMEANSTRUCTURE is not allowed in conjuction with TYPE=MISSING.Request for MODEL=NOMEANSTRUCTURE will be ignored."
(2) If I try to free the intercepts and factor loadings manually, the model does not converge (standard errors of the model parameters could not be computed. The model may not be identified. Check your model.).
Hi! Is it true that the default in Mplus is for exogenous variables to be correlated, but endogenous variables to not be correlated otherwise specified? I am pretty sure this is true, but want to check. Thanks, Lisa
Linda, if I use the NOMEANSTRUCTURE option, wouldn't I be estimating fewer parameters?
This was important in my current study because it is a large model, and we were receiving messages from Mplus such as "the number of estimated parameters exceeds the number of observations". We have a 2-group model. While the number of estimated paramaters for our full model is perhaps not greater than the number of observations in our entire sample, it was greater than the number of observations in one of our groups, BOYS. So we are trying to alleviate this complication by: 1) estimating one portion of our model at a time (a couple latent factors at a time), and 2) excluding means.
We thought that by using the NOMEANSTRUCTURE option, we would be lessening the number of parameters estimated. Isn't that true?