Hello, I have 2 questions regarding correlating variables in multilevel models.
1) In addition to having 2 latent variables at level 1, I have a measured predictor variable (no error). I was wondering why, at the second level, this measured variable (which becomes latent at level 2) is not correlated to the other latent exogenous variables by default? I tried correlating it through syntax, but it made the model fit considerably worse.
2) If I want to make a variable at level 1 only correlate with the error terms of the other variables at that level, how would that be stated in Mplus sytax? In the manual, I could only figure out how to correlate variables themselves, not a variable with error terms of other variables.
Please send your questions along with output that you can point to along with your license number to firstname.lastname@example.org. I am not sure exactly what you are asking. Please be specific in your questions referring the the parameters in question by name.
I am using a multilevel growth curve modeling to examine whether any of my covariates explain variance in the growth factors at the within and between level. Why would the model fit (chi-square value) change when I explicitly specify (WITH command) the correlations/covariances among the predictors?
Means, variances, and covariances of observed exogenous variables, covariates, are not part of a regression model. When you mention these variables using WITH statements, they are treated as dependent variables in the model. Distributional assumptions are made about them and their means, variances, and covariances are estimated.
Thank you for your answer. I am not sure whether I really understand though. Does it mean that if I don't use WITH statements, exogeneous variables are considered to be orthogonal to each other?
What also confuses me is that when I am running a simple path model (not a multilevel model) where I use WITH statements (to specify correlations among the exogeneous variables) vs. when I don't include WITH statements, the chi-square value stays the same.
No, it does not mean that the covariances are zero. You can think about it as though the covariances are fixed at the sample values.
When TYPE=GENERAL is used with continuous outcomes, it just so happens that whether observed exogenous variables are treated as independent variables in the model or dependent variables in the model, the results are the same. This is not the case in other situations like multilevel modeling.
So, I have not used WITH commands to specify covariances among exogeneous variables in my multilevel model. However, if I want to present covariances and correlations of these variables, will I use the estimates from the output (using sampstat command)? And, how do I get the significance values for those?
For correlations among exogenous factors you can look at the STD solutions. A general approach is to express correlations using the model parameters by giving the model parameters labels that are then referred to in Model Constraint. For guidance, see UG ex 5.20.
Anabel posted on Monday, February 07, 2011 - 3:48 am
thanks a lot for your response.
But I think I have to follow up on my question. I need the significances for the correlations among the endogonous and exogonous factors and the manifest variables respectively for my SEM models.
I don´t really understand how to calculate those through the model constraint. Could you please give an example?
With an SEM, why don't you focus on the structural parameter estimates (perhaps in standardized form) that the model specifies rather than the factor correlations? If you want the factor correlations, why not formulate a CFA model? Those estimates and their significance should be close to the SEM if the model fits well.
I would like to know why it is so that when I am estimating correlations between many variables simultaneously (x with y z b), I get different estimates compared to when I only estimate a correlation between two variables (x with y)? In both instances, I should be estimating bivariate correlations (and there are no missing data).
Please send your output to support so we know what your situation is.
Eric Deemer posted on Sunday, September 29, 2013 - 11:18 am
I'm fitting a multilevel mediation model with just 3 variables--X, M, and Y. M has variation on both levels. I want to estimate the between-level correlations but I know that Mplus doesn't provide SEs with correlation output. I was thinking of separately regressing Y on M and M on X since these regression coefficients would be the same as correlation coefficients. Would this be true in the ML framework?
Eric Deemer posted on Sunday, September 29, 2013 - 12:35 pm
We are examining the validity of therapist scores from a new measure; many therapists rate more than one client. ICCs for the therapist scores range from .21 to .34, and are lower for other variables. We wish to model correlations at the client (within) level to examine construct validity.
We used (TID = therapist id): Cluster = tid; Analysis: Type = twolevel; Model: %within% X with Y; output: standardized;
We could also use type=complex to examine the same issue. I have run a few correlations with both approaches, and the results differ more than I thought they would (e.g., r=.25 vs. .32). Why would the correlations differ so much? With a very simple analysis such as this, what should we consider in choosing one approach over the other?
I have 5-day data from both partners of couples. I've ordered the data with couple being one case (so five rows for five days of a couple) and variables for each spouse in one row.
I have two questions: 1) Because of the way I structured the data, I think I reduced the three levels (couples, spouses, days) to two levels (couples, days). All variables are measured at the daily level though. What is the best analysis strategy for this data? Type is multilevel or complex? I've tested both (estimator = MLR) and the results do not substantially differ. For now, I've chosen to go with TYPE = Twolevel, estimator = MLR and I modeled the regression pathways on level 1, and let Mplus estimate only the variance of the dependent variable at level 2. Is this okay, or would type = complex make more sense since I do not model anything on level 2?
2) What is the best way to get means, standard errors and correlations for the descriptives table? The reviewers want correlations for level 1 and level 2. If I estimate an empty model (only variance of DV at level 1 and level 2) the correlations are quite different from the estimates I get when I estimate the regression model. I do have some missing data.
Hopefully you can help me with these questions. Thanks very much in advance,
1) It sounds like you have the data ready for a two-level analysis in long format where days represent level-1 (describing what varies across time) and couples represent level-2 (what varies across couple). A variable measured each day can have components of variation on both levels.
2) Report the twolevel sample statistics which you can obtain also using Type=Twolevel Basic.
Thanks for your quick reply. To follow up on each answer: 1) Is it wrong to use type = complex for the long format? And if I do use a twolevel model but my hypotheses concern only level 1 effects, is it enough to only estimate variance of the dependent variable on level 2?
2) If I use Type = Two level basic, what model at level 2 and 1 do I specify? Thanks!
1) Type=complex does not model the level-2 relationships so it depends on what your model looks like. For instance, if it is a growth model over the 5 time points that is your primary interest, that is a level-1 focus and Type=complex is ok. But if you have a model with relationships between couples you need level 2.
Thanks, issue two is now solved. The other issue is a little more difficult: 1) I don't have a growth model but an actor partner model. More specifically, I want to know how wife's job demands and husband's job demands affect how much support they each give to each other at home, and their rating of family quality. It is a mediation model (job demands -> support given -> family quality) and as I have those three variables for husbands and wives I model actor as well as partner effects.
I measured all variables on five consecutive days. There are very strong level 2 correlations and somewhat weaker but similar correlations at level 1. I wonder if I should model this as a level 1 model, but not center at the group mean (which I usually would do to examine day level effects) so that the between level effects are still there. I don't think I can model this at level 2 because I only have 26 couples. I'd like to make use of the fact that I measured each variable 5 times even if that means that I can only test couple level effects that are measured reliably. Any thoughts on what the best way to model this would be?
You could just do (level-1 is time, level-2 is subject)
%Within% y on x m; m on x; x; %Between% y on x m; m on x;
where x, m, and y correspond to your 3 variables for one spouse (you have to extend to both spouses according to your actor partner model). That means that a latent variable decomposition is done of the 3 variables into within and between components. The variables are correlated across time due to their random intercepts and means on the between level.
26 couples is a little low for such twolevel analysis, but it can be tried.
I've modeled it as you suggested but the model is too complex (I actually have 2 x variables per spouse). Mplus warns that there are more parameter than number of clusters and suggests to reduce the number of parameters.
One solutions I'm thinking is to keep the model on within level and control for correlations between the two support variables of spouses (m) and family quality rated by spouses (y). This works better, but Mplus still suggests to reduce number of parameters.
What if I go with Type = complex? This model fits well and does not give any warnings. Or is it not possible to use type=complex for this kind of data?
Great, that will work well. Thank you so much for taking the time to read and answer all these questions. One final question: if I use type=complex, do I interpret the relationships as within or between? In other words, would I say for a significant effect: on days on which men had high job demands (as compared to days on which they had low job demands), they provided less support at home. Or would I say: men who had high job demands (as compared to men with low job demands) gave less support to their wives at home?
With Type=Complex you are not dividing variables into within and between (the assumption is that this is not important) so you are looking at the relationships between the total observed variables. So the latter interpretation holds.
101 question: I was wondering if I use the right method to get the correlations and their significance at both levels for a study with 2 DV's and x IV's. Could you maybe say if I got it right/wrong? Thanks in advance.
USEVARS = DV1 DV2 IV1 IV2 IV3 IV4; WITHIN = IV1 IV2 etc; BETWEEN = IV3 IV4 etc;
%WITHIN% DV1 WITH DV2 IV1 IV2; DV2 WITH IV1 IV2; IV1 WITH IV2; etc.
%BETWEEN% DV1 WITH DV2 IV3 IV4; DV2 WITH IV3 IV4; IV3 WITH IV4; etc.
Are there any theoretical or methodological reasons to include or exclude the exogeneous variables as dependent variables in the model (using the WITH statement)? Usually, I specify the means of the predictors to make sure that all the data is being used (and Mplus then automatically estimates the correlations). For the results however, it seems to make quite a difference if the correlations between predictors are being estimated or not.
Therefore, I would like to know on what ground one should decide to include these correlations.
Generally speaking, I think bringing x's into the model is often done too casually. You are better off generally if you don't have to bring the x's into the model because when you do, you are adding assumptions to your modeling. But with much missing data on the x's, you may not have a choice. In that case, there are several considerations, particularly when some x's are binary. It's a long story. We discuss the issues in our RMA book chapters 9 and 10 where we show simulation studies indicating that certain ways of bringing x's into the model are better than other ways.