Multilevel CFA
Message/Author
 Antje Schmitt posted on Monday, December 01, 2008 - 2:31 am
Hello,
I am attempting to conduct a multilevel CFA with Mplus. Repeated measures data from daily surveys are nested within individuals. So, I`ve got a two-level model with a series of repeated measures on the within-individual level and individual differences on the between-individual level. Now, I`d like to conduct a CFA with level1 variables that were measured 18 times. I guess the basic syntax for my analysis might be Example 9.6 in the handbook but I have a very simple question concerning this syntax:

TITLE: this is an example of a two-level CFA with
continuous factor indicators and
covariates
DATA: FILE IS ex9.6.dat;
VARIABLE: NAMES ARE y1-y4 x1 x2 w clus;
WITHIN = x1 x2;
BETWEEN = w;
CLUSTER = clus;
ANALYSIS: TYPE = TWOLEVEL;
MODEL:
%WITHIN%
fw BY y1-y4;
fw ON x1 x2;
%BETWEEN%
fb BY y1-y4;
y1-y4@0;
fb ON w;

I am not sure about what variable I should use for "w" in my data. Actually, I am interested in testing variables y1-y11 (measured on level1) and like to figure out if a one-facor or a two-factor model will be appropriate. So, I do not understand what variables in our model would parallel "w" in this example.
 Bengt O. Muthen posted on Monday, December 01, 2008 - 6:48 am
It doesn't sound like you have a w or an x in your example. And given that you can treat longitudinal data as single-level data, you may also not have a need for two-level modeling. It is not clear how many items you measure per time point. If only 1 item, then the number of items is the number of time points in your single-level factor analyis. If a set of items, then you have a factor model at each time point, like the top part of UG ex 6.14.
 Andrea Vocino posted on Friday, February 12, 2010 - 9:02 pm
I am asking advice on how I should model a multilevel CFA. I have a model where subjects observe variables under specific conditions hence I have repeated measures (conditions are within subjects). The dataset is structured as follows:

Subj VarX Condition
111 ... 11
... ... 12
... ... 13
... ... ...
222 ... 21
... ... 22
... ... 23
... ... ...
333 ... 11
... 12
... 13
...

The total number of conditions is 16 and each subject observes only 8 of them (random block). What is the most appropriate example I should follow in the manual? I am avoiding using complex sample (w/ sandwich estimation) techniques because I want to capture the variance given by subjects as well as the variance explained by the different conditions.

 Linda K. Muthen posted on Saturday, February 13, 2010 - 9:51 am
With data in the wide format, multivariate modeling takes care of the fact that several variables are measured for each person. The 8 conditions not measured should be represented as missing data. There is no need for multilevel modeling.
 Andrea Vocino posted on Saturday, February 13, 2010 - 3:57 pm
I don't understand exactly what you mean. I you mean to model all the variables in the CFA after having transposed the dataset such as

Subj VArXCond11 VarXCon12 . . . .

then my model would have an incredible number of variables and won't converge as I will have more parameter estimates than observations. Instead I was thinking about using a two level CFA where I would cluster the subject but I still need another level to look after the condition. Is there any possibility to run a kind of random effects CFA model?
 Linda K. Muthen posted on Sunday, February 14, 2010 - 10:06 am
The parameter reductions you get by using the long format impose measurement invariance, something you cannot then test. You should consider this. You can impose those same restrictions in the wide format.
 Andrea Vocino posted on Sunday, February 14, 2010 - 11:27 am
It's an impossible model to run (in the wide format) as I would have 16 conditions x 32 variables and I would end up w/ 512 variables having ~220 observations. I think the CFA would have >1,500 parameter estimates.

What kind oif restriction were you suggesting? Loadings, error variances and factor variance to be equal across the items/variables and to be freely estimates across the conditions (being orthogonal?)
 Bengt O. Muthen posted on Monday, February 15, 2010 - 12:55 pm
It sounds like you have 32 variables per condition and 8 conditions per person. If so, do the 32 variables measure a single factor for all conditions? If so, can it be assumed that the factor indicators (the 32 vbles) have measurement invariance (at least loadings, perhaps also intercepts) across all 16 conditions?
 John Barile posted on Monday, March 01, 2010 - 4:32 pm
Hello Mplus Team:

I am running the following multilevel mediational model:
observed within and between level covariates --> Multilevel CFA (leading to their respective parts) --> 2 individual measured outcome variables (with random intercepts; fixed slopes).
I have grand mean centered the within level covariates and I have manifest group averages of the same covariates at between level.
As I understand, by default, multilevel CFA are essentially constructed by having group mean centered within level indicators and the between level indicators are a latent mean average that contain both within and between group variance. - please correct me if I am wrong here.
By having the within level covariates grandmean centered, am I getting contextual effects at the between level without putting any constraints on the Multilevel CFA indicators?
If I do need to put constraints on the MLCFA, what would be the best why to go about doing this?

Ideally, I would like to get only within level variation on the within level and between level variation on the between level. Thank you for any help.
 Bengt O. Muthen posted on Tuesday, March 02, 2010 - 1:07 pm
Multilevel FA formulates a model for the population SigmaW and SigmaB, corresponding to within and between variation decomposition. In terms of the analysis, the factor indicators are not centered or averaged. The decomposition is in line with 1-way random effects anova:

y_{ij} = eta_j + epsilon_{ij}

where eta_j has between variance and epsilon within variance.

In line with Raudenbush & Bryk (2002), page 140, you get contextual effects on the between level when you grandmean as opposed to group-mean center the covariate.
 Sandra N. posted on Wednesday, September 21, 2011 - 8:16 am
Hi,
I conducted a multilevel cfa with 6 latent factors, each specified by 4 indicators on the within and between level (12 latent latent variables and 24 manifest variables in total). I used data from 620 students nested in 45 classes. This model converged fine and model fits were satisfying (RMSEA 0.035, CFI/TLI 0.928, 0.917, SRMR within 0.039, SRMR between 0.125). However, I obtained the following error:
THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES MAY NOT BE TRUSTWORTHY FOR SOME PARAMETERS DUE TO A NON-POSITIVE DEFINITE FIRST-ORDER DERIVATIVE PRODUCT MATRIX. THIS MAY BE DUE TO THE STARTING VALUES BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. THE CONDITION NUMBER IS -0.125D-15. PROBLEM INVOLVING PARAMETER 46.
THE NONIDENTIFICATION IS MOST LIKELY DUE TO HAVING MORE PARAMETERS THAN THE NUMBER OF CLUSTERS. REDUCE THE NUMBER OF PARAMETERS.
I tried to reduce the number of parameters but it seems I cannot reduce them lower than 45, so I am not sure how to deal with that error. Is the model trustable (loadings, s.e.)?
Furthermore, I obtained 3 negative residual variances (-0.007 - -0.028)on the between level. As those values are small I fixed them to zero. Could you give me some advice on how to deal with those Heywood cases in reporting the results as I am currently working on a publication of these analyses. Is the model trustable despite the Heywood cases?
Sandra
 Linda K. Muthen posted on Wednesday, September 21, 2011 - 9:18 am
It is common to have small residual variances on the between level and common to fix these to zero. See the following paper which is available on the website:

Muthén, B. & Asparouhov, T. (2011). Beyond multilevel regression modeling: Multilevel analysis in a general latent variable framework. In J. Hox & J.K. Roberts (eds), Handbook of Advanced Multilevel Analysis, pp. 15-40. New York: Taylor and Francis.

Regarding the other problem, it is not known what the effect of having more parameters than clusters has on model results. This has not been studied. You would need to do a simulation to see this.
 sarah posted on Tuesday, August 28, 2012 - 12:20 pm
Hi
I am doing something similar to this topic's first message posted by Antje. I have 20 indiv measured every month for 30 years (30x12=360 obs). I have 3 items and want to do a CFA. I don't think I can for my data due to many obsv and need to do multilevel. I looked at ex. 6.14 and also 9.15 and 9.16, but it's unclear.
My data is set up in a long format with each line containing one individual for each time period (with 3items, time, cluster ID). In 9.16 WITHIN = time a3;
BETWEEN = x1 x2; What would be my MODEL syntax for within and between?
 Linda K. Muthen posted on Wednesday, August 29, 2012 - 9:40 am
There will be a new method is Version 7 that can handle your situation.
 sarah posted on Wednesday, September 05, 2012 - 12:20 pm
Hi I need to do what I wrote in the previous post ASAP for my dissertation. When is Version 7 coming out? Is it going to be in September or October? Is it going to allow me to conduct multi level analysis and still test for measurement invariance across time? This will definitely solve my problem.
 Bengt O. Muthen posted on Wednesday, September 05, 2012 - 12:54 pm
While waiting, you may want to study the handouts from the Version 7 course last week in Utrecht posted at

http://www.statmodel.com/v7workshops.shtml
 Linda K. Muthen posted on Wednesday, September 05, 2012 - 1:31 pm
Late summer or early fall.
 Ok-young, JI posted on Friday, April 15, 2016 - 4:47 am
hi Dr.Muthens

now, i'm analyzing twolevel CFA
this model is one factor model.

my code is

title: this is mCFA for daily ego depletion, job engagement, CWB
data: file is mCFA.txt ;
variable: names = x1-x3 y1-y3 z1-z4 id ;
usevariables = x1-x3 y1-y3 z1-z4 id ;
cluster = id ;

analysis: type=twolevel ;
estimator = MLR ;
model:
%within%
fw BY x1-x3 y1-y3 z1-z4 ;
%between%
fb BY x1-x3 y1-y3 z1-z4 ;
x1-x3@0 ; y1-y3@0 ; z1-z4@0 ;

output: tech1 tech8 ;

than result said

THE LOGLIKELIHOOD DECREASED IN THE LAST EM ITERATION. CHANGE YOUR MODEL
AND/OR STARTING VALUES.

THE MODEL ESTIMATION DID NOT TERMINATE NORMALLY DUE TO AN ERROR IN THE
COMPUTATION. CHANGE YOUR MODEL AND/OR STARTING VALUES.

what is this mean??

 Linda K. Muthen posted on Friday, April 15, 2016 - 6:55 am
 Corinna Ziegler posted on Sunday, March 05, 2017 - 7:51 am
Dear Dr. Muthens,
I try to analyze a twolevel CFA with two second-order factors with the same lambdas for both level:

analysis: type = twolevel;
model:
%within%
kh_w by ...

dk_w by ...

cc_w by kh_w@1
dk_w (cc2);

%between%
kh_b by ...
dk_b by ...

cc_b by kh_b@1
dk_b (cc2);

***

I get the following:
MAXIMUM LOG-LIKELIHOOD VALUE FOR THE UNRESTRICTED (H1) MODEL IS -118332.266

THE MODEL ESTIMATION DID NOT TERMINATE NORMALLY DUE TO AN ILL-CONDITIONED
FISHER INFORMATION MATRIX. CHANGE YOUR MODEL AND/OR STARTING VALUES.

THE MODEL ESTIMATION DID NOT TERMINATE NORMALLY DUE TO A NON-POSITIVE
DEFINITE FISHER INFORMATION MATRIX. THIS MAY BE DUE TO THE STARTING VALUES
BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. THE CONDITION
NUMBER IS -0.135D-12.

THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES COULD NOT BE COMPUTED. THIS IS OFTEN DUE TO THE STARTING VALUES BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. CHANGE YOUR MODEL AND/OR STARTING VALUES. PROBLEM INVOLVING PARAMETER 43.

(Parameter 43 is Psi of cc_w with cc_w).

Can you help me?
Kind regards.
 Bengt O. Muthen posted on Sunday, March 05, 2017 - 2:52 pm
You can't identify a second-order factor with only 2 first-order factors as indicators; you need at least 3.
 Joao Garcez posted on Wednesday, September 27, 2017 - 12:29 pm
Dear Dr Muthens,

I want to explore the model fit of latent variables prior to testing a structural model and I want to account for the clustering in the data (45 clusters). I saw that in the manual this is how a multilevel CFA is defined:

%within%
YW BY X1-X5;

%between%
YB BY X1-X5;
X1-X5@0;

However, I am not interested in modelling the measurement model in the between level, I just want to account for the clustering. Would it be wrong to just have it in the within level, thus leaving the between empty and allowing mplus to only calculate the variances in the between level like below?

%within%
YW BY X1-X5;

%between%

Thank you.
 Bengt O. Muthen posted on Wednesday, September 27, 2017 - 3:29 pm
You want to allow the X1-X5 random intercepts to correlate on between. A 1-factor model is a good way to accomplish this.
 Joao Garcez posted on Thursday, September 28, 2017 - 12:20 am
Dear Drs Muthen & Muthen,

Thank you for your quick answer and availability. I really appreciate it.

Have a nice day.
 Thula_Chelvan posted on Friday, January 19, 2018 - 12:14 pm
I'm attempting to run a CFA of a scale (2 factors of ~10 indicators each). I have 4 repeated measures of this scale however.

Running a multiple indicator CFA in wide-form is resulting in more parameters estimated than observations. Can I just run the CFA at the first time point without considering the repeated measures? What are my other options?
 Thula_Chelvan posted on Friday, January 19, 2018 - 12:18 pm
I'm attempting to run a CFA of a scale (2 factors of ~10 indicators each). I have 4 repeated measures of this scale however.

Running a multiple indicator CFA in wide-form is resulting in more parameters estimated than observations. Is it appropriate to run the CFA at the first time point only? What are my other options?
 Thula_Chelvan posted on Friday, January 19, 2018 - 12:29 pm
I'm attempting to run a CFA of a scale (2 factors of ~10 indicators each). I have 4 repeated measures of this scale however.

Running a multiple indicator CFA in wide-form is resulting in more parameters estimated than observations. Is it appropriate to run the CFA at the first time point only? What are my other options?
 Bengt O. Muthen posted on Friday, January 19, 2018 - 1:21 pm
To reduce the number of parameters, you can assume measurement invariance across the 4 time points (but it's a strong assumption). Under this assumption, you can do the analysis in a 2-level, long format with time as level 1 and subject as level 2.

Or, you can do 2 timepoints at a time in wide format.
 Thula_Chelvan posted on Sunday, January 21, 2018 - 8:43 am
Thank you for your response, Dr. Muthen.

Would the 2-level approach look something like this, where all the x items are in long form, indexed by time?

USEVARIABLES ARE id x1-x9;
MISSING ARE ALL (999);

CLUSTER = id;

ANALYSIS:
TYPE = TWOLEVEL;
ESTIMATOR = MLR;

MODEL:

%WITHIN%

fwithin1 by x1@1
x2 (1)
x3 (2)
x4 (3);

fwithin2 by x5@1
x6 (4)
x7 (5)
x8 (6)
x9 (7);

%BETWEEN%
fbetween1 by x1@1
x2 (1)
x3 (2)
x4 (3);

fbetween2 by x5@1
x6 (4)
x7 (5)
x8 (6)
x9 (7);
 Bengt O. Muthen posted on Sunday, January 21, 2018 - 5:26 pm
That's one way of doing it. Assuming invariance of loadings across levels as you do, you may instead want to hold all loadings equal, instead setting the metric by fwithin1@1, fwithin2@1, and let the fbetween factors have free variances to be estimated.
 Thula_Chelvan posted on Monday, January 22, 2018 - 8:07 am
Thank you! Do you mean something like this? From what I understand, the below code allows you to estimate the factor variation at the between level (and not the within-level, because we have assumed invariance of time).

ANALYSIS:
TYPE = TWOLEVEL;
ESTIMATOR = MLR;

MODEL:

%WITHIN%

fwithin1 by
x1 (4)
x2 (1)
x3 (2)
x4 (3);

fwithin1@1;

fwithin2 by
x5 (8)
x6 (4)
x7 (5)
x8 (6)
x9 (7);

fwithin2@1;

%BETWEEN%
fbetween1 by
x1 (4)
x2 (1)
x3 (2)
x4 (3);

fbetween2 by
x5 (8)
x6 (4)
x7 (5)
x8 (6)
x9 (7);
 Bengt O. Muthen posted on Monday, January 22, 2018 - 10:40 am
Right.
 Bengt O. Muthen posted on Monday, January 22, 2018 - 10:41 am
You can then see how much smaller/larger the between factor variances relative to the fixed unit within factor variances.
 Christina Bader posted on Wednesday, September 05, 2018 - 2:45 am
Dear Dr. Muthen,

I´m very new to using Mplus and struggling a lot with the multilevel CFA. I used a diary study to collect my data so that I now have repeated measures (5 consequential days). That´s why I think I would have to analyze my CFA with a twolevel model with Level 1 being the subject and Level 2 being the day. Am I correct to think that then the cluster should be the variable "day"? My code looks as follows:

Analysis: Type = twolevel;
estimator=mlf ;
ALGORITHM=EM;

Model: %Within%
Perm_home_w by PE01_01 PE01_02R
PE01_03 PE01_04R PE01_05R PE01_06 PE01_07 PE01_08 ;

Perm_work_w by PW01_01R PW01_02R PW01_03 PW01_04 PW01_05R PW01_06R PW01_07 PW01_08;

Perm_home_w with Perm_work_w

%Between%

Perm_home_b by PE01_01 PE01_02R
PE01_03 PE01_04R PE01_05R PE01_06 PE01_07 PE01_08;

Perm_work_b by PW01_01R PW01_02R PW01_03 PW01_04 PW01_05R PW01_06R PW01_07 PW01_08;

Perm_home_b with Perm_work_b

But I receive this error message:

THE VARIANCE OF PE01_02R APPROACHES 0. FIX THIS VARIANCE AND THE
CORRESPONDING COVARIANCES TO 0, DECREASE THE MINIMUM VARIANCE, OR
SPECIFY THE VARIABLE AS A WITHIN VARIABLE.
[....]

I tried to run the analysis without the Item PE01_02R but it makes no difference. I still have an error message then, "complaining" about other Items.

 Bengt O. Muthen posted on Thursday, September 06, 2018 - 3:00 pm
Yes, cluster=day.

Try fixing all between-level residual variances at zero. They are typically not very big.

Try using MLR instead of MLF.
 shuang posted on Sunday, November 18, 2018 - 11:53 pm
Dear Dr. Muthen,

I did a multi-level CFA. However, SRMR at the between level is very high. Could you advise what the problem is and how to fix pls? Thank you.

SRMR (Standardized Root Mean Square Residual)
Value for Within 0.058
Value for Between 0.424

The code I used is as follows.

ANALYSIS: TYPE = TWOLEVEL;
H1ITERATIONS = 5000;
MODEL:
%WITHIN%
TPerfW BY TPERF1 TPERF2 TPERF3 TPERF4 TPERF5 TPERF6;
CPerfW BY CPERF1 CPERF2 CPERF3 CPERF4 CPERF5 CPERF6 CPERF7;
SCDW BY SCD1 SCD2 SCD3 SCD4 SCD5 SCD6;
EEW BY EE1 EE2 EE3 EE4;
EEW WITH SCDW TPerfW CPerfW;
SCDW WITH TPerfW CPerfW;
TPerfW WITH CPerfW;

%BETWEEN%
TPerfB BY TPERF1 TPERF2 TPERF3 TPERF4 TPERF5 TPERF6;
CPerfB BY CPERF1 CPERF2 CPERF3 CPERF4 CPERF5 CPERF6 CPERF7;
TPERF1 TPERF2 TPERF3 TPERF4 TPERF5 TPERF6 @ 0;
SCD1 SCD2 SCD3 SCD4 SCD5 SCD6 @ 0;

OUTPUT: SAMPSTAT RESIDUAL TECH1 STANDARDIZED;
 Bengt O. Muthen posted on Monday, November 19, 2018 - 4:16 pm
SRMR on between can be high when the number of clusters is not large while at the same time the chi-square for the model is still good. You may want to consult SEMNET for analysis strategies.
 Bharath Shashanka Katkam posted on Wednesday, May 01, 2019 - 6:03 am
Hello Mplus Team,

In the Example 9.6 i.e., "TWO-LEVEL CFA WITH CONTINUOUS FACTOR INDICATORS AND COVARIATES",
At the Within-level, the Covariates are two in number (x1, x2).
But at the Between-level, how the two Covariates are been transformed into a Single Covariate (w)?
 Bengt O. Muthen posted on Wednesday, May 01, 2019 - 4:32 pm
w is a different variable than x1 and x2. The Within- and Between-levels can have different variables. See our Short Course Topic 7 video and handout on our web site.
 Bharath Shashanka Katkam posted on Wednesday, May 01, 2019 - 11:29 pm
Okay Sir...
 Bharath Shashanka Katkam posted on Saturday, May 25, 2019 - 7:39 am
Hello Mplus Team,

Assumptions of Structural Equation Modeling (SEM) given by David Kaplan, state that, the Independent Variable values & the Error terms should not be correlated.
But, in my research, the Independent Variable values & the Error terms are correlated.
So, in that case, can I use SEM?
 Bengt O. Muthen posted on Sunday, May 26, 2019 - 5:15 pm
Standard regression makes this assumption too. How do you know it is violated?
 Bharath Shashanka Katkam posted on Monday, May 27, 2019 - 4:59 am
My research literature says that, the "Independent Variables & Error terms" of the Variables of my study are correlated.
So, in this case, what is the best Parameter estimation technique other than Maximum Likelihood.
Could that be Bayesian Parameter estimation?
 Bharath Shashanka Katkam posted on Friday, September 27, 2019 - 7:08 am
Hello Mplus Team,

I have few doubts in the results section of Multilevel CFA. The following are the results of my Mplus CFA output:

Within Level
FW BY
Y1 1.000 0.000 999.000 999.000
Y2 0.983 0.019 50.528 0.000
Y3 0.984 0.024 40.825 0.000
Y4 0.999 0.021 46.872 0.000

Between Level
FB BY
Y1 1.000 0.000 999.000 999.000
Y2 1.131 0.154 7.357 0.000
Y3 1.320 0.288 4.580 0.000
Y4 1.086 0.126 8.602 0.000

How do I interpret the Between level factor loadings, when the values are more than 1.00?
 Bengt O. Muthen posted on Friday, September 27, 2019 - 11:32 am
That's perfectly ok because the DV on Between does not necessarily have variance 1. Loadings less than 1 are relevant only if both the factor variance and the DV (indicator) variance is 1. If you don't like the look of it, just change the loading fixed to 1 to the largest loading and re-run.
 Bharath Shashanka Katkam posted on Saturday, September 28, 2019 - 12:11 am
Thank you Sir.Do you mean that, we have to standardize the loadings?
 Bengt O. Muthen posted on Saturday, September 28, 2019 - 12:34 pm
No. Loadings are like other regression coefficients - they can be greater than 1.

Perhaps you are thinking of EFA where loadings refer to factors and factor indicators with variance 1 which typically results in loadings less than 1.
 Bharath Shashanka Katkam posted on Sunday, September 29, 2019 - 3:39 am
Thank you Sir. Could we get the Standardized factor loadings (correlation values) in the CFA, similar to the EFA?
 Shahid Khan posted on Tuesday, July 07, 2020 - 6:14 pm
Dear Muthen
Mplus by default provides ICCs value in multilevel model's output under the section title "Estimated Intraclass Correlations for the Y Variables". Can you please tell me whether these ICCs are ICC1 or ICC2 values?
Also can you please provide me with a syntax calculating ICC1 and ICC2 for the predictor variable?

Regards SK
 Bengt O. Muthen posted on Wednesday, July 08, 2020 - 11:51 am