EFA IN A CFA FRAMEWORK: Identificatio... PreviousNext
Mplus Discussion > Exploratory Factor Analysis >
Message/Author
 Irene Rebollo posted on Wednesday, June 29, 2005 - 2:17 pm
Dear Dr Muthen,
I am trying to do EFA in a CFA framework. I am new with Mplus.
In order to identify the model I want to use a constraint suggested by D.Hessen and C.Dolan in a recent paper (David J. Hessen, Conor V. Dolan, and Jelte M. Wicherts (2004). Multi-group exploratory factor analysis and the power to detect uniform bias. Applied Psychological Research, In press.):

In order to accomplish identification the matrix: Lambda(T)*Theta~*Lambda can be constrained to be diagonal. -LISREL notation, (T)for transposed, ~for inverse-. With this method it is not necessary to use anchor constraints.

I have managed to use this constraint succesfully in mx and LISREL. In mx it is straightforward with matrix algebra, and in LISREL I used non-linear constraints and additional parameters. But due to my inexperience I can not come up with the way to do it in Mplus.
Could you please give me some clue?

Thanks in advance,
Irene Rebollo
Biological Psychology Department
Vrije Universiteit
Amsterdam
 Linda K. Muthen posted on Saturday, July 02, 2005 - 5:40 pm
Have you tried setting up the constraints used in LISREL using MODEL CONSTRAINT in Mplus? You should be able to do the same thing as you did in LISREL this way.
 Irene Rebollo posted on Monday, July 04, 2005 - 8:01 am
Thank you four your answer Linda.

Yes, I just managed today. Easier than in Lisrel. This is the MPLUS equivalent to what I have done in LISREL:

*************************************************
TITLE: Try for Identification Constraint: EFA in CFA Conor's simulation
DATA:
FILE IS const2.cor;
TYPE IS FULLCORR;
NOBSERVATIONS=1000;
VARIABLE:
NAMES ARE V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12;



MODEL CONSTRAINT:
!Dummy parameters D1-D6 contain the elements of the matrix Lambda'*Theta~*Lambda
!D1-D3 are the diagonal elements: should be free
!D4-D6 are the off diagonal elements that should be constrained to be zero

D1=((L11**2)/T1+(L21**2)/T2+(L31**2)/T3+(L41**2)/T4+(L51**2)/T5+
(L61**2)/T6+(L71**2)/T7+(L81**2)/T8+(L91**2)/T9+(L101**2)/T10+
(L111**2)/T11+(L121**2)/T12);

D2=((L12**2)/T1+(L22**2)/T2+(L32**2)/T3+(L42**2)/T4+(L52**2)/T5+
(L62**2)/T6+(L72**2)/T7+(L82**2)/T8+(L92**2)/T9+(L102**2)/T10+
(L112**2)/T11+(L122**2)/T12);

D3=((L13**2)/T1+(L23**2)/T2+(L33**2)/T3+(L43**2)/T4+(L53**2)/T5+
(L63**2)/T6+(L73**2)/T7+(L83**2)/T8+(L93**2)/T9+(L103**2)/T10+
(L113**2)/T11+(L123**2)/T12);

D4=((L11*L12)/T1+(L21*L22)/T2+(L31*L32)/T3+(L41*L42)/T4+(L51*L52)/T5
+(L61*L62)/T6+(L71*L72)/T7+(L81*L82)/T8+(L91*L92)/T9+(L101*L102)/T10
+(L111*L112)/T11+(L121*L122)/T12)=0;

D5=((L11*L13)/T1+(L21*L23)/T2+(L31*L33)/T3+(L41*L43)/T4+(L51*L53)/T5
+(L61*L63)/T6+(L71*L73)/T7+(L81*L83)/T8+(L91*L93)/T9+(L101*L103)/T10
+(L111*L113)/T11+(L121*L123)/T12)=0;

D6=((L12*L13)/T1+(L22*L23)/T2+(L32*L33)/T3+(L42*L43)/T4+(L52*L53)/T5
+(L62*L63)/T6+(L72*L73)/T7+(L82*L83)/T8+(L92*L93)/T9+(L102*L103)/T10
+(L112*L113)/T11+(L122*L123)/T12)=0;


MODEL:

F1 BY V1*(L11)
V2 (L21)
V3 (L31)
V4 (L41)
V5 (L51)
V6 (L61)
V7 (L71)
V8 (L81)
V9 (L91)
V10 (L101)
V11 (L111)
V12 (L121);
F2 BY V1*(L12)
V2 (L22)
V3 (L32)
V4 (L42)
V5 (L52)
V6 (L62)
V7 (L72)
V8 (L82)
V9 (L92)
V10 (L102)
V11 (L112)
V12 (L122);
F3 BY V1*(L13)
V2 (L23)
V3 (L33)
V4 (L43)
V5 (L53)
V6 (L63)
V7 (L73)
V8 (L83)
V9 (L93)
V10 (L103)
V11 (L113)
V12 (L123);
V1 (T1);
V2 (T2);
V3 (T3);
V4 (T4);
V5 (T5);
V6 (T6);
V7 (T7);
V8 (T8);
V9 (T9);
V10 (T10);
V11 (T11);
V12 (T12);

F1-F3@1;
F1 F2 F3 PWITH F2@0 F3@0 F1@0;

DUM1 BY V1@0;
DUM2 BY V1@0;
DUM3 BY V1@0;
DUM4 BY V1@0;
DUM5 BY V1@0;
DUM6 BY V1@0;

DUM1 (D1);
DUM2 (D2);
DUM3 (D3);
DUM4 (D4);
DUM5 (D5);
DUM6 (D6);

DUM1 WITH DUM2-DUM6@0;
DUM2 WITH DUM3-DUM6@0;
DUM3 WITH DUM4-DUM6@0;
DUM4 WITH DUM5-DUM6@0;
DUM5 WITH DUM6@0;
DUM1-DUM6 WITH F1-F3@0;



OUTPUT: TECH1

*******************************************
But it's not working. It still behaves as if there was no constraint, because the model is not identified. I get these same loadings for the three factors:
F1 BY
V1 0.238
V2 0.215
V3 0.228
V4 0.215
V5 0.337
V6 0.349
V7 0.322
V8 0.304
V9 0.324
V10 0.271
V11 0.292
V12 0.310

Variances
F1 1.000
F2 1.000
F3 1.000
DUM1 1.381
DUM2 1.381
DUM3 1.381
DUM4 0.000
DUM5 0.000
DUM6 0.000

This is what I get with Lisrel and Mx:

MATRIX Lambda

-0.1594 -0.4969 -0.4776
-0.1408 -0.4387 -0.4218
-0.1506 -0.4702 -0.4513
-0.1408 -0.4387 -0.4218
-0.2161 0.3305 -0.5866
-0.2263 0.3459 -0.6133
-0.2041 0.3134 -0.5548
-0.1912 0.2923 -0.5180
0.4151 0.0138 -0.5720
0.3370 0.0115 -0.4635
0.3674 0.0120 -0.5053
0.3936 0.0132 -0.5411

[=L'*(T*T)~*L]

1.5144E+00 5.6877E-09 -2.3397E-07
5.6877E-09 2.3275E+00 -7.8191E-09
-2.3397E-07 -7.8191E-09 5.8483E+00
********************************************
In Lisrel It was necessary to writte the constraint before the model. That's why I had to redefine the whole model with additional parameters in PHI.
But in MPLUS the order doesn't make a difference.

Is the =0 at the end of the constraint allowed? I am afraid that this might be the problem.
With LISREl I just did a second constraint saying e.g. D4=0, but in MPLUS this is not allowed.

thank you again!
Irene
 bmuthen posted on Monday, July 04, 2005 - 11:09 am
Our expert on this new Model Constraint feature will be back this Thursday, but let me try to give you some feedback in the meanwhile. Also, Model Constraint will be further extended in the future to make all of this easier. I see two things to modify.

I think your model has 3 factors so it seems like in addition to Psi = I, there are 3 constraints that you want to impose on your matrix product - namely the zero values for D4, D5, and D6. I therefore don't see that the D1-D3 statements are needed.

As you surmised, Mplus cannot currently have the "=0" part added at the end of a constraint statement, but instead you can simply rewrite this restriction. For example, for

y = a + b = 0;

you can instead say

b = -a;

So for example with D6, you could solve for T12 and have it on the LHS.
 Irene Rebollo posted on Tuesday, July 05, 2005 - 4:34 am
I think it's great if you extend the Model Constraint feature.

The problem with solving for T12 is that then I can not use T12 on the RHS in the other two. So to make 3 constraints, by solving for three values of T, it becomes recursive.

I'll keep trying other possibilities, and wait for you expert's suggestion.

Thanx!
 bmuthen posted on Tuesday, July 05, 2005 - 5:23 pm
I see your point about T12 being on the RHS in the D5 and D4 expressions, so that would mean having to solve for T12 in those equations in terms of the T12 = equation, etc, which is messy.
 Irene Rebollo posted on Wednesday, July 06, 2005 - 5:03 am
This is the last idea I came up with:

*************************************************
TITLE: Try for Identification Constraint: EFA in CFA Conor's simulation
DATA:
FILE IS const.cor;
TYPE IS FULLCOV;
NOBSERVATIONS=1000;
VARIABLE:
NAMES ARE V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 DUM1 DUM2 DUM3;

ANALYSIS:
ITERATIONS= 3000
STARTS= 20 2
MODEL:

F1 BY V1*.3 (L11)
V2*.3 (L21)
V3*.3 (L31)
V4*.3 (L41)
V5*.3 (L51)
V6*.3 (L61)
V7*.3 (L71)
V8*.3 (L81)
V9*.3 (L91)
V10*.3 (L101)
V11*.3 (L111)
V12*.3 (L121);
F2 BY V1*.3(L12)
V2*.3 (L22)
V3*.3 (L32)
V4*.3 (L42)
V5*.3 (L52)
V6*.3 (L62)
V7*.3 (L72)
V8*.3 (L82)
V9*.3 (L92)
V10*.3 (L102)
V11*.3 (L112)
V12*.3 (L122);
F3 BY V1*.3(L13)
V2*.3 (L23)
V3*.3 (L33)
V4*.3 (L43)
V5*.3 (L53)
V6*.3 (L63)
V7*.3 (L73)
V8*.3 (L83)
V9*.3 (L93)
V10*.3 (L103)
V11*.3 (L113)
V12*.3 (L123);
V1*1 (T1);
V2*1 (T2);
V3*1 (T3);
V4*1 (T4);
V5*1 (T5);
V6*1 (T6);
V7*1 (T7);
V8*1 (T8);
V9*1 (T9);
V10*1 (T10);
V11*1 (T11);
V12*1 (T12);

F1-F3@1;
F1 F2 F3 PWITH F2@0 F3@0 F1@0;

! DUM1-DUM3 are dummy variables included in the data file that have 0
! correlations with all the other variables and 0.00001 variance. This
! way D1, D2 and D3 (constraints below) are forced to be zero.

DUM1 (D1);
DUM2 (D2);
DUM3 (D3);


MODEL CONSTRAINT:
!*******************************************************************************
!Dummy parameters contain the elements of the matrix Lambda'*Theta~*Lambda
!D1-D3 are the off diagonal elements that should be constrained to be zero

D1=L11*L12/T1+L21*L22/T2+L31*L32/T3+L41*L42/T4
+L51*L52/T5+L61*L62/T6+L71*L72/T7+L81*L82/T8
+L91*L92/T9+L101*L102/T10+L111*L112/T11+L121*L122/T12;

D2=L11*L13/T1+L21*L23/T2+L31*L33/T3+L41*L43/T4
+L51*L53/T5+L61*L63/T6+L71*L73/T7+L81*L83/T8
+L91*L93/T9+L101*L103/T10+L111*L113/T11+L121*L123/T12;

D3=L12*L13/T1+L22*L23/T2+L32*L33/T3+L42*L43/T4
+L52*L53/T5+L62*L63/T6+L72*L73/T7+L82*L83/T8
+L92*L93/T9+L102*L103/T10+L112*L113/T11+L122*L123/T12;



OUTPUT: TECH1
*************************************************
I think the constrain might be working now. But the model still doesn't converge. I am not sure why:
*************************************************

NO CONVERGENCE. NUMBER OF ITERATIONS EXCEEDED.


MODEL RESULTS

Estimates

F1 BY
V1 -0.002
V2 -0.002
V3 -0.001
V4 -0.002
V5 0.001
V6 0.001
V7 0.000
V8 0.000
V9 0.001
V10 0.000
V11 0.000
V12 0.000

F2 BY
V1 -0.002
V2 -0.002
V3 -0.001
V4 -0.002
V5 0.001
V6 0.001
V7 0.000
V8 0.000
V9 0.001
V10 0.000
V11 0.000
V12 0.000

F3 BY
V1 -0.002
V2 -0.002
V3 -0.001
V4 -0.002
V5 0.001
V6 0.001
V7 0.000
V8 0.000
V9 0.001
V10 0.000
V11 0.000
V12 0.000

F1 WITH
F2 0.000

F2 WITH
F3 0.000

F3 WITH
F1 0.000

Variances
DUM1 0.000
DUM2 0.000
DUM3 0.000
F1 1.000
F2 1.000
F3 1.000

Residual Variances
V1 1.154
V2 1.135
V3 1.143
V4 1.135
V5 1.107
V6 1.121
V7 1.094
V8 1.085
V9 1.135
V10 1.094
V11 1.102
V12 1.116
****************************************
Will keep trying...
 Tihomir Asparouhov posted on Thursday, July 07, 2005 - 11:18 am
The above code will not work since the dummy parameters are not zero. You have to solve the 3 equations for 3 parameters and have these 3 parameters be dependent while all other parameters are independent. This in principle is not hard but the result is very messy because your equations are complicated. Here are the steps you can go through:

1. Multiply eq 1 by L13/L12 and subtract it from eq 2 (thus you will have L11 only in eq 2)

2. Solve eq 1 for L11

3. Multiply eq 3 by L21/L22 and subtract it from eq 2

4. Solve eq 3 for L23

5. Solve eq 2 for L33

6. Substitute the solution for L33 back in eq 2 so that L33 doesn't appear on RHS of eq 2

0=L11*L12/T1+L21*L22/T2+L31*L32/T3+L41*L42/T4
+L51*L52/T5+L61*L62/T6+L71*L72/T7+L81*L82/T8
+L91*L92/T9+L101*L102/T10+L111*L112/T11+L121*L122/T12;

0=L11*L13/T1+L21*L23/T2+L31*L33/T3+L41*L43/T4
+L51*L53/T5+L61*L63/T6+L71*L73/T7+L81*L83/T8
+L91*L93/T9+L101*L103/T10+L111*L113/T11+L121*L123/T12;

0=L12*L13/T1+L22*L23/T2+L32*L33/T3+L42*L43/T4
+L52*L53/T5+L62*L63/T6+L72*L73/T7+L82*L83/T8
+L92*L93/T9+L102*L103/T10+L112*L113/T11+L122*L123/T12;
 Irene Rebollo posted on Friday, July 08, 2005 - 5:41 am
Thank you for the solution! It reaaly looks that this is the way to go. But...

I solved the equations as you suggested:
***********************************************
L11 = -T1/L12*(L21*L22/T2+L31*L32/T3+L41*L42/T4+L51*L52/T5+L61*L62/T6+L71*L72/T7
+L81*L82/T8+L91*L92/T9+L101*L102/T10+L111*L112/T11+L121*L122/T12);

L33=-L22*T3/(L22*L31-L21*L32)*((L41*L43/T4+L51*L53/T5+L61*L63/T6+L71*L73/T7
+L81*L83/T8+L91*L93/T9+L101*L103/T10+L111*L113/T11+L121*L123/T12)
-(L13/L12*(L21*L22/T2+L31*L32/T3+L41*L42/T4+L51*L52/T5+L61*L62/T6+L71*L72/T7
+L81*L82/T8+L91*L92/T9+L101*L102/T10+L111*L112/T11+L121*L122/T12))-
(L21/L22*(L12*L13/T1+L42*L43/T4+L52*L53/T5+L62*L63/T6+L72*L73/T7+L82*L83/T8
+L92*L93/T9+L102*L103/T10+L112*L113/T11+L122*L123/T12)));

L23=-T2/L22*(L12*L13/T1+L32*(-L22*T3/(L22*L31-L21*L32)*((L41*L43/T4+L51*L53/T5
+L61*L63/T6+L71*L73/T7+L81*L83/T8+L91*L93/T9+L101*L103/T10+L111*L113/T11
+L121*L123/T12)-(L13/L12*(L21*L22/T2+L31*L32/T3+L41*L42/T4+L51*L52/T5
+L61*L62/T6+L71*L72/T7+L81*L82/T8+L91*L92/T9+L101*L102/T10+L111*L112/T11
+L121*L122/T12))-(L21/L22*(L12*L13/T1+L42*L43/T4+L52*L53/T5+L62*L63/T6
+L72*L73/T7+L82*L83/T8+L92*L93/T9+L102*L103/T10+L112*L113/T11+L122*L123/T12))))
/T3+L42*L43/T4+L52*L53/T5+L62*L63/T6+L72*L73/T7+L82*L83/T8+L92*L93/T9
+L102*L103/T10+L112*L113/T11+L122*L123/T12);
*************************************************
Apparently Equation 3 becomes too large and I get this warning:


*** WARNING in Model Constraint command
Statement exceeds line limit. Break your statement into several parts.
L23=-T2/L22*(L12*L13/T1+L32*(-L22*T3/(L22*L31-L21*L32)*((L41*L43/T4+L51*L53/T5

Is there any way to solve this problem? Because it is not about the lenght of each line which is already adjusted to 80characters, but about the whole statement.

Thank you again for the effort that you are making to solve it!

Irene
 Linda K. Muthen posted on Friday, July 08, 2005 - 8:40 am
I don't think there is any way to solve this. The maximum length is 510 in Version 3.12. We will increase this in the next update. In the meantime, if you send your input, data, and license number to support@statmodel.com, we can run it for you with our developmental version and send you the output. We plan to have the update available by the end of the month or shortly thereafter.
 Irene Rebollo posted on Wednesday, July 13, 2005 - 7:38 am
Hi again,
I changed the names of the parameters to make them shorter, which solved the length problem.

But still does not work. I have solved the equations many times for different parameters and checked for mistakes.

I think that the solution proposed might not be entirely good, and might alter the constraint. Because:

EQ1=0
EQ2=0
EQ1-EQ2=0
----> EQ1=EQ2
We do something like this:
EQ1=0
EQ2=0
EQ1*a=0
EQ1*a-EQ2=0
----> EQ1=EQ2/a

Anyway, it is possible to solve it progressively by just solving for three parameters and replacing back in the other equations, without multiplying by anything, but then the equations become larger than 510.

I will just wait for the update with the improvements in the MODEL CONSTRAINT. Hope it comes soon.

Thank you!
 bmuthen posted on Wednesday, July 13, 2005 - 9:19 am
Can you please email me your latest input, output and data so we can try it here?
 David Bard posted on Tuesday, March 06, 2007 - 8:00 pm
I'm trying to fit the cfa version of an efa described in the Joreskog (1979) addendum to his original (1969) specification. With 6 variables, I'm having difficulty fitting 3 or even 2 factor efa solution. Below is the 3 factor syntax. In both models, estimation does not reach convergence. Do you notice anything wrong with the syntax? The vars are highly skewed, in fact, I'd really like to fit this as a Tobit SEM using the M+ censored normal option, but for now, one step at a time. Your thoughts?

Model:
F1 BY NVAB1*
NVAB2@1
!NVAB3
!NVAB4
NVAB5
NVAB6;
F2 BY NVAB1*
!NVAB2
NVAB3@1
!NVAB4
NVAB5
NVAB6;
F3 BY NVAB1*
!NVAB2
!NVAB3
NVAB4@1
NVAB5
NVAB6;
F1-F3;
F1 with F2-F3;
F2 with F3;
 Linda K. Muthen posted on Wednesday, March 07, 2007 - 7:13 am
The convergence problem may come about because the starting values for factor loadings are one and with EFA in a CFA framework many factor loadings are small. You may want to start the smaller loadings at zero. We have found convergence problems when we don't do this.
 Carla Zelaya posted on Friday, August 03, 2007 - 7:04 am
Hello,

I am doing an EFA in a CFA framework. I have been learning how to do this from the online classes at UCLA. I am having trouble running my model. I have a large dataset (n=2367) and I am using the following analysis:

ANALYSIS:
Estimator = ml;
TYPE = MISSING;
INTEGRATION = Montecarlo;
MITERATIONS = 1000

I also have 22 dependent variables (indicators), and 4 continuous latent variables.

This is the error message that I am receiving:

THE MODEL ESTIMATION DID NOT TERMINATE NORMALLY DUE TO A NON-ZERO DERIVATIVE OF THE OBSERVED-DATA LOGLIKELIHOOD.

THE MCONVERGENCE CRITERION OF THE EM ALGORITHM IS NOT FULFILLED.CHECK YOUR STARTING VALUES OR INCREASE THE NUMBER OF MITERATIONS.ESTIMATES CANNOT BE TRUSTED. THE LOGLIKELIHOOD DERIVATIVE FOR PARAMETER 78 IS 0.10762752D+05.

I am not sure what to do. Should I stop using ML and go to WLSMV? I would like to stay in ML as then I can have logistic regression, and I would like Chi Squared Statistics, or other test statistics. I also want to go onto CFA with covariates. It is possible to use WLSMV and still get Chi Square statistics for CFA with covariates.

You help would be much appreciated. I can also send my input or output if that helps.

Regards,
Carla Zelaya
PhD Candidate, Dept. of Epidemiology,
JHSPH
 Carla Zelaya posted on Friday, August 03, 2007 - 7:07 am
I forgot to mention my dependant variables (22) are all categorical (ordered, 5 point likert scale).

Thanks Carla
 Linda K. Muthen posted on Friday, August 03, 2007 - 8:26 am
You can try using the MITERATIONS option to increase the number of iterations as suggested by the error message. With four factors, weighted least squares may be a better estimator choice. You will obtain fit statistics with WLSMV with and without covariates.

If you continue to have problems, please send your input, data, output, and license number to support@statmodel.com.
 Kevin K. Makino posted on Thursday, May 05, 2011 - 12:02 pm
Hello hello -

I am currently a PhD candidate working on my dissertation, but am new to latent variable analyses... I have been trying to bring myself up to speed on methodology, and have a question about EFA in a CFA framework - from the on-line courses and handouts, it seems that the recommended approach is to create the "m" restrictions using one anchor item per factor. Is there a reason why this approach is superior to creating the restrictions by setting loadings at exactly what the output from the EFA specifies? This way, you wouldn't need to rely on identifying a good anchor item...

Thank you!
 Linda K. Muthen posted on Thursday, May 05, 2011 - 1:35 pm
You cannot give all of the factor loadings from the EFA. To identify the model, you must place restrictions equal to the number of factors squared. The choice of the anchor item determines the rotation.
 Kevin K. Makino posted on Thursday, May 05, 2011 - 2:48 pm
Hi Linda - thank you for your quick response! However, what if you place restrictions equal to the number of factors squared, and define those restrictions on the basis of the actual loadings from the EFA? In the "standard" approach using anchor items, it seems to me that what we are doing is to approximate the loadings from the EFA by setting the cross-loadings for the anchor items to zero - so shouldn't it also work to structure the CFA based on the actual loadings from the EFA? (I hope my attempt at an explanation makes things clearer, and apologize if this is not the case.)

Thank you again,
Kevin
 Linda K. Muthen posted on Thursday, May 05, 2011 - 2:55 pm
I think you will find without good anchor items, you will not recover the EFA as well as with good anchor items. Try it and see.
 rosalynd denise upton posted on Thursday, March 29, 2012 - 8:01 pm
Dear Professors,

I am trying to test a unidimensional factor structure using multilevel EFA with 7 ordinal items. Yet I think multilevel EFA is not applicable with categorical data, and so I was wondering if I should use multilevel EFA in a CFA framework with Mplus?



Second, I was wondering if you would recommend that I estimate this model in Mplus by fixing the within- and between- factor variances to 1 and freely estimating the factor loadings?



Third, are there any other parameters I should constrain/freely estimate in Mplus to accurately test a unidimensional factor structure using multilevel E/CFA?
 Bengt O. Muthen posted on Friday, March 30, 2012 - 6:26 am
You can do 2-level EFA with categorical variables in Mplus. See the Topic 7 short course handout and video of 3/29/11 on our website, slides 97-106.
Back to top
Add Your Message Here
Post:
Username: Posting Information:
This is a private posting area. Only registered users and moderators may post messages here.
Password:
Options: Enable HTML code in message
Automatically activate URLs in message
Action: