Message/Author 


Dear Dr Muthen, I am trying to do EFA in a CFA framework. I am new with Mplus. In order to identify the model I want to use a constraint suggested by D.Hessen and C.Dolan in a recent paper (David J. Hessen, Conor V. Dolan, and Jelte M. Wicherts (2004). Multigroup exploratory factor analysis and the power to detect uniform bias. Applied Psychological Research, In press.): In order to accomplish identification the matrix: Lambda(T)*Theta~*Lambda can be constrained to be diagonal. LISREL notation, (T)for transposed, ~for inverse. With this method it is not necessary to use anchor constraints. I have managed to use this constraint succesfully in mx and LISREL. In mx it is straightforward with matrix algebra, and in LISREL I used nonlinear constraints and additional parameters. But due to my inexperience I can not come up with the way to do it in Mplus. Could you please give me some clue? Thanks in advance, Irene Rebollo Biological Psychology Department Vrije Universiteit Amsterdam 


Have you tried setting up the constraints used in LISREL using MODEL CONSTRAINT in Mplus? You should be able to do the same thing as you did in LISREL this way. 


Thank you four your answer Linda. Yes, I just managed today. Easier than in Lisrel. This is the MPLUS equivalent to what I have done in LISREL: ************************************************* TITLE: Try for Identification Constraint: EFA in CFA Conor's simulation DATA: FILE IS const2.cor; TYPE IS FULLCORR; NOBSERVATIONS=1000; VARIABLE: NAMES ARE V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12; MODEL CONSTRAINT: !Dummy parameters D1D6 contain the elements of the matrix Lambda'*Theta~*Lambda !D1D3 are the diagonal elements: should be free !D4D6 are the off diagonal elements that should be constrained to be zero D1=((L11**2)/T1+(L21**2)/T2+(L31**2)/T3+(L41**2)/T4+(L51**2)/T5+ (L61**2)/T6+(L71**2)/T7+(L81**2)/T8+(L91**2)/T9+(L101**2)/T10+ (L111**2)/T11+(L121**2)/T12); D2=((L12**2)/T1+(L22**2)/T2+(L32**2)/T3+(L42**2)/T4+(L52**2)/T5+ (L62**2)/T6+(L72**2)/T7+(L82**2)/T8+(L92**2)/T9+(L102**2)/T10+ (L112**2)/T11+(L122**2)/T12); D3=((L13**2)/T1+(L23**2)/T2+(L33**2)/T3+(L43**2)/T4+(L53**2)/T5+ (L63**2)/T6+(L73**2)/T7+(L83**2)/T8+(L93**2)/T9+(L103**2)/T10+ (L113**2)/T11+(L123**2)/T12); D4=((L11*L12)/T1+(L21*L22)/T2+(L31*L32)/T3+(L41*L42)/T4+(L51*L52)/T5 +(L61*L62)/T6+(L71*L72)/T7+(L81*L82)/T8+(L91*L92)/T9+(L101*L102)/T10 +(L111*L112)/T11+(L121*L122)/T12)=0; D5=((L11*L13)/T1+(L21*L23)/T2+(L31*L33)/T3+(L41*L43)/T4+(L51*L53)/T5 +(L61*L63)/T6+(L71*L73)/T7+(L81*L83)/T8+(L91*L93)/T9+(L101*L103)/T10 +(L111*L113)/T11+(L121*L123)/T12)=0; D6=((L12*L13)/T1+(L22*L23)/T2+(L32*L33)/T3+(L42*L43)/T4+(L52*L53)/T5 +(L62*L63)/T6+(L72*L73)/T7+(L82*L83)/T8+(L92*L93)/T9+(L102*L103)/T10 +(L112*L113)/T11+(L122*L123)/T12)=0; MODEL: F1 BY V1*(L11) V2 (L21) V3 (L31) V4 (L41) V5 (L51) V6 (L61) V7 (L71) V8 (L81) V9 (L91) V10 (L101) V11 (L111) V12 (L121); F2 BY V1*(L12) V2 (L22) V3 (L32) V4 (L42) V5 (L52) V6 (L62) V7 (L72) V8 (L82) V9 (L92) V10 (L102) V11 (L112) V12 (L122); F3 BY V1*(L13) V2 (L23) V3 (L33) V4 (L43) V5 (L53) V6 (L63) V7 (L73) V8 (L83) V9 (L93) V10 (L103) V11 (L113) V12 (L123); V1 (T1); V2 (T2); V3 (T3); V4 (T4); V5 (T5); V6 (T6); V7 (T7); V8 (T8); V9 (T9); V10 (T10); V11 (T11); V12 (T12); F1F3@1; F1 F2 F3 PWITH F2@0 F3@0 F1@0; DUM1 BY V1@0; DUM2 BY V1@0; DUM3 BY V1@0; DUM4 BY V1@0; DUM5 BY V1@0; DUM6 BY V1@0; DUM1 (D1); DUM2 (D2); DUM3 (D3); DUM4 (D4); DUM5 (D5); DUM6 (D6); DUM1 WITH DUM2DUM6@0; DUM2 WITH DUM3DUM6@0; DUM3 WITH DUM4DUM6@0; DUM4 WITH DUM5DUM6@0; DUM5 WITH DUM6@0; DUM1DUM6 WITH F1F3@0; OUTPUT: TECH1 ******************************************* But it's not working. It still behaves as if there was no constraint, because the model is not identified. I get these same loadings for the three factors: F1 BY V1 0.238 V2 0.215 V3 0.228 V4 0.215 V5 0.337 V6 0.349 V7 0.322 V8 0.304 V9 0.324 V10 0.271 V11 0.292 V12 0.310 Variances F1 1.000 F2 1.000 F3 1.000 DUM1 1.381 DUM2 1.381 DUM3 1.381 DUM4 0.000 DUM5 0.000 DUM6 0.000 This is what I get with Lisrel and Mx: MATRIX Lambda 0.1594 0.4969 0.4776 0.1408 0.4387 0.4218 0.1506 0.4702 0.4513 0.1408 0.4387 0.4218 0.2161 0.3305 0.5866 0.2263 0.3459 0.6133 0.2041 0.3134 0.5548 0.1912 0.2923 0.5180 0.4151 0.0138 0.5720 0.3370 0.0115 0.4635 0.3674 0.0120 0.5053 0.3936 0.0132 0.5411 [=L'*(T*T)~*L] 1.5144E+00 5.6877E09 2.3397E07 5.6877E09 2.3275E+00 7.8191E09 2.3397E07 7.8191E09 5.8483E+00 ******************************************** In Lisrel It was necessary to writte the constraint before the model. That's why I had to redefine the whole model with additional parameters in PHI. But in MPLUS the order doesn't make a difference. Is the =0 at the end of the constraint allowed? I am afraid that this might be the problem. With LISREl I just did a second constraint saying e.g. D4=0, but in MPLUS this is not allowed. thank you again! Irene 

bmuthen posted on Monday, July 04, 2005  11:09 am



Our expert on this new Model Constraint feature will be back this Thursday, but let me try to give you some feedback in the meanwhile. Also, Model Constraint will be further extended in the future to make all of this easier. I see two things to modify. I think your model has 3 factors so it seems like in addition to Psi = I, there are 3 constraints that you want to impose on your matrix product  namely the zero values for D4, D5, and D6. I therefore don't see that the D1D3 statements are needed. As you surmised, Mplus cannot currently have the "=0" part added at the end of a constraint statement, but instead you can simply rewrite this restriction. For example, for y = a + b = 0; you can instead say b = a; So for example with D6, you could solve for T12 and have it on the LHS. 


I think it's great if you extend the Model Constraint feature. The problem with solving for T12 is that then I can not use T12 on the RHS in the other two. So to make 3 constraints, by solving for three values of T, it becomes recursive. I'll keep trying other possibilities, and wait for you expert's suggestion. Thanx! 

bmuthen posted on Tuesday, July 05, 2005  5:23 pm



I see your point about T12 being on the RHS in the D5 and D4 expressions, so that would mean having to solve for T12 in those equations in terms of the T12 = equation, etc, which is messy. 


This is the last idea I came up with: ************************************************* TITLE: Try for Identification Constraint: EFA in CFA Conor's simulation DATA: FILE IS const.cor; TYPE IS FULLCOV; NOBSERVATIONS=1000; VARIABLE: NAMES ARE V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 DUM1 DUM2 DUM3; ANALYSIS: ITERATIONS= 3000 STARTS= 20 2 MODEL: F1 BY V1*.3 (L11) V2*.3 (L21) V3*.3 (L31) V4*.3 (L41) V5*.3 (L51) V6*.3 (L61) V7*.3 (L71) V8*.3 (L81) V9*.3 (L91) V10*.3 (L101) V11*.3 (L111) V12*.3 (L121); F2 BY V1*.3(L12) V2*.3 (L22) V3*.3 (L32) V4*.3 (L42) V5*.3 (L52) V6*.3 (L62) V7*.3 (L72) V8*.3 (L82) V9*.3 (L92) V10*.3 (L102) V11*.3 (L112) V12*.3 (L122); F3 BY V1*.3(L13) V2*.3 (L23) V3*.3 (L33) V4*.3 (L43) V5*.3 (L53) V6*.3 (L63) V7*.3 (L73) V8*.3 (L83) V9*.3 (L93) V10*.3 (L103) V11*.3 (L113) V12*.3 (L123); V1*1 (T1); V2*1 (T2); V3*1 (T3); V4*1 (T4); V5*1 (T5); V6*1 (T6); V7*1 (T7); V8*1 (T8); V9*1 (T9); V10*1 (T10); V11*1 (T11); V12*1 (T12); F1F3@1; F1 F2 F3 PWITH F2@0 F3@0 F1@0; ! DUM1DUM3 are dummy variables included in the data file that have 0 ! correlations with all the other variables and 0.00001 variance. This ! way D1, D2 and D3 (constraints below) are forced to be zero. DUM1 (D1); DUM2 (D2); DUM3 (D3); MODEL CONSTRAINT: !******************************************************************************* !Dummy parameters contain the elements of the matrix Lambda'*Theta~*Lambda !D1D3 are the off diagonal elements that should be constrained to be zero D1=L11*L12/T1+L21*L22/T2+L31*L32/T3+L41*L42/T4 +L51*L52/T5+L61*L62/T6+L71*L72/T7+L81*L82/T8 +L91*L92/T9+L101*L102/T10+L111*L112/T11+L121*L122/T12; D2=L11*L13/T1+L21*L23/T2+L31*L33/T3+L41*L43/T4 +L51*L53/T5+L61*L63/T6+L71*L73/T7+L81*L83/T8 +L91*L93/T9+L101*L103/T10+L111*L113/T11+L121*L123/T12; D3=L12*L13/T1+L22*L23/T2+L32*L33/T3+L42*L43/T4 +L52*L53/T5+L62*L63/T6+L72*L73/T7+L82*L83/T8 +L92*L93/T9+L102*L103/T10+L112*L113/T11+L122*L123/T12; OUTPUT: TECH1 ************************************************* I think the constrain might be working now. But the model still doesn't converge. I am not sure why: ************************************************* NO CONVERGENCE. NUMBER OF ITERATIONS EXCEEDED. MODEL RESULTS Estimates F1 BY V1 0.002 V2 0.002 V3 0.001 V4 0.002 V5 0.001 V6 0.001 V7 0.000 V8 0.000 V9 0.001 V10 0.000 V11 0.000 V12 0.000 F2 BY V1 0.002 V2 0.002 V3 0.001 V4 0.002 V5 0.001 V6 0.001 V7 0.000 V8 0.000 V9 0.001 V10 0.000 V11 0.000 V12 0.000 F3 BY V1 0.002 V2 0.002 V3 0.001 V4 0.002 V5 0.001 V6 0.001 V7 0.000 V8 0.000 V9 0.001 V10 0.000 V11 0.000 V12 0.000 F1 WITH F2 0.000 F2 WITH F3 0.000 F3 WITH F1 0.000 Variances DUM1 0.000 DUM2 0.000 DUM3 0.000 F1 1.000 F2 1.000 F3 1.000 Residual Variances V1 1.154 V2 1.135 V3 1.143 V4 1.135 V5 1.107 V6 1.121 V7 1.094 V8 1.085 V9 1.135 V10 1.094 V11 1.102 V12 1.116 **************************************** Will keep trying... 


The above code will not work since the dummy parameters are not zero. You have to solve the 3 equations for 3 parameters and have these 3 parameters be dependent while all other parameters are independent. This in principle is not hard but the result is very messy because your equations are complicated. Here are the steps you can go through: 1. Multiply eq 1 by L13/L12 and subtract it from eq 2 (thus you will have L11 only in eq 2) 2. Solve eq 1 for L11 3. Multiply eq 3 by L21/L22 and subtract it from eq 2 4. Solve eq 3 for L23 5. Solve eq 2 for L33 6. Substitute the solution for L33 back in eq 2 so that L33 doesn't appear on RHS of eq 2 0=L11*L12/T1+L21*L22/T2+L31*L32/T3+L41*L42/T4 +L51*L52/T5+L61*L62/T6+L71*L72/T7+L81*L82/T8 +L91*L92/T9+L101*L102/T10+L111*L112/T11+L121*L122/T12; 0=L11*L13/T1+L21*L23/T2+L31*L33/T3+L41*L43/T4 +L51*L53/T5+L61*L63/T6+L71*L73/T7+L81*L83/T8 +L91*L93/T9+L101*L103/T10+L111*L113/T11+L121*L123/T12; 0=L12*L13/T1+L22*L23/T2+L32*L33/T3+L42*L43/T4 +L52*L53/T5+L62*L63/T6+L72*L73/T7+L82*L83/T8 +L92*L93/T9+L102*L103/T10+L112*L113/T11+L122*L123/T12; 


Thank you for the solution! It reaaly looks that this is the way to go. But... I solved the equations as you suggested: *********************************************** L11 = T1/L12*(L21*L22/T2+L31*L32/T3+L41*L42/T4+L51*L52/T5+L61*L62/T6+L71*L72/T7 +L81*L82/T8+L91*L92/T9+L101*L102/T10+L111*L112/T11+L121*L122/T12); L33=L22*T3/(L22*L31L21*L32)*((L41*L43/T4+L51*L53/T5+L61*L63/T6+L71*L73/T7 +L81*L83/T8+L91*L93/T9+L101*L103/T10+L111*L113/T11+L121*L123/T12) (L13/L12*(L21*L22/T2+L31*L32/T3+L41*L42/T4+L51*L52/T5+L61*L62/T6+L71*L72/T7 +L81*L82/T8+L91*L92/T9+L101*L102/T10+L111*L112/T11+L121*L122/T12)) (L21/L22*(L12*L13/T1+L42*L43/T4+L52*L53/T5+L62*L63/T6+L72*L73/T7+L82*L83/T8 +L92*L93/T9+L102*L103/T10+L112*L113/T11+L122*L123/T12))); L23=T2/L22*(L12*L13/T1+L32*(L22*T3/(L22*L31L21*L32)*((L41*L43/T4+L51*L53/T5 +L61*L63/T6+L71*L73/T7+L81*L83/T8+L91*L93/T9+L101*L103/T10+L111*L113/T11 +L121*L123/T12)(L13/L12*(L21*L22/T2+L31*L32/T3+L41*L42/T4+L51*L52/T5 +L61*L62/T6+L71*L72/T7+L81*L82/T8+L91*L92/T9+L101*L102/T10+L111*L112/T11 +L121*L122/T12))(L21/L22*(L12*L13/T1+L42*L43/T4+L52*L53/T5+L62*L63/T6 +L72*L73/T7+L82*L83/T8+L92*L93/T9+L102*L103/T10+L112*L113/T11+L122*L123/T12)))) /T3+L42*L43/T4+L52*L53/T5+L62*L63/T6+L72*L73/T7+L82*L83/T8+L92*L93/T9 +L102*L103/T10+L112*L113/T11+L122*L123/T12); ************************************************* Apparently Equation 3 becomes too large and I get this warning: *** WARNING in Model Constraint command Statement exceeds line limit. Break your statement into several parts. L23=T2/L22*(L12*L13/T1+L32*(L22*T3/(L22*L31L21*L32)*((L41*L43/T4+L51*L53/T5 Is there any way to solve this problem? Because it is not about the lenght of each line which is already adjusted to 80characters, but about the whole statement. Thank you again for the effort that you are making to solve it! Irene 


I don't think there is any way to solve this. The maximum length is 510 in Version 3.12. We will increase this in the next update. In the meantime, if you send your input, data, and license number to support@statmodel.com, we can run it for you with our developmental version and send you the output. We plan to have the update available by the end of the month or shortly thereafter. 


Hi again, I changed the names of the parameters to make them shorter, which solved the length problem. But still does not work. I have solved the equations many times for different parameters and checked for mistakes. I think that the solution proposed might not be entirely good, and might alter the constraint. Because: EQ1=0 EQ2=0 EQ1EQ2=0 > EQ1=EQ2 We do something like this: EQ1=0 EQ2=0 EQ1*a=0 EQ1*aEQ2=0 > EQ1=EQ2/a Anyway, it is possible to solve it progressively by just solving for three parameters and replacing back in the other equations, without multiplying by anything, but then the equations become larger than 510. I will just wait for the update with the improvements in the MODEL CONSTRAINT. Hope it comes soon. Thank you! 

bmuthen posted on Wednesday, July 13, 2005  9:19 am



Can you please email me your latest input, output and data so we can try it here? 

David Bard posted on Tuesday, March 06, 2007  8:00 pm



I'm trying to fit the cfa version of an efa described in the Joreskog (1979) addendum to his original (1969) specification. With 6 variables, I'm having difficulty fitting 3 or even 2 factor efa solution. Below is the 3 factor syntax. In both models, estimation does not reach convergence. Do you notice anything wrong with the syntax? The vars are highly skewed, in fact, I'd really like to fit this as a Tobit SEM using the M+ censored normal option, but for now, one step at a time. Your thoughts? Model: F1 BY NVAB1* NVAB2@1 !NVAB3 !NVAB4 NVAB5 NVAB6; F2 BY NVAB1* !NVAB2 NVAB3@1 !NVAB4 NVAB5 NVAB6; F3 BY NVAB1* !NVAB2 !NVAB3 NVAB4@1 NVAB5 NVAB6; F1F3; F1 with F2F3; F2 with F3; 


The convergence problem may come about because the starting values for factor loadings are one and with EFA in a CFA framework many factor loadings are small. You may want to start the smaller loadings at zero. We have found convergence problems when we don't do this. 


Hello, I am doing an EFA in a CFA framework. I have been learning how to do this from the online classes at UCLA. I am having trouble running my model. I have a large dataset (n=2367) and I am using the following analysis: ANALYSIS: Estimator = ml; TYPE = MISSING; INTEGRATION = Montecarlo; MITERATIONS = 1000 I also have 22 dependent variables (indicators), and 4 continuous latent variables. This is the error message that I am receiving: THE MODEL ESTIMATION DID NOT TERMINATE NORMALLY DUE TO A NONZERO DERIVATIVE OF THE OBSERVEDDATA LOGLIKELIHOOD. THE MCONVERGENCE CRITERION OF THE EM ALGORITHM IS NOT FULFILLED.CHECK YOUR STARTING VALUES OR INCREASE THE NUMBER OF MITERATIONS.ESTIMATES CANNOT BE TRUSTED. THE LOGLIKELIHOOD DERIVATIVE FOR PARAMETER 78 IS 0.10762752D+05. I am not sure what to do. Should I stop using ML and go to WLSMV? I would like to stay in ML as then I can have logistic regression, and I would like Chi Squared Statistics, or other test statistics. I also want to go onto CFA with covariates. It is possible to use WLSMV and still get Chi Square statistics for CFA with covariates. You help would be much appreciated. I can also send my input or output if that helps. Regards, Carla Zelaya PhD Candidate, Dept. of Epidemiology, JHSPH 


I forgot to mention my dependant variables (22) are all categorical (ordered, 5 point likert scale). Thanks Carla 


You can try using the MITERATIONS option to increase the number of iterations as suggested by the error message. With four factors, weighted least squares may be a better estimator choice. You will obtain fit statistics with WLSMV with and without covariates. If you continue to have problems, please send your input, data, output, and license number to support@statmodel.com. 


Hello hello  I am currently a PhD candidate working on my dissertation, but am new to latent variable analyses... I have been trying to bring myself up to speed on methodology, and have a question about EFA in a CFA framework  from the online courses and handouts, it seems that the recommended approach is to create the "m" restrictions using one anchor item per factor. Is there a reason why this approach is superior to creating the restrictions by setting loadings at exactly what the output from the EFA specifies? This way, you wouldn't need to rely on identifying a good anchor item... Thank you! 


You cannot give all of the factor loadings from the EFA. To identify the model, you must place restrictions equal to the number of factors squared. The choice of the anchor item determines the rotation. 


Hi Linda  thank you for your quick response! However, what if you place restrictions equal to the number of factors squared, and define those restrictions on the basis of the actual loadings from the EFA? In the "standard" approach using anchor items, it seems to me that what we are doing is to approximate the loadings from the EFA by setting the crossloadings for the anchor items to zero  so shouldn't it also work to structure the CFA based on the actual loadings from the EFA? (I hope my attempt at an explanation makes things clearer, and apologize if this is not the case.) Thank you again, Kevin 


I think you will find without good anchor items, you will not recover the EFA as well as with good anchor items. Try it and see. 


Dear Professors, I am trying to test a unidimensional factor structure using multilevel EFA with 7 ordinal items. Yet I think multilevel EFA is not applicable with categorical data, and so I was wondering if I should use multilevel EFA in a CFA framework with Mplus? Second, I was wondering if you would recommend that I estimate this model in Mplus by fixing the within and between factor variances to 1 and freely estimating the factor loadings? Third, are there any other parameters I should constrain/freely estimate in Mplus to accurately test a unidimensional factor structure using multilevel E/CFA? 


You can do 2level EFA with categorical variables in Mplus. See the Topic 7 short course handout and video of 3/29/11 on our website, slides 97106. 


I noticed the messages from Irene above about the model constraint command. I am trying to run a model that uses several characters, and I have written code to ensure that each model constraint is not longer than 80 characters. However, I run into the same problem that Irene did in 2005, and I get the following error: *** WARNING in MODEL CONSTRAINT command Statement exceeds line limit. Break your statement into several parts. Again, none of the lines exceeds 80 characters (most of them are under 30), but they do span several lines (perhaps more than 100). What is the limit to the length of the model constraint here? 


Please send the full output and your license number to support@statmodel.com. 

Back to top 