Message/Author 

Anonymous posted on Friday, April 23, 2004  5:45 am



I have recently run a confirmatory factor analysis with several constructs related to the development of antisocial behavior. Although fit indicies and chisquare change statistics indicate that the four factor solution proposed is the best fitting model, the factors are highly correlated (.75.84). Will statisticians question the independence of these constructs because of their high correlation? In addition, do you run into the same problems in SEM with multicollinearity when including highly correlated factors in a regression together? 

bmuthen posted on Friday, April 23, 2004  6:51 am



I don't think this is problematic. Factors are often naturally correlated, not independent. Multicollinearity is always a risk. One way around this is to postulate a secondorder factor behind highly correlated factors. 

Anonymous posted on Wednesday, April 28, 2004  5:22 am



Thanks for the info. A few followup questions. Do you have reccomendations on how to check for multicollinearity in Mplus 3.0? In addition, when a secondorder factor is used to handle the issue, do you simultaneously regress the DV onto the second order factor and all first order factors, or do you simply regress on the DV onto all first order factors without including the second order factor? 

bmuthen posted on Thursday, April 29, 2004  6:15 pm



I think multicollinearity can be checked just like in regular regression with observed variables. With a 2ndorder factor, you can simply regress the DV on that factor only. 

Anonymous posted on Sunday, November 21, 2004  6:18 pm



Hi, I just have a question about the correlation between factors in a confirmatory factor analysis. I wonder if a rotation is used when doing a CFA. If yes, is it promax? If no, how are calculated the correlations between factors? Another question: if the observed variables are categorical, does it make a difference about the correlation between factors? Thanks 


CFA does not use a rotation. The covariances are estimated as part of the model. Whether the factor indicators are contiuous or categorical, the factor is continuous. 

Boliang Guo posted on Saturday, September 03, 2005  11:32 am



Hi Linda, how to set the factors correlation is 1? thank for your your kind attention. 


Boliang, I know the question was directed at Linda, but the answer is: f1 with f2@1.0 where f1 = factor 1 and f2 = factor 2 


Note that this is only a correlation if the metric of the factor is set by fixing the factor variances to one. If the metric of the factors is fixed by setting one factor loading to one, then this is a covariance. 

yufang posted on Thursday, October 06, 2005  11:39 am



Does anyone know references that recommends how large a correlation between factors is considered moderate or high? and at which point should we check for multicollinearity? 

bmuthen posted on Saturday, October 08, 2005  11:55 am



I do not.  Others? 

yang posted on Monday, April 17, 2006  12:30 pm



Why the covariance matrix and correlation matrix are identical for factors in CFA? Thanks. 


Your factor variances must be one. 

yang posted on Friday, June 23, 2006  6:53 am



I ran a CFA and got some correlation coefficients among the factors with absolute values greater than 1. Thank Linda for telling me that this means the corresponding factors are not statistically distinguishable. Now I have another question: how can the correlations coefficients among the factors have absolute values greater than 1? Are we using some different formula to calculate these coefficients? Shouldn¡¯t the correlation coefficients should range from 1 to 1? Thanks a lot. 


Correlations are not calculated using a formula. The correlations are estimated as part of the model. When variables correlate one, model estimation is thrown off and values greater than one can occur. This is why the results are inadmissible in this case. 

yshing posted on Tuesday, November 14, 2006  9:53 am



I'm running a multiplegroup CFA with 2 factors. For theoretical reason I don't want to fix the variance of the latent construct to one. For one of my groups I would like to fix the correlation of the two latent constructs to 1. I understand that a nonlinear constraint is needed in the model. How can I do this? Thank you for your help. 


See MODEL CONSTRAINT in the user's guide. You can create a new parameter that is the covariance diviied by the product of the two standard deviations and make the new parameter equal to one. 


Hi, i'am trying to make a confirmatory factor analysis from categorical variables only: MODEL: f1 BY fte1*0.600 fte2*0.526 fte6*0.745 fte7*0.654 fte8*0.690 fte10*0.639 fte11*0.655 ftt16*0.409 ftt17*0.460 ftt18*0.463 ftf18*0.408; f2 BY ftt19*0.405 ftf1*0.745 ftf2*0.749 ftf3*0.628 ftf8*0.472 ftf9*0.428 ftf13*0.484 ftf17*0.402; f1@1 f2@1; f1 WITH f2@0; Under this model I would assume that the factors have variance 1 and uncorrelated. However, the saved factor scores are still correlated and have a variance unequal to one (around 0.5). Do I have to specify more? Thanks for any help Sigbert 


Factor scores are not identical to the factors in the estimated model. Deviations can occur when the factors do not have high factor determinacties. This is one reason that it is better to work with factors in a simultaneous model rather than work with factor scores. 


Hi, so, what do I save with the scores? The deviations are pretty large for a two factor model. Can it be, because the model is bad (CFI 0.171, TLI 0.202, RMSEA 0.243)? I wanted to test a 2factor model, before I go to the full 9factor model. Can I check somewhere how many iterations the CFA needed? Maybe the model has not converged, it took some minutes to finish; although estimating each factor separately was very quick. Since we want to estimate a SEM or a Multilevel model with these factors with another program I need reliable factor scores, if possible. Thanks in advance Sigbert Klinke 


If the model did not converge, it would say so in the output. I think the differences can be attributed to a poorly fitting model. 

Daniel Shen posted on Tuesday, March 13, 2007  8:19 am



Prof. Muthen, In CFA, can we obtain standard errors of factor correlation estimates, in order to build a 95% CI? Thanks, Daniel 


You can do this in two ways. You can set the metric of the factors by setting the factor variances to one. Then you obtain correlations among the factors rather than covariances. Or you can use MODEL CONSTRAINT to create a correlation from the covariance. 


I would be very grateful for your advice: 1) Assuming I do a CFA and want to test the orthogonal model, would "f1 WITH f2@0" be the correct syntax to use? 2) Is it possible to obtain CFI, TLI, RMSEA, WRMR values for the the independence model (standard control in CFA) and what would the relevant syntax be? (I have 23 indicators that are ordinal/categorical data and 2 latent variables) I tried MODEL: f1 BY v1 f2 BY v2 ... f23 BY v23 but it obviously didn't work many thanks Ioanna 


1) Yes. 2) The independence model is called the "baseline model" in Mplus and is automatically included so that CFI etc can be reported  see output. 


Thanks very much. Unfortunately, I cannot see any CFI/TLI/RMSEA/WRMR values for the baseline model in the output. I can see the relevant chisquare, df and p value though. Where should I look for the CFI/TLI/RMSEA/WRMR of the baseline model? many thanks Ioanna 


You don't get CFI etc for the baseline model because that would just compare the baseline model to itself. The baseline model is merely used to be able to compute CFI. If you are interested in chi2 fit for the baseline model, you would have to specify it yourself in the MODEL command: y1yp with y1yp@0; where p is the number of variables. This will then get you a chi2 test saying how well (probably how very badly) the baseline model fits relative to an unrestricted model. Again, CFI is not relevant since both the H1 and H0 models are the baseline model. 


This is very clear, thanks very much. 

Kihan Kim posted on Wednesday, January 28, 2009  10:11 pm



Dear Dr. Muthen, I'm testing a twofactor CFA model with 7 items (4 items for F1, and 3 items for F2). I wanted to perform a chisquare difference test between (M1) twofactor CFA (factors are allowed to correlate), and (M2) twofactor CFA with interfactor correlation fixed at 1. When I run M2, I'm keep receiving the following convergence problem. I looked over the User Manual regarding "Convergence Problems," and am still not sure what I should try. Could you help me resolving this problem? NO CONVERGENCE. SERIOUS PROBLEMS IN ITERATIONS. ESTIMATED COVARIANCE MATRIX NONINVERTIBLE. CHECK YOUR STARTING VALUES. Kihan 


Unless you have set the metric of the factor by freeing all factor loadings and fixing the factor variance to one, you are fixing the covariance to one not the correlation. Fixing a parameter to an incorrect value may cause convergence problems. If you want to test if a covariance or correlation is equal to one, use MODEL TEST. See the user's guide for further information. 

Kihan Kim posted on Friday, January 30, 2009  9:04 am



Thank you for your answer. I was able to fix the interfactor correlation to 1 by freeing all factor loadings and fixing the factor variance to one. I was also trying to use MODEL TEST, but I'm still not clear how to use it. Could you suggest a command so that I can set the interfactor correlation to 1 using MODEL TEST command for the following MODEL command of 2factor CFA? Model: f1 by id1 id2 id3 id4; f2 by fit1 fit2 fit3; 


The following code tests if the factor correlation is one. It does not fix it to one. MODEL: f1 BY id1* id2 id3 id4; f1@1; f2 BY fit1* fit2 fit3; f2@1; f1 WITH f2 (p1); MODEL TEST: 0 = 1  p1; 


Dear Linda & Bengt, I wonder how you can get the significance level of oblimin factor correlations. Would be great if you helped me. Thanks a lot in advance, Tina 


You can't get these with TYPE=EFA but you can get them with the new EFA using the MODEL command. See the Version 5.1 Examples and Language Addendums on the website with the Mplus User's Guide. 

Tracy Witte posted on Thursday, March 04, 2010  9:57 am



I am running a CFA with the WLSMV estimator. Similar to the original person on this thread, the correlation between my factors is very high (in my case, approximately .92). However, when I use the difftest procedure to compare the two factor to the onefactor model, the results show that the fit of the model is significantly worsened when I specify a onefactor model. Is my next step to determine if the factors have differential predictive validity? It seems unlikely with such a high correlation between the two of them. From my reading of the above thread, it looks like I should specify a higherorder factor and regress the DV onto lowerorder factors. Is this correct? (thank you!) 


I would try an EFA to see if your CFA model is even in the ballpark for these data. 


I want to confirm I am interpreting my output correctly, as I am unsure exactly how the unstandardized vs. STDYX output is calculated within M+. Within a larger model, I have multicollinearity between two latent predictors (Parental Support and Peer Support) of the latent outcome variable  educational expectations. The issue I have is that the standard errors (SEs) associated with these direct effects are much much larger in the unstandardized output than in the standardized STDYX output. I know inflated SEs are a sign or symptom of multicollinearity and have addressed this by constraining the paths of peer and parental support > educational expectations as equal. (I'll spare the details). Even after doing so, why are the SEs associated with these two direct effects so much larger in the unstandardized section of the output than in the STYX output? 


See the STANDARDIZED option in the user's guide and Technical Appendix 3 on the website for information about standardizations used in Mplus. It sounds like your efforts to get around your multicollinearity are not succeeding. I suggest using only one of the variables involved. 


Hi, I would like to test whether factor correlations are equal to each other in CFA, and I have defined in Mplus: F1 with F2 (1); F2 with F3 (1); F3 with F4 (1); F5 with F6 (1); I got the output where all covariances are 0.044, but correlations from STDYX standardization range from 0.338 to 0.512. LR test from EQS: Chi^2(3)=6.89, P=.07 LR test from Mplus: Chi^2(3)7.80, P=.05 Have I performed the right test in Mplus, if I want to replicate the results from EQS? Thanks in advance. 


The difference in results between Mplus and EQS is likely due to Mplus using n and EQS using n1 for the chisquare computation. This difference shows up for small samples which I assume you have. 


Hi, In this post I read that when the correlation between factors is greater than 1, this means the corresponding factors are not statistically distinguishable. But does it also mean that the correlation is actually 1.00, and the model is simply estimating a somewhat higher correlation? Additionally, does this mean that when I have 2 factors with a correlation of 1.16, I can assume a onefactor model fits the data better? When I did a Chisquare difference test, it indicated that the two factor model was better. 


A correlation estimate of 1 means that the factors are indistinguishable. A correlation estimate higher than 1 means that the model does not make sense for the data because correlations should not be higher than 1. So even if chisquare says that two factors fit better, you should not choose that model. Instead, another model should be explored. 


Hi, I am new to MPlus and I am still learning the codes to run the analysis. I am running a five factor CFA model and wanted to remove non significant correlations between 2 latent factors. Could you please tell me how this is done in MPLUS. Thanks for your help. 


If you want to fix the factor covariance to zero, you do so by f1 WITH f2@0; 


model: fc by f1 f6 f11 f16; ex by f7 f17; cf by f3 f8 f13 f18; au by f4 f9 f14 f19; lf by f5 f10 f15 f20; lf with fc@0; Does this look correct? 


Yes. 


Hello. I have a simple question: I am doing a 7 factor CFA and my model results indicate some insignificant factor covariances (i.e. with statements of 0.443 and 0.177). What do you recommend to do about it? What is the effect if I set them to 0 and what is the effect if I simply do nothing about it? Global fit indices are all very good and all BYstatements are statistically significant. Thanks in advance, Jan 


I would not fix them to zero. I would leave them as is. 

Xiaolu Zhou posted on Tuesday, November 08, 2011  1:22 pm



I am a new user. Could you help me with my CFA syntax? My data is binary data. I am not sure if my syntax is correct to check the correlation between the 2 factors with this syntax. My syntax is: TITLE: c scale VARIABLE: NAMES ARE country c16 c17 c18 c19 c20 c21 c22 c23 c24 c25 c26 c27 c28 c29 c30; USEVARIABLES ARE c16 c18 c19 c20 c21 c22 c23 c26 c27 c30; CATEGORICAL ARE c16 c18 c19 c20 c21 c22 c23 c26 c27 c30; MISSING are all (999); MODEL: b BY c16* c18 c23 c26 c27 c30; v BY c19* c20 c21 c22; b@1; v@1; b With v; ANALYSIS: ESTIMATOR = WLSMV; OUTPUT: standardized MODINDICES (3.5); For the output part, I have two questions: 1.where I can find the correlation between the two factors? 2. for the model fit, if chisquare, CFI and RMSEA is good, only TLI is less than .95, can we still call the model fit is acceptable? Many thanks! 


The correlation is in the results under b WITH v. If most fit statistics show good fit, that should be acceptable. 

Xiaolu Zhou posted on Wednesday, February 01, 2012  10:19 am



Hi Linda, I have another question about the CFA of binary data: I found some nonsignificant thresholds. What does these mean? Do they matter to my model? If they matter, what should I do with them? Thanks a lot! 


Thrsholds are used in the computation of probabilities and to test for measurement invariance. I would not be concerned with their significance. 

Sarah posted on Friday, October 05, 2012  12:49 pm



Hi, I have a quick and probably very easy question. In my SEM model I wish to obtain the correlations between my latent factors. I used the "tech4" option which provides such correlations. But how do I know if the correlations are significant? Put differently how can I obtain the level of significance of the correlations? Thank you very much. Sarah 


We don't provide this. You could define the correlations in MODEL CONSTRAINT and get a standard error that way. 

sailor cai posted on Wednesday, November 07, 2012  2:35 am



Hi Linda, One question about standardization: I am testing latent interaction in a SEM model. The direction between two factors is twoway. So,how can I standardize correlation coefficients? E.g. F4 on F1 F2 F3; F1 with F2 F3; F2 with F3; So, how to standardize, say, r12? Thanks! Sailor 


If f1f4 are factors defined using the BY option, You can ask for STD in the OUTPUT command. I don't see any interaction in the MODEL command. 

sailor cai posted on Wednesday, November 07, 2012  7:40 pm



Hi Linda, thanks for the direction. Sorry for not giving the whole syntax. My original commants are: Analysis: Type=random; Algorith=integration; Model: F1 by b1 b2 b3 b4; F2 by l1 l2; F3 by s1 s2 s3 s4 s5 s6;F4 by t1 t2 t3 t4; F4 on F1 F2 F3 F1xF2; F1 with F2 F3; F2 with F3; F1xF2  F1 XWITH F2; As the TYPE=random is used, therefore the STD cannot be asked for. Is there an equation to calculate the standardized correlations? Thanks again! 


This is not possible if numerical integration is required. 

sailor cai posted on Thursday, November 08, 2012  9:34 pm



By "not possible", do you mean asking for standardized correlation is only not possible: 1) using Mplus commands? 2) yet to be developed in the literature? 3) theoretically not possible? Your further clarification will be appreciated! 


1. Yes. 23. See the FAQ Latent Variable Interactions on the website 


Dear Sir; please guide me.... i have utilized second order factor model..where F1 BY y1 y2 y3; F2 BY y4 y5 y6; F3 BY y7 y8 y9; F4 BY F1 F2 F3; F5 BY x1 x3 x4 x5; F6 BY v1 v2 v3 v4; F5 ON F4; F6 ON F5; MODEL INDIRECT: F6 IND F5 F4; Fit indices are CFI=.912, TLI=.90, RMSEA=.038, SRMR=.05 But problem is that F4 which is comprised of three factors, shows nonsignificant relationship with F2, I have checked that scale correlation with each subscale and get highly significant correlation…yes no doubt structural equation modeling better deals with measurement errors and consider residuals covariance as well….but please tell me is there any way to come out of this problem…. F1 has .832 standardized estimates with F4 F2 has .022 non sig correlation with F4…..its residual variance estimate is 1 n significant…. F3 has .899 standardized estimates now how i would report it...as one factor is non significant....but model fit indices are good enough 


This general SEM question is better suited for SEMNET. 

Kwan K. posted on Sunday, June 30, 2013  11:27 am



Hi Linda, I saw that you previously replied to Sarah (question above) that Mplus does not provide the level of significance of the correlations. However, I have found several articles using Mplus and reporting the level of significance of the factor correlations. I was wondering if there is any other way to get it or I missed something here. Best, Kwan 


These are now available in TECH4 of the OUTPUT command in most cases. 

Kwan K. posted on Monday, July 01, 2013  1:21 am



Thank you Linda. Do you mean these are available in the latest version of Mplus? since it apparently does not work for my Mplus VERSION 6.11. Best, Kwan 


Yes, they are available in the latest version of Mplus. 


Dear Professors, I am trying to check if there is Multicollinearity with my high two correlated factors (.86) in a regression. As suggested, I simply regressed the both factors on a secondorder factor. I obtained VARIABLE COVARIANCE MATRIX (PSI) IS NOT POSITIVE DEFINITE, where my secondorder factor and a DV had a correlation greater than 1. Is it a indicator of multicolinearity? All best, Hugo 


You need more than two factors to define a secondorder factor is a solid way. It isn't identified without restrictive specifications. A correlation of 0.86 most likely gives rise to multicollinearity problems. 


Hi, I want to make sure what I'm doing is the correct way of correlating factors to each other and factors to observed variables. I ran an SEM and for publication I want to report a table of correlations between all variables used (observed and latent). For factor correlations the syntax is: f1 with f2: For factor and observed correlations: f1 with age; Then I examine the respective STDYX "with" estimates ? Thanks! 


Yes, but make sure you don't change your SEM model by adding these correlations. You find latent variable correlations in TECH4. If you want correlations between latent and observed variables not included in TECH4 you have to put a factor behind the observed variables. Typically, however, you report the model parameters, not these correlations. 


I have a simple question about including multiple correlation (WITH) statements in a model. I have a latent factor that I am then correlating with the five domains of personality modeled as separate indicators. I have run five separate models with one personality domain correlated with the latent factor at a time, but I want to correlate all 5 of them concurrently with the latent factor. This runs fine and the values are similar to those acquired when I run the separate models. My question is regarding the concurrent model are these correlations independent of each other or will they be controlling for the other indicators that are being correlated with the latent factor at the same time? 


They will be regular correlations if you also allow correlations among your 5 personality domain variables so that this part of the model is justidentified (saturated). 


Thanks otherwise would they be semipartial correlations..or? (If I don't allow them to be correlated that is) 


No, in that case they are just ignorable due to a misfitting model. 

M.O. posted on Friday, May 01, 2015  11:28 pm



I am comparing 1 and 2 factor models for factor mixture analysis. In the 2 factor model, factor correlation turned out to be 1. I suspect this means factor 1 and 2 are statistically not distinguishable, and 2 factor model should not be considered. However, AIC and BIC are smaller in 2 factor model than 1 factor model (AIC, 10111 vs 10012; BIC, 10273 vs 10189). Does this mean that 2 factor model is better model for my data? Is there anything I could do to improve model? Just to clarify, here is the input statement for 2 factor model. Thanks a lot for your advice, INPUT TITLE: LCACFA DATA: FILE = '35itemN888.dat'; VARIABLE: NAMES = u1u35; USEVARIABLES = u14u17 u30u32; CATEGORICAL = u14u17 u30u32; CLASSES = C(4); ANALYSIS: TYPE IS MIXTURE; ALGORITHM=INTEGRATION; MODEL:%OVERALL% f1 by u14u17; f2 by u30u32; OUTPUT: STANDARDIZED;  OUTPUT STANDARDIZED MODEL RESULTS Estimate S.E. Est./S.E. PValue F2 WITH F1 1.000 0.000 ********* 0.000  


You could check which factor loadings are big and for pairs of items with large loadings you can replace the factor with WITH statements for those pairs of items. Use Parameterization=Rescov. See the new article on our website: Asparouhov, T. & Muthen, B. (2015). Residual associations in latent class and latent transition analysis. Structural Equation Modeling: A Multidisciplinary Journal, 22:2, 169177, DOI: 10.1080/10705511.2014.935844 

Back to top 