Confirmatory factor analysis PreviousNext
Mplus Discussion > Confirmatory Factor Analysis >
 Anonymous posted on Monday, August 26, 2002 - 6:38 pm
My topic for a thesis is about Confirmatory factor analysis.Can you suggest the best type of data where I applied and in what field of interest.
 bmuthen posted on Tuesday, August 27, 2002 - 10:15 am
There are so many application areas. I think you should study the literature and explore the area that you are most interested in and that is also agreeable to your mentor and your department.
 Anonymous posted on Tuesday, August 27, 2002 - 10:12 pm
Thank you very much!
 Anonymous posted on Monday, September 02, 2002 - 11:11 am
I have conducted a multigroup factor analysis in Mplus (using categorical indicator variables).

I want to output the Mplus factor scores (FSs) to a file, and the match them to my original data set.

I'm having a great deal of difficulty because Mplus does not save a CASE ID to its output files. Furthermore Mplus appears to resort the input data by GROUP ID and by other criteria before producing the FS output file. I know this because even after I resort my input data by GROUP ID and CASE ID, the weights for the input file are ordered differently than in the Mplus output file.

Is there anyway to sort Mplus FS output file so that I can reliably patch the FSs back into my original data set ?
 Linda K. Muthen posted on Monday, September 02, 2002 - 11:38 am
Mplus Version 2.0 and up does allow the inclusion of an ID variable. The IDVARIABLE option is part of the VARIABLE command.
 Anonymous posted on Tuesday, September 03, 2002 - 10:50 am
Perfect. I'd consulted the wrong part of the manual. Works great.
 Anonymous posted on Tuesday, September 10, 2002 - 12:03 am
Is it possible to include an indirect effect when examing measurement invariance of a single-factor measure in a multiple group model? Thanks!
 bmuthen posted on Tuesday, September 10, 2002 - 8:02 am
Yes. I assume you mean that you have an x variable that influences the factor and therefore the indicators indirectly.
 Hervé CACI posted on Monday, February 24, 2003 - 2:13 am
In some recent exchanges on SEMNET, Stan Mulaik argued that his parsimony ratio should be taken into consideration for fit testing. I don't see how it can work with WLSMV since the number of degrees of freedom reflect both the number of parameters to be estimated and the data. Stan nor anybody on the list answered my question. Is it a worthless thought?

 bmuthen posted on Tuesday, February 25, 2003 - 9:42 am
I think you might want to use WLS for this.
 Anonymous posted on Tuesday, June 07, 2005 - 6:44 am
What is the maximum number of dichtotmous items Mplus can handle when doing the CFA? When I run 147 dichtotmous items, it kept running. Thanks!
 Linda K. Muthen posted on Tuesday, June 07, 2005 - 7:04 am
The maximum number of variables allowed in Mplus is 500. With categorical outcomes, the analysis can take some time with 147 items depending on the speed of your computer.
 Eric Buhi posted on Tuesday, January 31, 2006 - 10:38 am
Accorinding to the APA manual, I need to report means/SDs for all the variables I include in my modeling. I get variable means with SAMPSTAT, but how do I produce the standard deviations? Thanks!
 Linda K. Muthen posted on Tuesday, January 31, 2006 - 10:45 am
Take the square root of the variances that are also reported in the sample statistics.
 Eric Buhi posted on Tuesday, January 31, 2006 - 11:16 am
Thank you for your reply. Do you mean the covariances on the diagonal (following the means results)?
 Linda K. Muthen posted on Tuesday, January 31, 2006 - 1:05 pm
The variances are on the diagonal of a variance/covariance matrix. The off-diagonal elements are covariances.
 anonymous posted on Wednesday, January 10, 2007 - 11:42 am

I have performed a CFA for 7 factors. In the output is it possible get Eigenvalues for each of these factors? Something similar to an SPSS or STATA output for factor analyses?

 Bengt O. Muthen posted on Thursday, January 11, 2007 - 8:36 am
Short answer is no. A longer answer is as follows.

Mplus gives eigenvalues for exploratory factor analysis and these eigenvalues are for the sample correlation matrix, used to guide in choosing number of factors. Many researchers in the past have used the amount of variance explained in the observed variables by a factor as a descriptive of the quality of the factor solution. This amount of variance is the sum of the squared loadings in a column (for a factor) when the factors are uncorrelated. This amount is related to the eigenvalue - would be the eigenvalue if the estimation method was principal component analysis (which is not a great estimator for factor models). Also, one could compute the eigenvalues for the model-estimated correlation matrix.

However, I would question the value of eigenvalue information for factor analysis beyond the EFA purpose of guiding the choice of number of factors. To decide on a well-fitting model in CFA we have better fit measure alternatives (and eigenvalues are not fit measures anyhow). And since factor analysis is not designed to maximize variance explained (but capturing correlation structure), the descriptive value of an eigenvalue is also not clear.
 anonymous posted on Tuesday, January 16, 2007 - 8:21 pm
Does this mean that the value of variance for each factor in the output is the variance explained by the factor? I am a little confused as to what does it represent?

 Linda K. Muthen posted on Wednesday, January 17, 2007 - 9:07 am
No, the factor variance is how much variability there is in the factor. Variance explained refers to how much variance of the factor indicators is explained by the factor. You can find this by looking the the R-square values of the factor indicators.
 Reetu Kumra posted on Monday, February 26, 2007 - 10:53 am

I have a few questions:

1. In a confirmatory factor analysis output, the column that is labeled StdYX (last column) is this interpreted? Is this the correlation between the latent construct and the actual variable?
Please help.

2. When doing a CFA on two groups within a sample, what is the difference in doing a multi-group analysis and doing a CFA on these two groups separately?

 Linda K. Muthen posted on Monday, February 26, 2007 - 3:54 pm
1. This is a raw coefficient standardized using both latent variable and observed variable variances.

2. If you analyze both groups together with all parameters free across groups, you will obtain the same estimates as if you analyzed the two groups separately. Usually, the two groups are analyzed together so that equality constraints can be used to test for measurement invariance.
 Reetu Kumra posted on Tuesday, February 27, 2007 - 11:09 am
Thanks Linda!

One last question: Once the CFA is complete, is there a way to make the latent constructs created into a measurable variable? (i.e. Can we somehow get something equivalent to data for the latent constructs?)
 Linda K. Muthen posted on Tuesday, February 27, 2007 - 11:26 am
Are you asking if you can obtain factor scores? If so, you can do this using the FSCORES option of the SAVEDATA command.
 Reetu Kumra posted on Tuesday, February 27, 2007 - 12:08 pm
Hi Linda.

I have two more questions:

1. How exactly is the StdYX derived? When you say standardized, please clarify how the raw coefficients are standardized.

2. How exactly are the factor scores created? Is this an overall measure of the raw data that go into the factor?

 Linda K. Muthen posted on Tuesday, February 27, 2007 - 1:28 pm
1. See Technical Appendix 3 which is on the website.
2. See Technical Appendix 11 which is on the website.
 Derek Kosty posted on Wednesday, August 20, 2008 - 10:25 am

I have noticed that the number of free parameters between Mplus version 4 and 5.1 disagree. When running the model:


version 4 counts 9 free parameters and version 5.1 counts 18. What is the reason behind this?

 Linda K. Muthen posted on Wednesday, August 20, 2008 - 11:11 am
With Version 5, TYPE=MEANSTRUCTURE became the default. This is the cause. You can add MODEL=NOMEANSTRUCTURE; to the ANALYSIS command to override this default.
 Joykrishna Sarkar posted on Friday, July 31, 2009 - 11:37 am
I was trying to save the standardized output in configural, metric , scalar and complete invariance tests. But Mplus does not save the standardized output of factor loadings in metric invariance, intercepts and factor loadings in scalar invariance, and intercepts, factor loadings & residual variances in complete invariance tests. Instead of saving standardized output of parameters mentioned above, Mplus save 999 (missing). Any help about how to save these standardized outputs would be appreciated.
 Linda K. Muthen posted on Friday, July 31, 2009 - 4:20 pm
Mplus does not save standardized parameter estimates that are constrained to be equal.
 ehsan malek posted on Wednesday, July 14, 2010 - 11:17 am

i have a CFA model with two latent variables.
i calculated average variance extracted for each of the two variables and it is around .3 for each. composite reliability is around .7 for each of the two latent variables.
i have around 500 cases. model fit indices are ok (almost ok, chi square is not and i think it is because of the big sample size).
what can i do for the AVE (as its recommended value is >.5)? does it have something to do with the sample size? as other things are ok with the model can i accept it?

 Linda K. Muthen posted on Thursday, July 15, 2010 - 7:57 am
I would look at factor determinacy. It is probably correlated with AVE. Can you give a reference for AVE? I would also not discount chi-square with a sample size of 500. This is not large.
 Christopher Bratt posted on Monday, July 19, 2010 - 3:29 pm
Linda, AVE is average variance extracted in factor analysis.
(It would be great if Mplus could compute AVE...)

Chris B.
 Morayo Ayodele posted on Tuesday, July 03, 2012 - 10:29 am
Hello Dr. Muthen,

Is there a reason why a model would run without errors in one sample and not in another irrespective of sample size? I am trying to run a four-factor model in four independent samples of N = 234, 296, 334, and 568. It returned errors for samples 296 and 334.

F1 by sgl3 sgl17 sgl25 sgl68;
F2 by sgl41 sgl42 sgl67 sgl76 sgl100;
F3 by sgl5 sgls8 sgl78 sgl84 sgl94 sgl96 sgl98;
F4 by sgl30 sgl40 sgl55 sgl83 sgl92 sgl97 sgl102;

Output: Sampstat standardized mod tech4;

WARNING: The latent variable covariance matrix (psi) is not positive definite. This could indicate a negative variance/residual variance for a latent variable, a correlation greater or equal to one between two latent variables, or a linear dependency among more than two latent variables. Check the tech4 output for more information. Problem involving variable F2.

I did observe a correlation greater than 1 for two latent variables (F2 & F4). Is there anything way of fixing this problem?

Thank you
 Linda K. Muthen posted on Tuesday, July 03, 2012 - 10:48 am
The same model might not be correct for different data sets. It sounds like that is the case.

A correlation greater than one means the model is inadmissible. You need to change the model.
 Mahdi posted on Tuesday, April 22, 2014 - 9:12 am
Hi dears Muthen;
My data is about components of metabolic syndrome and its indicators (e.g. HDL,LDL,BMI, TC, . . . ) and sample size is almost 800. I run a CFA on my data, but our Chi-Square Test of Model Fit is significant and X2/df is greater than 8. Also my RMSEA is greater than 0.1, but CFI and TLI is greater than 0.9 and our SRMR is less than 0.05. Is Model correct for analysis? Our purpose is genetic association analysis on selected markers.

Thaks a lot.
 seungjin lee posted on Tuesday, April 22, 2014 - 10:02 am
I have conducted three factor analysis CFA model with a binary data set with ULI and UVI just in case (wlsmv).

However, the sign of the correlations among latent variables is totally opposite between ULI and UVI even though the absolute values are totally identical.In my eye, it does not make sense.

Also, the signs of loadings for one latent variable are totally opposite as well between ULI and UVI.

Do you know why?
 Linda K. Muthen posted on Wednesday, April 23, 2014 - 10:15 am

You can look at modification indices to see where fit can be improved or you can do an EFA to see if your CFA is reasonable.
 Linda K. Muthen posted on Wednesday, April 23, 2014 - 10:20 am

This is not a problem. You can use positive starting values for the factor that gets negative loadings. Then you will get a positive correlation.
 Mahdi posted on Wednesday, April 23, 2014 - 11:30 am
Dear Dr.Muthen
Based on our medical theory I can't change the CFA model .
 seungjin lee posted on Wednesday, April 23, 2014 - 2:05 pm
Hello Muthen,

Thank you so much!
Howver, I am still confued though.

1. Then, between ULI an UVL which sign of the correlations among latent variables can I trust? In fact, I was expcting that positive correlations among latent variables.

2. I did not use any starting value for my model. In fact, I am not familar with this concept. In order to to use positive starting values for the facotr that get negative loadings, which syntax I should use.

3. Also, why I should use the starting value for the factors, not for loadings?

4. Finally, which case I should consider starting values?

Thanks you so much in advance!

 Linda K. Muthen posted on Wednesday, April 23, 2014 - 3:08 pm

If you can't change the model, you can't improve the fit.
 Linda K. Muthen posted on Wednesday, April 23, 2014 - 3:16 pm

In factor analysis, the factor is the same when all factor loadings are positive or all factor loadings are negative. If you prefer the positive interpretation of the factor, you can change the factor loadings using starting values, for example, if the factor loadings are -.4 and -.3, say

f BY y1 y2*.4 y3*.3;

When the factor has negative loadings, it correlates negatively with the other factor. If you reverse the sign of the loadings, it will correlate positively. Both interpretations are valid.
 seungjin lee posted on Thursday, April 24, 2014 - 10:07 am
Dear DR. Muthen;

Thank so very much!!

I have learned a lot already!!

Have a nice weekend in advance!!!!

 Sara Aguilar-Barrientos posted on Monday, June 25, 2018 - 12:34 pm
When running measurement invariance in two groups, the correlation is too high (more than 0.9). How can I fix this?. Besides when regressing the two latent variables with the command ON, it shows it is not statistically significant with estimates higher than 1. This makes no sense
 Bengt O. Muthen posted on Monday, June 25, 2018 - 12:56 pm
High factor correlation is often due to cross-loadings that are fixed at zero but should be free - check with Modind.
Back to top
Add Your Message Here
Username: Posting Information:
This is a private posting area. Only registered users and moderators may post messages here.
Options: Enable HTML code in message
Automatically activate URLs in message