Mplus Discussion >> How MPLUS Computes Factor Scores

Topics
Last Day
Last 3 Days
Last Week
Tree View

Edit Profile


How MPLUS Computes Factor Scores

Mplus Discussion > Confirmatory Factor Analysis >

Message/Author

RDU posted on Monday, December 01, 2008 - 5:54 pm

I was looking at the online technical appendix and happened upon the formula used to compute factor scores for CFA/SEM models. After looking at this, I came up with two questions.

1.) It appears that Mplus takes the posterior moment approach, which uses the posterior disttribution or conditional distribution of the latent factors (Y) given the observed data (X):
f(Y|X)

Furthermore, the means of the posterior distribution are used as estimates of the factor scores. Am I correct in my assumption here?

2.) I know that SPSS uses Bartlett's method and the Anderson-Rubin method to compute factor scores. Are these methods different from what is used in Mplus because they use the alternative solution as opposed to the posterior moment or Bayesian solution?

Thanks, and I hope all of this is clear.

Bengt O. Muthen posted on Monday, December 01, 2008 - 6:17 pm

1) For continuous outcomes and for categorical outcomes with weighted least squares estimation, Mplus uses the maximum a posteriori method. For continuous outcomes, this is also called the regression method - probably the most commonly used method for factor score estimation. For categorical outcomes this is also called MAP (e.g. in IRT).

For ML with categorical and other non-normal outcomes, Mplus uses EAP, the expected a posterio method.

2) Those methods are different.

RDU posted on Tuesday, December 02, 2008 - 9:31 am

Thank you for the quick response. As a follow-up question, for ML with continuous outcomes, does Mplus use EAP, MAP, or something different?

Best,

RDU

Bengt O. Muthen posted on Tuesday, December 02, 2008 - 10:01 am

MAP - the regression method.

Maren Winkler posted on Friday, June 18, 2010 - 4:53 am

Dear Drs. Muth�n,

is it possible to estimate factor scores in Mplus 5.21 using weighted likelihood estimation (WLEs)? If so, what would one have to specify in the input file?

Thanks for your help!

Linda K. Muthen posted on Friday, June 18, 2010 - 9:50 am

Sampling weights can be used in the estimation of the model from which factor scores are derived.

Maren Winkler posted on Wednesday, June 23, 2010 - 6:04 am

Dear Drs. Muth�n,

I have estimated factor scores for a measurement model with 7 indicators. All load a common factor (i. e. F1 BY a b c d e f g;). Additionally, I have established a nested factor for four indicators (i. e. F2 BY d e f g;) with F1 and F2 not being correlated (F1 WITH F2@0;).

When I correlate the factor scores for F1 and F2, the correlation amounts to .289. Why would that happen? Is there a way to specify that their correlation is 0?

Thanks for your help!

Linda K. Muthen posted on Wednesday, June 23, 2010 - 10:57 am

Factor scores and factors are not the same so this correlation is not surprising. You cannot specify the correlation is zero for the purposes of generating factor scores.

Heather Pearce posted on Wednesday, November 10, 2010 - 6:56 pm

Hello,
I have been provided with factor scores for my categorical variable factors, whose items were all measured on 5-point likert scales. While I appreciate the utility of these and will be using them for the correlational analyses I need to perform, I'm also expected to provide raw means and standard deviations for these factors. Can you please tell me how I rescale these factor scores to have the same Ms and SDs as my original scale?
Thanks,
Heather

Bengt O. Muthen posted on Thursday, November 11, 2010 - 7:54 am

You won't get factor scores coming out scored as 5-point Likert scales. Factors are specified as continuous normal variables. With continuous outcomes the mean of the factor is also different from the mean of the factor indicators due to the inclusion of measurement intercepts which pick up the indicator means. Also, means and variances of estimated factor scores are not the same as those of the estimated model parameters. Furthermore, the estimated factor score do not have the same correlations with other variables as the true factors do. These are some of the reasons behind the need to do a 1-step SEM instead of the multi-step procedure it appears you have in mind. Only if you have many good indicators would the estimated factor scores work like true factor scores. It doesn't help to rescale the estimated factor scores.

It sounds like you have categorical factor indicators which adds another layer of complexity - you can't put a continuous factor score into a categorical scale.

Sarah Phillips posted on Wednesday, December 12, 2012 - 2:59 pm

Hello,

I am using multilevel data (students nested in classrooms nested in teachers) and have computed the factor score of a classroom-level scale using TYPE=COMPLEX. I am hoping to use these factor scores as independent variables in an HLM model.

I understand from reading Linda's response to others that the factor scores MPLUS generates are not standardized but do have a mean of zero.

I am wondering how to interpret any significant results for this scale in my HLM models because I am not sure what the units of the factor score are and whether it would be correct to say that they are standardized around the grand mean. For example, can I say that a score that is 1 standard deviation above the grand mean is associated with an X standard deviation change in my dependent variable (this variable is standardized around the grand mean)?

Thanks for your help!

Bengt O. Muthen posted on Wednesday, December 12, 2012 - 7:15 pm

You can standardize your estimated factor scores before you use them in your model.

But why do you take this 3-step approach (FA-factorscores-HLM)? You can do this in 1 step in Mplus.

deana desa posted on Wednesday, January 15, 2014 - 1:40 am

Does Mplus uses MAP or EAP for computing factor scores when MLR is used?

Linda K. Muthen posted on Wednesday, January 15, 2014 - 9:51 am

We use MAP for MLR continuous and EAP for MLR categorical.

Elina Dale posted on Wednesday, January 22, 2014 - 9:16 pm

Dear Dr. Muthen,

In the write-up by Dr. Skrondal on MPlus website, it says the following: "With continuous variables,Mplus estimates factor scores as the maximum of the posterior distribution of the factor, which is the same as the Regression Method for factor score estimation.With this method, using factor scores as predictors gives unbiased regression slopes, but using factor scores as dependent variables gives biased slopes."

Does the last part of the statement, i.e. using factor scores as predictors gives unbiased slopes, also apply to MAP method used in MPlus with categorical outcomes?

Thank you!

Linda K. Muthen posted on Thursday, January 23, 2014 - 10:18 am

With MAP, slopes will be biased if they are used as independent or dependent variables.

Alysia Blandon posted on Monday, February 02, 2015 - 7:46 am

I have a question regarding factor scores from a CFA. I remember reading something on the discussion board that they are centered? So will have a mean of zero.

In my analyses, the factors were based on original variables that were difference scores so zero was meaningful in interpreting the direction of the effect. Am I correct in thinking that the zero point for the factor scores cannot be interpreted the same way?

Bengt O. Muthen posted on Monday, February 02, 2015 - 10:42 am

Factor scores are not centered, that is, Mplus does not subtract their means. Their means may not be zero (although typically close to it) because even if the model-estimated factor mean parameter is zero the estimated factor scores don't behave exactly the same way.

Alysia Blandon posted on Monday, February 02, 2015 - 11:34 am

Hi Dr. Muthen,

Thanks for the clarification.

Just to confirm, am I correct in thinking that the factors scores cannot be interpreted the same way as the difference scores for the original variables? Specifically, for the difference scores zero indicated that there was no change between the two scores used to create the difference score? Or since the factors were created from multiple difference scores,can I assume that a factor score of zero indicates on average no change?

Bengt O. Muthen posted on Monday, February 02, 2015 - 11:56 am

If all your factor indicators have zero sample means (as it sounds like your difference score variables do), then you would expect the estimates of your factor indicator intercepts to be zero (your factor mean parameters are fixed at zero as the default) to give estimated indicator means of zero, and you would expect your estimated factor scores to have close to zero means in your sample. The factor measures what the indicators have in common. So I would say that factor scores of zero can be interpreted as an overall difference score.

Justine Loncke posted on Friday, May 13, 2016 - 5:35 am

Dear all,

I wondered if there is an easy way to calculate Bartlett Factor scores in Mplus for a hierarchical CFA model. I understand that the method that is described in the User Guide is to obtain Regression factor scores.

Thanks in advance

Bengt O. Muthen posted on Friday, May 13, 2016 - 10:04 am

Not easy to do with Mplus but with some matrix algebra calculations outside Mplus it should not be hard.

Dzifa Adjaye-Gbewonyo posted on Thursday, February 09, 2017 - 8:18 pm

I have done CFA with categorical indicators, and I would also like to generate factor scores from this to use in future regression analyses. When generating the factor scores, does it make a difference whether the CFA was run as unstandardized vs. standardized? In other words, will the factor scores still be valid if I do CFA with the default (first path fixed at 1) as opposed to fixing the variances? Also, do factor scores assume that measurement is invariant throughout the sample/data used in the CFA such that it is based on the overall relationship between the items and the factors?

Bengt O. Muthen posted on Friday, February 10, 2017 - 10:35 am

Q1: The factor scores will come out in different metrics.

Q2: Yes, they will be valid but come out in different metrics.

Q3: Yes, everyone is assumed to follow the same model.

Dzifa Adjaye-Gbewonyo posted on Tuesday, February 28, 2017 - 8:20 pm

I am also using complex survey data in my analysis (which has stratification, clustering, and weights). When calculating the factor scores to use in future analysis, should I run a regular CFA as opposed to a complex CFA if I will be accounting for the survey design in the future analysis? (i.e. would calculating the factor scores using Type=complex and then analyzing the scores later as complex survey data duplicate the design-based analysis?)

Bengt O. Muthen posted on Wednesday, March 01, 2017 - 7:58 am

No, you should use Type=Complex when getting the factor scores. Otherwise, the model estimates that the scores are influenced by won't be the right ones.

Anders Hofverberg posted on Thursday, March 30, 2017 - 4:42 am

Dear professors,

I have two questions regarding the metric of factor scores and latent factor means.
First, I have read in other posts that the metric of latent factor means will be identical to the metric of the indicator with fixed factor loading (though centered around zero). What if there is no fixed factor loading, but the variance of the factor is fixed instead? What determines the metric then?
Second, is the metric of the factor scores similarly scaled?

Thank you in advance.

Bengt O. Muthen posted on Thursday, March 30, 2017 - 9:11 am

Q1-Q2: Mean zero, var 1.

Q3: No.

Anders Hofverberg posted on Friday, March 31, 2017 - 12:08 am

Thank you for your quick reply.
A follow-up to my last question: how are factor scores scaled? How to interpret a difference of e.g. 0.5 in factor scores?

Bengt O. Muthen posted on Saturday, April 01, 2017 - 4:38 pm

You see the factor score mean and variance - that gives you the scale for interpretations.

Ali posted on Tuesday, June 26, 2018 - 9:32 pm

I am doing measurement invariance before I extract factor scores for a latent variable to do a regression analysis.
In the data, measurement invariance is valid in the configural, loadings , thresholds and residual steps.

Which step should I extract factor scores to do a regression analysis ?

Thank you :-)

Bengt O. Muthen posted on Wednesday, June 27, 2018 - 4:06 pm

I would use the most restrictive model that you can't reject.

Katharine Buek posted on Thursday, November 08, 2018 - 2:03 pm

Can I get output showing the factor scoring coefficients? I want to be able to apply the scoring parameters to another data set for other analyses. Thank you!

Bengt O. Muthen posted on Friday, November 09, 2018 - 1:10 pm

Ask for FSCOEFF.

Katharine Buek posted on Wednesday, November 14, 2018 - 8:23 am

Hi there,
I have longitudinal factors, and i want to obtain factor scores based on the time 1 parameters for all the time points (I do not want the scores rescaled at each time point or they won't show growth). Would it be appropriate to fix the factor loadings for time2-time4 to the time 1 values, and request fscores in the output for each year? (I also do not want to include the full measurement model in my growth model; as it will be far too complex).

Bengt O. Muthen posted on Wednesday, November 14, 2018 - 5:02 pm

You can fix loadings like that. But if the loadings and intercept really aren't invariant over time, it seems that you don't measure the same construct and growth is not possible to study.

Daniel Lee posted on Thursday, January 24, 2019 - 8:22 am

Hello, I was wondering if it would be possible to save factor scores (from cfa) to look for optimal cutpoints (e.g., diagnostic thresholds) using ROC analysis. Would there be any issues with this analysis?

Thank you!

Bengt O. Muthen posted on Friday, January 25, 2019 - 10:43 am

You can use

Save = Fscores;

But it is hard to find clear cut points. You may want to discuss on SEMNET.