I was looking at the online technical appendix and happened upon the formula used to compute factor scores for CFA/SEM models. After looking at this, I came up with two questions.
1.) It appears that Mplus takes the posterior moment approach, which uses the posterior disttribution or conditional distribution of the latent factors (Y) given the observed data (X): f(Y|X)
Furthermore, the means of the posterior distribution are used as estimates of the factor scores. Am I correct in my assumption here?
2.) I know that SPSS uses Bartlett's method and the Anderson-Rubin method to compute factor scores. Are these methods different from what is used in Mplus because they use the alternative solution as opposed to the posterior moment or Bayesian solution?
1) For continuous outcomes and for categorical outcomes with weighted least squares estimation, Mplus uses the maximum a posteriori method. For continuous outcomes, this is also called the regression method - probably the most commonly used method for factor score estimation. For categorical outcomes this is also called MAP (e.g. in IRT).
For ML with categorical and other non-normal outcomes, Mplus uses EAP, the expected a posterio method.
2) Those methods are different.
RDU posted on Tuesday, December 02, 2008 - 9:31 am
Thank you for the quick response. As a follow-up question, for ML with continuous outcomes, does Mplus use EAP, MAP, or something different?
I have estimated factor scores for a measurement model with 7 indicators. All load a common factor (i. e. F1 BY a b c d e f g;). Additionally, I have established a nested factor for four indicators (i. e. F2 BY d e f g;) with F1 and F2 not being correlated (F1 WITH F2@0;).
When I correlate the factor scores for F1 and F2, the correlation amounts to .289. Why would that happen? Is there a way to specify that their correlation is 0?
Hello, I have been provided with factor scores for my categorical variable factors, whose items were all measured on 5-point likert scales. While I appreciate the utility of these and will be using them for the correlational analyses I need to perform, I'm also expected to provide raw means and standard deviations for these factors. Can you please tell me how I rescale these factor scores to have the same Ms and SDs as my original scale? Thanks, Heather
You won't get factor scores coming out scored as 5-point Likert scales. Factors are specified as continuous normal variables. With continuous outcomes the mean of the factor is also different from the mean of the factor indicators due to the inclusion of measurement intercepts which pick up the indicator means. Also, means and variances of estimated factor scores are not the same as those of the estimated model parameters. Furthermore, the estimated factor score do not have the same correlations with other variables as the true factors do. These are some of the reasons behind the need to do a 1-step SEM instead of the multi-step procedure it appears you have in mind. Only if you have many good indicators would the estimated factor scores work like true factor scores. It doesn't help to rescale the estimated factor scores.
It sounds like you have categorical factor indicators which adds another layer of complexity - you can't put a continuous factor score into a categorical scale.
I am using multilevel data (students nested in classrooms nested in teachers) and have computed the factor score of a classroom-level scale using TYPE=COMPLEX. I am hoping to use these factor scores as independent variables in an HLM model.
I understand from reading Linda's response to others that the factor scores MPLUS generates are not standardized but do have a mean of zero.
I am wondering how to interpret any significant results for this scale in my HLM models because I am not sure what the units of the factor score are and whether it would be correct to say that they are standardized around the grand mean. For example, can I say that a score that is 1 standard deviation above the grand mean is associated with an X standard deviation change in my dependent variable (this variable is standardized around the grand mean)?
We use MAP for MLR continuous and EAP for MLR categorical.
Elina Dale posted on Wednesday, January 22, 2014 - 9:16 pm
Dear Dr. Muthen,
In the write-up by Dr. Skrondal on MPlus website, it says the following: "With continuous variables,Mplus estimates factor scores as the maximum of the posterior distribution of the factor, which is the same as the Regression Method for factor score estimation.With this method, using factor scores as predictors gives unbiased regression slopes, but using factor scores as dependent variables gives biased slopes."
Does the last part of the statement, i.e. using factor scores as predictors gives unbiased slopes, also apply to MAP method used in MPlus with categorical outcomes?
I have a question regarding factor scores from a CFA. I remember reading something on the discussion board that they are centered? So will have a mean of zero.
In my analyses, the factors were based on original variables that were difference scores so zero was meaningful in interpreting the direction of the effect. Am I correct in thinking that the zero point for the factor scores cannot be interpreted the same way?
Factor scores are not centered, that is, Mplus does not subtract their means. Their means may not be zero (although typically close to it) because even if the model-estimated factor mean parameter is zero the estimated factor scores don't behave exactly the same way.
Just to confirm, am I correct in thinking that the factors scores cannot be interpreted the same way as the difference scores for the original variables? Specifically, for the difference scores zero indicated that there was no change between the two scores used to create the difference score? Or since the factors were created from multiple difference scores,can I assume that a factor score of zero indicates on average no change?
If all your factor indicators have zero sample means (as it sounds like your difference score variables do), then you would expect the estimates of your factor indicator intercepts to be zero (your factor mean parameters are fixed at zero as the default) to give estimated indicator means of zero, and you would expect your estimated factor scores to have close to zero means in your sample. The factor measures what the indicators have in common. So I would say that factor scores of zero can be interpreted as an overall difference score.
I wondered if there is an easy way to calculate Bartlett Factor scores in Mplus for a hierarchical CFA model. I understand that the method that is described in the User Guide is to obtain Regression factor scores.
I have done CFA with categorical indicators, and I would also like to generate factor scores from this to use in future regression analyses. When generating the factor scores, does it make a difference whether the CFA was run as unstandardized vs. standardized? In other words, will the factor scores still be valid if I do CFA with the default (first path fixed at 1) as opposed to fixing the variances? Also, do factor scores assume that measurement is invariant throughout the sample/data used in the CFA such that it is based on the overall relationship between the items and the factors?
I am also using complex survey data in my analysis (which has stratification, clustering, and weights). When calculating the factor scores to use in future analysis, should I run a regular CFA as opposed to a complex CFA if I will be accounting for the survey design in the future analysis? (i.e. would calculating the factor scores using Type=complex and then analyzing the scores later as complex survey data duplicate the design-based analysis?)
I have two questions regarding the metric of factor scores and latent factor means. First, I have read in other posts that the metric of latent factor means will be identical to the metric of the indicator with fixed factor loading (though centered around zero). What if there is no fixed factor loading, but the variance of the factor is fixed instead? What determines the metric then? Second, is the metric of the factor scores similarly scaled?
I am doing measurement invariance before I extract factor scores for a latent variable to do a regression analysis. In the data, measurement invariance is valid in the configural, loadings , thresholds and residual steps.
Which step should I extract factor scores to do a regression analysis ?
Hi there, I have longitudinal factors, and i want to obtain factor scores based on the time 1 parameters for all the time points (I do not want the scores rescaled at each time point or they won't show growth). Would it be appropriate to fix the factor loadings for time2-time4 to the time 1 values, and request fscores in the output for each year? (I also do not want to include the full measurement model in my growth model; as it will be far too complex).