Mplus Discussion >> Obtaining factor scores

Topics
Last Day
Last 3 Days
Last Week
Tree View

Edit Profile


Obtaining factor scores

Mplus Discussion > Confirmatory Factor Analysis >

Message/Author

Stephanie West posted on Friday, May 14, 2004 - 8:19 am

Hello! I have used Mplus to confirm a scale which I developed. Now I want to use my data and the scale to compare groups (demographics) according to the latent variables. I think what I need to do is to get a factor score for each of the observed variables using Mplus then transfer the data to SPSS and do the following:

Example: I have 9 Latent Variables. The first Latent Variable (FAC1) is comprised of the following observed variables: C7, C11, C20 and C24.

1. Total the OV factor scores for each of the OV that contribute to a particular LV.
Example: (FS for C7) + (FS for C11) + (FS for C20) + (FS for C24) = FAC1 FS Sum
2. Divide each OV factor score by the total of OV factor scores to get a "proportional factor score" - so that all of the OV factor scores will equal 1 when summed - to give a % of the impact of each OV.
Example: (FS for C7)/(FAC1 FS Sum) = PFS for C7, where PFS = Proportional factor score
3. Use the proportional factor scores in a weighted sum formula. This would allow me to give meaning to the values b/c they would be on the same scale as the responses, 1-5.
Example: FAC1 = ((PFS for C7)*C7) + ((PFS for C11)*C11) + ((PFS for C20)*C20) + ((PFS for C24)*C24), where C7 is the respondent's answer to question C7 and FAC1 is now the calculated response for that person's beliefs about Factor 1 (Latent Variable 1)

Perhaps there is a better way to do this. Your advice would be greatly appreciated.

Stephanie West

Linda K. Muthen posted on Friday, May 14, 2004 - 9:27 am

You can compare the means, variances, and covariances of the latent variables using multiple group analysis and chi-square difference testing. Is this what you mean?

Stephanie posted on Friday, May 14, 2004 - 9:56 am

Yes, but I don't know how to do that.

Stephanie West posted on Friday, May 14, 2004 - 10:07 am

I don't know how to use multiple group analysis so I was going to use the above method. Would that work and if so, how can I get the factor scores?

Linda K. Muthen posted on Friday, May 14, 2004 - 10:22 am

You can get the factor scores using the SAVEDATA command. See the user's guide for details. If you are interested in seeing the steps we use to test measurement invarinace and population heterogeneity using multiple groups, you could purchase the Day 1 handout from our short courses.

Stephanie West posted on Friday, May 14, 2004 - 12:22 pm

I have ordered the handout and will try that. In the meantime, I'd like to get those factor scores. But when I used the SAVEDATA command, it said that it couldn't find the data. Any suggestions on what I need to do.

*** ERROR in Savedata command
Only sample correlation matrix may be saved when there is at least
one categorical dependent variable.

Stephanie posted on Friday, May 14, 2004 - 12:39 pm

Latest error message:

THE MODEL ESTIMATION TERMINATED NORMALLY

WARNING: THE RESIDUAL COVARIANCE MATRIX (PSI) IS NOT POSITIVE DEFINITE.
PROBLEM INVOLVING VARIABLE F3.

THE MODEL COVARIANCE MATRIX IS NOT POSITIVE DEFINITE.
FACTOR SCORES WILL NOT BE COMPUTED. CHECK YOUR MODEL.

Linda K. Muthen posted on Friday, May 14, 2004 - 2:02 pm

You must have a negative residual variance or a correlation greater than one. Please send your output to support@statmodel.com and I will look at it.

Paul Kim posted on Monday, August 09, 2004 - 9:42 pm

I'm trying to create factor scores but I get the following error message:

MINIMIZATION FAILED WHILE COMPUTING FACTOR SCORES FOR THE FOLLOWING
OBSERVATION(S) :
457 FOR VARIABLE V32011
484 FOR VARIABLE V32001
860 FOR VARIABLE V32001

Is this a data cleaning problem? I've looked through my data and cleaned it the best I could and checked for missing data, but I can't seem to fix this. Any suggestions?

Linda K. Muthen posted on Friday, August 13, 2004 - 3:49 pm

These observations must have unusual patterns, for example, getting the easy items wrong and the difficult items correct. This makes optimization difficult.

Anonymous posted on Monday, February 14, 2005 - 9:39 am

What are the 8 columns one gets from the Savedata: save=fscores option? thanks

Linda K. Muthen posted on Monday, February 14, 2005 - 9:53 am

If you look at the end of your output, the format and order of the variables in SAVEDATA file are described.

Girish Mallapragada posted on Monday, March 21, 2005 - 8:50 am

Hello Dr. Muthen,

Which regression method is used to obtain factor scores in Mplus - anderson-rubin or Bartlett or is there some other?

regards

bmuthen posted on Monday, March 21, 2005 - 5:39 pm

Mplus uses the regression method (see e.g. Lawley-Maxwell's FA book) - also known as the modal posterior estimator - for continuous outcomes and for categorical outcomes with WLSMV. In other cases, it uses the expected posterior distribution approach.

Girish Mallapragada posted on Monday, March 21, 2005 - 7:06 pm

Hello Dr. Muthen,

Thanks for the clarification.

Shuang Wang posted on Tuesday, June 14, 2005 - 3:43 pm

Hi Linda,

I used SAVEDATA: SAVE = FSCORES to save factor scores from the CFA, but I found all the factor scores in the output file are either 0 or 1. Can I ask what goes wrong here? Or if it is what I should get, how should I understand it?

Thank you!
Shuang

Shuang Wang posted on Tuesday, June 14, 2005 - 3:49 pm

Sorry Linda, please discard my last questionl. There are three columns that have values look like factor score. Is the first column the fs for the first factor, sceond column factor scores for the second factor, third column residual?
Thank you!
Shuang

Shuang Wang posted on Tuesday, June 14, 2005 - 4:14 pm

Hi Linda,

I have one more follow-up question. The number of individuals in the factor score file is different from the number of individuals in the input file. Is there any way to match these two files to know which factor score is for which individual?

Thank you!
Shuang

bmuthen posted on Wednesday, June 15, 2005 - 7:41 am

Please send your output,saved file, and data to support@statmodel.com.

Anonymous posted on Wednesday, June 29, 2005 - 11:41 am

Hi, Linda,

I have a question about the factor score obtained from CFA. I used save=fscore command. When I changed the sequence of indicators in the BY command, the factor score also changed. What�s happed? Which factor score should be used?

Thanks

Linda K. Muthen posted on Saturday, July 02, 2005 - 5:42 pm

When you change the sequence of the factor indicators, a different factor loading is fixed to one to set the metric of the factor. This is why you get different factor scores. Both scores are valid and should correlate one with each other.

samruddhi posted on Tuesday, September 06, 2005 - 9:56 am

I have 10 categorical variables and am predicting one factor using CFA.
here is the output:
CFI/TLI
CFI 0.971
TLI 0.986

Number of Free Parameters 10

RMSEA (Root Mean Square Error Of Approximation)
Estimate 0.036

SRMR (Standardized Root Mean Square Residual)
Value 0.041

WRMR (Weighted Root Mean Square Residual)
Value 1.281

I see that all but the WRMR value shows a good fit. Could you, please, share with me your insight on how to interpret these results in light of WRMR value being so much higher than what indicates a good fit for my model?

Thanks much.

Anonymous posted on Tuesday, September 06, 2005 - 9:58 am

can I get a factor score with Mplus if all my indicators are categorical? Mplus guide says this is not possible, any suggestions are most welcome.

thanks!

Linda K. Muthen posted on Tuesday, September 06, 2005 - 10:34 am

Our recent experiences with WRMR have not shown it to be as good as we had hoped. I would not worry about it. But I would look at my chi-square.

Linda K. Muthen posted on Tuesday, September 06, 2005 - 10:35 am

Factor scores for factors with categorical indicators have been available for several years. You must be looking at a very old user's guide.

Anonymous posted on Tuesday, September 06, 2005 - 10:39 am

thanks, linda, i am using 2001 user's guide reprinted in 2002. i will find a copy of the new one and use that. many thanks.

samruddhi posted on Tuesday, September 06, 2005 - 10:46 am

Thanks, Linda, for your quick response re: WRMR. Chi-square is 113.157 (p=0.0000) but, with a sample size of 2444, my understanding is that chi-square is not a good measure. let me know if you think otherwise.

could you, also, verify for me if the following values can be used to assess model fit:
SRMR 250)
TLI > 0.95
CFI > 0.95
RMSEA < 0.06

thanks much for your guidance.

samruddhi posted on Tuesday, September 06, 2005 - 10:52 am

In the previous message, I meant
SRMR 250)

thanks.

samruddhi posted on Tuesday, September 06, 2005 - 10:54 am

In the previous message, I meant
SRMR <0.08
works for sample greater than 250.

thanks.

Linda K. Muthen posted on Tuesday, September 06, 2005 - 11:32 am

Chi-square can be sensitive to sample size but that is not a reason to ignore it. You can do a sensitivity study where you free parameters until chi-square is acceptable and see if in doing so the parameters in your original model stay the same. If they do, then the poor chi-square was probably due to its sensitivity. If the original parameters change a lot, then the model probably does not fit well.

For acceptable cutoffs for fit measures, see the Hu and Bentler article from several years ago in Psych Methods and also the Yu dissertation on our website.

Elke Pari posted on Saturday, October 01, 2005 - 7:01 am

Hi,
I have a problem in getting factor scores. My dataset contains of 9 variables which are categorical. If I try to get factor scores through the command "SAVE=FSCORES" I get the following error message:
"THE MODEL ESTIMATION TERMINATED NORMALLY
THE MODEL COVARIANCE MATRIX IS NOT POSITIVE DEFINITE.
FACTOR SCORES WILL NOT BE COMPUTED. CHECK YOUR MODEL."
What do you suggest could be wrong?

Linda K. Muthen posted on Saturday, October 01, 2005 - 8:29 am

It is difficult to say exactly what the problem is without more information. Please send your input, data, output, and license number to support@statmodel.com.

Marco posted on Wednesday, November 16, 2005 - 5:25 am

Hello Drs. Muth�n,

I would like to conduct a multilevel regression analysis with Mplus. Since my indicators are not tau-parallel or even tau-equivalent (bad fit for a CFA with equality constraint on the factor loadings), the simple mean/sum of the indicators aren�t the best estimators of a person�s true score. I know that it is possible to estimate a multilevel model with latent variables, but I would prefer to keep it simple. Therefore...
...would it be advisable to use the factor scores as predictors? I found another estimator in a book from Roderick McDonald (Test Theory, 1999), which takes the factor loadings and the residual variance into account, but reduces the variance of the estimated true score dramatically (compared to the factor scores of Mplus).
...I have a hierarchical sample, so I will estimate the factor scores with TYPE=COMPLEX. Does this option affect the estimation of the factor scores? (Maybe it is problem to take the nonindependence twice into account: first within the CFA with TYPE=COMPLEX and afterwards within the multilevel regression analysis)

Thanks a lot for your help!
Marco

Marco posted on Wednesday, November 16, 2005 - 5:41 am

Sorry, I forgot one question: It seems to be, that Mplus estimates factor scores even for cases with all missings on the corresponding indicators. How is that possible?

(I used MLR with TYPE=COMPLEX MISSING H1)

Thanks, Marco

bmuthen posted on Wednesday, November 16, 2005 - 8:45 am

Perhaps you have covariates in the model.

bmuthen posted on Wednesday, November 16, 2005 - 8:49 am

Regarding your 5:25 AM question, if you are going to use factor scores in a multilevel setting it is best if you base those on a multilevel factor analysis model. So you would use type=twolevel. For type = complex to be formally correct, you have to assume that the factor loadings are (approximately) the same on both the within and between level, which is often not the case.

Marco posted on Wednesday, November 16, 2005 - 12:52 pm

Thanks a lot for your reply, but I'am not sure, whether I understand you correctly. Does that mean that the factor structure obtained by type=complex is in fact a mixture of two factor structures, one between- and one within-group?

bmuthen posted on Wednesday, November 16, 2005 - 4:45 pm

Yes. Type = Complex estimates the same parameters as in a regular (single-level) factor analysis. The aim of Type = Complex is to give SEs and chi-square corrected for the non-independence of the observations due to clustering. Sometimes this is not sufficient but Tyep = Twolevel is needed.

Marco posted on Friday, November 18, 2005 - 12:38 am

Thanks for the clarification, but that produces a practical question. My group-level observations are limited to 38 clusters and there are 19 Indicators on the individual-level (although most of them have little between-variance), so probably I will have to reduce the number of between-parameters. Are there some practical guidelines about how to do this best? For example, is there a rule-of-thumb about the minimum between- variance needed? In which cases would it be reasonable to fix the residual variances on the between-level to zero or constrain the loadings on both levels to be equal? I guess that this is a wide topic, so maybe you could provide a reference.

Many thanks!

bmuthen posted on Friday, November 18, 2005 - 5:29 am

To have success with your two-level factor analysis, I would recommend reading and following the 5 steps of

Muth�n, B. (1994). Multilevel covariance structure analysis. In J. Hox & I. Kreft (eds.), Multilevel Modeling, a special issue of Sociological Methods & Research, 22, 376-398. (#55)

It may be that you need only a 1-factor model on between and perhaps zero between-level residual variances.

Jantine Spilt posted on Tuesday, January 03, 2006 - 7:45 am

Hi!

I've a question about the estimation of factor means within a multiple group analyses (model: 3 factors with 4 indicators each).

As a default the factor means are fixed to zero in the first group, while in the second group factor means are free. I want to overrule the default and estimate the factor means of both groups by fixing the intercept of an observed dependent variable to zero (one for each factor). But it doesn't work. I hope you can help me solving this problem.

thank you in advance,

Jantine

Linda K. Muthen posted on Tuesday, January 03, 2006 - 8:45 am

This should work and I am not sure what you mean when you say it does not work. I suspect that you are not freeing the factor mean in the class-specific MODEL command for the first group. If this does not help you solve your problem, please send your input, data, output, and license number to support@statmodel.com

Jinseok Kim posted on Tuesday, January 03, 2006 - 10:25 pm

I am trying to estimate a latern interaction modeling in mplus. Schumacker introduced some approach by Joreskog that used "latent variable score" to estimate a sem with latent interaction modeling (http://www.ssicentral.com/lisrel/techdocs/lvscores.pdf). It seems to me attractive but his explanation is all in LISREL language. So, I was wondering if I can do the same modeling using mplus. Any of your thoughts and suggestions will be greatly appreciated. Thanks.

bmuthen posted on Wednesday, January 04, 2006 - 8:53 am

Using estimated factor scores only leads to approximate solutions. A better alternative is to use the Mplus ML approach to latent variable interaction modeling which is in line with the Klein-Moosbrugger method from Psychometrika.

Karen Kaczynski posted on Wednesday, April 19, 2006 - 2:10 pm

Hello!

I am attempting to generate factor scores for a 2 factor model with combined continuous and categorical (binary) indicators. The model fits adequately. I have been using the save = fscores command, but for some reason the program only saves the raw data. It does not seem to be generating factor scores. Is there a problem generating factor scores with combine categorical and continuous indicators?

Thank you in advance for your help!

Linda K. Muthen posted on Wednesday, April 19, 2006 - 2:39 pm

No, this is not a problem. The only reason factor scores would not be computed is if the model did not converge or there was a warning about negative residual variances or some other problem. The only way to know is to send your intput, data, output, and license number to support@statmodel.com.

Karen Kaczynski posted on Thursday, April 20, 2006 - 2:42 pm

It looks like the residual variance for one of the categorical variables is negative. Is there a way to fix this? Or do I need to alter my model?

Thanks!

Karen

Linda K. Muthen posted on Thursday, April 20, 2006 - 2:58 pm

I would suggest modifying the model. Did you do an EFA on these items? If not, that is a good place to start.

chennel huang posted on Wednesday, July 12, 2006 - 4:24 am

Hi, it's huang.

1.
I tryied to save the factor score, but the save data information in the ouput stated as below:

"Factor scores were not computed. No data were saved."

Folling is my syntax:

data:
file is mi1.dat;
variable:
names are cl edu inc gen bmi year at1-at7 ba1-ba5;
usevariables are at1-at7;
categorical are at1-at7;
model:
!set cfa due to varimax.
!set reference due to loading.
fat1 by at1*;
fat1 by at2@1;
fat1 by at3;
fat2 by at4*;
fat2 by at5;
fat2 by at6@1;
fat2 by at7;
fat1 with fat2;
output:
sampstat standardized residual modindices (0) tech1 tech2 tech4;
savedata:
file is 0618withfc.dat;
save=fscores;

Did I do something wrong?

2.
If I chang the model to modify the relations between indicators and/or factors, the factor score also have different value?

3.Could the factor score be interpret as every responser's value of the concept/factor? How is it produced with all the categorical observed values?

4.Besides, how do I read the MI to decide whether free the parameter or not?

thanks for your time.

chennel huang posted on Thursday, July 13, 2006 - 1:13 am

For the last question point 2, I meant the covariance b/w factor and/or indicator's measurement error.

Bengt O. Muthen posted on Friday, July 14, 2006 - 4:35 pm

This can be diagnosed if you send your output, data and license number to support@statmodel.com. Typically, factor scores are not computed when a model has inadmissible parameter estimate values.

chennel huang posted on Sunday, July 16, 2006 - 7:22 am

It's my carelessness. I find out one indicator(at2) (standardized) residual variance is negative value like this"AT2 -0.433 Undefined 0.14327E+01". Is there any suggestion you can offer to solve this problem? thank you.

chennel huang posted on Sunday, July 16, 2006 - 7:31 am

To add the previous statement, the correlation(std of loading) b/w indicator at2 & factor fat1 is more than 1(i.e. 1.197).

Bengt O. Muthen posted on Sunday, July 16, 2006 - 12:11 pm

This suggests that the model is not appropriate for the data - so the model needs to be modified. Possible modifications can be suggested my modification indices for cross-loadings and residual correlations.

Also check that you get only 1 loading fixed to 1 for each of the 2 factors.

chennel huang posted on Monday, July 17, 2006 - 5:38 am

Thanks for your suggestion. After modifying the relation b/w measurement errors, the problem is solved.

chennel huang posted on Monday, July 31, 2006 - 5:36 am

Following the question of Paul Kim posted on Monday, August 09, 2004 - 9:42 pm

MINIMIZATION FAILED WHILE COMPUTING FACTOR SCORES FOR THE FOLLOWING
OBSERVATION(S) :

Is there any way to solve this problem? Does modifying the relation b/w measurement errors mean anything about this problem?

Linda K. Muthen posted on Monday, July 31, 2006 - 10:22 am

This means that for some observations, the pattern of values in contradictory, for example, a person who gets the easy items incorrect and the difficult items correct. There is nothing you can do other than change the model. I suggest using Version 4.1 if you are not.

chennel huang posted on Tuesday, August 01, 2006 - 3:43 am

According to the subject of my research, I will not change the relations of loading between the factors & indicators, so I modeified the model by incorporating the measurement error covariance to the model due to the value of residual correlation matrix more than .1. After the operation step by step, the residual correlation is all less than .1, the CFI, RMSEA, WRMR is better & better, the number of observations failed to computing factor scores is from 8 to 1. So, I decide to delete this observation from the research sample. Is the process of my operation proper? thank you.

Linda K. Muthen posted on Tuesday, August 01, 2006 - 9:22 am

I don't think adding residual covariances with no substantive reason to obtain factor scores for problematic individuals is definsible. Although your theory may say one thing, your data may not be valid measures of the constructs in your theory. I would do an EFA to see if the variables are behaving as expected.

chennel huang posted on Wednesday, August 02, 2006 - 4:49 am

Due to my understanding of your statement about the minization failure, and the initially number of observations failed to computing factor scores is 8 and the sample size is 1357, if I directly delete these 8 observations and run the CFA to save the factor scores again, is it acceptable? Sorry for my strange question.

chennel huang posted on Wednesday, August 02, 2006 - 5:11 am

I am sorry for this stupid quetion. After trying the method just mentioned, the observations failed to computing will change to others and I never note that.

Linda K. Muthen posted on Wednesday, August 02, 2006 - 8:46 am

If you are not using Version 4.1, you should upgrade to that. If you need further assistance, you should send your input, data, output, and license number to support@statmodel.com.

owen fisher posted on Friday, October 20, 2006 - 3:47 am

It's carey.

Observing one indicator, every level of the indicator has it's own meaning. Does the value of the factor score mean anything if I try to do the distribution of it.

Linda K. Muthen posted on Friday, October 20, 2006 - 7:09 am

Can you describe this in more detail.

owen fisher posted on Friday, October 20, 2006 - 10:15 am

Sorry, and I should describe in this way. There is one factor and three categorical indicators, and the loadings are .455, 1, and .386. If these loadings are probit regression coefficient, how do I inteprete the relation between the factor and the indicators. thank you.

owen fisher posted on Friday, October 20, 2006 - 10:25 am

Adding to the previous question, the number of these three indicator levels are 3, 4, and 4. Hope to make the question clearer.

Linda K. Muthen posted on Friday, October 20, 2006 - 10:52 am

If you are using WLSMV, the factor loadings are probit regresson coefficients. You can interpret their sign and significance. That is probably what is most important if they are used as factor indicators.

owen fisher posted on Sunday, October 22, 2006 - 12:11 pm

Hi, it's carey again.

I describe my question in this way, and sorry for my illiteracy.

There is one factor and three categorical indicators, and the loadings are .455, 1, and .386. By the robust weighted least square, these loadings are probit regression coefficient.

My teacher told me to do the distributoion of the factor score, but I know (if I don't misunderstand the factor score), as a index, I can't find the meaningful cutoff points of the factor score. Like the binary logistic or order logit model, the factor score can only be calculated to the probability of being one of the level of the indicator.

After read the "logit and probit model:order and multinominal analysis" published from Sage univisity, I got a rough idea of the cutoff point of the factor score corresponding to the boundary between one level to the next or last of the indicator.

owen fisher posted on Sunday, October 22, 2006 - 12:15 pm

Following the previous statement.

If the thing is like that, I will directly describe the distributions of these three indicators without causing mental fatigue to find the good cutoff points of the factor score. However, I still need your suggestion.

Is the reason of my decision okay?

By the way, first, how to calculate the probability of being one of the level of the indicator from the factor score? Second, how to calculate the cutoff point of the factor score corresponding to the boundary between one level to the next or last of the indicator?

Thanks for your patience.

Linda K. Muthen posted on Sunday, October 22, 2006 - 3:34 pm

Even when factor indicators are categorical, the factors are continuous. You should save the factor scores and plot them.

See Chapter 13 for how to turn probit regression coefficients into probabilities.

owen fisher posted on Monday, October 23, 2006 - 3:15 am

Thanks for the suggestion, but how about the cutoff points of the factor score?

Linda K. Muthen posted on Monday, October 23, 2006 - 6:40 am

There is no set way that I know of to find cutoff points for factor scores.

owen fisher posted on Tuesday, October 24, 2006 - 5:40 am

Thanks for your patience.

Anja Wei� posted on Monday, April 16, 2007 - 4:02 am

Dear Mr. and Mrs. Muthen,

I have a Confirmatory factor model with ordinal variables and 7 factors. With SAVEDATA I have saved the factor scores.
How are these factor scores calculated? What ist the theoretical backround and where do I find it?

kind regards
anwe

Linda K. Muthen posted on Monday, April 16, 2007 - 7:56 am

See Technical Appendix 11.

Jessica Schumacher posted on Tuesday, November 13, 2007 - 1:20 pm

I saw in previous posts and in the technical appendices that Mplus uses the regression method for estimating factor scores for categorical outcomes with WLSMV. If I output the factor scores using the FSCORES option of the SAVEDATA command, can they be interpreted as any factor score would (i.e. if a respondent had a higher value, it indicates they have a higher level on that given factor)? Are the factor scores included on the saved dataset standardized? Thank you for your help and patience. Jessica

Bengt O. Muthen posted on Tuesday, November 13, 2007 - 5:41 pm

For both continuous as well as categorical outcomes with weighted least squares, the factor scores are obtained using the maximum of the posterior distribution. For continuous outcomes this approach has been given the name "Regression method", but I don't think that is used with categorical outcomes. For categorical outcomes, the method is iterative.

The answers to your questions are yes and yes.

Courtney Bagge posted on Thursday, December 13, 2007 - 3:17 pm

Hi there,

I am trying to output factor scores for one latent variable with 14 indicators. I am using type = complex missing h1. I have 13,570 in my data set. However, I am using the subpopulation command which specifies a subsample of 11,488. When examining the factor score output I have factor scores for 13,557 individuals. I thought I would have factor scores for only the 11,488.

Is the missing command generating scores for those not within the subpopulation? I wish to have factor scores for only those within the subpopulation (and not based on the whole population). However, if I do not specify the missing option then I will kick out those individuals who are missing at least one of the indicator variables. I want to make sure that my SEs are correct for the subsample.

Thanks,

Courtney

Joanna Harma posted on Friday, December 14, 2007 - 9:10 am

Hello,
I did confirmatory factor analysis with mplus and computed factor scores. When I checked the score I found that lot of people have negative factor scores. And remaining had positive factor scores. Now my supervisor wants an explanation for the negative score. Do you think negative score is an error? What is the possible reason for the negative score?
Many thanks
Joanna

Bengt O. Muthen posted on Friday, December 14, 2007 - 5:42 pm

Typically, the mean of a factor is set at zero so it is natural that some estimated factor scores are negative and some positive. It is not an error.

Tihomir Asparouhov posted on Monday, December 17, 2007 - 8:49 am

This is a response to Courtney Bagge's post. The factors scores that you have obtained are based on the model estimated from the subpopulation (not the entire population). So you can just ignore the factors computed for elements not in the subpopulation.

Tihomir Asparouhov

Joanna Harma posted on Friday, December 21, 2007 - 7:40 am

hello,
I computed factor score after CFA and used these factor score as independent variable in logit regression in the next stage of my analysis. Now my supervisor wants to know the procedure how Mplus computes factor score. Do you think you could expain to me the procedure in not too mathematical way? Also, when I compute quintiles of these factor scores I get 2 people in first quintile, 4 in second quintile, 30 in 3rd quintile, 120 in 4th quintile and 94 in 5th quintile. Now first three groups have very few people so does it make sense to combine first 3 groups in one and do regression with 3 groups instead of 5.
I shall be most grateful for your help.
Joanna

Linda K. Muthen posted on Friday, December 21, 2007 - 12:44 pm

Factor score estimation depends on the scale of the variables. The algorithms are described in Technical Appendix 11 which is on the website.

Why are you using factor scores? Why do you not simply estimate the full model in one step. It is always preferable to do an analysis in one step if possible.

Joanna Harma posted on Friday, December 21, 2007 - 2:28 pm

Thanks,
Well I have computed the asset index by CFA for the family level data and then did the logit regression (with cluster option) with child level variable as dependent variable and both child level and family level variables (including asset index) as explanatory variables. My supervisor wants me to use quintiles of factor score rather than just scores as computed by CFA. Hence I computed factor score in the first step and then did the second step with quintiles of factor scores. Now I am not certain if it is alright to compute quintiles of factor score as they are standardised. Do you think we could do the whole analysis in one step? Is it alright to compute quintiles of factor score?
Thanks
Joanna

Linda K. Muthen posted on Saturday, December 22, 2007 - 1:30 pm

In general, if you can avoid the intermediate step of estimating factor scores, you are at an advantage. I'm not sure why you feel you need factor scores nor quintiles of them.

Goran Milacevic posted on Saturday, July 12, 2008 - 1:20 am

Hello,
I am trying to get factor scores for a Latent Difference Score Model (McArdle). Unfortunately Mplus doesn�t give me any output when using the SAVE IS FSCORES; line. I have already checked the path (savedata)as well as the data set. However, I get an output with all the estimates (and no error messages) without the FSCORES command. Any comments how to solve this problem would be highly appreciated.

Thanks!

Linda K. Muthen posted on Saturday, July 12, 2008 - 6:52 am

Do you also use the FILE statement in the SAVEDATA command to specify where to save the factor scores?

Goran Milacevic posted on Saturday, July 12, 2008 - 7:13 am

Yes, I use the FILE statement and the values of my observed variables are saved in this file, but after adding the FSCORE statement I don't even get an output (with an error message, i.e. that the factor scores were not saved).

Linda K. Muthen posted on Saturday, July 12, 2008 - 8:57 am

Please send your input, data, output, and license number to support@statmodel.com.

Delfino Vargas posted on Friday, June 26, 2009 - 5:04 pm

I am using CFA model where i specified a single factor (F1) based on seven continuous manifest variables. After requesting the factor scores using the option SAVE=FS I observed that the mean scores are zero. How can i calculate the factor scores for each observation such that the mean is not zero but the actual estimated mean of F1?
Thanks

Bengt O. Muthen posted on Friday, June 26, 2009 - 5:16 pm

Typically, the mean parameter for the factor is fixed (standardized to) zero. Unless you have multiple groups. So there isn't a non-zero estimated factor mean and the estimated factor means having mean zero is then desirable.

Delfino Vargas posted on Monday, June 29, 2009 - 9:27 am

Thanks Bengt for your response. Actually in my case i do want to estimate the actual mean of the factor F1, and incorporate this into the factor scores for each observation.
Actually I have a second factor (F2), with similar variables at time 2, and analogously i have the corresponding factor scores. Since, both scores have mean of zero I cannot compare both of them. My goal is to compare the mean scores from F1 and F2. I was thinking in adding the estimated intercepts from F1 to the scores, does it makes sense? Any other suggestions?

Bengt O. Muthen posted on Monday, June 29, 2009 - 10:54 am

If you impose measurement invariance across time for items that are the same at the two time points you can identify a factor mean difference for the two time points. You can fix the factor mean at zero for the first time point and free it and let it be estimated for the second time point. The estimated factor scores for the second time point will then take this non-zero estimated factor mean into account (this occurs automatically in the prior of the posterior computations for the estimated factor scores).

James L. Lewis posted on Monday, September 28, 2009 - 6:27 pm

Do the factor scores extracted from a CFA in Mplus have the same desirable properties as an IRT score? Would the factor score for each individual be equal to the IRT score for each individual (say from a graded response IRT model)? If not, is there a consensus on which type of score is superior? Thanks.

Linda K. Muthen posted on Monday, September 28, 2009 - 6:48 pm

Yes, they are the same.

Katayoun Safi posted on Friday, October 30, 2009 - 4:09 am

Hi,
how can I export the saved factor scores in Mplus for further analysis, for example in stata?

Linda K. Muthen posted on Friday, October 30, 2009 - 6:06 am

Use the FSCORES option of the SAVEDATA command. See the user's guide for further information.

nina chien posted on Wednesday, November 04, 2009 - 10:22 am

Hi,

I saved out factor scores for 2 factors, closeness and stress. The factor scores for closeness range from -2.51 to 1.08 (distribution is somewhat negatively skewed), and for stress from -.71 to 2.86 (distribution is very positively skewed). But the original items are on a scale from 1 to 5. Did Mplus automatically center the factor scores? I see that each factor has a mean of 0.00. Thank you for your help.

Linda K. Muthen posted on Wednesday, November 04, 2009 - 5:44 pm

Factor scores are not centered. They need a metric and factor score estimation gives them a mean of zero.

James L. Lewis posted on Wednesday, December 30, 2009 - 7:06 pm

I wanted to learn more about the factor scores created from a CFA in MPLUS. Here are a few questions. Thanks!

(1) Would it be fair to say that they contain "no error" the way we think of it when we model everything in an SEM framework?

(2) How do we know they are generally better than using a regression method or summing/taking means of items to create a composite construct/score? Is there something that can be cited in general and perhaps in particular with regard to the (relatively better) properties of these scores.

(3) Would you still refer to the construct represented by these scores as "latent?"

(4) I have 2 correlated constructs for which I am generating the scores from the CFA, one is a 3-item and the other is a 4-item. As usual I constrain the variance of each factor to 1.0 and freely estimate each loading. Am I wrong that this would essentially standardize the factors? I am getting a mean of (essentially) zero, but a SD of around .91 no matter what I do. Is there any problem with standardizing them?

I have read Appendix 11, but I was hoping to learn more about their general properties.

I use Mplus very regularly and love it. This discussion board is a tremendous resource. Thank you very much.

Bengt O. Muthen posted on Thursday, December 31, 2009 - 11:58 am

It has been established that estimated factor scores do not behave like factors. See for instance

Skrondal, A. and Laake, P. (2001). Regression among factor scores. Psychometrika 66, 563-575.

This shows the distortion in the means, variances, and relations with other variables when using estimated factor scores. Especially with a small number of items I would recommend instead using SEM, which also makes it possible to test that the item sets are unidimensional.

James L. Lewis posted on Friday, January 01, 2010 - 6:01 pm

Thank you.

I would prefer to use SEM but, among other things, there are cross-classified random effects in my models, which I am able to deal with in the "mixed model" framework. I used CFA to test (multi)dimensionality and to examine measurement invariance across 2 groups. What is the best way to get a factor score or an "observed" score for my constructs if I can't/don't use SEM? Is there a better way than the method used in Mplus? I looked over the article - both of my variables are explanatory and one is a DV in one case. I am not sure I have the resources to carry out the method described there - is it the best way? Perhaps two seprate IRT unidimensional GRM models (but I was under the impression that the CFA method in Mplus gives the same scores)? Thanks.

Bengt O. Muthen posted on Saturday, January 02, 2010 - 9:29 am

I am not sure if your items are continuous or categorical. For cont. items, Mplus uses the regression method of factor score estimation, which is equivalent to the Maximum A Posteriori (MAP) method of IRT. For cat. items Mplus uses EAP (expected...) which is standard in IRT. With only 3 and 4 items you won't get good factor score estimates - IRT typically works with many more items per factor, say 20 items or more. With cont. items the problem shows up in terms of a low factor determinacy and with cat items it shows up in terms of poor information functions. I am not aware of literature comparing summed scores to factor scores with a small number of items, but it probably exists (although see my dissertation paper

Muth�n, B. (1977). Some results on using summed raw scores and factor scores from dichotomous items in the estimation of structural equation models. Unpublished Technical Report, University of Uppsala, Sweden.

at

http://www.gseis.ucla.edu/faculty/muthen/articles/Muthen_Unpublished_01.pdf

Anyone else?

So, it sounds like you have to make a compromise in terms of deciding which feature to ignore: measurement error or cross-classified random effects.

James L. Lewis posted on Monday, January 04, 2010 - 10:38 am

Thank you much for the valuable input. Yes, that is the dilemma (CCREs vs. Measurement Error - not the first time or likely not the last). My scores are Likert (5pt). Most people who I have asked think that ignoring the CCREs is a more problematic offense.

Bengt O. Muthen posted on Monday, January 04, 2010 - 5:14 pm

CCREs is on our future's list.

CEKIC Sezen posted on Saturday, February 27, 2010 - 1:45 am

Hello,
My problem is the following:
I have to complete analyses already done i.e. calculate factorial scores from an CFA which has been carried out with a TYPE=COMPLEX and a CLUSTER=SUBJECT.
My first question is the following:
the number of observations used in the estimate of the model is 1936, although my initial database was composed of 1556 observations.
1. What are the criteria of mplus for eliminating observations?
Then I�ve tried to obtain factorial scores relative to the analyses already done:
2. Is it possible to obtain factorial scores directly from an analysis CFA TYPE=COMPLEX with CLUSTER=SUBJECT?
As I could not do it, I�ve redone the analysis (by keeping the same model than before), a CFA with TYPE=GENERAL and IDVARIABLE ARE SUBJECT.
I�ve recounted the factorial scores on this last analysis.
Unfortunately, the parameters, their standard deviations and the indices estimated by this last analysis don�t� correspond exactly to the first analysis performed with TYPE=GENERAL and CLUSTER=SUBJECT.
�
3.if it is not possible to obtain factorial scores with an analysis of the type TYPE=COMPLEX and a CLUSTER=SUBJECT, can the factorial scores obtained thanks to the estimate TYPE=GENERAL and IDVARIABLE ARE SUBJECT be interpreted in the framework of the first analysis, even if the estimate of the two models is not exactly identical?
�
I hope that my questions are clear and that you can help me.
Cordially

Bengt O. Muthen posted on Saturday, February 27, 2010 - 6:15 am

Mplus uses ML under "MAR" which is sometimes called "FIML" and means that all subjects who have data on any of the analysis variables are used in the analysis. So perhaps your 1556 observations are the listwise present group, while 1936 is what ML under MAR uses.

You can get factor scores directly in a Type=complex, Cluster=subject analysis.

If you still have problems, please send your input, output, data, and license number to support@statmodel.com.

Edelyn Verona posted on Monday, April 12, 2010 - 12:59 pm

My question has to do with latent variable means in a multi-group CFA. Given that the default is for the latent variable means are set to zero in the first group and freely estimated in the second group, does that mean that I can't compare the means across groups using a chi-square diff test? Basically, I want to conduct analyses in the context of the multiple group model to test whether or not specific latent variable means are significantly different across the two groups. How do I do this and have an identified model?

Bengt O. Muthen posted on Monday, April 12, 2010 - 1:35 pm

You fix the factor means to zero in the second group. The chi-2 difference between the models will then test factor mean equality across groups.

Edelyn Verona posted on Monday, April 12, 2010 - 2:21 pm

So, the unconstrained model has the factor means set to zero in only the first group (freely estimated in the second group), and the constrained model sets the factor means for both groups to zero. Is that correct? And the chi-square diff between those two models tests equality of means across groups?

Bengt O. Muthen posted on Monday, April 12, 2010 - 2:57 pm

Right.

Edelyn Verona posted on Tuesday, April 13, 2010 - 12:08 pm

Thanks! How do I get the estimated means for the latent variables in Group 1, then, if they are always set to zero? And the estimated means for Group 2 are only in reference to Group 1, correct?

Bengt O. Muthen posted on Wednesday, April 14, 2010 - 4:30 pm

You don't need estimated factor means for Group 1. It is only the difference in factor means that is identified and meaningful to discuss. And that difference is captured by the group 2 factor means.

Note that this does not imply that every person's factor value is zero in group 1.

Daiwon Lee posted on Sunday, May 23, 2010 - 4:28 pm

Hello,

My advisor wants me to run a model where I saved the factor scores and then merge the factor scores into a new data set.
I think I know how to save factor scores, but I don't know how to merge with original data set to use them in the analysis. Please help me.
Many thanks in advance.

Linda K. Muthen posted on Sunday, May 23, 2010 - 9:18 pm

See the new merge options in the SAVEDATA command for Version 6. The most recent user's guide is on the website.

Daiwon Lee posted on Monday, May 24, 2010 - 6:28 am

Hi Dr.Muthen,

Thank you for the note. However, could you please tell me how to merge saved factor scores with original data in version 5.21?
I also tried to save factor scores in "dta" to merge with original data using stata program but stata failed to read the mplus saved factor score file.

Thanks a lot in advance.

Linda K. Muthen posted on Monday, May 24, 2010 - 6:36 am

See the FSCORES option of the SAVEDATA command. If you want variables saved other than the analysis variables and the factor scores, use the AUXILIARY option to name these variables.

Bart Meuleman posted on Wednesday, June 23, 2010 - 1:17 am

Hello,

I encountered some strange results when saving factor scores for a multi-group CFA.

I have a very simple model, with 2 groups and 3 items loading (strongly; >.70) on one latent variable. Factor scores are saved using the SAVEDATA option. In the newly created dataset, however, I find near-zero correlations (<.10) between the obtained factor scores and the original items for the second group. For the first group, correlations between items and factor scores are as expected.

I checked beforehand, and factor loadings are about equally strong in both groups. The same strange pattern is found irrespective of the the order of the groups, whether I imply or release equality constraints on loadings and/or intercepts, use different items...

Am I doing something wrong? This is the syntax I use (for a model with equality constraints on factor loadings and intercepts). FYI: I am still using an older version of Mplus (version 4), maybe this has something to do with it...

-----------------------------
variable:
names are item1 item2 item3 country;
usevariables are item1 item2 item3;
grouping is country (1=BE 7=DK);

model:
Y by item1 item2 item3;

output:
standardized; modindices;

savedata: file is testout.sav; save = fscores;
----------------------------------

Thanks beforehand

Bart

Linda K. Muthen posted on Wednesday, June 23, 2010 - 6:05 am

Please send the files to support@statmodel.com.

Li Lin posted on Monday, July 19, 2010 - 8:47 am

What are the "*"s in the saved factor scores data set? Thanks.

Linda K. Muthen posted on Monday, July 19, 2010 - 8:52 am

The asterisk (*) is the missing value flag.

Li Lin posted on Monday, July 19, 2010 - 1:46 pm

Thanks! Another question - What are the first few columns in the saved factor score data set for? For example, I had observed ordinal (value = 1, 2, 3, 4, and 5)variables x1 to x4, then the factor score data set includes 4 columns called "x1" to "x4" in front of id and factor score. Compare these columns to the original observed data, it appears that x1 in factor score data equals to the original observed x1 minus 1.

Linda K. Muthen posted on Monday, July 19, 2010 - 2:32 pm

Mplus requires the data to have the lowest value of zero so the data are automatically recoded to 0, 1, 2, 3, and 4, The recoded data are saved. See the CATEGORICAL option in the user's guide for more information about this recoding.

Dallas posted on Saturday, July 31, 2010 - 8:53 am

Linda,

Good morning. I have a question about the comment you make above in replying to James (James L. Lewis posted on Monday, September 28, 2009 - 6:27 pm).

James asks two questions it seems to me. 1) Do factor scores have the same properties as IRT scores? And, 2) are IRT and factor scores the same.

You indicate yes to both. For properties, this makes sense (assuming, of course, a factor model that corresponds to an IRT model).

However, it doesn't seem true that factor scores EQUAL IRT scores. Correct? In other words, if we used EAP scoring and the loadings, etc. from the factor model, we'd get one set of factor scores. If we then converted the factor model parameters into IRT parameters, and again used EAP scoring to get IRT scores, it doesn't seem to me the factor scores would EQUAL the IRT scores. They would have similar measurement properties, but it doesn't seem like they'd be identical scores (in value).

If I am right, can you or Bengt provide a formula to convert factor scores into IRT scores, like formulas do to convert the parameters?

Thanks!

Bengt O. Muthen posted on Saturday, July 31, 2010 - 4:36 pm

IRT often refers to the 2PL model estimated under maximum likelihood. The 2PL model is the logistic model in Mplus for binary items and using a single factor where you free the factor loadings and fix the factor variance at one. With 2PL and ML estimation, Mplus gets the same loglikelihood as IRT programs. And it computes factor scores using the same EAP (expected a posteriori) method. So this is the same as in IRT and the same factor score values should be expected.

ML probit is also possible, and again EAP is used.

WLSMV probit is also possible, in which case MAP is used.

Dallas posted on Saturday, August 07, 2010 - 5:51 am

Dr. Muthen. Thanks for your reply. You replied so quickly it took me a few days to notice! Yes, I was thinking about using probit and ML, and also thinking about WLSMV probit. In those cases, it seems one would have score with similar measurement properties (e.g., one would still inherit the properties of IRT models), but not identical scores with respect to the "traditionally" estimated IRT model. It seems, though, that a general formula for converting scores from the probit metric to the logistic metric does not exist?

And, thanks for the nudge regarding logistic and ML. It does make sense that and I do agree that with logistic and ML (and appropriately identified model), one would achieve the same results as the IRT model.

Thanks.

craig neumann posted on Monday, August 23, 2010 - 11:52 am

When obtaining factor scores via the savedata command, is it possible to also output other variables (such as subject ID) that were not in the CFA?

craig neumann posted on Monday, August 23, 2010 - 11:56 am

oops... never mind the post for crag neumann, just found the idvariable option.

Maren Winkler posted on Sunday, October 17, 2010 - 10:19 am

Are the factorscores Mplus estimates when using the MLR estimator corrected for measurement error?

Linda K. Muthen posted on Monday, October 18, 2010 - 9:30 am

Yex.

James L. Lewis posted on Tuesday, October 19, 2010 - 4:00 pm

Hello,
I have a question about obtaining factor scores. The way I understand it Mplus factor scores from a CFA with Likert items and the WLSMV estimator will give me factor scores that are equivalent to Graded Response Model (GRM) IRT scores (EAP estimation [with a normal prior I presume?]).

1. Is this correct?

My other question is with regard to local independence and unidimensionality.

2. I have 15 Likert items (5 point) with which I am trying to measure a single construct. It is pretty clear to me, however, that there are local dependencies and maybe ultimately >1 factor among these items. If I appropriately specify a bi-factor model or otherwise appropriately estimate correlations among the residuals however (to get rid of or "account for" the local dependencies), will my estimated factor scores for the GENERAL factor essentially be equal to GRM person scores (theta) from a model where the unidimensionality and local independence assumptions are satisfied?

If I could I would just stay within the latent variable framework, but for my application here I really need the factor scores.

I hope this makes sense. Any citations would be tremendous also. I love Mplus. Thanks.

Bengt O. Muthen posted on Wednesday, October 20, 2010 - 10:23 am

1. With WLSMV it is actually MAEP (maximum a posteriori) that is used. With ML it is EAP. For WLSMV factor score estimation, see the Technical Appendix for Version 2 on our web site.

2. Yes. I know of no citation for this.

James L. Lewis posted on Wednesday, October 20, 2010 - 10:59 am

Thanks much.

MAEP with a normal prior?

I apologize, I cannot locate Technical Appendix Version 2 on the website, only Version 3. It looks like Appendix 11 in the latter only covers estimation of factor scores for continuous and binary data, not ordered categorical(?).

Thanks.

Bengt O. Muthen posted on Wednesday, October 20, 2010 - 11:36 am

Yes, normal prior.

You find it under Technical Appendices (see left margin of the home page) and once on that page, it is the first link which gets you to:

http://www.statmodel.com/download/techappen.pdf

Appendix 11 - see (229) and below.

James L. Lewis posted on Wednesday, October 20, 2010 - 4:06 pm

Thanks much. I see it now.

I have one final question.

MAEP and EAP scoring of course require the specification of a prior distribution. I am a bit confused in that if I specify a normal prior (for either method), should I not expect that the resulting distribution of Theta-Hat will be normal or close to it?-- particularly if the population distribution of Theta is indeed normal? What if the population distribution of Theta is not normal? I am getting conflicting reports on all this. Can you perhaps clarify. Thanks much.

Bengt O. Muthen posted on Thursday, October 21, 2010 - 1:53 pm

The posterior distribution can be quite non-normal even with a normal prior. For example, if the items are too difficult or too easy we can't discriminate between people who are high/low and the posterior (theta-hat) will be skewed.

ywang posted on Friday, February 11, 2011 - 8:17 am

Dear Drs. Muthen,

Does IRT model usually stand alone? I included IRT in the SEM, but cannot find the model fitness criteria for the SEM. What model fitness criteria can we get for the SEM with IRT? Also can you refer any paper that describes the model of SEM with IRT? Can it be described in similar way as the CFA model in SEM except for that the indicator variables are categorical?

Linda K. Muthen posted on Friday, February 11, 2011 - 10:28 am

It sounds like you are using maximum likelihood estimation and categorical outcomes. You will not receive chi-square and related fit statistics in this case. If you use weighted least squares estimation, you will. IRT is CFA with categorical outcomes. There are many IRT books. We have some papers on our webiste under Papers/IRT.

xstudylab posted on Wednesday, February 23, 2011 - 10:09 am

I just switched to Version 6.1 and now I can't get factor scores for my model... I get the error message 'FACTOR SCORES CAN NOT BE COMPUTED FOR THIS MODEL DUE TO A REGRESSION ON A DEPENDENT VARIABLE' but indicators and factors are only regressed on exogenous variables.

I ran the same input using Version 5 and it gave me the same estimates as Version 6.1, but it produced factor scores without an error message.

Is there something different about how Version 6.1 produces factor scores?

Linda K. Muthen posted on Wednesday, February 23, 2011 - 11:59 am

Please send the output and your license number to support@statmodel.com. This may be a problem in Version 6.1.

Dena Pastor posted on Wednesday, February 23, 2011 - 1:52 pm

I'm running a regular CFA model and a little uncertain how to interpret the information provided in the output under SAMPLE STATISTICS FOR ESTIMATED FACTOR SCORES obtained using PLOT3.

Linda K. Muthen posted on Wednesday, February 23, 2011 - 2:38 pm

When there are factors in the model, the PLOT command provides factor scores. The descriptive statistics are for the factor scores.

Walter Paczkowski posted on Wednesday, March 02, 2011 - 10:02 am

I'm interested in buying MPlus Base because of the ability to handle binary and ordinal variables in CFA. I do have a question regarding the factor scores that I hope you can address. The scores are continuous, yet I need to convert them to either binary or ordinal -- the same scales as the original variables used in the CFA. Does MPlus do this or is there a way to program this conversion (does MPlus, in fact, have programming capability?)? Any papers on converting? Technical Appendices? Notes?

Bengt O. Muthen posted on Wednesday, March 02, 2011 - 10:50 am

Mplus does not do this. There would need to be further information in order to define such a conversion. Mplus does not have a programming capability. There is an IRT literature on conversions such as that related to NAEP with writings by Mislevy and others.

ywang posted on Wednesday, March 02, 2011 - 10:59 am

I am working on a SEM with a latent factor by IRT (3 dummy indicator variables) as an independent variable. I was asked by other researchers for description on the latent construct. They believe that the latent variable must have some sort of values and would like to describe the latent contruct in the way of range and distribution.

It seems that the latent variable does not have a metric and it is not possible to be described in such a way as an indicator variable. I am wondering whether it is appropriate to describe the latent construct using factor score instead. However, I have some concerns since(1) estimated factor scores differ between the stand-alone IRT model and the SEM model, and (2) factor score is not exactly the latent construct and still has measurement error.

Do you have any suggestions on how to describe the latent factor in the SEM?

Thank you very much for your help!

Linda K. Muthen posted on Wednesday, March 02, 2011 - 11:44 am

In a cross-sectional model, a factor has a mean of zero and an estimated variance.

ywang posted on Wednesday, March 02, 2011 - 12:32 pm

Thank you very much for the reply. I have a follow-up question. In the stand-alone IRT model, I got the factor variance as 0.122. However, when the IRT was included in the SEM, the factor variance was changed to 0.202. Which variance should I report? Is this inconsistency due to that the SEM does not fit the data well (CFI 0.873, TLI: 0.762)?

Bengt O. Muthen posted on Wednesday, March 02, 2011 - 1:04 pm

If the factor is an independent variable in the SEM then the estimated variances should be close within their SEs. If not, as you say, the SEM may be ill-fitting.

I would not use estimated factor scores here given that you have only 3 indicators. The factor metric of the SEM is clear: your model postulates a normal variable with a mean of zero and a certain variance (or you can fix the variance at 1 to get a z score, and then free the first loading).

ywang posted on Wednesday, March 02, 2011 - 2:07 pm

Thanks a lot. Your reply greatly helped me. I have another question for factors by subgroups such as gender. If I have to list the mean and variance of factor score for males, for females and for all the sample in one table as well as the p value of difference of factor score between males and females, what should I do?

In your previos discussion with other researchers, I understand that multi-group analyses should be used to compare whether the factor mean differs between males and females. In the model you previously mentioned, mean for the factor among one group (e.g. males) is fixed as 0 and the mean is freely estimated in the other group (e.g. females). For that table, I need to list mean and variance for both males, females, and overall sample. How can I relax the means for both groups in the multi-group analyses? Thanks!

Linda K. Muthen posted on Thursday, March 03, 2011 - 6:48 am

In multiple group analysis, a test of factor mean differences is a difference test between a model with factor mean zero in one group and free in the other groups versus a model with factor mean zero in all groups. Please see Slide 223 of the Topic 1 course handout.

Fernando H Andrade posted on Thursday, March 17, 2011 - 1:16 pm

Dear DR. Muthen
I am running a CFA with categorical indicators, i requested Mplus to compute the factor scores. The model fits well except that mplus cannot compute the factor scores. this is the message

THE MODEL ESTIMATION TERMINATED NORMALLY

FACTOR SCORES CAN NOT BE COMPUTED FOR THIS MODEL DUE TO
A REGRESSION ON A DEPENDENT VARIABLE.

is there a way to overcome this and get the factor scores
thank you
fernando

Linda K. Muthen posted on Thursday, March 17, 2011 - 1:56 pm

I think you will find that factor scores are saved in spite of this message. Check that. There is an incorrect error check in Version 6.1 that produces this message but still gives valid factor scores.

Fernando H Andrade posted on Thursday, March 17, 2011 - 2:04 pm

thank you, but i could not find the file with the scores. in the output i had this message:

SAVEDATA INFORMATION

Factor scores were not computed.
No data were saved.

this is the command i used to save factor scores
savedata:
file is E:\fandrade\pisa\paper1\revised soc of educ\scpfscores;
save is fscores;

Linda K. Muthen posted on Thursday, March 17, 2011 - 2:35 pm

Pleas send your input, data, output, and license number to support@statmodel.com.

Jan Ivanouw posted on Thursday, March 31, 2011 - 12:52 am

Hi,

I wonder if Mplus can give SE's for Factor Scores?

Linda K. Muthen posted on Thursday, March 31, 2011 - 5:54 am

Yes, for factors with continuous factor indicators.

Helen Zhao posted on Tuesday, May 17, 2011 - 12:16 am

Hi, professors Muthen,

I wonder is it possible to obtain factor scores for error terms in Mplus?

Thanks,
Helen

Linda K. Muthen posted on Tuesday, May 17, 2011 - 6:53 am

You should be able to do this. See the FAQ on the website called Regressing on a residual.

Michelle Little posted on Monday, June 20, 2011 - 6:38 pm

Hello,

Reading over this post stream, I understand that SEM is generally preferable to using factor scores. I have a related question.

When use of all items, or parcels, is prohibited because of sample size limitation, are there any advantages to using factor scores in lieu of total scores for first order factors in an SEM model? In other words, I was thinking of using factor scores to represent first order manifest scales within latent factors in an SEM model. I was hoping to reduce some second-order factors to first order factors in this way. Are there any advantages of accuracy and reduced error when using factor scores in this way, as opposed to using total scores?

Thanks in advance for your help.

Linda K. Muthen posted on Tuesday, June 21, 2011 - 10:22 am

In your case, when you are using the factor scores or sum scores as factor indicators, measurement error is taken into account. I can't think of a reason why one would be preferable to the other.

Stefanie Hinkle posted on Wednesday, July 27, 2011 - 8:22 am

Hi. I am working on obtaining factors scores. The problem that I have run into is when I look at the output data set the values for my weight variable that are output are not the same as the original weights that I input. The cluster and strata values were not altered. Does Mplus alter the weights when it is using it?

For example the original weight variable ranged from ~4 to ~3000. Now in the output dataset it ranges from 0 to ~4.

I appreciate any thoughts.

Thanks!

Here is my code:
WEIGHT IS W1;

STRATIFICATION IS STR;

CLUSTER IS NEWPSU;

SUBPOPULATION IS EX EQ 1;

MISSING ARE .;

IDVARIABLE IS id;

ANALYSIS:

TYPE = GENERAL COMPLEX;

MODEL:

!MEASUREMENT MODEL
F1 BY V1 V2 V3 V4 V5 ;
F2 BY V6 V7 V8 V9 V10 ;

Linda K. Muthen posted on Wednesday, July 27, 2011 - 11:03 am

If the sum of the sampling weights is not equal to the total number of observations in the analysis data set, the weights are rescaled so that they sum to the total number of observations.

Michael Becker posted on Wednesday, October 12, 2011 - 11:03 am

Dear Drs. Muthen,

I tried to estimate an IRT-Model, which works, yet it does not return factor scores, but strangely enough there is also no error message. It seems like this or some similar problem was addressed in the forum before, at least there is a thread from
Goran Milacevic posted on Saturday, July 12, 2008 - 1:20 am
which sounds similar. Can you reconstruct what the solution was back then?

Thank you very much in advance for your advice!
Michael

Linda K. Muthen posted on Wednesday, October 12, 2011 - 11:16 am

Please send your input, data, output and license number to support@statmodel.com

Michael Becker posted on Wednesday, October 12, 2011 - 9:07 pm

While I was waiting, I got a solution: Another trial with an increased number of integration points worked to return out the fscores output.
Thanks,
Michael

Evelyn posted on Friday, October 28, 2011 - 3:33 pm

I have done an EFA and would like to save the factorscore and use them in a pathmodel (the N of the dataset is too low to include the measurement model in the path analysis)

I've read it is not possible to use "save=fscores" with the EFA command,
so I have used

Analysis:
Type = COMPLEX; !data from students in classes
Model:
f1-f3 by es_vroth-es_marry (*1); !EFA indicated 3 factor model
Savedata:
file is moroc_fscores.dat;
save = fscores;

I've compared the GEOMIN ROTATED LOADINGS of the EFA and the CFA and noticed slight differences. Why is that? Could it be problematic?

Also I was wondering how I can name the factors for easy identification

thank you

Evelyn posted on Friday, October 28, 2011 - 3:38 pm

Apologies: when I compare the standardised loadings they are identical.

Bengt O. Muthen posted on Friday, October 28, 2011 - 8:12 pm

Loadings and therefore factor scores are different in EFA than CFA because the models are different, with different degrees of freedom. EFA presents standardized loadings because a correlation matrix is analyzed whereas CFA uses a covariance matrix so that only the standardized solution is close to the EFA.

You should watch the video of our Topic 1 course to learn more about these matters.

Kerry Lee posted on Sunday, November 20, 2011 - 8:25 pm

Dear Drs Muthen,

I am running a confirmatory CFA. Depending on the indicator to which the scale of the factor is fixed, the variance of the factor does not always attain significance. Specifically, when I fixed the metric of the latent to the indicator with the largest loading (i.e., when the factor variance was fixed at one), it failed to attain significance (p = .076). It attains significance when its metric is fixed to one of the other two indicators. To obtain more information on the distribution of the latent, I generated some histograms using the PLOT2 command. When I asked to "view descriptive statistics", the variance value (.133) differs from that in the text output (.147). Would you have some suggestions on why there is a discrepancy? My second question relates to the Kurtosis value. Has it been rescaled, with zero denoting no departure from normality?

Sincerely,
Kerry.

Kerry Lee posted on Sunday, November 20, 2011 - 8:34 pm

Regarding my previous post, the histograms were generated using PLOT3, not 2.

Kerry.

Bengt O. Muthen posted on Sunday, November 20, 2011 - 8:39 pm

I think what you are seeing is the difference between the estimated factor variance parameter and the variance of the estimated factor scores. They are not expected to be the same. Yes, Kurtosis zero is no such departure.

Kerry Lee posted on Sunday, November 20, 2011 - 8:45 pm

Regarding my previous post, the histograms were generated using PLOT3, not 2.

Kerry.

Kerry Lee posted on Sunday, November 20, 2011 - 8:54 pm

Thanks very much for the quick reply. I want to say something about whether there is a significant amount of variance in the latent factor, should I report the value from the text output (I assume this is the estimated factor variance parameter).

Kerry.

Bengt O. Muthen posted on Monday, November 21, 2011 - 7:34 am

Yes, use the estimated factor variance parameter and its SE in the printed output.

nanda mooij posted on Wednesday, December 07, 2011 - 7:48 am

Dear Drs. Muthen,

I have a model with first, second and third order factors, and I want to estimate the factor scores of the first and second order factors with the calculated item parameters of the third order factors. Now I am wondering if I could get these factor scores through putting the item parameters of the third order factors in the input. The items have 3 categories, so are polytomous and are non-ordered. I saw the appendix about the estimation of factor scores, but that's only about dichotomous or continuous y-variables...
So can I put item parameters in the input and if not, where can I find an appendix about estimating factor scores of categorical, non-ordered, y-variables?

Thanks in advance,
Nanda

Linda K. Muthen posted on Wednesday, December 07, 2011 - 12:22 pm

You can obtain factor scores for all of the factors using the FSCORES option of the SAVEDATA command. See Technical Appendix 11 on the website.

nanda mooij posted on Thursday, December 08, 2011 - 9:21 am

Dear Drs. Muthen,

I know that I can obtain the factor scores for all the factors at once, but I actually want to obtain the factor scores of the total scale using the itemparameters of the subscales (so I first estimate the item parameters of the subscales per subscale, and I want to use these itemparameters to estimate the factor scores of the the total scale). So actually what I want to do is the way it is done in MULTILOG, where I must provide a file with the itemparameters I want to use in it. I'm wondering if this is also possible in Mplus to put a reference of the itemparameters in the input.
I hope I'm explaining myself better now.

Thanks a lot,
Nanda

Bengt O. Muthen posted on Thursday, December 08, 2011 - 8:16 pm

You can fix the item parameters at the values you want in the MODEL command and only request estimation of the factor scores.

Roger E. Millsap posted on Monday, April 02, 2012 - 1:28 pm

We are running a CFA with categorical indicators in which some loadings and thresholds are fixed (and others are not). We want to obtain factor scores. Mplus tells us that "FACTOR SCORES CAN NOT BE COMPUTED FOR THIS MODEL DUE TO
A REGRESSION ON A DEPENDENT VARIABLE". What does this mean? We have no regressions in the model, or other path structure, apart from a CFA.

Linda K. Muthen posted on Monday, April 02, 2012 - 3:18 pm

If you are not using Version 6.12, download it. This error message came out in error in an earlier version.

Richard E. Zinbarg posted on Thursday, April 05, 2012 - 10:00 am

We want to obtain factor scores from a CFA of categorical indicators to use in a subsequent path analysis because the full model, including the measurement model, would be too large for our sample size. We want to use a single-indicator SEM approach for the path analysis, however, so we want to estimate the reliabilities of the factor scores. We have found a very sensible formula on the internet in which we would divide the factor variance by the sum of the factor variance plus the factor score variance (based on the notion that the factor score variance estimates the error variance). That post on the web indicated that Mplus could give us output corresponding to the within-subject factor score variance (which is constant across subjects). Is this correct? If so, how do we get it?

Linda K. Muthen posted on Thursday, April 05, 2012 - 1:37 pm

If you use SAVE=FSCORES you will obtain factor scores and standard errors of the factor scores.

Alison R Alden posted on Sunday, April 08, 2012 - 4:22 pm

This is Rick Zinbarg's graduate student, Alison, following up on the question that he asked about our analyses.

While it is true that when you use SAVE=FSCORES for models of _continuous_ items, you obtain standard errors for the factor scores, we have specified that the items in our data set are categorical.

When we do this and use the SAVE=FSCORES command, MPLUS still gives us factor scores but no longer gives us their standard errors.

Is there a way to get the within-subject factor score variance for categorical data?

Our goal in getting this information is to calculate the reliabilities for the factor scores we are obtaining in order to use them in a single-indicator SEM approach for path analysis.

Thanks for your help.

Linda K. Muthen posted on Monday, April 09, 2012 - 8:19 am

With categorical outcomes, the standard errors vary as a function of the factor values. You need to take those values from the plot of the information function you get using PLOT2 in the PLOT command. The standard errors are computed as 1/square root of the information function value.

Alison R Alden posted on Tuesday, April 17, 2012 - 5:38 pm

Thanks for the info about getting the information functions.

Unfortunately, we neglected to mention in our earlier posts that we are trying to obtain the within-subject factor score variance for a group factor in a hierarchical model (e.g., one in which items load upon both group factors and a general factor).

We are able to use the PLOT 2 option to get Mplus to give us the information functions for both a unidimensional model of the the items comprising one group factor and a unidimensional model of all items.

However, is there any way to get MPLUS to give us information functions for each factor in a multi-dimensional model in which items are categorical?

We are hoping to use this information to calculate the reliability of factor scores generated from our model.

Linda K. Muthen posted on Wednesday, April 18, 2012 - 9:15 am

You have a choice of showing individual factors in PLOT2. Go through the windows and you will find one where you can choose which factor to plot.

Wen-Hsu Lin posted on Sunday, May 27, 2012 - 7:33 pm

Hi, when I conduct CFA with categorical data, one part of the output also show IRT results. I am wondering the saved factor score can be used as IRT score? Thank you.

Linda K. Muthen posted on Monday, May 28, 2012 - 8:35 am

Yes, the saved factor scores can be used as IRT scores.

Wen-Hsu Lin posted on Tuesday, May 29, 2012 - 12:19 am

Thanks Lina. A follow up on creating IRT score. I have, for example, 4 waves data, and for each wave I create the IRT score according to saved factor score. I would like to use these four waves' saved score to fit a LGC model. How do I do this in Mplus?
Thank you very much

Linda K. Muthen posted on Tuesday, May 29, 2012 - 8:38 am

You can save the factor scores using the SAVEDATA command and then use the saved data to estimate the LGC model. Unless factor determinacy is one, the factor scores are not the same as using the factors in the model. I would suggest a multiple indicator growth model instead.

Wen-Hsu Lin posted on Sunday, June 03, 2012 - 8:35 pm

Thanks Linda, perhaps I did not state my question clearly. I know how to save the factor scores. How do I combine saved scores for each waves? or should I run CFA for each wave at once and save scores. So, I will have only one file?

Linda K. Muthen posted on Monday, June 04, 2012 - 5:43 am

If you want to use the factor scores from all of the waves in a single analysis, you need to save all of the factor scores in one file. This can be accomplished by running all four waves together and saving the factors scores. You would want to do this as a first step to determine measurement invariance across time.

Dmitriy Poznyak posted on Monday, February 11, 2013 - 7:26 pm

Dear Linda and Bengt,

I need to calculate the standard errors of the factor scores for the CFA model with categorical outcomes. From your explanation above, it looks like in categorical CFA, std. errors can not be computed along with the factor scores in a single step but need to be further derived from the plot of the information function. This is where I am getting lost.

Could you please describe the sequence of steps one needs to follow in order to produce the std. errors for the latent factor scores for each respondent (or refer me to an example). I also wonder if there would be a way to output these scores in a txt or dat format in order to append to the vector of the factor scores?

Thank you!

Bengt O. Muthen posted on Tuesday, February 12, 2013 - 3:51 pm

Factor score SEs have been implemented in more cases in Version 7. Still, WLSMV doesn't give them yet.

The information function gives the inverse of the square of the SE at different factor score values.

Dmitriy Poznyak posted on Tuesday, February 12, 2013 - 4:54 pm

Dr. Muthen,

Thank you for your prompt reply. Just to clarify, in v. 7, I would be able to get the errors if I used ML pr Bayes estimators instead of WLSMV, correct?

Unfortunately, since I want to calculate S.E. for each unique value of the latent scores (n=3400), using the information function graph would be impractical.

Bengt O. Muthen posted on Tuesday, February 12, 2013 - 5:46 pm

If you send Support your input, we can check if that case gives SEs.

Christoph J. posted on Thursday, June 20, 2013 - 7:42 am

Hi,

i am estimating a model using WLSMV and want to compute factor scores. Mplus (Version 6.1) gives me the following error message:

FACTOR SCORES CAN NOT BE COMPUTED FOR THIS MODEL DUE TO A REGRESSION ON A DEPENDENT VARIABLE.

I have four ordinal indicators. What does the error message mean and is there a way to solve this problem?

Kind regards,
Christoph

Linda K. Muthen posted on Thursday, June 20, 2013 - 7:51 am

You cannot obtain factor scores for your model using weighted least squares. You can use maximum likelihood instead.

Christoph J. posted on Thursday, June 20, 2013 - 8:18 am

Thanks for this really quick reply! Could you maybe give me a short explanation what the error message means and why WLSMV cannot be used in such a case? I would like to understand why it is not working.

Thanks in advance!
Christoph

Linda K. Muthen posted on Thursday, June 20, 2013 - 8:21 am

You must have a situation where you regress something on a dependent variable. Factor scores have not been developed for this model.

Christoph J. posted on Thursday, June 20, 2013 - 8:31 am

Hi, again thanks for the quick reply. I am not sure I understand you response. In my case the model is simply:

Model:
Y1 by f_oper99 f_kino99 f_sport99 f_freVerw99;

Christoph

Linda K. Muthen posted on Thursday, June 20, 2013 - 8:52 am

Oh, I think there was a version where this message came out in error. You should use a newer version.

Jason Bond posted on Tuesday, July 09, 2013 - 10:31 am

Hi,

I had a question regarding estimated factor scores obtained from a CFA on polytomous factor indicators. For continuous factor indicators, my understanding is that the estimated factor scores are standardized to have mean 0 and variance 1. However, estimated factor scores from polytomous factor indicators have, in my case (a majority of the items are skewed), a negative mean which naturally raises the question with those who I am working with what the scale (or normalization) is for these factor scores. I found this question difficult to answer with Tech appendix 11. Thanks for an help,

Jason

Linda K. Muthen posted on Wednesday, July 10, 2013 - 12:29 pm

Factor scores are not standardized. The can have any mean or variance. With categorical items that are skewed, you will see this in the factor scores.

J.W. posted on Sunday, August 18, 2013 - 12:27 pm

To my knowledge, when items are continuous, Mplus uses the regression method to estimate factor scores; when items are categorical, Mplus uses Expected A Posteriori (EAP). I have a CFA with ordinal items measured on a 5-point scale:
1) Treating item as continuous measures and save estimated factor scores in Mplus;
2) Treating item as categorical measures and save estimated factor scores in Mplus.

Both model fits data very well, and the factor scores estimated from the two models are highly correlated (r=0.98). I am wondering if the model results indicate that I can simply treat my 5-point Likert scale items as continuous measure for CFA? If so, any reference? It is much easier to use CFA with continuous items for measurement invariance testing. As I recall, the ALIGNMENT option is only available for continuous or binary items in the current version of Mplus. Will the option be available for categorical items in the future verion of Mplus?
Your help will be appreciated!

Bengt O. Muthen posted on Sunday, August 18, 2013 - 1:57 pm

When items are categorical, WLSMV uses Maximum A Posteriori (MAP) and ML uses EAP.

Typically, it is ok to approximate Likert scale variables as continuous and use linear models unless there are strong floor or ceiling effects. You asked for references - here are 2 classic ones:

Muth�n, B., & Kaplan D. (1985). A comparison of some methodologies for the factor analysis of non-normal Likert variables. British Journal of Mathematical and Statistical Psychology, 38, 171-189.

Muth�n, B., & Kaplan, D. (1992). A comparison of some methodologies for the factor analysis of non-normal Likert variables: A note on the size of the model. British Journal of Mathematical and Statistical Psychology, 45, 19-30.

Yes, alignment method is likely to be expanded in various directions.

J.W. posted on Monday, August 19, 2013 - 8:15 am

Great help! Thank you so much!

Jessica Kay Flake posted on Tuesday, September 24, 2013 - 4:52 pm

Hello,
I am using the alignment method, new to version 7.11. I have attempted to use the SAVEDATA command to save the results/import the output to a data file, but no data file appears or is empty when I open it. Are there SAVEDATA options for the alignment method? I have thousands of pages of output and am trying to find ways to make it easier to interpret/manipulate/analyze the output. Any suggestions you have for this will be much appreciated. Thank you for your time.

Linda K. Muthen posted on Wednesday, September 25, 2013 - 4:21 pm

We are not aware of any specific problems involving the RESULTS option and alignment. Please send your input, data, output, and license number to support@statmodel.com.

deana desa posted on Thursday, November 07, 2013 - 7:52 am

Hi,

Is there any way to tell Mplus to keep cases with missing on all variables as the way there are in the input data, although these cases won't be used in the modeling when computing factor scores for other cases?

Linda K. Muthen posted on Thursday, November 07, 2013 - 8:56 am

There is no way to do this. The analysis data set is saved and it does not contain these cases.

Li posted on Tuesday, November 19, 2013 - 2:37 pm

Dear Drs. Muthen,
I have a question about factor scores derived from CFA.
I ran a four-factor CFA with 16 items. So each factor has only a few indicators (dichotomous, yes or no). The purpose of the CFA is mainly to confirm the factor structure for some follow-up multilevel analysis. I understand I can use one-step multilevel SEM for this, but I can�t do that due to sample size and other issues.

I compared sum score and factor score for the four factors and found them very similar. So I used sum score for the follow-up multilevel regression analysis. A colleague thinks factor scores are superior to sum scores because they take into account of the correlation among the factors. BTW, the correlation among the four factors in our CFA ranges from .48 to .75. According to this colleague, if I use factor score in the follow-up analysis then multicollinearity is already dealt with.

I read the Mplus posts and the technical appendix but still can�t tell whether the factor scores from CFA already incorporates the correlations among the factors or not. What do you think? Especially do you have a reference for this?
I really appreciate it.
Li

Linda K. Muthen posted on Wednesday, November 20, 2013 - 1:39 pm

The factor scores are computed based on the model that is estimated. If the model includes correlations among the factors, the factors scores incorporates them.

Aidan G. Wright posted on Tuesday, February 18, 2014 - 8:37 pm

I had a question about how Mplus calculates factor scores when cases have some missing data on continuous indicators. As I've read, I gather that Mplus uses "regression or Barlett methods" when estimating factor scores for continuous data, and it uses "all available data" to estimate these scores when data are missing. However, it's not very clear exactly how this is accomplished. Could you please say more about how Mplus manages to calculate a factor score when a case has missing data?

Thanks,

Aidan

Bengt O. Muthen posted on Wednesday, February 19, 2014 - 8:48 am

If a subject has missing data on all variables in the model, a factor score for that subject cannot be computed; in fact, that subject is not even included in the model estimation. But if the subject has some of the variables observed a factor score is estimated. For instance, if you have a longitudinal model with 1 factor at each of 3 time points and a subject is not present at time point 2, his factor score at time point 2 can be estimated because he has observations at time 1 and 3 and the estimated model says how much the time 2 factor correlates with the time 1 and time 3 factors. The SE of the estimated time 2 factor score for that person is going to be higher than for a person who had no missing.

Aidan G. Wright posted on Wednesday, February 19, 2014 - 9:25 am

Thanks, Bengt. So to clarify, when there is partially missing data for an individual, Mplus uses the individual's available scores and the model estimated relationships among the variables to estimate the missing factor score. Right? And, if it were a 1-factor model, with 4 items loading on it, and for an individual 1 is missing, then I assume that Mplus will use the model implied covariance among the items to estimate the score as if the value were present?

Would you say this would result in similar values to using EM based imputation?

Thanks for the clarification.

Bengt O. Muthen posted on Wednesday, February 19, 2014 - 11:30 am

I agree on the first paragraph. EM-based imputation typically concerns the missing data on the items, not the factor scores.

Tom Booth posted on Tuesday, March 04, 2014 - 8:54 am

Linda/Bengt,

I have conducted a CFA on 7 items with a binary response format based on data with a family structure. As such I have included clustering based on a family identifier in the model commands.

I need to produce factor scores for subsequent analyses. I wanted to confirm that the clustering does not impact these scores? As I understand it corrects SE, and as a result would not impact on the computation of the score. However, I am not 100% confident on how the factor scores for binary responses are computed in order to be sure the above is correct.

Thanks

Tom

Bengt O. Muthen posted on Tuesday, March 04, 2014 - 7:20 pm

It sounds like you use Type=Complex to take care of clustering. In which case you are right about the factor score estimates.

If you use Type = Twolevel you can get factor scores on both levels.

Tom Booth posted on Monday, March 17, 2014 - 6:32 am

Thanks Bengt

Jeremy Cochran posted on Monday, March 17, 2014 - 8:34 am

Hi,

I saw previously on this thread that there might be an issue with version 6.1 and reporting factor scores from a CFA. We're having that issue right now - we're running a CFA with categorical indicators and requested factor scores, but each time we're getting the same message: FACTOR SCORES CAN NOT BE COMPUTED FOR THIS MODEL DUE TO A REGRESSION ON A DEPENDENT VARIABLE. Is there a specific solution we should use?

Linda K. Muthen posted on Monday, March 17, 2014 - 9:56 am

Yes, there was an error in a check that was introduced in Version 6.1. There is no workaround for this.

Tatiana Trifan posted on Thursday, March 27, 2014 - 6:18 am

Dear Dr. Muthen,

We performed CFAs on several measures having from 3 to 9 indicators each, measured on three different scales: 1-3, 1-4, and 1-5. Within each measure, the scale is the same. We treated the measures as continuous. Our purpose is to perform a latent class analysis using the resulting factor scores. We have noticed that the range of the factor scores is wider than the original measurement scale. For example, for a measure with 4 indicators (scale 1-4), our factor scores range from -0.22 to 10.86. We rescaled our items so that the range is 1 to 3 for all of them. However, when we re-calculated the factor scores using the re-scaled indicators, the factor scores were identical to the FS calculated with the original metric. Given that cluster analysis is sensitive to the metric of the measures, we wanted to ask you how the factor scores are calculated in Mplus, and whether the metric of the indicators has an influence on the factor scores.
Thank you.

Linda K. Muthen posted on Thursday, March 27, 2014 - 1:47 pm

Please send the output and your license number to support@statmodel.com.

Sara Manganelli posted on Wednesday, July 09, 2014 - 3:04 am

Dear drs. Muthen,
I did a multiple group CFA with equality constraint on the factor loadings and I computed the factor scores. When I checked the factor scores distribution, I found that they had a very small variance (0,015 for the first group and 0,018 for the second group) and a very reduced range of variation (between -0,346 and 0,342 for the first group). My questions are:
Are the estimated factor scores standardized? How is the scale of the factor scores defined (mean and variance)?

Thank you very much for your help.

Sara

Linda K. Muthen posted on Wednesday, July 09, 2014 - 9:19 am

Factor scores are not standardized. The scale of the factor score is determined by the estimated model.

Please send your output including SAMPSTAT and your license number to support@statmodel.com.

Tomasz ��tak posted on Saturday, October 04, 2014 - 7:54 am

I observed that in multiple group models (with known group membership, specified with KNOWNCLASS option) with categorical dependent variables and full measurement invariance assumed (only expected values and variances of latent variable are set as free parameters in non reference groups), estimated with MLR in Mplus 7.2 I don't obtain standard errors of estimated factor scores.

Is it planned to enable estimation of factor scores standard errors in such a case (and in multiple group models estimated with MLR in general) in future versions of Mplus?

Linda K. Muthen posted on Monday, October 06, 2014 - 10:28 am

It is correct that standard errors for factor scores are not available in this cases. This is on our list but won't be in the next update.

You can get the standard errors by the running one group at a time with parameters fixed at the estimated values. You can use the SVALUES option of the OUTPUT command to get input with the estimated values as starting values and fix the * to @.

Lance Rappaport posted on Wednesday, September 09, 2015 - 10:44 am

Dear Professors,

I apologize for posting several questions (I posted one on a different topic yesterday). I am trying to obtain factor scores for a two-level MSEM model with three factors on level 1 and two factors on level 2. However, the variables are all ordinal, so I am using the WLSMV estimator as you recommend elsewhere for ordinal data.

The trouble is that Mplus gives me an error that "Factor scores cannot be computed for TYPE=TWOLEVEL with the estimators ULSMV, WLS, WLSM, and WLSMV." Is it possible to obtain factor scores for a two-level model with ordinal data?

Sincerely,
Lance Rappaport

Linda K. Muthen posted on Wednesday, September 09, 2015 - 11:32 am

You would need to use maximum likelihood to obtain factor scores for this model.

Trang Q. Nguyen posted on Wednesday, October 28, 2015 - 9:44 am

Dear Drs. Muthen,

I have a one-factor model with a mix of ordinal and continuous indicators, with a residual correlation between two continuous indicators, fit using WLSMV.

USEVAR = x1 x2 x3 x4 x5 x6 x7;
CATEGORICAL = x1 x2 x3 x4 x5;

ANALYSIS:
ESTIMATOR = WLSMV;

MODEL:
f BY x1 x2 x3 x4 x5 x6 x7;
x6 WITH x7;

I am interested in generating factor scores from this model. Will the factor scores in this case incorporate the correlation of the error terms of x6 and x7? I read in tech appendix 11 that when one indicator is binary/categorical, residual correlations are assumed to be zero in the computation of the MAP factor scores. In this case, the two items with residual correlation are continuous. I am hoping that the factor score computation, while assuming zero residual correlations for all the ordinal indicators, preserves the residual correlation between the two continuous variable.

Thank you!

Bengt O. Muthen posted on Wednesday, October 28, 2015 - 2:51 pm

The WLSMV factor score does not include x6 WITH x7, but you can do it using ML. Just replace

x6 WITH x7;

to instead capture the residual covariance as

f BY x6 x7; f@1;

Trang Q. Nguyen posted on Wednesday, October 28, 2015 - 9:13 pm

Thank you so much! This is great to know.

shaun goh posted on Monday, January 18, 2016 - 3:21 am

Dear Drs Muthen,

Are saved factor scores from the following WLSMV one-factor model equivalent/or at least a reasonable proxy in scale to the scores estimated by IRT of theta? (i.e. a saved factor score of 1.2 would correspond to 1.2 SD of theta)

model:
f by u1* u2* u3* u4*; ! Where u1 to u4 are binary
[f@0];
f@1

Bengt O. Muthen posted on Monday, January 18, 2016 - 2:24 pm

Yes. It's just a different estimator and probit instead of logit link. Note that you can use ML to get the usual theta scores.

Helen Norman posted on Friday, July 22, 2016 - 7:06 am

Hi

I have managed to save my factor scores with the following command

savedata: file is fscores1.dat;
save = fscores;

And under the VARIABLE command in my model, I have identified which is the ID variable:

IDVARIABLE IS mcsid;

so that I can match up which factor score goes with which id. However, when I run my model, all the Mplus id variables are 0 (and a few are 1) - they don't match up to the mplus dataset like they should (and how they appear in my dataset inp file) (i.e. they should run from 1 - 5899)

can you help?

Linda K. Muthen posted on Friday, July 22, 2016 - 9:48 am

Please send the files and your license number to support@statmodel.com.

Rick Borst posted on Monday, July 25, 2016 - 4:11 am

Dear drs. Muthen,

I developed a measurement model first using CFA. Everything ran fine. Then I started to relate the variables towards oneanother (structural model) and I received the following message:

THE MODEL ESTIMATION TERMINATED NORMALLY

MINIMIZATION FAILED WHILE COMPUTING FACTOR SCORES FOR THE FOLLOWING
OBSERVATION(S) :

1 FOR VARIABLE EFFIC1
2 FOR VARIABLE EFFIC4
3 FOR VARIABLE EFFIC1
4 FOR VARIABLE EFFIC1
5 FOR VARIABLE EFFIC1
6 FOR VARIABLE EFFIC1
7 FOR VARIABLE PSM2
8 FOR VARIABLE PROACT2

etc.

Why did it ran properly when I did not related the factors to oneanother and now it does not? What can I do about it?

Thanks in advance!

Linda K. Muthen posted on Monday, July 25, 2016 - 6:12 am

Please send the two outputs and your license number to support@statmodel.com.

Rick Borst posted on Tuesday, July 26, 2016 - 4:24 am

I Try to send you my datafiles but keep receiving the message through e-mail:

For the following reason:

Mail size limit exceeded.

However the size is merely 500 kb. Is there an alternative way to send the files?

Linda K. Muthen posted on Tuesday, July 26, 2016 - 8:44 am

They have been received. It must be a problem with our mail server. It is being looked into.

Rick Borst posted on Friday, August 19, 2016 - 1:39 am

Hello,

I am trying to conduct moderation analysis. And I have a few questions:

1. The latent variable moderation with LOOP plot example (following UG ex 5.13) has two asterices beyond the indicators at the righthand side of the BY statements of the moderator and the independent variable. a) Why is that? b) Do all the indicators in the BY statement need an asterix or just one indicator of every variable?

2. I have a mediation analysis 5 IV's, 1 mediator and 2 DV's. They are all latent variables existing of categorical variables. I want to check whether the moderator (also a latent variable) influences the effect of 2 IV's on the mediator. I need the R square of the mediator (with and without the interactions) and after that the LOOP plot of both interactions (so two LOOP plots). Is this feasable? Because I get stuck all the time at the moment (Errors such as: too many dimensions, the model has reached a saddle point, the model estimation did not terminate normally due to a non-zero derivetative... check you starting values... the loglikelihood derivetative for parameter .. is -0.86 etc.).

Rick Borst posted on Friday, August 19, 2016 - 1:40 am

I am sorry, this is in the wrong thread. I will ask it in another area.

Bengt O. Muthen posted on Friday, August 19, 2016 - 12:07 pm

Answered in the other spot.

Justine Loncke posted on Tuesday, October 04, 2016 - 2:29 am

Dear all,

somewhere in this forum is noted that Mplus calculates factor scores even for cases that have missing data on some of the continuous indicators. I understand that Mplus uses information based on the other indicators to estimate the factorscores.

How is this precisely calculated?

More specific, I want to calculate the Bartlett factor scores. Given that mplus does not estimate these and I use matrix calculations. I always end up with missing factor scores for the subjects that have some missingness on their indicators.

Thanks in advance.

Bengt O. Muthen posted on Tuesday, October 04, 2016 - 2:16 pm

To estimate the f value for a subject, Mplus maximizes the likelihood function

g(y | f) * g(f)

where the first term splits up in a product of univariate y_j |f and if that y_j is missing for a subject, it doesn't contribute to the f estimation. I haven't looked at how Bartlett would be done with missing.

Luc Watrin posted on Thursday, October 13, 2016 - 1:09 am

Dear Drs. Muthen,

is there a way to save factor scores with more than 3 decimal places?

Thanks in advance!

Linda K. Muthen posted on Thursday, October 13, 2016 - 9:52 am

No.

Christoph Weber posted on Tuesday, March 14, 2017 - 7:56 am

Dear Mplus-team,
In the description for factor score estimation in mplus (Factor scores.pdf) it is mentioned that FS (Regression method) used as predictors will yield unbiased regression slopes.

I'm testing a complex moderation model (... F1�*F2), where all fully latent approaches (LMS, unconstained product indicators, ...) show convergence problems (...). So I'm wondering if it would be reasonable to use FS for the predictors (and the product Terms) and use a measurement model for the dependent variable.

Christoph Weber

Bengt O. Muthen posted on Tuesday, March 14, 2017 - 6:08 pm

That result holds only for linear models where the bias in the nominator and denominator of the usual slope formula cancel out.

Christoph Weber posted on Wednesday, March 15, 2017 - 2:24 pm

Thanks a lot! I know that the estimation of the model is quite problematic. It seems to be a question of lesser bias. In this regard would it be preferable to use FS or a measurement model as dependet variable?

Bengt O. Muthen posted on Wednesday, March 15, 2017 - 6:05 pm

If you can analyze in a single step, that is best.

AT Jothees posted on Saturday, March 25, 2017 - 1:27 am

Dear all,

I am very new to psychometrics and mplus. So, I am trying to understand the difference between latent score generated from factor analysis and IRT.

In my understanding, there is no real difference in terms of range -3 to +3. Is this correct ?

Thank you very much in advance.

Regards, J

Linda K. Muthen posted on Saturday, March 25, 2017 - 9:31 am

No difference.

Tom Clarke posted on Sunday, June 04, 2017 - 2:37 pm

Dear Professors,

Many thanks for this forum it is a fantastically useful resource. Could I ask is there a way to obtain non-standardised factor scores from MPLUS? I am running an analysis using categorical indicators were it would be useful to have raw factor scores outputted.

Many thanks,
Tom

Bengt O. Muthen posted on Sunday, June 04, 2017 - 4:53 pm

The factor scores are estimated from a model where you can choose the metric of the factor. For instance, the model's factor variance parameter need not be 1, although that seems a natural metric. The factor score estimates themselves are not standardized.

Daria Gerasimova posted on Monday, August 21, 2017 - 12:19 pm

To follow up on the question in the previous post -- is there a way to scale factors scores back to the scale of indicators? The reason it would be beneficial is that the indicator scale has a meaning (e.g., 1 means Not at all, 4 means A Lot, etc.); thus, it's easier to interpret scores than that many standard deviations above/below the mean... Thank you in advance.

Daria Gerasimova posted on Monday, August 21, 2017 - 12:25 pm

Also, I wanted to ask about the latent variable scores regression coefficient matrix that I obtain through the CFA analysis -- these are used to create composite scores as far as I understand. Is this matrix different from the factor score coefficient matrix that is produced by the EFA analysis?

If I want to compute composite scores by hand using the factor score coefficient matrix from EFA, I can do so by multiplying coefficients by standardized item values and then adding them together. But this approach doesn't seem to work this way with the latent variable scores regression coefficient matrix from CFA. Why is that?

Bengt O. Muthen posted on Monday, August 21, 2017 - 5:08 pm

Answer to 12:25 post:

They refer to analogous things but the CFA coefficient matrix refers to the unstandardized (raw) data.

Bengt O. Muthen posted on Monday, August 21, 2017 - 5:13 pm

Answer to the 12:19 post:

Your model has a continuous factor so it doesn't give only a limited number of distinct values like your observed ordinal variables. Attempting to get back to the observed scale by some kind of categorization would throw away information. It is true, however, that some measurement instruments like NAEP (google it) brings factor values back to an ordinal scale (basic, etc) but that is a difficult process that involves understanding which items tend to exceed a certain threshold at which factor value.

My advice - stick with SDs for the latent variable.

Daria G. posted on Monday, August 21, 2017 - 5:28 pm

Thank you so much! That's really helpful.

Just two more quick questions:

1. I know that I can obtain composite scores through EFA (via a number of methods, such as Regression, Bartlett, Anderson-Rubin, etc.) or through CFA. I am a little confused -- which way should I choose?

2. I intend to do a cluster analysis on latent variables. Do I understand correctly that I can do that only using composite scores? In other words, there is no way to do the analysis "in one step"? (CFA + cluster without computing composite scores?)

Bengt O. Muthen posted on Tuesday, August 22, 2017 - 5:59 pm

1. See our FAQ:

Factor scores

2. You can do the clustering together with the CFA in a single analysis. We refer to that as factor mixture modeling - see the Papers section of our web site and also the Topic 5 short course handout and video on our website.

Luo Wenshu posted on Friday, December 22, 2017 - 4:21 pm

Dear Dr. Muthen,

I saved factor scores obtained in CFA for following regression analyses. I found that the correlations between factor scores are higher than the correlations between factors obtained in CFA? Can you help explain why? Thank you very much.

Bengt O. Muthen posted on Friday, December 22, 2017 - 4:39 pm

See the FAQ on our website:

Factor scores

Luo Wenshu posted on Saturday, December 23, 2017 - 5:34 am

Thank you, Dr. Muthen. I have read the FAQ and relevant article. Does this mean that regression based on factor scores is never recommended due to potential bias in slope estimation?

Bengt O. Muthen posted on Saturday, December 23, 2017 - 2:47 pm

I would try to avoid it, except as an approximation when you have very good measurement of the factor.

Sarah Phillips posted on Tuesday, March 27, 2018 - 8:45 am

Hi,

I'm running a type=twolevel model and trying to save off the factor scores.

The model is converging fine, and the output looks like it's generating the factor scores. But when I open my text file, it's blank.

Here are the relevant bits of my code:

Variable:
cluster=class;

Analysis:
type=twolevel;

Savedata:
file is 7_3_fscores.txt;
save = fscores;

Am I doing something wrong?

Thanks,

Sarah

Bengt O. Muthen posted on Tuesday, March 27, 2018 - 10:34 am

Please send your output to Support along with your license number.

Michael Strambler posted on Wednesday, April 04, 2018 - 2:09 pm

Hi,
I have generated factor scores for teacher-level CFAs adjusting for school clustering. I did this instead of traditional SEL because I wanted to create school-level values (school n=27) from the teacher factor scores and then use these values in a MLM to predict student-level outcomes, which I can only link to schools (not teachers/classrooms). The CFA items are likert scales that I'm treating as ordinal. I have two questions:

(1) Is this a reasonable way to handle such data/questions?

(2) I noticed that the factor score mean does not equal zero. Do you know why this is the case?

Here's an example of a CFA model:

USEVAR u1 u2 u3 u4;
CATEGORICAL ARE u1 u2 u3 u4;
CLUSTER=School;
SUBPOP=(TRole EQ 2);
ANALYSIS:
TYPE=COMPLEX;
PROCESSOR = 2;
MODEL:
f BY u1 u2 u3 u4
OUTPUT:
SAMP STDYX;
SAVEDATA:
FILE IS TEACHERS.dat;
SAVE = FSCORES;

Bengt O. Muthen posted on Wednesday, April 04, 2018 - 3:31 pm

Type=Complex affects only SEs, not parameter estimates and therefore not factor scores. Instead, you can use type=Twolevel and define a factor on between:

fb by u1-u4;

You can then use the estimated factor scores for fb.

Factor scores don't have exactly the same metric as the factors in the model. They also don't have exactly the same relationships to other variables. See e.g. our FAQ: Factor scores

Michael Strambler posted on Wednesday, April 04, 2018 - 7:37 pm

Thank you for clarifying the Type=Complex issue. I previously considered the 2-level approach you mentioned but thought that with a between sample size of 27, it might be too small of a number to generate trustworthy factor scores for a between CFA. My thinking was that doing the CFA at the teacher-level (where I had larger sample size) and aggregating to schools would be more appropriate. But it sounds like taking the two-level approach is reasonable with this size. Is that correct? Would using Bayesian estimation also help with the small size?

Lastly, If I take the two level approach and use the factor scores as predictors, do I interpret the unstandardized output as if it were standardized?

Bengt O. Muthen posted on Thursday, April 05, 2018 - 2:31 pm

Q1: Yes.

Q2: Not really in this case.

Q3: No. Your model-estimated factor variance is probably not 1, or is the variance of the estimated factor scores.

Dr. Gurpreet Rekhi posted on Thursday, April 26, 2018 - 10:41 pm

Hi, I am doing CFA for a rating scale with 16 items and 5 factors. I need to allow correlation of errors between two of the items to get acceptable fit of the model. When i try to generate factor scores using this model, I get the following message:

THE MODEL CONTAINS A NON-ZERO CORRELATION BETWEEN DEPENDENT VARIABLES.
SUCH CORRELATIONS ARE IGNORED IN THE COMPUTATION OF THE FACTOR SCORES.
THE MODEL COVARIANCE MATRIX IS NOT POSITIVE DEFINITE.
FACTOR SCORES WILL NOT BE COMPUTED. CHECK YOUR MODEL.

Please advise what should I do.

Many thanks
Gurpreet

Bengt O. Muthen posted on Friday, April 27, 2018 - 9:01 am

Please send your output to Support along with your license number.

Lina posted on Friday, May 04, 2018 - 9:47 am

Dr. Muthen,
I am estimating a model using WLSMV and want to compute factors scores.Mplus (version 7.4) output showed:

Factor scores were not computed.
No data were saved.

Here is the command (the model reported favorable fit):

Title: THIS IS AN EXAMPLE OF matched sample;
Data: FILE IS 396ANE.dat;
Variable: Name are NO GENDER ps CARF JOBF STAY WPV OE OI rebo1-rebo22;
IDVARIABLE = NO;
Usev are NO GENDER JOBF STAY WPV OE OI E1-E3;
Categorical=STAY;
Missing=All(-9);
DEFINE: E1=(rebo1+rebo2+rebo3)/3;
E2=(rebo6+rebo8+rebo13)/3;
E3=(rebo14+rebo16+rebo20)/3;
Model: ANE by WPV OE OI;
EE by E1 E2 E3;
STAY on JOBF EE GENDER;
JOBF on EE GENDER;
EE on ANE GENDER;
ANE on GENDER;
Model indirect: STAY ind ANE;
JOBF ind ANE;
output: SAMPSTAT TECH1 TECH4 Stdyx Mod;
SAVEDATA: FILE IS ANE.sav;
SAVE IS fscores;
FORMAT IS free;

Anything wrong with my "SAVEDATA" command? How to compute factors scores with such a WLSMV estimated model?
Any advice will be pretty appreciated.

Best regards!
Lina

Tihomir Asparouhov posted on Friday, May 04, 2018 - 11:26 pm

This maybe due to negative residual variances (Heywood Cases) or factor correlation bigger than 1. If that is the case simplify the model to avoid the problem. If that is not the case send you example to support@statmodel.com

Lina posted on Sunday, May 06, 2018 - 4:13 am

No negative residual variances and no factor correlation bigger than 1 were found with checking.

Thank you so much with advice!

Lina

Bengt O. Muthen posted on Sunday, May 06, 2018 - 3:35 pm

Send your output to Support along with your license number.

mboer posted on Monday, November 05, 2018 - 9:11 am

Dear prof. Muth�n,

I have a longitudinal model with 1 factor at each of three time points (wide format), where there is drop-out. The factor has categorical items.

I want to calculate factor scores from the measures in all three time points using the 'save=fscores' command, and I understand that factor scores are being calculated for drop-out cases when using ML, based on information from previous waves.

However, I noticed that factor scores for drop-out cases are also computed when WLSMV is used. My question is, how are factor scores computed with WLSMV? I know that WLSMV uses pairwise present in dealing with missing data, but I don't understand how pairwise present information can accomodate factor scores for drop-out cases.

Thank you in advance.

Bengt O. Muthen posted on Monday, November 05, 2018 - 3:52 pm

WLSMV uses pairwise present data to estimate the sample statistics (see Muthen 1984 in Psychometrika) to which the model is fitted. But the factor scores are based on the estimated model parameters (and the data), not on these pairwise sample statistics.

Katharine Buek posted on Thursday, November 29, 2018 - 11:31 am

I have five uncorrelated factors (varimax rotated) and am trying to get factor scores that are similarly uncorrelated. I tried putting the items into a CFA fixing the factor loadings to the values from the EFA output and requesting Fscores. However, I'm still getting factor scores that are correlated > .90!

I'm confused because when I use the factor scoring in stata, the scores are uncorrelated (r = <.1). Why, when I fix the loadings in mplus are the scores now so highly correlated??

thanks for any guidance you can offer.

Bengt O. Muthen posted on Thursday, November 29, 2018 - 2:41 pm

CFA typically gets higher factor correlations than EFA due to forcing cross-loadings to be zero.

You can send your EFA factor score output and data to Support along with your license number.

Katharine Buek posted on Friday, November 30, 2018 - 8:40 am

Thank you for your response. To clarify, I'm talking about factor scores, not factors, and the correlations between scores go from <.1>.9 in Mplus when I fix the factor loadings to the EFA output values. Would cross-loaders be responsible for such a huge change?

Bengt O. Muthen posted on Friday, November 30, 2018 - 4:40 pm

Try getting the factor scores by using ESEM.

Snigdha Dutta posted on Monday, May 20, 2019 - 10:13 am

I wish to use factor scores obtained from scalar models of 8 of my constructs.
I have exported the factor scores onto SPSS.
I noticed that the factor scores for the constructs are negative. If I have to use them as ON statements, Mplus will not be recognising negative scores. How then can I use the factor scores in my model?

Bengt O. Muthen posted on Monday, May 20, 2019 - 5:26 pm

Mplus does recognize negative scores.

Snigdha Dutta posted on Wednesday, June 19, 2019 - 6:10 am

How can I obtain factor scores using Bayesian estimation?

I know the SAVEDATA command for factor scores, but how what command should go under ANALYSIS?

Bengt O. Muthen posted on Wednesday, June 19, 2019 - 12:15 pm

See page 838 of the Version 8 UG and also the paper on our website:

Asparouhov, T. & Muth�n, B. (2010). Plausible values for latent variables using Mplus. Technical Report.
download paper download scripts contact second author

Richard E. Zinbarg posted on Thursday, October 24, 2019 - 3:28 pm

I've run a CFA model, using categorial indicators, in a new sample that is otherwise identical to one I've run in a different sample including syntax to obtain factor scores as I've done in the past. In the current analysis, factor scores are not saved and the output includes the following message:
SAVEDATA INFORMATION

Factor scores were not computed.
No data were saved.

Any idea what might be going wrong?
Thanks!

Bengt O. Muthen posted on Thursday, October 24, 2019 - 4:08 pm

There should be a message saying why they weren't computed.

If you can't find it, please send the output to Support along with your license number.

Richard E. Zinbarg posted on Friday, October 25, 2019 - 8:52 am

thanks Bengt, I found the message and know what I can try to correct this

Susan South posted on Thursday, May 28, 2020 - 10:24 am

Hello. I wanted to confirm that if I run a one factor CFA with continuous indicators and set the latent factor to 1.0, if I extract factor scores they are standardized. If that's correct, why am I getting a SD of less than 1.0 when I extract the factor scores to SPSS?

Bengt O. Muthen posted on Saturday, May 30, 2020 - 11:38 am

Fixing the factor variance at 1 does not mean that the factor scores will have variance 1 (nor mean zero if the factor mean is fixed at zero)). It is well known that factor scores don't behave exactly like factors. See e.g. our FAQ:

Factor scores

Rachel Liebman posted on Monday, June 08, 2020 - 9:17 am

Hello, I'd like to save the variances from an unconditional growth model, along with the parameter estimates. when i use the SAVEDATA = FSCORES command, it only seems to save the slope and intercept mean and standard error. Is there a way to save the variance estimates too?

thanks

Bengt O. Muthen posted on Monday, June 08, 2020 - 5:15 pm

When you say "variances", I assume that you mean the model-estimated variances. They can be saved like all parameter estimates using the Savedata command and Results = file;

Lian van Vemde posted on Sunday, October 11, 2020 - 12:04 pm

Hi!

I used the save data command to save my factor scores obtained in the alignment method for measurement invariance.

However, Mplus only seems to do this for half of my sample (half of the sample in each group).

So when I want to merge this with the rest of my dataset for subsequent analyses, I have lots of missing data...

Is there a way to solve this?

Tihomir Asparouhov posted on Monday, October 12, 2020 - 9:51 am

This can happen in the analysis if all factor indicators are missing. The factor score value in that case would be the factor mean which one could enter manually. Alternatively you can do this trick: add a non-missing indicator in the data - any random number would do or existing data column that doesn't have missing values - and add that as an indicator to the factor and fix the loading to 0.

Lian van Vemde posted on Monday, October 12, 2020 - 9:57 am

Thanks for the answer!

I also already figured it out later. I had extra variables listed that were no longer in the data set and caused the problem.

Richard E. Zinbarg posted on Sunday, October 18, 2020 - 3:21 pm

I have been saving factor scores for a longitudinal study in which the measurement model is constrained to be scalar invariant at the later time points with respect to the first time point. The factor scores saved just fine for some of the waves but for other waves Mplus is systematically omitting the factor scores for several of my subjects. At one wave, for example, I have data for 241 subjects but Mplus says that the number of observations equals 237 and only saves factor scores for 237 subjects. Any help you can provide will be most appreciated - given that we have data for these subjects at these time points, we woud like to include their factor scores in our anslyses. Many thanks!

Richard E. Zinbarg posted on Sunday, October 18, 2020 - 3:25 pm

I think you can ignore my last post. Reading the previous few led me to examine my data more closely and it looks like we have an extra variable in the data set

Richard E. Zinbarg posted on Sunday, October 18, 2020 - 3:29 pm

I was wrong. There are 101 variables in the data set in addition to the id# and that is how many I name on my variables name statement. I am baffled as to why Mplus is only giving me factor scores for 237 of my subjects rather than for all 241.

Bengt O. Muthen posted on Sunday, October 18, 2020 - 4:20 pm

We need to see your output and data to diagnose this - send to Support along with your license mumber.

Richard E. Zinbarg posted on Sunday, October 18, 2020 - 6:17 pm

False alarm Bengt - there were a few missing values that were left blank rather than being converted to the missing data code. When I corrected that, it solved the problem. Many thanks for the speedy reply and sorry to have troubled you for no good reason!