Message/Author 


Given the following factor model: CATEGORICAL = ANSWER1 ANSWER2 ANSWER3 ANSWER5 ANSWER6 ANSWER7 ANSWER8; CLUSTER = id; ANALYSIS: TYPE IS TWOLEVEL; ESTIMATOR IS BAYES; Model: %within% F1 BY answer1* answer2 answer3; F2 BY answer4* answer5 answer6; F1@1 F2@1; F1 WITH F2; F1 WITH F3; F2 WITH F3; %between% answer1answer5 with answer6; answer1answer3 with answer5; answer1answer2 with answer3; answer1 with answer2; I generated 5 multiple impuations using Bayesian estimation with plausible factor scores saved. Using 5 imputed data sets I ran a multilevel CFA using WSLMV repeating the same model that was used for the imputation. I want to examine how these plausible factor scores correspond to the behaviour of the True scores from my factor model. Where I am confused is in how to go about reproducing the withinlevel factor correlations from the CFA model using my plausible factor scores. I would like to see how closely the plausible factor score correlations correspond to the correlations between my latents in my withinlevel CFA. Any adivce on how to proceed would be very welcome? 


The line of syntax reading F2 WITH F3; is an error  please ignore 


It seems like there are at least 3 ways to find the correlation between the 2 factors. (1) The 2level Bayes run that you show gives a Bayes factor correlation estimate. (2) The WLSMV run based on the imputed data gives a WLSMV factor correlation estimate. (3) And the plausible values that your Bayes run generates can be used to compute the factor correlation. As for (3) you get N*n factor scores for each factor, where N is the number of plausible draws and n is your samle size. You get the factor correlation from those N*n factor scores. Also, take a look at this paper on our website: Asparouhov, T. & Muthén, B. (2010). Plausible values for latent variables using Mplus. Technical Report. 


Thanks Bengt, I think I didnt explain myself clearly. I was interested in examining how the plausible factor score estimates correlated with their true scores, that is, the withinlevel factors from my CFA. Is there any reason why I could not enter the plausible factor scores into my multilevel CFA as covariates (as below) in order to see how well they correlate with their respective true factor scores? Model: %within% F1 BY answer1* answer2 answer3; F2 BY answer4* answer5 answer6; F1@1 F2@1 PlausibleF1@1 PlausibleF2@1; F1 WITH F2; F1 WITH PlausibleF1; (Correlation of Plausible factor socre with true factor score) F2 WITH PlausibleF2;(Correlation of Plausible factor socre with true factor score) PlausibleF1 WITH PlausibleF2;(correlation of plausible factor scores) %between% answer1answer5 with answer6; answer1answer3 with answer5; answer1answer2 with answer3; answer1 with answer2; PlausibleF1 WITH answer1 to answer6; PlausibleF2 WITH answer1 to answer6; Any adivce on how to proceed would be very welcome? 


I am not sure that a high correlation from that approach is indicative of highquality plausible values. I would instead look at how the plausible values for the two factors relate to each other and to other variables in comparison to estimating those quantities directly in the model. If you are doing a Monte Carlo study you could you see how true scores (generated factor values) compare to plausible values. 


I agree that looking at how the plausible values for the two factors relate to other variables in comparison to estimating those quantities directly in the model is useful. Inspite of having a good fitting model I need to provide daytoday users of the instrument (eg,clinicians) with confidence that a scoring method has correlational accuracy, univocality and that the scoring method (factor score estimates or plausible values) correlate with their true scores. This is because even for highly determinate factors (which mine are) I could still end up chosing a poor set of factor score estimates  these need to be evaluated somehow. Factor score estimates  Nunnally: “If the multiple correlation [the proportion of determinacy in the factor] is less than .70, one is in trouble. In that instance the error variance in estimating the factor would be approximately the same as the valid variance. At a very minimum, one should be quite suspicious of factor estimates obtained with a multiple correlation of less than .50, because in that case less than 25 percent of the variance of factor scores can be predicted from the variables. Then one could not trust the variables as actually representing the factor.” (1978, p. 426). Is it that the method that I suggested is technically incorrect to answer the question, or is it that you think it is not a useful test or not as useful as the other method you proposed? 


Are you interested in the quality of the scores in terms of the quality of individuals' scores or in terms of the quality of summary measures and relationships with other variables? Section 4.2 of this paper relates to the latter: http://www.statmodel.com/download/Plausible.pdf I recommend reading von Davier M., Gonzalez E. & Mislevy R. (2009) What are plausible values and why are they useful? IERI Monograph Series Issues and Methodologies in LargeScale Assessments. IER Institute. Educational Testing Service. If you can't find it, I am happy to send it. 


Both actually but for the moment, the former, the quality of scores in terms of individual's scores. I tried using regression weighting methods using the factor score coefficient matrix generated from a withinlevel polychoric correlation matrix with ML estimation and several unit weighted and course factor score methods but these all seem to perform very poorly using the method I pasted as syntax. The factor score estimates all correlate with their true scores no higher and often a lot lower than r=.65. (that's r not rsquare). I take it by your question, plausible values may not be suitable for individual use? I suppose this would make sense if we're using imputations as these are designed to estimate population parameters and standard errors and not individual's responses. I will continue to read on the subject of plausible values but I'm still stuck with the problem of having a good approximate fit but no method for users to score the instrument with confidence. few people ever seem to bother evaluating their factor scoring methods for instruments so I wanted to make an extra effort to learn about it. 


I think plausible values have an advantage over regular estimated factor scores in that each individual gets a distribution of values so that the uncertainty is clearly presented and the shape of the distribution is clear (an estimated factor score for an individual at most gets a SE). This is useful to have when you compute the variance over people and when you compute the relationship to other variables (see the section 4.2 of our imputation paper that I referred to). I don't know if I have seen that the average plausible value for a certain individual is any better than the estimated factor score for that individual. Factor scores (and plausible values) have different strengths and weaknesses for different uses as was discussed already by Tucker in the 50's. For example, ranking individuals is one use, regression another. For a recent discussion, see Skrondal, A. and Laake, P. (2001). Regression among factor scores. Psychometrika 66, 563575. I think the evaluation approach you suggest may have a variation on the Heisenberger (?) problem  in evaluating the plausible values you alter the factor meaning. The factors become determined also by the F WITH Plausible; statements, which is not what you want. Again, to really see how well the factor scores or plausible values work in the model and intended use situation, you want to do a Monte Carlo study so that you can compare the true, generated scores with your estimated ones. 


Thank you Bengt, I appreciate your thoughts Jonathon 


We asked people to evaluate 3 different profiles across 12 items (2x6) with wording corresponding across the items. We want to treat each set of 6 items as a latent variable and compare scores across the 3 profiles. We tried using example 11.7 to get factor scores but got an error message. Any guidance with this would be much appreciated, many thanks, Caoimhe 


Please send your output and license number to support@statmodel.com. 


hi all, i'm in receipt of a review in which the reviewer states (in part): The authors also fail to mention known limitations of the threestep approach to examining relationships among latent variables (i.e., known biases in the effect estimates for the hierarchical regressions conducted on factor scores; see recent papers by Skrondal & Laake, and Lu & Thomas, and further back Tucker, 1971). what i actually did was output plausible values based on the entire item/latent factor set, allowing the latent variables to correlate. for various reasons we then conducted regressions on the medians of those plausible values. thus, we did not use the threestep approach i've seen described elsewhere. further, as far as i can tell from the 2010 paper on the website about plausible values, the opinion on the mplus side would seem to be that what we did should be basically ok, leading to relatively unbiased estimates in the resulting regressions. i am not 100% sure of my interpretation, though. any input from anyone knowledgeable in this area would be much appreciated. i'm about to use this basic approach again, and don't want to commit an error multiple times! 


The advantage of having plausible values is to treat them as multiple imputations, that is, perform m repeated analyses with them when there are m plausible value data sets. It sounds like what you did lost that advantage. Factor scores, essentially only one draw instead of m, cause biases as the reviewer states. See further the plausible value writings we refer to in our plausible value paper among our posted Bayesian papers. 


thanks, that's helpful. in this case we already had MI data sets (5) in which we had tested factor structure using WLSMV and type=imputation. we ultimately analyzed through type=imputation 5 median plausible value datasetsone from each of the MI datasets. thus, there were 5 draws rather than 1, although 1 draw from each of 5 datasets imputed through other means (amelia II, as it happens). would the resulting factor scores still be biased? your answer is cuing me to realize that there are probably better or at least more elegant ways to do what we did, but it would still be helpful to know how far off we were, exactly, to begin with. thanks! 


The answers are somewhat complex, but found in those plausible values references: [4] Mislevy R., Johnson E., & Muraki E. (1992) Scaling Procedures in NAEP. Journal of Educational Statistics, Vol. 17, No. 2, Special Issue: National Assessment of Educational Progress, pp. 131154. [6] von Davier M., Gonzalez E. & Mislevy R. (2009) What are plausible values and why are they useful? IERI Monograph Series Issues and Methodologies in LargeScale Assessments. IER Institute. Educational Testing Service. There are probably other refs as well. 


thanks for the specific references! 


Here are 2 more: 1.The effect of not using plausible values when they should be: An illustration using TIMSS 2007 grade 8 mathematics data  DRAFT June 1, 2010  Ralph Carstens, IEA Data Processing and Research Center, ralph.carstens@ieadpc.de Dirk Hastedt, IEA Data Processing and Research Center, dirk.hastedt@ieadpc.de 2. Predictive Inference Using Latent Variables with Covariatesy (under review, Psychometrika) Lynne Steuerle Schoeld Department of Mathematics and Statistics, Swarthmore College Brian Junker Department of Statistics, Carnegie Mellon University Lowell J. Taylor Heinz College, Carnegie Mellon University Dan A. Black Harris School, University of Chicago June 26, 2014 Abstract Plausible Values (PVs) are a standard multiple imputation tool for analysis of large education survey data that measures latent prociency variables. When latent pro ciency is the dependent variable, we reconsider the standard institutionallygenerated PV methodology and nd it applies with greater generality than shown previously. When latent prociency is an independent variable, we show that the standard institu tional PV methodology produces biased inference because the institutional conditioning model places restrictions on the form of the secondary analysts' model. We oer an alternative approach that avoids these biases based on the mixed eects structural equations (MESE) model of Schoeld (2008). 

Back to top 