Message/Author 


Dear Dr Muthen, I am planing to purchase MPlus for my doctoral research but I have few questions in mind before making the decision. I have almost 25 variables which I plan to use for poverty modelling in the area of research. These variables are in continuous, ordinal and nominal forms. Hence I could not use traditional factor analysis in SPSS. Do you think Mplus is right software for mix data like this? If yes then how does it create correlation matrix? Is it possible to create score for each factor for each individual household to decide who is poor and who is not in my analysis? Could you also do cluster analysis with MPlus? Mank thanks Gaurav 


Factor indicators in Mplus can be continuous, censored, binary, ordered categorical (ordinal), counts, or combinations of these variable types. Nominal factor indicators would need to be turned into a set of binary variables. The sample statistics depend on the scale of the variables. Factor scores are available. Cluster analysis is available via mixture modeling. 


Dear Dr. Muthen Thanks for you reply. Could you also tell me how EFA in Mplus deals with missing values in both continuous and ordinal variables? Thanks 


For continuous variables, it is fullinformation maximum likelihood. For binary and ordinal variables, it is pairwise present. 


Thanks I have few more questions: How are the factor scores calculated in MPlus? How could I obtain factor score after performing EFA on group of continuous, ordinal and dichotomous variable? Could I then perform cluster analysis on estimated factor scores in MPlus? Many thanks, Gaurav 


See Technical Appendix 11 which is available on the website. Factor scores are available only through CFA in Mplus at this time. It sounds like you want to do a factor mixture model. Doing it in one step superior to doing it in two steps. See Example 7.20. 


Thanks. The ordinal data that I have to analyse has various levels, they are made of 3, 4 or 5 levels. So will I have to standardise the data before doing EFA in CFA framework? If not, where does the standardisation take place to account for different levels in variables? Is the EFA in CFA framework for ordinal and continuous data done on covariance matrix or correlation matrix? What type of correlation done to create matrix for this kind of nonnormal data? What are the assumptions underlying this kind analysis? Regards, Gaurav 


You should not standardize categorical variables. The numbers represent categories. They have no numeric value. The measure of association used in model estimation takes into account the nature of the variable, for example, it is a Pearson correlation for two continuous variables, a tetrachoric correlation for two binary variables, a polychoric correlation for two ordered polytomous variables, etc. There are several references on the website under the heading Categorical Outcomes that discuss estimation and assumptions for methods for categorical outcomes. 


I noticed that new version of MPlus gives 2 new outputs in EFA: Factor determinacy and factor structure. Could you please explain their significance and how to use them? Thanks Gaurav 


The factor score determinacy ranges from zero to one and describes how well the factor is measured with one being the best value. The factor structure matrix shows the correlation between the items and the factors. This indicates which items measure the factors best. 


Dear Linda, I just did my first CFA modelling in MPlus. The very first problem is that i get WARNING: VARIABLE Q2 MAY BE DICHOTOMOUS BUT DECLARED AS CONTINUOUS. Q2 is income variable. It is a continuous variable. Please advice me what to do. Regards 


You should send your input, data, output, and license number to support@statmodel.com. You may be reading your data incorrectly. 


Dear Linda, I hope you got the data file I emailed you. When I run EFA with the same data file I get perfect result. However, with CFA it is giving same error message: WARNING: VARIABLE Q2 MAY BE DICHOTOMOUS BUT DECLARED AS CONTINUOUS. Regards 


Another quick question.. when i was doing EFA with all my 23 variable i got message of non positive definite matrix. So i tried to do efa with small number of variable adding one at a time to see which variable is causing the problem. What I found is that analysis goes smoothly until I have added 20 variables in analysis and when I added 21st variable I got same error message again. So I dropped 21st variable and added 22nd and 23rd variable and gor the same message. Then I removed one variable from first 20 in my list and added 21 or 22 or 23 variable and the analysis showed no error. Does that mean variable size effect matrix being positive definite or not? or what could be possible reason for that. Regards 


With categorical outcomes, the sample correlations are estimated pairwise. This can result in a nonpositive definite matrix. It is not a direct function of the number of variables although as the number of variables increases, the probability of this occurring increases. It does not affect the results. 

yang posted on Thursday, September 20, 2007  12:17 pm



Drs. Muthen, I am doing a CFA on a set of binary (0/1) variables for a unidimensional structure. I got the factor scores for each of the subjects, and I assume that Mplus is calculating the factor scores based on the tetrachoric correlation matrix instead of Pearson correlation matrix. However, I am not 100% sure. Would you mind kindly confirming this assumption? Thanks. 


Technical Appendix 11 on the website describes factor score estimation. 

robertav posted on Friday, September 21, 2007  6:43 am



Dear authors, I'm carring out a simple EFA, with 9 ordinal indicators. With 2factor solution I obtain this result: FACTOR DETERMINACIES 1 2 ________ ________ 1 1.002 0.939 You said in this "topic" that the factor score determinacy ranges from zero to one and describes how well the factor is measured, with one being the best value. Why do I obtain the value 1.002? And how is calculated the factor determinacy? Really thanks 


I'm not sure how you got factor determinacies with EFA because I don't think they are available with EFA. Please send your input, data, output, and license number to support@statmodel.com. Note that factor determinacies are available for only continuous outcomes so I think you may not be using the CATEGORICAL option of the VARIABLE command to specify that your outcomes are categorical. 


Dear Dr. Muthen, I would greatly appreciate your help with the following question: How is the latent score in CFA calculated for observations with missing values on one or more indicators? Brief background: I run a simple CFA model with 5 binary indicators. I export the factor score and use it in regressions (I know it would be better/more efficient to estimate a full SEM in Mplus). I read the technical appendix 11 and googled extensively for answer but couldn't find it. Many thanks in advance for your reply, Anna Zajacova 


The latent variable score for an individual is computed using the posterior distribution of the latent variable. This is based on (1) the model and (2) the data for the person. (1) The estimated model parameters take missing data into account by the usual approach of ML under "MAR", that is using all available data. (2) The data for the person appears in the posterior for each variable that is observed for this person  missing indicators don't contribute. 


Bengt, Thank you so much for your answer. Would it be correct to say that the estimation effectively imputes the value of the missing indicator(s) for a given individual based on covariances of the nonmissing observations from other individuals plus the observed data for the individual, and then calculates the factor score based on the observed values of the nonmissing indicators and the 'imputed' values of the missing indicators? Many thanks in advance, Anna 


Estimated factor scores are obtained from the posterior = prior + data. The prior becomes the estimated model using all available data. Data is the observed data for the individual. So instead of your statements, I would say that for a person with missing data on some of the indicators, the estimated model  the prior  is relied on more due to the missing data. There is never an imputation done (although conceptually one might think of that) for either the model estimation step or the factor score estimation step. The important missing data aspect comes in during estimation of the model parameters  that's when you draw on information from correlations with variables without missing to estimate parameters for variables with missing (MAR theory). 


Dear Bengt, Thank you again for your answer  things are crystal clear now. Your reply is much appreciated! Anna 


Hello Linda, I just wanted to see if the statements below still hold. I am getting factor determinacies with EFA specifying it as categorical data. If they do then I suppose I'm doing something wrong. Also, could you point to a reference as to why nonpositive definite matrix will not affect the results in Mplus. Thank you! Tom I'm not sure how you got factor determinacies with EFA because I don't think they are available with EFA. Please send your input, data, output, and license number to support@statmodel.com. Note that factor determinacies are available for only continuous outcomes so I think you may not be using the CATEGORICAL option of the VARIABLE command to specify that your outcomes are categorical. 


The statement below is no longer valid. 


Dear Linda! I would like to come back to Thomas' request. I have nested data and ran an EFA based on the estimated between correlation matrix (using version 5.1). I specified two factors. Same as Thomas I got factor determinacies greater than 1. What does that mean? One of my items has a negative error variance. Might that be the problem? If yes, how can I handle that? Thank you very much in advance! Janine 


A negative error variance makes the results inadmissible so the results are not interpretable. You might want to try the new EFA feature in the MODEL command that came out with Version 5.1. See the Version 5 Examples and Language Addendums on the website. If the negative residual variance is small and not signficant, you could fix it at zero. 

dkim posted on Tuesday, February 03, 2009  1:30 pm



Dear Linda, I have 1015 dichotomous variables. I tried to run both EFA with 1 factor using ML and CFA with 1 factor using both default estimator and ML. All three analyses ran without any errors. In SPSS with a ML extraction method, after EFT run, I can get a residual matrix (observed  model predicted), which I can get,using MPLUS, from CFA with the default estimator (WLSMV) but not ML estimator. Is there any way I can get the residual matrix either using values on mplus output or specifying MPLUS options in the input file? I have read the manual but I can't find the info. Thank you I think 


In SPSS, the variables are treated as continuous. If you treat them as continuous in Mplus, you will obtain residuals also. In Mplus, treating the variables as categorical with maximum likelihood estimation requires numerical integration. Sample statistics are not sufficient for model estimation. The raw data are used. 


I'm trying to produce a correlation matrix for a large number of model variables, some of which are continuous (factor scores) and some of which are ordinal (items not included in the factor scores) When I run Analysis: TYPE=BASIC with both continuous and categorical variables included, M+ is only outputting continuous variables in the resulting correlation matrix. Is there a way to produce a correlation matrix that includes both continuous and categorical variables? 


I just did a TYPE=BASIC with continuous and categorical outcomes and I get a correlation matrix containing all variables. Please send your output and license number to support@statmodel.com. 


Hello, Doing EFA in spss enables us to suppress small coeffecients of loadings below .40 according to some references, what's about the equivalent value in MPLUS? in other words, what is the cuttoff value to retain or exclude factors in EFA using MPLUS. one more question is do i have to run EFA using MPLUS or i can depend on spss factors resulted from factor analysis, especially it shows a good fit when i enter it in CFA in MPLUS. Thanks, 


Mplus gives you estimated standard errors in order to decide which loadings are significant or ignorable. No arbitrary cut off is necessary. I would do the EFA in Mplus for the reason above and also because Mplus allows Geomin rotation and modification indices. See also the recent article on ESEM: Asparouhov, T. & Muthén, B. (2009). Exploratory structural equation modeling. Structural Equation Modeling, 16, 397438. which you find on our web site. 


Hi, I have one binary dependent variable (yes/no) and 2 inddependent latent variables (continuous as they are Likert Scale 15). I am trying to measure the impact of those two indpendent variables on the dependent variable. I have written the syntax as: data: file is.... variable: names are x1x39 u1; categorical is u1; model: f1 by x1x4; f2 by x5x9; f3 by x10x15; f4 by x16x21; f5 by x22x25; f6 by x26x30; f7 by x31x36; f8 by x37x39; f9 by f1f4; f10 by f5f8; u1 on f9 f10; Is this syntax correct to measure the above target relationship or i need to add anything else? I have got the results of model fit as: RMSEA=0.019 CFI=0.915 TLI=0.909 WRMR= 0.761 Are these fit indices showing a good model fit? What else fit indices i have to calculate? Many thanks Indeed, 


Your model setup looks correct. The best way to know if you get what you want is to estimate the model and see which parameters are estimated or look at TECH1. All available fits statistics are given as the default. 


Are the results of the model in this case targets u1 as 1 or u1 as 0? Thanks 


One. 

EFried posted on Friday, December 07, 2012  1:08 pm



Quick question: When running EFA with categorical data, and am interested in factor loadings that are generally reported in papers. I am confused as to why the sum of neither the "geomin rotated loadings" nor of the "factor structure" across factors add up to 1. I am used to that kind of output from publications and other programs, in which an item has 1 point of variance to "give" that is distributed across factors. Am I misinterpreting the MPLUS output? Thank you 


Adding up to 1 is a feature of principal component analysis which in early days was used to estimate a factor model. In say, ML EFA there is no such scaling to 1. PCA has a focus on variance explanation but factor analysis instead has a focus on explaining correlations. 

EFried posted on Monday, December 10, 2012  8:13 am



Thank you Bengt. 

Back to top 