Message/Author 


Hi Linda, I would like to fit an SEM model to a set of ordered categorical variables that have L or J shaped distribution. That is, the assumption of the latent variable be normally distributed is not valid and there are some missing (up to 2 percent) values. 1. Can the Mplus handle it properly 2. Can it accommodate formative construct. Thank you in advance haihong 


Factors with categorical factor indicators are not necessarily nonnormal. 1. Yes. 2. Yes. 

hai hong li posted on Tuesday, August 01, 2006  2:03 pm



You have written that" Factors with categorical factor indicators are not necessarily nonnormal". So if I have five point Likert scales which are extremely skewed (L or J shape), for analysis of the data, you do not make the assumption that the underlying latent variable is normally distributed. That is, you do not substitute polychoric correlations instead of the covariance to analyze the data. Is this correct? Thanks haihong 


There are 2 things here. First, the fact that your Likert scale items are skewed does not mean that the factor must be skewed. Take the example of very extreme attitude items where most people disagree. This gives skewed items, but the factor may still be normal. The observed nonnormality may simply be due to extremeness of the item wording. Second, the default assumption in Mplus is that the factor is normal. It does not have to be normal, however, if you work with mixture modeling. As for your last sentence, you may be interested in the following statement from our short courses: Note that by assuming normal factors and using probit links, ML uses the same model as WLSMV. This is because normal factors and probit links result in multivariate normal u* variables. For model estimation, WLSMV uses the limited information of first and secondorder moments, thresholds and sample correlations of the multivariate normal u* variables (tetrachoric, polychoric, and polyserial correlations), whereas ML uses full information from all moments of the data. 

Xu, Man posted on Thursday, July 12, 2012  11:23 am



Could I please follow up on this lead? If one's latent variable is not normally distributed, regardless of continuous or ordinal items. For example, most psychiatric screening instruments aren't normally distributed and binary cutoff is often applied on sum scores. It is now quite often to apply CFA for these ordinal item level data (either as continuous under RML or as ordinal under WLSMV), and use these latent factor as predictor or outcome in SEM. I was wondering what the implication is for the results based on this kind of measurement model and is there a way to deal with this? Thanks! 


Not sure what your major concern is  perhaps it is that you don't believe your latent variable is normally distributed. You seem to take a nonnormal distribution of a psychiatric screening instrument as an argument that the corresponding latent variable is not normal, but per my discussion above, I don't think that necessarily follows. For a related, recent article, see Wall, M. M., Guo, J., & Amemiya, Y. (2012). Mixture factor analysis for approximating a nonnormally distributed continuous latent factor with continuous and dichotomous observed variables. Multivariate Behavioral Research, 47:2, 276313. where a nonnormal latent variable is obtained using the Mplus mixture approach. 

Xu, Man posted on Friday, July 13, 2012  2:17 am



yes, I am worried that the factor is not normally distributed. I checked the data I have (3 instruments). One of them has factor scores normally distributed (as you say). The other two are not. I declared all items to be categorical with the default estimator. On the item level, I found the instrument has normal distribution has more items that are normally distributed. The two that have got skewed factor score distributions have mostly highly skewed items. Does it mean that the latent variable approach is not suitable for the other two intruments. Thank you for the paper. I shall read it. 


Take a look at slide 117 of out Topic 2 handout and you can see why an observed score can be nonnormal at the same time as a latent score being normal. A latent variable can be normal and still give a nonnormal estimated factor score distribution. This is because of items that don't capture the tails of the factor distribution well, for instance too easy or too hard items. I would guess that the issue you are concerned with is likely of less importance than other aspects of your modeling. 


Transforming skewed data in ESEM Hello! In my ESEM model I try to estimate relationship between two variables: explanatory variable is positevely skewed (likert scale on preference) and dependent variable is negatively skewed (consumption data). I wonder if I need to use logtransfromation to normilze the data or it is enough to use WLSMV  estimator and skip transformations? Another reason for choosing WLSMVestimator is that I have some other categorical variables in the model. Thanks, Jazgul 


I would not transform variables unless that makes the linearity specification more realistic. The MLR estimator handles nonnormality, or if you don't want to assume continuous variables, then WLSMV takes care of it. 


Thanks! If I have both continous and dichtomous variables in the model, can I specify both estimators? /Jazgul 


You cannot use more than one estimator in an analysis. Both weighted least squares and maximum likelihood can be used for a model with a combination of continuous and dichotomous variables. 


Hello Linda, I was hoping you can help me. I am new to Mplus but I have read many discussions here and also the Mplus manual. However, I am failing to find a way how I can improve my CFA model fit (incremental stats used already). I am trying to fit 2 factor CFA ( 2 latent variables). I am using factor indicators that were calculated as the means of specific items measured on a likertscale. Thus the factor indicators are not normally distributed (histogram and normality tests support my assumption). On the top of the nonnormality,my data possesses quite a large number of missing data and imputation method nor transformation did not improve the normality. I have tried to use WLSMV as well as WLS and getting errors stating that I have no categorical variables present (I can not select the factor indicators as categorical as they are not integers). When I use MLM or MLR I get an error message that I have to use listwise deletion which is impossible due to the amount of missing values. I have used FIML so far but the model fit is poor when I evaluate all fit indices other than chisquare (TLI= 0.8). Have you got any suggestions? Thank you in advance for your help. Lucie 


Try an EFA to see if the two factors you are specifying are supported by the data. You could consider going back to the original items. 

Back to top 