Message/Author 

Anonymous posted on Friday, February 13, 2004  11:13 am



Is it necessary that the data are normally distributed for confirmatory factor analyses or linear growth modelling, if the outcome variables are likert type scale and continuous? 


Factor indicators or outcomes in a growth model can be continuous, binary, or ordered categorical in the current version of Mplus. If they are not normally distributed, there are estimators that are robust to nonnormality. In Version 3, factor indictaors and outcomes can also be censored or count variables. In factor analysis combinations of different variable types are allowed. 

J.W. posted on Monday, February 04, 2008  2:13 pm



I am testing multivariate normality of the observed continuous variables before running a CFA model with 18 variables loading to 3 factors. The Mardia measures for skewness and kurtosis estimated from SAS using SAS macro MULTNORM are the following: Mardia Skewness 9844 p<.0001 Mardia Kurtosis 110.3 p<.0001 By specifying a single group in Mixture analysis and Option 13 in Mplus, I have: TWOSIDED MULTIVARIATE SKEW TEST OF FIT Sample Value 85.942 Mean 27.862 Standard Deviation 1.182 PValue 0.0000 TWOSIDED MULTIVARIATE KURTOSIS TEST OF FIT Sample Value 481.889 Mean 357.220 Standard Deviation 2.995 PValue 0.0000 Why the sample values are so different in Mplus and SAS results? How can we test multivariate normality in Mplus? Thanks! 


Mplus computes Mardia (1970) definitions of multivariate skew and kurtosis. Mplus uses the actual sample statistic as defined in Mardia, Kent, Bibby (1979, pg 21). SAS uses definitions from Mardia (1974). 

J.W. posted on Tuesday, February 05, 2008  9:46 am



Linda, Thanks a lot for your prompt reply to my question. In Mixture analysis, the Pvalue provided by Tech13 in Mplus output is for comparing the sample value and model estimated value in regard to Mardia Skewness and Kurtosis measures. A small Pvalue (e.g., <0.05) indicates that the model (i.e., the single group model in my case) does not fit the data. To test the hypothesis of multivariate normality in the observed measures, can I report the Sample Value and Pvalue in Mplus output? That is, if the Pvalue is <0.05, then reject the hypothesis of multivariate normality. However, it seems to me that the Pvalue in Mplus output is for model fit test. Is it also for testing multivariate normality of the observed variables? Appreciate your help! 


TECH13 does not provide tests of multivariate skewness and kurtosis. It is a test of the modelgenerated skewness and kurtosis against observed variable skewness and kurtosis. Multivariate normality is not needed when using the MLR and MLM estimators. 


Hi Linda, What kind of output do I need to run in order to estimate actual sample statistic as defined in Mardia, Kent, Bibby (1979, pg 21). 


This is TECH13 which is available only for TYPE=MIXTURE; You can do TYPEMIXTURE; with CLASSES = c(1); if you are not doing mixture modeling. 

kirby posted on Friday, May 16, 2008  6:30 am



Dear all, Linda wrote "TECH13 does not provide tests of multivariate skewness and kurtosis. It is a test of the modelgenerated skewness and kurtosis against observed variable skewness and kurtosis." and "Mplus computes Mardia (1970) definitions of multivariate skew and kurtosis. Mplus uses the actual sample statistic as defined in Mardia, Kent, Bibby (1979, pg 21)." I am sorry, but I do not really understand which definition MPlus uses and what exactly Tech13's Mardia Coefficient can tell me. Does it help me to assess multivariate normality of my indicator variables or not? If one can use Mplus Tech13's coefficient, how can I interpret it? I tried to find some guidance e.g. in the semnet archive, but there was nothing that really helped me. I assume the following result indicates that my indicator variables are not multivariate normally distributed, right? But how can I evalute if the multivariate kurtosis is low, moderate, or heavy? TWOSIDED MULTIVARIATE KURTOSIS TEST OF FIT Sample Value 551.903 Mean 481.730 Standard Deviation 2.579 PValue 0.0000 Thanks a lot for your help!! PS: Another thought: (how) can I use the SantoraBentler scaling correction factor to assess multivariate normality? 


Tech13 was primarily developed for mixture models to see if the estimated model would capture the skewness and kurtosis in the data. I think that is the background for Linda's statement. With a single class, however, this gives a standard test of multivariate normality. The actual skewness and kurtosis values are obtained in the output using Tech12. I have not seen the SatoraBentler correction factor used to test/assess multivariate normality. My own opinion is that tests of multivariate normality are of less importance now that we have nonnormality robust techniques using MLR or MLM in Mplus. Experience indicates that under nonnormality the normalitybased ML parameter estimates are quite robust, the SEs that MLR and MLM give are very good, and MLR/ MLM chisquare test of model fit is also very good. Normality testing seems to have been advocated in earlier days when these robust techniques hadn't been implemented. So in this sense, the focus on testing normality in the context of latent variable modeling is in my view a bit outdated. 

kirby posted on Sunday, May 18, 2008  6:12 am



Dear Bengt, thanks a lot for your detailed answer! Please allow one followup question: are there any criteria along which I can decide whether I should use MLM or MLR? On the one hand MLM is more popular and there is more information available from other authors, but on the other hand it only works with listwise deletion... Is there anything which points at MLM or MLR (I do not have nonindependent observations)? Have a nice Sunday! 


MLR and MLM usually give very similar SEs and they are asymptotically equivalent (even for nonnormal data). 

kirby posted on Tuesday, May 20, 2008  6:38 am



Dear Bengt, thanks for that pleasant answer! Then I would prefer MLR  due to my missing values. Do you maybe know any paper which supports your statement that MLM and MLR are very similar? I have had a look on the website but have not found anything. Thanks! 


I don't know of any such paper. 

kirby posted on Wednesday, May 21, 2008  7:48 am



I found some simulation studies which evaluate the performance of MLM and compare it to 'normal' ML estimation. However, I did not come across such studies for MLR. Do you maybe know one/some? That would be great! 


I don't know of any. 

Sean Mullen posted on Friday, November 06, 2009  2:19 pm



Dear list, Continuing on with the discussion above...how does Numerical Integration compare (to MLM and MLR) as a robust estimator in the presence of nonnormal data? I have 2 nonnormal distal outcome variables (model 1) and 3 nonnormal continous indicator variables (model 2)but MPlus requires integration with GMM and missing data. Should I worry about nonnormality, or should I definitely reflect and transform my negatively skewed variables? Thanks in advance! 


Numerical integration is an algorithm, not an estimator so can not be compared to say MLR. With mixture models such as GMM you should not worry about nonnormality of outcomes  the mixture creates nonnormality in the outcomes. So if you transform nonnormal outcomes you may not find the mixture that generated the data. 

finnigan posted on Thursday, July 22, 2010  12:28 pm



Linda/Bengt Can a box cox transformation be used to normalize skewed ordinal categorical data? Thanks 


I don't think this would be appropriate. Categorical data methodology is developed to deal with floor and ceiling effects so no transformation is necessary. 

finnigan posted on Friday, August 06, 2010  8:13 am



Linda/Bengt I have a sample size of 129 individuals responding to a 56 item survey using 15 point likert scale. Previous research has shown that a rather poor fitting five factor model under pins the data. I cannot validate the five factor model in my sample because it is too small and the data is skewed. Previous research has shown that 18 to 25 items out of 56 items do not load on a factor. The five factors do not correlate, and I would like to do a CFA on each of the factors separately to assess the loadings in my sample, with a view to reducing the number of items . Would you see any problems with this approach if the loadings are only of interest? Thanks 


This seems reasonable. 


Dr. Muthen, Continuing on your statement above that given MLR, worries about nonnormality are outdated, are you saying that  in the context of path modeling where the dependent variable has a zeroinflated Poissonlike distribution  specifying the DV as such is not necessary? I am hoping your answer is yes. That treating the DV as continuous, but using MLR estimation, yields similar results. Jennifer 


If you have a zeroinflated count variable, you should treat it as such by using the COUNT option. 

finnigan posted on Thursday, December 30, 2010  3:17 am



Linda/Bengt I am trying to run a one factor model using 10 measured variables scored on a likert scale from 15. The scale has been previously used in published research. My data are skewed and z scores for the indicators range from 5.7217 – 00372. Z scores for kurtosis range from 2.07 to 4.34. MPLUS gave the following warning: WARNING in MODEL command. All variables are uncorrelated with all other variables in the model. Check that this is what is intended. Given that previous research has shown that these items load on a factor and cronbachs alpha is .72 ,its surprising that this warning has arisen. I’m wondering if the warning has emerged because the skewness and kurtosis have impacted Pearsons correlation coefficients. Is there any way in MPLUS to deal with this warning? 


This warning comes when variables on the USEVARIABLES or NAMES list are not used in the MODEL command to warn you if this is unintended. These variables are all considered analysis variables. If you can't see the reason, please send the full output and your license number to support@statmodel.com. 


Dear all, having read several posts on the topic of mardia test, I learned, that there is the possibility of getting the mardia test by specifying a mixture model and use tech13. I would like to know, if there is an alternative way of getting this test in the current version (I use version 6). (I know, for my analyses I should use MLR or MLM with non normal data, but I am afraid I have to underpin this decision by reporting a test statistic like mardia’s coefficient in my diploma thesis...) Does anyone know a paper or other text were I can get information helping me to decide if to use MLR or MLM? (A text that tells something about the differences/advantages of the two options?) Thanks a lot for any comment, Katharina 


There is nothing wrong with always using MLR (or MLM when no missing data) instead of ML. If you have reasons to suspect that your data are nonnormal, I would simply use MLR/MLM and see what difference (compared to ML) that makes to SEs and chisquare. To me, that is the ultimate test of nonnormality. I don't find it necessary to do a Mardia test  I never do  that was needed in the old days when we didn't have MLR/MLM, but only ML. I can't point to a paper where ML, MLR, and MLN are thoroughly compared. Anyone? 


Dear Professor Muthen, thank you very much for your helpful advice. So I will use MLR with my data (because I have some missings). Kind regards, Katharina 

finnigan posted on Monday, March 12, 2012  2:47 pm



Linda/Bengt I am looking at the univariate skew and kurtosis in mplus. I have three waves of data and I'm running a CFA per wave consisting of a one factor model. However different indicators from different CFAs are showing the same values for skewness and kurtosis of observed indicators. Is there any reason that different indicators would have identical skewness and kurtosis values? Thanks 


Please send the output and your license number to support@statmodel.com so we can see exactly what you are talking about. 


Hi I am testing three different models running CFA (N=71000): model 1: 25 items and 5 first order factors; model 2: adding 2 second order factors  two to each of two first order factors; model 3: adding 1 second order factor to the same four first order factors as in model 2. My 25 items are highly skewed (the most skewed distribution 98, 1.5, 0.5). I get some fine overall fit statistics  RMSEA = .36, but below .9 CFI/ TLI and obviously very high chi square >6000)  but I also get some factor estimates >1 (because I haven't got enough variance I assume). I am treating my data on a cateogircal level and I have also tried dicotomising my answer categories. But it still doesn't work  I still get factor estimates >1. I hope you can advise me on what do do/ try next in order for me to be able to run my models without getting these overfitted models. Thanks a lot... 


Factor loadings can be greater than one with correlated factors. Your concern should be negative residual variances. 


Is there any adjustment to Pearson correlations that can be made when observed indicators are not normally distributed. Thanks 


There is no adjustment. If the variables are censored, you can use the CENSORED option. Otherwise, there are estimators like MLR that are robust to nonnormality. 

Ashley posted on Saturday, August 04, 2012  3:13 pm



Dear Drs. Muthen, I am running a growth cruve model using a drinking variable that is quite skewed. Because of the nonnormality, I am using the MLR estimation you explained earlier in this thread. On a path from a predictor variable to the intercept of drinking, I am getting different pvalues between the unstandardized and standardized results. I know that the betas are supposed to change when they are standardized, and the pvalues may change very slightly, but the difference in pvalues in this case is about .05, so the path is significant in the unstandardized results but not in the standardized results. Have you experienced this before when using MLR, or am I using it in an incorrect circumstance? If it is possible to get different pvalues, which results should I rely on? Thank you very much for your time. 


The pvalues differ between standardized and unstandardized because the sampling distribution are different. It has nothing to do with MLR. In all cases with so many simultaneous tests, you should be conservative. You should use the one you want to report. 

Ashley posted on Monday, August 06, 2012  10:29 am



Thank you very much. I appreciate the time and effort you both put into the discussion board, as it is one of the most useful tools. Thank you. 


Dear Dr. Muthen, I am running a CFA with 9 items loading on two factors and would like to use tech11 or tech13 to get information about the skewness. As Linda posted earlier this is only possible with type mixture and classes = c(1). How do I have to specify my model command which includes bycommands? Thanks a lot in advance. Anne 


It is TECH13. You need to add %OVERALL% after MODEL and before the BY statements. 

Anne posted on Monday, August 20, 2012  4:59 am



Thanks a lot for your help! 


Hello, In the situation where a dependent variable is a zeroinflated count, but the addition of the COUNT option produces a model that will not run due to computational complexity, how much trust can we put in results when using MLM/MLR? Would this be an instance where transforming the variable may be suitable? The warning I am receiving is below. Nick (THERE IS NOT ENOUGH MEMORY SPACE TO RUN THE PROGRAM ON THE CURRENT INPUT FILE. THE ANALYSIS REQUIRES 4 DIMENSIONS OF INTEGRATION RESULTING IN A TOTAL OF 0.50625E+05 INTEGRATION POINTS) 


The message has to do with not enough memory because the model has 4 dimensions of integration. Try INTEGRATION = MONTECARLO (5000). If this does not help, send the output and your license number to support@statmodel.com. 

Cecily Na posted on Thursday, August 08, 2013  11:42 am



Dear Professor, Although MLR estimator can handle nonnormal data, to what extent the nonnormality is for MLR to be most effective? For instance, Skewness < 2 and Kurtosis < 7, or the sknewness and kurtosis can be infinite? Thank you. 


I don't know of any studies that have looked at this specifically. You would need to do your own simulation study to see. 

Anne posted on Wednesday, September 04, 2013  6:13 am



Dear Ms. Muthen, although you say reporting Mardia coefficient is outdated, I need to report it in my thesis. Could you be so kind and explain to me what the mean and what the sample value is? What is the difference? Thank you in advance! 


Please send your output and license number to support@statmodel.com. 


Hello, I have evidence of nonnormal distribution in my data. So, I've used MLM. It's a SEM with two factors as IV, two mediator variables and two dependent variables. Then, I made a multigroup analysis. My doubt is that the d.f. of the model should be double than in the original model. However, I've more degrees of freedom than I expected. I guess than it's being constraint some parameters by default. Which? And how can I freely estimate these parameters? As a second step, I want to constraint some paths in the model and test If I have some differences between groups. As I'm using MLM, I think I have to test each path separately and calculate the scaled ji squared difference with the Bryant and Satorra (2012) macro. Am I right? Is there any form to test all the differences in the paths when MLM estimator is used at the same time? Thank you in advance! 


Please see the Topic 1 course handout under multiple group analysis. Factor loadings and intercepts are held equal as the default in multiple group analysis. You can see how to relax these equalities in the handout. You can do a joint test of them all by comparing a model with all coefficients free versus all coefficients held equal or you can do them one at a time. You can also use MODEL TEST. 


HI, Following on from the discussion regarding normality test, Dr. Bengt O. Muthen wrote: "There is nothing wrong with always using MLR (or MLM when no missing data) instead of ML. If you have reasons to suspect that your data are nonnormal, I would simply use MLR/MLM and see what difference (compared to ML) that makes to SEs and chisquare. To me, that is the ultimate test of nonnormality. I don't find it necessary to do a Mardia test  I never do  that was needed in the old days when we didn't have MLR/MLM, but only ML." I am wondering whether there is a rule of thumb based your experience that the extent of differences of SEs and chisquare (I presumed you meant the chisquare in the model fit indices?) between the outputs estimated by ML and MLM, would warrant a conclusion that the data has nonnormality. Please advise. Many thanks. 


MLR is robust to nonnormality and model misspecification. You can't tell why the standard errors are different. If I were doing maximum likelihood estimation, I would use MLR. 


Dear Drs. Muthen, I am using Mplus 7.11 with multilevel add on, how can I get the skewness and kurtusis output? Isn't it available for this version? Noting that all observed variables are continuous. Thank you 


You can find skewness and kurtosis using the PLOT command. Look at the histograms. When you right click on the histogram you can view the descriptive statistics which include skewness and kurtosis. 


I could view one observed variable at a time. Can I view descriptive statistics for all observed variables? Thanks a lot Dr. Linda 


TYPE=BASIC and the SAMPSTAT option give descriptive statistics for a set of variables. They do not however provide skewness and kurtosis. 


Hello. I am trying to determine whether to use the ML or MLM estimator for my data. My x2 value with ML is 638.00 and my x2 value with MLM is 619.419. The Scaling Correction Factor is 1.030. I know that values >1 are indicative of nonnormality; however, is this a hard and fast rule? Or is there some judgement involved? To me, 1.030 is very close to 1. 


You may also see if the SEs differ substantially. A good choice is MLR which is robust to nonnormality as well as some other misspecifications. 

Back to top 