Factor indicators or outcomes in a growth model can be continuous, binary, or ordered categorical in the current version of Mplus. If they are not normally distributed, there are estimators that are robust to non-normality. In Version 3, factor indictaors and outcomes can also be censored or count variables. In factor analysis combinations of different variable types are allowed.
J.W. posted on Monday, February 04, 2008 - 2:13 pm
I am testing multivariate normality of the observed continuous variables before running a CFA model with 18 variables loading to 3 factors. The Mardia measures for skewness and kurtosis estimated from SAS using SAS macro MULTNORM are the following:
Mplus computes Mardia (1970) definitions of multivariate skew and kurtosis. Mplus uses the actual sample statistic as defined in Mardia, Kent, Bibby (1979, pg 21). SAS uses definitions from Mardia (1974).
J.W. posted on Tuesday, February 05, 2008 - 9:46 am
Thanks a lot for your prompt reply to my question.
In Mixture analysis, the P-value provided by Tech13 in Mplus output is for comparing the sample value and model estimated value in regard to Mardia Skewness and Kurtosis measures. A small P-value (e.g., <0.05) indicates that the model (i.e., the single group model in my case) does not fit the data. To test the hypothesis of multivariate normality in the observed measures, can I report the Sample Value and P-value in Mplus output? That is, if the P-value is <0.05, then reject the hypothesis of multivariate normality. However, it seems to me that the P-value in Mplus output is for model fit test. Is it also for testing multivariate normality of the observed variables? Appreciate your help!
TECH13 does not provide tests of multivariate skewness and kurtosis. It is a test of the model-generated skewness and kurtosis against observed variable skewness and kurtosis. Multivariate normality is not needed when using the MLR and MLM estimators.
Dear all, Linda wrote "TECH13 does not provide tests of multivariate skewness and kurtosis. It is a test of the model-generated skewness and kurtosis against observed variable skewness and kurtosis." and "Mplus computes Mardia (1970) definitions of multivariate skew and kurtosis. Mplus uses the actual sample statistic as defined in Mardia, Kent, Bibby (1979, pg 21)."
I am sorry, but I do not really understand which definition MPlus uses and what exactly Tech13's Mardia Coefficient can tell me. Does it help me to assess multivariate normality of my indicator variables or not?
If one can use Mplus Tech13's coefficient, how can I interpret it? I tried to find some guidance e.g. in the semnet archive, but there was nothing that really helped me. I assume the following result indicates that my indicator variables are not multivariate normally distributed, right? But how can I evalute if the multivariate kurtosis is low, moderate, or heavy?
TWO-SIDED MULTIVARIATE KURTOSIS TEST OF FIT Sample Value 551.903 Mean 481.730 Standard Deviation 2.579 P-Value 0.0000
Thanks a lot for your help!!
PS: Another thought: (how) can I use the Santora-Bentler scaling correction factor to assess multivariate normality?
Tech13 was primarily developed for mixture models to see if the estimated model would capture the skewness and kurtosis in the data. I think that is the background for Linda's statement. With a single class, however, this gives a standard test of multivariate normality. The actual skewness and kurtosis values are obtained in the output using Tech12.
I have not seen the Satora-Bentler correction factor used to test/assess multivariate normality.
My own opinion is that tests of multivariate normality are of less importance now that we have non-normality robust techniques using MLR or MLM in Mplus. Experience indicates that under non-normality the normality-based ML parameter estimates are quite robust, the SEs that MLR and MLM give are very good, and MLR/ MLM chi-square test of model fit is also very good. Normality testing seems to have been advocated in earlier days when these robust techniques hadn't been implemented. So in this sense, the focus on testing normality in the context of latent variable modeling is in my view a bit outdated.
Please allow one follow-up question: are there any criteria along which I can decide whether I should use MLM or MLR?
On the one hand MLM is more popular and there is more information available from other authors, but on the other hand it only works with listwise deletion... Is there anything which points at MLM or MLR (I do not have non-independent observations)?
I found some simulation studies which evaluate the performance of MLM and compare it to 'normal' ML estimation. However, I did not come across such studies for MLR. Do you maybe know one/some? That would be great!
Sean Mullen posted on Friday, November 06, 2009 - 2:19 pm
Dear list, Continuing on with the discussion above...how does Numerical Integration compare (to MLM and MLR) as a robust estimator in the presence of non-normal data? I have 2 non-normal distal outcome variables (model 1) and 3 non-normal continous indicator variables (model 2)--but MPlus requires integration with GMM and missing data. Should I worry about non-normality, or should I definitely reflect and transform my negatively skewed variables? Thanks in advance!
Numerical integration is an algorithm, not an estimator so can not be compared to say MLR. With mixture models such as GMM you should not worry about non-normality of outcomes - the mixture creates non-normality in the outcomes. So if you transform non-normal outcomes you may not find the mixture that generated the data.
finnigan posted on Thursday, July 22, 2010 - 12:28 pm
Can a box cox transformation be used to normalize skewed ordinal categorical data?
I don't think this would be appropriate. Categorical data methodology is developed to deal with floor and ceiling effects so no transformation is necessary.
finnigan posted on Friday, August 06, 2010 - 8:13 am
I have a sample size of 129 individuals responding to a 56 item survey using 1-5 point likert scale. Previous research has shown that a rather poor fitting five factor model under pins the data. I cannot validate the five factor model in my sample because it is too small and the data is skewed. Previous research has shown that 18 to 25 items out of 56 items do not load on a factor.
The five factors do not correlate, and I would like to do a CFA on each of the factors separately to assess the loadings in my sample, with a view to reducing the number of items . Would you see any problems with this approach if the loadings are only of interest?
Continuing on your statement above that given MLR, worries about non-normality are outdated, are you saying that -- in the context of path modeling where the dependent variable has a zero-inflated Poisson-like distribution -- specifying the DV as such is not necessary?
I am hoping your answer is yes. That treating the DV as continuous, but using MLR estimation, yields similar results.
If you have a zero-inflated count variable, you should treat it as such by using the COUNT option.
finnigan posted on Thursday, December 30, 2010 - 3:17 am
Linda/Bengt I am trying to run a one factor model using 10 measured variables scored on a likert scale from 1-5. The scale has been previously used in published research. My data are skewed and z scores for the indicators range from 5.7217 – 00372. Z scores for kurtosis range from 2.07 to 4.34. MPLUS gave the following warning: WARNING in MODEL command. All variables are uncorrelated with all other variables in the model. Check that this is what is intended.
Given that previous research has shown that these items load on a factor and cronbachs alpha is .72 ,its surprising that this warning has arisen. I’m wondering if the warning has emerged because the skewness and kurtosis have impacted Pearsons correlation coefficients.
Is there any way in MPLUS to deal with this warning?
This warning comes when variables on the USEVARIABLES or NAMES list are not used in the MODEL command to warn you if this is unintended. These variables are all considered analysis variables. If you can't see the reason, please send the full output and your license number to email@example.com.
having read several posts on the topic of mardia test, I learned, that there is the possibility of getting the mardia test by specifying a mixture model and use tech13. I would like to know, if there is an alternative way of getting this test in the current version (I use version 6). (I know, for my analyses I should use MLR or MLM with non normal data, but I am afraid I have to underpin this decision by reporting a test statistic like mardia’s coefficient in my diploma thesis...)
Does anyone know a paper or other text were I can get information helping me to decide if to use MLR or MLM? (A text that tells something about the differences/advantages of the two options?)
There is nothing wrong with always using MLR (or MLM when no missing data) instead of ML. If you have reasons to suspect that your data are non-normal, I would simply use MLR/MLM and see what difference (compared to ML) that makes to SEs and chi-square. To me, that is the ultimate test of non-normality. I don't find it necessary to do a Mardia test - I never do - that was needed in the old days when we didn't have MLR/MLM, but only ML.
I can't point to a paper where ML, MLR, and MLN are thoroughly compared. Anyone?
thank you very much for your helpful advice. So I will use MLR with my data (because I have some missings).
Kind regards, Katharina
finnigan posted on Monday, March 12, 2012 - 2:47 pm
I am looking at the univariate skew and kurtosis in mplus. I have three waves of data and I'm running a CFA per wave consisting of a one factor model. However different indicators from different CFAs are showing the same values for skewness and kurtosis of observed indicators. Is there any reason that different indicators would have identical skewness and kurtosis values?
I am testing three different models running CFA (N=71000): model 1: 25 items and 5 first order factors; model 2: adding 2 second order factors - two to each of two first order factors; model 3: adding 1 second order factor to the same four first order factors as in model 2. My 25 items are highly skewed (the most skewed distribution 98, 1.5, 0.5). I get some fine overall fit statistics - RMSEA = .36, but below .9 CFI/ TLI and obviously very high chi square >6000) - but I also get some factor estimates >1 (because I haven't got enough variance I assume). I am treating my data on a cateogircal level and I have also tried dicotomising my answer categories. But it still doesn't work - I still get factor estimates >1. I hope you can advise me on what do do/ try next in order for me to be able to run my models without getting these overfitted models.
There is no adjustment. If the variables are censored, you can use the CENSORED option. Otherwise, there are estimators like MLR that are robust to non-normality.
Ashley posted on Saturday, August 04, 2012 - 3:13 pm
Dear Drs. Muthen,
I am running a growth cruve model using a drinking variable that is quite skewed. Because of the non-normality, I am using the MLR estimation you explained earlier in this thread. On a path from a predictor variable to the intercept of drinking, I am getting different p-values between the unstandardized and standardized results. I know that the betas are supposed to change when they are standardized, and the p-values may change very slightly, but the difference in p-values in this case is about .05, so the path is significant in the unstandardized results but not in the standardized results. Have you experienced this before when using MLR, or am I using it in an incorrect circumstance? If it is possible to get different p-values, which results should I rely on? Thank you very much for your time.
The p-values differ between standardized and unstandardized because the sampling distribution are different. It has nothing to do with MLR. In all cases with so many simultaneous tests, you should be conservative. You should use the one you want to report.
Ashley posted on Monday, August 06, 2012 - 10:29 am
Thank you very much. I appreciate the time and effort you both put into the discussion board, as it is one of the most useful tools. Thank you.
I am running a CFA with 9 items loading on two factors and would like to use tech11 or tech13 to get information about the skewness. As Linda posted earlier this is only possible with type mixture and classes = c(1). How do I have to specify my model command which includes by-commands?
Hello, In the situation where a dependent variable is a zero-inflated count, but the addition of the COUNT option produces a model that will not run due to computational complexity, how much trust can we put in results when using MLM/MLR? Would this be an instance where transforming the variable may be suitable? The warning I am receiving is below.
(THERE IS NOT ENOUGH MEMORY SPACE TO RUN THE PROGRAM ON THE CURRENT INPUT FILE. THE ANALYSIS REQUIRES 4 DIMENSIONS OF INTEGRATION RESULTING IN A TOTAL OF 0.50625E+05 INTEGRATION POINTS)
The message has to do with not enough memory because the model has 4 dimensions of integration. Try INTEGRATION = MONTECARLO (5000). If this does not help, send the output and your license number to firstname.lastname@example.org.
Cecily Na posted on Thursday, August 08, 2013 - 11:42 am
Dear Professor, Although MLR estimator can handle non-normal data, to what extent the non-normality is for MLR to be most effective? For instance, Skewness < 2 and Kurtosis < 7, or the sknewness and kurtosis can be infinite?
Hello, I have evidence of non-normal distribution in my data. So, I've used MLM. It's a SEM with two factors as IV, two mediator variables and two dependent variables. Then, I made a multi-group analysis. My doubt is that the d.f. of the model should be double than in the original model. However, I've more degrees of freedom than I expected. I guess than it's being constraint some parameters by default. Which? And how can I freely estimate these parameters?
As a second step, I want to constraint some paths in the model and test If I have some differences between groups. As I'm using MLM, I think I have to test each path separately and calculate the scaled ji squared difference with the Bryant and Satorra (2012) macro. Am I right? Is there any form to test all the differences in the paths when MLM estimator is used at the same time?
Please see the Topic 1 course handout under multiple group analysis. Factor loadings and intercepts are held equal as the default in multiple group analysis. You can see how to relax these equalities in the handout.
You can do a joint test of them all by comparing a model with all coefficients free versus all coefficients held equal or you can do them one at a time. You can also use MODEL TEST.
Following on from the discussion regarding normality test, Dr. Bengt O. Muthen wrote:
"There is nothing wrong with always using MLR (or MLM when no missing data) instead of ML. If you have reasons to suspect that your data are non-normal, I would simply use MLR/MLM and see what difference (compared to ML) that makes to SEs and chi-square. To me, that is the ultimate test of non-normality. I don't find it necessary to do a Mardia test - I never do - that was needed in the old days when we didn't have MLR/MLM, but only ML."
I am wondering whether there is a rule of thumb based your experience that the extent of differences of SEs and chi-square (I presumed you meant the chi-square in the model fit indices?) between the outputs estimated by ML and MLM, would warrant a conclusion that the data has non-normality. Please advise. Many thanks.
Hello. I am trying to determine whether to use the ML or MLM estimator for my data. My x2 value with ML is 638.00 and my x2 value with MLM is 619.419. The Scaling Correction Factor is 1.030. I know that values >1 are indicative of nonnormality; however, is this a hard and fast rule? Or is there some judgement involved? To me, 1.030 is very close to 1.