Message/Author 


Greetings, The new skew SEM possibilities are really great! I have some questions regarding practical implementation. (1) Typically, models use a number of variables that may fit normality assumptions to various degrees. (a) Thus, how would you recommend we pick which one of the three available distributions(skewnormal, skewt, t)? Maybe start with SKEWT and look at the indicators of skewness and df? Would a SKEWT analysis be in any way biased if the data are more SKEWNORMAL or T (I think not as even normal data seems correctly estimated with SKEWT based on your paper)? Would you recommend any specific guideline to select the distribution? (b) I also saw that you can select specific variables in a single model for which to use the SKEW distributions (but not the others). Any guideline to suggest to determine the level of skewness that justifies dropping normality assumptions for a specific variable? (2) Many studies used Likert items (orderedcategorical). Simulations showed that treating these variables as continuous and using ML/MLR estimation is robust as long as there are more than 5 answer categories, whereas WLSMV tends to be better with less categories. You mention that the SKEW estimators are designed for continuous data. How does that translate to Likert items? For instance, would there be any problems in using MLR with a SKEW distribution with Likert items with 67 answers categories? 


I'd try a sequence: Normal, t, skewnormal, skewt. If the skew and kurtosis is small (say less than plus minus 0.5?) you might get a better BIC with normal. But using say a skewt would not hurt but simply get skew approx = 0 for more normal variables. Skewnormal can't accommodate skews larger than plus minus 1. We've had problems fitting skewt with Likert scales, at least if all answer categories don't have high frequency, but more experience is needed. Courses dealing with these new features are listed on our website. There will be handouts from these courses posted on our website shortly. There is a handout posted already from my 5/6/14 presentation to PSMG. There will be a videotaping of the July Psychometric Society 1day training on this that we will post. 

Ted Fong posted on Friday, May 16, 2014  2:58 am



Dear Dr. Muthén, I have tried factor mixture analysis with nonnormal distributions using the newly released version 7.2 and have a few questions regarding this new analysis: 1) In your mixture examples in Webnote 19, MLF and ML were used as estimators instead of the default MLR. Which estimator(s) do you deem suitable for nonnormal mixture modeling? Tech11 is not available without using MLR. Do you think it is essential to use MLR to obtain Tech11 or it is okay to simply rely on BIC for model comparison? 2) The 2class, 2factor FMA with t gives a much smaller BIC over the FMA with normal and its results make substantial sense. The model warns about the low df parameters (2.77 and 2.91) and infinite skewness in both classes. May I ask what are the distributional assumptions of the skewt distribution and how should one interpret the infinite skewness? 3) In page 10 of Webnote 19, it is written that ‘models with v < 3 should be used only for modeling data with substantial heavy tails and outsiders’, does this mean that I should not go on to perform FMA with skewnormal or skewt as their results would not be valid in my case of infinite skewness and should instead focus on the FMA results with normal and t? 


1) I would use MLR. At first we had only MLF available so that's why that was used in some early runs posted. I would simply use BIC. 2) Come to our UConn Mplus Version 7.2 course on Monday and we'll talk about it. It's a long story. Briefly, you can still describe (and plot) the estimated distribution so in that sense it is ok. Small df can come from smallish class sizes  I think your total sample size as n=197 which is not a lot in this context. Also, we do need more practical experience. 3) Look at your histograms to see if you have heavy tails  that is large positive kurtosis value (see View Descriptive Stats). You can certainly try skewnormal and see how the BIC compares, but skewt may be out of reach for this small sample. 


I am trying to run an SEM growth model using the newly released 7.2 with skewed distributions. I am not able to get the models to run. The output states the following: THE MODEL ESTIMATION DID NOT TERMINATE NORMALLY DUE TO AN ERROR IN THE COMPUTATION. CHANGE YOUR MODEL AND/OR STARTING VALUES. Do you have any suggestions? I have been investigating zero inflated models for this data because it has about 30% of density at 0. Thank you. 


Please send your output and license number to support@statmodel.com. 


I have the same error as Janice Kooken, who posted on Thursday May 29, 2014. I am curious if you have any suggestions for dealing with this error. I initially tried to run the measurement model for two scales with the latent factors predicting some skewed outcomes. All in one place produced errors, but even taking the observed scores and attempting to predict the skewed outcomes one by one produced errors. I also tried increasing the number of random starts. In addition to skew I have some floor and ceiling effects. I am wondering if it is that these data aren't appropriate for these options, or if I am not using the options correctly! 


Floor and ceiling effects was the issue that Kooken had if I remember correctly. It appears that skewSEM can have problems with that since it is an extreme case of skewness that is probably better modeled in other ways such as twopart modeling with parameters describing the probability of being at the floor or ceiling. If you like you can send the data and input to Support so we can all learn more about this. 


Dr. Muthen, Thanks for your reply. You are correct about Kooken, I spoke with her directly! We have considered the zeroinflated Poisson model for one of our outcomes, but were interested learning more about these new skew options. I will send along the data and input, to get your thoughts! 


Hi friends, Can someone please post some example syntax for the new distribution functions from 7.2? I'm doing a CFA with one factor and four indicators, all my indicators are positively skewed, and I theorize the latent construct to be positively skewed as well. How can I model this? Using Skewt? Is there some example syntax available? Thanks! 


See the handout from my version 7.2 workshop in Madison, July 21, 2014. See http://www.statmodel.com/7_2_presentations.shtml Video will be posted shortly. I would start with the default setting of applying the skew to only the factor(s). So all you need to say is Distribution = skewt; in the Analysis command. Make sure the best logL is replicated by using Starts = x y; for some small x and smaller y. Note, however, that your indicators need to be continuous. 5category Likert scales may not give sufficient information. 


Hi, I am trying to run a Monte Carlo analysis on simulated data using skewt mixture SEMs. The problem that occurs is that the commands for estimating skew and df seem to be ignored, as the output shows that they were not estimated and population values do not correspond to my starting values. However, if I apply the same input to the first dataset only it works well. Also if I combine the analysis with the data generation in one input it works well. Do you have any idea why that might be? Thanks! 


We need to see your Monte Carlo output  send to Support along with your license number. 


Hello, I read through the handout and have some questions. 1. Should the DISTRIBUTION options be considered for simple linear regressions with skewed outcomes? So far, most examples Ive seen concern more complex models. 2. If so, which estimators should be used and why? Does MLR still cover certain aspects of nonnormality that ML doesnt? What if multiple imputation is used or if there is missing data? 3. Lastly, are the DISTRIBUTION options still too experimental for those who are not researching on statistics? Could their use be recommended for nonstatisticsresearch? Thanks. 


Hello, I read the handout and have some doubts. 1. Are the DISTRIBUTION options still considered experimental and mostly for their use in statistical methods research? Or could they be used freely for all types of research? 2. Most examples I've seen concern complex models like mixture growth modeling. Could the DISTRIBUTION options be used in a simple regression? 3. If that is the case, which estimators should be used and why? Does MLR still cover aspects of nonnormality that ML wouldn't? What if there is missing data or multiple imputation is used? Thanks. 


1. This can be used for all types of research. We have found that it doesn't work well for variables with strong floor or ceiling effects. 2. Yes 3. Q1: Use estimators in the ML family. Q2: Yes Q3: Missing data is ok (uses ML under MAR as usual). MI is done under normality assumption so not quite consistent with skewt. 


I see. Thank you very much! 


Hi, I have fitted a twopart model. Yet, the continuous variable part of the model shows rather strong floor effect. I had considered using skew t distribution to handle that but obviously this may not work too well. If true what are my options in Mplus? Should I use bootstrapping to get the "best" possible standard errors for my point estimates? 


With a twopart model you should see no floor effect at all. Check the input and the data. Perhaps you have a lowest value that is lower than the value with the high percentage in which case you can use Cutpoint option (see V8 UG page 585). 

Samuli Helle posted on Monday, September 03, 2018  12:27 pm



If the data has a strong floor effect to begin with? We have theoretical interest to model the zero and the continuous parts separately. Am I missing something here? 


I mean that after you have used DATA TWOPART to split the variable with a strong floor effect into a continuous part and a binary part, there should be a vastly reduced floor effect for the continuous part. 

Samuli Helle posted on Tuesday, September 04, 2018  1:45 am



Yes, but the floor effect still clearly exist, making the residuals of the continuous part strongly nonnormal. Any advice how strong floor effects skew t distribution can handle? 


I don't recommend using skewt for a variable with a floor effect. The BMI distribution is a perfect example of what skewt handles well. If you have a strong floor for the continuous Y part of 2part, the default logY assumption is not good. I would suggest categorizing into an ordinal variable treated as Categorical. 


Hello, I am wondering when it would make more sense to use normal mixture modeling rather than skew mixture modeling (even if the BIC is worse). In the Asparouhov & Muthén (2016) "Structural Equation Models and Mixture Models With Continuous Nonnormal Skewed Distributions" paper, it says "Modeling with the skew t distribution in general requires larger sample sizes than modeling with the normal distribution... If the sample size is not sufficient, the additional skewness parameters in the skew t distribution will not be statistically significant and in that case they should be eliminated from the model to preserve model parsimony and minimize the standard errors for the remaining model parameters." If there are 6 continuous indicators for a LPA, of which 2 have skewness & kurtosis and 1 has skewness only... what would be a sample size that is sufficient? Are there other reasons to forgo skew mixture modeling aside from small sample size? Thank you. 


Hard to say which N is required without doing a Monte Carlo simulation study for your particular situation  that can be done. Perhaps N > 500. Also, skewt is generally not suitable for variables with strong floor or ceiling effects. 


Thank you Dr. Muthen, I would love to try a simulation study to answer this question but unsure how to start. Would you have a user guide recommendation for testing whether I have a large enough sample to conduct modeling with skewt? 


I can email a Monte Carlo script to you. 


Thank you Dr. Muthen. I have one more, naive, followup question. Is it possible to specify estimation of skew parameters for just the few indicators that have skew? Rather than producing an extra parameter for all of the indicators? Can the syntax be adjusted in some way to do this? It would be helpful to reduce the number of parameters estimated, since I have a limited number of indicators I can use for the LPA. But, I'm unsure if this makes sense to do methodologically, and if this is possible in Mplus. 


It is possible to specify estimation of skew parameters for just the few indicators that have skew. If you do not mention the {Y} parameter in the model it is assumed that the skew parameter for Y is fixed to 0 and no skewness is modeled for that variable. 


Thank you, I understand now. Similarly, is it possible to specify nonnormal distribution (tdist) for one timepoint, if doing an LTA with multiple timepoints and only one may need the nonnormal distribution for finding latent profiles? 


Yes. 


I'm not sure I am writing the syntax correctly. Would this be how to only specify tdist for one timepoint, one class? (I reduced to only two timepoints to fit limits of posting size) ... usevariables are y1t1 y2t1 y3t1 y4t1 y5t1 y6t1 y7t1 y1t2 y2t2 y3t2 y4t2 y5t2 y6t2 y7t2; classes = c1(3) c2(2); Analysis: type = mixture; distribution = tdist; Model: %OVERALL% [c1#1c2#1]; c2 on c1; Model c1: %c1#1% {df*1}; [y1t1]; [y2t1]; [y3t1]; [y4t1]; [y5t1]; [y6t1]; [y7t1]; %c1#2% {df@0} [y1t1]; [y2t1]; [y3t1]; [y4t1]; [y5t1]; [y6t1]; [y7t1]; %c1#3% {df@0} [y1t1]; [y2t1]; [y3t1]; [y4t1]; [y5t1]; [y6t1]; [y7t1]; Model c2: %c2#1% {df@0} [y1t2]; [y2t2]; [y3t2]; [y4t2]; [y5t2]; [y6t2]; [y7t2]; %c2#2% {df@0} [y1t2]; [y2t2]; [y3t2]; [y4t2]; [y5t2]; [y6t2]; [y7t2]; Thank you again. 


To convert the Tdistribution to a normal distribution you have to fix the DF parameter to a large value, not 0. For example {df@50}. 


Hello, is there a way of converting between the values for the skewnormal and skew t distributions and skewness and/or kurtosis statistics? I.e. if I have a particular value and df of skew t, what skewness/kurtosis values would that imply? Best, Fredrik Falkenström 


Hi again, the reason I asked my previous question was that I want to use the Monte Carlo function to generate data with one variable at certain levels of skewness, while other variables should be normal. Not sure how I would go about choosing the values of the skewness parameters to achieve this, though? Fredrik Falkenström 


See (2123) in http://statmodel.com/download/Skew.pdf Mplus will compute these for you in output:residual. Essentially, you would just fix the skew parameter to 0 for all but one variable. For example ANALYSIS: DISTRIBUTION = SKEWNORMAL; MODEL POPULATION: Y1Y2*1; Y1 with Y2*0.5; {Y1*3}; {Y2@0}; will generate a skew Y1 and a normal Y2. You can then analyze the generated sample with a separate input file with output:residual to see the level of skewness. 


Thank you very much! It seems that if I set {Y1*10} I get skewness close to 1. I'm analysing a misspecified model that doesn't take skewness into account, and I'd like to keep the variance of Y1 = 1 in that model. However, when I specify Y1*1 and {Y1*10} the variance of Y1 explodes to almost 40 (although curiously the correlation between Y1 and Y2 is correctly estimated as .50). How can I specify the model so that the variance of Y1 becomes 1? 


The skew parameter {Y1*10} (that is the delta parameter in formulas 2123) is different from the skewness of the variable. In this modeling framework it is not possible to generate bivariate data where Var(Y1)=Var(Y2)=1 Skew(Y1)=1 Skew(Y2)=0 Cov(Y1,Y2)=0.5 This is explained in the paragraph between formulas (23) and (24). If you drop Cov(Y1,Y2)=0.5 you can generate it using Y1*0.001; {Y1*1.6}; but that is probably not that useful. I would recommend using skewness smaller than 1 so that some correlation can be accommodated. I would also not worry about having variance=1. You can use define: standardize y1; in the subsequent analysis to get that. The standardization will of course preserve the skewness. 


Ok, thanks! Is there any way of generating samples with larger skewness than 1 in one of the variables while also keeping the correlation between variables? 


Not with the skew method implemented in Mplus. You can perhaphs do other things like using categorical plus continuous variable to get bigger skewness. See http://mathworld.wolfram.com/BernoulliDistribution.html and formula (20)in http://www.divaportal.org/smash/get/diva2:302313/FULLTEXT01.pdf 


Ok, thanks. One more question: is it possible to fix the level of skewness in the data generation process, so that the skewness level is constant across samples? I tried {X1@3} but it seems that skewness is still sampled rather than fixed. 


I am not really sure what you are seeing but the value 3 doesn't correspond to a sample statistic: the skew parameter is not the same as the sample skewness. Apart from that, the sample is generated from a particular distribution that is specified in the model population command. For finite sample size, the sample skewness will differ from the true skewness just like sample mean differes from the true mean. For larger samples the differences would be negligible. 

Back to top 