Skew SEM
Message/Author
 Alexandre J.S. Morin posted on Wednesday, May 14, 2014 - 8:49 pm
Greetings, The new skew SEM possibilities are really great! I have some questions regarding practical implementation.
(1) Typically, models use a number of variables that may fit normality assumptions to various degrees.
(a) Thus, how would you recommend we pick which one of the three available distributions(skewnormal, skewt, t)? Maybe start with SKEWT and look at the indicators of skewness and df? Would a SKEWT analysis be in any way biased if the data are more SKEWNORMAL or T (I think not as even normal data seems correctly estimated with SKEWT based on your paper)? Would you recommend any specific guideline to select the distribution?
(b) I also saw that you can select specific variables in a single model for which to use the SKEW distributions (but not the others). Any guideline to suggest to determine the level of skewness that justifies dropping normality assumptions for a specific variable?
(2) Many studies used Likert items (ordered-categorical). Simulations showed that treating these variables as continuous and using ML/MLR estimation is robust as long as there are more than 5 answer categories, whereas WLSMV tends to be better with less categories. You mention that the SKEW estimators are designed for continuous data. How does that translate to Likert items? For instance, would there be any problems in using MLR with a SKEW distribution with Likert items with 6-7 answers categories?
 Bengt O. Muthen posted on Thursday, May 15, 2014 - 6:03 am
I'd try a sequence: Normal, t, skew-normal, skew-t. If the skew and kurtosis is small (say less than plus minus 0.5?) you might get a better BIC with normal. But using say a skew-t would not hurt but simply get skew approx = 0 for more normal variables. Skew-normal can't accommodate skews larger than plus minus 1. We've had problems fitting skew-t with Likert scales, at least if all answer categories don't have high frequency, but more experience is needed.

Courses dealing with these new features are listed on our website. There will be handouts from these courses posted on our website shortly. There is a handout posted already from my 5/6/14 presentation to PSMG. There will be a videotaping of the July Psychometric Society 1-day training on this that we will post.
 Ted Fong posted on Friday, May 16, 2014 - 2:58 am
Dear Dr. Muthén,

I have tried factor mixture analysis with non-normal distributions using the newly released version 7.2 and have a few questions regarding this new analysis:

1) In your mixture examples in Webnote 19, MLF and ML were used as estimators instead of the default MLR. Which estimator(s) do you deem suitable for non-normal mixture modeling? Tech11 is not available without using MLR. Do you think it is essential to use MLR to obtain Tech11 or it is okay to simply rely on BIC for model comparison?

2) The 2-class, 2-factor FMA with t gives a much smaller BIC over the FMA with normal and its results make substantial sense. The model warns about the low df parameters (2.77 and 2.91) and infinite skewness in both classes. May I ask what are the distributional assumptions of the skew-t distribution and how should one interpret the infinite skewness?

3) In page 10 of Webnote 19, it is written that ‘models with v < 3 should be used only for modeling data with substantial heavy tails and outsiders’, does this mean that I should not go on to perform FMA with skew-normal or skew-t as their results would not be valid in my case of infinite skewness and should instead focus on the FMA results with normal and t?
 Bengt O. Muthen posted on Saturday, May 17, 2014 - 11:34 am
1) I would use MLR. At first we had only MLF available so that's why that was used in some early runs posted. I would simply use BIC.

2) Come to our UConn Mplus Version 7.2 course on Monday and we'll talk about it. It's a long story. Briefly, you can still describe (and plot) the estimated distribution so in that sense it is ok. Small df can come from smallish class sizes - I think your total sample size as n=197 which is not a lot in this context. Also, we do need more practical experience.

3) Look at your histograms to see if you have heavy tails - that is large positive kurtosis value (see View Descriptive Stats). You can certainly try skew-normal and see how the BIC compares, but skew-t may be out of reach for this small sample.
 Janice Kooken posted on Thursday, May 29, 2014 - 7:11 pm
I am trying to run an SEM growth model using the newly released 7.2 with skewed distributions. I am not able to get the models to run. The output states the following:
THE MODEL ESTIMATION DID NOT TERMINATE NORMALLY DUE TO AN ERROR IN THE
COMPUTATION. CHANGE YOUR MODEL AND/OR STARTING VALUES.
Do you have any suggestions?
I have been investigating zero inflated models for this data because it has about 30% of density at 0.
Thank you.
 Linda K. Muthen posted on Friday, May 30, 2014 - 11:29 am
 Jessica Kay Flake posted on Tuesday, June 24, 2014 - 10:50 am
I have the same error as Janice Kooken, who posted on Thursday May 29, 2014. I am curious if you have any suggestions for dealing with this error.
I initially tried to run the measurement model for two scales with the latent factors predicting some skewed outcomes. All in one place produced errors, but even taking the observed scores and attempting to predict the skewed outcomes one by one produced errors. I also tried increasing the number of random starts. In addition to skew I have some floor and ceiling effects. I am wondering if it is that these data aren't appropriate for these options, or if I am not using the options correctly!
 Bengt O. Muthen posted on Tuesday, June 24, 2014 - 11:26 am
Floor and ceiling effects was the issue that Kooken had if I remember correctly. It appears that skew-SEM can have problems with that since it is an extreme case of skewness that is probably better modeled in other ways such as two-part modeling with parameters describing the probability of being at the floor or ceiling. If you like you can send the data and input to Support so we can all learn more about this.
 Jessica Kay Flake posted on Wednesday, June 25, 2014 - 1:23 pm
Dr. Muthen,
Thanks for your reply. You are correct about Kooken, I spoke with her directly! We have considered the zero-inflated Poisson model for one of our outcomes, but were interested learning more about these new skew options. I will send along the data and input, to get your thoughts!
 Eric Thibodeau posted on Friday, August 08, 2014 - 1:18 pm
Hi friends,

Can someone please post some example syntax for the new distribution functions from 7.2? I'm doing a CFA with one factor and four indicators, all my indicators are positively skewed, and I theorize the latent construct to be positively skewed as well. How can I model this? Using Skewt? Is there some example syntax available?

Thanks!
 Bengt O. Muthen posted on Friday, August 08, 2014 - 4:02 pm
See the handout from my version 7.2 workshop in Madison, July 21, 2014. See

http://www.statmodel.com/7_2_presentations.shtml

Video will be posted shortly.

I would start with the default setting of applying the skew to only the factor(s). So all you need to say is

Distribution = skewt;

in the Analysis command. Make sure the best logL is replicated by using Starts = x y; for some small x and smaller y.

Note, however, that your indicators need to be continuous. 5-category Likert scales may not give sufficient information.
 Jana Holtmann posted on Tuesday, April 25, 2017 - 7:57 am
Hi, I am trying to run a Monte Carlo analysis on simulated data using skew-t mixture SEMs. The problem that occurs is that the commands for estimating skew and df seem to be ignored, as the output shows that they were not estimated and population values do not correspond to my starting values. However, if I apply the same input to the first dataset only it works well. Also if I combine the analysis with the data generation in one input it works well. Do you have any idea why that might be? Thanks!
 Bengt O. Muthen posted on Tuesday, April 25, 2017 - 5:26 pm
We need to see your Monte Carlo output - send to Support along with your license number.
 Cristina Ramirez posted on Friday, September 29, 2017 - 8:35 pm
Hello, I read through the handout and have some questions.

1. Should the DISTRIBUTION options be considered for simple linear regressions with skewed outcomes? So far, most examples Ive seen concern more complex models.

2. If so, which estimators should be used and why? Does MLR still cover certain aspects of non-normality that ML doesnt? What if multiple imputation is used or if there is missing data?

3. Lastly, are the DISTRIBUTION options still too experimental for those who are not researching on statistics? Could their use be recommended for non-statistics-research?

Thanks.
 Cristina Ramirez posted on Saturday, September 30, 2017 - 6:55 pm
Hello, I read the handout and have some doubts.

1. Are the DISTRIBUTION options still considered experimental and mostly for their use in statistical methods research? Or could they be used freely for all types of research?

2. Most examples I've seen concern complex models like mixture growth modeling. Could the DISTRIBUTION options be used in a simple regression?

3. If that is the case, which estimators should be used and why? Does MLR still cover aspects of non-normality that ML wouldn't? What if there is missing data or multiple imputation is used?

Thanks.
 Bengt O. Muthen posted on Sunday, October 01, 2017 - 12:33 pm
1. This can be used for all types of research. We have found that it doesn't work well for variables with strong floor or ceiling effects.

2. Yes

3.

Q1: Use estimators in the ML family.

Q2: Yes

Q3: Missing data is ok (uses ML under MAR as usual). MI is done under normality assumption so not quite consistent with skew-t.
 Cristina Ramirez posted on Tuesday, October 03, 2017 - 5:41 pm
I see. Thank you very much!