Skew SEM
Message/Author
 Alexandre J.S. Morin posted on Wednesday, May 14, 2014 - 8:49 pm
Greetings, The new skew SEM possibilities are really great! I have some questions regarding practical implementation.
(1) Typically, models use a number of variables that may fit normality assumptions to various degrees.
(a) Thus, how would you recommend we pick which one of the three available distributions(skewnormal, skewt, t)? Maybe start with SKEWT and look at the indicators of skewness and df? Would a SKEWT analysis be in any way biased if the data are more SKEWNORMAL or T (I think not as even normal data seems correctly estimated with SKEWT based on your paper)? Would you recommend any specific guideline to select the distribution?
(b) I also saw that you can select specific variables in a single model for which to use the SKEW distributions (but not the others). Any guideline to suggest to determine the level of skewness that justifies dropping normality assumptions for a specific variable?
(2) Many studies used Likert items (ordered-categorical). Simulations showed that treating these variables as continuous and using ML/MLR estimation is robust as long as there are more than 5 answer categories, whereas WLSMV tends to be better with less categories. You mention that the SKEW estimators are designed for continuous data. How does that translate to Likert items? For instance, would there be any problems in using MLR with a SKEW distribution with Likert items with 6-7 answers categories?
 Bengt O. Muthen posted on Thursday, May 15, 2014 - 6:03 am
I'd try a sequence: Normal, t, skew-normal, skew-t. If the skew and kurtosis is small (say less than plus minus 0.5?) you might get a better BIC with normal. But using say a skew-t would not hurt but simply get skew approx = 0 for more normal variables. Skew-normal can't accommodate skews larger than plus minus 1. We've had problems fitting skew-t with Likert scales, at least if all answer categories don't have high frequency, but more experience is needed.

Courses dealing with these new features are listed on our website. There will be handouts from these courses posted on our website shortly. There is a handout posted already from my 5/6/14 presentation to PSMG. There will be a videotaping of the July Psychometric Society 1-day training on this that we will post.
 Ted Fong posted on Friday, May 16, 2014 - 2:58 am
Dear Dr. Muthén,

I have tried factor mixture analysis with non-normal distributions using the newly released version 7.2 and have a few questions regarding this new analysis:

1) In your mixture examples in Webnote 19, MLF and ML were used as estimators instead of the default MLR. Which estimator(s) do you deem suitable for non-normal mixture modeling? Tech11 is not available without using MLR. Do you think it is essential to use MLR to obtain Tech11 or it is okay to simply rely on BIC for model comparison?

2) The 2-class, 2-factor FMA with t gives a much smaller BIC over the FMA with normal and its results make substantial sense. The model warns about the low df parameters (2.77 and 2.91) and infinite skewness in both classes. May I ask what are the distributional assumptions of the skew-t distribution and how should one interpret the infinite skewness?

3) In page 10 of Webnote 19, it is written that ‘models with v < 3 should be used only for modeling data with substantial heavy tails and outsiders’, does this mean that I should not go on to perform FMA with skew-normal or skew-t as their results would not be valid in my case of infinite skewness and should instead focus on the FMA results with normal and t?
 Bengt O. Muthen posted on Saturday, May 17, 2014 - 11:34 am
1) I would use MLR. At first we had only MLF available so that's why that was used in some early runs posted. I would simply use BIC.

2) Come to our UConn Mplus Version 7.2 course on Monday and we'll talk about it. It's a long story. Briefly, you can still describe (and plot) the estimated distribution so in that sense it is ok. Small df can come from smallish class sizes - I think your total sample size as n=197 which is not a lot in this context. Also, we do need more practical experience.

3) Look at your histograms to see if you have heavy tails - that is large positive kurtosis value (see View Descriptive Stats). You can certainly try skew-normal and see how the BIC compares, but skew-t may be out of reach for this small sample.
 Janice Kooken posted on Thursday, May 29, 2014 - 7:11 pm
I am trying to run an SEM growth model using the newly released 7.2 with skewed distributions. I am not able to get the models to run. The output states the following:
THE MODEL ESTIMATION DID NOT TERMINATE NORMALLY DUE TO AN ERROR IN THE
COMPUTATION. CHANGE YOUR MODEL AND/OR STARTING VALUES.
Do you have any suggestions?
I have been investigating zero inflated models for this data because it has about 30% of density at 0.
Thank you.
 Linda K. Muthen posted on Friday, May 30, 2014 - 11:29 am
 Jessica Kay Flake posted on Tuesday, June 24, 2014 - 10:50 am
I have the same error as Janice Kooken, who posted on Thursday May 29, 2014. I am curious if you have any suggestions for dealing with this error.
I initially tried to run the measurement model for two scales with the latent factors predicting some skewed outcomes. All in one place produced errors, but even taking the observed scores and attempting to predict the skewed outcomes one by one produced errors. I also tried increasing the number of random starts. In addition to skew I have some floor and ceiling effects. I am wondering if it is that these data aren't appropriate for these options, or if I am not using the options correctly!
 Bengt O. Muthen posted on Tuesday, June 24, 2014 - 11:26 am
Floor and ceiling effects was the issue that Kooken had if I remember correctly. It appears that skew-SEM can have problems with that since it is an extreme case of skewness that is probably better modeled in other ways such as two-part modeling with parameters describing the probability of being at the floor or ceiling. If you like you can send the data and input to Support so we can all learn more about this.
 Jessica Kay Flake posted on Wednesday, June 25, 2014 - 1:23 pm
Dr. Muthen,
Thanks for your reply. You are correct about Kooken, I spoke with her directly! We have considered the zero-inflated Poisson model for one of our outcomes, but were interested learning more about these new skew options. I will send along the data and input, to get your thoughts!
 Eric Thibodeau posted on Friday, August 08, 2014 - 1:18 pm
Hi friends,

Can someone please post some example syntax for the new distribution functions from 7.2? I'm doing a CFA with one factor and four indicators, all my indicators are positively skewed, and I theorize the latent construct to be positively skewed as well. How can I model this? Using Skewt? Is there some example syntax available?

Thanks!
 Bengt O. Muthen posted on Friday, August 08, 2014 - 4:02 pm
See the handout from my version 7.2 workshop in Madison, July 21, 2014. See

http://www.statmodel.com/7_2_presentations.shtml

Video will be posted shortly.

I would start with the default setting of applying the skew to only the factor(s). So all you need to say is

Distribution = skewt;

in the Analysis command. Make sure the best logL is replicated by using Starts = x y; for some small x and smaller y.

Note, however, that your indicators need to be continuous. 5-category Likert scales may not give sufficient information.
 Jana Holtmann posted on Tuesday, April 25, 2017 - 7:57 am
Hi, I am trying to run a Monte Carlo analysis on simulated data using skew-t mixture SEMs. The problem that occurs is that the commands for estimating skew and df seem to be ignored, as the output shows that they were not estimated and population values do not correspond to my starting values. However, if I apply the same input to the first dataset only it works well. Also if I combine the analysis with the data generation in one input it works well. Do you have any idea why that might be? Thanks!
 Bengt O. Muthen posted on Tuesday, April 25, 2017 - 5:26 pm
We need to see your Monte Carlo output - send to Support along with your license number.
 Cristina Ramirez posted on Friday, September 29, 2017 - 8:35 pm
Hello, I read through the handout and have some questions.

1. Should the DISTRIBUTION options be considered for simple linear regressions with skewed outcomes? So far, most examples Ive seen concern more complex models.

2. If so, which estimators should be used and why? Does MLR still cover certain aspects of non-normality that ML doesnt? What if multiple imputation is used or if there is missing data?

3. Lastly, are the DISTRIBUTION options still too experimental for those who are not researching on statistics? Could their use be recommended for non-statistics-research?

Thanks.
 Cristina Ramirez posted on Saturday, September 30, 2017 - 6:55 pm
Hello, I read the handout and have some doubts.

1. Are the DISTRIBUTION options still considered experimental and mostly for their use in statistical methods research? Or could they be used freely for all types of research?

2. Most examples I've seen concern complex models like mixture growth modeling. Could the DISTRIBUTION options be used in a simple regression?

3. If that is the case, which estimators should be used and why? Does MLR still cover aspects of non-normality that ML wouldn't? What if there is missing data or multiple imputation is used?

Thanks.
 Bengt O. Muthen posted on Sunday, October 01, 2017 - 12:33 pm
1. This can be used for all types of research. We have found that it doesn't work well for variables with strong floor or ceiling effects.

2. Yes

3.

Q1: Use estimators in the ML family.

Q2: Yes

Q3: Missing data is ok (uses ML under MAR as usual). MI is done under normality assumption so not quite consistent with skew-t.
 Cristina Ramirez posted on Tuesday, October 03, 2017 - 5:41 pm
I see. Thank you very much!
 Samuli Helle posted on Friday, August 31, 2018 - 3:04 am
Hi,

I have fitted a two-part model. Yet, the continuous variable part of the model shows rather strong floor effect. I had considered using skew t distribution to handle that but obviously this may not work too well. If true what are my options in Mplus? Should I use bootstrapping to get the "best" possible standard errors for my point estimates?
 Bengt O. Muthen posted on Friday, August 31, 2018 - 2:01 pm
With a two-part model you should see no floor effect at all. Check the input and the data. Perhaps you have a lowest value that is lower than the value with the high percentage in which case you can use Cutpoint option (see V8 UG page 585).
 Samuli Helle posted on Monday, September 03, 2018 - 12:27 pm
If the data has a strong floor effect to begin with? We have theoretical interest to model the zero and the continuous parts separately. Am I missing something here?
 Bengt O. Muthen posted on Monday, September 03, 2018 - 2:05 pm
I mean that after you have used DATA TWOPART to split the variable with a strong floor effect into a continuous part and a binary part, there should be a vastly reduced floor effect for the continuous part.
 Samuli Helle posted on Tuesday, September 04, 2018 - 1:45 am
Yes, but the floor effect still clearly exist, making the residuals of the continuous part strongly non-normal. Any advice how strong floor effects skew t distribution can handle?
 Bengt O. Muthen posted on Tuesday, September 04, 2018 - 2:52 pm
I don't recommend using skew-t for a variable with a floor effect. The BMI distribution is a perfect example of what skew-t handles well.

If you have a strong floor for the continuous Y part of 2-part, the default logY assumption is not good. I would suggest categorizing into an ordinal variable treated as Categorical.
 Jacqueline Kim posted on Tuesday, September 04, 2018 - 3:55 pm
Hello, I am wondering when it would make more sense to use normal mixture modeling rather than skew mixture modeling (even if the BIC is worse).

In the Asparouhov & Muthén (2016) "Structural Equation Models and Mixture Models With Continuous Nonnormal Skewed Distributions" paper, it says "Modeling with the skew t distribution in general requires larger sample sizes than modeling with the normal distribution... If the sample size is not sufficient, the additional skewness parameters in the skew t distribution will not be statistically significant and in that case they should be eliminated from the model to preserve model parsimony and minimize the standard errors for the remaining model parameters."

If there are 6 continuous indicators for a LPA, of which 2 have skewness & kurtosis and 1 has skewness only... what would be a sample size that is sufficient?

Are there other reasons to forgo skew mixture modeling aside from small sample size?

Thank you.
 Bengt O. Muthen posted on Tuesday, September 04, 2018 - 5:49 pm
Hard to say which N is required without doing a Monte Carlo simulation study for your particular situation - that can be done. Perhaps N > 500. Also, skew-t is generally not suitable for variables with strong floor or ceiling effects.
 Jacqueline Kim posted on Wednesday, September 05, 2018 - 2:59 pm
Thank you Dr. Muthen, I would love to try a simulation study to answer this question but unsure how to start. Would you have a user guide recommendation for testing whether I have a large enough sample to conduct modeling with skew-t?
 Bengt O. Muthen posted on Thursday, September 06, 2018 - 3:07 pm
I can email a Monte Carlo script to you.
 Jacqueline Kim posted on Wednesday, September 26, 2018 - 11:59 am
Thank you Dr. Muthen.

I have one more, naive, follow-up question. Is it possible to specify estimation of skew parameters for just the few indicators that have skew? Rather than producing an extra parameter for all of the indicators? Can the syntax be adjusted in some way to do this?

It would be helpful to reduce the number of parameters estimated, since I have a limited number of indicators I can use for the LPA. But, I'm unsure if this makes sense to do methodologically, and if this is possible in Mplus.
 Tihomir Asparouhov posted on Thursday, September 27, 2018 - 2:07 pm
It is possible to specify estimation of skew parameters for just the few indicators that have skew. If you do not mention the {Y} parameter in the model it is assumed that the skew parameter for Y is fixed to 0 and no skewness is modeled for that variable.
 Jacqueline Kim posted on Tuesday, October 09, 2018 - 12:43 pm
Thank you, I understand now.

Similarly, is it possible to specify non-normal distribution (tdist) for one timepoint, if doing an LTA with multiple timepoints and only one may need the non-normal distribution for finding latent profiles?
 Bengt O. Muthen posted on Wednesday, October 10, 2018 - 8:23 am
Yes.
 Jacqueline Kim posted on Wednesday, October 10, 2018 - 9:47 am
I'm not sure I am writing the syntax correctly. Would this be how to only specify tdist for one timepoint, one class? (I reduced to only two timepoints to fit limits of posting size)
...
usevariables are
y1t1 y2t1 y3t1 y4t1 y5t1 y6t1 y7t1
y1t2 y2t2 y3t2 y4t2 y5t2 y6t2 y7t2;

classes = c1(3) c2(2);

Analysis:
type = mixture;
distribution = tdist;

Model:
%OVERALL%
[c1#1-c2#1];
c2 on c1;

Model c1:
%c1#1%
{df*1};
[y1t1];
[y2t1];
[y3t1];
[y4t1];
[y5t1];
[y6t1];
[y7t1];
%c1#2%
{df@0}
[y1t1];
[y2t1];
[y3t1];
[y4t1];
[y5t1];
[y6t1];
[y7t1];
%c1#3%
{df@0}
[y1t1];
[y2t1];
[y3t1];
[y4t1];
[y5t1];
[y6t1];
[y7t1];

Model c2:
%c2#1%
{df@0}
[y1t2];
[y2t2];
[y3t2];
[y4t2];
[y5t2];
[y6t2];
[y7t2];
%c2#2%
{df@0}
[y1t2];
[y2t2];
[y3t2];
[y4t2];
[y5t2];
[y6t2];
[y7t2];

Thank you again.
 Tihomir Asparouhov posted on Wednesday, October 10, 2018 - 11:10 am
To convert the T-distribution to a normal distribution you have to fix the DF parameter to a large value, not 0. For example {df@50}.
 Fredrik Falkenström posted on Sunday, May 19, 2019 - 12:08 pm
Hello, is there a way of converting between the values for the skewnormal and skew t distributions and skewness and/or kurtosis statistics? I.e. if I have a particular value and df of skew t, what skewness/kurtosis values would that imply?

Best,

Fredrik Falkenström
 Fredrik Falkenström posted on Saturday, May 25, 2019 - 12:57 pm
Hi again, the reason I asked my previous question was that I want to use the Monte Carlo function to generate data with one variable at certain levels of skewness, while other variables should be normal. Not sure how I would go about choosing the values of the skewness parameters to achieve this, though?

Fredrik Falkenström
 Tihomir Asparouhov posted on Tuesday, May 28, 2019 - 10:49 am
See (21-23) in
Mplus will compute these for you in output:residual.

Essentially, you would just fix the skew parameter to 0 for all but one variable. For example

ANALYSIS: DISTRIBUTION = SKEWNORMAL;

MODEL POPULATION:
Y1-Y2*1;
Y1 with Y2*0.5;
{Y1*3}; {Y2@0};

will generate a skew Y1 and a normal Y2. You can then analyze the generated sample with a separate input file with output:residual to see the level of skewness.
 Fredrik Falkenström posted on Friday, May 31, 2019 - 12:16 pm
Thank you very much! It seems that if I set {Y1*10} I get skewness close to 1. I'm analysing a misspecified model that doesn't take skewness into account, and I'd like to keep the variance of Y1 = 1 in that model. However, when I specify Y1*1 and {Y1*10} the variance of Y1 explodes to almost 40 (although curiously the correlation between Y1 and Y2 is correctly estimated as .50). How can I specify the model so that the variance of Y1 becomes 1?
 Tihomir Asparouhov posted on Friday, May 31, 2019 - 3:22 pm
The skew parameter {Y1*10} (that is the delta parameter in formulas 21-23) is different from the skewness of the variable.

In this modeling framework it is not possible to generate bivariate data where
Var(Y1)=Var(Y2)=1
Skew(Y1)=1
Skew(Y2)=0
Cov(Y1,Y2)=0.5

This is explained in the paragraph between formulas (23) and (24).

If you drop
Cov(Y1,Y2)=0.5
you can generate it using
Y1*0.001; {Y1*1.6};
but that is probably not that useful.

I would recommend using skewness smaller than 1 so that some correlation can be accommodated. I would also not worry about having variance=1. You can use
define: standardize y1;
in the subsequent analysis to get that. The standardization will of course preserve the skewness.
 Fredrik Falkenström posted on Saturday, June 01, 2019 - 9:47 am
Ok, thanks! Is there any way of generating samples with larger skewness than 1 in one of the variables while also keeping the correlation between variables?
 Tihomir Asparouhov posted on Saturday, June 01, 2019 - 12:05 pm
Not with the skew method implemented in Mplus. You can perhaphs do other things like using categorical plus continuous variable to get bigger skewness.

See
http://mathworld.wolfram.com/BernoulliDistribution.html
and formula (20)in
http://www.diva-portal.org/smash/get/diva2:302313/FULLTEXT01.pdf
 Fredrik Falkenström posted on Sunday, June 02, 2019 - 11:14 pm
Ok, thanks. One more question: is it possible to fix the level of skewness in the data generation process, so that the skewness level is constant across samples? I tried {X1@3} but it seems that skewness is still sampled rather than fixed.
 Tihomir Asparouhov posted on Monday, June 03, 2019 - 8:29 am
I am not really sure what you are seeing but the value 3 doesn't correspond to a sample statistic: the skew parameter is not the same as the sample skewness. Apart from that, the sample is generated from a particular distribution that is specified in the model population command. For finite sample size, the sample skewness will differ from the true skewness just like sample mean differes from the true mean. For larger samples the differences would be negligible.