Mario posted on Monday, January 11, 2016 - 11:28 pm
We are trying to define distribution of our dependent variables and we know that there are three main commands for that: CATEGORICAL, NOMINAL, and COUNT. However, we are encountering two problems: 1) One of our mediators is nominal but when we define it as NOMINAL the program gives an error message “*** ERROR in MODEL command A nominal variable may not appear on the right-hand side of an ON statement: SP” We know that we must define the distribution of mediators and that mediators are on the right-side of an ON statement, so we don’t understand how ever one can define a Nominal dependent variable.
2) Another dependent variable has a left-skewed distribution but is not a count variable (has many zeros and the numbers are not integers). No transformation helps to make the distribution Normal. In a regular GLM, we would use a Gamma distribution (after changing the “0” to “0.00000001”), but I could not find a way in MPLUS to define a GAMMA distribution for the dependent variable. Could you offer me an alternative solution?
ANALYSIS: TYPE = random; ALGORITHM = INTEGRATION; ESTIMATOR = MLR;
MODEL: dens GApdic Mycf Bartf bin1 cont1 on sand; Myc Bart on dens GApdic MycF bbin1 cont1 Bartf;
However, I receive the following message *** ERROR Categorical variable BIN1 contains less than 2 categories.
Mario posted on Wednesday, February 03, 2016 - 3:16 am
I did not define a correct CUTPOINT.
Mario posted on Wednesday, February 03, 2016 - 6:26 am
Dear Dr. Muthen, your previous answers was really helpful to run the model with a GAMMA distributed dependent variable.
However, the output show the statistics (Est., s.e., P-value,…) of the continuous (bin1) and binary (count1) variables created from our GAMMA distributed variable (named Reproductive success). In order to include the results in a paper, we wonder if it is possible to combine the values obtained for bin1 and count1 to obtain Est., s.e. of the original “Reproductive success” variable.
I would not try to combine the effects - the nice part if two-part is that you get a richer answer, one for each part.
Mario posted on Friday, February 05, 2016 - 2:11 am
Dear Dr. Muthen, thanks a lot for our answer. However, despite is atatistically interesting to show the answer from the two variables, we need to give a biological answer using the original variable. So, sorry to insist, is there any way to combine the values of the 2 new created variables to get values for the original one?
If you used a Gamma distribution maybe you would have a single answer, but two-part modeling is different. You can use BIC to tell how much better two-part fits than Gamma. I think it is the nature of the two-part model that you don't get one answer, but a more detailed 2-part answer. I can see your dilemma if a Gamma distribution and its regression coefficient is the standard approach in your field but I can't think of a way to come up with a single combined answer from the two-part results. If instead you used a censored-normal model you would get a single answer, but a censored-inflated model might fit better and then again your have two answers.
The expected value of the two-part outcome has an expected value that involves both parts of the model.
John C posted on Saturday, February 15, 2020 - 6:53 pm
I have a similar question to the above. In my case, the dependent variable is a count variable in the context of GMM with 3 latent classes. This variable is a month-to-month count within a year, ranging from zero to 12, so that there are 13 categories, and is highly left-skewed.
Can I do two-part modeling with such a variable. My understanding of the syntax is that I would have to specify the category of interest, in this case category #13. Can this be done?
So count=12 has the highest frequency? If so, a count model would not fit unless you work with 2 classes. Treating the counts as interval scaled, you could use two-part. If easier to handle, you can turn the scale around so that 12 becomes zero and 0 becomes 12.
John C posted on Monday, February 17, 2020 - 9:06 pm
Yes, thanks, count 12 is the highest, followed by count zero. Because of this I thought it might be better to preserve the order after recoding the last category, so that the recoding would be as follows: 12->0, 0->1, 1->2, ...,11->12. Would this be ok for a two-part model, even though the binary and continuous parts were not in the same direction? Or would it still be advisable to just reverse code all the categories?