Message/Author |
|
Mario posted on Monday, January 11, 2016 - 11:28 pm
|
|
|
We are trying to define distribution of our dependent variables and we know that there are three main commands for that: CATEGORICAL, NOMINAL, and COUNT. However, we are encountering two problems: 1) One of our mediators is nominal but when we define it as NOMINAL the program gives an error message “*** ERROR in MODEL command A nominal variable may not appear on the right-hand side of an ON statement: SP” We know that we must define the distribution of mediators and that mediators are on the right-side of an ON statement, so we don’t understand how ever one can define a Nominal dependent variable. 2) Another dependent variable has a left-skewed distribution but is not a count variable (has many zeros and the numbers are not integers). No transformation helps to make the distribution Normal. In a regular GLM, we would use a Gamma distribution (after changing the “0” to “0.00000001”), but I could not find a way in MPLUS to define a GAMMA distribution for the dependent variable. Could you offer me an alternative solution? Thanks, Mario |
|
|
1) Nominal mediators is an advanced topic that I cover in my 2011 paper on our website. 2) You can use two-part modeling which is described in our UG examples. |
|
Mario posted on Wednesday, February 03, 2016 - 1:23 am
|
|
|
Dear Muthen, coming back to the point 2 of my equation. I will try to figure it out with an example. The original model I want to run in this one: USEVARIABLES sand GApdic dens Fbrdn Myc Bart Bartf Mycf RepSucc; CATEGORICAL GApdic Myc Bart Bartf Mycf; COUNT Fbrdn; MISSING ARE *; ANALYSIS: TYPE = random; ALGORITHM = INTEGRATION; ESTIMATOR = MLR; MODEL: dens GApdic Mycf Bartf RepSucc on sand; Myc Bart on dens GApdic MycF RepSucc Bartf; Following two-part modeling example from 16.6 I tried the following DATA TWOPART: NAMES = RepSucc; BINARY = bin1; CONTINUOUS = cont1; USEVARIABLES sand GApdic dens Fbrdn Myc Bart Bartf Mycf bin1 cont1; CATEGORICAL GApdic Myc Bart Bartf Mycf bin1; COUNT Fbrdn; MISSING ARE *; ANALYSIS: TYPE = random; ALGORITHM = INTEGRATION; ESTIMATOR = MLR; MODEL: dens GApdic Mycf Bartf bin1 cont1 on sand; Myc Bart on dens GApdic MycF bbin1 cont1 Bartf; However, I receive the following message *** ERROR Categorical variable BIN1 contains less than 2 categories. |
|
Mario posted on Wednesday, February 03, 2016 - 3:16 am
|
|
|
Solve it! I did not define a correct CUTPOINT. Thanks |
|
Mario posted on Wednesday, February 03, 2016 - 6:26 am
|
|
|
Dear Dr. Muthen, your previous answers was really helpful to run the model with a GAMMA distributed dependent variable. However, the output show the statistics (Est., s.e., P-value,…) of the continuous (bin1) and binary (count1) variables created from our GAMMA distributed variable (named Reproductive success). In order to include the results in a paper, we wonder if it is possible to combine the values obtained for bin1 and count1 to obtain Est., s.e. of the original “Reproductive success” variable. Thanks, Mario |
|
|
I would not try to combine the effects - the nice part if two-part is that you get a richer answer, one for each part. |
|
Mario posted on Friday, February 05, 2016 - 2:11 am
|
|
|
Dear Dr. Muthen, thanks a lot for our answer. However, despite is atatistically interesting to show the answer from the two variables, we need to give a biological answer using the original variable. So, sorry to insist, is there any way to combine the values of the 2 new created variables to get values for the original one? Thanks a lot |
|
|
If you used a Gamma distribution maybe you would have a single answer, but two-part modeling is different. You can use BIC to tell how much better two-part fits than Gamma. I think it is the nature of the two-part model that you don't get one answer, but a more detailed 2-part answer. I can see your dilemma if a Gamma distribution and its regression coefficient is the standard approach in your field but I can't think of a way to come up with a single combined answer from the two-part results. If instead you used a censored-normal model you would get a single answer, but a censored-inflated model might fit better and then again your have two answers. |
|
|
The expected value of the two-part outcome has an expected value that involves both parts of the model. |
|
John C posted on Saturday, February 15, 2020 - 6:53 pm
|
|
|
Hello, I have a similar question to the above. In my case, the dependent variable is a count variable in the context of GMM with 3 latent classes. This variable is a month-to-month count within a year, ranging from zero to 12, so that there are 13 categories, and is highly left-skewed. Can I do two-part modeling with such a variable. My understanding of the syntax is that I would have to specify the category of interest, in this case category #13. Can this be done? |
|
|
So count=12 has the highest frequency? If so, a count model would not fit unless you work with 2 classes. Treating the counts as interval scaled, you could use two-part. If easier to handle, you can turn the scale around so that 12 becomes zero and 0 becomes 12. |
|
John C posted on Monday, February 17, 2020 - 9:06 pm
|
|
|
Yes, thanks, count 12 is the highest, followed by count zero. Because of this I thought it might be better to preserve the order after recoding the last category, so that the recoding would be as follows: 12->0, 0->1, 1->2, ...,11->12. Would this be ok for a two-part model, even though the binary and continuous parts were not in the same direction? Or would it still be advisable to just reverse code all the categories? |
|
|
Hmm; I think your scoring would be ok too. I would do both and check. |
|
Back to top |