Message/Author 

Mario posted on Monday, January 11, 2016  11:28 pm



We are trying to define distribution of our dependent variables and we know that there are three main commands for that: CATEGORICAL, NOMINAL, and COUNT. However, we are encountering two problems: 1) One of our mediators is nominal but when we define it as NOMINAL the program gives an error message “*** ERROR in MODEL command A nominal variable may not appear on the righthand side of an ON statement: SP” We know that we must define the distribution of mediators and that mediators are on the rightside of an ON statement, so we don’t understand how ever one can define a Nominal dependent variable. 2) Another dependent variable has a leftskewed distribution but is not a count variable (has many zeros and the numbers are not integers). No transformation helps to make the distribution Normal. In a regular GLM, we would use a Gamma distribution (after changing the “0” to “0.00000001”), but I could not find a way in MPLUS to define a GAMMA distribution for the dependent variable. Could you offer me an alternative solution? Thanks, Mario 


1) Nominal mediators is an advanced topic that I cover in my 2011 paper on our website. 2) You can use twopart modeling which is described in our UG examples. 

Mario posted on Wednesday, February 03, 2016  1:23 am



Dear Muthen, coming back to the point 2 of my equation. I will try to figure it out with an example. The original model I want to run in this one: USEVARIABLES sand GApdic dens Fbrdn Myc Bart Bartf Mycf RepSucc; CATEGORICAL GApdic Myc Bart Bartf Mycf; COUNT Fbrdn; MISSING ARE *; ANALYSIS: TYPE = random; ALGORITHM = INTEGRATION; ESTIMATOR = MLR; MODEL: dens GApdic Mycf Bartf RepSucc on sand; Myc Bart on dens GApdic MycF RepSucc Bartf; Following twopart modeling example from 16.6 I tried the following DATA TWOPART: NAMES = RepSucc; BINARY = bin1; CONTINUOUS = cont1; USEVARIABLES sand GApdic dens Fbrdn Myc Bart Bartf Mycf bin1 cont1; CATEGORICAL GApdic Myc Bart Bartf Mycf bin1; COUNT Fbrdn; MISSING ARE *; ANALYSIS: TYPE = random; ALGORITHM = INTEGRATION; ESTIMATOR = MLR; MODEL: dens GApdic Mycf Bartf bin1 cont1 on sand; Myc Bart on dens GApdic MycF bbin1 cont1 Bartf; However, I receive the following message *** ERROR Categorical variable BIN1 contains less than 2 categories. 

Mario posted on Wednesday, February 03, 2016  3:16 am



Solve it! I did not define a correct CUTPOINT. Thanks 

Mario posted on Wednesday, February 03, 2016  6:26 am



Dear Dr. Muthen, your previous answers was really helpful to run the model with a GAMMA distributed dependent variable. However, the output show the statistics (Est., s.e., Pvalue,…) of the continuous (bin1) and binary (count1) variables created from our GAMMA distributed variable (named Reproductive success). In order to include the results in a paper, we wonder if it is possible to combine the values obtained for bin1 and count1 to obtain Est., s.e. of the original “Reproductive success” variable. Thanks, Mario 


I would not try to combine the effects  the nice part if twopart is that you get a richer answer, one for each part. 

Mario posted on Friday, February 05, 2016  2:11 am



Dear Dr. Muthen, thanks a lot for our answer. However, despite is atatistically interesting to show the answer from the two variables, we need to give a biological answer using the original variable. So, sorry to insist, is there any way to combine the values of the 2 new created variables to get values for the original one? Thanks a lot 


If you used a Gamma distribution maybe you would have a single answer, but twopart modeling is different. You can use BIC to tell how much better twopart fits than Gamma. I think it is the nature of the twopart model that you don't get one answer, but a more detailed 2part answer. I can see your dilemma if a Gamma distribution and its regression coefficient is the standard approach in your field but I can't think of a way to come up with a single combined answer from the twopart results. If instead you used a censorednormal model you would get a single answer, but a censoredinflated model might fit better and then again your have two answers. 


The expected value of the twopart outcome has an expected value that involves both parts of the model. 

John C posted on Saturday, February 15, 2020  6:53 pm



Hello, I have a similar question to the above. In my case, the dependent variable is a count variable in the context of GMM with 3 latent classes. This variable is a monthtomonth count within a year, ranging from zero to 12, so that there are 13 categories, and is highly leftskewed. Can I do twopart modeling with such a variable. My understanding of the syntax is that I would have to specify the category of interest, in this case category #13. Can this be done? 


So count=12 has the highest frequency? If so, a count model would not fit unless you work with 2 classes. Treating the counts as interval scaled, you could use twopart. If easier to handle, you can turn the scale around so that 12 becomes zero and 0 becomes 12. 

John C posted on Monday, February 17, 2020  9:06 pm



Yes, thanks, count 12 is the highest, followed by count zero. Because of this I thought it might be better to preserve the order after recoding the last category, so that the recoding would be as follows: 12>0, 0>1, 1>2, ...,11>12. Would this be ok for a twopart model, even though the binary and continuous parts were not in the same direction? Or would it still be advisable to just reverse code all the categories? 


Hmm; I think your scoring would be ok too. I would do both and check. 

Back to top 