Mplus Discussion >> Sample selection, censoring and truncation

Topics
Last Day
Last 3 Days
Last Week
Tree View


Sample selection, censoring and trunc...

Mplus Discussion > Categorical Data Modeling >

Message/Author

Elisabeth Wells posted on Friday, October 07, 2005 - 8:17 am

I have been looking at Examples 3.2 and 3.2 in the MPLUS Version 3 manual in relation to two analysis issues in a large complex sample (cross-sectional).
1) Analysis of a measure of hazardous use of alcohol in the past 12 months (AUDIT). Those who did not drink in the last twelve months are assigned a zero score. I think that this requires a censor-inflated regression (Example 3.2) as they could not have experienced problems if they did not drink so y=0.
2) Analysis of either alcohol dependence or alcohol dependence symptom counts (lifetime). Not everyone was asked these questions, based on consumption and other questions. Consequently it cannot be assumed that there was no dependence, so when y is missing it cannot be set to 0. Reading around I think that this is a sample selection problem, not a censoring or truncation problem. As such I think that MPLUS does not have a way of analysing this.

I would be very grateful for comments.
Thank you.
Elisabeth Wells

bmuthen posted on Saturday, October 08, 2005 - 2:27 pm

1) Both a censored and a censored-inflated analysis could be considered here since both acknowledge the y=0 situation. There is a large literature on modeling with zeros, particularly in the health literature.

2) If consumption and other variables affect the symptom questions being asked, you might consider the missingness of y as a function of these variables and therefore fall into the "MAR" case of ML estimation. This would imply that those who weren't asked are included with y = 999 in a Type = Missing analysis.

Fernando Terrés de Ercilla posted on Thursday, May 29, 2008 - 1:08 pm

Does the new Mplus 5.1 have any other means to cope with sample selection in the case of count data?
One of my variables is visits to the doctor, and the other is how many of this visits are work-related (asked only if the first is not 0).
By the way, both variables are also right censored (0, 1, 2, 3, 4 or more), with over 40% in the 0 count, is there any way to cope with this type of censoring?
Many thanks in advance,
Fernando.

Fernando Terrés de Ercilla posted on Friday, May 30, 2008 - 1:19 am

I forgot to add that my sample size is 5236 and count frequencies are:

count	v1	v2
0	40,2	81,8
1	21,1	8,7
2	16,6	3,2
3	7,8	1,2
4 or more	13,7	2,7
missing	0,6	2,6

Bengt O. Muthen posted on Friday, May 30, 2008 - 7:30 am

Seems like you could formulate your dependent variable as "How many of your visits to the doctor (in the last x months) are work-related?" And then you could use a suitable count model - see the Mplus Web Talk on count modeling in 5.1 (see home page).

Fernando Terrés de Ercilla posted on Friday, May 30, 2008 - 9:16 am

So, then only positive values of v1 must be considered in v2?
The 0 frequency of v2 diminishes to a 42.9% but with 41.4% of missing values (which could be modeled).

Bengt O. Muthen posted on Sunday, June 01, 2008 - 1:00 am

No, I was thinking you would include the zeros - not all visits are work related. This way you could also have a separate prediction of zeros and non zeros (see my web talk).