Hi, I have an assessment tool with 57 items and 5 factors. The problem is that this tool contains 2 gated questions for each factor. In order for there to be scores on the factor items, other than "Not Applicable", one of the gated quetions has to be answered as a yes. Otherwise, the tool is filled in with "Not Applicable". The result is that half of all the entries in the data result in NA responses. The other possible responses are (No Problem), (Managed Risk), (Problem), (Severe Problem). My distributions are non-normal.
I have read a thread on here where Dr. Bengt Muthen suggested that since there is a gated question that accounts for these NA's then the data are essentially MAR. The suggestion was to use TYPE=MISSING. However, that is a lot of data to impute.
Prior Aproaches: Collapsing categories to binary values or 3 and 4 categories and using WLSMV produces emprty cells and correlations reaching 1.00.
The only approach that has worked is to arbitrarily scale the assessment as ordered categorical with NA=-1 through Severe Problem = 3. We used MLR. This allows us to estimate the model and do some MI testing.
Being concerned with this approach, however, we have been exploring semicontinuous models as a possible approach. However, this is a computational bear. What are your thoughts on the above and are there any other approaches you all would see suitable? Thanks. JB