Phil Wood posted on Wednesday, August 03, 2011 - 8:37 am
This is a simple syntax question. I have a data set with five variables, continuously measured. Based on leverage statistics, I want to delete a particular observation, say #335. Writing a useobservations statement is cumbersome, given that the variables are all continuously measured. Is there any way to easily delete a given observation in the file without listing it's particular values? This would make it easier to use the influence plots and rerun the analysis without influential observations. If it's not current syntax, it might be a useful thing to add. Just a thought. thanks!
If you have an id variable, you can delete using the number the person has on the id variable.
Phil Wood posted on Wednesday, August 03, 2011 - 11:32 am
That's true, but in this case, it's just the variables. I think the best answer at this point is to savedata the influence diagnostics and then useobservations based on that, including just usevariables for only the analysis variables. it's a little cumbersome for large datasets. That's why I suggested that maybe one could consider a reserved variable for observation number (analogous to the _N_ variable in SAS) which would index the observation number. Thanks.
It might be worthwhile to add an id variable to your data set. Then if you identify it using the IDVARIABLE option, you will see the ID number when you hold the cursor on the outlier in the plot and you can easily exclude the observations.
Phil Wood posted on Thursday, October 20, 2016 - 10:06 am
OK, so I tried the following approach: I defined an ID variable, scrid: idvariable is scrid; and then added the following: define: if (scrid eq 2758.0) then delete; I verified that this number appears in the file (it's actually 2758.00000000), but I'm dismayed to see that this observation still appears in the influence plots for the analysis. I tried moving it before and after the model statement, but nothing seems to affect it. Any ideas?
Hello, I have used this thread to tr to help me exclude observations that are missing on a certain value.
Long story short, I am running two analyses, and I am imputing for both of them. However, in the second analyses, I would like to exclude participants who are missing on a certain variable- but I want to use the same imputed data that I had created from the full sample. As suggested above, I have tried to include the following in my syntax:
USEOBSERVATIONS ARE (CONFLICT /= *);
However, I keep receiving this error: *** ERROR ( (CONFLICT /= *) ) ^ ERROR *** WARNING in VARIABLE command When a subpopulation is analyzed with TYPE=COMPLEX, standard errors may be incorrect. Use the SUBPOPULATION option instead of the USEOBSERVATIONS option to obtain correct standard errors. *** ERROR ( (CONFLICT /= *) ) ^ ERROR *** ERROR Missing matching left parenthesis. *** ERROR Left numeric operand cannot be found.
Is there something I am missing in the coding? Thank you very much.
Looks like you are trying to compare against the missing symbol * in USEOBSERVATIONS and you can't do that. You can only do the comparison if the missing values are numeric. You can change * in the data to some number like 99999.
Dr. Muthen, I actually have a follow-up now that I've thought about where the * came from. When I imputed, the missing were automatically assigned the * in all 100 imputed datasets. It would be difficult to change all missing to a number in 100 different datasets. Is there a way to tell MPLUS to change it somewhere?