I am running a CFA in Mplus with complex survey data. The weight variable that I am using is missing for some participants, because the study used a two-phase sampling approach in which only a subsample of cases progressed from Part 1 to Part 2. When I run the data in Mplus, I receive the following: *** ERROR Weight variable has missing value at observation 4651.
I would like to run this model without turning on LISTWISE, because otherwise there is an extreme amount of missing data.
I have this same problem, with a longitudinal dataset (Times 1,2,3). I would use a longitudinal weight that spans from time 1 - time 3.
I agree that eliminating observations without a weight value would address this problem, but I'm concerned that doing so might reduce covariance coverage & the amount of usable data.
I am analyzing a subpopulation and so planned to run analyses under FIML conditions to make use of datapoints from Time 1 for participants who may not be sampled at Time 3, for example. (These participants may've left the sample due to death, non-response, etc. from time 1 to time 3).
My concern is that if I eliminate observations without weights, I will also eliminate usable data from times 1 and 2, for example.
Is there another work-around for cases without a weight value that doesn't entail losing information in this way?
Thanks, Linda. Just to provide closure to the story, in case others run into this same problem:
We tried running the model with and without the weight variable. (There is no clustering or stratification in this data set).
When we deleted cases with a missing value for the weight, our model won't converge in M+. We're not sure if this due to a reduction in sample size or some problem introduced by the weight itself. It may also be because we are analyzing a subpopulation.
When we run the model without the weight, everything works (relatively) OK and fits relatively well.
However, not using the weight entails that our sample would a) not be adjusted for attrition, non-response and differential probability of selection b) and would no longer be nationally representative. We are going to use a more conservative significance criterion (.01) because we are not using the weight & have to live with these limitations.
I just updated to Mplus6. I have a dataset of 16,500 subjects. I ran regression analyses using sample weights and MLR in Mplus5.2 and the output reported the number of observations as my dataset sample size. The same models in version 6 seem to be dropping cases that do not have data for the dependent variable. Now my number of observations is smaller (I assume reflecting the attrition in this longitudinal dataset). I also get the following warnings:
*** WARNING Data set contains cases with missing on x-variables. These cases were not included in the analysis. Number of cases with missing on x-variables: 6067 *** WARNING Data set contains cases with missing on all variables except x-variables. These cases were not included in the analysis. Number of cases with missing on all variables except x-variables: 5 2 WARNING(S) FOUND IN THE INPUT INSTRUCTIONS
could you clarify exactly what has changed in version 6? Thanks.
In Version 5.21 the patterns of missing data included auxiliary variables. In Version 6, the patterns depend on only analysis variables. If this is not your case, please send the two outputs and your license number to firstname.lastname@example.org.
nina chien posted on Monday, August 23, 2010 - 8:47 am
I also updated to Mplus 6 (like William above). Whereas in Mplus 5 all cases were utilized, now I am receiving the following warning messsages:
*** WARNING Data set contains cases with missing on x-variables. These cases were not included in the analysis. Number of cases with missing on x-variables: 5 *** WARNING Data set contains cases with missing on all variables except x-variables. These cases were not included in the analysis. Number of cases with missing on all variables except x-variables: 16 2 WARNING(S) FOUND IN THE INPUT INSTRUCTIONS
Accordingly, whereas my number of observations using Mplus 5 was 100, my number of observations using Mplus 6 is only 79.
Questions: 1) Are either Mplus 5 or Mplus 6 results more "correct?" 2) *IF* Mplus 6 results are more correct, does that mean I have to report an N of only 79 (instead of the previous 100?) Of course, we would much prefer reporting 100. Also, we are worried that we wouldn't be able to put in as many parameters with 79 as we would with 100.
In Version 6 for models with all continuous variables using maximum likelihood estimation, we started estimating the model conditioned on x to be in line with the rest of the Mplus program. To obtain Version 5 results, mention the variances of the covariates in the MODEL command. With no missing data, the Version 5 and 6 results are identical.