I am running a CFA in Mplus with complex survey data. The weight variable that I am using is missing for some participants, because the study used a two-phase sampling approach in which only a subsample of cases progressed from Part 1 to Part 2. When I run the data in Mplus, I receive the following: *** ERROR Weight variable has missing value at observation 4651.
I would like to run this model without turning on LISTWISE, because otherwise there is an extreme amount of missing data.
I have this same problem, with a longitudinal dataset (Times 1,2,3). I would use a longitudinal weight that spans from time 1 - time 3.
I agree that eliminating observations without a weight value would address this problem, but I'm concerned that doing so might reduce covariance coverage & the amount of usable data.
I am analyzing a subpopulation and so planned to run analyses under FIML conditions to make use of datapoints from Time 1 for participants who may not be sampled at Time 3, for example. (These participants may've left the sample due to death, non-response, etc. from time 1 to time 3).
My concern is that if I eliminate observations without weights, I will also eliminate usable data from times 1 and 2, for example.
Is there another work-around for cases without a weight value that doesn't entail losing information in this way?
Thanks, Linda. Just to provide closure to the story, in case others run into this same problem:
We tried running the model with and without the weight variable. (There is no clustering or stratification in this data set).
When we deleted cases with a missing value for the weight, our model won't converge in M+. We're not sure if this due to a reduction in sample size or some problem introduced by the weight itself. It may also be because we are analyzing a subpopulation.
When we run the model without the weight, everything works (relatively) OK and fits relatively well.
However, not using the weight entails that our sample would a) not be adjusted for attrition, non-response and differential probability of selection b) and would no longer be nationally representative. We are going to use a more conservative significance criterion (.01) because we are not using the weight & have to live with these limitations.
I just updated to Mplus6. I have a dataset of 16,500 subjects. I ran regression analyses using sample weights and MLR in Mplus5.2 and the output reported the number of observations as my dataset sample size. The same models in version 6 seem to be dropping cases that do not have data for the dependent variable. Now my number of observations is smaller (I assume reflecting the attrition in this longitudinal dataset). I also get the following warnings:
*** WARNING Data set contains cases with missing on x-variables. These cases were not included in the analysis. Number of cases with missing on x-variables: 6067 *** WARNING Data set contains cases with missing on all variables except x-variables. These cases were not included in the analysis. Number of cases with missing on all variables except x-variables: 5 2 WARNING(S) FOUND IN THE INPUT INSTRUCTIONS
could you clarify exactly what has changed in version 6? Thanks.
In Version 5.21 the patterns of missing data included auxiliary variables. In Version 6, the patterns depend on only analysis variables. If this is not your case, please send the two outputs and your license number to email@example.com.
nina chien posted on Monday, August 23, 2010 - 8:47 am
I also updated to Mplus 6 (like William above). Whereas in Mplus 5 all cases were utilized, now I am receiving the following warning messsages:
*** WARNING Data set contains cases with missing on x-variables. These cases were not included in the analysis. Number of cases with missing on x-variables: 5 *** WARNING Data set contains cases with missing on all variables except x-variables. These cases were not included in the analysis. Number of cases with missing on all variables except x-variables: 16 2 WARNING(S) FOUND IN THE INPUT INSTRUCTIONS
Accordingly, whereas my number of observations using Mplus 5 was 100, my number of observations using Mplus 6 is only 79.
Questions: 1) Are either Mplus 5 or Mplus 6 results more "correct?" 2) *IF* Mplus 6 results are more correct, does that mean I have to report an N of only 79 (instead of the previous 100?) Of course, we would much prefer reporting 100. Also, we are worried that we wouldn't be able to put in as many parameters with 79 as we would with 100.
In Version 6 for models with all continuous variables using maximum likelihood estimation, we started estimating the model conditioned on x to be in line with the rest of the Mplus program. To obtain Version 5 results, mention the variances of the covariates in the MODEL command. With no missing data, the Version 5 and 6 results are identical.
I have read people's posts about missing data on replicate weights above from 2008 and 2010 and the responses saying there is no solution to this problem.
I am having this problem now and thought it worth checking whether there is now a solution (apart from deleting cases with missing replicate weights)?
Specifically the problem is this:
I am trying to apply a basic weight and 33 replicate weights for a longitudinal data set. Some values in the replicate weights are missing when participants were missing from the data collection wave. I have told Mplus that missing data are -99 but when I run the analysis the error message says: *** ERROR Weight variable has missing value at observation 3.
Greetings. I have read your above information that weight variables with missing values should be deleted. I am doing a two way complex and my user manual has indicated NOT to delete any participants because it would impact the weight structure. Would an option be to exclude them using the subpop command?
I also have cases with zero for weights but my understanding from your other posts is that these are different than missing weights.
You can probably find out from the data publisher how the weight is constructed and it might be possible to do multiple imputation for the variables used for the weights construction. So MI followed by weight construction seems like a solid method.
If that is not possible I would either ignore the weight variable or use the average weight for those observations with missing weight - either approach is not perfect. In some circumstance (small number of observations with missing weight variable) I would recommend listwise deletion.