 William Martinez posted on Monday, March 21, 2016 - 5:54 pm

I ran the multiple imputation missing data analyses and now want to use the 10 imputed datasets to run a series of multilevel models. However, I received an error message about an unknown character being present in my dataset. When I inspected the inputed datasets, I noticed that the saved filed included "*" for any missing values found in the auxiliary values (which is fine, I can easily delete these). However, where the big issue lies is that one of my auxiliary variables is my clustering ID variable, and any value above 6 characters is being denoted in these datasets with a set of asterisks "***************". It seems like this is happening because mplus is having this variable being outputted to three decimal places, possibly making it too large to be displayed.

Two questions:

1) Can I assign a missing data value when saving the imputed files?

2) How do I resolve the issue with the cluster ID variable? Is there a way either to force mplus to recognize this variable as categorical when saving the imputed files and/or can I have Mplus save but not to three decimal places for this particular variable?

Thanks for your help.
 Linda K. Muthen posted on Tuesday, March 22, 2016 - 2:21 pm
1. No. An asterisk is used. You should not delete them. You should use MISSING=*; when you read those files.

2. The cluster ID variable should be on the AUXILIARY list. It should not be used to impute data nor should data be imputed for it.
 William Martinez posted on Tuesday, March 22, 2016 - 2:44 pm
Thank you for your reply. I realized I made a mistake on the missing command originally which is why the MISSING = * was not working for me.

To clarify I did have the cluster id variable on the AUXILIARY list. However, on the imputed files, many of the values are replaced by *********** and this is because the IDs are too long. The reason for this is that Mplus is outputting all values in the imputed files to three decimal places (even the auxiliary variables, of which my cluster ID is one of them). Since my longest auxiliary variable is 6 digits, it seems to not be displaying any variables of length 8 digits and above.

So then I tried converting these IDs into numbers to three decimal places so that mplus does not think it has to do carry them to three decimal places (yet keeping the identification system in place, albeit with a decimal point). What this output spits out are truncated values for all my cluster ids. In effect, the output shows IDs only until the 5th digit and no more.

How do I get around these issues? I can't use the multiple imputation function in an easy manner without having a cluster ID attached to this data.

 William Martinez posted on Tuesday, March 22, 2016 - 3:08 pm
Actually I just realized what is happening. In the multiple imputation files, the cluster IDs are being saved correctly to three decimal places. However, I just remembered that cluster ID variables have to be integers, thus, why it is truncated everything after the decimal.

So I realized that I only have 657 unique clusters, so I used Excel and wrote a formula that assigned a value to each of the clusters starting with 1 - 657. This allowed me to stay under the 8 digit limit and it worked. However, if there is an easier way to do this in Mplus, such as setting the default for the auxiliary variables on the saved imputed files, that would be very helpful.
 Linda K. Muthen posted on Wednesday, March 23, 2016 - 12:40 pm
We need to see what you are doing to understand the problem which we have not seen before. Please send the output, data, and your license number to
