I am using complex data (PISA). There are 5 plausible values for test scores in math and reading. I am estimating a path model, where reading is an indep. var and math is the dep. var.
I am using the type = imputation command in conjunction with the cluster, weight and complex command.
1.) Is it necessary to use 25 Data sets (i.e.: Dat1 = mathpv1, readpv1; Dat2 = mathpv1, readpv2; ...; Dat6 = mathpv2, readpv1, ...) or would it also be correct to use just 5 data sets (Dat1 = mathpv1, readpv1; Dat2 = mathpv2, readpv2, ...)?
2.) What would be when I further use multiple imputation for the other variables (x1 - x10) of the model? Are 5 data sets (mathpv1, readpv1, x1imput1, x2imput1, ...) enough? Or do I also have to create data sets for each combination of imputed values?
Yes. And typically you would have imputed data sets where each includes all your latent variables (math and reading in your case). So you would have say 5 of those, not 5 for one of the latents and 5 for the other.
Hi, I've got a little question. In DATA IMPUTATION command we could find option for rounding number of decimals for imputed continuous variables (ROUNDING=). But how to set the number of decimals for plausible values i.e. imputations for latent variable? When I put name of latent variable in rounding option (for instance ROUNDING = f1 (5); ) it does not work. Thank You Artur
Dear Dr. Muthen ¡G I am doing analysis about bifactor. I want to get the plausible value of bifactor model. Can Mplus do that? I have tried to add the plausible value code in the bifactor model. However, it doesn't work? Am I wrong or the Mplus can not do that¡H
Jan Zirk posted on Tuesday, September 25, 2012 - 6:14 pm
Dear Bengt or Tihomir,
In Asparouhov & Muthen (2010; Plausible Values for Latent Variables Using Mplus) you mention the plausible values (pvs) but when they are extracted from a categorical variable or a latent factor MPlus gives us mean median, SD and CI values; did you mean in Table 3 mean or median pvs?
In one of my PhD studies I've made secondary analysis on Swedish PISA data. The SEM models were tested with Mplus using the maximum likelihood parameter estimator (MLR) with the two-level complex analysis type. PISA 2009 data was used.
When I performed the analysis I was not aware of the possibility to use all 5 plausible values offered for each student. Instead I used one of the plausible values and tested the models with each PV at a time and compared the five outputs (that did not differ much).
However, one of the anonymous reviewers of my manuscript informed me about the possibility to perform the testing of the CFA models and estimating the parameters using all the five PVs in Mplus in order to get correct standard errors (since group-differences are in focus).
How is this done? Should I use type=imputation as referred to above by Christoph? What will the input instruction look like when it is supposed to call for five different data files? Is it all done in one analysis?
Yes, use TYPE=IMPUTATION and you will get averaged estimates and correct SEs and chi-2 using all the imputations. See UG ex 11.8, part 2. You find the data, including the "implist" file on our website under the User's Guide examples.
we're scratching our heads over this output message we've been getting:
*** ERROR in DATA IMPUTATION command Unknown option: PLAUSIBLE
are there circumstances under which even proper use of the plausible command would produce this error? or perhaps somehow we are using it incorrectly?
a slightly shortened version of our input file is below, in case that helps. any insight much appreciated! we've tried a number of permutations of the below, but keep getting the same message. we're using mplus 7.
data: file is informant2.dat; variable: names are [quite a few]; IDVARIABLE = id; usevariables are avfmps12 avfmps16 avfmps19 avfmps24 avfmps30 fmps12 fmps16 fmps19 fmps24 fmps30; missing are avfmps12 avfmps16 avfmps19 avfmps24 avfmps30 fmps12 fmps16 fmps19 fmps24 fmps30 (-99); categorical are fmps12 fmps16 fmps19 fmps24 fmps30; analysis: estimator = BAYES; model: [is a complicated factor structure] DATA IMPUTATION: PLAUSIBLE = hplau.dat; SAVE = hplau2.dat; output: TECH1 TECH8;
I don't know what to recommend here. It sounds like the first data set has more subjects than the second, but the second has more variables - the latent variable's plausible values. Not sure what the final modeling would be.
Dan Cloney posted on Monday, October 06, 2014 - 2:52 pm
Thank you for your response.
You are right, the first data set contains more subjects than the second. e.g., u11, u12, u13...u33 and 3000 subjects long, with some missing data.
The second data set contains fewer variables that the first: 5 PVs for one latent variable. e.g., f1_pv1...f1_pv5 and 2000 subjects long.
The final model is intended to be a growth model, that includes f1 as a (time-invariant) covariate.
Then I think all you can do is analyze the n=2000 subjects in common to the two data sets, merging the PV data sets with the n=2000 subset of the other data set to get those observed variables. Then use Type=Imputation data input.
Hi Bengt and Linda, we are running estimates (means) across 20 plausible values using TYPE=IMPUTATION for one variable in our data set, but are not getting standard errors for the mean score on the variable created through the combination of 20 p.v.'s. Can you help us know how to get the standard errors? We did not specify the variable created from the 20 p.v.'s as latent. Do we need to?
Or is there a specific output command that we need to utilize in order to get the standard errors?
We ran the analysis with gender as a grouping variable, and just receive this output (with no standard errors):
Hello, we specified out syntax as above, but are still not receiving SEs (see select input and output below). Do we need to specify any different options to receive SEs? We cannot send the data because it is restricted use.
DATA: FILE IS "G4reading2021215list.txt"; type=imputation; VARIABLE: NAMES ARE dsex origwt srwt01-srwt62 mathcomp; Usevariable is mathcomp; MISSING ARE .; WEIGHT IS origwt; REPWEIGHTS = srwt01-srwt62; GROUPING IS dsex (1=males, 2=females);
ANALYSIS: TYPE = COMPLEX BASIC; REPSE = JACKKNIFE2;
Dear Tihomir, I do not see the SE estimates. Our entire RESULTS section of the output is below.
Do we need to select an estimation type other than COMPLEX BASIC? Should we be using TYPE = COMPLEX instead (not specifying BASIC)? Is there an Output option that we need to select, in order to see the SEs? Thank you.
RESULTS FOR BASIC ANALYSIS
NOTE: These are average results over 20 data sets.
ESTIMATED SAMPLE STATISTICS FOR MALES
Means MATHCOMP 1 218.521
Covariances MATHCOMP MATHCOMP 1413.018
Correlations MATHCOMP MATHCOMP 1.000
ESTIMATED SAMPLE STATISTICS FOR FEMALES
Means MATHCOMP 1 225.236
Covariances MATHCOMP MATHCOMP 1283.463
Correlations MATHCOMP MATHCOMP 1.000
Mplus diagrams are currently not available for TYPE=BASIC. No diagram output was produced.
I have run imputation using Bayesian estimation for a set of data with missing values. The data is categorical (binary or ordinal) and I would like for the imputed values to remain between 0-7, and not include any decimal values.
I use the commands: VALUES = 0 1 2 3 4 5 6 7; ROUNDING = 0;
I am getting the error *** ERROR in DATA IMPUTATION command Missing values at the end of the ROUNDING option: 0
3) I am finding that standard errors for coefficients in the model are INCREASING rather than decreasing when twenty plausible values are used, relative to when I ran the model based on 1 pv.
Is this typical, and what is the reason for this? I know that pv's help derive *unbiased* standard errors...but what is the effect of pvs on precision?
Do the standard errors increase with 20 pv's because there is greater variation (for the estimated regression coefficient) across all 20 runs of the model than there was in the model based on a single p.v.?
Michelle Wu posted on Sunday, February 04, 2018 - 2:28 pm
Hi Linda and Bengt,
I'm new to the Mplus. I'm trying to conduct a path analysis with PISA 2012 and I'm still confused with how to handle the plausible values of student performance data intentionally set in the PISA dataset after reading this thread.
Suppose the plausible variables are pv1 pv2 pv3 pv4 pv5, should I separate these values, along with the covariates into five files and then use DATA: FILE IS implist.dat; TYPE = IMPUTATION;?
Or is it correct to specify pv1-pv5 as imputed values just by using TYPE = IMPUTATION;?
Another thought of mine is to set these five pvs as the imputed values of a newly created variable pv0 in say STATA. Then read this file in Mplus. Is this doable at all?
Please guide me with some directions. Thank you so much.