Mplus Discussion >> IRT and item factor analysis of categorical items

Topics
Last Day
Last 3 Days
Last Week
Tree View

Edit Profile


IRT and item factor analysis of categ...

Mplus Discussion > Categorical Data Modeling >

Message/Author

Richard E. Zinbarg posted on Tuesday, August 15, 2006 - 7:16 pm

Hi Bengt,
On p. 259 of his 1999 book, Rod McDonald gives an equation relating the factor loading, lambda, to the IRT slope parameter b. In 12.17a, he states that lamba = b/(sqrt(1 + b^2)). Would the b in this equation correspond to the probit model slope that Mplus provides as factor loadings? Or should I think of the probit model slope that Mplus provides as the lambda in this equation?
Thanks very much!
Rick Zinbarg

Bengt O. Muthen posted on Tuesday, August 15, 2006 - 8:37 pm

Here is what we say in our teachings:

•2-parameter normal ogive IRT model uses
P (u = 1 | theta) = [a (theta – b)]
a discrimination
b difficulty

•2-parameter logistic IRT model uses
P (u = 1 |theta ) =
1/(1 + exp(-D a (theta - b)))

with D = 1.7 to make a, b close to those of probit.

Richard E. Zinbarg posted on Tuesday, August 15, 2006 - 9:56 pm

thanks for the very speedy reply! And in Mplus Web Notes #4, I know you give a different equation relating the IRT discrimination parameter to lambda from factor analysis. If I am understanding that Web Note correctly, the lambda in your equation 19 is the factor loading from an analysis of the tetrachoric correlations using a probit link rather than of the phi correlations (or observed covariances) among the observed variables using a linear regression model. Is that correct? If so, are you aware of any work that relates a factor loading from an analysis of tetrachorics using a probit link to a loading from an analysis of the phi correlations (or observed covariances) among the observed variables using a linear regression model? It is clear to me that for the purposes of model testing and comparison, the analysis of tetrachorics using a probit link is the most appropriate but in terms of estimating factor-analytically derived indices of reliability of composite scores, what quantities one should use in these reliability formulas is less clear. Rod McDonald's advice seems to be, that for the purpose of estimating factor-analytically derived reliability indices such as omega, to just fit
the linear model to the sample item covariance matrix. I am trying to figure out if this is a strategy that is both reasonable and the only one feasible and that should (presumably) satisfy reviewers.

Bengt O. Muthen posted on Thursday, August 17, 2006 - 11:14 am

For some reason, my earlier post included only half of what I intended, but as you say Web Note #4 has the formula in (19). There is not a simple relationship between the tetrachoric/probit-based loadings and loadings from linear modeling using phi's. I think Rod has written about such relations. I think one has to define what reliability should mean - if it refers to how well a factor in a probit/logit IRT model is captured by a sum of binary items, then I think one has to use a non-linear model, but that would not necessarily be the case if one has another definition.

Salma Ayis posted on Tuesday, September 05, 2006 - 5:17 am

Hi, I am a new user of IRT and still have few questions for which I very much appreciate answers/advice/references!
1- for a set of binary items, I would like to interpret my results in term of logits for each item, is it possible to get these logits as an output without needing to compute them seperately?. If so please let me know!; if not I can see in the output, in Model Results, that there is a formula stated as: IRT PARAMETERIZATION IN TWO-PARAMETER LOGISTIC METRIC
WHERE THE LOGIT IS 1.7*DISCRIMINATION*(THETA - DIFFICULTY), what is theta exactly, I can see theta in my output but I am unable to link this with other parametrs-please advice!.
2- If I use more than two categories would I still have estimated difficulty and discrimination parameters for each item? your advice is most appreciated!

Linda K. Muthen posted on Tuesday, September 05, 2006 - 10:26 am

The regression coefficients obtained using the CATEGORICAL option with the maximum likelihood estimator are logits. Theta is a factor score. The Theta in your output refers to a parameter in the model. If you use more than two categories, you will obtain difficulty and discrimiation.

Salma Ayis posted on Friday, January 05, 2007 - 5:35 am

Dear Linda, Further to your response on Tuesday, September 05, 2006, I am afraid, still unsure where to find the logits?, I am using example 5.5, and have specified the CATEGORICAL option for my set of binary indicators, when you say the regression coefficients, what are these called in the output? are they the estimates? or another command is needed to do these calculations? many thanks for your anticipated response!

Linda K. Muthen posted on Sunday, January 07, 2007 - 9:06 am

The parameter estimates are shown under the column labeled Estimates. The output is described in the beginning of Chapter 17.

Lisa M. Yarnell posted on Wednesday, March 12, 2014 - 12:47 pm

Hello, I employed the D(1.7) output option for my IRT model with dichotomous indicators. We are using WLSMV estimation and hence a probit link. The output for the factor model and IRT parameterization sections do not show the same values for the loadings/discrimination levels and thresholds/difficulties.

I had thought that in employing the D(1.7) option, the factor model section of output would be translated to the IRT parameterization, and that the two sections of output would hence contain the same numbers.

Or is it true that the output continues to show the two different sets of values even when employing the D(1.7) output option?

Thank you and regards.

Lisa M. Yarnell posted on Wednesday, March 12, 2014 - 12:58 pm

Also, I have just found that with theta parameterization, the factor model and IRT parameterization output sections contain the same values, whether employing the D(1.7) option or not.

But when delta parameterization is used, I do not receive the same values in the factor model and IRT parameterization sections of output, whether I employ D(1.7) or not.

So is receiving the same solution across the two parameterizations more an issue of theta vs. delta, or using the translation constant?

Thank you.

Bengt O. Muthen posted on Friday, March 14, 2014 - 12:46 pm

The regular output and the IRT translation are not expected to show the same results since they are different parameterizations. The D=1.7 is another matter - it has to do with making probit and logit close.

Lisa M. Yarnell posted on Monday, March 17, 2014 - 7:41 pm

I was under the impression that when probit estimation is being done, employing D(1.7) makes the factor model and IRT parameterization outputs equal (not that it makes the probit and logit solutions close).

The default is probit for both the factor model and IRT output sections when WLSMV estimation is used, which happens with dichtomous items.

Can you comment on my statements here?

Bengt O. Muthen posted on Tuesday, March 18, 2014 - 2:02 pm

Your first paragraph is incorrect: 1.7 is to make the IRT logit close to the IRT probit. It has nothing to do with the factor model parameterization.

Your second paragraph is correct.

Keri Wong posted on Saturday, August 30, 2014 - 7:33 am

Dear Dr Muthen,
I'm trying to run IRTs on items with 3 response categories (no, sometimes, yes) and I'm most interested in the item characteristic curves of individual's responding 'yes'. However, the mplus output doesn't produce the item discrimination values presumably because items are not binary? If so, is there another way to print the slope/intercepts of the curves?

This is my input:
VARIABLE:
NAMES ARE
ID Gender agePriS T9 T9h T10 T10h
T11 T11h T8sr T8hr T3srr T3hrr
T5srr T5hrr;

USEVARIABLES ARE
T9 T9h T10 T10h T11 T11h T8sr T8hr
T3srr T3hrr T5srr T5hrr;

CATEGORICAL ARE ALL;
IDVARIABLE = ID;
MISSING ARE ALL (-99);

ANALYSIS:
ESTIMATOR = WLSMV;

MODEL: TrustH BY T10h T9h T11h T8hr;
TrustS BY T10 T9 T11 T8sr;
TrustG BY T5srr T3hrr T3srr T5hrr;

TRUSTH WITH TRUSTS;
TRUSTH WITH TRUSTG;
TRUSTS WITH TRUSTG;

!T11 WITH T11H;
!T5SRR WITH T3HRR;
!T8HR WITH T8SR;
!T8HR WITH T9H;
!T5SRR WITH T3SRR;

PLOT:
TYPE IS PLOT3;

OUTPUT:
SAMPSTAT TECH1 STANDARDIZED MODINDICES(3.84);

Thanks

Bengt O. Muthen posted on Saturday, August 30, 2014 - 10:02 am

With multiple factors, the Mplus parameterization is that used in IRT so no translation is needed. See our FAQ:

IRT parameterization using Mplus thresholds

You can view the icc's in plots given by Mplus and focus on any category or sums of categories.

Keri Wong posted on Saturday, August 30, 2014 - 11:26 am

Just to clarify, do you mean that the thresholds in the outputs are essentially the "item discrimination" values? I have the plots but can't seem to produce any descriptives for them to get the slopes?

Thanks,
Keri

Bengt O. Muthen posted on Saturday, August 30, 2014 - 12:05 pm

No, that's not what I mean. Read pages 224-225 of the Cai et al. (2011) Psych Methods article.

The slopes are the loadings in this parameterization, that is, the slopes are found under BY in the output.

Mike Todd posted on Monday, October 06, 2014 - 2:43 pm

I am working with a simple example dataset and IRT model (single-group, unidimensional, 20 binary indicators named X1, X2, ..., X20) from a workshop given by Jonathan Templin. The model runs fine, and expected output and plot/graph files are generated.

I've had trouble obtaining ICC plots for some items using the mplus.R program. With the mplus.plot.irt.icc function I can get ICC plots for any combination of indicators X1, X2, and X10-X20, but if the plot specification includes any indicator from X3 to X9, I get the following error message:

Error in mplus.plot.irt.icc("/filepath/01 Fraction Subtraction Example/mplus example #1- 1PL model.gh5", :
The index for the indicator in uvar is out of range.

I have no problem opening and viewing this plot/graph file with a Windows version of Mplus. I've tried a couple of different variable naming conventions in the Mplus code (Xa,Xb,...,Xt and Vxa, Vxb,...,Vxt), but the results are the same--using the 3rd through 9th indicators throws an error.

If you'd like more information, please let me know, and I'll send the data, code, and output files to you and/or Thuy.

Thanks!

PS. In the example I was working, with the legend obscured the crucial part of the plot, so I tweaked the legend syntax in mplus.R a bit. I can pass along the modified R code too if you'd like.

Bengt O. Muthen posted on Monday, October 06, 2014 - 4:37 pm

Yes, please send files to support so we can take a look at it.

Mike Todd posted on Tuesday, October 07, 2014 - 4:39 pm

The revised R code Linda sent seems to have done the trick.

Thanks for the quick response!

jan mod posted on Wednesday, February 11, 2015 - 1:51 am

How are the theta scores calculated in mplus in a 2P IRT model? I want to calculate them by hand.

Bengt O. Muthen posted on Wednesday, February 11, 2015 - 6:12 pm

Mplus uses the usual IRT formulas for the Expected Aposteriori method - see IRT books. It takes a bit of programming.

Johannes Bauer posted on Saturday, June 27, 2015 - 8:35 am

Hi

I am trying to apply four different parameterizations that are described by Kamata & Bauer (2008, p. 139, Struct Eq Mod) to a CFA with binary items. I am using MLR because I want to relate the results to an 2PL IRT model. The four parameterizations are:

1) Conditional/reference indicator: lambda_1 = 1, tau_1 = 0, V(epsilon) = 1
2) Conditional/standardized factor: E(ksi) = 0, V(ksi) = 1, V(epsilon) = 1
3) Marginal/reference indicator: lambda_1 = 1, tau_1 = 0, V(y*) = 1
4) Marginal/standardized factor: E(ksi) = 0, V(ksi) = 1, V(y*) = 1

For the two marginal parameterizations Mplus gives the error message: "ALGORITHM=INTEGRATION does not support models with scale factors." So it seems to me that these paremeterizations cannot be done with MLR . Is there a way to implement them?

Many thanks
Johannes

Bengt O. Muthen posted on Sunday, June 28, 2015 - 10:35 am

Regular IRT does not use scale factors - that is, {} statements using Mplus language.

If you like, send relevant output and license number to support.

Madison Silverstein posted on Thursday, September 14, 2017 - 10:00 am

Hello,

I have a follow up question after reading the FAQ IRT parameterization using Mplus thresholds. Should I be using the standardized or unstandardized values for difficulty and discrimination parameters?

I also have a question about plots. When I try to generate an ICC using the sum of categories 1-5, I get a horizontal line. However, when I generate an ICC using the sum of categories 2-5, the plot looks good. Since my response options are 0-4, I thought that having a 0 might be interfering with the plots, so I translated the scores from 1-5. However, the problem remained. Any advice would be much appreciated!

Thank you!

Madison

Bengt O. Muthen posted on Thursday, September 14, 2017 - 4:14 pm

Q1: Unstandardized.

Q2: If you have categories 1-5 and add them up, the probability is one for any x-axis value - so it should be a straight line.

Mary Rose Mamey posted on Tuesday, September 04, 2018 - 6:24 am

Hello,
I’m using IRT procedure as a step in developing a new measure. My current codes do not produce the item discrimination and difficulty parameters in the output, though it does produce the ICC plot for all items (that uses both the discrimination and difficulty parameters to be created). I have used the same exact set of codes for a different dataset in the past that produces these values, and have compared the two inputs line by line to make sure there are no differences. Below is my input. Any suggestions for producing those discrimination and difficulty parameters?

Much appreciated!

Mary Rose

VARIABLE:
NAMES ARE
PID MSL_1 – MSL_49;

USEVARIABLES ARE
MSL_1 – MSL_49;

CATEGORICAL ARE
MSL_1 – MSL_49;

MISSING ARE ALL (-88);

ANALYSIS:
ESTIMATOR = ML;

MODEL:
THETA by MSL_1* - MSL_49
THETA@1;
[THETA@0];

PLOT:
TYPE = PLOT1 PLOT2 PLOT3;

OUTPUT:
TECH1 TECH5 TECH8 TECH10;
STDyx;

Bengt O. Muthen posted on Tuesday, September 04, 2018 - 2:54 pm

Are your items binary?

For translations into IRT, see the paper on our IRT page:

http://www.statmodel.com/download/MplusIRT.pdf

Mary Rose Mamey posted on Tuesday, September 04, 2018 - 5:21 pm

Yes they are. I ran the same codes also using a ML estimator for another set of binary items and was able to produce the difficulty and discrimination parameters in the output. Is the suggestion from this paper to use a different estimator becuaes those items are binary?

Thank you!

Bengt O. Muthen posted on Tuesday, September 04, 2018 - 5:58 pm

Are you saying that you get discrimination and difficulty only with ML and not with WLSMV? For binary items you should get discrimination and difficulty. For us to see what's going on, send your full output and data to Support along with your license number.

owis eilayyan posted on Friday, January 11, 2019 - 6:32 am

Hello,
I am working on different unidimension IRT models and want to assess the unidimensionality assumption. What are the available tests in MPLUS to assess this assumption other than confirmatory factor analysis?

Thank you,
Owis

Bengt O. Muthen posted on Friday, January 11, 2019 - 7:25 am

Check the FAQ on our website: Estimator choices with categorical outcomes

Apart from overall tests of model fit provided by WLSMV and Bayes, there is also item pairwise testing using TECH10. Note that CFA is not needed - EFA also gives these tests.

owis eilayyan posted on Friday, January 11, 2019 - 9:01 am

Thank you Dr. Muthen
So if I have model fit statistics, does that mean my model meets the unidimensional assumption?

Regards,
Owis

Bengt O. Muthen posted on Saturday, January 12, 2019 - 11:52 am

It means that you have not rejected the model of unidimensionality. You never use a good test of fit result to mean that a model is "accepted"; just "not rejected".

owis eilayyan posted on Tuesday, January 15, 2019 - 4:46 am

Thank you,
Owis