Mplus Discussion >> Multilevel IRT Model in Mplus

Topics
Last Day
Last 3 Days
Last Week
Tree View

Edit Profile


Multilevel IRT Model in Mplus

Mplus Discussion > Multilevel Data/Complex Sample >

Message/Author

June Zhou posted on Tuesday, January 29, 2013 - 8:57 pm

I am trying to run multilevel IRT Rasch models in Mplus and I have some questions listing as below:

1. In the 2-level 1PL IRT model, items (level-1) were nested in students (level-2), can I use Ex.7.27 and constrain all of loadings to be 1? How to add student-level predictors in the model?

2. In the 3-level 1PL IRT model, items (level-1) were nested in students (level-2) and students were nested in schools (level-3), should I use EX.10.5 and constrain all of loadings to be 1? How to add school-level predictors in the model?

Thank you in advance.

Linda K. Muthen posted on Wednesday, January 30, 2013 - 10:55 am

1. This is not a multilevel model in Mplus. See Example 5.2 and 5.5.

2. This is a two-level model in Mplus. See Example 9.7 which includes a between-level covariate.

June Zhou posted on Wednesday, January 30, 2013 - 12:03 pm

Thank you very much for your advice, Dr. Muthen. I really appreciate your help.

June Zhou posted on Monday, April 01, 2013 - 4:15 pm

I am trying to run an MCMC Rasch model in Mplus. I have two questions:

1. When I tried to constrain item discrimination parameters (factor loading) to be 1, I found I only got identical loadings but they were not 1. Why?

2. In MCEX5.5, is the code "[u1$1-u10$1*-.5 u11$1-u20$1*.5]" meant to fix item difficulty parameters of the first 10 items to be -.5 and the last 10 items .5? If yes, can I change the values to -2.5, -0.5, 0, 0.5, and 2.5 since the range of item difficulty parameter is (-3, 3)?

Thank you in advance!

Bengt O. Muthen posted on Monday, April 01, 2013 - 4:53 pm

1. If you hold them equal, one parameter will be estimated. That won't happen if you fix them @1.

2. Yes. The range of difficulty parameters depend on the variance of the factor. Using 3 implies variance one.

June Zhou posted on Wednesday, April 03, 2013 - 3:36 pm

Thank you so much for your prompt reply, Dr. Muthen! I really appreciate your help.

This time I am going to do a Monte Carlo study for a 2-level Rasch model with within and between-level variables in Mplus. Below are my codes. Compared with mcex10.5. I am not sure whether I defined my thresholds (item difficulty) accurately or not. Would you please give me some advice? Thank you for your time in advance.

montecarlo:
names = u1-u5 x1 x2;
generate = u1-u5(1);
categorical = u1-u5;
nobs = 200;
ncsizes = 2;
csizes = 6 (20) 2 (40);
seed = 48459;
nreps = 500;
repsave=all;
save=con1*.dat;
within = x1;
between = x2;

ANALYSIS: TYPE IS TWOLEVEL;

model population:

%Within%
[x1*0]; x1@1;
fw by u1@1 u2-u5*1;
fw*1;
fw ON x1*.25;

%Between%
[x2*0]; x2@1;
fb by u1@1 u2-u5*1;
fb*.1111;
fb ON x2*.25;
[u1$1*-2.5 u2$1*-1.5 u3$1*0 u4$1*1.5 u5$1*2.5];

Bengt O. Muthen posted on Wednesday, April 03, 2013 - 4:48 pm

Looks ok. But you want more clusters than 8 in a twolevel analysis. And you want to specify MODEL as well, not only MODEL POPULATION.

June Zhou posted on Thursday, April 04, 2013 - 2:00 pm

Thank you very much for your advice, Dr. Muthen.

Could you please tell me the reason why the thresholds of variables on the between and within levels exist only on the between level. In other words, why I couldn't define thresholds in within level?

Thanks!

Bengt O. Muthen posted on Thursday, April 04, 2013 - 2:53 pm

Each item has just one set of thresholds (one if binary, several if polytomous) and the convention is that they are reported on the Between level. They cannot also be identified on the within level. This is in line with means/intercepts in regular multilevel analysis.

June Zhou posted on Monday, April 08, 2013 - 8:00 am

I see. Thank you so much for your reply! Sorry, I have another question: is there a statistic estimating the latent person ability parameter in IRT model in Mplus? Thank you in advance!

Bengt O. Muthen posted on Monday, April 08, 2013 - 5:03 pm

You request factor scores using FSCORES.

zahra sharafi posted on Tuesday, September 23, 2014 - 10:15 pm

Dear Drs. Muthen,
I want to simulate ordinal multilevel data that has DIF. I just know I should use 2 parameter logistic IRT model. please tell information about this model and steps of my simulation.I have a little information about this problem and I do not know where I can find about it?

Bengt O. Muthen posted on Wednesday, September 24, 2014 - 3:03 pm

This has been answered under a different thread.

zahra sharafi posted on Saturday, September 27, 2014 - 6:07 am

Thank you for your answer.
I know about the theory of multilevel IRT
and I read and understand Samejima�s graded response model that you said in example 5.5 also I read your homepage.But I dont know how simulate multilevel IRT data?
Do you look at this model like CFA?
should I simulate from multilevel IRT or CFA?

Linda K. Muthen posted on Saturday, September 27, 2014 - 9:20 am

Look at the Monte Carlo counterpart for Example 5.5. This was used to simulate the data for Example 5.5

�eyma posted on Wednesday, October 22, 2014 - 12:43 pm

Dear Drs. Muthen,
I want to do multilevel mixture IRT analysis. But I am only interested in the differences across student level classes, when the data is multilevel (include school level). The diffuculty parameters are same across school level classes. I generated data in any other program and I used example 10.5 for analyse it. But I changed something in the syntax. Is that true?

VARIABLE:NAMES ARE u1-u8 dumb dum clus;
USEVARIABLES = u1-u8;
CATEGORICAL = u1-u8;
CLASSES = cb(1) c(2);
BETWEEN = cb;
CLUSTER = clus;
DATA: FILE = ex10.5.dat;
ANALYSIS: TYPE = TWOLEVEL MIXTURE;
ALGORITHM = INTEGRATION;
PROCESSORS = 2;
MODEL:
%WITHIN%
%OVERALL%
f BY u1-u8;
[f@0];
%BETWEEN%
%OVERALL%
%cb#1.c#1%
[u1$1-u8$1];
%cb#1.c#2%
[u1$1-u8$1];
OUTPUT: TECH1 TECH8;

Bengt O. Muthen posted on Thursday, October 23, 2014 - 11:41 am

It seems like you want only one latent class variable cb which has two classes and is declared as a between-level variable. Which means that you would say

%BETWEEN%
%OVERALL%
%cb#1%
[u1$1-u8$1];
%cb#2%
[u1$1-u8$1];

zahra sharafi posted on Friday, March 27, 2015 - 11:54 am

Dear Drs. Muthen,
I want to use multilevel ordinal logistic regression for determining DIF.
How can I do it with Mplus?
Can Mplus draw figures for uniform and non-uniform DIF? how can I do it?
Thank you.

Bengt O. Muthen posted on Friday, March 27, 2015 - 2:42 pm

See UG ex 9.7, where you add direct effects from the covariates to factor indicators to capture DIF with respect to difficulty parameters/thresholds. You can do the plot as a function of the covariate values using an Adjusted probability plot option in the plot menu.

reyhane rahmani posted on Saturday, September 12, 2015 - 1:20 am

Dear Drs. Muthen,
I follow your suggestion about finding DIF and logistic regression.
I want to find DIF with logistic regression my response are ordinal and multilevel
how I can do it?
example 9.7 find DIF with CCFA
but I want to do with logistic regression.
Thanks

Bengt O. Muthen posted on Saturday, September 12, 2015 - 5:10 pm

You can do ex9.7 with logistic link using ML. The direct effects on within or between are indications of DIF.

reyhane rahmani posted on Monday, September 28, 2015 - 4:22 am

Dear Drs. Muthen,
thanks for your answer.I wrote following program.x1(dependent variable) is peoples' answer to my first question(1,2,3,4,5),theta(independent variable)is ability and wv (independent variable)is the group indicator(1or2).
I want to know this program is true for detection of DIF with ordinal logstic regression?
and also when I run this program x1 and wv both are considered dependent variable?

TITLE: this is an example of a two-level ordinal logistic regression , a random intercept, and covariates
DATA: FILE = "F:/Mplus.dat";
VARIABLE:
NAMES = s clus per wv bv theta X1 X2 X3 X4 X5 s1 clus1 per1 wv1 bv1theta1 y1 y2 y3 y4 y5;
usevariables X1 wv theta;
CATEGORICAL = X1 wv ;
WITHIN =wv theta;
CLUSTER = clus;
MISSING = ALL (999);
ANALYSIS:TYPE = TWOLEVEL random;
Estimator is MLR;
Link=logit;
MODEL:
%WITHIN%
X1 on wv theta;
OUTPUT:Sampstat TECH1 TECH8;

reyhane rahmani posted on Tuesday, September 29, 2015 - 3:50 am

Dear Drs. Muthen,
I think I get it.
is this program true?

TITLE: this is an example of a two-level ordinal logistic regression , a random intercept, and covariates
DATA: FILE = "F:/Mplus.dat";
VARIABLE:
NAMES = s clus per wv bv theta X1 X2 X3 X4 X5 s1 clus1 per1 wv1 bv1theta1 y1 y2 y3 y4 y5;
usevariables X1 wv theta;
MISSING=.;
CATEGORICAL = X1 ;
WITHIN =wv theta;
CLUSTER = clus;
MISSING = ALL (999);
ANALYSIS: TYPE = TWOLEVEL random;
Estimator is MLR;
Link=logit;
MODEL:
%WITHIN%
X1 on wv theta;
%BETWEEN%
X1
OUTPUT:Sampstat TECH1 TECH8;

regards

Bengt O. Muthen posted on Tuesday, September 29, 2015 - 6:26 am

This looks ok.

zahra sharafi posted on Tuesday, November 24, 2015 - 8:24 pm

Dear Drs. Muthen,
Thank you for your kind response.
I wrote following program.x1-x20 are item response(1,2,3,4,5),theta is ability and wv is the group indicator(1or2).
at first I want to use multilevel ordinal logistic for detecting DIF but I simulate my data in R so I use external Monte Carlo simulation.I have 20 item and I should use ordinal logistic for each question how can I do it in my program?
I replicate each condition 1000 time and want to calculate power and type 1 error rate. how can I do it?
TITLE: this is an example of a two-level ordinal logistic regression , a random intercept, and covariates
DATA: FILE = "F:/rep/replist.dat";
TYPE = MONTECARLO;
VARIABLE:
NAMES = s clus per wv bv theta X1 X2 . . . . X20 s1 clus1 per1 wv1 bv1
theta1 y1 y2 y3 y4 y5;
usevariables X1 wv theta;
MISSING=.;
CATEGORICAL = X1 ;
WITHIN =wv theta;
CLUSTER = clus;
MISSING = ALL (999);
ANALYSIS: TYPE = TWOLEVEL random;
Estimator is MLR;
Link=logit;
MODEL:
%WITHIN%
X1 on wv theta;
%BETWEEN%
X1
OUTPUT:Sampstat TECH1 TECH8;

Bengt O. Muthen posted on Wednesday, November 25, 2015 - 4:01 pm

I don't know what you have in mind by putting only x1 in the USEV statement since you have 20 items. I think you should either start from UG ex9.7 using Type=Twolevel (without the covariates), or start from UG ex 9.26 using Type=Crossclassified.

zahra sharafi posted on Friday, November 27, 2015 - 9:59 pm

thank you for your helpful answer.
I finally understand but I still have some problem.I write theta(my latent varible)with x1-x20.
I need to run this model

logit(p(xij<k))=a+b1*theta+b2*wv+b3(wv*theta)+uj+eij

how can I write my model?

TITLE: this is an example of a two-level ordinal logistic regression , a random intercept, and covariates
DATA: FILE = "F:/rep/replist.dat";
TYPE = MONTECARLO;
VARIABLE:
NAMES = clus wv x1-x20
usevariables x1-x20 wv;
CATEGORICAL = x1-x20 ;
WITHIN =wv;
CLUSTER = clus;
MISSING = ALL (999);
ANALYSIS: TYPE = TWOLEVEL random;
Estimator is MLR;
Link=logit;
MODEL:
%WITHIN%
theta BY x1-x20;

%BETWEEN%
OUTPUT:Sampstat TECH1 TECH8;
thank you

Bengt O. Muthen posted on Saturday, November 28, 2015 - 2:24 pm

You can add on %Within%

int | wv XWITH theta;

x1-x20 ON wv int;

And on %Between%

x1-x20;

The ON statement gives you b2 and b3 coefficients and the %Between% statement gives you uj.

But it isn't clear if your 3 b's vary over items. I assume the j subscript represents item.

zahra sharafi posted on Saturday, November 28, 2015 - 8:42 pm

Thank you very much.
I try to run DIF detection model when for example student are clustered in school so i show student and j show school.
I am so sorry but how I can find type 1 error and power for b2=0(uniform DIF) and for b3=0(non uniform DIF).
I want to know is it different between these two hypothesis?
h0:logit(p(xij<k))=a+b1*theta+b2*wv+uj+eij
and h1:logit(p(xij<k))=a+b1*theta+b2*wv+b3(wv*theta)+uj+eij
or
h0:b3=0
and
h1:other wise
best

zahra sharafi posted on Saturday, November 28, 2015 - 9:36 pm

I run this model:
TITLE: this is an example of a two-level ordinal logistic regression , a random intercept, and covariates
DATA: FILE = "F:/repetation/replist.dat";
TYPE = MONTECARLO;
VARIABLE:
NAMES = rep clus per wv bv theta x1-x20 rep1 clus1 per1 wv1 bv1
theta1 y1-y20;
usevariables x1-x20 wv;
CATEGORICAL = x1-x20 ;
WITHIN =wv;
CLUSTER = clus;
MISSING = ALL (999);
ANALYSIS: TYPE = TWOLEVEL random;
Estimator is MLR;
Link=logit;
MODEL:
%WITHIN%
theta BY x1-x20;
int | wv XWITH theta;
x1-x20 ON wv int;
%BETWEEN%
x1-x20;
OUTPUT:Sampstat TECH1 TECH8;

zahra sharafi posted on Saturday, November 28, 2015 - 9:37 pm

and I have this error:
FATAL ERROR
THERE IS NOT ENOUGH MEMORY SPACE TO RUN Mplus ON THE CURRENT
INPUT FILE. THE ANALYSIS REQUIRES 21 DIMENSIONS OF INTEGRATION RESULTING
IN A TOTAL OF 0.49879E+25 INTEGRATION POINTS. THIS MAY BE THE CAUSE
OF THE MEMORY SHORTAGE. YOU CAN TRY TO FREE UP SOME MEMORY BY CLOSING
OTHER APPLICATIONS THAT ARE CURRENTLY RUNNING. NOTE THAT THE MODEL MAY
REQUIRE MORE MEMORY THAN ALLOWED BY THE OPERATING SYSTEM.
REFER TO SYSTEM REQUIREMENTS AT www.statmodel.com FOR MORE
INFORMATION ABOUT THIS LIMIT.
rep1 was one of my smallest and simplest data set I have!
In this simulated data I have 20 item response 25 cluster and 10 person in each cluster and my item response was binary (0,1)not ordinal(1,2,3,4,5).
Thank you very much

Bengt O. Muthen posted on Sunday, November 29, 2015 - 11:19 am

It is hard to guide you because it isn't clear which model you are interested in. For example, you need to have a subscript for item in order to describe the model.

You say

I want to know is it different between these two hypothesis?
h0:logit(p(xij<k))=a+b1*theta+b2*wv+uj+eij
and h1:logit(p(xij<k))=a+b1*theta+b2*wv+b3(wv*theta)+uj+eij

My answer is that to do this analysis you yourself need to understand what the difference is and know why you would be interested in one versus the other.

Note also that we ask that only one window is used for postings.

zahra sharafi posted on Monday, November 30, 2015 - 8:38 pm

Detecting DIF through HoLR is based on comparing three different models.models are formulated as follows:
logit(p(xij<k))=a+b1*theta+b2*wv+b3(wv*theta)+uj+eij (model 1)

logit(p(xij<k))=a+b1*theta+b2*wv+uj+eij (model 2)

logit(p(xij<k))=a+b1*theta+uj+eij (model 3)

comparing model1 and 2 (b3=0) is nonuniform DIF.
comparing model2 and 3 (b2=0) is uniform DIF.

I have 20 item,I should do these test for each item so I should have a subscript for item in my mplus program.
I want power and type 1 error rate for these test.how can I calculate these?
thank you.

Bengt O. Muthen posted on Tuesday, December 01, 2015 - 5:24 pm

To define theta you have to say

theta BY x1-x20;

which means that all 20 items are part of the analysis. That's why I need to see the item subscripts in your formulas to guide you regarding how to specify the Mplus input. I assume that a and the b's vary across items, but does u_j?

zahra sharafi posted on Monday, December 07, 2015 - 11:42 pm

Thank you for your good question.
I do not want to allow the DIF parameters for each item to vary across schools.
so u_j should have the same variance across items
best

Bengt O. Muthen posted on Tuesday, December 08, 2015 - 5:59 pm

If u_j influences each item the same you say on between:

u BY x1-x20@1;

But, usually, a between component is part of theta so that u influences the items by the same b1 parameters as on within.

zahra sharafi posted on Thursday, December 10, 2015 - 6:58 am

Thank you.
you have a lot of experience in multilevel DIF detection.
DO you think it is true that u_j influences each item the same?
I can not really find a good article about it.
Thank you.

Bengt O. Muthen posted on Thursday, December 10, 2015 - 6:21 pm

No, I don't think that.

To represent u_j type variation, I would create a Between-level factor fb which has the same loadings as on Within using equality constraints.

I think you may want to talk to an IRT consultant on the modeling you are interested in. After that it is easy to do it in Mplus.

zahra sharafi posted on Friday, December 11, 2015 - 1:50 am

thank you
If u_j vary across items we should say on between just:
x1-x20
this is true?
and how can find Type 1 error rate
and power?
thank you

Bengt O. Muthen posted on Saturday, December 12, 2015 - 11:50 am

Q1. No. Take a look at UG ex 9.7. You have only one factor. And you want to hold loadings equal across levels. The UG describes how to do this. You can also read about multilevel IRT in the paper on our website:

Muth�n, B. & Asparouhov, T. (2013). Item response modeling in Mplus: A multi-dimensional, multi-level, and multi-timepoint example. Download output files. Download table 6 output.

If this doesn't help, I suggest a psychometric or Mplus expert consultant. It is too much to try to teach this over Mplus Discussion.

Q2. You may want to ask this general question on SEMNET.

zahra sharafi posted on Sunday, December 20, 2015 - 10:22 pm

Thank you
I read your article and some other articles.
in my model:
logit(p(xij<k))=a+b1*theta+b2*wv+b3(wv*theta)+uj+eij
can I say
theta_ij is person level ability and u_j is school level ability?
best

Bengt O. Muthen posted on Monday, December 21, 2015 - 6:39 pm

You should ask these general modeling questions on SEMNET.

zahra sharafi posted on Monday, December 21, 2015 - 8:41 pm

Thank you.

Michael Strambler posted on Thursday, April 11, 2019 - 12:50 pm

My goal is run an IRT model on a measure for teachers that accounts for teachers being clustered in schools. Since I'm not interested in the school-level model, can this be accomplished with TYPE=COMPLEX to adjust for SEs or is it necessary to run a multilevel model? Thanks.

Michael Strambler posted on Thursday, April 11, 2019 - 1:51 pm

Sorry, I should have specified that I'm interested in running a Monte Carlo simulation for the above to determine sample size.

Bengt O. Muthen posted on Thursday, April 11, 2019 - 4:15 pm

You can generate the data in a first step using twolevel and analyze in a second step using complex. See UG ex 12.6 step 1 and step 2.

Michael Strambler posted on Thursday, April 11, 2019 - 4:31 pm

Thanks. And for such a model with real data, would this approach adequately address concerns around nonindependence?

Bengt O. Muthen posted on Thursday, April 11, 2019 - 5:49 pm

Both approaches take care of non-independence of observations. The Complex approach estimates the same parameters as in a single-level model whereas the twolevel approach estimates more parameters (between-level parameters as well). In many cases but not all, the model is "aggregatable" so that the single-level model isn't misfitting when estimated on data generated by a twolevel model. This is discussed in a simulation study in the paper on our web site (under Papers, Complex survey data analysis):

Muth�n, B. & Satorra, A. (1995). Complex sample data in structural equation modeling. Sociological Methodology, 25, 267-316.
download paper contact first author show abstract

Michael Strambler posted on Friday, April 12, 2019 - 12:23 pm

Thanks, that was helpful. I tried the UG 12.6 steps (1 & 2) by first creating a MLIRT with a 9-item (binary) factor modeled at level 1 and generated 100 datasets. I then used those datasets to run a MC on the same factor structure using TYPE=COMPLEX. All estimates of power are 1 but bias is higher (highest=20%) and coverage is lower (lowest=74%) than the results from MC on the MLIRT analysis (where I used 500 replications). Also, power was not all 1.

1. Does it make sense for power to be 1 for everything in the COMPLEX analysis?
2. Why would the COMPLEX analysis results have higher bias than the MLIRT analysis?

The model is f1 BY u1@1 u2-u9*;
f1@1;

Thanks!

Bengt O. Muthen posted on Friday, April 12, 2019 - 1:55 pm

1. Don't interpret the power column if you have biased estimates or bad coverage.

2. This could be because the model is not "aggregatable".

Michael Strambler posted on Friday, April 12, 2019 - 5:09 pm

If I understand accurately, non-aggregatable means that the factors on the two levels are not the same. In my situation, I generated data where there was only a factor on within and only variance on between. The model I�m estimating is also consistent with this. Does the aggregatable concept still apply?

Bengt O. Muthen posted on Saturday, April 13, 2019 - 4:23 pm

That depends on what your specification of between variance is.