Message/Author 

Anonymous posted on Saturday, April 23, 2005  1:33 pm



I am using MPlus v. 3.12 to analyze a set of multilevel models with random slopes and a categorical (binary) outcome. I'm having some trouble interpreting the results. In my level 1 model, let's say that "A" is my outcome. "A" is regressed on "B" and "C", which are dummy variables. Let's say the output produces the following values: Threshold for "A" is 2.0. A on B is 1.0, SE = .10, t = 10.0. A on C is .5, SE = .10, t = 5.0. How do I calculate the probability of A=1 for a person in my reference group? How does being in groups B=1 or C=1 impact the probability of A=1? How is this calculated? Sorry for asking such a basic question! Any help would be appreciated. 

Anonymous posted on Saturday, April 23, 2005  6:49 pm



Sorry everyone, I figured it out already by reading through some other threads here. 

Doug posted on Thursday, April 14, 2011  12:11 pm



For binary data are thresholds equal to intercepts for continuous data? If so, in a CFA multigroup factor analysis, given that factor loadings and thresholds must be held equal in tandem, is testing for invariance of the factor loadings across groups equivalent to testing for both metric invariance (factor loadings) and scalar invariance (thresholds/intercepts) at the same time? 


Yes. 

Doug posted on Saturday, April 16, 2011  4:55 pm



If you find noninvariance of factor/thresholds how do you determine if it is the factor or threshold that is not equal across groups, that is, do you have metric or scalar noninvariance? 


You can look at modification indices to see where the invariance is. 

Doug posted on Wednesday, May 18, 2011  10:09 am



I am assuming that I use the MI titled "Means/Intercepts/Thresholds". The output gives a MI for [item1] and a MI for [item1$1]. I am assuming the first is the MI for the factor and the other is the MI for the threshold, for the same item. The value for both the factor and threshold is the same. How would you determine if it is the factor or the threshold that is noninvariant across groups from this info (see comment above April 16). Also, would the list of MI include all factor/thresholds that are contributing to the noninvariance? For example, for one analysis the MI output listed two items under "Means/Intercepts/Thresholds", but when I tested for noninvariance for each individual item (DIF)I found five items that were noninvariant across groups. Why the discrepancy (i.e., two versus five)? I was wondering if I needed to do some type of correction to account for the number of tests of invariance I conducted, i.e. tested three factors and 35 items. That might reduce the number of significant findings. 

Doug posted on Wednesday, May 18, 2011  10:15 am



One additional question ... I have one item in the MI output that has 15 or 16 suggested correlations ("ON" and "WITH") with other items and factors that would reduce chisquare, and I was wondering if so many suggested correlations and crossloadings indicated that there was an issue with that particular item? What does that say about the item? 


[item1] refers to the intercept (nu) which is fixed at zero for categorical items. [item1$1] refers to the threshold. Thresholds are the same as intercepts except with opposite signs. Different models have different MIs. Also, note that only MI's > 3.84 are printed as the default. Yes, an item with many MI correlations certainly flags a problem with that item. Perhaps the item content contains all those other items (a "global" item)? 

Jason Bond posted on Friday, April 13, 2012  5:34 pm



Bengt/Linda, In doing an IRT of 7 variables (3 categories each), with the default settings (which I believe are a normal ogive model with WLS) when I look at each of the 2 estimated thresholds per variable in the output, the first is always smaller in magnitude than the second. I assume this is an artifact of the probit model for ordinal polytomous regression. However, when I look at the ICC curves I see that, for a majority of the variables, the value of the factor at which the probability curves for categories 1 and 2 cross (I assume this 'difficulty' value is the severity where a jump between adjacent categories is equally likely and computed as the corresponding threshold/loading) is larger than the severity value at which the probability curves for categories 2 and 3 cross. As higher response categories are intended to be indications of higher severity for all variables for which this occurs, this is puzzling. Is there an easy explanation for this? It is true that, for many of these, the prevalence of the 2nd category is less than that of the third. Thanks. 


I don't think you should focus on the crossings, but rather the factor value at the peak of each category's probability curve. Those peaks should be ordered according to the category ordering. 

Jason Bond posted on Monday, April 16, 2012  1:22 pm



The peaks of the ICCs for the categories are indeed ordered by the threshold values. Is there then information in these curves regarding when and how combining of categories is appropriate? For example, for any given item, the ICC curve for the first and last category intersect at a given factor value but the peaks of the probability curves for all the other categories in between never rise to the level of intersection of the probability of the first and last. Say an item had 5 categories at that the peaks for the 3 intermediate categories occurred at factor values that were on different sides of the factor value where the probability of the first and last category intersected. Could one then use this as a rule to decide which categories were combined? Thanks again, Jason 


If the three categories in the middle have low and similar peaks, this suggests they can be collapsed. 

Jason Bond posted on Monday, May 07, 2012  11:28 am



Bengt, A followup question regarding your response on April 15th to my initial posting. I'm trying to reconcile the threshold estimates to physical characteristics of the ICC curves...my initial post to you was that I thought they corresponded to where the probability category curves for adjacent categories crossed...but your response and the fact that the crossings aren't ordered ruled that out. You mentioned that the peaks should be ordered by the threshold values, and indeed they are..but given there are k2 peaks (as the first and last categories don't have them) and only k1 thresholds, I've clearly got something wrong. Is there an actual probability curve characteristic that corresponds to the values of the estimated threshold parameters? Thanks much again in advance, Jason 


There may be, but I don't focus on that in my work so I don't recall it. Why don't you take a look at for instance the IRT book by Baker and Kim or by Reckase. 

Jason Bond posted on Wednesday, May 09, 2012  11:31 am



Bengt, Thanks much for the reference. One more reference request, if you know of one. I have a number of groups (data from different countries) each of which have the seven AUDIT 5category alcohol problems...do you know of anyone who has used the multilevel functionality of Mplus in the context of a polytomous IRT for a DIF application paper? I've looked but not found anything satisfactory. Thanks again, Jason 


How many countries do you have? I have started to notice an increased interest in multilevel, multigroup analysis with categorical items using Mplus, but I can't recall seeing anything published yet on that. 

Jason Bond posted on Wednesday, May 09, 2012  3:05 pm



I have 17 countries...but it might make more sense to only look at DIF for meaningful subgroups of countries. Might you know of any references where anyone has looked at DIF when a moderate number of groups are used (I only recall having found analyses with G=2). Thanks, Jason 


Off hand I can't think of any such references, except my 1989 Psychometrika article talked about using several covariates in a MIMIC model to study DIF across many groups with binary items. Does anybody know of other such references for categorical items? 

Ping Kuo posted on Friday, September 04, 2015  7:12 am



Hello, I wonder whether my interpretations of changes of thresholds across two time points are correct. Thresholds of all items at Time 2 are higher than those at Time 1. Take item 1 for example, three thresholds of item 1 are .017, .92, 1.428 at Time 1, but they are .460, 1.52, 2.129 at time 2. Could I interpret the observed scores of this measure are underestimated at Time 2 (because all items become more difficult at time 2 and participants need more latent traits to endorse the same category of items at time 2)? Thanks a lot. 


You should test for measurement invariance of the thresholds, whereby you can see if the change over time is due to a change in the mean/variance of the underlying factor. 

Ping Kuo posted on Friday, September 04, 2015  8:18 am



Hello, Thanks for quick responses. I tested longitudinal MI using WLSMV. However, the threshold invariance was violated. Under such situation, are the statements mentioned above correct? Thanks a lot. 


No, you need a wellfitting model to be able to make t6hose statements. You can't talk about "more latent traits" accounting for the change, nor can you talk about underestimation. All you can say is that the percentages of the observed variable have changed in a certain direction. 

Margarita posted on Tuesday, September 06, 2016  11:07 pm



Dear Dr. Muthen, Using the new mplus shortcut code for measurement invariance I checked the scalar vs. Configural invariance for a 5item domain with 3item likert scale (2 thresholds) and I found non invariance. To locate the source of non invariance I replicated the analysis manually and in the modification indices I found the following to be provlematic: [item1$ ]. In examples I found in the literature the MI usually indicate which thershold is problematic and so I am confused as to what [item1$ ] means. When I free both thresholds of that item [item1$1* item1$2*] the fit improves, so would that be the right way of doing it? Thank you in advance 

Margarita posted on Thursday, September 08, 2016  6:27 am



Dear Dr. Muthen, Using the new mplus shortcut code for measurement invariance I checked the scalar vs. Configural invariance for a 5item domain with 3item likert scale (2 thresholds) and I found non invariance. To locate the source of non invariance I replicated the analysis manually and in the modification indices I found the following to be provlematic: [item1$ ]. In examples I found in the literature the MI usually indicate which thershold is problematic and so I am confused as to what [item1$ ] means. When I free both thresholds of that item [item1$1* item1$2*] the fit improves, so would that be the right way of doing it? Thank you in advance (I am reposting this as it does not appear in this thread  so apologies for any duplications) 


It may be that you use a long name for your item so that the last number gets cut off  shorten the name. 

Margarita posted on Friday, September 09, 2016  12:18 pm



You were right, problem fixed! Thank you so much 


I would like to translate probit coefficients into probabilities using the formula: prob (y = 1) = f(threshold + b1*x1 + b2*x2 …) You write that “Thresholds are the same as intercepts except with opposite signs.” So this formula is the same as prob (y = 1) = f(intercept+ b1*x1 + b2*x2 …) (Because to minuses become a plus.) Is that correct? 


This is correct. 

Back to top 