Ali posted on Sunday, September 29, 2013 - 7:42 pm
Hi, I am doing Example 7.27, but there are some codes which I am confused. %OVERALL% f BY u1-u8; [f@0]; %c#1% f BY u1@1 u2-u8; f; [u1$1-u8$1]; %c#2% f BY u1@1 u2-u8; f; [u1$1-u8$1]; 1. why do we set mean 0 in the whole model? could I set another value ? 2. in the model of class 1 and class 2, why do we set u1@1? also, in the mixture IRT model, do we always set different variance in the classes? 3. I am not familiar with mixture IRT model, so could you please give me suggestions like which chapters are related to mixture IRT in the Mplus user guide?
1. We fix the means of the factor to zero in all classes because the thresholds are not held equal across classes. This is required for model identification.
2. U1 is fixed at one to set the metric of the factor. In IRT, this is usually done by fixing the factor variance to one and freeing all factor loadings.
3. See the following paper which is available on the website:
Muthén, B. (2008). Latent variable hybrids: Overview of old and new models. In Hancock, G. R., & Samuelsen, K. M. (Eds.), Advances in latent variable mixture models, pp. 1-24. Charlotte, NC: Information Age Publishing, Inc.
Ali posted on Thursday, October 03, 2013 - 9:17 pm
Thank you for explanation. I got other questions. 1. If i changed the start values, some values are slightly different like AIC. Does it matter? and how do I choose the number of the start value? 2. I want to add another factor in the model, so I name it as f2. MODEL: %OVERALL% f1 BY u1-u8; f2 BY u1-u8; [f1@0]; [f2@0]; %c#1% f1 BY u1@1 u2-u8; f2 BY u1@1 u2-u8; f1; f2; [u1$1-u8$1]; %c#2% f1 BY u1@1 u2-u8; f2 BY u1@1 u2-u8; f1; f2; [u1$1-u8$1]; the AIC and BIC are only slight difference . AIC and BIC are slightly higher than the original model. So, which model should be better one? Thanks!
You should use enough random starts to replicate the best loglikelihood several times. The second number should be about i/4 of the first, for example,
STARTS = 200 50;
A second factor should not have the same factor indicators as the f irst factor.
Ali posted on Thursday, October 10, 2013 - 8:02 pm
Hi, I want to do mixture rasch model(I used ex7.27),so I run the codes as following: %OVERALL% f BY u1-u8(1); [f@0]; %c#1% f BY u1-u8 (1); f; [u1$1-u8$1]; %c#2% f BY u1-u8(1); f; [u1$1-u8$1]; 1.do I need set every model has (1) after f by u1-u8? 2. Since the sample size is really large like 300000, is there is any way to decrease time to run the whole procedure? 3. And such the sample size is so big, does there any suggestion to set "starts"? or just follow the above rule-the second number is about i/4 first.
Hi, I am reading two articles-Muthen, B. (2008). Latent variable hybrids: Overview of old and new models. In Hancock, G. R., & Samuelsen, K. M. (Eds.), Advances in latent variable mixture models, pp. 1-24. Charlotte, NC: Information Age Publishing, Inc. and Item response mixture modeling:application to tobacco dependence criteria(Asparouhov,T. and Muthen,.B, 2006) I am confused about the diagram(p.3) provided in Muthen, B. (2008). Latent variable hybrids: Overview of old and new models. What is the difference between one diagram with a line to the factor and another without a line to the factor under the factor mixture model? Because in the tobacco article , the diagram (p.1057)has a line from the latent class to the factor. Does different diagrams affect the codes of mixture IRT model?
See UG ex 7.17 and 7.27 for scripts for the two types of models. An arrow from c to f implies that f has a mixture distribution with for instance different factor means in the different latent classes.
Hi, I ran the mixture Rasch model with 21items. It shows me the warning message- THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES MAY NOT BE TRUSTWORTHY FOR SOME PARAMETERS DUE TO A NON-POSITIVE DEFINITE FIRST-ORDER DERIVATIVE PRODUCT MATRIX. THIS MAY BE DUE TO THE STARTING VALUES BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. THE CONDITION NUMBER IS 0.304D-16. PROBLEM INVOLVING PARAMETER 4. ONE OR MORE PARAMETERS WERE FIXED TO AVOID SINGULARITY OF THE INFORMATION MATRIX. THE SINGULARITY IS MOST LIKELY BECAUSE THE MODEL IS NOT IDENTIFIED, OR BECAUSE OF EMPTY CELLS IN THE JOINT DISTRIBUTION OF THE CATEGORICAL VARIABLES IN THE MODEL. THE FOLLOWING PARAMETERS WERE FIXED: 17. However, it also shows me “THE BEST LOGLIKELIHOOD VALUE HAS BEEN REPLICATED. RERUN WITH AT LEAST TWICE THE RANDOM STARTS TO CHECK THAT THE BEST lOGLIKELIHOOD IS STILL OBTAINED AND REPLICATED”. Should the results be trusted like item difficulties ?How can I fix this warning message? I have another question-in IRT PARAMETERIZATION IN TWO-PARAMETER LOGISTIC METRIC WHERE THE LOGIT IS DISCRIMINATION*(THETA - DIFFICULTY) results, there are four columns. Do they mean estimate, S.E., Est./S.E. , and Two-Tailed P-value? Thanks!
Ali posted on Thursday, November 07, 2013 - 6:30 pm
Hi, I am running mixture 2pl-IRT model. I assuemed both groups have the same ability C1~N(0,1) and C2~N(0,1). Here is my codes: MODEL:%OVERALL% f by M1-M13; [f@0]; f@1; %c#1% f by M1@1 M2-M13; [M1$1-M13$1]; %c#2% [M1$1-M13$1]; f by M1@1 M2-M13; I am wondering that after I adding f@1,do I need to putf f by M1@1 M2-M13 in each class model?
The default is that the loadings are equal across classes, so if you want them unequal you have to mention them in each class like you do.
Ali posted on Friday, November 08, 2013 - 12:22 pm
Thanks! If I set variance to equal one across classes, I will get the same value of loadings as the item discriminties. However, if I don't set f@1, the loadings and item discrinmities are different. Why the loading s and item discrinmities are different under this circumstance? Another question is that can item parameters can be directly be compared from results under the mixture 2-Pl model? What the constraints are used in Mplus to make item parameters in the same scale?
Hi, I am using mixture Rasch model to figure out which the item is the anchor item. So, I first set all loadings are equal 1 across groups, so their item discriminties will be 1 like rasch model (a=1). I assumed u2-u8 are anchor items , so I constrain them equally across groups .Only u1 is freely estimated across groups , also, I set mean=0 in C1 , but mean is freely estimated in C2. Variance is freely estimated across groups. However, I used my codes then the model results showed me that loadings are 1 between groups, but in IRT parameters results , the item discriminties are not 1. How could I get the item discriminates are 1 across groups? Also, the result doesn¡¦t show me u2-u8 have the same item difficulties . Here is my codes: f BY u1-u8@1; [u2$1 u3$1 u4$1 u5$1 u6$1 u7$1 u8$1]; %c#1% [f@0]; f; [u1$1]; %c#2% [f]; f; [u1$1];
The IRT results use the factor variance to get the discrimination in the scale of N(0,1), so when you let the factor variance change over classes equal loadings does not translate to equal discriminations. See eqn (19) in http://www.statmodel.com/download/MplusIRT2.pdf
Not scaling to variance 1, equal loadings is the same as equal discriminations.
Ali posted on Tuesday, November 19, 2013 - 5:18 pm
Thanks! I looked the formulas for item difficulties and item discriminties. If I set c1:mean=0,variance freely estimated , and all items have loadings 1;c2: mean and variance are freely estimated , but all loadings equals 1.
For c1: a=variance for c1**0.5 b=threshold/variance for c1**0.5 c2: a=variance for c2 **0.5 b=(threshold-mean for c2)/variance for c2**0.5
So, it seems that it's hard to get the equal item difficulties across groups,but different within the group and the same item discrimities across groups. Is my understanding correct ?
Please tell me succinctly how you want the parameters to be in the two groups using IRT terms.
Ali posted on Wednesday, November 20, 2013 - 3:45 pm
I want to let item discrinminties are set 1 or the same in two groups. Item difficulties are the same across groups, but different within groups. For example, there are three items. Item 1 has the item difficulty as 1 in group 1 and group 2, and item 2 has the item difficulty as 0.5 in group 1 and group2, and item 3 is freely estimated across groups. I want to do this because I want to try if I could possibly know which item is invariant across groups, which is the anchor item. So , first, I will set one of items is freely estimated for item difficulty and the remaining items are fixed with the same item difficulty across groups , but different item difficulties within groups, then repeated this procedure. Also, the variances in both groups are freely estimated and mean for c1 is set 0, but mean for c2 is freely estimated. Thanks!
Thanks for providing codes. I got another questions. The last Dr.Muthen answered on Wednesday, November 20, 2013 - 6:43 pm. I am a little confused. Does mean I could think IRT model as factor model ? When run mixture rasch model : f BY u1-u7@1; %c#1% [f@0]; f; [u1$1]; [u2$1-u7$1] (1-6); %c#2% [f]; f; [u1$1]; [u2$1-u7$1] (1-6);
The IRT results doesn't have the same item difficulties across groups and the item discrinmities as 1, because those values depend on group's mean and variance. In such case, the model results is more likely what I want(loading is 1 and thresholds are the same across grousp expect item u1). So, could I only look the model results and think this model as the factor miture model?
Yes, the IRT model is a factor model for categorical outcomes. They are one and the same. I think you will find it easier to work with the factor analysis parameterization.
But a general comment is in order given your series of questions. Although it is laudable to be ambitious, I don't think it is a good idea to try to learn basic facts about IRT and factor analysis by asking questions on Mplus Discussion because the answers are by necessity brief. You should study the literature long and hard. Especially since you are doing advanced modeling using mixtures.
The factor (employer’s motivation to occupational health and safety) is central to my much delayed research. My first dates back to 2006 with a small data set and 6 categorical items (published in 2013). Now I have a bigger dataset, and I plan to end my research in the next 6 months (if my disability lets me). In my new dataset (N=1711), I've 7 items of interest, measured on a 0 to 10 scale, highly skewed (the value 10 gets between 20 and 53% of responses, smaller frequencies for the 9 value, and frequencies in-between for the value 8). My data is cross-sectional, the value 10 isn’t a hurdle, but a legitimate value, probably affected by the sensibility of the studied issue. I have conducted EFA with the data in three categories, and separately (on the reversed scale) for the binary part, and the continuous part; for two sub-samples. In every instance, one factor gets short and two factors over-fit the data, and for the loadings of the one factor solution: two loadings are equal (and it has sense to fix both @1) and the other 5 loadings are also equal between them (around 1.5, to be freely estimated). Finally, I have many interesting predictors for the factor, and up to six distal factors (to end with a full-fledged SEM model). Given my hesitations, I went back to the theory, and also to review the recent Mplus advances with latent class predictors, and Bayes estimation.
Now, for my first question: If I use a 2 part model, I will need to use mixtures to get a good fit and, given that 10 is a legitimate value, the factor loadings seem to be necessarily constant among classes, and probably equal for the binary and the continuous part. I had clear the implications of fixing the factor loadings for different classes for continuous models, but I was unsure if the same could be said for the binary part, given the non-linear relationships. For a similar published work, Kim and Muthén do not fix the factor loadings among classes, estimate them freely, but their 0 value is clearly a hurdle. Hesitations, Dr. Muthén, with some reluctance (trepidation?) to use such sophisticated models with two processes. And how to afterwards use my measure with predictors and distals.
Perhaps you should keep it simple. Compare (1) a regular EFA treating the outcomes as continuous and using the MLR estimator with (2) a Censored-normal EFA using WLSMV. If no important differences, go on witn (1).
EFA for 1 factor gives similar loadings for censored (MLR) and categorical (3 categories, WLSMV); the loadings for continuous (MLR) are not so similar. The 2 factor EFA does not converge for censored items, and gives negative residual variances for categorical items. It only reaches a solution for continuous items, with 4 double loadings. In every instance, the 2 factors try to separate apart M1, legislation, from M4, costs (that has theoretical support: it can be costly to scrupulously abide by the law). For the 1 factor continuous and categorical items RMSEA is 0.104-0.126 and SRMR 0.061-0.067. Many thanks, Fernando