| Message/Author |
|
|
|
|
| If I have a design in which I administered the same measure to men and women at two points in time to test for sex differences in changes in the construct tapped by the measure, it seems important to demonstrate that the measure is both invariant across groups and invariant over time before one can meaningfully compare group differences over time. Do you think it appropriate to test for both forms simultaneously in the same model? Or if say invariance across groups holds but invariance across time does not might the misfit due to variance across time be partially offset by the fact that the portion of the model having to do with the cross-groups invariance fits well? In other words, would a stronger approach be to test for invariance across groups at each time point separately and then test for invariance over time within each group separately? Thanks for any insight you can share! |
|
|
|
|
| I would do them separately because if you do them together it would be difficult to know where the problem is if measurement invariance does not hold. |
|
|
|
|
Thanks for the speedy reply Linda! A follow-up question is if invariance does hold in the combined model is it reasonable to conclude that both forms of invariance do hold? or might the overall model fit still be acceptable if one form didn't hold but the other form did. Also, from reading other postings on the discussion board, I now understand that the Mplus approach to factor analysis with categorical items is equivalent to a 2 parameter IRT model. That is very exciting I think as it opens up lots of analytic possibilities that I did not think were available (i.e., multidimimensional IRT) at least not with any software that I had. I do have a couple of questions about this though. It seems to me that one of the major advantages of IRT is that it enables equating of scores across different tests by taking threshold and slope information into account from both measures to locate participants on the underlying latent trait. It seems to me that it stands to reason that if we can equate scores from subjects from the same population who have taken two alternative forms of some test that the same methods should allow us to equate scores from subjects from different population administered the same test for whom the thresholds and/or factor loadings differ as long as we are confident that the test is still measuring the same construct in the two populations, no? If so, then I would think the same would be true for equating scores from repeated administration of the same test to the same sample even if the thresholds and/or factor loadings differ over time? If these intuitions are correct, perhaps then it is not critical that the measure demonstrate invariance as long as the data is analyzed appropriately (i.e., by incorporating both thresholds and factor loadings in our measurement model)? My second question about factor analysis of categorical items in Mplus is that I don't understand the concept of a single threshold for a measure with a hierarchical structure in which each item load on a general factor and a group factor. As these factors are orthogonal, each item is not measuring a single ability but rather two abilities. Thus, I am having a hard time at a conceptual level with the notion of a single threshold for such an item. At a purely mathematical level I can understand that such a threshold corresponds to something like a particular vector length of the vectors corresponding to participants' locations in the plane defined by the two abilities measured by a given item. Conceptually however I am having a hard time understanding the meaning of such a threshold as a subject could exceed it in many different ways (e.g., by being high on the one ability but low on the other or vice versa or by having a moderate standing on both abilities). Any insight you could provide that might help one to develop some intuition for the meaning of such thresholds would be greatly appreciated (or being pointed to a reference that might help along these lines). Thanks very much! |
|
|
|
|
I don't think that it is necessarily true that if invariance holds in the combined model that it holds for groups and time. I would test both. Regarding IRT, there is a new section on IRT in Mplus. You can find the link on the homepage. You need measurement invariance across time for it to make sense to study the development of the construct across time. The structural parameters, the means, variances and covariances of the constructs, may vary across time but the measurement parameters should not. It may be that you can have partial invariance. Regarding a single threshold when there is a general and a group factor influencing the same factor indicator, you may think of this as a threshold on a specific ability variable needed to solve the item correctly. The specific ability variable is the sum of the general and general and group factor. A person may exceed the threshold of this specific ability variable by different combinations of general and group factor values added together. |
|
|
|
|
| Thanks very much LInda - your responses are very helpful! Re: the issue of the single threshold, at a conceptual level what you say makes perfect sense if I am thinking of the structure in higher-order terms with each item having a loading on one and only one factor - its first-order factor (which then loads on a second-order factor and so on until the highest level in the structure is reached). If I am thinking of the structure in terms of a hierarchical model such as the bi-factor model, it seems to me that there are two ability variables needed to solve the item correctly - the general factor plus the group factor which is orthogonal to the general factor. Given the orthogonality it just seems conceptually messy to me to talk of a specific ability variable which is the sum of the general and group factors - it seems more accurate in this case to talk of the abilities (plural) needed to solve the item correctly. Now I realize that mathematically many hierarchical models and higher-order models are just linear transformations of each other (using the Schmid-Leiman transformation and its inverse) but at a conceptual level if one thinks the hierarchical model provides the representation closer to reality in a given domain it seems strange to me to talk of a single threshold on the several independent abilities needed to solve the item. |
|
|
|
|
If the model is (1) y = g + s where y is either the logit or probit for the binary item, then it says that y needs to be large enough to solve the item, implying that a relatively low g (or s) value for a person can be compensated by his higher s (or g) value to give the same y. This says that one threshold for y is sufficient. If on the other hand solving the item requires that g exceeds a threshold and that s exceeds another threshold, then a different model than that of (1) is called for. I think the former model is that of "bi-factor" modeling that has been written about recently in Psychometrika by Gibbons and others. I don't have a reference for the latter model. |
|
|
|
|
| that helps - thanks Bengt! I agree a compensatory model suggests that one threshold is sufficient and that seems a more accurate albeit more unwieldy description of the conceptual meaning of the threshold in the case of a bi-factor (or other hierarchical) model with this compensatory relationship. |
|
|
| Jon Elhai posted on Tuesday, October 27, 2009 - 6:56 pm
|
|
|
I'm using the MLM estimator, and I conducted a difference test between two nested models (as per your "chisquare difference test" formula section of your website)... So now I have a corrected chisquare difference value and difference in degrees of freedom to look up in a chi-square table. Does it make sense to take that chi-square value difference, degrees of freedom difference, and sample size to manually calculate RMSEA? Would such a resulting RMSEA represent a difference in fit between the two models? |
|
|
|
|
| I have not seen RMSEA used for difference testing. |
|
|
|
|
I am investigating measurement invariance in a study with two time points (i.e., baseline and 1 year). SF-36 is used as the measurement model. Is it possible to observe both strong non-invariance (indicator intercepts) and strict non-invariance (error variances) in the same scale (at 1 year)? Thank you for your time! |
|
|
|
|
| I'm not sure that I understand your question. You can test for intercept, factor loading, and residual variance invariance across the two time points. |
|
|
|
|
I ran the analysis and found that both strong non-invariance (indicator intercepts) and strict non-invariance (error variances) were identified for the same scale at one year. Thus, I meant to confirm that is it possible that both strong non-invariance (indicator intercepts) and strict non-invariance (error variances) can be identified in the same scale (at 1 year)? Thank you! |
|
|
|
|
| For continuous items, which I assume you have given that you refer to intercepts, yes. |
|
|
| Back to top |