Temporal Invariance
Message/Author
 Jason Prenoveau posted on Friday, March 21, 2008 - 9:26 am
We have a hierarchical CFA model with 67 categorical indicators, 6 group factors, and a hierarchical general factor. This model fits very well in our Time 1 data. Unfortunately, due to the large nature of the combined model (134 observed categorical variables), our sample size is too small to enable us to estimate invariance in the usual manner.

We have an alternate strategy and were wondering if it seems valid. Instead of testing the structures together in the same model, we would validate the Time 1 structure in data at Time 2, by first examining configural invariance (how well does the Time 1 derived structure fit in Time 2 data when thresholds and factor loadings are freely estimated). Next, we would examine metric invariance by using Time 2 data, but fixing factor loadings and thresholds equal to values obtained in Time 1 data and examining model fit (as well as chi-square difference test of the configural invariant model versus the metric invariant model to determine if there is a significant difference in fit). If there were areas of strain, we could release some of these constraints and test for partial metric invariance.

Does this sound like a reasonable strategy for testing temporal invariance of our factor structure? Thank you for your thoughts.
 Linda K. Muthen posted on Saturday, March 22, 2008 - 9:25 am
I don't think you can do better than what you suggest. You might also do it in the opposite direction.
 Jason Prenoveau posted on Friday, March 28, 2008 - 8:02 pm
Linda,

Thank you for your reply. I conducted the analysis I had mentioned and was surprised to find that the metric invariant model (where I fixed all factor loadings and thresholds) actually had a LOWER chi-square value than the configural invariant model (where factor loadings and thresholds were free to vary). Chi-square dropped from 324 (with a CFI of 0.95 and RMSEA of 0.049) to 235 (CFI=0.96, RMSEA=0.057). It is just difficult to understand how fixing these loadings/thresholds results in a LOWER chi-squared value.

Also, the df only dropped from 169 (configural invariant) to 107 (metric invariant) even though I fixed 149 factor loadings (as well as many thresholds). Is this a function of how df are calculated in WLSMV?

Thank you for your help,
Jason
 Linda K. Muthen posted on Saturday, March 29, 2008 - 6:15 am
You cannot compare the chi-square and degrees of freedom with WLSMV. Only the p-value should be interpreted. If you want to do difference testing, you need to use the DIFFTEST option.
 Jason Prenoveau posted on Saturday, March 29, 2008 - 8:50 am
Sorry - I should have mentioned that I did use DIFFTEST and the result was significant (value=108, df=43, p=0.000). Since the chi-square value was reduced from the less restrictive model (loadings and thresholds free to vary) to the more restrictive model (fixed loadings and thresholds), I assumed the DIFFTEST result indicated a significant IMPROVEMENT in fit when the pathways were fixed. However, this is counter to what I expected, and what is usually seen: fixing (or equating) pathways typically results in a decrement in model fit.

So, are you indicating that despite the apparent DROP in chi-square (of about 90), the DIFFTEST value is indicating an INCREASE in chi-square (of 108) and the metric invariant model represents a significant decrement in model fit as we would expect? Or, is the DIFFTEST indicating the metric invariant model is a sig improvement (since visual inspection reveals chi-square dropping from the configural to metric invariant model, and the value of this drop is similar in magnitude to the DIFFTEST value)? Your help in interpreting these unexpected results is much appreciated! Thank you!
 Linda K. Muthen posted on Sunday, March 30, 2008 - 10:37 am
With WLSMV you should not be making any interpretation based on the chi-square values and degrees of freedom including those given in the DIFFTEST results. It is only the p-values you should be looking at. If the p-value for the DIFFTEST results is less than .05, it means that the more restrictive model significantly worsens the fit of the model. If the p-value is greater than .05, it means that the more restrictive model does not significantly worsen the fit of the model.
 Jason Prenoveau posted on Tuesday, April 08, 2008 - 9:31 am
Great - thank you again for your speedy reply. I had a question with regards to reporting results from the DIFFTEST. In a manuscript would DIFFTEST results more accurately be expressed as delta chi-square or simply chi-square?

Thank you again for all of your help,
Jason
 Linda K. Muthen posted on Tuesday, April 08, 2008 - 9:39 am
I would report only the p-value for the reasons mentioned above.
 Matthew Diemer posted on Wednesday, January 21, 2009 - 7:41 am
I have a different wrinkle on testing temporal invariance that I was hoping for advice on.

We have two waves of panel data, where the same participants were surveyed at each wave but different indicators are available at Time 1 and Time 2.

So, I think it might be technically possible to assess whether temporal invariance holds, using only those indicators that are repeated at Time 1 and Time 2.

However, I'm not sure if this type of temporal invariance analysis tells you anything - does temporal invariance using a subset of indicators provide some evidence of temporal stability over time, but less evidence than the same set of indicators from the same participants at Time 1 and Time 2.

Or, is it simply not appropriate to assess temporal invariance for only a subset of indicators?

Thanks!

-Matt
 Linda K. Muthen posted on Wednesday, January 21, 2009 - 9:18 am
This type of invariance is used a lot by the testing companies like ETS where the majority of items are repeated over time. I think the believability is related to how many items are repeated.
 Matthew Diemer posted on Tuesday, February 03, 2009 - 1:44 pm
Thanks, Linda. I will look at what ETS and others have done for examples.

I'm not sure on this point - will we need to restructure the dataset for analyses of temporal invariance?

We already have the data in the more common wide format, but are unsure if they need to be restructured to long format (as in LGM).

Thanks!

Matt
 Linda K. Muthen posted on Tuesday, February 03, 2009 - 2:54 pm
I would leave the data in the wide format. See the Topic 4 short course handout on the website. Multiple indicator growth is described starting on Slide 77. The first part of the example illustrates testing measurement invariance across time.