I listened to your Mplus talks on growth modeling, EFA and CFA in March and August 2008. I got the message that as you move from one step to another in estimation, you need to use another random sample than the sample used in the first step. For example: you do EFA at each time point in growth modeling in the first step using random sample1, than in the forth step when you do a CFA you need another random sample2. I ignored the steps two and three where one determines the shapes of the growth curve.
I am trying to estimate a multiple indicator growth model. My data consists of educational attainment in OECD countries over the period 1960-1985, every 5 year data. I have 20 countries and 6 time points. Do I have to select random samples from my data set, as I move from one estimation step to another? How do I decide the size of the random sample to be selected for the next step? Can you recommend any further readings?
Those steps are mainly intended for developing new instruments, where one modifies, deletes, and adds items. But even if you have to live with a particular instrument as is, I think keeping exploration and confirmation to different data sets is useful if you have the sample size to afford it. Depending on the number of items in your instrument you may need at least say 200-300 observations for a stable exploratory analysis.
Also, if your hypothesis only concerns the number of factors rather than a more specific structure, you might consider the new "ESEM" approach. In a growth context, however, this requires you to study up on it, and is not something you just learn by doing.
Furthermore, you have to consider whether or not a multilevel (exploratory) factor analysis is necessary (see the Mplus Short Course Topic 7 handout).
I assume by instruments you mean indicators. The indicators I work with represent existing educational attainment data set built by different authors, which researchers claimed are corrupted with measurement error. I do not intend to develop new instruments, or modify the existing ones, I just want to get an estimate for the measurement error in the existing instruments.
As you probably noticed, I do not have the sample size to afford having different samples for EFA and CFA.
I listened to one of your multilevel factor analysis talk, and read the suggested handout (Mplus Short Course Topic 7 handout). I do not think I need to consider a multilevel analysis in my case. Even though my countries are OECD countries, I only have one cluster, the only variation I will find will be within the OECD cluster. If you see any obvious reason why you think I should consider multilevel analysis, please let me know.
What I was really worried about is that in one of the Mplus presentations, when talking about CFA on the Holzinger-Swineford data, Linda said that even though some of the Modification Indices were large, she did not choose to use them and re-estimate the model, because that would require the use of another sample (data set). I hope that would be needed in an ideal world. Or is this the common practice, to re-estimate the model using a different sample?
It is unfortunately not common to re-estimate the model using a different sample. I think what Linda was referring to was that when you use the same sample repeatedly in checking Modification indices and modifying the model you may be capitalizing on chance, that is, the parameters you free up may not be significant in another random sample from the same population. So you want to free only those parameters with large indices that are clearly interpretable.
Brianna H posted on Friday, September 27, 2013 - 4:09 pm
Hello-- I am trying to figure out how to randomly sample observations from my dataset, so that I can divide the sample in half and test model modifications on one half of the sample at a time. Can you please let me know the syntax for randomly sampling a certain number of observations from the data file used in the analysis? Thank you.