

Correction for nonindependent observ... 

Message/Author 


I'm analysing a questionnaire that measures altered states of consciousness. In order to have enough data, I pooled data from various experimental studies, in which hallucionogenic drugs were given to healthy volunteers. In some studies the hallucinogenic drugs was given just once, but in a few studies the drugs were administered within a repeated measurement design with three different doses on 3 separate days. The questionnaire was filled out on each experimental day. Some subjects also took part on more than one study. Therefore, some subjects in my dataset have up to 6 or 7 observations. About half of my dataset consists of single independent observations though. Can I use type=complex with the subjects as clusters to correct for this partial nonindipendence of observations or do I have to take the repeated measures out of my dataset? The problem is, that I can hardly estimate my model without the repeated measures, since my dataset is already on the small side. Since most of my clusters are rather small in size and the questionnaires where not filled out exactly under the same condition, I suspect, that the intraclass correlation is not too big and the bias in ChiSquare and standard errors would be rather small, wouldn't it? I also read somewhere, that the MLMestimator corrects for nonindependence. Can you elaborate on this? Thanks a lot! 


You could try Type = Complex to investigate the effect of the clustering. If the SEs are fairly similar to those of regular MLR estimation, you could then not bother with the clustering. 


Thank you so much! Compared to regular MLR estimation, chi square and fit indices get slightly better when I take clustering into account. I have 485 observations, 197 free parameters and 264 clusters. Fit indices with MLR: ChiSquare: 1771.56 df 978 CFI: 0.916 TLI: 0.908 RMSEA: 0.041 SRMR: 0.055 Fit indices MLR & type=complex: ChiSquare: 1701.597 df 978 CFI: 0.920 TLI: 0.911 RMSEA: 0.039 SRMR: 0.055 I also calculated the difference in standard error estimation. The highest difference was 19.4%. The difference on average was 7.8%. Would you consider this difference as worth bothering? I'm asking because I have also tried to do some analysis with a subset of my dataset and I got the error message that I have more free parameters than clusters. I guess, type=complex is only appropiate with more clusters than free parameters. Is there a recommended ratio between clusters and free parameters? If I take the regular MLMestimator, the CFI and TLI are still better than with MLR and type=complex. Here is what I get with the MLMestimator: ChiSquare: 1742.877 df 978 CFI: 0.921 TLI: 0.912 RMSEA: 0.040 SRMR: 0.055 The standard errors with the MLMestimator are on average about 20% smaller than with MLR and type=complex. I still don't understand the difference between MLM and MLRestimaton and why they behave so differently in my case. 


MLM and MLR are asymptotically equivalent although they use different formulas. In your case, I do not think a comparison is valid because the model does not fit the data well. Standard errors are underestimated when nonindependence of observations is not taken into account. 


Thanks! I really tried hard to improve the model fit of my measurement model. This is about the best solution I can get, unless I free lots of parameters (crossloadings and correlated errors) that don't make sense from a theoretical point of view or unless I throw out more items and factors. I thought, the model fit wasn't too bad, since RMSEA and SRMR are well below recommend cutoff values and only CFI and TLI are too low. I suspected that this is because of my rather low sample size. Would you consider the fit of my measurement model as unsufficient for proceeding into mimicmodels? I understand, that it's probably the best solution to take MLR with type=complex, but what would you suggest, if I have more parameters than clusters? 


I would use TYPE=COMPLEX; with MLR and start with an EFA. I do not think you can claim the model fits well. It is desirable to have more clusters than parameters. 

Back to top 

