Message/Author |
|
Jon Heron posted on Friday, September 04, 2020 - 3:08 am
|
|
|
Apologies for yet another DSEM question! In standard multilevel models, fixed effects and random effects come in pairs. The fixed effect describes the population mean level whilst the random effect describes variation around that mean as well as enabling the researcher to consider covariates that might explain some of that variability. When it comes to crosslag effects in a DSEM we are encouraged to focus on the within-person standardized estimates rather than the fixed effects reported in the main output. Does this mean we can no longer describe variability in the crosslag relationship about a population mean and/or consider covariates as described above? many thanks, Jon |
|
|
If you look at output:stand(cluster); you can see the average standardized values and the values for each subject which can then be used to compute the variability (but that is not in the output). The relationship between a standardized effect and a covariate will not be linear in DSEM but it will be a bit more complex. If you want a linear effect I would suggest a two-step approach. In the first run exclude the covariate and use output:residual(cluster) to get the estimated variances for each variable for each subject. Then use that estimated variance to standardize the observed values in each cluster separately. Follow this with a second DSEM run on the standardized data where you include the covariates. |
|
Jon Heron posted on Monday, September 07, 2020 - 12:36 am
|
|
|
Thanks Tihomir that makes sense does a covariate effect on the non-standardised crosslag random effect tell us anything useful or is this as potentially misleading as the non-standardized crosslag fixed effect? best, Jon |
|
Jon Heron posted on Tuesday, September 08, 2020 - 3:47 am
|
|
|
So I've obtained each person's variance information using residual(cluster) and have used this to rescale the data. If I then feed this back into Mplus and refit the model without a covariate it feels like my non-standardized output using this standardised data (specifically the means of my crosslag random effects) should agree with the Within-Level Standardized Estimates obtained using non-standardized data. But they don't. Either I've done something wrong, or i'm incorrect in my thinking. Or perhaps both? best, Jon |
|
|
I wouldn't expect those to be the same. In the first run the means are on the between level and are standardized with respect to between variance. In the second run you are standardizing with respect to within level variance. On the earlier question: It is best to use simulations to see through this, but I would expect that some sort of bias will occurs if there is a non-zero correlation between the subject specific variance and the subject specific covariance effect. |
|
Jon Heron posted on Wednesday, September 09, 2020 - 12:35 am
|
|
|
many thanks Tihomir |
|
Jon Heron posted on Wednesday, October 21, 2020 - 8:38 am
|
|
|
I really hope this is my last Q about this! I've been exploring different ways in which one might explore the difference in the within-person standardized estimate for a crosslag term between two groups. The estimate for the standardized crosslag effect from the model without the covariate was 0.065, 95% CI=(0.054, 0.075) If I employ Tihomir's suggested solution and use resid(cluster) to output cluster-specific estiamtes of variance and use these to rescale the data then I estimate the covariate effect on this crosslag to be 0.111 (0.041, 0.182) if I use stand(cluster) to produce each person's own standardized crosslag term and regress these estimates on my covariate in Stata then I get 0.020 [0.008, 0.032] And if I simply estimate my original model twice, i.e. for X=0 and for X=1 then I get the following: X=0 : standardized crosslag = 0.051 (SD=0.007) X=1 : standardized crosslag = 0.091 (SD=0.009) Bottom line = the pattern of conclusions across a range of covariates is the same irrespective of which of these 3 options I pick. But clearly the scales are different. Which results should I present? all of them? many thanks, Jon |
|
|
As far as I can see method 3 is the best. The shortcomings of the first two are that they ignore the uncertainty in the cluster-specific estiamtes of variance and in the standardized crosslag. I would recommend doing simulations and also contacting Noémi Schuurman and Ellen Hamaker for alternate opinion and to see if they have some updated information. Apart from that, I will point out that Method 1 & 2 v.s. Method 3 also differ in the SE. The SE for Method 3 will converge to zero when the cluster sample sizes converge to infinity. The SE for Method 1 and 2 will converge to zero when the number of clusters converges to infinity. Var(ar-parameter)= Var(E(ar_i)) + E(Var(ar_i)) Method 3 uses just the second term so it is expected to always have a smaller SE. It gives you the estimate for this set of individuals rather than for a generalized population the way Method 1 and 2 do. |
|
Jon Heron posted on Thursday, October 22, 2020 - 1:06 am
|
|
|
Thanks Tihomir method 3 was also my favourite, although I was slightly concerned that as I was unable to stratify using grouping/knownclass every aspect of the model was permitted to vary across my groups - it felt a little like comparing apples and oranges. Do you think a step too far to use the information provided by method 3 to construct a confidence interval for the group difference? That feels like an awkward marriage of Bayesian and frequentist thinking. all the best, Jon |
|
|
You can use "multivariate format" to do multiple group modeling and hold parameters equal. Each variable will have YG1 and YG2 and if cluster sizes are unequal you would fill in missing values. I don't see any problems with the marriage of Bayesian and frequentist thinking. You are making asymptotic inference anyway and Bayes and ML are asymptotically the same. |
|
Jon Heron posted on Friday, October 23, 2020 - 12:56 am
|
|
|
Multivariate ?! My poor PC cheers, Jon |
|
|
It shouldn't be that bad. If you have 2 variables in the original model, the multivariate version of multiple group would have 4 variables. It will be two parallel processes that are independent of each other. Ideally I think this should just double the estimation time and certainly it shouldn't be more than 4 times longer. |
|
Back to top |