

Betweengroup variance for the factor... 

Message/Author 

Jeroen Ooms posted on Friday, August 29, 2008  4:34 am



I have a (simulated) dataset in which 4 observed variables function as indicators for one latent factor. However, one of the factor loadings has some between group variance: the population factor loading of y4 is .3 for half of the groups, and .5 for the other half. The other factor loadings are homogeneous across the groups @ 0.5 I want to evaluate if the twolevel model is capable of discovering this. I tried to fit the model with only within level effects: MODEL: %within% y1y4*.25; [y1y4*0]; f BY y1y4*.5; f@1; [f@0]; When i fit this model, mplus estimates f BY y13 at 0.5 (correctly), and has estimated f BY y4 at 0.4 for the entire group. However, the model seems to fit perfectly (chi.sq < 1). I dont understand why it fits perfeclty, because the estimate of 0.4 is incorrect for all groups, it is 0.3 for half of the groups and 0.5 for the other half of the groups. I think my question is: does mplus by default also estimate a variance parameter for the factor loading? If so, how can i fix this variance@0? And if not, why does this model fit perfectly, although the model parameter of 0.4 is actually incorrect for all groups? 


I assume that by "group" you mean cluster in the sense of students observed within schools (cluster = school). It sounds like you want a random loading for y4, which you should specify as %within% f by y1y3; s  y4 on f; %between% [s]; s; where [s] and s on between gives the mean and variance of the 4th loading across the clusters. In some cases, model misspecification gets absorbed into model parameters and doesn't cause a model misfit. I have not penetrated your model enough to say why that seems to happen here. 

Jeroen Ooms posted on Friday, August 29, 2008  9:06 am



Thank you for your response. In your example f is measured by y13 and y4 is regressed on f. This is something different than measuring f by y1y4, right? I explicity want to test if scale equivalence holds, which means equal factor loadings across clusters. From your response I assume that there is no such thing as a random factor loading. If that seems to be the case, then I still don't understand what caused the perfect model fit. Here is some output from the MC: MODEL RESULTS ESTIMATES S. E. M. S. E. 95% % Sig Population Average Std. Dev. Average Cover Coeff F BY Y1 0.500 0.5000 0.0034 0.0033 0.0000 0.935 1.000 Y2 0.500 0.5001 0.0033 0.0032 0.0000 0.943 1.000 Y3 0.500 0.4999 0.0034 0.0032 0.0000 0.932 1.000 Y4 0.500 0.4000 0.0030 0.0185 0.0100 0.000 1.000 as you can see, the factor loading for y4 was on avg estimated at 0.4, with a mse of .01. This is correct, because factor loadings were 0.5 for half of the clusters, and 0.3 for the other half. However this model misspecification does not affect the chi2, the average chisquare is below 2, indicating a 'perfect fit'. Do you have any idea what could have caused this? 


On your 1st paragraph, f is still measured by y4. Note that a loading is a regression coefficient of a y on f. The only difference compared to regular factor analysis is that the loading is random as I specified it, which is what you wanted. So there is indeed such a thing as a random factor loading in Mplus. 

Boliang Guo posted on Thursday, September 25, 2008  2:49 am



that is really a creative code prof Muthen!! frollowing your suggestion at http://www.statmodel.com/discussion/messages/12/698.html?1118367086 about the random loading with ex9.10, fw by y1y4 if I adjust the code as: %within% fw by y1@0; !just name the fw s1  y1 on fw; s2  y2 on fw; s3  y3 on fw; s4  y4 on fw; %between% s1s4; [s1s4]; my quesstion is:assuming all loadings are random,then, the mean of each loading same as the loading for the 2level factor fb? sorry, I could not get the model convergent with ex9.10 data at the moment. 


Note that you can say: fw by; No, the means of random loadings are not the same as the loading of the betweenlevel fb. Nonconvergence for these data is most likely due to the fact that the data were not created with random slopes. This has the consequence that the variances of the random slopes go to zero and that takes a long time (many iterations). The output that you get at the end should show those variances as zero or very small, which is a suggestion that you should treat them as fixed. 

Boliang Guo posted on Wednesday, October 01, 2008  7:42 am



for multilevel ACE model(EX5.18),I wanan test the random loading following your suggestion with following code(knownclass, twolevel random mixture): Model: %within% %overall% [y1y2] (1);! within= y1 y2 y1y2@0; a1 BY; a2 BY; c1 BY; c2 BY; e1 BY; e2 BY; s_a1  y1 ON a1*.7 (2);! s_a2  y2 ON a2*.7 (2);! s_c1  y1 ON c1*.6 (3); s_c2  y2 ON c2*.6 (3); s_e1  y1 ON e1*.4 (4); s_e2  y2 ON e2*.4 (4); a1e2@1; [a1e2@0]; a1 WITH a2@1; ! value for MZ twins c1 WITH c2@1; a1 WITH c1e2@0; a2 WITH c1e2@0; c1 WITH e1e2@0; c2 WITH e1e2@0; e1 WITH e2@0; 

Boliang Guo posted on Wednesday, October 01, 2008  7:43 am



%C#2% a1 WITH a2@.5; ! Value for DZ twins %between% %overall% [s_a1s_a2*.7](10);! [s_c1s_c2*.6](11); [s_e1s_e2*.4](12); s_a1s_a2*.1 (13);! s_c1s_c2*.1 (14); s_e1s_e2*.07 (15); %C#2% ! no modeling seting I randomly simulated 22 dataset(mcex5.18 code with variable population value) and apend the dataset to one big dataset, var(a), var(c) var(e) are 0.12 0.1 0.09 among population value. if I run above code, mplus reported: THE ESTIMATED WITHIN COVARIANCE MATRIX IN CLASS 1 COULD NOT BE INVERTED. COMPUTATION COULD NOT BE COMPLETED IN ITERATION 1. CHANGE YOUR MODEL AND/OR STARTING VALUES. THE MODEL ESTIMATION DID NOT TERMINATE NORMALLY DUE TO AN ERROR IN THE COMPUTATION. CHANGE YOUR MODEL AND/OR STARTING VALUES. I use the mean of population value as starting value, what is my problem, please? I tested the data with single group analysis using both 5.18 code and knowclass mixture code, the results are exactly same, why I could not get the random loadings here with above code? thanks. 


The way you specify the model, I think there is no e variance on the Within and you have already fixed the y residual variance at zero. This would cause the error message. You can either not have random slopes for e, or you can use the asterisk Mplus construction that allows both within and between level variance: s*  y on e; 

Boliang Guo posted on Wednesday, October 01, 2008  8:52 am



thanks, Prof Muthen 

Back to top 

