Message/Author 

bmuthen posted on Friday, October 25, 2002  11:39 pm



In some applications it may be more natural to let a factor be defined as being influenced by indicators, rather than influencing the indicators. An example is SES. Mplus handles this modeling by the following model statements. f by y*; f@0; f on x1@1 x2 x3; Note that f by y* (that is, freeing the factor loading for the single indicator y) is the same as y on f. f has to be defined by some y  if instead you use the dummy definition f by y@0; y on f; the estimate shows up in f by y. 

Anonymous posted on Wednesday, June 09, 2004  7:09 pm



Would you elaborate on the substantive meaning of this parameterization ? If the latent variable (LV) "causes" the indicators, isn't the interpretation of this particular parameterization that the correlation between the indicators is rendered spurious in the presence of the "true cause" of the indicators, LV. Examples of the this type of model would be: intelligence, conservativism, etc. I'm guessing the the model you're suggesting suggests that the LV is "caused" by the indicators, but that correlation between the indicators is still permitted ? Is this correct. I fail to see the connection between this parameterization and (in your example) SES. 

bmuthen posted on Thursday, June 10, 2004  5:44 pm



Yes, having the arrows point to a factor instead of from a factor can be thought of as the LV being "caused" by the indicators and correlations between the indicators are not modeled, but are free. The interpretation in the context of SES might be that a person's education, income, and job status produce the person's SES (the LV). A regular factor analysis model would in contrast say that a person has a latent SES (an inborn need?, a life style requirement?) that influences the education and job he/she gets. The substantive theory would guide the choice. As a layman, I think the former view is a bit more compelling in the context of SES, although I can imagine a complex combination of the two formulations perhaps being closer to the truth. 


Hi Bengt, In this case, is it true that the "f@0;" from above is only required for identification if you only have 1 y variable in the by statement? Jarvis, MacKenzie, and Podsakoff talk about being able to uniquely estimate the variance in f if F is a predictor of 2 or more latent variables. Lee 

bmuthen posted on Friday, November 05, 2004  1:19 pm



Hmmm, I didn't think the residual variance of the formative factor f (f predicted by x) would be identified even if f predicts a second f, f2 say, with f2 having a residual variance and multiple indicators (y say). I am not familiar with the paper you mention. Seems like there is no information for the f residual variance since the x covariances are already saturated by this model, the xy covariances don't involve that parameter, and although the y covariances identify only the variance of f2 it cannot be separated into the f2 and f residual variances. But maybe I am missing something  you can try it out and see what Mplus says. 


What would be a suitable way of getting factor scores for the formative factor? Would it be fine to save the factor scores as you do with factors with reflective indicators? Any suggestions would be appreciated. Carlos 

bmuthen posted on Monday, November 15, 2004  6:58 pm



Yes, factor scores can be obtained for any latent variable in the Mplus model. 


Thanks! C. 

Lee Van Horn posted on Thursday, December 16, 2004  1:50 am



In reply to the previous discussion, yes, you can estimate the variance of the emergent variable. The theory and math behind this is discussed fairly extensively in a 1993 Psych Bulletin paper by MacCallum and Browne as well as in an as of yet unpublished paper by Bollen and Davis, "Causal indicator models: identification, estimation, and testing." The Bollen and Davis paper is especially useful for showing the necissary conditions for the identification of these models. One of those conditions is that the emergent variable must emit two paths. 

hai hong li posted on Thursday, September 28, 2006  8:17 pm



Hi, I am a new user of Mplus. I am trying to incorporate causal indicators (formative measures  not MIMIC) into a sem model. It's not clear how to write the code for this kind of problems although I've gone through all related discussions in the discussion board. Suppose for example we have a model with six measured variables, x1x6, x1x3 are causal (formative) indicators of F1, and x4x6 are reflective indicators of F2, and there is a path from F1 to F2. Could you help me with the coding of such a model? Thanks! 


This is described in our short course "Day 1" handout (2006 version). You say f1 by; f1 on x1@1 x2x3; f1@0; f2 by x4x6; f2 on f1; 

Ken posted on Wednesday, July 02, 2008  2:46 am



If I have the model f BY y1 y2; f ON y3; x ON f; then I can calculate the indirect path from y3 to x, but this model doesn't seem correct as y3 should be a measurement of f. Is there some way for the model f BY y1 y2 y3; x ON f; that I can express how x and y3 are related. I could make statements about how both y3 is significantly related to f and x to f but it would be clearer with a single relationship. The people I am working with want to focus on one of the indicators and its relkationship to the outcome and a latent variable model seems most appropriate. 


If you want the indirect effect of x to y3, your model should be: f y1 y2 y3; f ON x; The indirect effect of x to y3 via f is the product of the regression coefficent for f ON x and f BY y3; 


I am trying to run a MIMIC model with 7 binary causal indicators and 7 binary effects indicators and one factor. I have 3 questions: 1) Is this the correct way to specify the model (loading for x1 fixed at 1) and do I need anything else? fac by x1  x7; fac on x8  x14; 2) Is there any way to obtain the model implied covariances for the causal variables? 3) The output includes "Model Estimated Slopes" where the rows are the effects variables and the columns are the causal variables. Could you please explain what this is? Thank you very much. 


1. Yes. See Examples in Chapter 5 and the Topic 1 course handout. 2. The model implied covariances for the binary indicators can be obtained by asking for RESIDUAL in the OUTPUT command. 3. I think these are part of Sample Statistics. This is because when there are covariates in the model, the sample statistics used for model estimation for WLSMV are the threholds, probit regression coefficients, and residual correlations. 

Erin P posted on Tuesday, January 24, 2012  12:45 am



I'm having a hard time specifying two models. I'm not sure that all of my variances are specified properly Any suggestions would be appreciated. MODEL 1 (induced variable model): F1 ON y1y3; F2 BY y4; F1 BY F2; y1 WITH y2 y3; y2 WITH y3; F1@0; y4@0; MODEL 2 (common factor model): F1 BY y1y3; F2 BY y4@1; F2 ON F1; y4@0; 


You should try the analyses and if you have problems, send the output and your license number to support@statmodel.com. 


I am running a formative factor model with three indicators pointing to the latent construct. I am using the categorical function and complex sampling options. I have missing data and am using the default, FIML. I have been trying to figure out what the default estimation is for this model? I have looked in the manual but cannot find it. Also, is this the appropriate estimation method for this kind of model? Thank you. Usevariables alcohol drugs mhprob pds_at2r pds_ng2r poverty total; Categorical are alcohol, drugs, mhprobs pds_at2r pds_ng2r; Weight = NANALWT; Stratification=Stratum; Subpopulation = cbclpop=1; missing are all .; ANALYSIS: TYPE=COMPLEX; interations=50000; PARAMETERIZATION=Theta; MODEL: par_imp by; par_imp on mhprob@1 drugs alcohol; par_imp@o; alcohol drugs mhprob on poverty; total on pds_at2r par_imp pds_ng2r; pds_at2r pds_ng2r on par_imp ; alcohol with drugs; mhprob with alcohol; mhprob with drugs; 


I believe the default is WLSMV. You can confirm that by looking at the output. There is a summary of the analysis specifications given after the input and before the results. 


Thank you Dr. Muthen. I have one more follow up question. With a WLSMV estimation, i get a warning. Chisqaure cannot be used for chisquare difference testing in a regular way. My understanding of difftest is that this is used in testing multigroup models, which I am not conducting. I am on the other hand, running a formative factor model (please see message above for code). Should the chisquare value not interpreted? 


This is a standard message warning a person they must use DIFFTEST if they compare nested models. With WLSMV only the pvalue should be interpreted. 

Back to top 