Formative model with an observed dependent variable ("friends") TITLE: HodgeTreiman social status modeling DATA: FILE = htmimicn1.dat; TYPE = COVARIANCE; NOBS = 530; VARIABLE: NAMES = church member friends income occup educ; USEV = friendseduc; MODEL: f BY friends*; ! defining the factor; same ! as regressing friends on ! f f@0; f ON income@1 occup educ; OUTPUT: TECH1 STANDARDIZED; Formative model with a latent dependent variable ("fy") TITLE: HodgeTreiman social status modeling DATA: FILE = htmimicn1.dat; TYPE = COVARIANCE; NOBS = 530; VARIABLE: NAMES = church members friends income occup educ; USEV = churcheduc; MODEL: fy BY churchfriends; f BY fy*; f@0; f ON income@1 occup educ; 1. .3607 1. .2104 .2655 1. .1002 .2845 .1763 1. .1563 .1924 .1363 .3046 1. .1583 .3246 .2264 .3056 .3447 1. 

Hi Linda, Just a quick and maybe very silly question for now  what is the purpose of the f@0 command in these programs? Thanks, Cam 


Formative factors have no residual variance. This would not be identified. 

I followed the second code segment (for latent DV "fy") and drew the directions of the variable and indicators. Can I clarify that this is a 2ndorder model, with "fy" as 1storder factor having income, occup and educ as formative indicators, and "fy" being a reflective latent variable (indicator) of the 2ndorder factor "f"? Thanks. 


I would not call this a secondorder factor. 

Dear Linda, How do I test interaction when I have  2 formative constructs (each with multiple indicators)?  1 formative and 1 reflective constructs? Thanks 


You can try the XWITH command. I'm not sure if it will work. 


Dear Linda, Just a quick question. In the Formative model with a latent dependent variable model above you set the path between income and f to one. But if that path is set to one, we cannot conclude anything about the significance of that specific path. My question is; since we cannot say anything about the significance of the path between income and f, how can we conclude that income is a formative indicator of f? I saw the same model under Indicator Arrows Pointing to a Factor discussion but setting one of the paths between formative indicators and factor to one is not explained there neither. Thank you... 


One of the paths needs to be fixed to a nonzero number for model identification. It does not necessarily need to be income. 

In the example "Formative model with a latent dependent variable ("fy")", how would I include a covariate (e.g. age) that I hypothesize should be related to either or both of the formative and latent variables? For example, f ON age implies that age is part of the formative variable, which I don't want: f ON income@1 occup educ age; Any suggestions would be most welcome. 


With a formative model, this cannot be statistically distinguished. 

In the example above (Formative model with an observed dependent variable), what is the purpose of fixing the residual variance of f to zero (f@0)? 


The residual variance cannot be identified. The formative approach essentially is like forming a factor by a weighted sum of the indicators where the weights are estimated, but measurement error is not parsed out. 


In the example 'Formative model with a latent dependent variable' the results depend on the path which is restricted to 1: income@1: estimate for F BY FY: 0,108 (Est./S.E.: 3,825) occup@1: estimate for F BY FY: 0,045 (Est./S.E.: 1,726) educ@1: estimate for F BY FY: 0,156 (Est./S.E.: 4,945) I would like to know why the results are ambiguous and how one can select the 'right' path setting to 1. Thanks in advance. 


Please send the input, data, output including the STANDARDIZED option of the OUTPUT command, and your license number to support@statmodel.com. 


Dear Linda, I used the input and data from the example above. I only changed the path which is restricted to 1. The standardized estimates for F BY FY are the same in the three cases, but I cannot assess definitely, if the coefficient is significant. 


Please send the files and your license number so I don't have to put them together. 


Can someone help me out with this? Please I'm trying to define Family's SES based on family's poverty level (POV), female's SES (FSES), and male's SES (MSES). Can I define the following formative factor? POV is nominal (richest  poorest) education occupation > MSES education occupation employment > FSES POV MSES FSES > Family's SES Where can I find literature and/or annotated syntax of formative factors Thanks a lot 


For the nominal variable, you need to create dummy variables. See the Topic 1 course handout for inputs for formative factors. 


Thank you very much. 


Sorry, I need to be sure of what you're saying. for my model POV MSES FSES > FSES Not only POV is nominal. Education and occupation are nominal too. Employment is dichotomous. So, should I create dummy variables for all my nominal variables or just for POV? and what is the reason to do that. thank you so much for your help 


Any nominal variable with more than two categories must be turned into a set of dummy variables. Formative factors are specified using ON. Variables on the righthand side of ON are covariates in a regression and must be binary (dummy) or continuous as in regular regression. 


Understood. Thank you so much. 


I want to do CFA with a latent variable(kks) and four formative indicators(x1 x2 x3 x4). so I write my program... DATA: FILE IS 2.dat; VARIABLE: NAMES ARE x1x4; MODEL: kks ON x1 x2 x3 x4; but it doesn't work. The results show... *** ERROR in MODEL command Unknown variable(s) in an ON statement: KKS I don't know how to correct my mistake. please help me. Thank you so much. 


Please see short course Topic 1 starting with Slide 238 for the proper specification of formative indicators. 


Thank you so much.... 


Hi, is it possible in Mplus to standardize the variance of the formatively measured latent by fixing it’s variance to unity (instead of fixing one of the slopes)? If yes... how? Thanks, Alex 


I don't think this model is identified. 


Hi Linda and thanks for your thought. Let’s assume an indentified model where the formative measured latent has 3 cause indicators and is directly connected to 2+ endogenous variables . In order to set a scale for the formative latent one has following options (e.g., Edwards, 2001, p.161): 1. either fix the paths leading to or from the formative construct to 1 (like the examples above with SES  but not necessarily with zeta set to 0) or 2. fix the variance of the construct to unity, thereby standardizing the construct. I would prefer the second option because I want to test the S.E. for all paths. Frankeet al. (2008) show that the effects in the model change depending on the scaling method (They used lisrel and I have read somewhere that it is also possible with ramon but I was curios about mplus). Edwards (2001): http://orm.sagepub.com/cgi/content/abstract/4/2/144 Franke, Rigdon, Preacher, 2008, in the Jou of Business Research spec. issue on formative constructs 


You can constrain the variance of the construct to unity using Model Constraint, where you express the variance of the construct in terms of model parameters and the formative indicators' sample covariance matrix and set the construct variance at 1. Note that in Mplus you can test all paths even if you fix a slope at 1  this is because Mplus gives you SEs also for the standardized coefficients. Because different scaling settings lead to different results I think it might be more straightforward to set the construct residual variance at zero, acknowledging that we don't have information on it. 


Many thanks! (I will try to model the variance using the model constraint. I was using version 4.1 where the stdxy; option is not implemented; it seems that the consequences of different scalings could be a good case for a new monte carlo study ) 

Dear Drs. Muthen, I'm building an SEM involving a formative indicator (CMAT1) predicting a latent outcome variable (LPOS). When I ran the model specifying one of the composite paths to 1 (CMAT1 on AGE @1 MARRY EDU;) and freely estimating the link between the outcome variable and the composite variable (LPOS on CMAT1;), as in METHOD 1 below, the model would not converge. However, when I slightly changed the model so that I freely estimated all the composite paths (CMAT1 on AGE MARRY EDU;) but set the link between the composite indicator and the outcome variable to 1 (LPOS ON CMAT1 @1;), as in METHOD 2 below, the model was identified. I see that METHOD 2 is slightly different than what you have adviced here and in your handout, am I doing anything wrong? (MARRY is a dichotomous variable, if that may make any difference). Thank you, Lewina ***METHOD 1*** LPOS by LSAT MCS; CMAT1 by; CMAT1 on AGE @1 MARRY EDU; CMAT1 @0; LPOS on CMAT1; ***METHOD 2*** LPOS by LSAT MCS; CMAT1 by; CMAT1 on AGE MARRY EDU; CMAT1 @0; LPOS on CMAT1 @1; 


I wonder if the problem is that lpos is not identified. Try playing with that. 

Thanks, Linda. Would you consider METHOD 1 & METHOD 2 above equivalent? If METHOD 2 allows the model to converge (despite the slight departure from your suggested approach in the handouts & on this forum), can I go along with the results? Thank you, Lewina 


No, the two methods are not the same. You need to fix on indicator to one. 


Hello, To evaluate formative constructs the P values for the weights are one criteria for relevance of the composite indicators. How can I determine the weights (and p values) used to calculate each of the composite indicators? The manual demos setting weights, which is not what I need to do. I have modeled a reflectiveformative construct as below: Analysis: Estimator = mlr; !for possible skew or nonnormality in raw data MODEL: A by X1@1 X2* X3* ; ! create the reflective lv "A" B by Y1@1 Y2*; ! create the reflective lv "B" C by Z1@1 Z2* Z3*; Q by ; !create the form. lv "Q" !D is a single indicator construct for Q Q ON A@.42 B@.34 C@.38 D@.35 ; ! specify the form. lv "Q" use pop values or make equal A with B; ! check r between 1st order lvs A with C; !etc Q@0; ! set Q residual var to 0 [Q@0]; ! set Q mean at 0 thanks! 


You fix one of the weights at 1 and free the rest. You fix the residual variance of Q at 0 as you have done. Note that [q@0]; doesn't fix the mean of Q at zero, but its intercept (since Q is a DV). 


Hello. I have a model where a latent variable predicts the 2 latent cause indicators of a formative model and the formative variable predicts 2 latent variables. SL > U SL > B U & B > I (Formative model) I > C I > T How can I tell if this model is identified? Mplus is able to estimate this model but I'm not sure it is identified. 


Syntax is: SL BY SL2 SL25; UNIQ BY Unique4 Unique6; PIS BY PIS1 PIS5; UNIQ with PIS; Inc by; Inc@0; Inc on Uniq@1 PIS@1; CRT by Crt2 Crt4; TC by TeamCit1 TeamCit3; TC ON INC; CRT ON INC; UNIQ ON SL; PIS ON SL; 


I assume "PIS" plays the role of "B" in your first notation. The model is identified and you should free the coefficients for PIS in Inc on Uniq@1 PIS@1; Please post in one window only. 

Hello, I have a model where 2 latent variables predict 1 formative variable. this is input setup. A by; ! create formative variable A@0; A on y1@1 y2 y3 y4 y5; B by x1 x2 x3 x4; ! create two latent variables C by z1 z2 z3 z4; A on B C; My model is not identified. PROBLEM INVOLVING THE FOLLOWING PARAMETER: A on B. Is this setup correct? Thank you very much. 


The setup is correct, but a formative factor needs to predict something observed for the model to be identified, that is, you need W ON A; 


Hello, I have tried to run a formative indicator model using the code from Topic 1 slide 238 and it would not run. I was using raw data  do I have to use a covariance matrix to run this code? Second, how do you interpret the output for a formative indicator  can you recommend any articles? Thanks, Heather 


No, you can use raw data. You can send your output and license number to Support for a diagnosis of the problem. Bollen & Bauldry (2011) in Psych Methods provides a discussion and references. 

I am curious about whether it is appropriate to use information measures like AIC/BIC to compete formative and reflective model specifications against one another (when the two models are based on the same items). Also assuming a sufficiently large sample to warrant the use of ML, is there any added benefit to using BSEM to estimate such a model and relying on DIC instead? 


The problem with AIC/BIC is that they are based on the likelihood which doesn't have the same set of DVs in the two models, so it is not in the same metric. For the formative model the "indicators" are covariates (influencing the factor) whereas for the reflective model they are DVs (influenced by the factor). BSEM would be of interest if there are say direct effects from some of the formative indicators to the distal outcome that the formative factor predicts. 

Thanks very much. 


Come to think of it, I think you can bring the formative indicators into the model by mentioning their variances. That changes their status from "x's" to "y's" in the Mplus thinking  and therefore you have the same set of DVs in both the formative and reflective model. 

That is a good thought. Thanks so much for your helpful replies. 


What are the implications in a formative model if the indicators are scored zero and one and the data is nonnormal? I ask because the paths in my formative model resulted as nonsignificance and one resulted as negative and we are thinking it might be because of data assumptions for a formative modeling. Thanks, Heather 


Binary formative indicators are fine. Because they are covariates (IVs) their distribution doesn't matter. So that's not the reason for your results. 

