Message/Author 


Formative model with an observed dependent variable ("friends") TITLE: HodgeTreiman social status modeling DATA: FILE = htmimicn1.dat; TYPE = COVARIANCE; NOBS = 530; VARIABLE: NAMES = church member friends income occup educ; USEV = friendseduc; MODEL: f BY friends*; ! defining the factor; same ! as regressing friends on ! f f@0; f ON income@1 occup educ; OUTPUT: TECH1 STANDARDIZED; Formative model with a latent dependent variable ("fy") TITLE: HodgeTreiman social status modeling DATA: FILE = htmimicn1.dat; TYPE = COVARIANCE; NOBS = 530; VARIABLE: NAMES = church members friends income occup educ; USEV = churcheduc; MODEL: fy BY churchfriends; f BY fy*; f@0; f ON income@1 occup educ; 1. .3607 1. .2104 .2655 1. .1002 .2845 .1763 1. .1563 .1924 .1363 .3046 1. .1583 .3246 .2264 .3056 .3447 1. 

Cam McIntosh posted on Thursday, February 16, 2006  5:22 pm



Hi Linda, Just a quick and maybe very silly question for now  what is the purpose of the f@0 command in these programs? Thanks, Cam 


Formative factors have no residual variance. This would not be identified. 

Robin posted on Friday, July 21, 2006  8:22 am



I followed the second code segment (for latent DV "fy") and drew the directions of the variable and indicators. Can I clarify that this is a 2ndorder model, with "fy" as 1storder factor having income, occup and educ as formative indicators, and "fy" being a reflective latent variable (indicator) of the 2ndorder factor "f"? Thanks. 


I would not call this a secondorder factor. 

Son K. Lam posted on Saturday, April 07, 2007  8:33 am



Dear Linda, How do I test interaction when I have  2 formative constructs (each with multiple indicators)?  1 formative and 1 reflective constructs? Thanks 


You can try the XWITH command. I'm not sure if it will work. 


Dear Linda, Just a quick question. In the Formative model with a latent dependent variable model above you set the path between income and f to one. But if that path is set to one, we cannot conclude anything about the significance of that specific path. My question is; since we cannot say anything about the significance of the path between income and f, how can we conclude that income is a formative indicator of f? I saw the same model under Indicator Arrows Pointing to a Factor discussion but setting one of the paths between formative indicators and factor to one is not explained there neither. Thank you... 


One of the paths needs to be fixed to a nonzero number for model identification. It does not necessarily need to be income. 

Gareth posted on Tuesday, September 04, 2007  3:50 am



In the example "Formative model with a latent dependent variable ("fy")", how would I include a covariate (e.g. age) that I hypothesize should be related to either or both of the formative and latent variables? For example, f ON age implies that age is part of the formative variable, which I don't want: f ON income@1 occup educ age; Any suggestions would be most welcome. 


With a formative model, this cannot be statistically distinguished. 

Gareth posted on Monday, December 10, 2007  9:22 am



In the example above (Formative model with an observed dependent variable), what is the purpose of fixing the residual variance of f to zero (f@0)? 


The residual variance cannot be identified. The formative approach essentially is like forming a factor by a weighted sum of the indicators where the weights are estimated, but measurement error is not parsed out. 


In the example 'Formative model with a latent dependent variable' the results depend on the path which is restricted to 1: income@1: estimate for F BY FY: 0,108 (Est./S.E.: 3,825) occup@1: estimate for F BY FY: 0,045 (Est./S.E.: 1,726) educ@1: estimate for F BY FY: 0,156 (Est./S.E.: 4,945) I would like to know why the results are ambiguous and how one can select the 'right' path setting to 1. Thanks in advance. 


Please send the input, data, output including the STANDARDIZED option of the OUTPUT command, and your license number to support@statmodel.com. 


Dear Linda, I used the input and data from the example above. I only changed the path which is restricted to 1. The standardized estimates for F BY FY are the same in the three cases, but I cannot assess definitely, if the coefficient is significant. 


Please send the files and your license number so I don't have to put them together. 


Can someone help me out with this? Please I'm trying to define Family's SES based on family's poverty level (POV), female's SES (FSES), and male's SES (MSES). Can I define the following formative factor? POV is nominal (richest  poorest) education occupation > MSES education occupation employment > FSES POV MSES FSES > Family's SES Where can I find literature and/or annotated syntax of formative factors Thanks a lot 


For the nominal variable, you need to create dummy variables. See the Topic 1 course handout for inputs for formative factors. 


Thank you very much. 


Sorry, I need to be sure of what you're saying. for my model POV MSES FSES > FSES Not only POV is nominal. Education and occupation are nominal too. Employment is dichotomous. So, should I create dummy variables for all my nominal variables or just for POV? and what is the reason to do that. thank you so much for your help 


Any nominal variable with more than two categories must be turned into a set of dummy variables. Formative factors are specified using ON. Variables on the righthand side of ON are covariates in a regression and must be binary (dummy) or continuous as in regular regression. 


Understood. Thank you so much. 


I want to do CFA with a latent variable(kks) and four formative indicators(x1 x2 x3 x4). so I write my program... DATA: FILE IS 2.dat; VARIABLE: NAMES ARE x1x4; MODEL: kks ON x1 x2 x3 x4; but it doesn't work. The results show... *** ERROR in MODEL command Unknown variable(s) in an ON statement: KKS I don't know how to correct my mistake. please help me. Thank you so much. 


Please see short course Topic 1 starting with Slide 238 for the proper specification of formative indicators. 


Thank you so much.... 


Hi, is it possible in Mplus to standardize the variance of the formatively measured latent by fixing it’s variance to unity (instead of fixing one of the slopes)? If yes... how? Thanks, Alex 


I don't think this model is identified. 


Hi Linda and thanks for your thought. Let’s assume an indentified model where the formative measured latent has 3 cause indicators and is directly connected to 2+ endogenous variables . In order to set a scale for the formative latent one has following options (e.g., Edwards, 2001, p.161): 1. either fix the paths leading to or from the formative construct to 1 (like the examples above with SES  but not necessarily with zeta set to 0) or 2. fix the variance of the construct to unity, thereby standardizing the construct. I would prefer the second option because I want to test the S.E. for all paths. Frankeet al. (2008) show that the effects in the model change depending on the scaling method (They used lisrel and I have read somewhere that it is also possible with ramon but I was curios about mplus). Edwards (2001): http://orm.sagepub.com/cgi/content/abstract/4/2/144 Franke, Rigdon, Preacher, 2008, in the Jou of Business Research spec. issue on formative constructs 


You can constrain the variance of the construct to unity using Model Constraint, where you express the variance of the construct in terms of model parameters and the formative indicators' sample covariance matrix and set the construct variance at 1. Note that in Mplus you can test all paths even if you fix a slope at 1  this is because Mplus gives you SEs also for the standardized coefficients. Because different scaling settings lead to different results I think it might be more straightforward to set the construct residual variance at zero, acknowledging that we don't have information on it. 


Many thanks! (I will try to model the variance using the model constraint. I was using version 4.1 where the stdxy; option is not implemented; it seems that the consequences of different scalings could be a good case for a new monte carlo study ) 

Lewina Lee posted on Monday, August 27, 2012  3:33 pm



Dear Drs. Muthen, I'm building an SEM involving a formative indicator (CMAT1) predicting a latent outcome variable (LPOS). When I ran the model specifying one of the composite paths to 1 (CMAT1 on AGE @1 MARRY EDU;) and freely estimating the link between the outcome variable and the composite variable (LPOS on CMAT1;), as in METHOD 1 below, the model would not converge. However, when I slightly changed the model so that I freely estimated all the composite paths (CMAT1 on AGE MARRY EDU;) but set the link between the composite indicator and the outcome variable to 1 (LPOS ON CMAT1 @1;), as in METHOD 2 below, the model was identified. I see that METHOD 2 is slightly different than what you have adviced here and in your handout, am I doing anything wrong? (MARRY is a dichotomous variable, if that may make any difference). Thank you, Lewina ***METHOD 1*** LPOS by LSAT MCS; CMAT1 by; CMAT1 on AGE @1 MARRY EDU; CMAT1 @0; LPOS on CMAT1; ***METHOD 2*** LPOS by LSAT MCS; CMAT1 by; CMAT1 on AGE MARRY EDU; CMAT1 @0; LPOS on CMAT1 @1; 


I wonder if the problem is that lpos is not identified. Try playing with that. 

Lewina Lee posted on Tuesday, August 28, 2012  8:35 am



Thanks, Linda. Would you consider METHOD 1 & METHOD 2 above equivalent? If METHOD 2 allows the model to converge (despite the slight departure from your suggested approach in the handouts & on this forum), can I go along with the results? Thank you, Lewina 


No, the two methods are not the same. You need to fix on indicator to one. 


Hello, To evaluate formative constructs the P values for the weights are one criteria for relevance of the composite indicators. How can I determine the weights (and p values) used to calculate each of the composite indicators? The manual demos setting weights, which is not what I need to do. I have modeled a reflectiveformative construct as below: Analysis: Estimator = mlr; !for possible skew or nonnormality in raw data MODEL: A by X1@1 X2* X3* ; ! create the reflective lv "A" B by Y1@1 Y2*; ! create the reflective lv "B" C by Z1@1 Z2* Z3*; Q by ; !create the form. lv "Q" !D is a single indicator construct for Q Q ON A@.42 B@.34 C@.38 D@.35 ; ! specify the form. lv "Q" use pop values or make equal A with B; ! check r between 1st order lvs A with C; !etc Q@0; ! set Q residual var to 0 [Q@0]; ! set Q mean at 0 thanks! 


You fix one of the weights at 1 and free the rest. You fix the residual variance of Q at 0 as you have done. Note that [q@0]; doesn't fix the mean of Q at zero, but its intercept (since Q is a DV). 


Hello. I have a model where a latent variable predicts the 2 latent cause indicators of a formative model and the formative variable predicts 2 latent variables. SL > U SL > B U & B > I (Formative model) I > C I > T How can I tell if this model is identified? Mplus is able to estimate this model but I'm not sure it is identified. 


Syntax is: SL BY SL2 SL25; UNIQ BY Unique4 Unique6; PIS BY PIS1 PIS5; UNIQ with PIS; Inc by; Inc@0; Inc on Uniq@1 PIS@1; CRT by Crt2 Crt4; TC by TeamCit1 TeamCit3; TC ON INC; CRT ON INC; UNIQ ON SL; PIS ON SL; 


I assume "PIS" plays the role of "B" in your first notation. The model is identified and you should free the coefficients for PIS in Inc on Uniq@1 PIS@1; Please post in one window only. 

j guo posted on Wednesday, July 02, 2014  12:03 am



Hello, I have a model where 2 latent variables predict 1 formative variable. this is input setup. A by; ! create formative variable A@0; A on y1@1 y2 y3 y4 y5; B by x1 x2 x3 x4; ! create two latent variables C by z1 z2 z3 z4; A on B C; My model is not identified. PROBLEM INVOLVING THE FOLLOWING PARAMETER: A on B. Is this setup correct? Thank you very much. 


The setup is correct, but a formative factor needs to predict something observed for the model to be identified, that is, you need W ON A; 


Hello, I have tried to run a formative indicator model using the code from Topic 1 slide 238 and it would not run. I was using raw data  do I have to use a covariance matrix to run this code? Second, how do you interpret the output for a formative indicator  can you recommend any articles? Thanks, Heather 


No, you can use raw data. You can send your output and license number to Support for a diagnosis of the problem. Bollen & Bauldry (2011) in Psych Methods provides a discussion and references. 

Ed Maguire posted on Thursday, March 12, 2015  2:10 pm



I am curious about whether it is appropriate to use information measures like AIC/BIC to compete formative and reflective model specifications against one another (when the two models are based on the same items). Also assuming a sufficiently large sample to warrant the use of ML, is there any added benefit to using BSEM to estimate such a model and relying on DIC instead? 


The problem with AIC/BIC is that they are based on the likelihood which doesn't have the same set of DVs in the two models, so it is not in the same metric. For the formative model the "indicators" are covariates (influencing the factor) whereas for the reflective model they are DVs (influenced by the factor). BSEM would be of interest if there are say direct effects from some of the formative indicators to the distal outcome that the formative factor predicts. 

Ed Maguire posted on Thursday, March 12, 2015  9:20 pm



Thanks very much. 


Come to think of it, I think you can bring the formative indicators into the model by mentioning their variances. That changes their status from "x's" to "y's" in the Mplus thinking  and therefore you have the same set of DVs in both the formative and reflective model. 

Ed Maguire posted on Friday, March 13, 2015  9:57 am



That is a good thought. Thanks so much for your helpful replies. 


What are the implications in a formative model if the indicators are scored zero and one and the data is nonnormal? I ask because the paths in my formative model resulted as nonsignificance and one resulted as negative and we are thinking it might be because of data assumptions for a formative modeling. Thanks, Heather 


Binary formative indicators are fine. Because they are covariates (IVs) their distribution doesn't matter. So that's not the reason for your results. 

MSP posted on Wednesday, October 21, 2015  12:32 pm



Hello. I'm currently learning SEM through MPlus. I would like to ask the ff: 1) Is it possible to have multiple formative factors with common causal indicators, then those formative factors regressing on a binary (observed) DV? Like if my exogenous variables would be through an ESEM syntax but the arrows are reversed? Originally, I had: f1f3 by x1x9(*1); out on f1f3; !out is binary ^ How to reverse the arrows for f1f3? Or do I have to specify causal indicators for each formative factor? If ever, can f1 and f2 both have, say, x2 as an indicator? f1 by; f1 on x1@1 x2 x3; f1@0; f2 by; f2 on x4@1 x5 x6; f2@0; f3 by; f3 on x7@1 x8 x9; f3@0; out on f1f3; ^ I either end up with negative dfs, or SEs could not be computed 2) How do you interpret fit indices/parameters for a model with formative factors? How is it different than traditional SEM? Any response is much appreciated. 


1) Yes. How to do the Mplus input is shown in the handout for Topic 1 short course on our website. 2) The formative indicators fit perfectly among themselves since no structure is imposed on their correlations (so unlike FA). But you get ordinary fit indices because the model imposed restrictions on the correlations between the formative indicators and the outcome (your "out"). 

MSP posted on Thursday, October 22, 2015  10:49 am



Thanks for the swift response. I tried doing the syntax for a model with formative factors, but I either get a "negative df" error, or a "standard error could not be computed" error. After much tweaking, I got a normal termination, some parameter estimates, but I would like to consult this (weird?) fit indices. Any suggestion on what important result output to look for in this formative model? Here's my model: f1 by; f1 on x1@1 x2 x3 x4 x5; f1@0; f2 by; f2 on x6@1 x7 x8; f2@0; f3 by; f3 on x9@1; f3@0; out on f1f3; !out is binary variable Output: # of free parameters: 10 ChiSquare Test of Model Fit: Value = 0, df = 0, pvalue = 0 RMSEA: 0, prob = 0 CFI/TLI: 1.0 ChiSquare Test of Model Fit for the Baseline Model: 196.689, df = 9, pvalue = 0 WRMR = 0.062 Any insight on this is much appreciated. Should I fix model? Do I still stick with formative factors? Or just use reflective factors for my measurement model, and have them regress on my binary DV (out)? Would that seem circular? Thanks. 


I forgot that df=0 in this case so no fit measure available. It's because there are as many slopes of "out" regressed directly on the x covariates as there are free formative coefficients plus the coefficients for out regressed on the 3 f's. No restriction is imposed. 

MSP posted on Friday, October 23, 2015  7:04 am



I see. Thanks for the response and guidance. So just to clarify  so is it fair to say to only interpret the model results with the parameter estimates output in this case of a model with formative factors? Or is there a fit index to look at? How to evaluate? 


Q1. Yes. Q2. Not in your case. Q3. Not possible in your case  it is just a restatement of regression analysis. 

Tyler Mason posted on Wednesday, November 11, 2015  11:02 am



Hello, I have a model with a formative construct (SES) comprised of income, education, and occupation. I would like to regress the formative construct on race (0 = white, 1 = black). Is this possible to do in mplus? 


I don't think that is an identified model because a formative construct needs to influence some other variable. 

Tyler Mason posted on Wednesday, November 11, 2015  6:32 pm



Sorry for leaving this out but the formative construct will be predicting several other variables in the model. 


That should work then. I think Bollen has written about this type of modeling. 


Hi Dr. Muthen, I am modeling formative measures where f1 is a formative measure with two indicators and f2 is another formative indicator with two measures and then f is a formative indicator with both f1 and f2. Here's my code: f1 by; f1 ON FPT1@1 FPT2; f2 by; f2 ON FPT3@1 FPT4 FPT5; f by; f ON f1@1 f2; f@0; fwc FPT_maj on f; However, I get the following error: THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES COULD NOT BE COMPUTED. THE MODEL MAY NOT BE IDENTIFIED. CHECK YOUR MODEL. PROBLEM INVOLVING THE FOLLOWING PARAMETER: Parameter 10, F2 Do you know why the model is not identified? 


You want to add f1@0; f2@0; 


Hello Dr. Muthen, I am modeling a unidimensional formative construct with five formative indicators. I am applying a MIMIC modeling approach and the formative construct is being identified by two reflective indicators. All variables are continuous, however the variances of the variables do vary by more than 10 units. My question is regarding scaling the latent. What is the theoretical difference in the estimates when using the model constraint and the variance is fixed to 1 vs. the standardized estimates with the variance fixed to 0? Thank you! 


It depends on the model. Do you have 1) fref on fform; with reflective indicators of fref and formative indicators of fform or 2) f by y1y5; f on x1@1 x2x5; f@0; ! or f@1 In case 1) the residual variance of fform adds to the fref variance and therefore scales the estimates. In case 2) I don't think it matters if you do f@0 or f@1. I may be wrong, however, since I haven't tried this out. 


Hello Dr. Muthen, Below is my partial code for a measurement model that includes formative indicators. Is it correct code? I am unsure about linking the formative variables to another variable (e.g., y30 ON BRM; y17 ON ICT; etc.). It runs and the fit indices are very strong. When I add the code for the structural model (not shown here) very few variables are significant. Can you offer any help please? MODEL: BRM BY; ! define the formative factor BRM ON y24@1 y25 y26 y27 y28; BRM@0; y30 ON BRM; ! linked to variable y30 to scale formative factor ICT BY; ! define the formative factor ICT ON y12@1 y13 y14 Y15; ICT@0; y17 ON ICT; ! linked to variable y17 to scale formative factor Road BY; ! define the formative factor Road ON y18@1 y19 y20 Y21; Road@0; y23 ON Road; ! linked to variable y23 to scale formative factor PostV BY; ! define the formative factor PostV ON y37@1 y38 y39; PostV@0; y41 ON PostV; ! linked to variable y41 to scale formative factor 


The formative parts look fine. I can't say anything about nonsignificance without looking at the model and this may also be a question more suited for SEMNET. 

Ads posted on Monday, September 23, 2019  5:28 pm



Is there a way to look at the mean differences between 2 formative factors? More detail: I am looking at disability change resulting from an intervention, from time 1 vs. time 2. This is a formative construct (e.g., being disabled doesn't cause you to have a broken leg). I tried different code options but couldn't get any model to be identified. I have the same 5 indicators for disability at each time point  for a number of reasons, it would be better to derive the scoring weights from the data rather than just using a sum of equal weights for all items. I am OK if the factor is time variant (the scoring weights may differ at time 2 vs. time 1). Many thanks! 


A formative factor needs to predict another variable to be identified. But in such situations, The Mplus tech4 output will give you its mean. 

Kerry Lee posted on Thursday, October 31, 2019  9:01 pm



Dear Prof Muthen, I have a discrepacny between the standardised and unstandardised results in a structural model involving a formative and a reflective latent (standardised is significant, as is the R squared, the unstandardised regression beta is not). With models containing only reflective latents, I think unstandardised results are recommended in cases of discrepancy. However, I noticed that standardised results are requested in a number of previous posts. Should one rely on the standardised output when formative latents are involved? If so, is it because they are composites involving measures of different scales? Thanks, Kerry. 


See the FAQ on our website: Standardized coefficient can have different significance than unstandardized If you don't want to use Bayes, you can achieve the same nonsymmetric CI using bootstrapping. 

JPower posted on Monday, October 26, 2020  12:48 pm



I have created a latent variable with 6 formative indicators and 1 reflective indicator (all blood marker measures). I created this latent variable separately in males and females in my sample (all the same formative and reflective indicators). I have output the latent variable scores from the male and female subsets. Is it okay to now bring the male and female samples into 1 larger data set and use the latent variable score as a dependent variable in a linear regression (i.e. as a single dependent variable in the combined malefemale sample? I am not certain if the latent variable scores, derived from the same indicators but separately in male and female samples, are on the same scale (all the blood markers are on the same scale). I do recognize the actual value of the latent variable has no meaning. Thank you. 


Seems like the factor score would refer to somewhat different factors unless there was measurement invariance imposed (equal slopes for the formative part). 

Back to top 