Formative model with an observed dependent variable ("friends")
TITLE: Hodge-Treiman social status modeling DATA: FILE = htmimicn1.dat; TYPE = COVARIANCE; NOBS = 530; VARIABLE: NAMES = church member friends income occup educ; USEV = friends-educ; MODEL: f BY friends*; ! defining the factor; same ! as regressing friends on ! f f@0; f ON income@1 occup educ; OUTPUT: TECH1 STANDARDIZED;
Formative model with a latent dependent variable ("fy")
TITLE: Hodge-Treiman social status modeling DATA: FILE = htmimicn1.dat; TYPE = COVARIANCE; NOBS = 530; VARIABLE: NAMES = church members friends income occup educ; USEV = church-educ; MODEL: fy BY church-friends; f BY fy*; f@0; f ON income@1 occup educ;
I followed the second code segment (for latent DV "fy") and drew the directions of the variable and indicators.
Can I clarify that this is a 2nd-order model, with "fy" as 1st-order factor having income, occup and educ as formative indicators, and "fy" being a reflective latent variable (indicator) of the 2nd-order factor "f"?
Just a quick question. In the Formative model with a latent dependent variable model above you set the path between income and f to one. But if that path is set to one, we cannot conclude anything about the significance of that specific path. My question is; since we cannot say anything about the significance of the path between income and f, how can we conclude that income is a formative indicator of f?
I saw the same model under Indicator Arrows Pointing to a Factor discussion but setting one of the paths between formative indicators and factor to one is not explained there neither.
One of the paths needs to be fixed to a non-zero number for model identification. It does not necessarily need to be income.
Gareth posted on Tuesday, September 04, 2007 - 3:50 am
In the example "Formative model with a latent dependent variable ("fy")", how would I include a covariate (e.g. age) that I hypothesize should be related to either or both of the formative and latent variables?
For example, f ON age implies that age is part of the formative variable, which I don't want:
The residual variance cannot be identified. The formative approach essentially is like forming a factor by a weighted sum of the indicators where the weights are estimated, but measurement error is not parsed out.
I used the input and data from the example above. I only changed the path which is restricted to 1. The standardized estimates for F BY FY are the same in the three cases, but I cannot assess definitely, if the coefficient is significant.
Any nominal variable with more than two categories must be turned into a set of dummy variables. Formative factors are specified using ON. Variables on the right-hand side of ON are covariates in a regression and must be binary (dummy) or continuous as in regular regression.
Hi Linda and thanks for your thought. Letís assume an indentified model where the formative measured latent has 3 cause indicators and is directly connected to 2+ endogenous variables . In order to set a scale for the formative latent one has following options (e.g., Edwards, 2001, p.161): 1. either fix the paths leading to or from the formative construct to 1 (like the examples above with SES - but not necessarily with zeta set to 0) or 2. fix the variance of the construct to unity, thereby standardizing the construct.
I would prefer the second option because I want to test the S.E. for all paths. Frankeet al. (2008) show that the effects in the model change depending on the scaling method (They used lisrel and I have read somewhere that it is also possible with ramon but I was curios about mplus).
You can constrain the variance of the construct to unity using Model Constraint, where you express the variance of the construct in terms of model parameters and the formative indicators' sample covariance matrix and set the construct variance at 1.
Note that in Mplus you can test all paths even if you fix a slope at 1 - this is because Mplus gives you SEs also for the standardized coefficients.
Because different scaling settings lead to different results I think it might be more straightforward to set the construct residual variance at zero, acknowledging that we don't have information on it.
Many thanks! (I will try to model the variance using the model constraint. I was using version 4.1 where the stdxy; option is not implemented; it seems that the consequences of different scalings could be a good case for a new monte carlo study )
Lewina Lee posted on Monday, August 27, 2012 - 3:33 pm
Dear Drs. Muthen,
I'm building an SEM involving a formative indicator (CMAT1) predicting a latent outcome variable (LPOS). When I ran the model specifying one of the composite paths to 1 (CMAT1 on AGE @1 MARRY EDU;) and freely estimating the link between the outcome variable and the composite variable (LPOS on CMAT1;), as in METHOD 1 below, the model would not converge.
However, when I slightly changed the model so that I freely estimated all the composite paths (CMAT1 on AGE MARRY EDU;) but set the link between the composite indicator and the outcome variable to 1 (LPOS ON CMAT1 @1;), as in METHOD 2 below, the model was identified.
I see that METHOD 2 is slightly different than what you have adviced here and in your handout, am I doing anything wrong?
(MARRY is a dichotomous variable, if that may make any difference).
Thank you, Lewina
***METHOD 1*** LPOS by LSAT MCS; CMAT1 by; CMAT1 on AGE @1 MARRY EDU; CMAT1 @0; LPOS on CMAT1;
***METHOD 2*** LPOS by LSAT MCS; CMAT1 by; CMAT1 on AGE MARRY EDU; CMAT1 @0; LPOS on CMAT1 @1;
I wonder if the problem is that lpos is not identified. Try playing with that.
Lewina Lee posted on Tuesday, August 28, 2012 - 8:35 am
Thanks, Linda. Would you consider METHOD 1 & METHOD 2 above equivalent? If METHOD 2 allows the model to converge (despite the slight departure from your suggested approach in the handouts & on this forum), can I go along with the results?
Hello, To evaluate formative constructs the P values for the weights are one criteria for relevance of the composite indicators. How can I determine the weights (and p values) used to calculate each of the composite indicators? The manual demos setting weights, which is not what I need to do. I have modeled a reflective-formative construct as below: Analysis: Estimator = mlr; !for possible skew or non-normality in raw data MODEL: A by X1@1 X2* X3* ; ! create the reflective lv "A"
No, you can use raw data. You can send your output and license number to Support for a diagnosis of the problem.
Bollen & Bauldry (2011) in Psych Methods provides a discussion and references.
Ed Maguire posted on Thursday, March 12, 2015 - 2:10 pm
I am curious about whether it is appropriate to use information measures like AIC/BIC to compete formative and reflective model specifications against one another (when the two models are based on the same items). Also assuming a sufficiently large sample to warrant the use of ML, is there any added benefit to using BSEM to estimate such a model and relying on DIC instead?
The problem with AIC/BIC is that they are based on the likelihood which doesn't have the same set of DVs in the two models, so it is not in the same metric. For the formative model the "indicators" are covariates (influencing the factor) whereas for the reflective model they are DVs (influenced by the factor).
BSEM would be of interest if there are say direct effects from some of the formative indicators to the distal outcome that the formative factor predicts.
Ed Maguire posted on Thursday, March 12, 2015 - 9:20 pm
Come to think of it, I think you can bring the formative indicators into the model by mentioning their variances. That changes their status from "x's" to "y's" in the Mplus thinking - and therefore you have the same set of DVs in both the formative and reflective model.
Ed Maguire posted on Friday, March 13, 2015 - 9:57 am
That is a good thought. Thanks so much for your helpful replies.
Binary formative indicators are fine. Because they are covariates (IVs) their distribution doesn't matter. So that's not the reason for your results.
MSP posted on Wednesday, October 21, 2015 - 12:32 pm
Hello. I'm currently learning SEM through MPlus. I would like to ask the ff:
1) Is it possible to have multiple formative factors with common causal indicators, then those formative factors regressing on a binary (observed) DV? Like if my exogenous variables would be through an ESEM syntax but the arrows are reversed?
Originally, I had: f1-f3 by x1-x9(*1); out on f1-f3; !out is binary
^ How to reverse the arrows for f1-f3?
Or do I have to specify causal indicators for each formative factor? If ever, can f1 and f2 both have, say, x2 as an indicator?
1) Yes. How to do the Mplus input is shown in the handout for Topic 1 short course on our website.
2) The formative indicators fit perfectly among themselves since no structure is imposed on their correlations (so unlike FA). But you get ordinary fit indices because the model imposed restrictions on the correlations between the formative indicators and the outcome (your "out").
MSP posted on Thursday, October 22, 2015 - 10:49 am
Thanks for the swift response.
I tried doing the syntax for a model with formative factors, but I either get a "negative df" error, or a "standard error could not be computed" error. After much tweaking, I got a normal termination, some parameter estimates, but I would like to consult this (weird?) fit indices. Any suggestion on what important result output to look for in this formative model?
Here's my model: f1 by; f1 on x1@1 x2 x3 x4 x5; f1@0; f2 by; f2 on x6@1 x7 x8; f2@0; f3 by; f3 on x9@1; f3@0; out on f1-f3; !out is binary variable
Output: # of free parameters: 10 Chi-Square Test of Model Fit: Value = 0, df = 0, p-value = 0 RMSEA: 0, prob = 0 CFI/TLI: 1.0 Chi-Square Test of Model Fit for the Baseline Model: 196.689, df = 9, p-value = 0 WRMR = 0.062
Any insight on this is much appreciated. Should I fix model? Do I still stick with formative factors? Or just use reflective factors for my measurement model, and have them regress on my binary DV (out)? Would that seem circular?
I forgot that df=0 in this case so no fit measure available. It's because there are as many slopes of "out" regressed directly on the x covariates as there are free formative coefficients plus the coefficients for out regressed on the 3 f's. No restriction is imposed.
So just to clarify - so is it fair to say to only interpret the model results with the parameter estimates output in this case of a model with formative factors? Or is there a fit index to look at? How to evaluate?
Q3. Not possible in your case - it is just a restatement of regression analysis.
Tyler Mason posted on Wednesday, November 11, 2015 - 11:02 am
I have a model with a formative construct (SES) comprised of income, education, and occupation. I would like to regress the formative construct on race (0 = white, 1 = black). Is this possible to do in mplus?
Hi Dr. Muthen, I am modeling formative measures where f1 is a formative measure with two indicators and f2 is another formative indicator with two measures and then f is a formative indicator with both f1 and f2. Here's my code:
f1 by; f1 ON FPT1@1 FPT2; f2 by; f2 ON FPT3@1 FPT4 FPT5; f by; f ON f1@1 f2; f@0; fwc FPT_maj on f;
However, I get the following error: THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES COULD NOT BE COMPUTED. THE MODEL MAY NOT BE IDENTIFIED. CHECK YOUR MODEL. PROBLEM INVOLVING THE FOLLOWING PARAMETER: Parameter 10, F2
I am modeling a unidimensional formative construct with five formative indicators. I am applying a MIMIC modeling approach and the formative construct is being identified by two reflective indicators. All variables are continuous, however the variances of the variables do vary by more than 10 units.
My question is regarding scaling the latent. What is the theoretical difference in the estimates when using the model constraint and the variance is fixed to 1 vs. the standardized estimates with the variance fixed to 0?
In case 1) the residual variance of fform adds to the fref variance and therefore scales the estimates. In case 2) I don't think it matters if you do f@0 or f@1. I may be wrong, however, since I haven't tried this out.
Below is my partial code for a measurement model that includes formative indicators. Is it correct code? I am unsure about linking the formative variables to another variable (e.g., y30 ON BRM; y17 ON ICT; etc.). It runs and the fit indices are very strong.
When I add the code for the structural model (not shown here) very few variables are significant. Can you offer any help please?
MODEL: BRM BY; ! define the formative factor BRM ON y24@1 y25 y26 y27 y28; BRM@0; y30 ON BRM; ! linked to variable y30 to scale formative factor ICT BY; ! define the formative factor ICT ON y12@1 y13 y14 Y15; ICT@0; y17 ON ICT; ! linked to variable y17 to scale formative factor Road BY; ! define the formative factor Road ON y18@1 y19 y20 Y21; Road@0; y23 ON Road; ! linked to variable y23 to scale formative factor PostV BY; ! define the formative factor PostV ON y37@1 y38 y39; PostV@0; y41 ON PostV; ! linked to variable y41 to scale formative factor
The formative parts look fine. I can't say anything about non-significance without looking at the model and this may also be a question more suited for SEMNET.
Ads posted on Monday, September 23, 2019 - 5:28 pm
Is there a way to look at the mean differences between 2 formative factors?
More detail: I am looking at disability change resulting from an intervention, from time 1 vs. time 2. This is a formative construct (e.g., being disabled doesn't cause you to have a broken leg).
I tried different code options but couldn't get any model to be identified. I have the same 5 indicators for disability at each time point - for a number of reasons, it would be better to derive the scoring weights from the data rather than just using a sum of equal weights for all items. I am OK if the factor is time variant (the scoring weights may differ at time 2 vs. time 1).
A formative factor needs to predict another variable to be identified. But in such situations, The Mplus tech4 output will give you its mean.
Kerry Lee posted on Thursday, October 31, 2019 - 9:01 pm
Dear Prof Muthen,
I have a discrepacny between the standardised and unstandardised results in a structural model involving a formative and a reflective latent (standardised is significant, as is the R squared, the unstandardised regression beta is not). With models containing only reflective latents, I think unstandardised results are recommended in cases of discrepancy. However, I noticed that standardised results are requested in a number of previous posts. Should one rely on the standardised output when formative latents are involved? If so, is it because they are composites involving measures of different scales?
Standardized coefficient can have different significance than unstandardized
If you don't want to use Bayes, you can achieve the same non-symmetric CI using bootstrapping.
JPower posted on Monday, October 26, 2020 - 12:48 pm
I have created a latent variable with 6 formative indicators and 1 reflective indicator (all blood marker measures). I created this latent variable separately in males and females in my sample (all the same formative and reflective indicators).
I have output the latent variable scores from the male and female subsets.
Is it okay to now bring the male and female samples into 1 larger data set and use the latent variable score as a dependent variable in a linear regression (i.e. as a single dependent variable in the combined male-female sample?
I am not certain if the latent variable scores, derived from the same indicators but separately in male and female samples, are on the same scale (all the blood markers are on the same scale). I do recognize the actual value of the latent variable has no meaning.