Finite Mixture Structural Equation Mo...
Message/Author
 Anonymous posted on Sunday, August 03, 2003 - 7:59 pm
I could not find an example for Finite Mixture Structural Equation Models in the book. For example, if I have some SEM model:

VARIABLE: NAMES ARE y1-y8;
CATEGORICAL ARE y1-y6;
MODEL: f1 by y1 y2;
f2 by y3 y4;
f3 by y5 y6;
f4 by y7 y8;
f3 on f1 f2;
f4 on f3 f1 f2;

...and if we suspect some unobserved heterogeneity in population, (a priori segmentation is not feasible) then I want simultaneously form segments (latent categorical variables) and obtain segment-specific estimates for the measurement and structural parameters of the model, so how should I go about it? What if I want to keep factor loadings equal across segments, but allow structural parameters to vary?

Your help would be very appreciated!
 bmuthen posted on Monday, August 04, 2003 - 9:42 am
With factor loadings equal across classes and structural regression slopes varying across classes, you would say (assuming 2 classes):

Model:
%overall%
f1 by y1 y2;
f2 by y3 y4;
f3 by y5 y6;
f4 by y7 y8;
f3 on f1 f2;
f4 on f3 f1 f2;
%c#1%
f3 on f1 f2;
f4 on f3 f1 f2;

which means that class 1 has class-specific structural slopes (so therefore the 2 classes are different in this regard). Starting values, even if rough, for these class-specific parameters are typically needed in finding the solution. Covariates x are helpful in finding the classes, adding the line in the %overall% section:

c#1 on x;

For a recent paper on factor mixture models, see Lubke & Muthen on the Mplus home page.
 Anonymous posted on Monday, August 04, 2003 - 2:30 pm
Did you mean covariates y1-y8 in this example?
 Anonymous posted on Monday, August 04, 2003 - 2:41 pm
Here is what I'm running when I get warnings and no results. Your help would be appreciated.

VARIABLE:

NAMES ARE y1-y8; USEVARIABLES ARE y1-y8;
CATEGORICAL ARE y1-y6;
CLASSES = c(2);

ANALYSIS: TYPE = MIXTURE;
ESTIMATOR = MLR;

MODEL:
%OVERALL%

f1 by y1 y2;
f2 by y3 y4;
f3 by y5 y6;
f4 by y7 y8;
f3 on f1 f2;
f4 on f3 f1 f2;

%c#1%

f3 ON f1*0.1 f2*0.1
f4 on f3*0.1 f1*0.1 f2*0.1

OUTPUT: SAMPSTAT STANDARDIZED CINTERVAL TECH7 TECH8;

SAVEDATA: FILE IS C:\mplus_output.txt;
FILE (RESULTS) IS C:\mplus_results.txt;
SAVE = FSCORES;
SAVE = CPROBABILITIES;

*** WARNING in Model command
Unknown variable(s):
f1
*** WARNING in Model command
Unknown variable(s):
f2
*** WARNING in Model command
Unknown variable(s):
f3
*** WARNING in Model command
Unknown variable(s):
f4
 Linda K. Muthen posted on Monday, August 04, 2003 - 2:55 pm
Send your data and output to support@statmodel.com.
 bmuthen posted on Monday, August 04, 2003 - 2:55 pm

I was referring to covariates that predict class membership, not the indicators of your factors.
 Anonymous posted on Friday, September 10, 2004 - 2:49 pm
Can the current version of Mplus run multi-group LCA?
Thanks!
 Linda K. Muthen posted on Wednesday, September 29, 2004 - 3:49 pm
Yes. The KNOWNCLASS option is used for this.
 Girish Mallapragada posted on Tuesday, November 09, 2004 - 9:36 pm
Hi,

I am trying to estimate a Mixture SEM with latent variable interactions.

However, when i use TYPE=mixture and declare interaction variables i get this error:
*** ERROR in Model command
To declare interaction variables, TYPE = RANDOM must be specified
in the ANALYSIS command.
...

when i do that i cannot estimate a mixture model.

How can i estimate a mixture SEM with interactions?
 Girish Mallapragada posted on Tuesday, November 09, 2004 - 10:42 pm
Hi,

There are different approaches for estimating SEM with interactions (Kenny and Judd, Ping etc.) Does Mplus use any of these or is there something specific to Mplus?

regards
 bmuthen posted on Sunday, November 14, 2004 - 11:33 am
Mixtures and l.v. interactions should be doable using Type = Random and algorithm = integration. If you have problems with this, please send input, data, and output to support@statmodel.com.

 Girish Mallapragada posted on Monday, November 15, 2004 - 10:30 pm
Thanks Dr. Muthen,

I figured out how to specify the type statement for this type of analysis.

Thanks for the reference.
 anonymous posted on Monday, March 06, 2006 - 8:35 am
I have two questions:

1. When estimating a factor mixture model, how should the class-varying factor means be interpreted? i understand that one can assess whether the factor mean for each class is significantly different from the last class, but can i also say something about the value of the factor mean itself?
For example, if class one has a factor mean of 0.8 and class 2 has a factor mean of -0.4, how do I interpret these values?

2. I have estimated a factor mixture model whereby a 4-class solution fits best. i now wish to estimate a SEMM to examine the relationships between the continuous latent variables included in the factor mixture. How do I let the regression (path) coefficients vary across the 4 classes?
do i simply mention the regression commands between the latent variables (e.g. f3 ON f2) for every class except the last class, similar to example 7.20 in the user's guide?
 bmuthen posted on Monday, March 06, 2006 - 9:14 am
1. A simple aspect of understanding factor mean differences across groups/classes is in terms of the factor standard deviations. - How many SDs apart are these factor means? A more complex aspect in terms of what this implies for the item means follows.

This situation is analogous to multiple-group analysis, except the group membership is latent. So this is a big topic if you don't have that background. A brief summary follows. The factor means are understood in the light of the factor indicators (the observed items) - the indicator that has loading fixed at 1 sets the metric of the factor (i.e. when the factor changes one unit, the indicator changes one unit). The mean of the indicators change as a function of the factor means as expressed by E(y) = nu + Lambda*alpha, where E(y) is the mean vector of the indicators, nu is the measurement intercept vector, Lambda is the factor loading matrix and alpha the factor mean vector. With one factor, with class/group-invariant nu, and considering the item with loading fixed at 1, the class/group-difference in that item mean is the difference in factor means.

2. Yes.
 Katharina Schmid posted on Friday, March 10, 2006 - 9:11 am
I am trying to estimate a factor mixture model with four classes. Prior to running the factor mixture model i estimated a separate LCA with and without covariates and in both cases a four class solution fit the data best, not only based on AIC/BIC/LRT indices but also in terms of substantive interpretation of the classes.
when estimating a factor mixture model including three continuous latents, the four class solution yields the same classes and class probabilities.
I further tried to estimate a factor mixture model that includes one additional continuous latent and hereby similar class sizes and estimated posterior probabilities are only yielded when using the default random starts. the problem is that once i increase the number of starts (e.g. 50 10 or 100 10) the class sizes and conditional probabilities change. i have tried to specify class-specific starting values for the thresholds based on the last value of each of the classes in the previous run, yet this did not seem to work.
i am assuming this has something to do with the one additional latent variable i included. is there any way i can control for this?
i would essentially like to keep the classes similar to those obtained in the original LCA model, as this makes most sense in terms of interpretability of the classes.
Is there any way i can do this?
 Linda K. Muthen posted on Friday, March 10, 2006 - 9:50 am
My question is did you replicate the best loglikelihood in all of the analyses that you describe? When you use more random starts, you may be finding a better solution. We have added a description about how to tell if you have a good solution for a mixture model in the new user's guide which is available online. See pages 325-327. Please see this.
 Ralf Wierich posted on Friday, November 17, 2006 - 8:00 am
I'm not sure about which estimator to select for my model. I want to fit a FM-SEM, the independent variables 'per', 'mind' and 'rab' are binary variables (see below), also the rest of my data are highly non-normal (7-point-scales).

would mlr be my first/best choice?

MODEL:
%OVERALL%
pt BY pt1 pt2 pt4-pt6;
sr BY sr1 sr2 sr4 sr5 sr7 sr9 sr10 ;
joy BY joy1-joy7;
ang BY ang1-ang6;
kog BY kog1 kog3-kog6;
att BY att2 att4-att7;
loy BY loy2 loy3 loy4 loy6 loy7;
int BY int2 int4-int7;
sr ON per rab mind;
pt ON per mind rab;
joy ON sr pt;
ang ON sr pt;
kog ON sr pt;
att ON joy ang kog;
int ON att;
loy ON att int;
 Linda K. Muthen posted on Friday, November 17, 2006 - 8:36 am
Only maximum likelihood is available for your situation. I would use the default MLR estimator. Don't put the independent variables on the CATEGORICAL list. This is for dependent variables only.
 Andrea Vocino posted on Sunday, March 01, 2009 - 7:00 pm
I am trying to extract some segments in the population and in particular I am looking for some response-based segments. I am running a CFA with continuos variables and no mssing data and I am wondering, according to the following results how many segments I need to consider 2 or 1?

BIC SSA BIC Entropy VLMR LMR adj.BLRT H0 LValue
1 class 24027.659 23723.175 … … … …
2 classes 24028.534 23673.303 0.925 0.2617 0.2703 0.0000 -11720.618
3 classes 24033.046 23655.613 0.839 0.4172 0.4257 0.0000 -11704.878
4 classes 24089.402 23689.767 0.831 0.3757 0.3784 0.3333 -11692.371
5 classes 24058.367 23636.530 0.820 0.5713 0.5799 0.0300 -11659.382
 Bengt O. Muthen posted on Monday, March 02, 2009 - 11:10 am
Tough to say in this case. If the sample is large enough, I like to start out with looking at BIC and when it's unclear if k or k-1 classes should be chosen complement that with LMR and BLRT plus interpretability. With small samples BIC tends to choose too few classes; see

Nylund, K.L., Asparouhov, T., & Muthen, B. (2007). Deciding on the number of classes in latent class analysis and growth mixture modeling. A Monte Carlo simulation study. Structural Equation Modeling, 14, 535-569.

So unless your sample is smallish, I would say there isn't a strong signal suggesting more than one class in the data.
 Andrea Vocino posted on Monday, March 02, 2009 - 4:48 pm
Many thanks Bengt -- I did read Nylund et al. which was quite useful, but as you say, in this instance the results do not seem to give a clear cut solution. My sample size is 314, so I presume that if one wants to be conservative, the one class solution could be adopted. If so, do these results imply that there is no unobserved heterogeneity in my population and, as a result, the scale is perceived homogeneously (e.g., no unobserved bias)? Thanks in advance.
 Bengt O. Muthen posted on Monday, March 02, 2009 - 6:12 pm
There could be other forms of heterogeneity using latent classes. I don't know how you formulated your model but for instance if you had measurement invariance over the classes and only let the structural parameters vary across the classes, then you may be missing important heterogeneity. Typically measurement intercepts vary across latent classes in my experience and this doesn't seem to have been taken into account in past mixture SEM writing if I am not mistaken.
 Andrea Vocino posted on Monday, March 02, 2009 - 6:30 pm
Thanks again Bengt. This is the syntax of the 4 class model. I would appreciate your comments:
DATA:

VARIABLE:

....

USEVARIABLES ARE
Q3_29 Q3_30 Q3_31 Q3_32 Q3_33 Q4_44 Q4_45 Q4_46
Q4_47 Q4_48 Q3B_39 Q3B_40 Q3B_41 Q3B_42 Q3B_43
Q4B_54 Q4B_58 Q3A_34 Q3A_35 Q3A_36 Q3A_37 Q3A_38
Q4A_49 Q4A_50 Q4A_51 Q4A_52 Q4A_53 Q4B_55 Q4B_56
Q4B_57;

CLASSES = c(4);

ANALYSIS:

PROCESS = 2;
TYPE = MIXTURE;
START = 0;
LRTSTARTS = 0 0 500 50;

MODEL:

%OVERALL%
PRO BY Q3_29* Q3_30 Q3_31 Q3_32 Q3_33;
VAL BY Q3A_34* Q3A_35 Q3A_36 Q3A_37 Q3A_38;
CAR BY Q3B_39* Q3B_40 Q3B_41 Q3B_42 Q3B_43;
SOC BY Q4_44* Q4_45 Q4_46 Q4_47 Q4_48;
UND BY Q4A_49* Q4A_50 Q4A_51 Q4A_52 Q4A_53;
ENH BY Q4B_54* Q4B_55 Q4B_56 Q4B_57 Q4B_58;

PRO@1 VAL@1 CAR@1 SOC@1 UND@1 ENH@1;

%c#1%
[PRO*1];
[VAL*1];
[CAR*1];
[SOC*1];
[UND*1];
[ENH*1];

OUTPUT:

STANDARDIZED;
TECH11 TECH14;
 Bengt O. Muthen posted on Tuesday, March 03, 2009 - 11:10 am
See the UG example 7.27. For general guidelines see my hybrid paper:

Muthén, B. (2008). Latent variable hybrids: Overview of old and new models. In Hancock, G. R., & Samuelsen, K. M. (Eds.), Advances in latent variable mixture models, pp. 1-24. Charlotte, NC: Information Age Publishing, Inc.

which is on our web site.

 Rob Angell posted on Wednesday, November 18, 2009 - 11:41 am
I have a fairly simple question fundamental to my study regarding setting up the syntax for a mixture SEM. The model I am using is:
CLASSES = c (2);
ANALYSIS: TYPE = MIXTURE;
Model:
%OVERALL%
f1 BY q15d q15i q15a q15j q15h q15g q15k;
f2 BY Q16a Q16c Q16d Q16e;
f3 BY q12d q12c q12a q12f q12e;
f4 BY q17a q17b q17c;
f5 BY q14h q14f q15l q13b q14g;
f6 BY q14d q14e q14c;
f7 BY q19a q19b q19c q19d;
f8 BY q20a q20b q20c;
f7 On f6 f5 f4 f3 f2 f1;
f8 on f7 f6 f5 f4 f3 f2 f1;

%c#2%
(?)Not sure

I wish to follow the same procedure as Jedidi et al (1997) whereby the measurement model is assumed invariant but the factor means and structural parameters are freely estimated across two (or more) classes. However, being inexperienced with syntax, I am not sure exactly what syntax I should use under the %c#2% statement to make this possible (Ex7.20 is v.helpful but would really appreciate some extra help just for clarification). Based on my model from the %overall% syntax can someone please tell me what my syntax would be exactly for the second class?? Thanks (in anticipation).
 Linda K. Muthen posted on Wednesday, November 18, 2009 - 1:45 pm
When in doubt, start with the Mplus defaults which are having the factor loadings and intercepts equal across classes and the factor mean zero in the last class representing measurement invariace and the regression coefficients equal across classes.

In a second step you can allow the regressions of the factors to vary across classes as shown in Example 7.20. This would be specified as:

%c#2%
f7 On f6 f5 f4 f3 f2 f1;
f8 on f7 f6 f5 f4 f3 f2 f1;
 Rob Angell posted on Tuesday, November 24, 2009 - 12:41 pm
Hi, just to follow on from before, I have followed your advice above with success - however, when I move from estimating 3 classes to estimating 4 classes I get the following error message:
*** FATAL ERROR
THERE IS NOT ENOUGH MEMORY SPACE TO RUN Mplus ON THE CURRENT
INPUT FILE. YOU CAN TRY TO FREE UP SOME MEMORY BY CLOSING OTHER
APPLICATIONS THAT ARE CURRENTLY RUNNING. NOTE THAT THE MODEL MAY
REQUIRE MORE MEMORY THAN ALLOWED BY THE OPERATING SYSTEM.
REFER TO SYSTEM REQUIREMENTS AT www.statmodel.com FOR MORE

FYI - I have tried on several different computers but to no avail. The model does have quite a few latent variables (see above) but still I would expect it to work...? I have only 500 cases also.

Also, just as a query, when I am letting structural factor loadings vary between groups, is there a way of restricting them to be 0 or above. With my substantive subject area it makes little sense if the latent regression loadings are negative in some instances. Please advise if this is 1)possible and 2) how I can do it?

Thanks (in anticipation)
 Linda K. Muthen posted on Wednesday, November 25, 2009 - 9:44 am
It sounds like the problem is too large for the computers available to you. I would need to see the full output and your license number at support@statmodel.com. If you are using numerical integration, reducing the number of integration points could help.

MODEL CONSTRAINT can be used for defining parameter constraints. See the user's guide for further information.
 Paulie posted on Friday, September 23, 2011 - 10:17 am
How are the degrees of freedom are calculated in FMSEM? Are they estimated according the formula 110 on page 20 in Technical Appendix 4. Can i find the degrees of freedom in the output?
 Paulie posted on Friday, September 23, 2011 - 11:25 am
sorry, some more information: df in case of categorical (ordinal) indicator variables, default estimation methods mlr and mplus 5.1.
 Linda K. Muthen posted on Friday, September 23, 2011 - 4:51 pm
For categorical outcomes, the degrees of freedom in the H1 model are the number of categories to the power of the number of variables minus 1. So for three binary variables it is 2 to the power of 3 minus one or 7.

For other types of outcomes, degrees of freedom are not relevant because means, variances, and covariances are not sufficient statistics for model estimation.
 Paulie posted on Friday, September 23, 2011 - 7:19 pm
 xianhuazeng posted on Monday, February 27, 2012 - 4:18 am
IN the Finite Mixture Structural Equation £¬I want to get the GFI,RMR,CLC, NEC, and ICL-BIC,
can you give me the command?
 Linda K. Muthen posted on Monday, February 27, 2012 - 10:32 am
All fit statistics that are available are given by default. With mixture, chi-square and related fit statistics like GFI are not available because means, variances, and covariances are not sufficient statistics for model estimation.
 Robert Angell posted on Wednesday, March 27, 2013 - 2:12 am
Dear Linda,

I am trying to estimate a very simple 2-class mixture model so that I can estimate structural parameter differences across classes (like Jedidi et al, 1997). Unfortunately I keep on getting the message

Insufficient Number of E-Steps...

I have increased Miterations (seemingly exponentially) but this hasn't worked. I am sure that I am just making some small mistake in the setup - could you please advise?

usevariables are AT1 AT2 AT3 AT5 CON1 CON3 CON4
FAM1 FAM2 FAM3 H3 H4 H5 H6 SA1 SA2 SA3 SA4 WC1 WC2 WC3;

CLASSES = c(2);
ANALYSIS: TYPE = MIXTURE;
MITERATIONS = 1000;
STSCALE = 0.1;
Model: %OVERALL%

F1 by H3 H4 H5 H6;
F2 by CON1 CON3 CON4;
F3 BY SA1 SA2 SA3 SA4;
F4 BY WC1 WC2 WC3;
F5 BY FAM1 FAM2 FAM3;
F7 BY AT1 AT2 AT3 AT5;

F7 ON F5 F4 F3 F2 F1;

%C#1%
[F1*1 F2*0.2 F3*0.8 F4*4.2 F5*3.2 F7*1.8];
F7 ON F5 F4 F3 F2 F1;

OUTPUT:
STDYX;
 Linda K. Muthen posted on Wednesday, March 27, 2013 - 6:43 am