Multigroup mixture modelling using Kn... PreviousNext
Mplus Discussion > Latent Variable Mixture Modeling >
Message/Author
 Andy Ross posted on Thursday, October 18, 2007 - 7:12 am
Dear Prof Muthen

I wish to extend example 7.21 (Mixture modelling with known classes - multiple group analysis) to include predictors - whereby x predicts c, but also allow this prediction to vary by cg.

I would then also like to be able to test whether this variation is significant or not.

Please could you advise me

Andy
 Linda K. Muthen posted on Thursday, October 18, 2007 - 9:44 am
You would specify c ON x in the class-specific parts of the MODEL command for the KNOWNCLASS variable.
 Andy Ross posted on Tuesday, October 23, 2007 - 10:06 am
Many thanks Linda

I have another query if I may...

I ran an analysis using the knownclass function holding the conditional probabilities equal across two samples.

In the Mplus output I am informed that I have achieved this aim - however on saving the cprobs and using these to create a weight variable so that I can recreate the solution in SPSS, the solution for the two samples is only fairly equivelent. Would you know why this is? I have saved the cprobs to 16 decimal places so I expected the SPSS solution to be a highly accurate representation of the MPlus one (it has been before)

Many thanks for your support

Andy
 Linda K. Muthen posted on Tuesday, October 23, 2007 - 1:54 pm
I'm not sure what you mean. What solution are you trying to reproduce in SPSS and how are you using the posterior probabilities to do this? If you cannot describe this briefly, please send the relevant information and your license number to support@statmodel.com.
 Thessa Wong posted on Tuesday, May 12, 2009 - 5:58 am
Hello,

I am also working on a multigroup mixture model with known classes. My question is about the decision regarding the best model. If I run two models that are nested, how can I decide which model is the best? I would say that it is not enough to look at the BIC only.

Thank you very much in advance!

Thessa
 Linda K. Muthen posted on Tuesday, May 12, 2009 - 8:54 am
You can use -2 times the loglikelihood difference which is distributed as chi-square to test nested models.
 Alexandre Morin posted on Tuesday, May 12, 2009 - 9:04 am
But dont forget to use the scaling factors since by default Mplus use MLR to estimate mixtures (unless you changed this default). See: http://www.statmodel.com/chidiff.shtml
 Thessa Wong posted on Tuesday, July 28, 2009 - 1:56 am
Hello,
I have another question about my model. Again I am doing a multiple group analysis using mixture modelling with know classes (i.e. sex), since my dependent variable is poisson distributed. When I tried to analyze a model with latent variables, the computation could not be completed (even when the means/regression coefficients where not allowed to differ across groups). The program advices to give starting values (see below). Can you tell me what is the best way to choose starting values for this model?

Thank you in advance!


Unperturbed starting value run did not converge.

1 perturbed starting value run(s) did not converge.


THE ESTIMATED COVARIANCE MATRIX IN CLASS 1 COULD NOT
BE INVERTED. COMPUTATION COULD NOT BE COMPLETED IN ITERATION 389.
CHANGE YOUR MODEL AND/OR STARTING VALUES.

THE ESTIMATED COVARIANCE MATRIX IN CLASS 1 COULD NOT
BE INVERTED. COMPUTATION COULD NOT BE COMPLETED IN ITERATION 389.
CHANGE YOUR MODEL AND/OR STARTING VALUES.


WARNING: WHEN ESTIMATING A MODEL WITH MORE THAN TWO CLASSES, IT MAY BE
NECESSARY TO INCREASE THE NUMBER OF RANDOM STARTS USING THE STARTS OPTION
TO AVOID LOCAL MAXIMA.



THE MODEL ESTIMATION DID NOT TERMINATE NORMALLY DUE TO AN ERROR IN THE
COMPUTATION. CHANGE YOUR MODEL AND/OR STARTING VALUES.
 Linda K. Muthen posted on Tuesday, July 28, 2009 - 6:20 am
Please send your input, data, output, and license number to support@statmodel.com.
 Sabrina Oesterle posted on Tuesday, December 15, 2009 - 6:13 pm
I have a question about using the KNOWNCLASS option to do a multiple group analysis in a latent class model. I have covariates predicting my latent class variable and I want the relationship between the covariates and the latent classes to vary between my KNOWNCLASS groups. How do I incorporate that into the syntax?

C is my latent class variable with 3 latent classes.
G is my knownclass variable, which references 2 gender groups (0=male and 1=female).

If I do it the following way, I get one set of coefficients for the regression of C on the covariates that does not vary by G.

MODEL:
%Overall%
C on G;
C#1 on covariates;
C#2 on covariates;

And the following doesn’t seem to give me the full set of regression coefficents for C on covariates.

MODEL:
%Overall%
C on G;
C#1 on covariates;
C#2 on covariates;

MODEL G:
%G#1%
C#1 on covariates;
C#2 on covariates;
%G#2%
C#1 on covariates;
C#2 on covariates;

Do I need the MODEL G: command or do I list the %G#1% and %G#2% under the first MODEL: section? I am confused about that.
 Linda K. Muthen posted on Wednesday, December 16, 2009 - 12:03 pm
Please send your input, data, output, and license number to support@statmodel.com.
 Peter Mulhall posted on Friday, December 03, 2010 - 8:21 am
Hi,
Is it possible to run a model like example 7.21 in the mplus manual except using categorical indicators? If so, what would the input file look like?When I try to run such a model I get the following each each of my indicators:

ERROR in MODEL command
Variances for categorical outcomes can only be specified using
PARAMETERIZATION=THETA with estimators WLS, WLSM, or WLSMV.

Thanks
 Linda K. Muthen posted on Sunday, December 05, 2010 - 11:19 am
Yes, but variances are not estimated for categorical variables so you need to remove

y1-y4;
 Adrianne Alpern posted on Monday, September 19, 2011 - 11:15 am
Hi there,

I specified TYPE=MIXTURE and used the KNOWNCLASS feature to run a multigroup model that should have been run using Bayesian estimation due to a small n. I learned that Bayesian estimation is not possible for multiple groups models, but was able to run the model this way.

Can you tell me how parameters are estimated in LCA/mixture models (e.g., using KNOWNCLASS), and how this is different from default settings for multiple groups? Is parameter estimation more robust?

Best regards,
Adrianne
 Linda K. Muthen posted on Monday, September 19, 2011 - 12:39 pm
You will obtain the same results using the GROUPING option as using the KNOWNCLASS option if all else is the same.
 Adrianne Alpern posted on Friday, October 07, 2011 - 4:55 am
Hi again,

Thank you for your response. The same model using the GROUPING option gives an error message that reads, "THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES COULD NOT BE COMPUTED. THE MODEL MAY NOT BE IDENTIFIED. CHECK YOUR MODEL. PROBLEM INVOLVING PARAMETER 35."

When I use the KNOWNCLASS option with LCA, some parameters are fixed automatically "TO AVOID SINGULARITY OF THE INFORMATION MATRIX." Can you see any problem with this? Are the results the same as they would be if I constrained the parameter myself?

Best regards,
Adrianne
 Linda K. Muthen posted on Friday, October 07, 2011 - 7:41 am
Please send your output and license number to support@statmodel.com. It may be that you are mentioning the first factor indicator in the groups and classes. If you do that, you relax the fact that it is fixed at one.
 Laura Baams posted on Saturday, November 26, 2011 - 12:13 pm
Hi,

I have a question about a parallel process model with Knownclass and Bayes.

I would like to know the regression of s1 ON i2, and s2 ON i1 estimated freely for both classes. The syntax below enables me to do that. However, I would also like to obtain the correlations s1 WITH s2 and i1 WITH i2 estimated freely for both classes.

In the multigroup MLR option I was able to do this. However with Bayes I get the following error message:


*** FATAL ERROR
VARIANCE COVARIANCE MATRIX IS NOT SUPPORTED WITH ESTIMATOR=BAYES.
PARTIAL EQUALITY BETWEEN TWO VARIANCE COVARIANCE BLOCKS.
IF TWO PARAMETERS FROM TWO DIFFERENT VARIANCE COVARIANCE BLOCKS ARE HELD EQUAL THEN ALL THE PARAMETERS HAVE TO BE EQUAL IN THE TWO BLOCKS.

Any help would be great! Thanks so much!

This is the Analysis and Model part of the syntax:

ANALYSIS:
TYPE = MIXTURE;
ESTIMATOR = BAYES;
STVALUES=ML;
STITERATIONS = 100;
CHAINS = 2;
PROCESSOR = 2;
ALGORITHM = GIBBS(RW);
MODEL:
%OVERALL%
i1 s1 | w1_A@0 w2_A@1 w3_A@2;
i2 s2 | w1_B@0 w2_B@1 w3_B@2;

%PR_2#1%
s1 on i2;
s2 on i1;
s1 with s2;
i1 with i2;

%PR_2#2%
s1 on i2;
s2 on i1;
s1 with s2;
s1 with i2;
 Bengt O. Muthen posted on Saturday, November 26, 2011 - 6:48 pm
Are you using version 6.12?
 Laura Baams posted on Sunday, November 27, 2011 - 12:59 am
No I have version 6.1.
 Linda K. Muthen posted on Sunday, November 27, 2011 - 11:18 am
Please send your input, data, output, and license number to support@statmodel.com.
 Ebrahim Hamedi posted on Thursday, March 01, 2012 - 1:48 pm
Hi
I would like to do a simple class analysis (measurement and not structural model) in three cultures. I failed to find anything on the website or in the book concerning multiple-group analysis and measurement invariance testing. Can you please let me know if I can define groups (countries) for LCA. Can you also give me some info about equality constraints in multiple group LCA?
This is an example of what I (and many other psychologists) need to do with LCA:

http://www.springerlink.com/content/l3u1l343u202gn25/

many thanks,
Ebi
 Linda K. Muthen posted on Thursday, March 01, 2012 - 3:48 pm
For LCA because the only parameters are thresholds, you can simply regress the latent class indicators on the dummy variables representing the groups. It is not necessary to do multiple group analysis. If the group dummy variables influence the latent class variable but not the latent class indicators directly, you have measurement invariance. If there are some direct effects, you have measurement non-invariance.

See Chapter 14 of the user's guide where there is a section on multiple group analysis. Everything in this section also applies to known classes. See the Topic 5 course handout on the website where multiple group analysis with mixture models is discussed.
 Ebrahim Hamedi posted on Thursday, March 01, 2012 - 5:00 pm
Thank you very much indeed for your reply. The method you are suggesting reminds me of a MIMIC model, and I am comfortable with it. I just made a hypothetical syntax (for three countries, 8 dichotomous items) based on your suggestion using Topic 5 course handout. I made dummies for two countries and not three.
This is totally hypothetical and I have not run it yet. Just want to double check with you if my syntax is going to work well for my purpose. Do you find any problem in this syntax? Or do I need to add any other thing for this multi-group analysis?


TITLE: LCA for three countries
DATA: FILE IS asb.dat;
VARIABLE: NAMES ARE B1-B8 US Thai;
USEVARIABLES ARE B1-B8 Us Thai;
CLASSES = c(3);
CATEGORICAL ARE B1-B8 Us Thai;
ANALYSIS: TYPE = MIXTURE;
MODEL:
%OVERALL%
c#1-c#3 ON US Thai;
OUTPUT: TECH1 TECH8;


many thanks,
Ebi
 Linda K. Muthen posted on Friday, March 02, 2012 - 11:44 am
This is the correct model to start. You should then add each b variable regressed on US and Thai one at a time.
 Joseph E. Glass posted on Friday, April 20, 2012 - 1:06 pm
I am using the KNOWNCLASS to conduct a multiple-group analysis because my dependent variable is nominal. This multinomial DV is regressed on an exogenous latent factor (CFA) composed of a mix of categorical and continuous items, some of which I know are not invariant across the known classes. When I specify my baseline model with configural invariance, I see that the residual variance for the continuous indicator is fixed across known classes. This seems to be different from the normal defaults. Is the behavior of the KNOWNCLASS statement documented somewhere, regarding how to use it to evaluate measurement invariance and adjust for non-invariance? Thank you.
 Linda K. Muthen posted on Friday, April 20, 2012 - 1:19 pm
The defaults differ in different tracks of the program. You can relax the equality by mentioning the parameter is the class-specific parts of the MODEL command. The steps to test for measurement invariance do not differ just the defaults.
 Joseph E. Glass posted on Friday, April 20, 2012 - 2:23 pm
Great, I appreciate the clarification. Thanks for your speedy response as always.
 Orla McBride posted on Monday, June 11, 2012 - 8:02 am
I'm trying to conduct a MG LCA with 10 binary indicators using the knownclass command (for gender) but 2 of the indicators are answered by men only. This means that the data for women can only be missing on these variables.

My question is can this MG analysis be conducted (perhaps using contraints?) if some of the variables are not common across men and women?
 Linda K. Muthen posted on Monday, June 11, 2012 - 5:02 pm
You can either analyze each group separately or not use the indicators with missing data for women.
 Lisa M. Yarnell posted on Thursday, March 14, 2013 - 7:38 pm
Hi Linda,

1. Can you explain (briefly) the purpose of the KNOWNCLASS option in mixture modeling? If classes are known, and mixture modeling accounts for group membership that is only probabilistic, what is the value of the KNOWNCLASS option? Why does Mplus software recommend the KNOWNCLASS option in certain scenarios over a multiple-group approach?

2. Can results of a mixture model estimated with the KNOWNCLASS option be compared with results from a model estimated outside of the mixture modeling framework? Or is estimation inherently different?

3. Is it possible to estimate a KNOWNCLASS model where there really is only one group?

Thank you.
 Linda K. Muthen posted on Friday, March 15, 2013 - 6:26 am
1. Sometimes in mixture modeling, people want to compare groups like males and females. The KNOWNCLASS option is a way to do this. Sometimes multiple group analysis is not available using the GROUPING option and must be done using the KNOWNCLASS option. This has no statistical implications.

2. KNOWNCLASS and GROUPING do the same thing and if you do the same analysis in both ways you will obtain the same results.

3. Yes but the results would be the same as not using the KNOWNCLASS option. I'm not sure if we give a message in this case.
 Lisa M. Yarnell posted on Tuesday, March 19, 2013 - 5:35 am
Hi Linda, can you explain the difference between the "c" and "cg" factors in Example 7.21 of Version 7 Users Guide?

It seems that both "c" and "cg" represent class membership. I'm confused about that and the implications for drawing up my general and class-specific models, where I will ultimately want to test some different patterns of beta weight (regression) parameters across the classes.

Is it that cg will reflect class differences for variances/covariances/beta weights, and c will reflect class differences for means/intercepts?

Thank you!
 Linda K. Muthen posted on Tuesday, March 19, 2013 - 6:43 am
cg is based on an observed variable. The classes are therefore not estimated but are known. It is identical to a grouping variable. c is a categorical latent variable for which class membership is estimated. In both cases, parameters can vary across the classes.
 Lisa M. Yarnell posted on Tuesday, March 19, 2013 - 6:54 am
OK, so if the model I am working with is not a measurement model but just a structural path model, no "c" would be needed--only the "cg" factor, which will allow the model as a whole to differ in certain ways across the classes?

Is that right?
 Linda K. Muthen posted on Tuesday, March 19, 2013 - 7:51 am
There is no relationship between c and cg and which parameters can vary across classes. The only difference is that with c classes are estimated and with cg classes are not estimated.
 Lisa M. Yarnell posted on Tuesday, March 19, 2013 - 8:15 am
Why do I need a c factor if there is no latent portion to my model?

cg will reflect class membership, and that's all that is needed, right?

No c is needed when there is no latent portion to the model itself?
 Lisa M. Yarnell posted on Tuesday, March 19, 2013 - 9:01 am
Basically, Linda, this is my model (below). Since there are no latent variables in this model, can I just use "cg" but not "c"?

I have 5 classes. I am running this as a KNOWNCLASS model because aud_grp (the dependent variable) is a count variable, which works outside of the mixture modeling framework with 1 group, but not with 5 groups. Thank you.

MODEL:

aud_grp on
pardrink
cond_col
age_col (b_agecol);

pardrink with cond_col@0;
age_col with cond_col@0;
age_col with pardrink@0;

age_col on
pardrink (a_pard)
cond_col (a_cond);

MODEL CONSTRAINT:
NEW (ind_pard ind_cond);
ind_pard = a_pard*b_agecol;
ind_cond = a_cond*b_agecol;
 Linda K. Muthen posted on Tuesday, March 19, 2013 - 9:02 am
If all classes are known, you need only a KNOWNCLASS variable.
 Lisa M. Yarnell posted on Tuesday, March 19, 2013 - 9:10 am
Great! Thank you. So I only need either cg or c, but not both. (Correct me if I have misinterpreted.)

Thanks!
 Linda K. Muthen posted on Tuesday, March 19, 2013 - 9:23 am
You need one categorical latent variable. It can be called anything. It is also a KNOWNCLASS variable.
 Lisa M. Yarnell posted on Thursday, March 21, 2013 - 5:35 pm
Hi Linda,

Why is it that in some class models, one must state the MODEL command a second time (after it is stated for the overall model), and in other class models, you do not need a second model command?

I have a single class variable in my model (which I called "cg"). I did not re-state the MODEL command when I started writing code for class-specific parameters. The model ran and the results are great.

But then I noticed that I had not written the MODEL statement again for the class-specific estimates, like this:

----------------------------------------------------------
CLASSES = cg (5);
KNOWNCLASS = cg (eth_gene=1 eth_gene=2 eth_gene=3 eth_gene=4 eth_gene=5);

ANALYSIS: TYPE = MIXTURE;
ALGORITHM=INTEGRATION;
INTEGRATION = MONTECARLO;

MODEL:
%OVERALL%
(overall parameters here)

MODEL cg:
%cg#1%
(parameters that differ in class 1)
----------------------------------------------------------

The model runs great when I DO NOT include the "MODEL cg" line, but when I do include it, I get the message:
*** ERROR in MODEL command
Unknown class model name CG specified in C-specific MODEL command.

Why is that? I am surprised that my initial run went smoothly without the second statement of the MODEL command.

Thanks for helping me understand, Linda.
 Linda K. Muthen posted on Thursday, March 21, 2013 - 6:13 pm
Please send the output and your license number to support@statmodel.com.
 Piotr Bialowolski posted on Wednesday, July 31, 2013 - 1:53 pm
Dear All,

If I may kindly ask you for advice ...

I received review of my paper, which applies multigroup latent class. One of the points reviewer makes is:

Please provide a brief discussion on estimation when there is a different number of observations in each group.

As my groups are time points and the number of responses at different time points ranges from 3000 to 10000 I would like to ask you:
Does the different size of groups affect the results in Mplus? Could you help me with some references to deal with the issue raised by the reviewer?

I already tried hard on my own to find the answer but was completely unsuccessful.
Thanks in advance
Piotr
 Linda K. Muthen posted on Thursday, August 01, 2013 - 10:43 am
Are the same people in your groups. Do you have repeated measures of the same people across time?
 Piotr Bialowolski posted on Friday, August 02, 2013 - 1:39 am
Dear Linda,

I have a panel, but I don't use this information. I just treat each period separately as a different group. It will be the next step of my research to use latent transition.
I think that the reviewer is just interested in technical issues of estimating multigroup LCA with different number of respondents in each group.

Thank you very much

Piotr
 Linda K. Muthen posted on Friday, August 02, 2013 - 8:32 am
Multiple group analysis requires the groups to consist of different people. You should be comparing across time in a single group analysis.
 Mike Todd posted on Wednesday, August 07, 2013 - 11:41 am
We have data from 2200 individuals sampled from 2 different cities. Our goal is to use 7 individual-level indicators to obtain meaningful latent profile solutions.

In exploring the possibility that the profile solutions differ between cities via the KNOWNCLASS command we have obtained somewhat confusing results

Allowing only the estimated item means to vary across cities (KNOWNCLASS categories) results in a large increase in the number of parameters (30 vs. 52) but *worse* fit as judged by absolute differences in -2LL, BIC, and AIC.

I estimated a series of 4 nested(?) models each with 3 derived/estimated classes and 2 observed/known classes. Model 1 ignores city altogether (no KNOWNCLASS command); Model 2 allows item means to vary across cities, Model 3 allows item means and class probabilities to vary across cities; and Model 4 allows item means, item variances, and class probabilities to vary across cities.

Model 4 fit better than Model 3, which fit better than Model 2, which makes sense to me. But only Model 4 fit better than Model 1, which confuses me.

I feel like I must be missing something fundamental about the nestedness (or non-nestedness) of my models. The results suggest that Models 2 and 3 are not actually less constrained versions of Model 1. Is this true?
 Bengt O. Muthen posted on Wednesday, August 07, 2013 - 2:34 pm
Model 1 is not on a loglikelihood metric comparable to the other models, which also means that BIC and AIC are not on a comparable metric. The reason is that Knownclass contributes to the likelihood (imagine an observed indicator, the probability of which is estimated).
 Sointu Leikas posted on Friday, September 13, 2013 - 2:56 am
Hi, I have an LTA model with two time points and 4 classes at both time points. I need to test whether the transition probabilities differ by gender by testing 1) an LTA model in which the transition probabilities are constrained to equality across gender, and b) a model where the probabilities are allowed to vary between gender.
I haven't been able to find out how to constrain trans. probabilities to equality. I tried like this:

CLASSES = csex (2) c1(4) c2(4);
KNOWNCLASS IS csex (SEX=1 SEX=2);
ANALYSIS: TYPE = mixture;
STARTS = 100 25;
MODEL:
%OVERALL%
c2#1 ON c1#1 csex#1 (p1);
c2#1 ON c1#2 csex#1 (p2);
c2#1 ON c1#3 csex#1 (p3);
c2#2 ON c1#1 csex#1 (p4);
c2#2 ON c1#2 csex#1 (p5);
c2#2 ON c1#3 csex#1 (p6);
c2#3 ON c1#1 csex#1 (p7);
c2#3 ON c1#2 csex#1 (p8);
c2#3 ON c1#3 csex#1 (p9);

And I can't add

c2#1 ON c1#1 csex#2 (p1);
c2#1 ON c1#2 csex#2 (p2);
etc. because I can't refer to the last class of csex on MODEL command.

Any help would be appreciated.
 Sointu Leikas posted on Friday, September 13, 2013 - 2:57 am
Sorry, I had to shorten my message and the meaning was lost. What I tried above does constrain the simple class probabilities as equal for men and women, but not the transition probabilities.
 Bengt O. Muthen posted on Friday, September 13, 2013 - 9:02 am
Take a look at Mplus Web Note 13, parameterization 2.
 Sointu Leikas posted on Monday, September 23, 2013 - 5:52 am
Thank you, I modified my code based on the Web Note 13, like this:
MODEL:
%OVERALL%
c1 ON csex;
c2 ON c1;
MODEL c1:
%c1#1%
c2#1 ON csex;
c2#2 ON csex;
c2#3 ON csex;
%c1#2%
c2#1 ON csex; etc.

but I get an error message "Invalid ON statement: C2#1 ON CSEX#1. The order of categorical latent variables does not allow for this regression." But, csex is mentioned first in the CLASSES command? So, I'm puzzled as to what to do.
 Linda K. Muthen posted on Monday, September 23, 2013 - 7:13 am
It should not be first. See the CLASSES option in the user's guide where the order is discussed.
 Sointu Leikas posted on Tuesday, September 24, 2013 - 2:52 am
Dear Linda,

the CLASSES option in the User's Guide tells me that the class on which other classes are regressed on should be first. Because I try to regress c1 and c2 on csex, I figured csex should be first in the CLASSES command. But be that as it may, I get a similar type of error message regardless of which order I use in the CLASSES. If csex is second or last, the error message says "Invalid ON statement: C1#1 ON CSEX#1. The order...", and if csex is first, I get the above error message "Invalid ON statement: C2#1 ON CSEX#1...".
 Linda K. Muthen posted on Tuesday, September 24, 2013 - 6:22 am
Please send the output and your license number to support@statmodel.com.
 Kathryn Modecki posted on Saturday, March 29, 2014 - 3:27 pm
Dear Prof Muthen-I have conducted two separate LPA's with two age groups (adolescents and young adults). I used the the Lo-Mendell-Rubin likelihood ratio test and the the bootstrapped likelihood test to determine the best fitting and most parsimonious models. For adolescents this is clearly a 3 class solution and for young adults a 2 class solution. My understanding is that when it is clear in this case that a 2 class solution won't fit for both groups-that a multiple-group analysis is not feasible. Is this your view? If this is the case, is it possible to test indicator mean differences across models as this is not a multiple group model? Thank you.
 Linda K. Muthen posted on Sunday, March 30, 2014 - 10:51 am
I agree that a multiple group analysis is not appropriate when the classes for the two groups are not the same. Indicator means across models cannot be tested.
 Shuai Chen posted on Thursday, October 16, 2014 - 7:25 pm
Hello,

I am fitting a multigroup mixture model with known classe gender and 3 latent classes according to example 7.21 but without parameter restriction:
CLASSES = cg (2) c (3);
KNOWNCLASS = cg (male = 0 male = 1);

MODEL:
%OVERALL%
c ON cg;

and compare it with the model without multiple group:
CLASSES = c (3);

I expected to get larger loglikelihood for multigroup mixture model, but it is -5587.569, smaller than -4874.786 from the model without multiple group. I also fitted with 1-5 latent classes, and each time the multigroup mixture model has smaller loglikelihood. Any explanation?

Thank you very much!
 Bengt O. Muthen posted on Friday, October 17, 2014 - 11:48 am
The Knownclass loglikelihood is not on a scale comparable to that without Knownclass. This is because Knownclass essentially has an observed binary indicator as an extra DV. If you want to make this type of comparison I think you have to use gender as a covariate of c in your "multiple-group" model (in the model that takes gender into account):

c ON gender;

That way, you have the same DVs in the different models.
 Shuai Chen posted on Wednesday, October 22, 2014 - 7:03 am
Thanks for the suggestion. My DVs are categorical variables. However, I tried your suggested way with 3 classes and found the thresholds for male group are the same with threshold for female group. Can I fit a model with different thresholds for two gender groups with the right loglikehood I need?
 J.D. Smith posted on Friday, October 24, 2014 - 11:55 am
Hi, I receive this error when trying to run a multiple group model with the Bayesian estimator using the mixture model with KNOWNCLASS command:

*** FATAL ERROR
VARIANCE COVARIANCE MATRIX IS NOT SUPPORTED WITH ESTIMATOR=BAYES. PARTIAL EQUALITY BETWEEN TWO VARIANCE COVARIANCE BLOCKS. IF TWO PARAMETERS FROM TWO DIFFERENT VARIANCE COVARIANCE BLOCKS ARE HELD EQUAL THEN ALL THE PARAMETERS HAVE TO BE EQUAL IN THE TWO BLOCKS. USE ALGORITHM=MH TO RESOLVE THIS PROBLEM.

The following model runs with ESTIMATOR = ML

CLASSES = CG (3);
KNOWNCLASS = CG (Cond = 1 Cond = 2 Cond = 3);

Analysis:
estimator = BAYES;
TYPE = MIXTURE;

Model:
%OVERALL%
Con2 on Con1;!(a1);
Con2 on Age;!(b1);
Con2 on TCGen;!(c1);
Con2 on COACHS;!(d1);
Age Tcgen Con1 COACHS;

%CG#1%
Con2 on Con1;!(a1);
Con2 on Age;!(b1);
Con2 on TCGen;!(c1);
Con2 on COACHS;!(d1);
Age Tcgen Con1 COACHS;

%CG#2%
Con2 on Con1;!(a2);
Con2 on Age;!(b2);
Con2 on TCGen;!(c2);
Con2 on COACHS;!(d2);
Age Tcgen Con1 COACHS;

%CG#3% is same as above (message too long if included)

Version is 7.2. Any help is much appreciated.
 Tihomir Asparouhov posted on Friday, October 24, 2014 - 2:59 pm
You have two options both giving the same conditional distribution estimation
[Con2 | Age Tcgen Con1 COACHS]

1. Preferable since you don't estimate as many parameters: Remove all the lines
Age Tcgen Con1 COACHS;
With this option the ditribution for
[Age Tcgen Con1 COACHS] is not estimated, i.e., they are treated as true covariates and no assumptions are made about their distribution.


2. In each class add
Age Tcgen Con1 COACHS with Age Tcgen Con1 COACHS;
So that the covariances are also class specific (not just the variances).

If you have missing data on these variables only option 2 will be possible.
 Julia Moeller posted on Thursday, September 17, 2015 - 9:38 am
Hi, I computed a multigroup latent profile analysis with two groups using the knownclass command.

Using the knownclass command, I don't get the adjusted Lo-Mendell-Rubin likelihood ratio test, nor the Vuong-Lo-Mendell-Rubin likelihood ration test, nor the parametric bootstrapped likelihood ratio test.

My reviewers have requested these tests nevertheless.

Which would be a good way to provide a similar information with the knownclass command, or which would be a useful workaround to this problem?

In advance thank you very much!
 Bengt O. Muthen posted on Thursday, September 17, 2015 - 6:13 pm
Yes, those tests aren't available with more than one c.

Personally, I would simply use BIC.

I can only think of using observed covariates instead of the knownclass variable.
 Julia Moeller posted on Friday, September 18, 2015 - 3:32 pm
Thank you very much!
in addition, I got the suggestion to compute the Bayes Factor and Correct Model Probability based on the formulae of Masyn (2013).

I computed both indices, but wonder about the results.

Do you think these indices work with the knownclass multigroup approach? I don't see a reason why not, but just to double-check.
 Bengt O. Muthen posted on Friday, September 18, 2015 - 6:00 pm
I think so, but you should check with Masyn - she is now at GSU.
 Martina Narayanan posted on Thursday, April 28, 2016 - 3:41 am
I am trying to use TYPE=MIXTURE RANDOM and the KNOWNCLASS option for multiple group analysis using XWITH (for interaction between a latent continous [nevro] and observed continous variable [sle]). My latent variable [nevro] has categorical indicator variables.

Classes = cg (2) c (1);
KNOWNCLASS = cg (Gender = 1 Gender = 2);

Model:

%OVERALL%
nevro BY sado011 sado007 sado008 sado027 sado019;
mod | nevro XWITH sle;
scl6 On sle nevro mod;

%cg#1.c#1%
scl6 On sle nevro mod;

However, I get the warning: "ONE OR MORE PARAMETERS WERE FIXED TO AVOID SINGULARITY OF THE INFORMATION MATRIX." etc.
Unfortunately, the fixed parameters are the regression coefficients for my interaction variables [mod].

Am I doing something wrong in my syntax?

In advance, thank you for your help!
 Martina Narayanan posted on Thursday, April 28, 2016 - 3:51 am
A quick clarification: I do get an estimate for the regression coefficient for my interaction variable [mod] but the standard error is zero and accordingly I don't get a significance estimate (999.000).
 Linda K. Muthen posted on Thursday, April 28, 2016 - 6:46 am
Please send the output and your license number to support@statmodel.com.
 Lisa M. Yarnell posted on Wednesday, December 21, 2016 - 7:32 am
Hello, Drs. Muthen.

In general, does computational burden increase with the number of known classes that are modeled? I am interested in running two-level mixture models with seven cohorts each represented in a known class.

However, I am concerned about running a seven-class model, when the two-level model itself already has the WITHIN and BETWEEN parts.

Is there a number of known classes beyond which computation will generally become too burdensome? In that case, I might collapse some cohorts by age group, but I want to see if I can keep all age groups separate to discern the most about the process at each age, if possible.

Thank you.
 Lisa M. Yarnell posted on Wednesday, December 21, 2016 - 7:36 am
P.S. I know that computational burden also depends on the complexity of the model--such as the number of latent variables and random slopes that are estimated. However, I wonder if you have recommendations about sheer number of known classes as a separate factor. Does it depend on the number of observations in each class?
 Tihomir Asparouhov posted on Wednesday, December 21, 2016 - 8:11 pm
I don't think the number of classes is an issue at all because the classes are known. All between random effects will have to be integrated however so if you have more than two you should use montecarlo integration with 5000 points. Soon Mplus will have the Bayesian estimation for these models and it will be easy to estimate.
 Fan Xizhen posted on Thursday, February 16, 2017 - 1:56 am
Dear Prof. Muthen,
I¡¯m working on a multigroup profile analysis with known classes, but I¡¯m still confused about the syntax, I need your help in the following problems:
1) The example 7.21 (Mixture modelling with known classes - multiple group analysis) in the Mplus User Guide Ver_7.0 is a fully uncontrained MLPA, right? So the means of y1, y2, y3, and y4 vary across the classes of c, while the variances of y1, y2, y3, and y4 vary across the classes of cg.
2) In the example 7.21, it did not mention whether the profile sizes vary freely across samples or stay equal? If I want to constrain the profile sizes of different samples, which syntax should I use? I searched through the manual, but I just could not find it.
3)How to constrain the means or variances of LPA indicators between different samples, what is the syntax?

Thanks in advance, your reply would be highly appreciated!
 Jon Heron posted on Thursday, February 16, 2017 - 2:31 am
Hi Fan

I wonder whether you might get along better following the convention shown in example 8.8

you have two grouping variables CG and C, one is measured perfectly and the other is not. If each has two categories then you are just talking about 4 groups defined by their combination.

by specifying the means and variances of Yi within each combination you can apply whatever constraints you are interested in.

%cg#1.c#1%

%cg#2.c#1%

%cg#1.c#2%

%cg#2.c#2%
 Fan Xizhen posted on Thursday, February 16, 2017 - 3:28 am
Hi Jon,
Thank you very much!
Acturally, no, I'm struggling with this syntax. Let me check if I got what you said.
If I want the means and variance to be equal, I just need to put it like this:
MODEL:
%OVERALL%
c ON cg;
%cg#1.c#1%
[y1-y5] (p1-p5);
%cg#2.c#1%
[y1-y5](p5-p10);
%cg#1.c#2%
[y1-y5] (p1-p5);
%cg#2.c#2%
[y1-y5](p5-p10);
And I'm also confused about how to constrain the profile sizes.
 Jon Heron posted on Thursday, February 16, 2017 - 3:53 am
To be on the safe side I would specify the variance terms too. it's then a trivial step to relax them if you choose. perhaps I am over-cautious, but I don't like getting estimates which I haven't explicitly asked for.

as for the profile sizes, I think removing the "c on cg;" command will do that for you.

MODEL:
%OVERALL%
c ON cg;
%cg#1.c#1%
[y1-y5] (p1-p5);
y1-y5 (v1-v5);
%cg#2.c#1%
[y1-y5](p5-p10);
y1-y5 (v1-v5);
%cg#1.c#2%
[y1-y5] (p1-p5);
y1-y5 (v1-v5);
%cg#2.c#2%
[y1-y5](p5-p10);
y1-y5 (v1-v5);
 Fan Xizhen posted on Thursday, February 16, 2017 - 4:02 am
Great, I'm clear now! Thank you very much, Jon, you've really done me a big favor!
 Jon Heron posted on Thursday, February 16, 2017 - 4:05 am
No problem. best of luck :-)
 Fan Xizhen posted on Tuesday, February 21, 2017 - 4:23 am
Dear Prof. Muthen,
I¡¯m working on a multigroup profile analysis with known classes, and I need your help in the following problems:
1.Can you explain the purpose of the KNOWNCLASS option in mixture modeling? Is it the same with multiple-group approach or is it just one of the ways to conduct a multiple-group analysis?
2.If I run a MLPA, like a MLPA using two cross-national samples, and I chose the best fit model. Then I want to conduct a 3-step LPA analysis for the two samples, which result should I use to do this? The MLPA result or the results from the two separate LPAs. I noticed that they are different.

Thanks in advance, your reply would be highly appreciated!
 Bengt O. Muthen posted on Wednesday, February 22, 2017 - 1:07 pm
This is answered elsewhere.
 Fan Xizhen posted on Wednesday, February 22, 2017 - 5:30 pm
Thank you very much, Mr Muthen, I really appreciated it.
 Fan Xizhen posted on Tuesday, February 28, 2017 - 7:26 pm
Dear Prof. Muthen,
I have some questions about MLPA that I was wondering if you can help me out.
1) To get measurement invariance in MPLA, What should be constrained to be equal? Should both of the means and variances of the indicators be restricted to be equal? Or just the means?
2) What is the difference between constraining both the means and variances and constraining only the means? If the model with only means constrained has better statistical fit over the one with both means and variances constrained, does it make sense to compare the differences between different cohorts?
Thanks in advance, your reply would be highly appreciated!
 Bengt O. Muthen posted on Wednesday, March 01, 2017 - 7:54 am
I would constrain only the means. Constraining the variances as well would be analogous to holding measurement error variances invariant and that is a strong assumption.
 Fan Xizhen posted on Wednesday, March 01, 2017 - 4:34 pm
Thank you very much, Mr.Muthen.
Would you give me some references about this?
 Bengt O. Muthen posted on Wednesday, March 01, 2017 - 5:58 pm
I don't know about literature on LPA invariance. Check articles on our website under Papers, LTA.
 Fan Xizhen posted on Wednesday, March 01, 2017 - 6:11 pm
Ok, thank you again!
 Sheila Frankfurt posted on Wednesday, March 15, 2017 - 1:19 pm
Hello Drs. Muthen and the Mplus team,

I am running a multiple group LGMM and want to estimate a covariate separately for each class in each group. However, I can only get results for the first group:

knownclass = g(tx=0 tx=1);
classes= g(2) c(3);
Analysis: TYPE = MIXTURE;
Model: %Overall%
i s|bl_gsitot@0 f3_gsitot@3 f6_gsitot@6;
c ON g;
c ON x;
MODEL g:
%g#1%
c ON x;
%g#2%
c ON x;

I have also tried to specify each group:

knownclass = g(tx=0 tx=1);
classes= g(2) c(3);
Analysis:TYPE = MIXTURE;
Model: %Overall%
i s|bl_gsitot@0 f3_gsitot@3 f6_gsitot@6;
c ON g;
c ON x;
%g#1.c#1%
c ON x;
%g#1.c#2%
[repeated for each group]

However, I get this error: The following model statements are ignored:
*Statements in Class %g#1.c#1% of Model:
c#1 on X
[repeated for each g#.c#]
***ERROR
One or more MODEL statements were ignored. Note that ON statements must appear in the OVERALL class before they can be modified in the class-specific models.

I do specify c ON x in the %OVERALL% model so I am confused why I get this message.

Thank you very much for your help!!
 Bengt O. Muthen posted on Wednesday, March 15, 2017 - 6:18 pm
Please send the output to support along witg your license number so we can see.
 Timothy Piehler posted on Wednesday, October 25, 2017 - 12:24 pm
Hello,
I am hoping to use the KNOWNCLASS feature to investigate how latent class membership may moderate the relationship between two different intervention options and an outcome. I'm hoping to determine if a specific latent class membership is related to a stronger intervention response to intervention option 1 versus option 2. I have three latent classes and two knownclass intervention options. This general framework is described in Lanza & Rhoades (2013) Prevention Science paper ("Latent Class Analysis: An Alternative Perspective on Subgroup Analysis in Prevention and Treatment"), but I'm struggling with generating the appropriate syntax in Mplus. Are you able to point me towards any reference or syntax example that may be helpful?

Thank you!
 Bengt O. Muthen posted on Wednesday, October 25, 2017 - 1:10 pm
Start from UG ex 8.8. It has one Known latent class variable and one unknown latent class variable.
 Tony Kwan posted on Saturday, November 25, 2017 - 12:56 am
Dear Prof Muthen,

I am doing a multigroup multilevel analysis. However, when I run the program, the following error occurs:

VARIABLE:
NAMES=IDCUNTRY UNI_sch SEX DHA_1-DHA_5 ALA_1-ALA_5 HER_Q1 HER_Q3 F1-F3 s16-s17;
USEVARIABLES = UNI_sch DHA_1 F1 F2 F3 s16 s17;
CATEGORICAL = DHA_1;
USEOBSERVATIONS are (HER_Q1 EQ 1);
CLASSES= CUNTRY (33);
KNOWNCLASS = CUNTRY (IDCUNTRY = 36
IDCUNTRY = 48
...
IDCUNTRY = 9133
);
CLUSTER = UNI_sch;
WITHIN = F1 F2 F3 s16 s17;
MISSING = ALL(9999);

*** WARNING
Data set does not contain any cases where variable IDCUNTRY has a
value of 36.0. This is used to determine one of the classes
for the KNOWNCLASS specification.

There are 33 warnings with 33 values specified in my program regarding KNOWNCLASS option. I have no idea why it cannot read the values in IDCUNTRY. Thank you for your help!

Tony
 Bengt O. Muthen posted on Sunday, November 26, 2017 - 4:32 pm
Send your output, data, and license number to Support.
 Dayuma Vargas posted on Tuesday, April 17, 2018 - 10:32 am
Hello,
I wonder if the issue discussed by Sointy on the Sept 23, 2013 post was resolved and, if so, if it would be possible to know the solution?

I have run into the same problem: I have a 2-times LTA and am using KNOWNCLASS to introduce a dichotomous covariate. When I try to use parameterization 2 from Mplus web note 13, I get the "order of categorical latent variables does not allow for this regression" warning regardless of the order in which I introduce the classes in the VARIABLES section.

Syntax excerpt:
USEVARIABLES ARE LTA_T1C4 LTA_T2C4 e1g;
CLASSES = Ce1g(2) C1(4) C2(4);
KNOWNCLASS = Ce1g (e1g = 0 e1g = 1);

MODEL:
%OVERALL%
C1 ON Ce1g;
C2 ON C1;

MODEL C1:
%C1#1%
C2#1 ON Ce1g;
C2#2 ON Ce1g;
C2#3 ON Ce1g;
...

Alternatively, is there a way to calculate the conditional transition probabilities using parameterization 1 in the Mplus web note 13? I am interesting in testing for differences in transition probabilities between the classes of the KNOWNCLASS.

Thank you.
 Bengt O. Muthen posted on Tuesday, April 17, 2018 - 4:36 pm
See our website's V8 UG ex 8.13 with the Tech15 discussion on p. 250. Ad also the alternative Parameterization=Probability. See also UG pp. 560-561.
 J.D. Haltigan posted on Wednesday, April 18, 2018 - 1:24 am
Conceptual question about using the KNOWNCLASS option with a basic LPA. If I have a set of Y indicators, and I want to see if the classes they generate differ in parameter estimates (for the indicators) across the KNOWNCLASS variable (akin to ANOVA logic) with no further evaluation of the latent classes with respect to outcomes (as their would be MI), can I still interpret the class prevalences for the latent class x known class patterns without concern? In short, all I want to establish is that the KNOWNCLASS grouping variable has a different empirical meaning across the derived latent classes, as the resultant indicator means for the set of n...k classes will serve to test my hypotheses.
 Bengt O. Muthen posted on Wednesday, April 18, 2018 - 3:55 pm
Perhaps you wonder if the Knownclass variable influences class formation - is so, yes. If you don't want that you can instead use dummy covariates in R3STEP.
 J.D. Haltigan posted on Wednesday, April 18, 2018 - 6:19 pm
Perfect, thanks Bengt. Assuming that there is no effect of the grouping variable on class formation as indicated by R3STEP, I can then more confidently infer that differences in parameter estimates across KNOWNCLASSES for the indicator y's reflect substantive differences in population heterogeneity (i.e., more evidence that the mixture classes do indeed reflect true subpopulations)? I guess I say this as typically we are interested in externally validating MI classes with outcomes. In this case, what we have in a way is a test of class discriminant validity on a grouping variable.
 Bengt O. Muthen posted on Thursday, April 19, 2018 - 4:26 pm
Right.
 Boliang Guo posted on Wednesday, May 09, 2018 - 3:38 am
Hi All, I tried runing an ESEM multiple group analysis using kownclass command. the code saying
*** ERROR in MODEL command
Invalid equality label: *T

The ESEM code have been working very well and ok if I use group option instead of knownclass option. looks knownclass option doesn't like ESEM model setting symbol (*t). any help please?

!GROUPING IS arm (0=tau 1=TX);
CLASSES = c (2);
KNOWNCLASS=c(arm=0 arm=1);

ANALYSIS:
ROTATION = BI-GEOMIN(orth);
MODEL = CONFIGURAL SCALAR;
ESTIMATOR=MLR;
TYPE=MIXTURE;
MODEL:
%OVERALL%
g1df d1f1-d1f3 by bk1 - bk21 (*t);
 Bengt O. Muthen posted on Thursday, May 10, 2018 - 2:55 pm
See page 720 of the V8 UG - ESEM is available only for General and Complex, not Mixtures.
 Stefan posted on Wednesday, August 14, 2019 - 8:48 am
Dear Prof Muthen,

I have conducted a MG LCA using the known class option. I now want to check for potential covariates using the R3STEP command for each class separately. However, this did not work.

Mplus handed me the following error:
*** ERROR in VARIABLE command
Auxiliary variables with E, R, R3STEP, DU3STEP, DE3STEP, DCATEGORICAL, DCONTINUOUS, or BCH are not available with TYPE=MIXTURE with more than one categorical latent variable.

Is there a way to do this?

Thank you in advance, Stefan
 Tihomir Asparouhov posted on Thursday, August 15, 2019 - 9:14 am
Take a look at Section 4
http://www.statmodel.com/download/3stepOct28.pdf
and Sections 6-14
http://statmodel.com/download/AppendicesOct28.pdf
 DavidBoyda posted on Wednesday, April 01, 2020 - 5:29 am
Hi,

I wonder if you could advise me what is best way to proceed for this model. I have data but wish to model males and female seperately because I am hypothesizing their optimal class structures would differ from each other.

What example in the manual would I lean on if i wish to:
(a) model the above
(b) compare if they classes are significanly different from each other
(c) do I need to do (b) if the say males produce a different optimal class than females?

Is there a more optimal way to achieve the above?

thank you.
 Bengt O. Muthen posted on Wednesday, April 01, 2020 - 4:09 pm
UG ex 7.21 can be a starting point.
 Diane Putnick posted on Thursday, April 02, 2020 - 8:19 am
I am using knownclass to compute a multiple group model (across child gender) to asses moderation of two continuous latent variables predicting a dichotomous outcome. I computed the scaling corrected chi-square change for constrained and unconstrained models and the chi-square is large and significant.

The question is how to decide which parameters are different across the two groups. Traditionally I would use modification indices to choose the paths that should be released, but they are not available for this model. I could compare chi-squares for each individual constrained parameter, but that's pretty tedious. Any other suggestions?
 Bengt O. Muthen posted on Thursday, April 02, 2020 - 6:06 pm
You can label the parameters of interest in the Model command and then use Model Constraint to express group differences in those parameters. That gives you a test for each parameter.
Back to top
Add Your Message Here
Post:
Username: Posting Information:
This is a private posting area. Only registered users and moderators may post messages here.
Password:
Options: Enable HTML code in message
Automatically activate URLs in message
Action: