LCA with Covariates, Missing Data, an... PreviousNext
Mplus Discussion > Latent Variable Mixture Modeling >
Message/Author
 Adam Perzynski posted on Friday, April 15, 2005 - 10:11 am
I am running LCA with covariates and have a question that I hope you can help me to answer. I have simplified the syntax to state the problem more clearly.
u1-u7 are the latent class indicators. x1 and x2 are covariates. There is missing data on x1, a measure of income. Using the following syntax, I have trouble with the plots. While histograms and scatterplots are available, the plots of "estimated probabilities for a categorical latent variable as a function of its covariates" are not available. If I run the syntax below without the portion of the MODEL statement "x1 on x2 x3 x4 x5" MPlus outputs the plots but does not estimate the missing values as x1 is then treated as an x-variable in the analysis. This leaves me with one main question. Is it possible to order the covariates, estimate missing values, and still obtain the probability plots? Do I need to hand calculate the probabilites from the coefficients as suggested on page 345- of the User Guide? Thank you for help with this.


VARIABLE:
NAMES ARE u1-u7 x1-x5;

USEVARIABLES ARE u1-u4 x1-x5 ;

CATEGORICAL ARE u1-u7;
MISSING ARE ALL (999);
CLASSES = c (3);

ANALYSIS:
TYPE IS MIXTURE MISSING;
INTEGRATION=NUMERICAL
ALGORITHM=MONTECARLO

MODEL: %OVERALL%
C#1 ON x1 x2 x3 x4 x5;
C#2 ON x1 x2 x3 x4 x5;
x1 ON x2 x3 x4 x5

OUTPUT: TECH1 TECH8;

PLOT: TYPE=PLOT3
 Adam Perzynski posted on Friday, April 15, 2005 - 10:26 am
My apologies the above syntax should read as follows.

VARIABLE:
NAMES ARE u1-u7 x1-x5;

USEVARIABLES ARE u1-u4 x1-x5 ;

CATEGORICAL ARE u1-u7;
MISSING ARE ALL (999);
CLASSES = c (3);

ANALYSIS:
TYPE IS MIXTURE MISSING;
ALGORITHM=INTEGRATION;
INTEGRATION=MONTECARLO;

MODEL: %OVERALL%
C#1 ON x1 x2 x3 x4 x5;
C#2 ON x1 x2 x3 x4 x5;
x1 ON x2 x3 x4 x5 ;

OUTPUT: TECH1 TECH8;

PLOT: TYPE=PLOT3 ;
 Linda K. Muthen posted on Saturday, April 16, 2005 - 4:29 am
I think you can achieve treating x1 as a y variable by simply mentioning its variance. So remove x1 ON x2 x3 x4 x5 ; and add x1;
 Adam Perzynski posted on Saturday, April 16, 2005 - 10:43 am
Thank you so much for your reply. These forums are a tremendous resource. I did as you suggested, and the model runs fine, but still MPlus does not produce the probability plots, only the histogram and scattergram.

I then realized another difference from the syntax in my message above, and the syntax in which MPlus produces the probability plots. If I remove x1 ON x2 x3 x4 x5 ; MPlus does not produce the plots.

If I continue and remove
ALGORITHM=INTEGRATION;
INTEGRATION=MONTECARLO;
then MPlus produces the plots.

Is it the case that the probability plots are not available when using montecarlo integration?

Thank you again for your help.
 Thuy Nguyen posted on Monday, April 18, 2005 - 6:12 pm
If you are referring to the plot of "estimated probabilities for a categorical latent variable as a function of its covariates", then this plot is not available for models with numerical integration. Numerical integration would be necessary when covariates are selected and this cannot be done post-processing.
 Annie Desrosiers posted on Wednesday, October 11, 2006 - 6:07 am
Hi, I have a question about plot in a LCA.

When I use a model with i s | y1@0 y2@1 y3@2; everything works well!!
But, when I try to use my tscores (age1-age3) in the model like below, I have a problem with my output, the plots are not there anymore…
Can you tell me what is the problem with this syntax, I tried all the possibility for series…

Thank…

variable: names are id age1 age2 age3 onset y1 y2 y3;
usevariables are age1-age3 y1-y3;
tscores = age1-age3;
classes = c(6);
missing = . ;

analysis: type = mixture random missing;
starts = 20 2;

model: %overall%
i s | y1-y3 at age1-age3;

plot: type = plot3;
series = y1-y3 (age1-age3);
 Linda K. Muthen posted on Wednesday, October 11, 2006 - 10:36 am
Growth plots are not available with TYPE=RANDOM;
 Rachel Foster posted on Saturday, March 22, 2008 - 12:35 pm
I am running a LCA with and without covariates. I’ve conducted LCAs before with the Analysis command: type=mixture missing (to obtain FIML) and ALWAYS received all necessary output to do the necessary interpretation. Now with the covariate model, I’m not receiving the Results in Probability Scale output. Why might that be? Is it because of the mixture missing command? Or, is it some other reason? Thanks for your help.
 Linda K. Muthen posted on Saturday, March 22, 2008 - 12:54 pm
I don't believe this has changed. If the categorical latent variable is regressed on a covariate (c ON x), you will obtain results in the probability scale. If a latent class indicator is regressed on a covariate (u ON x), results are not given in probability scale because they vary depending on the covariate value.
 Sharon Ghazarian posted on Thursday, May 28, 2009 - 9:29 am
I have a 4-class LCA with both categorical and continuous covariates. The model runs just fine with the categorical covariates, but I get the following error message when I include any continuous covariates:

WARNING: THE BEST LOGLIKELIHOOD VALUE WAS NOT REPLICATED. THE SOLUTION MAY NOT BE TRUSTWORTHY DUE TO LOCAL MAXIMA. INCREASE THE NUMBER OF RANDOM STARTS.

Although I get this message, the output looks just fine and demonstrates no major problems. I have increased the starts and it makes no difference. Should I ignore the message, or am I missing something in my syntax maybe? Here is the syntax with just two categorical (drpyr2; cesd162) and one continuous (mage1) covariate.

thank you.

Usevariables are verbalmr2 verbalmp2 verbalsr2 verbalsp2 phymr2 phymp2 physr2 physp2 sexr2 sexp2 injmr2 injmp2 injsr2 injsp2 drpyr2 cesd162 mage1;

Missing = all (999);

Categorical are verbalmr2 verbalmp2 verbalsr2 verbalsp2 phymr2 phymp2 physr2 physp2 sexr2 sexp2 injmr2 injmp2 injsr2 injsp2;

classes = c (4);

ANALYSIS:
TYPE = MIXTURE;
STARTS = 300 50;

MODEL:
%OVERALL%
c on drpyr2 cesd162 mage1;

OUTPUT:
TECH11;
 Linda K. Muthen posted on Thursday, May 28, 2009 - 10:39 am
Without replicating the best loglikelihood, you may have hit a local solution. You should increase the random starts.

I would also check that the variances of the continuous variables are not large. If they are, I suggest rescaling them by dividing them by a constant in the DEFINE command so that the variances are between one and ten.

If you continue to have problems, you should send the input, data, output, and your license number to support@statmodel.com.
 Sharon Ghazarian posted on Thursday, June 04, 2009 - 11:42 am
Thank you so much! It turned out to be the variances after all. I appreciate the help and extra set of eyes.
 Roxann Roberson-Nay posted on Saturday, July 03, 2010 - 7:10 am
I am conducting a 3 class LCA with 13 binary outcomes and three covariates (n=2300). Class 3 serves as the reference group, so I get class 1 vs. class 3 and class 2 vs. class 3 in the output. The Alternative Parameterization is not provided so I do not get class 1 vs. class 2. I believe this is happening because I am using numerical integration (algorithm = integration). How do I specify my Mplus code to give me the class 1 vs. class 2 comparison?

Thanks,
Roxann
 Linda K. Muthen posted on Saturday, July 03, 2010 - 9:44 am
An LCA with categorical outcomes does not require numerical integration. You should remove ALGORITHM=INTEGRATION from the ANALYSIS command.
 Roxann Roberson-Nay posted on Tuesday, July 06, 2010 - 1:41 pm
When I try to run the LCA without the Algorithm=integration option, I get the following warning from Mplus: "This latent class regression requires numerical integration. Add algorithm=integration to the analysis command." I forgot to mention that my covariates are both categorical and continuous.

Thanks!
Roxann
 Linda K. Muthen posted on Tuesday, July 06, 2010 - 2:09 pm
Please send the full output and your license number to support@statmodel.com.
 Jerry Cochran posted on Tuesday, November 30, 2010 - 1:36 pm
It seems as though I might have a issue similar to the one above described by Roxann.

I am in the process of developing an LTA, following the Nylund 2007 dissertation. I am at the point where I am adding covariates and distal outcomes to my models at each time point (the covariates being dichotomous and the one distal being continuous).

While my model for the baseline runs well, when I add covariates to my 2nd and 3rd time points, I get the following error messages:

*** ERROR
The following MODEL statements are ignored:
* Statements in the OVERALL class:
C#1 ON GENDER2
C#1 ON ETHNICIT
C#1 ON INJTYPEV
C#2 ON GENDER2
C#2 ON ETHNICIT
C#2 ON INJTYPEV
*** ERROR
One or more MODEL statements were ignored. These statements may be
incorrect or are only supported by ALGORITHM=INTEGRATION.

The differences between the time points is that the baseline data is complete, and the other time points have missing data.

Is the missing data the problem, or am I doing something incorrectly?

Thank you for your help.
 Linda K. Muthen posted on Tuesday, November 30, 2010 - 2:50 pm
Try adding ALGORITHM=INTEGRATION to the ANALYSIS command.
 Jerry Cochran posted on Tuesday, November 30, 2010 - 7:21 pm
Thank you for your response. I added the ALGORITHM=INTEGRATION command, and the models converge well.

I do have another question, though, that I wanted to ask. In the probability plots for my 3 class solution, I have clear high, medium, and low endorsing groups. However, in the probability plots, the low endorsing group is labeled as class 2 and the medium group is 3. Is there a way I can switch the classifications of groups 2 and 3?

I ask because in the odds ratios reported in the output under model results, I have the ratios for classes 1 and 2 for my three covariates. That would be okay, but, I want to report the odds ratios for classes 1 and 2, with class 2 being the medium class and not the low class. Is there a way in Mplus that I can specify which classes are defined as 1, 2, or 3?

Thank you again for your help.
 Linda K. Muthen posted on Wednesday, December 01, 2010 - 6:40 am
You can use the ending values for class 2 as starting values for class 3 in a new analysis.
 Simone Schmidt posted on Friday, December 03, 2010 - 6:06 am
Hi, I think I´ve got nearly the same problem like Roxanna (July 03, 2010). I am conducting a 3 class LCA with the following input:
usevariables are
klima2 mathe2 lit2
mig1 mig2_ HISEI AlterJ AlterK Anzahl
m2_ Buecher;
Cluster= code_num_3;
missing are all (-999);
classes = c(3);

analysis:
type=mixture complex;
algorithm=integration;
integration= montecarlo;
starts= 500 50;
stiterations = 50;

Model:
%OVERALL%
c on
mig1 mig2_ HISEI AlterJ AlterK
Anzahl m2_ Buecher;

HISEI AlterJ AlterK
Anzahl m2_ Buecher;

For using the FIML I list the variables in the model line and mplus asks me then to use Algorithm=integration (although the covariates should be imputed are not categorical). Imputation seems to work well but for the regression model I only get class 1 and class 2 vs. class 3 in the output. The Alternative Parameterization is not provided so I do not get class 1 and class 3 vs. class 2 etc. How do I specify my model to give me all comparisons?
Thank you very much!
 Linda K. Muthen posted on Friday, December 03, 2010 - 8:46 am
The reason you need numerical integration is that you mention the variances of the observed covariates in the MODEL command. If you remove these and the following statements you should get all parameterizations.

algorithm=integration;
integration= montecarlo;
 Simone Schmidt posted on Monday, December 06, 2010 - 7:33 am
Hi Linda, thank you for your reply. If I remove the covariates in the MODEL command the imputation is not working and the classes identified differ from the original model without covariates. Is the only possibility not using the FIML to get all parameterizations? Thanks again!
 Linda K. Muthen posted on Monday, December 06, 2010 - 10:16 am
It cannot understand the problem. Please send the relevant outputs and your license number to support@statmodel.com.
 Tamika Gilreath posted on Friday, August 12, 2011 - 5:10 pm
Hi I'm running into a similar problem as described by Simone. Could you please tell what it is that I'm doing to require the integration and montecarlo? I need to see the alternative parameterizations for the model being specified.


USEVARIABLES ARE age Q2SEX Q41C Q41D Q41E
Q60A Q60B Q60C black hispanic asian other;

CLASSES = c (5);

Categorical = black hispanic asian other Q2SEX Q41C Q41D Q41E Q60A Q60B Q60C;

Missing is all (999);

ANALYSIS:
type = mixture ;
process=4;
ALGORITHM=INTEGRATION;
integration=montecarlo;

MODEL:

%overall%
c#1-c#4 on black hispanic asian other Q2SEX age;


OUTPUT: SAMP stand cint tech14 tech11;


TYPE IS PLOT3;
SERIES IS Q41C Q41D Q41E Q60A Q60B Q60C(*);


Thanks!
 Bengt O. Muthen posted on Saturday, August 13, 2011 - 1:50 pm
Please send your output to support.
 Tait Medina posted on Thursday, August 16, 2012 - 9:00 pm
Is there a quick way to get the estimated probabilities that are plotted in the plot "Estimated probabilities for a categorical latent variable as a function of its covariates"? Or, is this something we just need to compute by hand? Thank you!
 Linda K. Muthen posted on Friday, August 17, 2012 - 1:03 pm
They should be in the gph file.
 Alma Boutin-Martinez posted on Tuesday, December 04, 2012 - 12:58 pm
Hi, I am running a LCA model with continuous and categorical covariates. When using the FIML method, is it correct to interpret that the analysis includes data from participants who had data on at least one of the categorical indicators, unless they were missing data on one of the covariates? In other words, how does the FIML treat covariates with missing data?


Thanks,
Alma
 Linda K. Muthen posted on Tuesday, December 04, 2012 - 2:03 pm
A person who has a missing value on one or more of the covariates is excluded from the analysis.
 Alma Boutin-Martinez posted on Tuesday, December 04, 2012 - 8:05 pm
Thank you!
 Ari Mäkiaho posted on Tuesday, August 27, 2013 - 1:46 am
Hi, I'm using Mplus Version 7.11. I'm running a LCA model.

CLASSES = c(2);
ANALYSIS:
type = mixture;

When I'm using the plot command

PLOT:
type = plot3;

I get this message in the output:

"Mplus diagrams are currently not available for Mixture analysis.
No diagram output was produced"

Is that really so? I'm new to the program and I have watched videos where you can plot the probability estimates (for example http://www.statmodel.com/videos/topic5_pt3_sm.shtml)

(sorry for my bad English)
Thanks,
Ari
 Ari Mäkiaho posted on Tuesday, August 27, 2013 - 6:13 am
Hi, I realized that I can use 'View plots' - 'Sample proportions and estimated probabilities' from the Plot menu.
 Linda K. Muthen posted on Tuesday, August 27, 2013 - 6:46 am
The diagram referred to is a diagram of the model. These are not available for mixture models. Plots of results using the PLOT command are available.
 Ari Mäkiaho posted on Wednesday, August 28, 2013 - 8:51 am
Thank you.
 Kristen Lee posted on Tuesday, September 17, 2013 - 2:35 pm
Hi, I am estimating a 3 class LCA model with covariates. I am interested in plotting the estimated probabilities of one of the indicators of my latent classes (opppcare), conditional on a set of covariates. However, regardless of what extreme values I set the covariates to, there is no variation in the predicted probabilities of the latent class indicator variable (by class). I've pasted part of my program below. Am I doing something wrong? Thank you.

Classes = c(3);
Analysis:
Type = mixture missing ;
starts=800 200;
LRTSTARTS = 2 1 50 15;
Model:

%Overall%
c#1-c#2 on ageb female married university citya nobro sibs chohorw
numdau numson wifparcar2 husparcar2;
Plot:
type is plot3;
series is q7wwhhx (1) Q7WWHPHH (2)
q7ffman (3) q7wwloff (4) Q7WWMNCKR (5) q7ffndmn (6) q7prman (7)
opismw (8) OP5SRWFE (9) opppcare (10) q7ffauth (11) q7ffhnr (12)
;
 Bengt O. Muthen posted on Wednesday, September 18, 2013 - 1:48 pm
The covariates in your model influence the latent class probabilities, which in turn influence the indicator probabilities. Your model has no direct effects from covariates to indicators. So when you get a probability for an indicator by class, then the covariate has no further influence on this probability within that class.
 Hyunzee Jung posted on Monday, August 11, 2014 - 11:30 am
Hi,

I have two predictors (one measured variable and the other latent continuous variable) for a latent categorical variable (LCA: 4 classes). I obtained odds ratios with class 4 being a reference. Earlier when I entered just one measured predictor, mplus automatically provided alternative parameterization for odds ratios of all possible comparison pairs. However, once the additional latent predictor was added to my model, this alternative parameterization odds ratios were not generated. How can I obtain these?

Below is a part of my input. Thank you so much!


USEVARIABLES male pabpre3m pabprept3 eabpre3m
WEB VIC_PSYCHd VIC_PHYSd VIC_SEXd VIC_INJd
PERP_PSYCHd PERP_PHYSd PERP_SEXd PERP_INJd ;

CLASSES= C (4) ;

CATEGORICAL=WEB VIC_PSYCHd VIC_PHYSd VIC_SEXd VIC_INJd
PERP_PSYCHd PERP_PHYSd PERP_SEXd PERP_INJd;

MISSING ARE ALL (-999);

MODEL:
%overall%
f1 by pabpre3m pabprept3 eabpre3m ;
C on f1 male ;

ANALYSIS:
TYPE=MIXTURE;
STARTS=1000 10;
STITERATIONS=20;
MITERATIONS=1000;
LRTSTARTS=0 0 30 10;
ALGORITHM=INTEGRATION;
 Bengt O. Muthen posted on Monday, August 11, 2014 - 5:22 pm
In this case you would have to do it yourself using separate runs where the starting values for the latent class indicators are chosen to produce a certain last class.
 Hyunzee Jung posted on Monday, August 11, 2014 - 11:42 pm
Thank you so much, Bengt!

Could you please guide/direct me on where I can learn how to do what you said - choosing starting values so as to produce a certain last class?
 Linda K. Muthen posted on Tuesday, August 12, 2014 - 10:32 am
Use the SVALUES option of the OUTPUT command to get input with the ending values as starting values. Change the class numbers to what you want and delete the means of the categorical latent variables. Use STARTS=0;
 Hyunzee Jung posted on Thursday, August 14, 2014 - 2:03 am
Thank you, Linda.
 Elisa Bolton posted on Thursday, August 04, 2016 - 9:04 am
Hello,
I am running a LCA model with a 3-step approach to estimate the effects of covariates on class membership. The covariates entered in the Auxiliary statement have missing data and Mplus is using list-wise deletion. Is there a way to estimate the missing data for the covariates when using the 3-step approach? If not, what is the alternative? Would creating a multiple imputed dataset be an option?

Thank you
 Bengt O. Muthen posted on Thursday, August 04, 2016 - 6:48 pm
Multiple imputation is a possibility but many of the usual analysis options aren't available for MI.
 Janna Kook posted on Thursday, October 13, 2016 - 8:45 am
Hello,

I'm running an LCA model that includes some items that have conditional dependence (multiple parts to the same question). To model this, I've tried creating latent variables for the groups of items, or defining new summary variables. When I try to plot the latent classes (plot3, series option), the program does not recognize the latent variables or defined variables as part of the series.

Is it possible to plot these along with other dichotomous variables? Is there another way I should be doing this?

Thanks so much.
 Bengt O. Muthen posted on Thursday, October 13, 2016 - 1:35 pm
Have you looked at the paper on our website:

Asparouhov, T. & Muthen, B. (2015). Residual associations in latent class and latent transition analysis. Structural Equation Modeling: A Multidisciplinary Journal, 22:2, 169-177, DOI: 10.1080/10705511.2014.935844. Download Mplus files.
 Janna Kook posted on Thursday, October 20, 2016 - 10:33 am
Thanks so much for sending this citation. Is there anywhere on the website where I could find the code for these analyses?
 Bengt O. Muthen posted on Thursday, October 20, 2016 - 2:51 pm
Check the link "Download Mplus files"
 peter lekkas posted on Wednesday, August 23, 2017 - 2:42 am
Hi
Is it possible to generate something like plot3 within Mplus for a repeated measures LCA with multiple categorical indicators across time points
E.g. where there are five latent class indicators measured at two time points
NAMES = u11-u15 u21-u25;
CATEGORICAL = u11-u15 u21-u25;
USEVAR = u11-u15 u21-u25;
Kind thanks
 Bengt O. Muthen posted on Wednesday, August 23, 2017 - 4:45 pm
What do you want to plot - estimated probabilities? Just try it and see what you can get.
 Steven Lancaster posted on Tuesday, November 21, 2017 - 10:04 am
Hi,
I am running an LCA with categorical and continuous indicators. It runs fine, unless I ask for a plot. In that case, I get the following error message. Is it possible to plot LCAs of this type?

Thanks - Steven

Data: File is Ideals LCA.dat;
variable: names are ID y1-y10 x1-x2;
usevariables are y1-y10;
CATEGORICAL = y1-y5 y7;
classes = c(4);
missing is all (999);
auxiliary=id ;
Analysis: type = mixture;
STARTS = 200 50;

plot: type=plot3;
series is y1-y10(*);
savedata: file IDEALS4.txt;
save is cprob;
format is free;
output: entropy tech11;

*** WARNING in MODEL command
All variables are uncorrelated with all other variables within class.
Check that this is what is intended.
*** ERROR in PLOT command
Time points for process 1 are not all continuous, all categorical, or
all latent as they should be.
 Bengt O. Muthen posted on Tuesday, November 21, 2017 - 2:47 pm
We don't want to mix probabilities for categorical outcomes with means for cont's outcomes. Try

Series = y1-y5(*) | y8-y10(*);
 Steven Lancaster posted on Wednesday, November 22, 2017 - 9:37 am
That seems to have worked - thanks!
 Katharine Buek posted on Wednesday, October 24, 2018 - 8:27 am
I am running an LCA (no covariates) in which some of the indicators have missing values. Is it correct that mplus is using FIML to account for these in the model? I know this is the case with LGMM, just wondering about LCA.

Thanks!
 Bengt O. Muthen posted on Wednesday, October 24, 2018 - 5:33 pm
Yes, this is correct - FIML is used in all maximum-likelihood estimation in Mplus irrespective of the model.
 Marie McGregor posted on Wednesday, July 01, 2020 - 10:05 pm
Hi, I am running LPA with covariates and distal outcomes. I have extracted a 4-profile solution and wish to add auxiliaries using R3 step. I have a few questions:
1. step 1 generated a new data set with the probabilities to be used in the following steps - does this only need to be done if doing R3 manually as opposed to automatically in Mplus?
2. Should data imputation be used when exploring auxiliaries in a model?
3. I got a warning that the standard errors of the model parameter estimates may not be trustworthy...non-positive definite...non identification. What is the next step for this? I did not get this warning when extracting 4-profiles (without covariates). I checked the corresponding parameter and cannot see any problems.
4. The class counts have changed significantly from what was previously observed without covariates, entropy is also down from .95 to .88

Thank you
 Bengt O. Muthen posted on Thursday, July 02, 2020 - 4:49 pm
1. Yes

2. You can. Look at Section 7 in the new version of web note 21.

3. We need to see your full output to diagnose. Send to Support along with your license number.

4. That may indicate measurement non-invariance, that is, the need for some direct effects from covariates to latent class indicators.
Back to top
Add Your Message Here
Post:
Username: Posting Information:
This is a private posting area. Only registered users and moderators may post messages here.
Password:
Options: Enable HTML code in message
Automatically activate URLs in message
Action: