I have a question about the item labeled "LATENT CLASS REGRESSION MODEL PART" in mixture model output. Suppose one has a latent profile analysis with four standardized continuous variables and the categorical latent variable has two levels. In this case, how does one interpret the LCRMP coefficient and what does it mean if est/SE is greater than abs(2)?
The latent class regression model part refers to the regression of the latent class variable on covariates, that is intercepts and slopes. If there are no covariates, which seems to be the case in your application, the coefficients given under this heading are just the intercepts. In this context, intercepts are logit coefficients determining the probabilities of the classes. In this case, the est/se ratio is not of much interest since the ratio tests against a zero logit, which translates to a probability of 0.5. So, these ratios can be ignored when there are no covariates.
I have 4 years of panel data and two variables of interest. The first variable, P, is an ordinal level variable with 3 categories. The second variable, A, is continuous. What I am trying to estimate is the change over the four years in the predicted probabilities for each category of P for individual i, given a change in A. Given that there are 9 possible paths to get from year 1 to year 4 (excluding intermediate points) for variable P, is there a way to estimate a path given a change in A?
In SAS, there is a procedure PROC TRAJ that handles this, and I was told that there may be a way to do it with MPLUS.
Sorry for the delay in answering this - I was out of town. You mention TRAJ which concerns latent classes of development, i.e. mixture modeling. The mixture modeling part of Mplus currently does not allow polytomous outcomes or trends in development over time in categorical outcomes. Your message, however, does not describe theories of latent classes of development. If instead the chief concern is describing longitudinal development of the polytomous outcome over time as a function of the continuous predictor variable, you can use the regular, non-mixture, part of Mplus for the analysis. You can either use models that are auto-regressive or growth models, the latter having growth factors, i.e. continuous latent variables, that influence the outcome. The continuous predictor variable can influence either the growth factors or the outcome directly. Hope this helps.
I have a model with three latent classes (C#1, C#2, C#3). Under the model statement, I included the command "C#1 on BLK" where BLK is a binary race variable. In the output under LATENT CLASS REGRESSION MODEL PART, there is the slope for C#1 ON BLK and then under intercepts there are two intercepts, C#1 and C#2. Why are there two intercepts? I believe I am modeling: Pr(C#1)=B0 + B1*BLK So there should only be one intercept. Thank you.
I'd like to identify latent segments differing with respect to the price elasticity for many commodities. The model, say, a mixture regression model allows for the fact that these price elasticities may differ for different segments, and consists of: 1. to explain the latent variable as a function of covariates 2. to predict a dependent variable as a function of predictors
The problem is that the dataset contains repeated observations (time-series) for each commodity (cross-sectional), and the model closely relates with these longitudinal data.
Suppose a regression model for estimating price elasticity, i.e., double-log model of quantity on price. We estimate price elasticities for N commodities by using T repetedly observed data for each, but within the framework of latent segments. This might be accomplished using the command GROUPING which identifies commodity. But the command GROUPING cannot be applied with MIXTURE.
Could somebody please give me some suggestion how I would fit these models using Mplus?
Thanks in advance for explanations, comments and tips.
It sounds like your observations are commodities. I'm not sure why you want to use the GROUPING option. If you want to identify latent segments of the commodities, the TYPE=MIXTURE is appropriate. GROUPING is used when there are observed not latent groups or segments. A growth mixture model would probably be appropriate here. If you check the References and Examples posted here on the website, you may find something to help.
Hana Kim posted on Sunday, November 11, 2001 - 2:56 pm
Hello! My research is conjoint marketing studies where respondents (cases) were repeatedly asked to provide personnel intention of purchase under several different scenarios. Please note that there are several records per respondent. My interest is to classify cases into several homogeneous groups and to develop regression models for each segment. Since it appears most likely that any different demographic profiles causes these segments, I will show then how to classify each respondent into the segment which is most appropriate. (A very similar example of my study could be seen from http://www.statisticalinnovations.com/tutorials/tut2.htm)
Do the Mplus fill these requirements, for cases involving repeated measurements?
bmuthen posted on Tuesday, November 13, 2001 - 8:23 am
I don't know conjoint analysis, but it sounds like you can use the Mplus mixture analysis to let the repeated measures for respondents be the latent class indicators (u variables in Mplus language) that measure the latent class variable (c in Mplus), and regress c on covariates (x). Here, the latent classes of c gives you the different segments and segment probabilities are expressed as a function of the x's by multinomial logistic regression. All of this is carried out by a single analysis using maximum-likelihood estimation. This analysis relates to what is referred to as Latent Class Growth Analysis in paper number 86 on the Mplus web site - we'd be happy to send this to you.
Anonymous posted on Wednesday, January 16, 2002 - 8:57 am
Hello, thank you for making this forum available.
I have fitted a three-class mixture, with class 1 being the least prevalent and class 3 the most.
One of the models I need to run is a logistic model of a binary outcome based on class.
The issue here is that the Probability of the outcome in class one and two is 1, or as near to certainty as you get. Therefore, there is no variance in the dependent variable for classes one and two.
However, a person in class three has only a hypothesized 60% chance of the event (as supported by empirical frequency results). What is interesting is to 1. Categorize subjects into class three, and 2. Calculate their probability of the event.
I think I can do this by first assigning kids to class, using one model then, second, running an additional logistic regression for the kids in class three in a second model (weighting by the probability of being in class three from the output data set), but this seems like an inelegant solution.
Is there a way in Mplus to code the regression model into the first mixture model, such that it runs only for class three?
Anonymous posted on Wednesday, January 16, 2002 - 2:02 pm
Sorry for 2 questions in one day, and thank you for answering them.
Is there away to tell Mplus a dependent variable of interest and have it output predicted values for it based on the structural model?
I'm pretty sure I can do this by hand, but obviously, it would be easier if Mplus were kind enough to do it for me.
The way to handle the problem of zero variability of a dependent variable in two of the three classes is as follows where u1-u5 are your binary latent class indicators and u6 is the binary outcome. Here u6 has probability of 1 in the first two classes and its probability is estimated in the third class. Note that u6 is seen simply as yet another latent class indicator. As usual, the probablity of u6 given class 3 is in a logit scale. This is, of course, a partial input.
Regarding your second question regarding predicted values of an observed dependent variable, Mplus does not do this automatically.
Anonymous posted on Thursday, June 27, 2002 - 8:29 am
I have 5 years of data collection of a continuous outcome. Using MPLUS I was able to identify 4 trajecory classes with a quadric model. A reviewer suggested me to revise this findings using PORC TRAJ in SAS. I don't know what are the assumptions made in MPLUS as compared to PROC TRAJ and if the classes could be different. Has anybody compared both approaches? Are they similar? Is there any literature available? Thanks
bmuthen posted on Thursday, June 27, 2002 - 12:28 pm
There are articles related to this topic authored by me and listed under References, Growth Mixture Modeling on this web site, e.g. papers 82, 85 and 86. PROC TRAJ assumes no within-class variability of the trajectories, which is a special case of Mplus, restricting the growth factor covariance matrix to zero (i fixed at 0, s fixed at 0, i with s fixed at zero). My experience with real-data analyses is that this specification often does not fit the data well.
Anonymous posted on Thursday, June 27, 2002 - 2:01 pm
It appears that one can only include a single latent variable in an Mplus MIXTURE model. Is this due to methodological restrictions? Are you planning on expanding on this capability in later versions ? Thank you.
bmuthen posted on Thursday, June 27, 2002 - 3:25 pm
You can have as many continuous latent variables as you want in mixture modeling. As for categorical latent variables, the program is intended for a single variable, but can be used also with several variables. The multiple latent categorical variable approach is described briefly on page 11 of paper #86. New Mplus development are in progress for more efficient handling of multiple latent categorical variables.
Anonymous posted on Friday, August 23, 2002 - 12:45 pm
Can mixture modeling in MPlus analyze a set of regressions, or simply one regression at a time? In other words, if I were interested in a set of regressions such as the following:
d= a + b + c + error f= d + e + error h= f + g + error
would I need to analyze each regression seperately, or could I have the procedure analyze the set of regressions concurrently?
bmuthen posted on Friday, August 23, 2002 - 1:57 pm
The set of regressions can be analyzed in a single analysis.
I have a question about SEM with a categorical latent variable.
Outcome Y, Mediators M1 and M2 are all continuous variables. But U1 and U2 are both binary aviables, so the latent variable C is also categorical. X1-X4 are covariates, not shown in the graph.
The following is my code. My question is that how to write the code for the Model section, should I say "Y on C M1-M2 X1-X4" or say "Y on C#1 M1-M2 X1-X4" ? It would report error if I used the former one. And how to order these statements. What is shown below is not working actually, I just hope to provide some info. I would greatly appreciate your help!
VARIABLE: NAMES ARE X1-X4 M1 M2 Y u1 u2 ; USEVARIABLES ARE X1-X4 M1 M2 Y u1 u2 ; CATEGORICAL = u1 u2 ; CLASSES = c(4);
ANALYSIS: TYPE IS Mixture; ALGORITHM=INTEGRATION;
Model: %overall% Y on u1 u2 M1-M2 X1-X4; c#1 on u1 u2; c#2 on u1 u2; c#3 on u1 u2;
The latent variable c for u1 and u2 does not have to be categorical because u1 and u2 are categorical. The factors in SEM are continuous not categorical. They can, however, have indicators that are continuous, categorical, or other scales. Do you want a traditional SEM model where the factors have categorical indicators or are you interested in a mixture model where the factors are categorical?
I am trying to use Mplus for a mixture modeling. I am confused with the CLASS statement on P109 of User's Guide. Looking at the data file for Ex.7.1.I noticed that the classes of all the observatios have been specified( 1 or 2), not "latent" . But the User's Guide on p 109 says, " ... there is one categorical latent variable c that has two latent classes." I can not understand. Sorry I raised this very basic question.
The 2 is the number of latent classes. Perhaps I don't understand your question. It is not necessary to specify latent.
Lilian posted on Sunday, December 04, 2005 - 7:08 pm
Hello, I was wondering whether we can change the reference category in Mplus when running a latent class regression. I am running a 6-class model and regressing the latent outcome on a few covariates, and i would like the reference category to be the class with the lowest symptom probability.. is that possible? Thanks!
You can use the ending values of the class you want to be last as starting values for the last class in a subsequent run and that class will be last.
pete posted on Thursday, February 09, 2006 - 11:53 am
Hello, I try to fit a mixed logistic regression model with covariates on both the regression and the the latent class part on an individual level.
The model is working and there appear no error messages. Since such models tend to be unidentifiable, does the lack of error messages indicate that the model is identified or is there no guarantee for indentification in mplus?
bmuthen posted on Thursday, February 09, 2006 - 12:09 pm
That is a notoriously difficult model and I would be a bit suspicious. Look at your condition number - if it is close to 10-10 I would be wary. You can also try a high starts = value to investigate the trustworthiness of the solution. You can also do an Mplus Monte Carlo study using your parameter estimate values to see if the model can be recovered. - If the model still holds up, I'd like to use it as an example...
Shane Allua posted on Saturday, February 03, 2007 - 5:25 am
Hello, I see that odds ratios for regression of latent class variable on covariates is new in V4.2. What syntax is required to get this information and can the ORs be output to a dataset?
I have fit a series of LCGA models on 13 infant growth measures. I have decided on the number of classes that I favor and would now like to examine associations between class membership and a series of distal (continous) outcomes. I have tried a number of variations in code and received error messages. My only success was with the 2 class model (which is not my favored model) where I added the following line at the end of my model statement:
bmia ON C;
1. What code is needed for a regression with a 4 class model?
2. If I have a series of outcomes I am interested in, can I put them all in one model statement?
I would like to fit a LCA model with categorical covariates. I am not sure how to specify that covariates are categorical. If I specify them as categorical, this is the error message I am getting:
*** ERROR The following MODEL statements are ignored: * Statements in the OVERALL class: C#1 ON X5 *** ERROR One or more MODEL statements were ignored. These statements may be incorrect or are only supported by ALGORITHM=INTEGRATION.
Please find my code with just one categorical covariate below:
variable: names are X1-X6 u1-u7; usevariables are u1-u7 x5; categorical are u1-u7 x5; classes = c (2); analysis: type=mixture; model: %overall% c on x5;
Could you please help me to sort out this problem?
I am using 2 covariates in my LPA. These covariates are correlated because of shared method variance (same rater). I was wondering if the logistic regression is a standard (instead of backwise/stepwise) and if I can assume that the shared variance is not attributed to any of those two variables? I have one variable at an earlier time-point but this doesn't lead to the same results.
Basically, can I use those two variables (they do not predict the same class membership).
I forgot to say that these 2 variables are measuring two members of a dyad (mother-child). Maybe I should use the earlier measure of the mother because I would be partialling out variance attributable to the child as well as the observer if I take a measure that was taken during the same interaction.
How would one do a liability threshold model with the latent class variables as (latent) dichotomous dependent variables for analyses with twin data? It was mentioned in the Twin Research and Human Genetics 2006. I have found how to use it with "normal" variables but wonder how to do it with classes.
I chose to post it here since I want to use regressions in a multilevel model to get heritability estimates. This strategy was shown by McArdle and Prescott in the same issue of Twin Research and Human Genetics. Sorry if it didn't seem logical to post here at first sight.
The third chapter in my dissertation addresses how to use latent classes in a liability threshold model (Clark, S.L. (2010). Mixture modeling with behavioral data. Doctoral dissertation, University of California, Los Angeles.). The appendix for that chapter includes example Mplus code for this model. In order to do this model in Mplus you will need to be using version 6.
A copy of my dissertation can be found on the Mplus website under the factor mixture modeling tab of the papers section.
mpduser1 posted on Monday, October 04, 2010 - 11:45 am
Is it possible to use MODEL PRIORS in Mplus 6.0 to specify a small informative priors to aid in the identification of a latent class regression analyses when one of the latent classes is small (see, for example, the procedure mentioned by Collins & Lanza, 2010)?
You can provide informative priors for every model parameter in Mplus and yes it should be possible to use informative priors to help identify small classes.
Sarah Ryan posted on Tuesday, March 29, 2011 - 5:06 pm
I'm trying to figure out how best to go about analyzing a mediation model which can be described: 1) Secondary data set (N= appx. 9,000), using three waves of data 2) Several background covariates 3) 5 exogenous measures- 4 latent factors and 1 manifest (continuous) indicator (a student colleague has suggested using bifactor analysis to treat these as one "general factor" as each of the 5 measures could be considered submeasures, suggests this may simplify interpretation of any mediation effect) 4) Latent mediator (Arrived at through latent class analysis) 5) One manifest DV (6 ordinal levels, treated as continuous)
The more I read on these discussion boards, the less convinced I am that using a latent class variable as the mediator actually is doable, or that it is a match theoretically (aside from the fact that this may be an unnecessarily complicated model). It almost seems like what I'd end up with is more a moderation analysis (DV would actually be DV means as a function of class membership). Am I right in this thinking?
A latent class mediator makes for a more complex model, just like an observed nominal mediator would. What should mediation mean in this case? Perhaps the following. The latent class membership can be influenced by exogenous variables, including factors, and latent class membership can influence DVs (by changing their means if cont's as you said). One can also add the restriction of having no direct effects from IVs to DVs. That formulation seems reasonable and the modeling can be done in Mplus because you can have latent variables influencing class membership. But how an indirect effect should be quantified is not clear - it is not just a product of two slopes as with a cont's mediator.
Re 5), perhaps you are thinking about a second-order factor model where only that general factor is an IV. With the bifactor model the general and the specific factors all can be IVs.
Sarah Ryan posted on Wednesday, March 30, 2011 - 2:50 pm
Thanks for this response- it is very helpful. I also just read your 2009 paper with Clark, "Relating LCA Results..." and this gives me more food for thought.
It is theoretically conceivable, with the indicators I'm using and the construct I'm testing, that the latent CLASS mediator could function as latent FACTOR, making it continuous and reducing a bit of the complexity. My committee suggested considering the latent class mediator, but I'm not sure they realized that they were sending me off into relatively uncharted waters (as far as I can tell, though perhaps I'm wrong). I need to go back to my model and the literature to think more about which (factor or class) I believe is more likely.
I'll keep plugging away here, and VERY much appreciate this board and your advice.
Hi, I am trying to regress a continuous variable on a categorical latent variable (c = 3) and on a continuous latent variable. Here is my code:
TITLE: SEM WITH CATEGORICAL LATENT VARIABLE DATA: FILE IS wpa4.dat; VARIABLE: NAMES ARE U1-U14; USEVARIABLES U1-U13; CATEGORICAL = U2-U13; CLASSES = C (3); ANALYSIS: TYPE = MIXTURE; ALGORITHM = INTEGRATION; MODEL: %OVERALL% F BY U7-U13; C ON F; U1 ON F; U1 ON C F; %C#1% [U2$1-U6$1]; %C#2% [U2$1-U6$1]; %C#3% [U2$1-U6$1]; OUTPUT: TECH1 TECH14;
The statement " U1 ON C F;" gives me the following error message: ** ERROR The following MODEL statements are ignored: * Statements in the OVERALL class: U1 ON C#1 U1 ON C#2 *** ERROR One or more MODEL statements were ignored. These statements may be incorrect.
I have a single binary covariate predicting different trajectories for emotions over time (10 measures), which in turn are expected to predict differences in consumption. Bengt was kind enough to direct me to examples of mixture modeling with distal outcomes, and I have experimented with many variations, including keeping factor variances and residual variances as class-invariant. My question now is simple - how can I estimate the indirect effect of the covariate on the distal outcome? Unlike normal mediation, there is no a*b effect to be estimated. How can I assert that the effect of the covariate on the outcome is mediated by the differences in trajectories?
right, that is what I am saying. So, the binary covariate influences the intercept, linear slope and quadratic trend, and these in turn are predicted to lead to differences in the distal outcome. I included the statement c#1 on x to estimate the effect of the covariate on latent class menbership, and I know that the beta for the effect of c on the distal outcome is given by the difference in the class means. What I now need to know is whether there s an indirect effect of the covariate on the distal outcome.
This is what I have. I hope I am doing this right. My x variable is litdar (0-1 variable). My distal outcome is consum (range 0-50,treated as continuous). What I find is the following: a) 2 class model with class-varying psis and thetas fits the data well, better than a growth model. The two classes are high versus low guilt. b) The effect of the covariate on class membserhip is not significant. c) However, within the high guilt class, the covariate predicts differences in trajectories. One group (x = 0) has a rising guilt while the other (x=1) has a reducing guilt, and the linear and quadratic growth factors are significantly different for the two groups. Further, the distal outcome is significantly different for these two groups within the high guilt condition.
d) There are no differences in trajectories for the low guilt class, and the distal outcome also does not differ for these two groups.
ac bc qc |ag1@0ag2@1ag3@2ag4@3ag5@4ag6@5ag7@6ag8@7ag9@8ag10@9; [ag1-ag10@0]; ac*5048.14; bc*607.81; qc*5.90; ac WITH bc*-678.04; ac WITH qc*58.72; bc WITH qc*-57.52; ag1*779.60 ag2*240.52 ag3*428.66 ag4*363.95 ag5*369.10 ag6*511.33 ag7*645.82 ag8*682.81 ag9*265.19 ag10*582.09; ac ON litdar; bc ON litdar; qc ON litdar; c#1 ON litdar; consum ON litdar;
%c#1% [ac*51.48 bc*5.39 qc*.71]; ac*5048.14; bc*607.81; qc*5.90; ac WITH bc*-678.04; ac WITH qc*58.72; bc WITH qc*-57.52; ag1*779.60 ag2*240.52 ag3*428.66 ag4*363.95 ag5*369.10 ag6*511.33 ag7*645.82 ag8*682.81 ag9*265.19 ag10*582.09; ac ON litdar; bc ON litdar; qc ON litdar; consum ON litdar;
So you have a direct effect of litdar (your binary X) on consum (your Y). But you don't have any indirect effect because litdar is not significantly influencing the class membership and although litdar influences the growth factors within class, your model doesn't say that the growth factors influence litdar.
Jamie Vaske posted on Tuesday, January 17, 2012 - 9:32 am
Hello, A colleague and I were recently looking over the Jung & Wickrama (2008) article on Latent Class Growth Analysis and Growth Mixture Modeling with MPLUS. In their article, they have a LCGA and they directly regress the slope factor on a covariate. Here is their syntax:
Our question pertains to how to interpret the effect of the covariate on the growth factors. The variation in the growth factors is set to 0, so the covariate is not explaining variation in the growth factors within a class. What does the effect of X on the growth factors represent when the variation in growth factors is constrained to zero?
With a conditional model, the residual variances are fixed at zero.
When i and s are regressed on x, it is a shift in means for each gender if x is, for example, gender.
Regan posted on Friday, February 03, 2012 - 11:32 pm
"This brings up two issues which may not always be well understood in mixture modeling. First, modeling the influence of a latent class variable c on a distal outcome y is not done by saying y ON c, but what is done gives information equivalent to having used ON..."
Dr. B. Muthen, the above was a response you gave to someone some years back...I am new to LCA and want to clarify somethings with regards to this comment:
1) Do I interpret your comment correctly if I say that when adding a distal outcome y to see the effect of class membership on y, that instead of using a y on c command, that we should just add the outcome variable to the 'usevariable' statement--therefore it is technically a covariate, but interpreted as an outcome?
2) Similar, but regarding dependent variables: Is there a substantive difference in whether we add the dependent variable to the 'usevariable' statement vs. using the 'knownclass' statement and adding a regression statement of c on x? If there is a significant association between x and c (for instance if x is gender) should we move to a multiple group analysis?
1. It is still an outcome. In reality it is another latent class indicator.
2. Please send outputs that illustrate what you are asking to make it clear. Also send your license number to email@example.com.
Regan posted on Monday, February 06, 2012 - 9:46 am
Thank you for the clarification. I am just starting out with the analysis and trying to understand what I have learned from attending your sessions and putting them to practical use at the current time. Therefore I have not yet gotten any output yet but wanted to understand more about the different command statements in order to obtain the correct output.
Hello, My LVSEM model involves two steps. First completing a latent profile analysis to devise a latent class variable of commuity adversity; and THEN, using that latent class variable as a 'predictor' in a LVSEM model.
Is there an example of how to do this somewhere.
The results of my LPA confirmed a 3-class solution. So, I thought that inorder to use my new Latent class variable in my model all I would have to do is have in the variable command.
CLASS = C(3);
and then define my latent class variale in the %overall% model command with its continuous indicators; while using the TYPE=MIXTURE analys command.
Then I could regress my latent class C variable onto my outcome of interest. However, an error message i saying that my latent class variable cannot also be defined as a continous latent variable.
Any help that you can offer would be appreciated. ***Melissa
Hi Dr. Muthen, I cannot, it is government protected data where each output has to be vetted through security and you are only allowed to vet twice. So, I need to save it for when my model works. Any ideas would be helpful.
I'm guessing you have y ON c. This is not the correct specification. Remove that. What you want to look at is the varying of the means of y across classes.
Gail Smith posted on Friday, March 30, 2012 - 7:22 am
I am doing a LCA with 3 classes and want to change my reference class from class 3 to class 2. In earlier posts, you have suggested to use the ending values for the parameters in the class that you want to be last as starting values for the parameters in the last class in a subsequent analysis. My question is: where do I find these ending values?
They are your results in the analysis where class 3 is the reference class. You can use the SVALUES option of the OUTPUT command to generate input with starting values and then change the class labels. You also need to change the means of the categorical latent variables.
It is clear that one cannot treat a latent class/profile as an independent variable by regressing X on C; instead you recommend including the X outcomes in the analysis to see how they may vary across the classes.
My model is a 6-profile solution (LPA of 7 continuous indicators), however I am not interested in how the 6 profiles are differentially related to a dependent variable. Rather, I want to compare the predictive abilities of certain profiles to other independent variables (e.g., controlling for a closely diagnoses, do profiles 1 and 2 predict impairment).
e.g., X on C#1 C#2 DX1 DX2;
Is their a way to run such an analysis in a single step in mplus? I understand that it is not recommended to export the posterior probabilities and run the analysis in a second step.
Could you also explain why it is not possible to regress X on C in Mplus? Do you expect this to be possible in future versions?
In your handout on LCA on slide 126 it shows that the predictor variable "black" is not siginificant in the regression equation for class 1, however it is significant for classes 2 and 3. My question here, is if this is interpreted as 'a significant predictor of class only for classes 2 and 3, however being black is not a significant predictor of class 1"? Also, would this imply that a multiple group model be run for black and non-black respondents?
My second question:
In using a distal outcome, I know I need to compare the means across groups and use the 'model test' command. However, if I have 3 groups, do I need to run the model three times to obtain 3 different Wald tests (p1=p2; p2=p3; p1=p3)?
When conducting the test of mean differences on a distal outcome, I am using the MODEL TEST command. I am running this several times because I have four groups. I am wondering if there needs to be a post-hoc Bonferonni test applied in this context, and if so, how is it conducted in Mplus? Thank you.
Hello, I run regression of the latent class variable on covariates. In the Model Result part of the output, for some covariates the S.E. is 0 (and p-value 999). What is the problem and how can it be avoided? Many thanks for your help.
That means that the slope cannot be determined. This happens when a class has zero variance for a covariate - everyone in that class has the same covariate value. It is the same issue as in ordinary logistic regression. It is not really a problem in that it is useful to know that people in that class are homogeneous with respect to that covariate.
I have encountered the following error will running a LCA model without covariates on a dataset that contains both continuous and dichotomous outcomes (class indicators) with no missing values. I'm not sure what to do with this message since I don't have any covariates.
THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES MAY NOT BE TRUSTWORTHY FOR SOME PARAMETERS DUE TO A NON-POSITIVE DEFINITE FIRST-ORDER DERIVATIVE PRODUCT MATRIX. THIS MAY BE DUE TO THE STARTING VALUES BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. THE CONDITION NUMBER IS0.297D-16. PROBLEM INVOLVING PARAMETER 62.
This parameter refers too:
STARTING VALUES FOR LATENT CLASS REGRESSION MODEL PART
Thanks for your past responses. In the following input I can regress an observed variable, YT2, on a categorical latent variable, CW. Model estimation terminates normally, although I do receive a message about a non-positive definite matrix.
Several users report receiving Mplus output errors when they try this regression. The advice in response is that an observed variable can not be regressed on a latent variable. Instead, include the observed variable on the USEVARIABLES statement and then examine class-specific means.
What is the reason why I am able to run the regression? Is it because the observed variable YT2 is a dependent variable in the model?
For the input below: (1) YT2 ON CW is modeled on the between level. Does this mean that between-level clusters influence the association between within-level latent class membership (CW) and YT2?
(2) Is the first regression of YT2 on YT1 (under MODEL: %WITHIN%) expected to produce an intercept for YT2? There were slope estimates in the Within Level results but I did not see intercept estimates. YT2 intercept estimates are in the Between Level output, and I assume they are for the regression on CW.
(3) If I un-commented the regression lines under MODEL CW, would I be allowing the regression of YT2 ON YT1 to be different for each class? Thank you.
which should not be used because it regresses a continuous between-level random effect on a categorical latent class variable, which is not the Mplus design. The CW means can vary across CB classes without saying this.
(2) An observed variable has only one intercept and this is by default printed on Between.
(3) Yes, but you can't have it in the Overall part as well because that would lead to non-identification.
S Elaine posted on Wednesday, March 18, 2015 - 12:42 pm
Van Horn et al. (2009) stated: "In general, we believe that regression mixture models are best viewed as a large-sample technique, though further methodological research is needed before sample size guidelines are provided." Are there any recent guidelines about sample size for Latent Class Regression? I am unable to locate recent articles addressing this issue. I ask, because in exploring this technique our research team found three distinct groups based on differential effects of 3 risk factors on 3 mental health outcomes; however, we only have 291 children in our sample. I am wondering if it is reasonable to proceed with examining two predictors of group differences. Thank you for your help.
I am running a multi-group latent class model with medical conditions defining latent class (c) and HIV serostatus as the knownclass. I have a set of covariates on which I want to regress both the latent and known classes. I do not want to treat them as distal variables but do want to allow them to influence class membership and within class conditional probabilities.
The categorical covariates are fine. The issue is with the covariate - age in years - measured on an interval level. To get the model to run and converge, I have to use these statements in the model statememt:
Model: %Overall% c with age; c on newrace2; c on orient; c on k6cat; c on alcuse2; c on mjuse2; c on methuse2; c on eduse2; c on popuse2;
This produces means for age and ORs for the categorical predictors. If I flip the statement "c with age" to be "c on age" to get ORs for each increased year of age as I might in an ordinary LR, the model does not converge and/or runs forever. I suspect the distribution of age, which has few cases at the upper end, might contribute to this.
Is there a better way to get ORs with CIs for the age variable? Also, and if not, is there a way to get significance tests to compare mean ages across the different latent classes or do I just have to use the CIs for age in the printout to figure that out myself?
Thanks for any help.
Jon Heron posted on Monday, October 19, 2015 - 9:17 am
I guess in a more general setting you could derive the directional association from the ratio of the covariance and the variance of the independent variable, however I am struggling to envisage this when C is latent nominal - what does a covariance even represent in this situation?
Also, when you say "tests to compare mean ages across the different latent classes" it sounds like you are now thinking of age as being dependent.
There are discussions in the technical appendices regarding continuous dependent variables causing problems when their distribution is non-normal, however I'm not aware of this problem when the variable is a predictor (indeed it's use as a predictor was used as one solution to this (LTB)).
Just to add to Jon's answer, perhaps you want to scale down the age variable, e.g. centering it and/or dividing it by 10.
Chris Giebe posted on Tuesday, November 15, 2016 - 2:33 am
Hello, I'm trying to include a covariate into my two-level model, with class 1 as a reference class. I've been using example 10.6 of the user's guide and your ASB example of topic 5 part 3 video as a reference, to create the following model:
MODEL: %WITHIN% %OVERALL% c ON PGCASMIN; %c#1% [PLB0357$1-PLB0350$1*0]; %c#2% [PLB0357$1-PLB0350$1*1]; %c#3% [PLB0357$1-PLB0350$1*2]; %c#4% [PLB0357$1-PLB0350$1*3];
%BETWEEN% %OVERALL% f BY c#2-c#4; f ON w;
but am getting this error:
*** ERROR in MODEL command Unknown variable(s) in a BY statement: C#2-C#4
Why not use the total score as the observed distal outcome.
Running several outcomes gives the same result as running one at a time.
Chris Giebe posted on Saturday, April 01, 2017 - 7:06 am
Thanks for the quick response. I guess that makes sense to create a summary score beforehand, and then include that in the model.
I do have a follow-up question:
In the output, under MODEL RESULTS I am seeing the class specific Estimates, S.E., Est./S.E., and p-values columns. Am I correct in understanding that the intercepts are the class means of my outcome variable? I'm noticing under RESIDUAL OUTPUT (I'm assuming this is Tech4?) ESTIMATED MODEL AND RESIDUALS (......) that there are also Model Estimated Means for my covariate and outcome. These are vastly different than the intercepts under MODEL RESULTS. Which ones do I report? The Model Estimated Means under RESIDUAL OUTPUT or the intercepts under MODEL RESULTS?
The intercepts for the outcomes are not the means for the outcomes - just like in regular regression.
Jenny Chang posted on Monday, November 13, 2017 - 4:32 am
I am trying to use 3-step approach to compare the difference of a distal outcome PND. My preliminary result by auto BCH is consistent with those by traditional 3-step. Then I further control effect of covariate AG on PND, which was not assumed to vary across classes. 1.Result of manualBCH shows Classification Probabilities matrix had negative value and also value above 1. The result is very different from those by autoBCH.Dose it fails in this case? The webnote 21 mentioned equal variance of distal outcome may solve this problem, but I did not found the sample code. 2.I followed the syntax in Appendix E in Asparouhov & Muth¨¦n(2014) to use manual 3-step (Vermunt,2010), and compared the intercept of PND by wald test(syntax as follow), in order to compare the difference of their mean. Does it make sense? 3.If my step 2 makes sense, I found the result has less significant pairwise comparison than those by autoBCH and traditional 3-step. Is the result by this manual 3-step robust, since the association between C and distal outcome is even lower than those by traditional approach. 4.Based on the current reuslt, which approach is recommended? Model: %overall% PND on AG; %C#1% PND (a1); %C#2% PND (a2); ¡ MODEL TEST: 0 = a1 ¨C a2;
I want to ask about MODEL RESULTS in case of having 2 class latent variable and binary observed variables what does Thresholds in each class refer to? Does it refer to the coefficients of logit function directly(alpha and beta)? logit(prob(y=1|z)=alpha + beta(z)
Hello, I’m trying to run a multi-group LCA analysis—my groups are designated as known classes, g, and I want to estimate five classes. I want to regress each of the classes onto a simple dichotomous education covariate and analyse group specific results—I think this is correct? Input:
classes= g(2) c(5); knownclass= g (group=0 group=1); Analysis: Type = mixture ; Starts= 25 25; Stseed=54321; Model: %overall% c on eduts; c on g; Model g: %g#1% c on eduts
Hi, I am trying to assess the influence of a latent class variable c on distal outcomes “AGG_PHYS” and “AGG_MENT”. I used the following syntax: DATA: FILE IS C:\Users\Owis Eilayyan\Desktop\PhD\Scoring\Dec2017\LCA\DATA\LBP_Datav6A.dat; VARIABLE: NAMES ARE ID red age gender marital children educ empl social Ethnicity hand AGG_PHYS AGG_MENT PainS PainInt ODI HADS_D HADS_A PHQ PF RP BP GH VT RE SF MH Effic FABQph FABQw KeelT KeelS; usevariables are AGG_PHYS AGG_MENT PainS PainInt ODI HADS_D HADS_A Effic FABQph FABQw; missing = .; CLASSES = c (3); AUXILIARY = AGG_PHYS (DU3STEP) AGG_MENT (DU3STEP); ANALYSIS: TYPE = MIXTURE; MODEL: %OVERALL% AGG_PHYS ON C; AGG_MENT ON C; PLOT: TYPE = PLOT3; OUTPUT: TECH1 TECH8 TECH10 TECH11 TECH14;
However, I got this error message: “Unknown variable(s) in an ON statement: AGG_PHYS”. How can I run a regression analysis with distal outcomes using DU3STEP command?
You don't say "...ON C" in Mplus, just like you don't regress anything on a nominal variable. for correct use of 3-step with a distal outcome, see the 2 papers on our website:
Asparouhov, T. & Muthén, B. (2014). Auxiliary variables in mixture modeling: Three-step approaches using Mplus. Structural Equation Modeling: A Multidisciplinary Journal, 21:3, 329-341. The posted version corrects several typos in the published version. An earlier version of this paper was posted as web note 15. (Download appendices with Mplus scripts).
Asparouhov, T. & Muthén, B. (2014). Auxiliary variables in mixture modeling: Using the BCH method in Mplus to estimate a distal outcome model and an arbitrary second model. Web note 21.