Anonymous posted on Tuesday, February 15, 2005 - 12:57 pm
I am trying to fit an LCA model where I have 6 unordered, 3 category observed variables and 3 latent classes. I have the following questions:
1. the output gives me estimates for two of the three categories of each observed variable for each class. Are these the logit for the probability of cases within each class giving this response, relative to the reference category (where the refence category is the lowest numerical value of the observed nominal variable?). 2. Is it possible to regress a continuous (or binary) dependent variable on to the latent class variables to address the question of whether class membership is associated with some distal oucome? When I try this I get the following error message:
*** ERROR The following MODEL statements are ignored: * Statements in the OVERALL class: FAVLIB ON C#1 *** ERROR One or more MODEL statements were ignored. These statements may be incorrect or are only supported by ALGORITHM=INTEGRATION.
3. Where can I find information about how to use start values to change the reference category in multinomial logistic regression?
It sounds like you need to add ALGORITHM=INTEGRATION to the MODEL command for that statement.
With observed variables, the reference category is the category with the highest number. I think you would need to renumber the categories using DEFINE if you want to change the reference category. Start values are used to change the last class.
Anonymous posted on Tuesday, February 15, 2005 - 7:05 pm
I can't seem to get the algorithm=integration part to work. Below is my output with error messages:
DEFINE: q1=3; if (euro2w1 eq -9) then q1=2; if (euro2w1 eq -1) then q1=1; if (euro2w1 eq 0) then q1=4; q2=3; if (WAGE2W1 eq -9) then q2=2; if (WAGE2W1 eq -1) then q2=1; if (WAGE2W1 eq 0) then q2=4; q3=3; if (TAX1W1 eq -9) then q3=2; if (TAX1W1 eq -1) then q3=1; if (TAX1W1 eq 0) then q3=4; q4=3; if (TAX2W1 eq -9) then q4=2; if (TAX2W1 eq -1) then q4=1; if (TAX2W1 eq 0) then q4=4;
ANALYSIS: TYPE = MIXTURE; starts=50 50; algorithm = integration;
FAVLIB ON C#1;
*** WARNING in Model command All variables are uncorrelated with all other variables within class. Check that this is what is intended. *** ERROR The following MODEL statements are ignored: * Statements in the OVERALL class: FAVLIB ON C#1
Also, when I have a logit of -15 for probability of response within a latent class, can I take this to indicate that noone gave this response in this class? Or does it indicate a problem with the model? thanks again,
I'm sorry. I didn't really look at your model, just the error message which wasn't accurate in this situation. You cannot regress an observed variable on a categorical latent variable. The means of the observed variable varying across classes gets at that parameter.
Anonymous posted on Wednesday, February 16, 2005 - 6:10 am
I suppose another way might be to save latent class membership variables and then use these as predictors in a subsequent model? Not as desirable as a single estimation but would get at the question directly. I'm not sure how to obtain the means of different observed variables across classes. If I specify continuous variables in the usevariables command, the output gives me means for these observed variables across classes but it also appears to change the parameters given for the nominal variables. Are the latter still logits for response given class membership? I will send output for the model in question.
Also, you didn't address my question regarding the meaning of logits of -15 and 15 with zero standard error in the LCA. Does this indicate a problem for the model? Thanks,
Saving the latent class membership would not do what you want. In addition, this approach introduces estimation errors and the standard errors will be too small.
The output that you sent does what you want. Think of the regression of y ON c where y is a continuous variable and c is a categorical latent variable. This regression results in the means of y changing over the classes of c. So you don't specify it using an ON statement. You just allow the means to vary over c. Your output has the means varying over c. So you have what you want.
Values are fixed when they become extreme. In your case, it means that in certain cells for certain classes, there is a high of low probability.
Anonymous posted on Wednesday, February 16, 2005 - 4:40 pm
But the additional observed variable then seems to go into the formation of the latent classes (which I don't want). It also changes the parameter estimates for the categories of the observed nominal variables to be means (or is this just the label?). My interpretation of the parameters before adding in the 'dependent' observed variable was logits representing the probability of each response alternative in each class. Now I'm not sure what they represent. thanks again for your time and advice,
When you add another variable, it will change the results. It is not possible for this variable not to contribute to the formation of the classes because it is related to c. The values listed under means for continuous variables are regular means. The values listed under means for nominal variables are logits.
Elmar posted on Wednesday, March 23, 2005 - 3:00 pm
Dear Linda Muthen,
I am new to LCA. Following the discussion above, I understand that one cannot regress an observed variable on a categorical latent variable. My question: is it possible with mplus to regress one latent categorical variable onto another latent categorical variable? E.g. for analysing if membership in one latent class predicts membership in another latent class (in another variable) I think this can`t be done using latentgold. Thanks, Elmar
bmuthen posted on Wednesday, March 23, 2005 - 4:14 pm
Yes, you can regress one latent categorical variable on another latent categorical variable in Mplus. There are examples of that in the version 3 User's Guide, e.g. latent transition analysis.
Note that you can also achieve what amounts to regression of an observed variable on a categorical latent variable. You just don't use ON. Instead, the observed variable is influenced by the categorical latent variable by the observed variable mean (and/or variance) changing across the latent classes.
Anonymous posted on Tuesday, March 29, 2005 - 3:54 pm
I have a question re interpretating the LCA results using nominal indicators (career with 5 categories) over multiple waves. The lowest category is govt job, followed by business, education, law and other. I built the model for c=(2)--- lca two-class model.
The LCA outputs are:
Latent Class 1
Means CAR1#1 1.184 0.208 5.698 CAR1#2 0.842 0.238 3.533 CAR1#3 -1.193 0.347 -3.437 CAR1#4 -0.762 0.299 -2.554 CAR2#1 1.176 0.210 5.593 CAR2#2 0.862 0.240 3.583 CAR2#3 -1.274 0.359 -3.552 CAR2#4 -0.763 0.299 -2.550 ....... till the last wave.
Latent Class 2
Means CAR1#1 1.798 0.218 8.245 CAR1#2 -0.355 0.290 -1.222 CAR1#3 0.484 0.255 1.896 CAR1#4 0.745 0.235 3.170 CAR2#1 1.767 0.216 8.163 ...... till the last wave.
Categorical Latent Variables
Means C#1 -0.413 0.092 -4.491
My questions are: Are the means logits? How should I interpret it in the context of LCA?
Thank you very much for your time. I will really appreciate your help.
Your means/intercepts are logits. See the second to last section in Chapter 13. The example with no covariates corresponds to a nominal variable LCA where only the intercepts are used.
Jinseok Kim posted on Thursday, February 08, 2007 - 5:30 pm
I try to ; use seven binary indicators (DLOCA-DNBGRP) for a four class LC (c); use the LC(c) as one of the predictors of a 4 category nominal DV(PCARE); and use a series of observed covariates (SEX - HINCOME) that influence both the LC(c) and the DV(PCARE). My questions: 1. How can I incorporate the step 2 (PCARE regressed on c) of the model into Mplus if I cannot use "PCARE#1 PCARE#2 PCARE#3 on C#1 C#2 C#3 ..."?; 2. Please explain about what I should look for in the output to interpret it as in multinomial logistic regression?; and 3. The following syntax ran well. Can you tell me what model I estimated? Thanks. Here's syntax.
TITLE: LCA regression ; DATA: FILE IS "choicefactor_pcare.txt"; VARIABLE: NAMES ARE BASMID DLOCA DCOST DRELY DLERN DCHIL DHROP DNBGRP PCARE SEX RESPSEX RESPAGE calcmonthage black hisp asia other npguadian momonly MOMGRADE dimomwork welf3yr HGOVCUR HINCOME; USEVARIABLES ARE DLOCA - HINCOME; CLASSES = c (4); CATEGORICAL = DLOCA - DNBGRP SEX RESPSEX black - momonly dimomwork welf3yr HGOVCUR; NOMINAL = PCARE; ANALYSIS: TYPE = MIXTURE; ALGORITHM = INTEGRATION; MODEL: %OVERALL% PCARE#1 PCARE#2 PCARE#3 on C#1 C#2 C#3 SEX - HINCOME; C#1 C#2 C#3 ON SEX - HINCOME;
Jon Elhai posted on Monday, July 28, 2008 - 8:46 pm
Linda, I ran a 6-class latent class analysis. I am trying to change the start values so that one particular class I'm interested in serves as the reference category for regressing the latent class on covariates. However, after trying to select various of the classes as class #6 (reference class), I can't seem to make my particular class the reference category - it shows up as a class other than class #6. Any suggestions?
I have a similar problem. I want to regress a binary outcome variable (educational level) on a latent categorical variable with 3 classes (social class) and one continuous latent variable (IQ).
You explain that when the outcome variable is continuous you don't include the LCA variable in the ON statement because Mplus already models class-specific means of the outcome variable. However, in my setup the outcome variable is binary and I want a logit/probit model. So, I don't want to model class-specific means of the outcome variable but rather class-specific log-odds/logits of y=1 vs. y=0 (with one of the three latent classes being the reference category). Is this possible in Mplus and where in the output do I find the estimated log-odds ratios? My Mplus output has a section called "Categorical latent variable means" with estimates of c#1 and c#2 (I have three classes in my model so I assume that c#3 is the reference group here). These estimates look like what I want but I'm not sure. Are they logits? Also, while the estimates sizes look about right they have the opposite sign of what I would expect - does Mplus estimate y=0 vs. y=1 as the default?
I should say that I also ran my model with a continuous rather than a binary outcome variable (exam results) and this works fine.
Since LC3 is the "lowest" social class, the positive logit of 0.647 for the contrast with LC1 (the "highest" class) means that LC1 has a higher probability of y=1 vs. y=0 relative to LC3. This makes sense. How would I construct SEs for the estimate of 0.647?
The numbers you create are log odds ratios. You can exponentiate them and obtain odds ratios which is a good way to explain how the classes relate to the distal outcome. You can do this in MODEL CONSTRAINT and thereby obtain standard errors for the odds ratios or the log odds ratios.
yawen posted on Wednesday, March 10, 2010 - 3:48 am
Dear Dr. Muthen,
I have similar questions, but I still don't know how to obtain standard errors.
"action" is a binary variable. I would like to see how the thresholds of action vary across three latent classes. My model is as follow. The question is how I should write model constraints to get the three standard errors I need to compare three thresholds.
You should use MODEL TEST or difference testing to see if the thresholds vary across classes. It is not necessary to compute standard errors.
NR posted on Thursday, September 02, 2010 - 2:31 am
I'm trying to use class membership from LCA as a dependent variable in the subsequent regression analysis. I read from one of your articles posted here that this may produce incorrect estimates or standard errors, but I have to stick to this two-step approach for several reasons.
My question is, if I create multiple plausible values for latent classes and use them in the subsequent regression, does it help to produce correct estimates and standard errors? Or are there better ways to deal with this problem?
Clark, S. & Muthén, B. (2009). Relating latent class analysis results to variables not included in the analysis.
Asparouhov, T. & Muthén, B. (2010). Plausible values for latent variables using Mplus.
The second concludes that to generate the plausible values which will not produce biases you need to include the covariates in the model that generates the PVs - which is what you say you don't want to do. If you don't, you need a high entropy, say > 0.8 (see the first paper).