Rebecca posted on Thursday, November 20, 2003 - 5:18 am
Hello. I have been able to develop an interesting 5-class latent profile model in the analyses I’ve been running. Now I am thinking of adding covariates to the model. To see if this might be worthwhile, I quickly ran a few chi-squares to see if at this level there would be significant statistical differences (e.g., race/ethnicity with the 5 classes). Unfortunately I did not find significant differences with these analyses. My question is this- may it still be worthwhile to run a LPA with covariates? Would this type of multivariate analysis find nuanced significant differences that I am not picking-up in the straightforward chi-square? Thanks in advance for any guidance you can provide.
I think what you did is classify people into their most likely class, and then do chi-square for classmembership by each covariate. If so and you found no significance, there probably is not any. Doing it stepwise like you did, you will obtain standard errors that are too small. Thus you will falsely find signficance. You can always add covariates to the LPA model and see what happens.
Rebecca posted on Thursday, November 20, 2003 - 10:43 am
Yes. This is exactly what I did. Thank-you for your guidance. As you suggest, I think I will add a covariate to the LPA model and see what happens because I have substantive questions that this type of analysis may help answer.
Now I have one follow-up question based your comment about the too small standard errors and false significance-
In order to assess the LPA model’s interpretability and usefulness, I have been running follow-up analyses (e.g., MANOVA’s) using substantively relevant auxiliary variables as outcomes to determine if there are significant differences among the classes on these variables. (I’ve run these after creating the LPA model and confirming that I do not have a local solution.) Is this appropriate? Or will I also have too small standard errors for such analyses because I am using a stepwise procedure here as well? If it helps to know, I am using cross-sectional data for these analyses.
Again thank-you for your helpful guidance and speedy response!
Whenever you assign a person to the most likely class and treat class membership as a given, i.e., ignore sampling variability, you are going to have bias in your standard errors. They will be too small. It is always best to estimate the entire model at the same time. You can add your auxiliary variables to your analysis as covariates or distal outcomes whichever is most appropriate.
Rebecca posted on Thursday, December 04, 2003 - 2:14 pm
Thanks for your comments from a few weeks ago. I have continued to work on the latent profile analysis and now have a follow-up question.
I have three continuous background variables that I wanted to add to the LPA to determine if class membership varies as a function of these background variables. I have been using the latent class analysis with covariates example (example 25.10 in the Mplus manual) as my guide in this analysis, although my class indicators are interval- not binary (i.e., this is an LPA model, not LCA). Is this acceptable? I have been able to obtain an identified model, but I want to make certain that I am on the right track. And if I am on the right track, can I interpret the output in the same way for the LPA as I would for a LCA with covariates? That is, is this still multinomial logistic regression?
Anonymous posted on Wednesday, August 18, 2004 - 1:02 pm
I have created latent classes using factor mixture modeling. When I add covariates (children's scores on mental health measures), the classes change. I want to continue to examine the scores as covariates (rather than including them as indicators), as this fits best with my theory. In other words, children's scores are not part of the latent construct I wish to model, but I am interested to know how scores vary according to latent class probability.
Am I going about this the correct way? Thank you.
bmuthen posted on Wednesday, August 18, 2004 - 1:06 pm
Covariates can and should influence the class formation, not only indicators. Think of it this way - any observed variable correlated with class membership carries information about class membership. In factor analysis with covariates you have the same situation and in fact ETS uses an extensive list of covariates to produce their factor scores (called "proficiences" and printed in your morning paper now and then). The issue of changing class membership due to covariates is discussed explicitly in Muthen (2004) which is in pdf on the Mplus home page.
I've run a latent profile model and end up with a 3 class solution. I’ve also included covariates to predict class membership, but am unsure of how to interpret the covariate output. In the output below, my initial assumption was that the first column represents the parameter estimate, the second a standard error, and the third a test statistic. If this is correct, my second question is in regard to interpretation of the test statistic. I was originally thinking it could be evaluated on a z distribution (e.g., absolute values of 1.96), but am now confused because this is a multinomial regression, right? For example, from this output, my interpretation was that both Class 2 and Class 3 had significantly lower scores on the “lastsex1” variable compared to Class 1 and that Class 3 had significantly lower scores than Class 2, but that none of the three classes differed on the “relation” variable. Could you please tell me if that is an accurate assessment, or if this should be interpreted differently?
Parameterization using Reference Class 1
C#2 ON LASTSEX1 -1.526 0.570 -2.680 RELATION -0.211 0.464 -0.456
C#3 ON LASTSEX1 -3.710 0.520 -7.141 RELATION -0.665 0.462 -1.439
Parameterization using Reference Class 2
C#3 ON LASTSEX1 -2.184 0.579 -3.771 RELATION -0.454 0.501 -0.905
Thanks for your reply - that makes perfect sense and thank you for clarifying that I'm still dealing with likelihoods. As a quick follow-up question, I was wondering if it would be appropriate to calculate the odds ratios and confidence limits from the parameter estimates and standard errors to report in the manuscript I am writing.
I am conducting a LPA with 4 classes and 2 continuous predictor variables. I would like to change the order of the classes so that I have a different reference class so that I have the odds ratios. I have read numerous threads in the discussion, and I know that I need to use the ending values of the desired reference class as starting values for the last class. I also know that these values can be found in the output. I have two questions. 1) Which values do I use? and 2) What is the input syntax that I need to use? It seems that example 7.10 is the closest example of what I want to do. I have included my syntax below.
Thanks so much for the help! One point of clarification. Do I want to use the means from the baseline model (the 4-class LPA without covariates)or the means from the first run with a particular covariate?
I am running a multigroup LPA model using the KNOWNCLASS command. I've run the groups separately and in both cases a 3-profile solution was the best fit, based on the VLMR. The interpretation of profiles was the same across groups as well. These profiles were also the same for the total sample.
Is there a way to get the VLMR for a multigroup LPA model? I get the following Warning: TECH11 option is not available for TYPE=MIXTURE with the TRAINING option. Request for TECH11 is ignored.
Is there a way to confirm the number of profiles that are the best fit in a multigroup LPA model?
We are using MPlus to run a LPA to see if different profiles of family engagement exist and if there are relations between these profiles and child and parent demographic characteristics and child outcomes.
When we looked at the results,all but 2 of the auxillary variables are not in the expected metric. When we looked at class membership information that was saved, we also found the variables did not seem to be in the order that was identified in the output.
Can you help us understand why this happened and how this can be resolved?
Are the variables in the NAMES statement in the order of the columns of the data set. This is the first thing I would check. Also are the number of variable names in the NAMES statement the same as the number of columns in the data set. It sounds like you may be reading the data incorrectly. Use TYPE=BASIC with no MODEL command to investigate this.
anonymous posted on Tuesday, March 19, 2013 - 12:10 pm
When including covariates in an LPA, is there ever a time when you would interpret the intercepts that are presented in the output below the covariate information? (For example):
Categorical Latent Variables
C#1 ON GRADE -0.174 0.231 -0.754 0.451 SEX 0.287 0.502 0.572 0.567
C#2 ON GRADE 0.347 0.355 0.978 0.328 SEX 1.662 0.950 1.749 0.080
C#3 ON GRADE -0.054 0.249 -0.215 0.830 SEX -0.121 0.520 -0.233 0.816
I have recently completed an LPA with auxiliary variables and am trying to obtain more detailed information from my model results. My specific questions are:
1. Since my auxiliary variables are categorical, I know I can’t show means on these added variables, but how can I show frequency distributions on each variable by class membership produced from the LPA?
2. Do you have any recommendations on how to show or discuss the chi-square results? My output shows p-values between pairs of classes, but I can't infer any differences in representation that are greater than expected as you would say from standardized residuals in a chi-square analysis.
3. How do these chi-square tests differ methodologically from a standard chi-square test based on class membership? I only ask this because my chi-square values produced in trying to get answers using SPSS were much larger for one of my variables almost by a factor of 4.
Dear Professors Muthen, I have run a Latent Profile Analysis with 8 variables. The sample is composed by two subsamples (1 recruited online and 1 recruited offline). The two subsamples differ on 2 of the 8 variables (p<.001). Therefore, I have run a multiple-group LPA in order to account for the subsamples differences (the observed classes correspond to the sub-samples online/offline). Does it make sense to you or would you suggest another solution? Thank you very much! Andrea
From what I understand, each of these is providing the statistics to compare each class against the reference class. So for example, when compared to class 4, the probability of being in class 1 decreases as stress increases but this is not significant. Is that correct?
Is there a way to interpret or report the significance of the covariate influence overall (e.g. overall did stress predict class membership)?
I've read through a lot of posts regarding when covariates should be included in the models, but I'm still confused.
Setup: Develop sexism profiles (LV) as measured by 4 sexism scales (continuous). Determine if sexism profiles are a greater predictor of attitudes towards father involvement than demos.
Original plan: Conduct an LPA using sexism scales as observed y's to determine the best-fitting model. Assign cases to classes based on post probs, examine differences in the demographics of the classes using cross-tabs, etc. Ultimately class assignment will be entered as the first block in a HMR, with demos in a second block, to examine relationship with father involvement attitudes.
Alternative plan: Conduct an LCA using sexism scores as y's and the demos as u's. After selecting the best fitting model, examine classes to determine if demo differ between classes. Then continue with the HMR as planned.
This is where I start to get confused, because some posts say remove covariates (e.g. demos) one at a time and see if it changes the model solution while others say to add them in one at a time. Which is the best/accepted practice?
Also, I know that when doing an LPA, I have to run each class enumeration 4 times to account for each of the main within-class var/cov structures. Can I still do this with the y's if I'm including categorical u's?
Hello! I'm new to Mplus and getting the help of your homepage a lot. Thank you so much. However, there are still problems I'm dealing with and they are as follows.
My research model - conditional LPA model - contains (1) 3 variables used for LPA: These are continuous variables and correlated each other. (2) 8 covariates . 1. Is it possible to use 3 variables from different time for LPA? For example, among 3 variables which are used for LPA in my study, one is the data measured in 2014 and the others were measured in 2004. 2. I think the data used in the analysis is truncated because I have only chosen the data of people who have jobs and did list-wise deletion of the others. Then, I have acknowledged that it can be a problem of the selection-bias in OLS. In this case, do I need to use Heckman model in my analysis? 3. Should all variables in LPA meet the normality assumption?