Message/Author 

Roscoe posted on Tuesday, November 02, 2004  2:06 pm



I have a dataset with approximately 1900 observations made up of patients (probands) and family members (brother, sister, mother, father). There are approximately 600 families, each with 24 members. I want to perform a latent profile analysis with 5 continuous measures. How do I model the clustering between families? 


Clustering for LPA can be handled in two ways in Mplus  TYPE=COMPLEX MITXURE; or TYPE = TWOLEVEL MIXTURE; These two approaches are described in the introduction to Chapter 9 in the Mplus User's Guide. This introduction is also available at the Mplus website. 

Anonymous posted on Wednesday, April 20, 2005  3:29 pm



I would like to run a model similar to 7.12 LCA with covariates. Except I want it to be a profile model with a categorical covariate. I have tried to run the model with the covariate X denoted as categorical and %overall% C#1 on x; but it doesn't run. It will run if I do not denote X as categorical. THis is the error message: *** ERROR The following MODEL statements are ignored: * Statements in the OVERALL class: C#1 ON X *** ERROR One or more MODEL statements were ignored. These statements may be incorrect or are only supported by ALGORITHM=INTEGRATION. 


Only dependent variables should be included on the CATEGORICAL list. So it is not necessary to put x on the CATEGORICAL list. 

Tess Yanisch posted on Wednesday, January 14, 2015  6:08 pm



Hello Doctors Muthen and Muthen, Similar in spirit: I'm running a LCA (LPA?) on students' attitudes about different minority groups. I have students nested in schools and covariates (which I'm using as Auxiliary (R)) at both the school and student levels. However, I don't think that school will affect what class an individual is in much; I just want to control for the nonindependence of the schoollevel covariates. Is using CLUSTER a good way of doing this? My current model is like this: USEVAR = IDSCHOOL TOTWGT {all individuallevel and schoollevel covariates} {variables to be clustered on}; Categorical are {Likertscale attitude items to be clustered on}; AUXILIARY = (R) {all covariates}; CLUSTER is IDSCHOOL; WEIGHT IS TOTWGT; Classes = patrn (7); Missing are all (9999); ANALYSIS: Type= mixture complex; Starts = 100 20; Second question: I want to see how covariates relate to class membershipif gender affects how likely a person is to be in a given class, e.g. I can't find a resource explaining how to interpret that part of my output, even on this message board, and the UG does not cover output. Do you know where I could find that information? Many thanks for your time. 


Q1. Looks good. Q2. See our video and handout for Topic 5, starting with slide 120. This covers multinomial regression with a latent nominal DV. For a general discussion of multinomial regression with an observed nominal DV, see Topic 2. 


Thank you very much for the speedy reply! I wish I had run across the tutorials earlier; big help there. I still have a question, thoughthe example in Topic 5 uses a MODEL: specification and the variables are not AUXILIARY. My output looks like this: TESTS OF CATEGORICAL LATENT VARIABLE MULTINOMIAL LOGISTIC REGRESSIONS USING POSTERIOR PROBABILITYBASED MULTIPLE IMPUTATIONS (PSEUDOCLASS DRAWS) PATRN#1 ON  Estimate  SE  Est/SE  2tailed pvalue [list of covariates] What is the "Estimate" here, and what does its significance mean? Is it log odds relative to odds of being in PATRN 7, the only class not listed? Also, am I correct in assuming that the "Latent Class [X] Thresholds" list of variables gives the log odds of that response being selected in Latent Class [X]? Apologies if these are foolish questions. This is my first major project and I unwittingly chose a method no one in my program is familiar with. 


I think I need to see your output; please send to Support@statmodel.com along with your license number. 


Hi Prof Muthen I am performing LCA/LPA where I want to determine latent classes based on physical activity data during work time and nonwork time . I have 8 continous variables, 4 for work and 4 for nonwork. I want to see for example is there a group of people who are very sedentary at work but are very active during the nonwork time . When I perform the analysis including all these 8 indicators, I see only 2 class even though I believe there are more classes than two. For example, there should be a class which includes people with high sedentary time at work but low sedentary time at the nonwork time, which I can't see in my analysis. Is it because the variables within work and within leisure are correlated ? should I add another level in the model where I mention that variables within work domain are correlated more to each other and similarly for the nonwork domain? my syntax is: Data: File is "U:\lcawl.csv"; Variable: Names are id perc_s_work perc_st_work perc_wa_work perc_mv_work perc_s_leisure perc_st_leisure perc_wa_leisure perc_mv _leisure ; Usevariables are perc_s_work perc_st_work perc_wa_work perc_mv_work perc_s_leisure perc_st_leisure perc_wa_leisure perc_mv _leisure; IDVARIABLE = id; CLASSES = c(4); missing = all(999); Analysis: type = MIXTURE; OUTPUT: tech11; 


Also, i forgot to mention that workrelated variables sum up to 100% of the time while nonwork variables sum up to 100% of the time. it is because I normalized the variables to the time at work and nonwork time. 


You can take the approach of using two latent class variables as in UG ex 7.14. One for work outcomes and one for nonwork outcomes. 

nidhi gupta posted on Wednesday, May 31, 2017  5:06 am



Prof Muthen Thank you for your reply. 1. This example in 7.14 is for confirmatory analysis. However, in my case, my research question is to combine work and nonwork time together. So, I want to see groups of people who are homogenous with respect to their physical activity and sedentary behavior at work and nonwork time. To do that I should perform latent profile analysis right? I did that using the model I explained above. However, this gives me nonmeaningful classes. For example, I don't get any class where individuals are being sedentary at work but active during leisure. What should i do ? 2. My another question is where can I read about assumptions of latent profile analysis? 3. Is it problematic if my observed variables are correlated to each other or some of the variables are correlated more than the other variables? 

nidhi gupta posted on Wednesday, May 31, 2017  5:07 am



Prof Muthen Thank you for your reply. 1. I read this example in 7.14 is for confirmatory analysis. However, in my case, my research question is to combine work and nonwork time together. So, I want to see groups of people who are homogenous with respect to their physical activity and sedentary behavior at work and nonwork time. To do that I should perform latent profile analysis right? I did that using the model I explained above. However, this gives me nonmeaningful classes. For example, I don't get any class where individuals are being sedentary at work but active during leisure. What should i do ? 2. My another question is where can I read about assumptions of latent profile analysis? 3. Is it problematic if my observed variables are correlated to each other or some of the variables are correlated more than the other variables? 


1) 7.14 is latent class analysis but would be the same for latent profile analysis. I think this is the solution for your problem of not getting the classes you are interested in. 2) There is a book by Collins and Lanza. Or google articles on LPA. 3) The analysis would not be meaningful if your variables weren't correlated. 

nidhi gupta posted on Thursday, June 01, 2017  2:58 pm



Dear Prof Muthen Thank you for your reply. 1. so if i use the model explained in 7.14 then, is my model correct? VARIABLE: NAMES ARE id perc_s_work perc_st_work perc_wa_work perc_mv_work perc_s_leisure perc_st_leisure perc_wa_leisure perc_mv _leisure; ; USEVARIABLES ARE perc_s_work perc_st_work perc_wa_work perc_mv_work perc_s_leisure perc_st_leisure perc_wa_leisure perc_mv _leisure; CLASSES = c (3); ANALYSIS: TYPE = MIXTURE; MODEL: %OVERALL% %c#1% %c#2% %c#3% !need to write the code for the model in these classes (the conditions are class 1. high sedentary work and high sedentary leisure class 2. high sedentary work and low sedentary leisure class 3. low sedentary work and low sedentary leisure) continued in next post............. 

nidhi gupta posted on Thursday, June 01, 2017  2:58 pm



If I understand correctly i should use this model if i have some hypothesis about how the groups will look like. 2. if i don't want to hypothesize about a number of classes and types of classes, then i should still use my previous model which i mentioned before right? If so, i ran that model and i got 4 classes which were statistically different from each other (LMR P=0.001) However, the mean of those classes for few variables did not make sense. For example, class 1 and 2 have highest non work sedentary time as shown in the figure of the sample mean. However, the descriptives show that class 1 and 3 have the highest mean of nonwork sedentary time. class 1. 53.612 % class 2. 71.028 % class 3. 51.399 % class 4. 70.354 % So basically, the classes shown in the sample mean figure and the descriptives of those classes do not match. Do you know why? Thank you so much for your help till now. I am sorry to bother you so much. Regards Nidhi 


We ask that postings be limited to one window. If you have longer questions they should be sent to Support with your license number. Clearly relate your questions to the outputs you send. 

Back to top 