Message/Author 


We have longitudinal data over 2 timepoints, 1 year apart (N is about 930). At both timepoints, we measured a variable called LEKA. We also have emotion recognition tasks at both timepoints. However, at T1 we have a different emotion recognition task than at T2 (lets call these variables Emo1_T1 and Emo2_t2). The scores on these two tasks are correlated, but it is a very different task so we cannot treat it as 1 construct. We want to see if there is a relation between LEKA and scores on these two tasks. Of course we could just do a separate correlational analysis at both timepoints. But we would rather examine it longitudinally. Basically, we want to see if we can determine latent classes for LEKA, with for instance stable high, stable low, stable middle, increasing and decreasing scores on this questionnaire. Then we want to see if this class relates to scores on emo1_T1 and emo2_T2. We looked into growth curve modeling, and it seems that you should actually have 3 or more timepoints to do this analysis. What type of analysis could we use? 


You can do a latent class analysis on the 2 LEKA variables. You can then relate the Most Likely Class (which you get out of saved cprobabilities) to the two emo variables. If your entropy is less than say 0.8 you may want to look at our web note 15. 

Gerine L posted on Thursday, January 08, 2015  3:04 am



Thank you for your response, this was very helpful. We now want to do some additional analyses. Instead of making latent classes for Leka_T1 and Leka_T2 only, we also want to add another variable that we measured at these two timepoints: FR_T1, FR_T2. Can we still use latent class analysis in this case? The reason we doubt about this, is that there are basically 2 forms of interdependency: [Leka_t1 and FR_t1] and [leka_t2 and FR_t2] are both measured at the same timepoint. and [leka_t1 and leka_t2] and [fr_t1 and FR_t2] are both the same measures. Might this be a problem, and is this something we can control for somehow? 


You can have 2 latent class variables that are related using WITH and parameterization= loglinear. Each latent class variable explains correlation between a variable at 2 time points and the latent class relation explains correlation across the set of variables. 

Gerine L posted on Friday, January 09, 2015  1:26 am



If I understand correctly, you propose that I create a latent class (or: latent profile, as I have 2 continuous variables), one for each time point. (i.e., one creates profiles from "leka_t1 + fr_T1", and one creates profiles from "Leka_t2 + FR_T2"). I correlate these with each other? My goal is to get profiles of individuals. For example: average score on FR on both timepoints, increase from low to high on LEKA  high on leka, low on FR on both timepoints etc. and to relate this to scores on another task. 


Reading this again, perhaps what you want is an LTA for LEKA (at each time point you have 1 indicator of a latent class variable) where you can see transitions between classes over time. And a parallel LTA for FR. The latent class variables of the 2 LTAs can be correlated. 

Gerine L posted on Monday, January 26, 2015  4:34 am



We went back to the original idea of only using the 2 leka variables (Leka_T1 and Leka_T2) to estimate latent profiles. Now we ran into a different issue, namely how to see if there are gender differences. We did 2 things: 1. use the "KNOWNCLASS" option 2. run a profile analysis separately for boys and girls. Disadvantage: 1. KNOWNCLASS "forces" a set nr of classes (e.g. 3) for both boys and girls. 2. We want to see if running models separately for boys and girls is sig. better than for boys and girls cobined. Usually with path models, we use the grouping function to first see if there is an overall difference between models for gender, and use subsequent waldtests to see where differences might be. Is there such a possibility for running a latent profile analysis for boys and girls? Thus, is there a way to determine what the best solution is for boys, and the best solution is for girls, and whether a model with or without "grouping" on gender is better. A possible outcome might be: For boys there is a stable high, stable low and increasing group. For girls, there is a stable high, stable low, increasing AND a decreasing group. The model in which profiles were estimated separately for boys and girls was significantly better than the model in which both were estimated together. 


Yes, you can do this gender invariance investigation. First, you analyze each group separately to see if they have the same number of classes. If not, you stop there. If yes, you can do a joint analysis using Knownclass. In addition to the Overall model you also have Model male and Model female: Model: %overall% etc Model male: %c#1% etc Model female: %c#1% etc In the overall part you can have "c ON cg" which would let the genders differ with respect to class percentages; if you don't have this statement the percentages are the same. In the genderspecific Model parts you can impose any kinds of gender invariance equalities on the indicator means. 

Gerine L posted on Wednesday, January 28, 2015  4:08 am



Thank you again for your help! We have a few additional questions. First, you indicate "if not, stop there". What does that mean? If we don't have the same number of classes for boys / girls, do we just do the analyses separately? Second, is there a way to statistically determine if the model estimated separately for boys and girls is better than the model for all participants combined Third, does cg refer to gender class? 


Q1. Yes, because you can then not compare the genders with respect to this latent variable  unless some of the classes have the same interpretation, but that leads to a complex analysis. Q2. I would not recommend phrasing the question that way. A more palatable question is if a model with no invariance across gender is better than a model with some or full invariance across gender  that can be answered statistically. Q3. Yes. 

Gerine L posted on Monday, February 02, 2015  4:52 am



An additional question: When examening plots, it looks like there might be a few multiple outliers in the data that might influence what classes are formed. As we only have 2 variables, it might be that for intsance Cook's distance is a more suitable way of examening outliers than leverage or student residuals. What do you think? What procedure do you reccomend to identify and adjust (or remove) these outliers? 


I would look at loglikelihoods and Cook's distance. I think a Cook's distance of less than 1 indicates a value is not an outlier. 

Gerine L posted on Monday, February 02, 2015  10:13 am



Thanks for your quick response. What do you reccomend doing after identifying the outliers? i.e., removing or a different procedure? 


I would explore if the results differ in important ways after removing the outliers. In large samples, there is often not much change. Then I would report that exploration in the paper. 

Back to top 