anonymous posted on Monday, October 20, 2014 - 8:35 pm
In reading about the new BCH method in web note 21 (October 7, 2014), I'm wondering if there is a variant of the method demonstrated in 3.2 that predicts the distal outcome(s) from the latent classes while controlling for the effect of the covariates on the distal outcomes.
anonymous posted on Tuesday, October 21, 2014 - 11:33 am
Perhaps I'm misunderstanding, but when I run that syntax the output shows the distal outcome regressed on my covariates and the classes regressed on my covariates, but I don't see anything in the syntax or output about the classes predicting the distal outcome.
You get class-specific intercepts/thresholds for the distal outcome. That is the same information as in a regression of the distal on the latent class variable. You can test for differences in these parameters when using Model Constraint.
Anne Chan posted on Friday, January 09, 2015 - 1:20 pm
I noticed that the missing cases (in distal outcome) are handled by listwise in the 3-step approach. I am wondering how the missing data (in distal outcome) are handled by new BCH method?
Dear Dr. Muthén, I´m trying to fit a mixture model using BCH, but when I include a second predictor (educ) of the latent class variable (SC), Mplus asks me to use Alorithm=integration which is not available for bch (according to an error message) . Could you help me with that please? Here´s my input:
Analysis: Type is mixture; Starts=0; Estimator=MLR; Model: %overall% SC on ISEIH Educ; ISEIR on Educ ISEIH; Educ on ISEIH;
Thank you for your answer Dr. Muthén, but I tried taking out missing values and got the same error message(that I need to use integration algorithm). Could it be that when I regress Educ on ISEIH, Educ variable becomes a distal outcome of the latent class and that is why I cannot also regress the latent class(SC) on Educ? If so, is there a way not to consider Educ as distal outcome (I'm only interested in ISEIR as distal outcome), to be able to regress it on ISEIH? And another question, I get an Entropy of 1.140 in the second BCH step, is that because of the use of the weights or I'm I doing something wrong? Thanks
I have the same issue as Valentina with regard to entropy > 1 when I run a regression auxiliary/latent class model along the lines of section 3.2 in webnote 21v2.
The input code is:
--- Variable: Names are ofcan ofecs ofcok ofgas truant schyear w1 w2 w3 ; Usevariables are truant schyear w1 w2 w3 ; Classes are c (3) ; Training are w1 w2 w3 (bch) ; Analysis: Type = Mixture ; Starts = 0 ; Model: %overall% [schyear] ; schyear on truant ; %c#1% [schyear] ; schyear on truant ; %c#2% [schyear] ; schyear on truant ; ---
The model fits three latent class to explain school pupils' exposure to four drug exposure variables. The latent classes, and whether the pupil ever truants from school, are then used to predict school year.
The model estimates look fine, but the classification quality output looks wrong (entropy >1, classification probabilities > 1).
I wonder, is this something to do with the class membership uncertainty being known quantities represented already by the bch weights?
I am trying to use BCH for an LCA with a continuous distal outcome and have no problem with the first step. When I run the final step I get this error message Invalid symbol in data file: "*" at record #: 1, field #: 23 and when I open the file in either Mplus editor or textedit, I can see that there are several *'s scattered throughout the file. Any idea on why this is and how it can be fixed? Is it related to missing data? I have specified missing =all (99)
When you read data that you have saved, you need to read it according the the information at the end of the output where it was saved. The * is a missing value flag that requires
MISSING = *;
Dina Dajani posted on Saturday, December 05, 2015 - 11:28 am
Hi Dr. Muthen,
When manually doing the BCH method for the first step I indicate to SAVE=bchweights; But the file does not have the bchweights, just has my indicators and auxiliary variables. Do you know what the problem might be?
They show the influence of the latent class variable on Y beyond the influence of the covariates. Think of the latent class variable as a dummy covariate in linear regression - it changes the intercepts.
Dina Dajani posted on Saturday, April 23, 2016 - 11:15 am
If I am not mistaken, the class specific intercepts represent the class-specific means of Y
Ryan Grimm posted on Friday, May 06, 2016 - 2:43 pm
I've used the BCH with 3 distal outcomes, one of which is continuous and 2 are binary. I also used DCAT & DCON with the same distals. There is a lot of missing data on each of the three distals, and this was reported in the output for DCAT & DCON. But, with the BCH, I received the following messages:
PROBLEMS OCCURRED DURING THE ESTIMATION FOR THE DISTAL OUTCOME KMTHIRT.
PROBLEMS OCCURRED DURING THE ESTIMATION FOR THE DISTAL OUTCOME LAMAJOR2.
The results said 999 for all of the estimates for these 2 distals. However, the 3rd distal, which was binary, ran fine. Any ideas as to why this may have occurred?
Just to confirm, does the BCH assume MAR for the distal outcomes?
I've used the manual BCH with several continuous distal outcomes but I am not sure how to do the same for categorical outcomes.
I want to examine the effect of latent class membership on a categorical outcome (DCF_MALT) after controlling for three variables which are ces_t1, T1_FINAN, and HISPANIC. To this end, I wrote the following input code:
Usevar are DCF_MALT ces_t1 T1_FINAN HISPANIC BCHW1-BCHW4; Categorical =DCF_MALT; Classes = c(4); Training=BCHW1-BCHW4(bch); Analysis: Type = Mixture; Starts=0; Estimator=mlr; Model: %overall% DCF_MALT on ces_t1; DCF_MALT on T1_FINAN; DCF_MALT on HISPANIC; c on ces_t1; c on T1_FINAN; c on HISPANIC;
When I added the same regressions into the class specific part of the model, I received an error notification that those statements were ignored:
The following MODEL statements are ignored: * Statements in Class 1: C#1 ON CES_T1 C#1 ON T1_FINAN C#1 ON HISPANIC C#2 ON CES_T1 C#2 ON T1_FINAN C#2 ON HISPANIC C#3 ON CES_T1 C#3 ON T1_FINAN C#3 ON HISPANIC
I wonder if my input until the class specific part of the model is correct and how should I revise the input re: the class specific part of the model.
Thanks for your advice re: using DCAT for categorical variables. However, I have a set of control variables which I'd used for continuous distal outcomes and I'd like to be able to control for the same variables while examining the categorical outcomes. It is my impression that you can't add controls into the model while using DCAT. Am I mistaken?
I tried the manual BCH for categorical outcomes but the code I wrote did not quite work out.
I want to examine the effect of latent class membership on a categorical outcome (DCF_MALT) after controlling for three variables which are ces_t1, T1_FINAN, and HISPANIC.
I'm attaching here the input I used in the second step of manual BCH. I'll greatly appreciate if you could tell me which part of the input was incorrect or needs to be revised.
Usevar are DCF_MALT ces_t1 T1_FINAN HISPANIC BCHW1-BCHW4; Categorical = DCF_MALT; Classes = c(4); Training=BCHW1-BCHW4(bch); Analysis: Type = Mixture; Starts=0; Estimator=mlr;
Model: %overall% DCF_MALT on ces_t1; DCF_MALT on T1_FINAN; DCF_MALT on HISPANIC; c on ces_t1; c on T1_FINAN; c on HISPANIC;
%c#1% DCF_MALT on ces_t1; DCF_MALT on T1_FINAN; DCF_MALT on HISPANIC; c on ces_t1; c on T1_FINAN; c on HISPANIC; %c#2% DCF_MALT on ces_t1; DCF_MALT on T1_FINAN; DCF_MALT on HISPANIC; c on ces_t1; c on T1_FINAN; c on HISPANIC; %c#3% DCF_MALT on ces_t1; DCF_MALT on T1_FINAN; DCF_MALT on HISPANIC; c on ces_t1; c on T1_FINAN; c on HISPANIC; %c#4% DCF_MALT on ces_t1; DCF_MALT on T1_FINAN; DCF_MALT on HISPANIC; c on ces_t1; c on T1_FINAN; c on HISPANIC;
Daniel Lee posted on Tuesday, November 01, 2016 - 4:33 pm
Hi Dr. Muthen,
I used Model Test command (e.g., intercept from class1 vs. intercept from class2) to conduct Wald's Chi-square test (taking into account several covariates). I would essentially constrain the intercept of an outcome (alcohol) in class 1 to equal the intercept of an outcome in class 2) using the model test command. For example:
Model: %OVERALL% C ON SEX HSYE; ALC MARIJ NICT ON SEX HSYE; %C#1% [ALC](m1); [MARIJ](m2); [NICT](m3); %C#2% [ALC](m4); [MARIJ](m5); [NICT](m6); %C#3% [ALC](m7); [MARIJ](m8); [NICT](m9); Output: SAMPSTAT TECH4 TECH7; Model Test: m2 = m8;
Is this the Model Constraint method you were talking about to examine whether intercepts in classes are significantly different, when using BCH method? I tried following the Mplus manual on the Model Constraint command, and I had trouble understanding how I would actually implement this for the BCH method. But the Model Test command appears to constrain parameters to test whether the difference is significant (although this method may not be suitable).
If I am on the right track, please let me know. If I should change my analysis, and actually use the model constraint command, I would appreciate any guidance (e.g., video tutorial, article).
Please send output and saved file to Support along with your license number.
S Elaine posted on Wednesday, January 04, 2017 - 12:01 pm
I'm following web note 21 v. 2 (example 3.2) to use the BCH method to run a regression aux model combined with latent class regression. I've followed steps 1 and 2 and not had any issues; now, I'd like to test whether there are sig differences between the classes on the DV (ALCHPRB). I've found several examples of how to do this online, all seem to be different. Could you let me know if this is the proper way to use the model constraint option? If so, I continuously get the following message and don't know how to resolve this: WALD'S TEST COULD NOT BE COMPUTED BECAUSE OF A SINGULAR COVARIANCE MATRIX.
Model: %Overall% C on Age Gender minority Chldses PAP PRALCHL BSI_GSI; Alchlprb on Age Gender minority Chldses PAP PRALCHL BSI_GSI;
%c#1% Alchlprb on Age Gender minority Chldses PAP PRALCHL BSI_GSI(p1); %c#2% Alchlprb on Age Gender minority Chldses PAP PRALCHL BSI_GSI(p2); %c#3% Alchlprb on Age Gender minority Chldses PAP PRALCHL BSI_GSI(p3); %c#4% Alchlprb on Age Gender minority Chldses PAP PRALCHL BSI_GSI (p4);
S Elaine posted on Thursday, January 05, 2017 - 8:43 am
Thank you for you response. This produced the following under model fit information: Wald Test of Parameter Constraints
Value 1.123 Degrees of Freedom 3 P-Value 0.7715 I see many studies reporting Chi-sq and/or Wald statistics for each class comparison (unique stats when comparing class 4 vs. 1, 4 vs. 2...3. 3 vs. 2, etc) on the DV. However, I've only been able to produce one value, and cannot find guidelines for how this information is obtained using the lasted BCH recommendations. I saw your recommendation 10/21/2014 here re: class specific intercepts: http://www.statmodel.com/discussion/messages/13/20479.html?1465244548
I'm trying to locate recent papers that use the recommendations from web note 21 version 2 (3.2). I'm interested in reviewing examples that reporting findings using this approach, but have not been successful in locating any in the social sciences... any guidance, references, or suggestions would be appreciated!
For comparing e.g. 4 vs 1, you need to do several runs, each with only that comparison.
Regarding your last question, if you don't find it among our latent class papers under Papers, you may want to ask on SEMNET.
Jamie Taxer posted on Tuesday, January 24, 2017 - 4:31 pm
I am trying to use the manual BCH approach to examine if an auxiliary variable is a significant predictor of latent class membership. I used the syntax given in Web Note 21 section 3.2, step 2 and as suggested removed the y's. However, when I just include an auxiliary variable that is predicting latent class membership, I keep getting the message "One or more variables in the data set have no non-missing values. Check your data and format statement." According to the warnings there are 249 cases of missing on all variables other than x-variables. This is however, not the case. There should only be around 20 cases of missing values. I have tried the syntax below with two different data sets and get the same error message. Can you please tell me where I am going wrong?
Data: File = manBCHStep1_HS.dat; Variable: Names = ZENTH3 ZENTH5 ZENTH6 ZENTH7 ZJOY1 ZJOY4 ZJOY5 ZJOY3 SE ANG AX EMOEX MH PH JOBSAT MOT ACH DIS SOCDES B5EX BCHW1 BCHW2 BCHW3 BCHW4;
Missing are *; Usevariables = socdes b5EX BCHW1-BCHW4; Classes = c (4); Training = BCHW1-BCHW4(bch); Analysis: type = mixture; Starts =0; Model: %overall% c on b5EX socdes;
We know of no such problem - send your relevant files to Support along with your license number.
Daniel Lee posted on Monday, May 22, 2017 - 9:50 am
Hi Dr. Muthen,
I have submitted a paper using the BCH method and the reviewer recommended that I use expectation-maximization (via SPSS) first and then re-run BCH analysis. A large proportion of the sample (almost half) was dropped because of missing data on exogeneous variables. Would this be a plausible solution (Expectation Maximization) for handling missing data when using the BCH method?
Sounds like the reviewer is suggesting dealing with missing data. But it doesn't make sense to me because the usual EM algorithm only produces a mean vector and covariance matrix which is not sufficient information for mixture modeling - plus the wrong (non-mixture model) is used.
Missing on exogenous variables not in the first step is tricky to handle. Multiple imputation can be used but limits what can be done. Changing from 3-step to 1-step where the exogenous variables are included in the model is computationally demanding.
Perhaps the best that can be done is to check if the subsample with no missing is different in important ways and if not rely on the results from the subsample.
Daniel Lee posted on Wednesday, May 24, 2017 - 6:47 am
That is very helpful! As follow-up, in addition to mean vector and covariance matrix, what other information is used in mixture model that EM imputation does not produce?
Daniel Lee posted on Wednesday, May 24, 2017 - 12:43 pm
I apologize for asking a second string of questions in a separate post:
1) If I include a predictor (e.g., gender) as a within-class predictor in my BCH model (2 trajectory classes), and sex differences in Y were observed in one trajectory class (e.g., Class 1; with males higher than females on Y), but not in the other class (e.g., Class 2; males and females not significantly different on Y), would I be able to compare mean differences in the following way:
-Males in Class 1 vs. Members of Class 2 -Females in Class 1 vs. Members of Class 2
2) Although I'm not looking at an interaction effect specifically, as one typically would in a regression model, would I be able to say that sex modified the effect of the latent class variable on Y IF males in Class 1 had the highest value of Y compared to females in their own class, and males/females in Class 2?
Q1: By regular linear regression perhaps you mean regressing the outcome on a set of dummy variables representing most likely class membership. If so, you would be ignoring measurement error in the classification.
Q2: Not auto from Mplus but you can express it yourself.
Daniel Lee posted on Thursday, June 01, 2017 - 8:19 am
Hi Dr. Muthen, is there a way to compare a BCH model (last step) with gender included as a between class predictor vs. the same model except, this time, gender is a between & within-group predictor? The model has a distal outcome & there are three classes.
Daniel Lee posted on Thursday, June 01, 2017 - 8:21 am
Sorry, as a point of clarification, by compare I mean model fit comparison (e.g., BIC).
I am using the BCH method for my paper. What is the full name of the BCH method? I searched Asparouhov & Muthen (2014) webnote, but did not find it.
Wendy Rote posted on Monday, June 19, 2017 - 3:15 pm
Hi, I'd like to use the manual BCH method to estimate profile differences (From an LPA on Wave 1 variables) in Wave 2 outcomes controlling for levels of those same outcomes at wave 1. I've done this in the step 2 syntax by regressing each w2 outcome variable on its w1 equivalent, requesting the means of the w2 variables in each profile, and using model constraint to compare them. This seems to run fine but I'm not sure whether I should model the autoregessive paths for each profile separately (as is done in webnote 21 when controlling for demographic predictors in an LPA modeling mean class differences) or for just for the model overall (as I assume would more similarly replicate a classic autoregressive path model with a categorical variable predicting autoregressive change over time). The results differ quite a lot depending on the choice. Any feedback would be really helpful. Thank you!
I would not be regressing indicator on indicator, rather regress latent class on latent class (otherwise you change the measurement model), i.e., if it gets too confusing stick the the standard latent transition analysis.
Using BCH with 2 latent class variables is not very easy. I would use one at a time and possibly run a joint analysis (single step standard ML) with all measurement model parameters fixed to the BCH estimates.
I'm using the BCH method to assess potential differences in distal outcomes across four latent classes per syntax in section 3.2 of Web Note 21. I have a set of covariates (12+) that influence my distal Ys and the latent class variable. I'm using class-specific models for the distal Ys.
I'm wondering if there's a way to get the fitted means for my distal outcomes at the centroid? When I include my model for C on my included covariates, I get the class-specific intercepts/thresholds for my distal outcomes but am unable to label the covariate means that I would then use in Model constraint to get the class-specific centroid means for my distal outcomes. Mplus assumes that I’m trying to bring the covariates into the model and gives me the error message that I need to use ALGORITHM = INTEGRATION (FIML estimation which cannot be used with BCH weights).
Is there another way to work around this so that I can get the necessary model parameters and use Model constraint (or some other procedure?) to compute and test for differences for the centroid means by class for my distal outcomes? Thank you for your assistance!
You can instead use the actual sample means. That won't give you the exactly right SEs but they will be close.
Jin Qu posted on Sunday, September 03, 2017 - 7:39 am
I am using the BCH method to include covariates for my latent profile analysis. Can I include more than 1 covariate at one time? for example, the outcome is "ZAGGR", while controlling for "crace" and "educati".
BCH is for a distal outcome, not covariates. If you want both, use the method described in Section 3.2 of our BCH paper.
RuoShui posted on Wednesday, September 06, 2017 - 5:32 pm
Dear Drs. Muthen,
I am using the manual BCH method to estimate how the latent classes are different on the outcome variable, controlling a range of covariates. From the above postings, I understand that when covariates are included, the intercepts of the outcome were estimated across latent classes. But I am still unsure how to compare the means across classes. In your response to "Sharon" above, you said, that actual sample means can be used. Can you please explain a little more what do you mean by that and how to do this within the BCH manual approach?
The model you are describing is focused on the conditional expectation E(Y|C,X) and not E(Y|C). Realize that it is somewhat of a mute question to ask if E(Y|C) is class invariant if E(Y|C,X) is not. It can happen in some strange/particular distribution of the [X|C], however, if E(Y|C,X) is not class-invariant one generally should conclude that E(Y|C) is not class-invariant. If your question is regarding E[Y|C] you should remove the covariates from the the BCH estimation and get a direct answer to that.
You can also write out in model test E(Y|C) as a function of the mean of X and the regression coefficient (you would need to include in the model the distribution of X) thereby directly test the quantities of tech4.
I'm using the manual BCH method to assess potential differences in a continuous distal outcome Y across four latent classes. I have a set of covariates that I would like to adjust for by regressing Y on the covariates under the %overall% command. However, I receive an error message stating that the sample variance of one of the covariates is negative in class 1 and that the standard errors of the model parameters cannot be computed. Is there a way to prevent this error message without making this covariate endogenous and fixing its variance to equality across latent classes?
Thank you for your previous response. I am now considering different approaches to address missing data when using the manual BCH method. My continuous distal outcome is the intercept from a latent growth curve model for alcohol use, therefore I am using FIML estimation to include those with at least 1 of 4 repeated alcohol measures. I would also like to use inverse probability weighting to address potential bias from missing data on the latent class exposure and covariates. However, when I include the inverse probability weight in the model, the latent class distribution changes from the unweighted model (including the final class counts based on their most likely latent class pattern). Is this because inverse probability weighting should not be combined with using the BCH weights?
From your description it looks to me that the change in the latent class distribution has nothing to do with BCH. In principle you can use sampling weight with BCH, but using sampling weights to adjust for missing data in the covariates is not something we would recommend. I think we generally would recommend that you convert the missing variables to endogenous variable or impute the missing data from an imputation model.
It is not a proper technique to deal with missing data. Using multivariate full information modeling or multiple imputation are the two most well established missing data methods. Sampling weights should be used when the observations were not sampled at random, and they should not be used for other purposes. Using post-stratification is a common method for calibrating the sample covariate distribution to population totals. That is not however a way to deal with missing values. It is also irrelevant if you are estimating a model [Y|X] rather than [Y,X].
Jin Qu posted on Tuesday, February 20, 2018 - 1:50 pm
I am using the BCH method described in Section 3.2 in Webnote 21 to predict distal outcomes (internalizing) from latent profiles while controlling for covariates (racedi). Then, I calculated means based on the intercepts, slopes, predictor means that I obtained from the output. I wonder if I could obtain standard deviation for the means of the distal outcomes? See below for my model. Thanks!
model: %overall% ZINTER4 on RaceDi; c on RaceDi;
%C#1% ZINTER4 on RaceDi; [ZINTER4] (m1); %C#2% ZINTER4 on RaceDi; [ZINTER4] (m2); %C#3% ZINTER4 on RaceDi; [ZINTER4] (m3);
You want to look at the difference between the class 1 and class 2 intercepts to see if class influences the DV controlling for the IVs. You do that using the Model Test command with parameter labels defined in the Model command, e.g.
Model Test: New(diff); diff = inter2 - inter1;
where inter1 and inter2 are intercept parameter labels.
I am using the bch approach as outlined in webnote 21. I identified 3 profiles (n's = 126, 545, and 124). I entered the covariates and DVs and saved the bch weights. When I ran analyses looking at the effect of the covariates on the DV, the profile n's shifted (n's = 59, 670, and 40). I thought the class n's were not supposed to shift using the bch approach. Can you please clarify?
I'm having the same issue as Jill above. I am trying to use the automatic BCH method with LPA. I keep getting the following error message:
*** ERROR in ANALYSIS command TYPE=MIXTURE is not available for multiple group analysis.
Here is the input: variable: names=id common serious gender eth major femin genid value identity; usevariables = common serious; idvariable = id; classes=c(3); auxiliary = value(bch); missing = all(99); data: file=a1.dat; analysis: type=mixture;
I'm using the manual BCH method to assess potential differences in a continuous distal outcome Y across three latent classes (my indicators are four 3-level ordinal variables). I would like to incorporate residual association between 2-3 indicators using PARAMETERIZATION=RESCOVARIANCES command. I have a set of covariates that influence Y and/or latent class variables. However, I receive an error message stating SAVE=BCHWEIGHTS is not available with ALGORITHM=INTEGRATION.
Is there a way to incorporate the residual association in latent class for 2-step BCH model?
Thank you very much for the very helpful answer. One additional question: if I would like to report the standardized mean difference, is it correct to take the following steps?
1. Convert the SE of the class mean into a SD by SE*(square root(n)) 2. Calculate the SMD by dividing the raw mean difference between two classes by the pooled SD of the two classes 3. Get the SE of the SMD by SMD/z-value (square root of chi-square) 4. Compute the 95% CI as above
For that purpose I would recommend using the manual BCH approach. You can form the standardize mean differences in model constraint to obtain the proper SE and confidence intervals. See Section 3 in http://statmodel.com/examples/webnotes/webnote21.pdf to see how the manual BCH is conducted. The model constraint is illustrated in User's Guide example 3.10. You provide labels for the model parameters and then use them to form whatever difference you want.