Two questions: 1). For only two treatment conditions (i.e., TX vs control): is it possible to have three levels of compliance (e.g., compliance, partial compliance, and non-compliance) so that the training variables could look like the following: c1 c2 c3 1 1 1 (in control group) 1 0 0 (compliance) 0 1 0 (partial compliance) 0 0 1 (non-compliance) Accordingly, the latent class variable would be specified with three classes in the mixture intervention model.
2). Is it possible to run a CACE model with three treatment conditions (e.g., tx1, tx2, and control) and two levels of compliance (i.e., compliance vs non-compliance)? It looks like we could include two TX dummy variables in the model, but how to define the training variables and interpret the latent class variable?
Thank you very much for your help!
Booil Jo posted on Wednesday, January 23, 2002 - 2:27 pm
1) It is possible to arrange the training variables as you did to reflect three levels of compliance. However, this model is not identified, even though you impose the exclusion restriction on noncompliers. In this case, you have to build the identifiability relying on more structural assumptions than the exclusion restriction. 2)This is another underdeveloped area in CACE modeling. I will compare tx1 vs. control, and tx2 vs. control using regular 2-class CACE models. However, It is not clear how to interpret the results when you compare two active treatments using CACE estimation, as you already pointed out (unless double-blining is possible).
JeremyMiles posted on Wednesday, March 24, 2004 - 3:59 am
I wondered if anyone reading knew if this was an appropriate problem for CACE modelling. We have a problem related to, but not exactly the same as, non-compliance - differential non-recruitment.
We carried out a cluster randomised study of a new form of therapy versus standard care. The clusters were clinics, and there were 13 clinics allocated to standard, and 13 allocated to the new form of care.
The problem that we encountered was that the clinics allocated to the new form of care - intervention - thought this was very exciting, and recruited a large number of patients (~800), whereas the usual care - control - didn't try so hard, and recruited only ~400.
The control group differs - probably in initial severity, and maybe on other characteristics. We have a wide range
Is this an example of somewhere we could use a CACE model? We have two classes in the intervention arm, and one class in the control arm. It's sort of related to ITT issues, but only sort of.
As I understand it, once you have lost your randomization, the method would no longer apply.
Anonymous posted on Wednesday, November 10, 2004 - 6:42 am
Dear Bengt and Linda
I have some questions regarding CACE models in MPlus 3.11. First, I wonder whether my data are suited to this model. I have a study where respondents to a survey are randomly allocated to either treatment or control. In the treatment condition they are shown a film about genomic science, the control receive no information. I want to look at the effect of watching the film on subsequent attitude questions. However, approximately 20% of the treatment group report not having understood the film. I would like to treat this group as non-compliers and estimate the complier-average causal effect of viewing the film. Would this seem appropriate to you?
Second, I am not sure how to interpret the MPlus output when fitting the CACE model in example 7.24. Can you point me somewhere that I might be able to find some pointers on what I should be looking for? In particular, which parameter denotes the causal effect? Is it the regression of Y on x2 in latent class 2? I have sent my output separately to MPlus Product support. Thank you,
bmuthen posted on Sunday, November 14, 2004 - 12:04 pm
Yes on the question in the first paragraph.
Example 7.24 uses the x2 variable to represent the treatment-control dummy in line with ex 7.23. So, yes, the regression coefficients for y on x2 is the causal effect.
Anonymous posted on Monday, November 15, 2004 - 6:25 am
thanks for your reply. I have a few further questions about this:
1. the output suggests to me that latent class 2 in my analysis are the non-compliers (this is because of the relative size of the classes. However, the regression of Y on X2 is fixed to zero in this class. Should class 1 be the non-compliers in the setup for this model?
2. what is the role of the x1 variable? Should I be including variables here that are predictive of complier/non-complier status? Can I include more than one variable for X1?
3. How do I deal with differential nonresponse in the CACE model? Can I simply specify a weight variable in the usual way?
bmuthen posted on Monday, November 15, 2004 - 7:17 am
1. There should be no doubt from the data about which class is the non-complier class - it is the class with no compliance for the treatment group (so an observable matter). The standard CACE model assumes no treatment effect for the non-compliance class since they do not receive treatment (see, however, modifications made in Booil Jo papers on our web site).
2. x1 is an example of a variable that strengthens the analysis much like covariates with ancova in randomized studies. You can have many such variables pointing to either the latent class variable or the outcome or both. See the Little & Yau article in Psych Methods on our web site.
3. By differential nonresponse, do you mean different across the 2 classes? If so, this is a topic studied by Frangakis and Rubin within the area of non-ignorable missingness - it can be handled by Mplus using the latent classes to predict missingness. But maybe I am misunderstanding your question.
We have a multiyear evaluation trial of a school-based program using student reports of outcomes, students nested within classrooms. In our initial CACE analysis we dichotomized the implementation measure to create complier and non-complier categories. Results were good, but our child-report covariates didn't predict compliance well. We have teacher-report covariates that might predict compliance better. Can Mplus do a CACE model that includes teacher-report covariates to predict compliance? Because students are nested within classroom, all students who had the same teacher have the same values for the teacher-report covariates. Is a multilevel CACE analysis needed here to correctly use the teacher-report covariates?
Is compliance a between level (classroom) variable with students being the within level? If so, the CACE latent class variable is a between variable. This feature is not in the current Mplus Version 4.1, but will be in the next version 4.2 which is due out in a couple of weeks. If compliance varies across students within classrooms, this is related to work by Booil Jo and you may want to contact her about it.
I am currently evaluating a multiyear trial of an elementary school primary prevention program. In the program, we have varying levels of exposure to the program (an unintended side effect of varying implementation of teachers).
I am interested in the CACE approach to model the effects of the program. Additionally, I’ve run propensity score analyses, with favorable results.
My question centers on locating a reference that may provide a substantive discussion on the key similarities and differences of CACE and propensity score methods.
Michael, One of the main differences between the CACE and propensity score methods are the underlying assumptions, so you will want to think about which is more reasonable for your setting.
In particular, the CACE models use the fact that the original treatment assignment was randomized (I'm not sure if it was in your example), and then make exclusion restrictions and the monotonicity assumption to estimate impacts. So they rely on having some "instrument" (the thing that was randomized) that affects the exposure level that someone gets.
Propensity score methods don't assume anything was randomized, but instead rely on an assumption of unconfounded treatment assignment: they assume that there are no hidden biases between the exposed and unexposed groups. In your case, this would imply that there are no unobsrved differences between the high exposure and low exposure groups--that all differences are captured by your observed covariates. So they assume that only observed variables affect the exposure level that someone gets.
I hope this helps. In another post I will list some references that you could look at.
This is a followup to my previous post, with references for propensity score and CACE assumptions and comparisons.
For the CACE assumptions I like the original Angrist, Imbens, and Rubin paper: Angrist, J.D., Imbens, G.W., and Rubin, D.B. (1996). Identification of causal effects using instrumental variables. Journal of the American Statistical Association, 91, 444-455.
For propensity score analyses I like this paper by Rubin: Rubin, D.B. (2001). Using propensity scores to help design observational studies: Application to the tobacco litigation. Health Services and Outcomes Research Methodology, 2, 169-188.
And there are two references I know of that compare instrumental variables (CACE) models and propensity score models:
Posner, M.A., Ash, A.S., Freund, K.M., Moskowitz, M.A., and Shwartz, M. (2001). "Comparing standard regression, propensity score matching, and instrumental variables methods for determining the influence of mammography on stage of diagnosis." Health Services and Outcomes Research Methodology, 2, 279-290.
Landrum, M.B. and Ayanian, J.Z. (2001). "Causal effect of amulatory specialty care on mortality following myocardial infarction: A comparison of propensity score and instrumental variable analyses." Health Services and Outcomes Research Methodology, 2, 221-245.
I have data from an experimental study in which I have both "noncompliers" in the experimental condition and "always-takes" in the control condition. I have run some CACE models using Booil's syntax for the two class CACE model (i.e., compliers and noncompliers) but extending the logic to a three class model (i.e., compliers, noncompliers, and always-takers). The results seem to make sense and match up reasonably closely with the estimate of the treatment effect I get with an instrumental variable approach or running an ANCOVA model that controls for covariates related to compliance. The standard errors for the treatment effect are somewhat larger using the CACE model compared to the ANCOVA model, which seems right. (I have not figured out how to get the standard errors or include covariates using the instrumental variable approach).
I am wondering whether I am in uncharted waters with the three class CACE model. I looked through Booil's publications and have not found a similar example.
It is possible to do 3-class CACE modeling in Mplus. The model will be identified by imposing the exclusion restriction both on never-takers and always-takers. Monotonicity is also necessary. Under this condition, Mplus should provide estimates close to those from the IV approach. However, since we are dealing with a parametric model, substantial deviation from normality may lead to erroneous solutions. Including covariates is not only possible, but also prevent this event from happening. If one or both of the exclusion restrictions are relaxed, identification of the model will depend more on covariate information and normality, and therefore more caution is needed. Cross-validating the results by using both parametric and semi or non-parametric approaches might be a good idea in this case. I have not tried many examples of 3-class CACE modeling, and have not seen many published examples using parametric approaches (and none using Mplus). However, there are many published examples of multi-class CACE modeling using the Bayesian approach.
Anonymous posted on Wednesday, September 16, 2009 - 3:04 pm
Hello, I have a few questions regarding a CACE model I am attempting to run.
1) What is the difference between these two error messages?
*** WARNING Data set contains cases with missing on all variables. These cases were not included in the analysis. Number of cases with missing on all variables: 2 *** WARNING Data set contains cases with missing on x-variables.These cases were not included in the analysis. Number of cases with missing on x-variables: 84
2)How can I maximize the variables used without effecting the Entropy?
The first message is about missing on all analysis variables. The second message is about observations with missing on one or more observed exogenous variables.
I don't understand your second question.
Anonymous posted on Wednesday, September 16, 2009 - 4:05 pm
Thanks for the answer and I apologize for being vague in my second question. Regarding the number of cases with missing observations on the exogenous variables, I have found on other posts using other types of analyses that mentioning the variances works to include cases with missing data on the covariates. Is this also true for CACE models (perhaps by saying x2; again in the %overall% model)? If so, will this change the entropy of the model (the model continues to run well without these cases)? I hope this is more clear, and thank you in advance.
Assuming that parametric estimation methods (e.g., 2-class mixture in Mplus) are used, around 100 or more subjects in each compliance class will result in good CACE estimates and standard errors. Around 50 subjects in each compliance class will still yield reasonably good estimates. However, if parametric assumptions and/or model identifying assumptions (e.g., exclusion restriction) are not met, the quality of estimates will deteriorate. Covariates may play important roles here. By including good covariates (i.e., predictors of compliance type) in the estimation model, one may obtain reasonably good CACE estimates with smaller samples. These covariates also tend to reduce sensitivity of CACE estimates to deviations from underlying model assumptions.
I have data from a randomized trial and am interested in the noncompliance Example 7.24 in the Mplus manual. In particular, can this analysis be done with missing data on the outcome variable? I do have baseline covariates that predict missingness, the outcome variable and compliance, and am planning to use these to identify the CACE. I have read the recent Jo, Ginexi & Ialongo (in press) paper posted at your web site which nicely describes missing outcomes and CACE in Mplus. But they identify the CACE by imposing restrictions on the relationship between outcome missingness and noncompliance. That would be useful as sensitivity analyses, but for the primary analysis I want to take advantage of the informative covariates, which were carefully chosen. Is it possible to use example Mplus 7.24 with multiple imputation? If so, which Mplus multiple imputation method would be best?
It is an interesting question what is best here. You could do imputation where you specify that tx is binary. But note that regular imputation does not acknowledge that you have a mixture of compliers and non-compliers. So the imputations would seem biased to some degree. Regular ML mixture modeling would seem more straightforward, with later follow-up using the latent ignorability NMAR approach. Note that ML under MAR estimates [y | x] from those with complete data on y. The subjects with data on only x would contribute only to the marginal [y]. You have the parameters you want already in [y|x]. I am saying that because there doesn't seem to be a reason to bring the x into the model in ML in this case. If your covariates that predict missingness are not part of your model, you could take the missing data correlates approach of UG ex 11.1.
Thank you for the suggestion to use the AUXILIARY command. I am interested in applying the GMM described in the 2003 Jo & Muthen chapter in the Reise & Duan book, and see that the AUXILIARY command provides a convenient way to deal with missing outcomes, although covariates that predict compliance and the outcome are still needed in the model proper to identify it. Is it possible to use the AUXILIARY command in a three-level model in Mplus, as in a longitudinal cluster-randomized trial (e.g., level 1 = time points, level 2 = person, level 3 = cluster)?
We will put it on our list for future developments. In the meanwhile note that this type of modeling can be set up by the user. The principle for the single-level approach is described in the web movie Missing Data Correlates using ML which you find at
The "Missing Data Correlates using ML" video was clear. I see how to add in the WITH statements rather than using the AUXILIARY command in that kind of model. I am interested in using this method in the multilevel growth mixture model format of UG Example 10.9. That would represent a couple of extensions from the video example. First, consider the case of UG Example 10.9 with some missing data in the outcomes, y1-y4. If there were three Time-1 covariates that were not measures of y (as they are in the video), that are associated with missingness in y1-y4, could they be used in the same manner as the z1-z5 in the video as auxiliary variables? That is, would this be done by adding to the %WITHIN% section of the Example 10.9 model the 3 correlations among the 3 auxiliary variables, and the 12 correlations between the 3 auxiliaries and the 4 outcome observations? If this is incorrect could give me a hint on how specify the auxiliary effects in MODEL = TWOLEVEL MIXTURE? Second, would the same approach extend to data that was missing from whole clusters of individuals (such as schools) that were missing from some waves of the study? That is, assuming there are 3 cluster-level covariates that are associated with whether whole clusters (data on students in a school) are missing from some waves of data. I can see how this might be done in the %BETWEEN% section of Example 10.9, analogous to the %Within% section as described above.
Though the AUXILIARY approach to dealing with missingness in the outcomes (y1-y4) of multilevel Example 10.9 is premature, would it not still be possible to provide evidence for MAR if there were within-level covariates that were predictive of the outcome, predictive of missingness, and of some substantive interest in the model. Such covariates could influence the intercept or slope. Substantive interest could come from a need to adjust model estimates for something like gender. Granted, I am talking about very informative covariates. But if you found a couple good ones, with moderately strong associations, and no interactions, wouldn't that address missingness in multilevel models such as Example 10.9?
If you have covariates that are predictive of the outcome and missingness as well, I would not hesitate to include them in the model and thereby make MAR more plausible. The missing correlates situation is different because the correlates don't have a role in the model as predictors of the outcome, only missingness. Hence, they shouldn't be included as covariates because the influence of the substantive covariates becomes distorted.
I would first ignore the multilevel angle and see if missing data correlates have an effect or not.
Anonymous posted on Monday, June 28, 2010 - 5:25 pm
In attempting to run a CACE model I receive the following warning:
WARNING:THE RESIDUAL COVARIANCE MATRIX (THETA)IN CLASS 1 IS NOT POSITIVE DEFINITE. THIS COULD INDICATE A NEGATIVE VARIANCE/RESIDUAL VARIANCE FOR AN OBSERVED VARIABLE, A CORRELATION GREATER OR EQUAL TO ONE BETWEEN TWO OBSERVED VARIABLES, OR A LINEAR DEPENDENCY AMONG MORE THAN TWO OBSERVED VARIABLES.CHECK THE RESULTS SECTION FOR MORE INFORMATION.PROBLEM INVOLVING VARIABLE W9XR.
No, this message should not be ignored. For the variable w9xr, you should look for a negative residual variance or a correlation greater than one. If you can't see the problem, please send your full output and license number to email@example.com.
ywang posted on Tuesday, October 12, 2010 - 11:54 am
I have two questions about CACE: 1. If y (outcome variable) is a categorical variable, can I do a CACE model with example 7.23 by only including a command of "categorical=y"? What else do I need to add to the input?
2. Is it possible to examine the interaction between treatment (not complier) and a covariate in the CACE model? If so, is it correct to simply generate an interaction term between treatment and the covariate, and include it as another covariate in the model?
Greetings, I have a question on how to interpret part of the output of a CACE model. Is the following section giving me the means of my predictors within class 1 (i.e., non-engagers in my model)?
. RESIDUAL OUTPUT
. ESTIMATED MODEL AND RESIDUALS (OBSERVED - ESTIMATED) FOR CLASS 1
. Model Estimated Means
And is the following section giving me the means of my predictors within class 2 (i.e., engagers in my model)?
. ESTIMATED MODEL AND RESIDUALS (OBSERVED - ESTIMATED) FOR CLASS 2
. Model Estimated Means
I would assume that these are the values I would get if I saved the estimated engager status for each participant and requested means for each group separately, but the subtitle saying "RESIDUAL OUTPUT" is confusing to me.
Hi I'm working on a cluster principal stratification model. I wonder if I created group dummies properly or not according to the error message. Treatment was assigned in school level, and level of compliance is school level as well. Outcome level is student level. Since compliance variable in control group was not observed, I coded "0" in both c1 and c2. And treatment group has compliance (1) and non-compliance (0) in c1 and c2 respectively. c1 c2 0 0 control (assumed to be zero treatment effect) 1 0 treatment (compliance) 0 1 treatment (non-compliance)
But then, I got an error message like the below. I greatly appreciate your help.
*** ERROR There is at least one observation in the data set where all training variables are zero. Please check your data and format statement.
I have two questions. Treatment and complier occur in school level, outcome (score) is in student level. But latent class size looks like complier variables were created in student level in the output. Also, I have an error message like below. Do you see any problems in the syntax so I can fix or is there any example of multi level CACE model? I appreciate your help so much.
VARIABLE: NAMES ARE school treat score c1 c2; USEVARIABLES ARE treat score school c1 c2;
CLUSTER = school; BETWEEN = treat; CLASSES = c(2); TRAINING = c1-c2;
Error message: THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES MAY NOT BE TRUSTWORTHY FOR SOME PARAMETERS DUE TO A NON-POSITIVE DEFINITE FIRST-ORDER DERIVATIVE PRODUCT MATRIX. THIS MAY BE DUE TO THE STARTING VALUES BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. THE CONDITION NUMBER IS 0.808D-11. PROBLEM INVOLVING PARAMETER 6.
Thank you so much. I added "BETWEEN=c;" But still I had students latent classes numbers. My compliance occurs in the school level meaning that latent classes size should be presented the number of school?
And since my "non-complier class" is low quality of dosage, it's not the zero-treatment effect. In this case, still I have to fix to zero? If I run after fixing to zero, then coefficient of class 1 is 0 too.
VARIABLE: NAMES ARE id school treat score c1 c2;
USEVARIABLES ARE treat score school c1 c2;
MISSING IS .;
CLUSTER = school; BETWEEN = treat c; CLASSES = c(2); TRAINING = c1-c2;
Check that your training data are between-level variables, that is, that all members of a cluster have the same value. If you continue to have problems, send your files and license number to firstname.lastname@example.org.
QianLi Xue posted on Wednesday, August 29, 2012 - 6:47 am
The User's Guide gives out two ways to implement the CACE model (i.e., Ex7.23 and Ex7.24). I noticed that the two approaches treat missing outcome variable (i.e., Y in the examples) differently. Ex7.23 will delete "cases with missing on all variables except x-variables." So it will delete all cases with Y missing. However in Ex7.24, only cases with both Y and u missing will be deleted, which means only cases in the control (not treatment) group with missing Y will be deleted. If this is correct. Which model do you recommend to use when there is missing data in Y? Thanks in advance for your help!
What you say does not seem correct. Can you please send the two outputs and your license number to email@example.com so I can see what you mean.
Elina Dale posted on Monday, May 06, 2013 - 8:23 pm
Dear Dr. Muthen,
I am trying to estimate CACE as I have RCT with non-compliance (55% of those in treatment group did not get it). My outcome variable is a cont latent variable measured through cat factor indicators. My treatment variable is trx (1/0) and I have compliance indicator p4p (1/0), which shows whether ind i actually received the trx. I ran my model and I got the following warning message:
RANDOM STARTS RESULTS RANKED FROM THE BEST TO THE WORST LOGLIKELIHOOD VALUES
Final stage loglikelihood values at local maxima, seeds, and initial stage start numbers: -12933.575 533738 11 -12933.575 407168 44 Unperturbed starting value run did not converge. 2 perturbed starting value run(s) did not converge. THE BEST LOGLIKELIHOOD VALUE HAS BEEN REPLICATED. RERUN WITH AT LEAST TWICE THE RANDOM STARTS TO CHECK THAT THE BEST LOGLIKELIHOOD IS STILL OBTAINED AND REPLICATED.
I reran then with STARTS = 100 ; which is twice the number of starts before where I had 50. However, I got the same warning message, i.e. RERUN WITH AT LEAST TWICE THE RANDOM STARTS TO CHECK THAT THE BEST LOGLIKELIHOOD IS STILL OBTAINED AND REPLICATED.
I am not sure what perturbed and unperturbed staring values and what I should do as the next step to get the best loglikelihood and replicate it.
We give the message to rerun every time you run. We have no way of knowing if this is your first or second run. If you have doubled the original starts and replicated the best loglikelihood several times, you should be fine. Unperturbed starting values are the default starting values which are used for the perturbed starting values.
Elina Dale posted on Friday, May 10, 2013 - 7:38 pm
Dear Dr. Muthen,
Thank you for all your help! It seems my model ran now and I got the CACE estimates that seem to make sense.
I am struggling now presenting and interpreting results, specifically assigning units. My outcome is a continuous latent variable measured through a set of observed indicators on a Likert scale. I have an assigned treatment variable (TRX=0/1) and I estimated CACE.
My F1 on TRX is -2.673. This is my CACE estimate. What are the units of measurement here?
For each of my factors 1st loading is fixed to 1. Will my results be easier to interpret if I fixed factor variances to 1?
I want to present it to policy-makers and would like to find a good way of interpreting the results. Also, if you have good publications you could refer me to that is similar to my case, I'd greatly appreciate it. Thank you!!!
I would interpret the standardized STD coefficient. Then the metric of f1 is mean zero and variance one.
Elina Dale posted on Saturday, May 11, 2013 - 8:24 pm
Dear Dr. Muthen,
I tried to get the STD coeff but it seems that to get them one has to specify ALGORITHM=INTEGRATION. With default setting, I got a message that there were 50625 integ points & I had to reduce their number or use MC Integration. So, I used MC with the following commands: Analysis: TYPE = COMPLEX MIXTURE ; ALGORITHM=INTEGRATION ; INTEGRATION = MONTECARLO ;
The Model took a few hours to run but eventually it terminated normally and I got the usual output without any warning messages. BUT, the estimates that I got using these specifications differ significantly from the estimates that I got previously, before I specified MC Integration, using these commands:
Analysis: TYPE = COMPLEX MIXTURE ; !3rd RERUN WITH MORE STARTS TO MAKE SURE !THE BEST LOGLIKELIHOOD IS STILL OBTAINED STARTS = 400 20 ; STITERATIONS = 20 ;
My estimates in all 3 runs w/out integration were all the same and my loglikelihood was replicated. So, I trusted those estimates. Now it seems with MC I get different ones. Which one of the two should I trust?
I've read on the Board that default numerical integ algorithm is better / more stable than MC. I couldn't do default b/c I had 4 dimensions & high number of integ points. But I still need to obtain STD results as per your earlier suggestion. Thank you!
Elina Dale posted on Thursday, August 15, 2013 - 1:44 am
Dear Dr. Muthen,
I am trying to estimate CACE as I have RCT with non-compliance. Here is my specification of the original model A: MODEL %overall% f1 BY item1 item2 item3 f2 BY item4 item5 item6 f3 BY item7 item8 item9 f4 BY item10 item11 item12
I ran my model A 3 times and my best likelihood was replicated each time. All the factor loadings and beta coefficients stayed the same.
Following this, I've added 3 more predictors, in addition to trx, and ran model B. I didn't alter the measurement part, i.e. factor indicators stayed the same. All other parts of how I specified the model also stayed unaltered.
The part that changed in Model B input: f1 ON trx type1 type2 type3; f2 ON trx type1 type2 type3; f3 ON trx type1 type2 type3; f4 ON trx type1 type2 type3;
Model B results show that Factor Loadings have changed(!) as well as beta coefficient of trx. I expected the latter, but I thought factor loadings should not have been altered. I rerun the model with more starts & iterations but got same results.
Should factor loadings remain unaltered between these two models? If not, what can I do to find the reason for this error? Many thanks!
The factor loadings can change. There may be a need for direct effects between the type and item variables due to measurement non-invariance.
Elina Dale posted on Tuesday, November 12, 2013 - 6:03 pm
Dear Dr. Muthen,
I am writing to you to just confirm again that CACE estimation method can be used with observed treatment and latent outcome variable b/c the articles on CACE that I found all seem to use observed outcomes (such as PIRC Study).
My Y or rather Y's are 4 factors measured through 20 items. My treatment is financial incentives, but I have high % of noncompliers, so am using CACE.
If I can use MPlus to estimate CACE with a latent outcome, is the specification below correct?
MODEL %overall% f1 BY item1 item2 item3 f2 BY item4 item5 item6 f3 BY item7 item8 item9 f4 BY item10 item11 item12
Yes, this is doable and your setup is fine. But you should be aware that there are several possibilities for how many measurement parameters for the DVs that should be invariant across the two classes (factor loadings, indicator intercepts, residual variances), and you could study the sensitivity of the results to that.
Elina Dale posted on Wednesday, November 13, 2013 - 8:51 pm
Yes, thank you, Dr. Muthen! I fixed the factor loadings & indicator intercepts to equal across two classes. I thought it would be reasonable to assume that, at least as a starting point.
1. I am wondering if there is good paper (like Booil Jo, 2002 on CACE) on how to check model assumptions etc with a CACE model where mediating variable is a LATENT variable consisting of 4 correlated factors.
X--> M --> Y where M is a latent variable that consists of 4 factors.
2. I am getting very strange estimates. For example, in my CACE model with just X & M where M was my outcome, I consistently got negative coefficients for trx.
Now that I added Y and my M is acting as a mediating variable, coefficients of X on M are positive on 2 of the factors.
The positive coefficients go also against exploratory data analysis results. I am wondering if the results are trustworthy.
I have 805 subjects, I have a trx variable, a mediating variable (4 factors, 20 items), and an observed continuous outcome.
I wonder if the sample is too small for such a complicated model or if there is something else I am doing wrong. Thank you!!!
1. Not that I know. You may want to contact Booil Jo at Stanford.
2.I would break down the modeling into small parts to understand what is happening. For instance, first do X-->M without bringing in CACE. And look at each M factor separately (first making sure that the M factor analysis model fits well).
The sample size should be sufficient.
Elina Dale posted on Thursday, November 14, 2013 - 10:47 am
Thank you, Dr. Muthen! I did do (2). I first fit a CFA and checked a model fit. Then I fit a model with just my treatment and my mediator as a my outcome (X-->M) without CACE. Then, I fit CACE for X-->M.
Then, since X and my final Y are observed variables, I fit regular regression to check X and Y association.
Now, I am trying to fit the whole model X-->M-->Y using CACE.
In this "big" model coefficients of X on M are getting reversed (what was neg before becoming positive, which doesn't make sense). Plus, I am getting a message that the model may not be identified. There is something wrong but unlike regular regressions, I do not know of diagnostic tools that we could use after fitting the model to check our assumptions etc.
Send the output to Support, including TECH1, TECH4, and TECH8.
Elina Dale posted on Friday, November 15, 2013 - 7:20 pm
Thank you!!! I am rerunning it now since I didn't specify TECH4 output. Will send it as soon as it finishes. Thank you!
Elina Dale posted on Sunday, January 26, 2014 - 11:15 pm
Dear Dr. Muthen,
I was listening to your presentation on categorical factor indicators. There you say that when we use MLR as an estimator, MPlus uses logistic regression, so the coefficients are interpreted as OR.
To estimate CACE, MPlus uses mixture modeling with MLR estimator. If everything is set up as in Ex 7.24, except the outcome is a latent variable measured on ordinal scale, do coefficients need to be exponentiated?
In May 2013, I wrote that my outcome was a continuous latent variable measured through a set of observed indicators on a Likert/Ordinal scale. I would like to just confirm again that the coefficient for the treatment variable that I get in the output does not need to be exponentiated and I haven't misunderstood your response in May.
Just go with what the DV is. The compliance status is binary. The factor is continuous - it doesn't matter that the factor indicators are categorical since they are DVs only for the factor predicting the indicators, not in the prediction of the factor.
Elina Dale posted on Monday, January 27, 2014 - 3:20 pm
Thank you! This makes sense now. Greatly appreciate it!
Huili Liu posted on Monday, May 12, 2014 - 5:41 pm
Thank you very much for reading this email.
I am a PhD student, and I am very interested in the estimation of intervention effects with noncompliance. I read your chapter with Dr. Jo, Modeling of Intervention Effects with Noncompliance: A Latent Variable Approach for Randomized Trials.I really enjoyed it!
But I have a question about the Mplus code provided. The outcome was regressed on the assignment variable Z and the compliance status C. However, the estimation seemed not include the variable, D, that specified whether subjects had taken or not taken the treatment given they have been assigned to certain treatment level. If D is not included in the model, does the estimation process loose some information? I tried to run the model with some data I simulated. I can estimated the right effect for compliers with the traditional 2SLS method, but the result estimated with the EM algorithm was less accurate.
The following is the code:
Variable: names are Y Z; usev are Y Z; classes=c(3); Analysis: type=mixture; Model: %overall% Y on Z; Y; [Y];
See the Mplus User's Guide which shows two ways to do CACE modeling.
Huili Liu posted on Wednesday, May 14, 2014 - 10:00 am
Dear Dr. Muthen,
Thank you very much for your quick response!
However, I still have a question about the problem after reading your examples in the Mplus User's Guide. In your chapter with Jo (2001), Modeling of Intervention Effects with Noncompliance: A Latent Variable Approach for Randomized Trials, you specified that the non-compliers are just the never-takers, not including the defiers and the always-takers.
But in the User Guide, I did not see the clarification for the non-compliers. My understanding is that the examples also assume that there are no always-takers and defiers. Is it right?
If it assumes there are no always-takers and defiers, is there a way to include the estimation of always-takers? Or do you suggest some paper for reading.
I understand that for most research settings always-takers are either not able to have access to the treatment or not able to be recorded whether they have taken the treatment, as most researchers will term it as the "observed compliance statuses". But if the research has access to the record some always-takers, how can Mplus handle this problem?
Dear Dr. Muthén. Using five dataset from a multiple imputation; I am trying to fit an multigroup unconditional model (as defined in HLM). My grouping variable is gender. I am not sure if my mplus coding is correct. Additionally i am getting an error:
Errors for replication with data file Stutomplus1.dat:
* FATAL ERROR CLASS-SPECIFIC BETWEEN VARIABLE PROBLEM.
my coding is
classes=g(2); KNOWNCLASS = g (gender= 0 gender=1); USEVARIABLES ARE zpvmath;
Hello, I'm grappling with how to best approach a CACE analysis using mixture modelling when I have reason to believe that there may be two competing dynamics at work such that assignment to treatment not only affects the DV positively via compliance, but also is having a negative effect on the DV via other processes. To be more concrete, we asked volunteer mentors to do X activities with their mentees -- so compliance can be taken as doing those activities. But it seems that what may also have occurred is that having been directed to do X activities induced mentors to *not* do other things, outside of those specific activities, with their mentees that would benefit the DV - so, it is as if mentors assigned to treatment had the approach to some extent of "doing what they were told to do" but also doing less (than those assigned to control) of other functionally-similar (i.e., also of benefit to the DV) activities, perhaps due to a sense of having already handled those through the assigned intervention activities. I'm gathering that modeling this might have to do with relaxing the exclusion restriction -- however, the only reference I can find on that is relaxing it for never-takers, whereas it seems like I might need to be relaxing for takers. I realize this is a pretty convoluted situation -- thanks very much for whatever suggestions or guidance you can offer!
Maybe this CACE paper on our website is somewhat related:
Sobel, M. & Muthén, B. (2012). Compliance mixture modelling with a zero effect complier class and missing data. Biometrics, 68, 1037-1045. DOI: 10.1111/j.1541-0420.2012.01791.x Download paper.
Huili Liu posted on Tuesday, October 27, 2015 - 11:17 am
Hi Dr. Muthen,
I am trying to use mplus to estimate my simulation data. I want to estimate the CACE (with compliers, alwaystakers, and nevertakers) across 5 observed groups by constraining some parameters. But my 6.12 version mplus told me that I cannot use "knownclass" together with training data. I tried other modeling methods but it seems that when I have three compliance classes, only the training data is available for CACE.
Do you have any suggestions about this problem? Does the new mplus allow this type of modeling?
You don't need to take the training data approach. See the Version 7 UG on our website for the alternative CACE approach without training data. To this you can add Knownclass.
Huili Liu posted on Tuesday, October 27, 2015 - 8:05 pm
Hi Dr. Muthen,
Thank you very much for your quick response.
According to my knowledge, the example on the UG has only two latent classes: never taker and complier. For the treatment group their compliance statuses are totally observable. Therefore, the observed class membership has both 1 and 0 and then missing data.
However, in my model, I have three latent classes: always taker, never taker and complier. I can only observe units' compliance statuses if they are always takers but assigned to control group or never takers assigned to treatment group. For those who were assigned to their treatment levels and actually complied their assignment, I only know they are not always takers or never takers. Therefore, the observed class membership variable has records of "never taker", "always taker" and missing values. I don't have any "compliers".
When I tried to use the method to specify the thresholds for the three levels of the observed class membership variable, Mplus gave me the errors like this:
*** ERROR in MODEL command Unknown threshold for NOMINAL variable U: U#2 *** ERROR The following MODEL statements are ignored: * Statements in the OVERALL class: [ U#2 ]
So is the method that creates ONE observed membership variable for the latent class variable correct? Or you meant for another method?
Let me ask you - for a single group, does the training data approach give good results?
Huili Liu posted on Wednesday, October 28, 2015 - 5:59 pm
Hi Dr. Muthen,
According to my simulation results without any multiple groups, the training data method performed better than the two stage least square. My simulation design was simple, with only one outcome variable Y, and there is no covariate. Proportions of compliers, always-takers and never-takers are (0.5,0.3,0.2). Treatment effect for always-takers are 1 and for compliers are 0.8. Pretreatment means for compliers, always-takers and nevertakers are 2,3 and 0. Variances are all 1. Here is the result from 500 replications.
With multiple groups we recommend that you use training data to capture known class membership to reflect the groups, that is, extend the number of classes.
Huili Liu posted on Thursday, October 29, 2015 - 11:39 am
Hi Dr. Muthen,
Thank you very much for your reply.
What do you mean by "use training data to capture known class membership to reflect the groups"? I have latent class membership that is captured by the training data. Now I added multiple group membership (which is totally observable) to the analysis. Mplus told me "External training variables are not supported when KNOWNCLASS option is used."
For your information, here is my code:
TITLE: mplus CACE use ML DATA: FILE IS Mplus_training_data_with planned missing9999.dat;
VARIABLE: NAMES ARE I X Gr c1 c2 c3 Y1 Y2 Y3 Y4 Y5; USEVARIABLES = I c1 c2 c3 Y1 Y2 Y3 Y4 Y5; ClASSES=G(5) C(3); knownclass=G (Gr=1 Gr=2 Gr=3 Gr=4 Gr=5); categorical=c1 c2 c3; training= c1 c2 c3; missing=ALL(99999);
Huili Liu posted on Thursday, October 29, 2015 - 11:44 am
MODEL: %OVERALL% Y by Y1 (l1); Y by Y2 (l2); Y by Y3 (l3); Y by Y4 (l4); Y by Y5 (l5); y (VarY); Y1 (E1); Y2 (E2); Y3 (E3); Y4 (E4); Y5 (E5); Y ON I ;
Huili Liu posted on Thursday, October 29, 2015 - 11:44 am
%G#1.C#1% !Y by Y1 (l1); Y by Y2 (l2); Y by Y3 (l3); Y by Y4 (l4); Y by Y5 (l5);
!Y1 (E1); Y2 (E2); Y3 (E3); Y4 (E4); Y5 (E5); [Y] (mu1); Y ON I (CACE);
I have RCT data in which participants were randomly assigned to treatment or control, but then those in treatment self-selected into different durations of treatment. I plan to run several CACE models without the exclusion restriction to figure out at what duration the treatment begins to have an effect. I figured I would do this by moving around what I considered to be 'low compliance' and running models ala Jo (2002), and models like the first in this thread (non-compliers, those who received so little treatment there was no effect or no treatment even though assigned; low compliers; high/full compliers). I am prepared to include several covariates associated with both my outcomes and selection into different durations. (1) Does this seem like a feasible/appropriate approach? (2) Will I run into issues of non-identification?
(2) You can try dichotomization at different levels. I think Jo explored continuous compliance; check her papers. Note also the possibility of compliers varying in terms of responding - see the Sobel & Muthen paper on our Randomized Trials web page:
Dear Drs. Muthen - I have data appropriate for a one-sided noncompliance model with a DV that is a latent variable composed of eight categorical indicators. I further subdivided the analysis to three groups based on another covariate, resulting in 3*2 (compliers/non-compliers) = 6 groups. The groups are further clustered by another variable & covariates X are used in both class prediction & in adjusting for Y.
I run the models and recover a negative CACE, & two positive CACEs.
My question actually comes to reporting ITT_y effects. If I run an auxiliary model just to retrieve an estimate of the ITT_y (i.e., not taking compliance into account) all of my ITT estimates are negative. My understanding of a simplified estimation of the CACE = ITT_y / ITT_w, where ITT_w is the proportion of compliers. So, I am confused why the ITT_y estimates are negative.
(1) Do you have any thoughts/recommendations regarding this potential issue?
(2) Is there a way in the CACE mixture model to retrieve an estimate of ITT_y? (Perhaps just take the estimated CACE and multiply it by the estimated class proportion?)
Dear Drs. Muthen (and others), I'm interested in learning if there are any examples available for estimating a multi-class CACE model in Mplus that take into account a situation in which there is both non-compliance and contamination. I'm inferring from what I have read that this would equate to a model with classes for compliers, non-compliers, and always-takers, but am not sure and even if I am correct, I am unsure how to estimate the model relative to the 2-class examples of which I am aware. Thank you!
refers to work on CACE. Booil Jo has done work on 3-class CACE. The Sobel-Muthen paper has a new type of 3-class model. I don't know what "contamination" refers to - except that it is used to describe outliers.
Thank you Dr. Muthen. Contamination refers to the exposure of members of a control group to the treatment condition ("process whereby an intervention intended for members of the trial (intervention or treatment) arm of a study is received by members of another (control) arm."). Methodological research on the topic has been concerned with considering the tradeoff between individual level vs. cluster level RCTs in the context of varying levels of contamination. I came across one interesting report on this topic that used CACE (https://researchonline.lshtm.ac.uk/6318/1/FullReport-hta11430.pdf). I have reached out directly to Dr. Jo for advice on how to estimate a CACE model in the context of both non-compliance and contamination and hopefully she can be of assistance. I also followed up with the author of one of the posts on this thread who asked a similar question several years ago and he provided me with the Mplus code he had used for the same situation, but was unsure if it was technically correct. If I hear from Dr. Jo, I'll share any pertinent information with this group.
I ran a CACE model. When I did not bring the means of the covariates in the model, the proportion of the compliers was correct (~30%). However, when I brought the means of the covariates (including the intervention status) to the model, the proportion of the compliers flipped into ~60%. In addition, the estimates of the intervention effect among the compliers are too large to be possible. What to do in this condition?
I did find the following warning message in the results with the means of the covariate in the model. C2 is the complier class, X2 is the intervention status. I used 1000 integration, but the warning still exists.
THE BEST LOGLIKELIHOOD VALUE HAS BEEN REPLICATED. RERUN WITH AT LEAST TWICE THE RANDOM STARTS TO CHECK THAT THE BEST LOGLIKELIHOOD IS STILL OBTAINED AND REPLICATED.
THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES MAY NOT BE TRUSTWORTHY FOR SOME PARAMETERS DUE TO A NON-POSITIVE DEFINITE FIRST-ORDER DERIVATIVE PRODUCT MATRIX. THIS MAY BE DUE TO THE STARTING VALUES BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. THE CONDITION NUMBER IS 0.142D-16. PROBLEM INVOLVING THE FOLLOWING PARAMETER: Parameter 28, %C#2%: [ X2 ]
Please send your output to Support along with your license number.
Margarita posted on Wednesday, March 14, 2018 - 8:10 am
Dear Dr. Muthen,
I am working on CACE analysis and trying to decide the best method to dichotomize my continuous compliance variable. Some use the 50th and 75th percentile as a method, but it seems to me that exploratory growth mixture analysis mentioned in Bengt's paper "Longitudinal studies with intervention and noncompliance: Estimation of causal effects in growth mixture modeling" is a more robust way. I was wondering,
1. if based on your experience the exploratory GMA is a robust approach to CACE, and 2. how one decides which group (compliers vs. non) corresponds to each class. In my case I found 90% of the control group to form class #1 and the rest 10% class#2. Would I assign 90% or 10% of the intervention group to the compliance group?
I haven't found other papers using this method, so I would be very interested if you have any more papers/chapters on the subject.