Message/Author 


Two questions: 1). For only two treatment conditions (i.e., TX vs control): is it possible to have three levels of compliance (e.g., compliance, partial compliance, and noncompliance) so that the training variables could look like the following: c1 c2 c3 1 1 1 (in control group) 1 0 0 (compliance) 0 1 0 (partial compliance) 0 0 1 (noncompliance) Accordingly, the latent class variable would be specified with three classes in the mixture intervention model. 2). Is it possible to run a CACE model with three treatment conditions (e.g., tx1, tx2, and control) and two levels of compliance (i.e., compliance vs noncompliance)? It looks like we could include two TX dummy variables in the model, but how to define the training variables and interpret the latent class variable? Thank you very much for your help! 

Booil Jo posted on Wednesday, January 23, 2002  8:27 pm



1) It is possible to arrange the training variables as you did to reflect three levels of compliance. However, this model is not identified, even though you impose the exclusion restriction on noncompliers. In this case, you have to build the identifiability relying on more structural assumptions than the exclusion restriction. 2)This is another underdeveloped area in CACE modeling. I will compare tx1 vs. control, and tx2 vs. control using regular 2class CACE models. However, It is not clear how to interpret the results when you compare two active treatments using CACE estimation, as you already pointed out (unless doubleblining is possible). 

JeremyMiles posted on Wednesday, March 24, 2004  9:59 am



I wondered if anyone reading knew if this was an appropriate problem for CACE modelling. We have a problem related to, but not exactly the same as, noncompliance  differential nonrecruitment. We carried out a cluster randomised study of a new form of therapy versus standard care. The clusters were clinics, and there were 13 clinics allocated to standard, and 13 allocated to the new form of care. The problem that we encountered was that the clinics allocated to the new form of care  intervention  thought this was very exciting, and recruited a large number of patients (~800), whereas the usual care  control  didn't try so hard, and recruited only ~400. The control group differs  probably in initial severity, and maybe on other characteristics. We have a wide range Is this an example of somewhere we could use a CACE model? We have two classes in the intervention arm, and one class in the control arm. It's sort of related to ITT issues, but only sort of. Thanks, Jeremy 


As I understand it, once you have lost your randomization, the method would no longer apply. 

Anonymous posted on Wednesday, November 10, 2004  12:42 pm



Dear Bengt and Linda I have some questions regarding CACE models in MPlus 3.11. First, I wonder whether my data are suited to this model. I have a study where respondents to a survey are randomly allocated to either treatment or control. In the treatment condition they are shown a film about genomic science, the control receive no information. I want to look at the effect of watching the film on subsequent attitude questions. However, approximately 20% of the treatment group report not having understood the film. I would like to treat this group as noncompliers and estimate the complieraverage causal effect of viewing the film. Would this seem appropriate to you? Second, I am not sure how to interpret the MPlus output when fitting the CACE model in example 7.24. Can you point me somewhere that I might be able to find some pointers on what I should be looking for? In particular, which parameter denotes the causal effect? Is it the regression of Y on x2 in latent class 2? I have sent my output separately to MPlus Product support. Thank you, Patrick 

bmuthen posted on Sunday, November 14, 2004  6:04 pm



Yes on the question in the first paragraph. Example 7.24 uses the x2 variable to represent the treatmentcontrol dummy in line with ex 7.23. So, yes, the regression coefficients for y on x2 is the causal effect. 

Anonymous posted on Monday, November 15, 2004  12:25 pm



Dear Bengt thanks for your reply. I have a few further questions about this: 1. the output suggests to me that latent class 2 in my analysis are the noncompliers (this is because of the relative size of the classes. However, the regression of Y on X2 is fixed to zero in this class. Should class 1 be the noncompliers in the setup for this model? 2. what is the role of the x1 variable? Should I be including variables here that are predictive of complier/noncomplier status? Can I include more than one variable for X1? 3. How do I deal with differential nonresponse in the CACE model? Can I simply specify a weight variable in the usual way? thanks again, Patrick 

bmuthen posted on Monday, November 15, 2004  1:17 pm



1. There should be no doubt from the data about which class is the noncomplier class  it is the class with no compliance for the treatment group (so an observable matter). The standard CACE model assumes no treatment effect for the noncompliance class since they do not receive treatment (see, however, modifications made in Booil Jo papers on our web site). 2. x1 is an example of a variable that strengthens the analysis much like covariates with ancova in randomized studies. You can have many such variables pointing to either the latent class variable or the outcome or both. See the Little & Yau article in Psych Methods on our web site. 3. By differential nonresponse, do you mean different across the 2 classes? If so, this is a topic studied by Frangakis and Rubin within the area of nonignorable missingness  it can be handled by Mplus using the latent classes to predict missingness. But maybe I am misunderstanding your question. 


We have a multiyear evaluation trial of a schoolbased program using student reports of outcomes, students nested within classrooms. In our initial CACE analysis we dichotomized the implementation measure to create complier and noncomplier categories. Results were good, but our childreport covariates didn't predict compliance well. We have teacherreport covariates that might predict compliance better. Can Mplus do a CACE model that includes teacherreport covariates to predict compliance? Because students are nested within classroom, all students who had the same teacher have the same values for the teacherreport covariates. Is a multilevel CACE analysis needed here to correctly use the teacherreport covariates? 


Is compliance a between level (classroom) variable with students being the within level? If so, the CACE latent class variable is a between variable. This feature is not in the current Mplus Version 4.1, but will be in the next version 4.2 which is due out in a couple of weeks. If compliance varies across students within classrooms, this is related to work by Booil Jo and you may want to contact her about it. 


If you send me an email, I have a paper and some communication with Booil Jo to share with you on this. 


I am currently evaluating a multiyear trial of an elementary school primary prevention program. In the program, we have varying levels of exposure to the program (an unintended side effect of varying implementation of teachers). I am interested in the CACE approach to model the effects of the program. Additionally, I’ve run propensity score analyses, with favorable results. My question centers on locating a reference that may provide a substantive discussion on the key similarities and differences of CACE and propensity score methods. Would you be able to point me a direction? Thank you. 


Michael, One of the main differences between the CACE and propensity score methods are the underlying assumptions, so you will want to think about which is more reasonable for your setting. In particular, the CACE models use the fact that the original treatment assignment was randomized (I'm not sure if it was in your example), and then make exclusion restrictions and the monotonicity assumption to estimate impacts. So they rely on having some "instrument" (the thing that was randomized) that affects the exposure level that someone gets. Propensity score methods don't assume anything was randomized, but instead rely on an assumption of unconfounded treatment assignment: they assume that there are no hidden biases between the exposed and unexposed groups. In your case, this would imply that there are no unobsrved differences between the high exposure and low exposure groupsthat all differences are captured by your observed covariates. So they assume that only observed variables affect the exposure level that someone gets. I hope this helps. In another post I will list some references that you could look at. Sincerely, Liz Stuart 


This is a followup to my previous post, with references for propensity score and CACE assumptions and comparisons. For the CACE assumptions I like the original Angrist, Imbens, and Rubin paper: Angrist, J.D., Imbens, G.W., and Rubin, D.B. (1996). Identification of causal effects using instrumental variables. Journal of the American Statistical Association, 91, 444455. For propensity score analyses I like this paper by Rubin: Rubin, D.B. (2001). Using propensity scores to help design observational studies: Application to the tobacco litigation. Health Services and Outcomes Research Methodology, 2, 169188. And there are two references I know of that compare instrumental variables (CACE) models and propensity score models: Posner, M.A., Ash, A.S., Freund, K.M., Moskowitz, M.A., and Shwartz, M. (2001). "Comparing standard regression, propensity score matching, and instrumental variables methods for determining the influence of mammography on stage of diagnosis." Health Services and Outcomes Research Methodology, 2, 279290. Landrum, M.B. and Ayanian, J.Z. (2001). "Causal effect of amulatory specialty care on mortality following myocardial infarction: A comparison of propensity score and instrumental variable analyses." Health Services and Outcomes Research Methodology, 2, 221245. Sincerely, Liz Stuart 


I really appreciate your assistance on this topic. I will delve into these references and if any new question should arise I will be sure to ask. Have a beautiful day. 

Scott Grey posted on Wednesday, February 21, 2007  4:37 pm



Guys, I'm not sure what this error output means. Can you help? __________________________________________ ANALYSIS: TYPE = TWOLEVEL MIXTURE MISSING; ESTIMATOR=MLR; MODEL: %WITHIN% %OVERALL% alc30_11 ON alc30_7; class#1 ON latino black am_ind asian oth_race male family age alc30_7 risk2 risk3 risk4; age WITH latino black am_ind asian oth_race male family alc30_7; %BETWEEN% %OVERALL% alc30_11 ON treatms; %CLASS#1% [complier$1@15]; alc30_11 ON treatms@0; [alc30_11]; %CLASS#2% [complier$1@15]; [alc30_11]; OUTPUT: SAVEDATA: FILE IS results; FORMAT IS FREE; RECORDLENGTH = 1000; SAVE = CPROBABILITIES; INPUT READING TERMINATED NORMALLY *** FATAL ERROR CLASSSPECIFIC BETWEEN VARIABLE PROBLEM. 

Thuy Nguyen posted on Wednesday, February 21, 2007  5:17 pm



Please send your input, output and data to support@statmodel.com. Not enough information is available here to determine the cause of the error message. 


I have data from an experimental study in which I have both "noncompliers" in the experimental condition and "alwaystakes" in the control condition. I have run some CACE models using Booil's syntax for the two class CACE model (i.e., compliers and noncompliers) but extending the logic to a three class model (i.e., compliers, noncompliers, and alwaystakers). The results seem to make sense and match up reasonably closely with the estimate of the treatment effect I get with an instrumental variable approach or running an ANCOVA model that controls for covariates related to compliance. The standard errors for the treatment effect are somewhat larger using the CACE model compared to the ANCOVA model, which seems right. (I have not figured out how to get the standard errors or include covariates using the instrumental variable approach). I am wondering whether I am in uncharted waters with the three class CACE model. I looked through Booil's publications and have not found a similar example. 


Following is a response from Booil Jo: It is possible to do 3class CACE modeling in Mplus. The model will be identified by imposing the exclusion restriction both on nevertakers and alwaystakers. Monotonicity is also necessary. Under this condition, Mplus should provide estimates close to those from the IV approach. However, since we are dealing with a parametric model, substantial deviation from normality may lead to erroneous solutions. Including covariates is not only possible, but also prevent this event from happening. If one or both of the exclusion restrictions are relaxed, identification of the model will depend more on covariate information and normality, and therefore more caution is needed. Crossvalidating the results by using both parametric and semi or nonparametric approaches might be a good idea in this case. I have not tried many examples of 3class CACE modeling, and have not seen many published examples using parametric approaches (and none using Mplus). However, there are many published examples of multiclass CACE modeling using the Bayesian approach. 

Anonymous posted on Wednesday, September 16, 2009  9:04 pm



Hello, I have a few questions regarding a CACE model I am attempting to run. 1) What is the difference between these two error messages? *** WARNING Data set contains cases with missing on all variables. These cases were not included in the analysis. Number of cases with missing on all variables: 2 *** WARNING Data set contains cases with missing on xvariables.These cases were not included in the analysis. Number of cases with missing on xvariables: 84 2)How can I maximize the variables used without effecting the Entropy? Thanks! 


The first message is about missing on all analysis variables. The second message is about observations with missing on one or more observed exogenous variables. I don't understand your second question. 

Anonymous posted on Wednesday, September 16, 2009  10:05 pm



Thanks for the answer and I apologize for being vague in my second question. Regarding the number of cases with missing observations on the exogenous variables, I have found on other posts using other types of analyses that mentioning the variances works to include cases with missing data on the covariates. Is this also true for CACE models (perhaps by saying x2; again in the %overall% model)? If so, will this change the entropy of the model (the model continues to run well without these cases)? I hope this is more clear, and thank you in advance. 


If you change the sample by bringing in the observations with missing on x's, you change the sample and the entropy will most likely change also. 

Anonymous posted on Monday, January 04, 2010  8:04 pm



How large of a sample size is needed to conduct a CACE model? 


Following is an answer from Booil Jo: Assuming that parametric estimation methods (e.g., 2class mixture in Mplus) are used, around 100 or more subjects in each compliance class will result in good CACE estimates and standard errors. Around 50 subjects in each compliance class will still yield reasonably good estimates. However, if parametric assumptions and/or model identifying assumptions (e.g., exclusion restriction) are not met, the quality of estimates will deteriorate. Covariates may play important roles here. By including good covariates (i.e., predictors of compliance type) in the estimation model, one may obtain reasonably good CACE estimates with smaller samples. These covariates also tend to reduce sensitivity of CACE estimates to deviations from underlying model assumptions. 


I have data from a randomized trial and am interested in the noncompliance Example 7.24 in the Mplus manual. In particular, can this analysis be done with missing data on the outcome variable? I do have baseline covariates that predict missingness, the outcome variable and compliance, and am planning to use these to identify the CACE. I have read the recent Jo, Ginexi & Ialongo (in press) paper posted at your web site which nicely describes missing outcomes and CACE in Mplus. But they identify the CACE by imposing restrictions on the relationship between outcome missingness and noncompliance. That would be useful as sensitivity analyses, but for the primary analysis I want to take advantage of the informative covariates, which were carefully chosen. Is it possible to use example Mplus 7.24 with multiple imputation? If so, which Mplus multiple imputation method would be best? 


It is an interesting question what is best here. You could do imputation where you specify that tx is binary. But note that regular imputation does not acknowledge that you have a mixture of compliers and noncompliers. So the imputations would seem biased to some degree. Regular ML mixture modeling would seem more straightforward, with later followup using the latent ignorability NMAR approach. Note that ML under MAR estimates [y  x] from those with complete data on y. The subjects with data on only x would contribute only to the marginal [y]. You have the parameters you want already in [yx]. I am saying that because there doesn't seem to be a reason to bring the x into the model in ML in this case. If your covariates that predict missingness are not part of your model, you could take the missing data correlates approach of UG ex 11.1. 


Thank you for the suggestion to use the AUXILIARY command. I am interested in applying the GMM described in the 2003 Jo & Muthen chapter in the Reise & Duan book, and see that the AUXILIARY command provides a convenient way to deal with missing outcomes, although covariates that predict compliance and the outcome are still needed in the model proper to identify it. Is it possible to use the AUXILIARY command in a threelevel model in Mplus, as in a longitudinal clusterrandomized trial (e.g., level 1 = time points, level 2 = person, level 3 = cluster)? 


The AUXILIARY (m)option is not available for multilevel modeling. 


Is there any chance that the AUXILIARY option for multilevel modeling will be included in future versions of Mplus? 


We will put it on our list for future developments. In the meanwhile note that this type of modeling can be set up by the user. The principle for the singlelevel approach is described in the web movie Missing Data Correlates using ML which you find at http://www.statmodel.com/webtalks.shtml That principle can be generalized to twolevel models. 


Thanks very much for this suggestion, I will check it out. 


The "Missing Data Correlates using ML" video was clear. I see how to add in the WITH statements rather than using the AUXILIARY command in that kind of model. I am interested in using this method in the multilevel growth mixture model format of UG Example 10.9. That would represent a couple of extensions from the video example. First, consider the case of UG Example 10.9 with some missing data in the outcomes, y1y4. If there were three Time1 covariates that were not measures of y (as they are in the video), that are associated with missingness in y1y4, could they be used in the same manner as the z1z5 in the video as auxiliary variables? That is, would this be done by adding to the %WITHIN% section of the Example 10.9 model the 3 correlations among the 3 auxiliary variables, and the 12 correlations between the 3 auxiliaries and the 4 outcome observations? If this is incorrect could give me a hint on how specify the auxiliary effects in MODEL = TWOLEVEL MIXTURE? Second, would the same approach extend to data that was missing from whole clusters of individuals (such as schools) that were missing from some waves of the study? That is, assuming there are 3 clusterlevel covariates that are associated with whether whole clusters (data on students in a school) are missing from some waves of data. I can see how this might be done in the %BETWEEN% section of Example 10.9, analogous to the %Within% section as described above. 


These are good questions and they are research questions that have to be studied, including via simulations. I'm afraid I don't have the specific answers since I haven't done this myself yet. 


Though the AUXILIARY approach to dealing with missingness in the outcomes (y1y4) of multilevel Example 10.9 is premature, would it not still be possible to provide evidence for MAR if there were withinlevel covariates that were predictive of the outcome, predictive of missingness, and of some substantive interest in the model. Such covariates could influence the intercept or slope. Substantive interest could come from a need to adjust model estimates for something like gender. Granted, I am talking about very informative covariates. But if you found a couple good ones, with moderately strong associations, and no interactions, wouldn't that address missingness in multilevel models such as Example 10.9? 


If you have covariates that are predictive of the outcome and missingness as well, I would not hesitate to include them in the model and thereby make MAR more plausible. The missing correlates situation is different because the correlates don't have a role in the model as predictors of the outcome, only missingness. Hence, they shouldn't be included as covariates because the influence of the substantive covariates becomes distorted. I would first ignore the multilevel angle and see if missing data correlates have an effect or not. 

Anonymous posted on Monday, June 28, 2010  11:25 pm



In attempting to run a CACE model I receive the following warning: WARNING:THE RESIDUAL COVARIANCE MATRIX (THETA)IN CLASS 1 IS NOT POSITIVE DEFINITE. THIS COULD INDICATE A NEGATIVE VARIANCE/RESIDUAL VARIANCE FOR AN OBSERVED VARIABLE, A CORRELATION GREATER OR EQUAL TO ONE BETWEEN TWO OBSERVED VARIABLES, OR A LINEAR DEPENDENCY AMONG MORE THAN TWO OBSERVED VARIABLES.CHECK THE RESULTS SECTION FOR MORE INFORMATION.PROBLEM INVOLVING VARIABLE W9XR. However, on the website i noticed that the same error is shown in an GMM example: http://statmodel2.com/examples/penn/penn8.html Can this warning be ignored? If not, how should one proceed in ameliorating the error? 


No, this message should not be ignored. For the variable w9xr, you should look for a negative residual variance or a correlation greater than one. If you can't see the problem, please send your full output and license number to support@statmodel.com. 

ywang posted on Tuesday, October 12, 2010  5:54 pm



Dear professors, I have two questions about CACE: 1. If y (outcome variable) is a categorical variable, can I do a CACE model with example 7.23 by only including a command of "categorical=y"? What else do I need to add to the input? 2. Is it possible to examine the interaction between treatment (not complier) and a covariate in the CACE model? If so, is it correct to simply generate an interaction term between treatment and the covariate, and include it as another covariate in the model? Thanks in advance! 


1. Yes. The other difference is that instead of a mean and variance for the dependent variable, you would have a threshold. 2. Yes. 

ywang posted on Tuesday, October 19, 2010  5:10 pm



For CACE model, is there any paper or example of input file that combines latent growth modeling and CACE? Thanks! 


Yes, look under Papers, Noncompliance and you will find a JoMuthen paper related to that. 


Greetings, I have a question on how to interpret part of the output of a CACE model. Is the following section giving me the means of my predictors within class 1 (i.e., nonengagers in my model)? . RESIDUAL OUTPUT . ESTIMATED MODEL AND RESIDUALS (OBSERVED  ESTIMATED) FOR CLASS 1 . Model Estimated Means And is the following section giving me the means of my predictors within class 2 (i.e., engagers in my model)? . ESTIMATED MODEL AND RESIDUALS (OBSERVED  ESTIMATED) FOR CLASS 2 . Model Estimated Means I would assume that these are the values I would get if I saved the estimated engager status for each participant and requested means for each group separately, but the subtitle saying "RESIDUAL OUTPUT" is confusing to me. Thank you for your help. 


Yes, those are the modelestimated means you want. 


Hi I'm working on a cluster principal stratification model. I wonder if I created group dummies properly or not according to the error message. Treatment was assigned in school level, and level of compliance is school level as well. Outcome level is student level. Since compliance variable in control group was not observed, I coded "0" in both c1 and c2. And treatment group has compliance (1) and noncompliance (0) in c1 and c2 respectively. c1 c2 0 0 control (assumed to be zero treatment effect) 1 0 treatment (compliance) 0 1 treatment (noncompliance) But then, I got an error message like the below. I greatly appreciate your help. *** ERROR There is at least one observation in the data set where all training variables are zero. Please check your data and format statement. 


See the Topic 5 course handout starting at slide 44. Instead of 0 0 you should have 1 1. This means they can be in either class. 


Thank you so much! I just wonder if I'm doing it correctly still for dealing with cluster design. Could you look at the below syntax please? CLUSTER = school; BETWEEN = treat; CLASSES = c(2); TRAINING = c1c2;ANALYSIS: TYPE IS TWOLEVEL random MIXTURE; MODEL: %WITHIN% %OVERALL% score; %c#1% score ; %c#2% score ; %BETWEEN% %OVERALL% score on treat; %c#1% score on treat @0; %c#2% score on treat; OUTPUT: tech1 tech2; 


You should ask yourself if  the treatment is on the cluster level (the answer seems to be yes)  the latent compliance classes are on the person or cluster level (the way you have done it, the compliance is on the person level) You also want the mean of "score" to vary across the 2 compliance classes. Then you check if your training data agrees with class 1 being the noncompliers. 


I have two questions. Treatment and complier occur in school level, outcome (score) is in student level. But latent class size looks like complier variables were created in student level in the output. Also, I have an error message like below. Do you see any problems in the syntax so I can fix or is there any example of multi level CACE model? I appreciate your help so much. VARIABLE: NAMES ARE school treat score c1 c2; USEVARIABLES ARE treat score school c1 c2; CLUSTER = school; BETWEEN = treat; CLASSES = c(2); TRAINING = c1c2; ANALYSIS: TYPE IS TWOLEVEL random MIXTURE; MODEL: %WITHIN% %OVERALL% score; %BETWEEN% %OVERALL% score on treat; [score]; %c#1% scoreon treat ; [score]; OUTPUT: tech1 tech2; Error message: THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES MAY NOT BE TRUSTWORTHY FOR SOME PARAMETERS DUE TO A NONPOSITIVE DEFINITE FIRSTORDER DERIVATIVE PRODUCT MATRIX. THIS MAY BE DUE TO THE STARTING VALUES BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. THE CONDITION NUMBER IS 0.808D11. PROBLEM INVOLVING PARAMETER 6. Latent Classes 1 571 0.21354 2 2103 0.78646 


If you want the classes to be between classes, add BETWEEN=c; to the VARIABLE command. You need to fix score on treat to zero in the noncomplier class. See Example 7.23. 


Thank you so much. I added "BETWEEN=c;" But still I had students latent classes numbers. My compliance occurs in the school level meaning that latent classes size should be presented the number of school? And since my "noncomplier class" is low quality of dosage, it's not the zerotreatment effect. In this case, still I have to fix to zero? If I run after fixing to zero, then coefficient of class 1 is 0 too. VARIABLE: NAMES ARE id school treat score c1 c2; USEVARIABLES ARE treat score school c1 c2; MISSING IS .; CLUSTER = school; BETWEEN = treat c; CLASSES = c(2); TRAINING = c1c2; ANALYSIS: TYPE IS TWOLEVEL random MIXTURE; MODEL: MODEL: %WITHIN% %OVERALL% score; %BETWEEN% %OVERALL% score on treat; [score]; %c#1% score; score on treat; [score]; %c#2% [score]; score; OUTPUT: tech1 tech2; 


No, you don't have to fix it at zero. Check that your training data are betweenlevel variables, that is, that all members of a cluster have the same value. If you continue to have problems, send your files and license number to support@statmodel.com. 

QianLi Xue posted on Wednesday, August 29, 2012  12:47 pm



The User's Guide gives out two ways to implement the CACE model (i.e., Ex7.23 and Ex7.24). I noticed that the two approaches treat missing outcome variable (i.e., Y in the examples) differently. Ex7.23 will delete "cases with missing on all variables except xvariables." So it will delete all cases with Y missing. However in Ex7.24, only cases with both Y and u missing will be deleted, which means only cases in the control (not treatment) group with missing Y will be deleted. If this is correct. Which model do you recommend to use when there is missing data in Y? Thanks in advance for your help! 


What you say does not seem correct. Can you please send the two outputs and your license number to support@statmodel.com so I can see what you mean. 

Elina Dale posted on Tuesday, May 07, 2013  2:23 am



Dear Dr. Muthen, I am trying to estimate CACE as I have RCT with noncompliance (55% of those in treatment group did not get it). My outcome variable is a cont latent variable measured through cat factor indicators. My treatment variable is trx (1/0) and I have compliance indicator p4p (1/0), which shows whether ind i actually received the trx. I ran my model and I got the following warning message: RANDOM STARTS RESULTS RANKED FROM THE BEST TO THE WORST LOGLIKELIHOOD VALUES Final stage loglikelihood values at local maxima, seeds, and initial stage start numbers: 12933.575 533738 11 12933.575 407168 44 Unperturbed starting value run did not converge. 2 perturbed starting value run(s) did not converge. THE BEST LOGLIKELIHOOD VALUE HAS BEEN REPLICATED. RERUN WITH AT LEAST TWICE THE RANDOM STARTS TO CHECK THAT THE BEST LOGLIKELIHOOD IS STILL OBTAINED AND REPLICATED. I reran then with STARTS = 100 ; which is twice the number of starts before where I had 50. However, I got the same warning message, i.e. RERUN WITH AT LEAST TWICE THE RANDOM STARTS TO CHECK THAT THE BEST LOGLIKELIHOOD IS STILL OBTAINED AND REPLICATED. I am not sure what perturbed and unperturbed staring values and what I should do as the next step to get the best loglikelihood and replicate it. Thank you! Elina 


We give the message to rerun every time you run. We have no way of knowing if this is your first or second run. If you have doubled the original starts and replicated the best loglikelihood several times, you should be fine. Unperturbed starting values are the default starting values which are used for the perturbed starting values. 

Elina Dale posted on Saturday, May 11, 2013  1:38 am



Dear Dr. Muthen, Thank you for all your help! It seems my model ran now and I got the CACE estimates that seem to make sense. I am struggling now presenting and interpreting results, specifically assigning units. My outcome is a continuous latent variable measured through a set of observed indicators on a Likert scale. I have an assigned treatment variable (TRX=0/1) and I estimated CACE. My F1 on TRX is 2.673. This is my CACE estimate. What are the units of measurement here? For each of my factors 1st loading is fixed to 1. Will my results be easier to interpret if I fixed factor variances to 1? I want to present it to policymakers and would like to find a good way of interpreting the results. Also, if you have good publications you could refer me to that is similar to my case, I'd greatly appreciate it. Thank you!!! 


I would interpret the standardized STD coefficient. Then the metric of f1 is mean zero and variance one. 

Elina Dale posted on Sunday, May 12, 2013  2:24 am



Dear Dr. Muthen, I tried to get the STD coeff but it seems that to get them one has to specify ALGORITHM=INTEGRATION. With default setting, I got a message that there were 50625 integ points & I had to reduce their number or use MC Integration. So, I used MC with the following commands: Analysis: TYPE = COMPLEX MIXTURE ; ALGORITHM=INTEGRATION ; INTEGRATION = MONTECARLO ; The Model took a few hours to run but eventually it terminated normally and I got the usual output without any warning messages. BUT, the estimates that I got using these specifications differ significantly from the estimates that I got previously, before I specified MC Integration, using these commands: Analysis: TYPE = COMPLEX MIXTURE ; !3rd RERUN WITH MORE STARTS TO MAKE SURE !THE BEST LOGLIKELIHOOD IS STILL OBTAINED STARTS = 400 20 ; STITERATIONS = 20 ; My estimates in all 3 runs w/out integration were all the same and my loglikelihood was replicated. So, I trusted those estimates. Now it seems with MC I get different ones. Which one of the two should I trust? I've read on the Board that default numerical integ algorithm is better / more stable than MC. I couldn't do default b/c I had 4 dimensions & high number of integ points. But I still need to obtain STD results as per your earlier suggestion. Thank you! 


Send the output without STD and the output with STD and your license number to support@statmodel.com. 

Elina Dale posted on Thursday, August 15, 2013  7:44 am



Dear Dr. Muthen, I am trying to estimate CACE as I have RCT with noncompliance. Here is my specification of the original model A: MODEL %overall% f1 BY item1 item2 item3 f2 BY item4 item5 item6 f3 BY item7 item8 item9 f4 BY item10 item11 item12 f1 ON trx ; f2 ON trx ; f3 ON trx ; f4 ON trx ; c ON X1 X2 X3 ; %c#1% [U$1@15] f1 ON trx; f2 ON.... etc I ran my model A 3 times and my best likelihood was replicated each time. All the factor loadings and beta coefficients stayed the same. Following this, I've added 3 more predictors, in addition to trx, and ran model B. I didn't alter the measurement part, i.e. factor indicators stayed the same. All other parts of how I specified the model also stayed unaltered. The part that changed in Model B input: f1 ON trx type1 type2 type3; f2 ON trx type1 type2 type3; f3 ON trx type1 type2 type3; f4 ON trx type1 type2 type3; Model B results show that Factor Loadings have changed(!) as well as beta coefficient of trx. I expected the latter, but I thought factor loadings should not have been altered. I rerun the model with more starts & iterations but got same results. Should factor loadings remain unaltered between these two models? If not, what can I do to find the reason for this error? Many thanks! 


The factor loadings can change. There may be a need for direct effects between the type and item variables due to measurement noninvariance. 

Elina Dale posted on Wednesday, November 13, 2013  12:03 am



Dear Dr. Muthen, I am writing to you to just confirm again that CACE estimation method can be used with observed treatment and latent outcome variable b/c the articles on CACE that I found all seem to use observed outcomes (such as PIRC Study). My Y or rather Y's are 4 factors measured through 20 items. My treatment is financial incentives, but I have high % of noncompliers, so am using CACE. If I can use MPlus to estimate CACE with a latent outcome, is the specification below correct? MODEL %overall% f1 BY item1 item2 item3 f2 BY item4 item5 item6 f3 BY item7 item8 item9 f4 BY item10 item11 item12 f1 ON trx ; f2 ON trx ; f3 ON trx ; f4 ON trx ; c ON X1 X2 X3 ; %c#1% [U$1@15] f1 ON trx; f2 ON.... etc Thank you! 


Yes, this is doable and your setup is fine. But you should be aware that there are several possibilities for how many measurement parameters for the DVs that should be invariant across the two classes (factor loadings, indicator intercepts, residual variances), and you could study the sensitivity of the results to that. 

Elina Dale posted on Thursday, November 14, 2013  2:51 am



Yes, thank you, Dr. Muthen! I fixed the factor loadings & indicator intercepts to equal across two classes. I thought it would be reasonable to assume that, at least as a starting point. 1. I am wondering if there is good paper (like Booil Jo, 2002 on CACE) on how to check model assumptions etc with a CACE model where mediating variable is a LATENT variable consisting of 4 correlated factors. X> M > Y where M is a latent variable that consists of 4 factors. 2. I am getting very strange estimates. For example, in my CACE model with just X & M where M was my outcome, I consistently got negative coefficients for trx. Now that I added Y and my M is acting as a mediating variable, coefficients of X on M are positive on 2 of the factors. The positive coefficients go also against exploratory data analysis results. I am wondering if the results are trustworthy. I have 805 subjects, I have a trx variable, a mediating variable (4 factors, 20 items), and an observed continuous outcome. I wonder if the sample is too small for such a complicated model or if there is something else I am doing wrong. Thank you!!! 


1. Not that I know. You may want to contact Booil Jo at Stanford. 2.I would break down the modeling into small parts to understand what is happening. For instance, first do X>M without bringing in CACE. And look at each M factor separately (first making sure that the M factor analysis model fits well). The sample size should be sufficient. 

Elina Dale posted on Thursday, November 14, 2013  4:47 pm



Thank you, Dr. Muthen! I did do (2). I first fit a CFA and checked a model fit. Then I fit a model with just my treatment and my mediator as a my outcome (X>M) without CACE. Then, I fit CACE for X>M. Then, since X and my final Y are observed variables, I fit regular regression to check X and Y association. Now, I am trying to fit the whole model X>M>Y using CACE. In this "big" model coefficients of X on M are getting reversed (what was neg before becoming positive, which doesn't make sense). Plus, I am getting a message that the model may not be identified. There is something wrong but unlike regular regressions, I do not know of diagnostic tools that we could use after fitting the model to check our assumptions etc. Could you please, help? Thank you! 


Send the output to Support, including TECH1, TECH4, and TECH8. 

Elina Dale posted on Saturday, November 16, 2013  1:20 am



Thank you!!! I am rerunning it now since I didn't specify TECH4 output. Will send it as soon as it finishes. Thank you! 

Elina Dale posted on Monday, January 27, 2014  5:15 am



Dear Dr. Muthen, I was listening to your presentation on categorical factor indicators. There you say that when we use MLR as an estimator, MPlus uses logistic regression, so the coefficients are interpreted as OR. To estimate CACE, MPlus uses mixture modeling with MLR estimator. If everything is set up as in Ex 7.24, except the outcome is a latent variable measured on ordinal scale, do coefficients need to be exponentiated? In May 2013, I wrote that my outcome was a continuous latent variable measured through a set of observed indicators on a Likert/Ordinal scale. I would like to just confirm again that the coefficient for the treatment variable that I get in the output does not need to be exponentiated and I haven't misunderstood your response in May. Thank you! 


No exponentiation needed because your DV is a (latent) continuous variable. 

Elina Dale posted on Monday, January 27, 2014  5:15 pm



Thank you! This is very helpful! But does the point remain that we are fitting logistic regression with MLR estimator in mixture analysis? I know we have logistic regression for predicting compliance status and MPlus even gives OR at the end. But I wonder the part of the model where we predict outcome based on compliance. Since it uses MLR and factor indicators are categorical, is it a linear or logistic regression? I promise this is my last question. Thank you! 


Just go with what the DV is. The compliance status is binary. The factor is continuous  it doesn't matter that the factor indicators are categorical since they are DVs only for the factor predicting the indicators, not in the prediction of the factor. 

Elina Dale posted on Monday, January 27, 2014  9:20 pm



Thank you! This makes sense now. Greatly appreciate it! 

Huili Liu posted on Monday, May 12, 2014  11:41 pm



Dear Dr.Muthen, Thank you very much for reading this email. I am a PhD student, and I am very interested in the estimation of intervention effects with noncompliance. I read your chapter with Dr. Jo, Modeling of Intervention Effects with Noncompliance: A Latent Variable Approach for Randomized Trials.I really enjoyed it! But I have a question about the Mplus code provided. The outcome was regressed on the assignment variable Z and the compliance status C. However, the estimation seemed not include the variable, D, that specified whether subjects had taken or not taken the treatment given they have been assigned to certain treatment level. If D is not included in the model, does the estimation process loose some information? I tried to run the model with some data I simulated. I can estimated the right effect for compliers with the traditional 2SLS method, but the result estimated with the EM algorithm was less accurate. The following is the code: Variable: names are Y Z; usev are Y Z; classes=c(3); Analysis: type=mixture; Model: %overall% Y on Z; Y; [Y]; ! For compliers %C#1% [Y]; Y on Z; ! For always takers %C#2% [Y]; Y on Z@0; ! For never takers %C#2% [Y]; Y on Z@0; Thank you very much. 


See the Mplus User's Guide which shows two ways to do CACE modeling. 

Huili Liu posted on Wednesday, May 14, 2014  4:00 pm



Dear Dr. Muthen, Thank you very much for your quick response! However, I still have a question about the problem after reading your examples in the Mplus User's Guide. In your chapter with Jo (2001), Modeling of Intervention Effects with Noncompliance: A Latent Variable Approach for Randomized Trials, you specified that the noncompliers are just the nevertakers, not including the defiers and the alwaystakers. But in the User Guide, I did not see the clarification for the noncompliers. My understanding is that the examples also assume that there are no alwaystakers and defiers. Is it right? If it assumes there are no alwaystakers and defiers, is there a way to include the estimation of alwaystakers? Or do you suggest some paper for reading. I understand that for most research settings alwaystakers are either not able to have access to the treatment or not able to be recorded whether they have taken the treatment, as most researchers will term it as the "observed compliance statuses". But if the research has access to the record some alwaystakers, how can Mplus handle this problem? Thank you so much for your help. 


3class CACE adding always takers can be done in Mplus. It can work well with covariates but not too well without covariates. This is the experience of Booil Jo who will share her Mplus inputs here. 


Dear Dr. Muthén. Using five dataset from a multiple imputation; I am trying to fit an multigroup unconditional model (as defined in HLM). My grouping variable is gender. I am not sure if my mplus coding is correct. Additionally i am getting an error: Errors for replication with data file Stutomplus1.dat: * FATAL ERROR CLASSSPECIFIC BETWEEN VARIABLE PROBLEM. my coding is classes=g(2); KNOWNCLASS = g (gender= 0 gender=1); USEVARIABLES ARE zpvmath; weight = w_fstuwt ; wtscale = cluster ; bweight = t_fstuwt ; bwtscale = sample ; CLUSTER = schoolid ; Analysis: Type = twolevel mixture; MODEL: %WITHIN% %OVERALL% zpvmath; %g#1% zpvmath (p1); %g#2% zpvmath (p2); %BETWEEN% %OVERALL% zpvmath; %g#1% zpvmath (p3); %g#2% zpvmath (p4); 


Never mind my post before, i found my mistake in the coding. it is working now. thank you fernando 

ywang posted on Tuesday, November 18, 2014  3:19 pm



Dear Drs. Muthen, I think it is possible to include a mediator in the CACE model, but do you have any related paper? Thank you! 


I can't think of one offhand. You may want to contact Booil Jo at Stanford's Psychiatry Dept. 


Dear Drs. Muthen, Is there an example available for implementing CACE mixture modeling with a zero effect class as described in Sobel and Muthen (2013) article on this topic in Biometrics? Thank you! 


Here is one key setup: TITLE: mix12 Complier Average Causal Effect (CACE) estimation in a randomized trial. Data from the JOBS II intervention trial, courtesy of Richard Price and Amiram Vinokur, University of Michigan. Little, R.J. & Yau, L.H.Y. (1998). Statistical techniques for analyzing data from prevention trials: Treatment of noshows using Rubin's causal model. Psychological Methods, 3, 147159. DATA: FILE = jobs with u.dat; VARIABLE: NAMES ARE depress risk Tx depbase age motivate educ assert single econ nonwhite u; categorical = tx u; !u=1 complier, u=0 noncomplier missing = all(999); classes = cg(2) c(3); ANALYSIS: TYPE = MIXTURE; starts = 100 20; processors = 4 (starts); MODEL: %OVERALL% depress ON risk depbase; c ON age educ motivate econ assert single nonwhite; %cg#1.c#1% !tx group (tx=1) [tx$1@15]; ! P(tx=1)=1 ! c#1 is tx group type A  nevertakers (no effect on M or Y) [u$1@15]; ! P(u = 1)=0 [depress] (1); depress (10); %cg#1.c#2% [tx$1@15]; ! c#2 is tx group type B  compliers, but no effect on Y [u$1@15]; ! P(u = 1)=1 [depress] (2); depress (11); %cg#1.c#3% [tx$1@15]; ! c#3 is tx group Type C  compliers and effect on Y [u$1@15]; ! P(u = 1)=1 [depress] (t3); depress (12); %cg#2.c#1% !ctrl group (tx=0) [tx$1@15]; ! P(tx=1)=0 ! c#1 is ctrl group type A  nevertakers (no effect on M or Y) [u$1@15]; ! P(u = 1)=0 [depress] (1); depress (10); %cg#2.c#2% [tx$1@15]; ! c#2 is ctrl group type B  compliers, but no effect on Y [u$1@15]; ! P(u = 1)=1 [depress] (2); depress (11); %cg#2.c#3% [tx$1@15]; ! c#3 is ctrl group Type C  compliers and effect on Y [u$1@15]; ! P(u = 1)=1 [depress] (c4); depress (12); Model constraint: new(cace); cace = t3c4; OUTPUT: tech1 tech8; 


Hello, I'm grappling with how to best approach a CACE analysis using mixture modelling when I have reason to believe that there may be two competing dynamics at work such that assignment to treatment not only affects the DV positively via compliance, but also is having a negative effect on the DV via other processes. To be more concrete, we asked volunteer mentors to do X activities with their mentees  so compliance can be taken as doing those activities. But it seems that what may also have occurred is that having been directed to do X activities induced mentors to *not* do other things, outside of those specific activities, with their mentees that would benefit the DV  so, it is as if mentors assigned to treatment had the approach to some extent of "doing what they were told to do" but also doing less (than those assigned to control) of other functionallysimilar (i.e., also of benefit to the DV) activities, perhaps due to a sense of having already handled those through the assigned intervention activities. I'm gathering that modeling this might have to do with relaxing the exclusion restriction  however, the only reference I can find on that is relaxing it for nevertakers, whereas it seems like I might need to be relaxing for takers. I realize this is a pretty convoluted situation  thanks very much for whatever suggestions or guidance you can offer! 


Maybe this CACE paper on our website is somewhat related: Sobel, M. & Muthén, B. (2012). Compliance mixture modelling with a zero effect complier class and missing data. Biometrics, 68, 10371045. DOI: 10.1111/j.15410420.2012.01791.x Download paper. 

Back to top 