Message/Author 


In a study I conducted, I had subjects rate multiple stimuli that represent different levels of within subjects, independent variables. I have a situation containing variance due to subjects, variance due to manipulations, and, I believe, variance due to the interaction of subjects and manipulations. In my SEM I want to account for these sources of variation. Can Mplus help me with this analysis? If so, how would I proceed? 


It sounds like what you want is a variance component analysis which can be done in Mplus using latent variables in a factor analytic framework. George Marcoulides has described how to do this in strucutural equation modeling in the SEM journal. I'm not sure of the exact citation. 


I'm a little lost. I've got a 4 wave panel who've answered items for the Theory of planned behaviour on attitudes to speeding: a series of antispeeding advertising campaigns have intervened. I'd like to be able to see if the TPB (it's a SEM) works over the panel who've answered each of the 4 waves and which model parameters have changed. I can see how to do a group analysis but that's just for separate cross sections. Longitudinal analyses are appropriate for panels, but only seem to be for a few variables (I've got 5 latents at each time point and up to 10 indicators for each). The type=twolevel isn't appropriate as I don't have clusters within timepoints. 

lmuthen posted on Friday, August 23, 2002  8:09 am



This sounds like a joint analysis of all time points using a growth model would be appropriate. You have latent variables with multiple indicators and after establishing a sufficient degree of measurement invariance over time you can formulate a growth model for the latents. This is regular type = meanstructure modeling. See the User's Guide, pages 218219 and 366. If you send your fax number to support@statmodel.com I can fax you some pages from our short course on this topic. 


I am fitting a longitudinal structural equation model, where i have the same binary variable measured at different waves. I am specifying this as both endogenous to previous events and exogenous to subsequent events. The binary variable is an indicator of becoming a parent for the first time. Because this can only ever happen once to any individual, the crossclassification of any two of these variables contains an empty cell. This, I believe, is why I am receiving the following message when I try to fit run my model: "THE WEIGHT MATRIX PART OF VARIABLE BECOMEPC IS NONINVERTIBLE. THIS MAY BE DUE TO ONE OR MORE CATEGORIES HAVING TOO FEW OBSERVATIONS. CHECK YOUR DATA AND/OR COLLAPSE THE CATEGORIES FOR THIS VARIABLE. PROBLEM INVOLVING THE REGRESSION OF BECOMEPC ON BECOMEPA. THE PROBLEM MAY BE CAUSED BY AN EMPTY CELL IN THE JOINT DISTRIBUTION." The binary variables in question are becomepa and becomepc. Is there a work around for this problem, or is it not possible to model this as a single system of equations? 


There is no workaround for this. I don't think a random effect growth model is the right model for this type of data. There really is not development. A person has zeroes until they become a parent for the first time and then they have ones. There is no ability to move in and out of being a first time parent. It sounds more like a candidate for a survial model. 

Anonymous posted on Thursday, October 20, 2005  5:49 am



Good morning Drs. Muthen and Muthen, I am a new Mplus user. I've managed to learn the ins and outs of the software fairly quickly thanks to the excellent examples provided in the Mplus version 3 manual. However I've encountered a problem. I am attempting to develop a longitudinal structural equation model that only has two panels of data for the outcome of interest. I would like to evaluate the ability of factors measured at wave I to predict variation in the outcome factor in the wave II data, while accounting for the autoregression that exists between the wave I and II outcome measures. It should also be noted that I would like to use the wave I outcome measure as a predictive factor. In reviewing the manual I have located examples for growth modeling and mixture modeling using longitudinal data but have not located an example that contains features somewhat similar to that described above. Could you perhaps identify an example that illustrates SEM with longitudinal data of the type that I've described? Any input into this scenario would be greatly appreciated. Thank you. 


I think what you are saying would be reflected in the following MODEL command. If not, it may be a starting point. MODEL: f1 BY y1 y2 (1) y3 (2) y4 (3); f2 BY y5 y6 (1) y7 (2) y8 (3); [y1 y5] (4); [y2 y6] (5); [y3 y7] (6); [y4 y8] (7); f2 ON f1; y1y4 PWITH y5y8; 


Dear all, I am trying to fit path analysis with an individual random effect. It starts from a straightforward repeated measure situation. There are, say, SES, wealth (W) as independent vars and health (H) as dependent vars; 3 periods each separated by 2 years. Repeated measure with indiv. random effect will be as follows: H = SES*beta1 + W*beta2 + mu_i + epsilon, where mu_i is individual random effect. Now, I do believe that the model should be slightly more detailed; hence, involving path analysis as follows. At any period: SES > W > H SES > H My question is, how do I incorporate and specify individual random effect, mu_i, into this path analytic model? For completeness, individual random effect is needed here mainly to capture 'sorting' or 'selection' or any unobserved confounding effect which influence both wealth accumulation and health status at the same time. Many thanks for your help. Gindo 


See Example 9.3. This is close to what you want. But you would want CLUSTER=ID; because your date are in the long format as shown in Example 9.16. 


Dear Linda, Many thanks indeed for your answer. Gindo 


Dear Drs. Muthen, I am conducting an RCT with 2 groups (Igroup and Cgroup). They were tested at T1 prior to randomization, at T2, and at T3. The results from the prepost study were fine, many sig differences and 87%retention rate. Age, gender and dosage were covariates, and in a few cases there were also age and gender interactions. My problem is that it looks like all the wellfunctioning families in the Igroup and the problematic families in the Cgroup have dropped out from T2 to T3, making the comparison "unfair" to the Intervention. At T3, the retention is only 58% of the initial sample. 1. How do I model these data so to keep the information about the families in the Igroup who really improved and most likely continued to improve at T3? 2. What exactly is the CACE command line in Mplus and should I use that? 3. Should I look for effects from T1 to T3, skipping T2 data, as some argue that groups are no longer randomized from T2 and on (the groups have become qualitatively different). On the other hand, I suppose I have to include T2 in the analyses because this is the time at which we see that our Igroup is doing really well. 4. Some ANCOVA tests do show differences between the groups at T3 as well, but because we have so few participants left (n= 6065), the differences are not significantly different. What do we do with that? I thank you in advance. 


You should use all your data from T1, T2, and T3. It sounds like T2 scores are predictive of dropout at T3. This helps make maximumlikelihood estimation under "MAR" missing data theory perform better in that MAR allows missingness at T3 to be a function of the T2 value. If only T1 and T3 were used, your results would not be as trustworthy due to missingness. One way to analyze T1, T2, and T3 is to do growth modeling, where you center at T1 and let the intervention dummy covariate influence the slope. CACE has to do with some subjects not adhering to the treatment, so that wouldn't seem directly relevant here. 

Mark Kline posted on Tuesday, September 01, 2009  8:34 am



I have negatively correlated residuals for the same variable across four time points? Does anyone know what this means or how to interpret it? 


That seems unusual  I haven't encountered that, I think. Sounds like this is an unusual outcome variable  or a misspecified model, but I amy be wrong. Anyone else? 


I am analyzing data from an ambulatory assessment study and I want to specify a latent firstorder autoregressive model. Because the time points are randomly selected for each individual, the data has an unbalanced structure: The occasions are not equally spaced and the time lags differ between individuals and between different occasions. I think that in order to specify such a model the autoregressive parameter has to have the individual time lag in the exponent: beta^timelag(individual). Could such a model be specified in Mplus or is there another way to solve this problem? 


You could try drawing on the Constraint = option in UG ex 5.23 where you read in the individualspecific lags. Perhaps in combination with UG ex 6.17. 

Michael Eid posted on Tuesday, October 19, 2010  12:12 am



This worked fine, thanks a lot for this suggestion. Using the model constraint command in this way I do not get goodness of fit coefficients and standardized solutions. Is there any way to get this information? 


With the individuallyvarying time points, the model falls outside the SEM covariance structure modeling and like with HLM there is no overall test of fit (see also the Raudenbush chapter in the Best Methods longitudinal book). Standardized would have to be computed say via Model constraint, expressing the standardized coefficients in terms of labeled model parameters. 

Kesinee posted on Monday, March 28, 2011  5:38 am



Dear all, I have a 5year follow up study. Three mediators (a continuous variable) were measured 2 times at time 1&3. IV is a normal category (4 categories) measured at time 1 and DV (having disease) is a dichotomous variable measured at time 5, including gender and age (measured 2 times). I would like to test whether developing of disease is mediated by IV combined with a change of mediators controlled for age and gender. I am thinking to purchase Mplus (student license). Is there the best approach to do this with Mplus? I also found that one category of the IV did not develop the outcome (empty cell), but I cannot recategorize it. Does this situation affect the analysis? Any suggestion would be appreciate. Best regards, 


I'm afraid we don't understand your question. You can try to restate it. Examples 3.11 through 3.17 in the user's guide on the website shows mediation. 

Kesinee posted on Monday, March 28, 2011  1:28 pm



Thank you for your prompt response. Sorry, if it did not clear. I mean I want to test 1) X > M1 (time1)  >M1 (time 3) > Y (time 5) X > M2 (time1)  >M2 (time 3) > Y (time 5) X > M3 (time1)  >M3 (time 3) > Y (time 5) All of variable are observed. (X= nominal, M13=Continuous, Y=Binary) I would like to know that I can use Mplus (student license) to test this. 2) Others, when I crosstab for X and Y (4x2) I have one empty cell in the results. I cannot regroup for X, so I do not know that this situation may cause problems in analysis. Thank you again. Sincerely yours, 


Mplus can handle this model. I am not sure if you'll have problems due to the empty cell  perhaps if you want to test a direct effect from X to Y. By the way, a nominal X is handled via a set of binary dummy variables. 

Kesinee posted on Tuesday, March 29, 2011  6:02 am



It means I have to create a dummy variable for X and then run the model for all Ms as below: M1t1 on d1 d2 d3; M1t3 on d1 d2 d3 M1t1; Y on d1 d2 d3 M1t1 M1t3; If I created a dummy variable like this, can it handle the problem of empty cell for indirect effect of X to Y? Thank you for your kindness. 


Try running Y on d1 d2 d3 M1t1 M1t3; in another program. If the empty cell is not a problem there, it won't be for Mplus. 

Kesinee posted on Wednesday, March 30, 2011  5:48 pm



I ran proc logistic in SAS. I got the warning as below. WARNING: There is possibly a quasicomplete separation of data points. The maximum likelihood estimate may not exist. It is possible that the empty cell may cause problem for Mplus or not. 


You would have this same problem in Mplus for your direct effect. 

Kesinee posted on Thursday, March 31, 2011  5:42 pm



Thank you. 


I have a path model (6 variables, 9 paths) that I would like to compare (both paths and variable means) across 4 different conditions. Traditional MG SEM requires that all four groups (i.e., participants in each of the four conditions) be independent of each other for testing between groups. However, I am trying to figure out if there is a way to do this with a withinsubjects comparison (repeated measures design, with each participant providing data for all four conditions). Because I am comparing an entire model, and not just means, a growth model does not appear to be an appropriate solution. How would one go about testing such a model? Is there a way to use an approach similar to MG SEM, but in a way that correlates error terms across groups (accounting for shared variance, since it is the same participants across conditions)? Thank you for your time. 


When you have a multivariate model where several variables are measured for each person, you can compare the means for the four conditions in a single group analysis. You do not need to worry about nonindependence of observations because the multivariate model takes that into account. 


Thank you for your reply. It is not the lack of independence of variables WITHIN the path model I am concerned about, but rather the lack of independence BETWEEN the path models in each condition. I want to be able to compare paths between groups, but the groups are not independent by design (i.e., each participant has provided responses to all variables in all conditions). 


If you have a single group analysis and have say condition1 on x and condition 2 on x, you can test the equality of the two regression parameters by difference testing or MODEL TEST. 


What I am interested in doing is comparing an entire model (= all paths) between groups, not just a single path. Is there a way to correlate terms across a MG path model to accommodate the nonindependence of individuals? If so, which terms would I correlate? Would this be a sufficient way to model this? 


By correlate terms, perhaps you mean correlate the residuals of the DVs? Comparing an entire model across groups can be done by analyzing with and without equality constraints to compute a chisquare. 


Thanks for your response, Dr. Muthen. I am interested in accounting for the nonindependence of individuals across group, which is done by design. My thought was to correlate residuals, but I'm not sure if this is an appropriate way to do this. Would I correlate all residuals? Just one? Any insight would be appreciated. 


When you say "nonindependence of individuals across groups", your earlier description made it sound like the groups were 4 conditions for the same group of individuals at the different time points? That is, you have longitudinal data. If that is a correct impression, then a singlegroup analysis of all 4 conditions is the right way to go. That's the multivariate approach, where with p variables observed at each time point, your model concerns 4*p variables. The only issue is how you let the variables correlate over the conditions (the different time points). You didn't want to do a growth model, so you can instead let all the variables correlate freely by using WITH statements. 

Jenny L. posted on Tuesday, April 23, 2013  1:38 pm



Dear Drs., I have a set of longitudinal data (2 time points). I'd like to see whether the associations among the variables would differ across time. Given that f1 and f2 are exogenous variables, f3 is a mediator, and f4 is an outcome, I wrote the following codes: model: F4_T1 on F3_T1; F3_T1 on F2_T1 F1_T1; F4_T2 on F3_T2; F3_T2 on F2_T2 F1_T2; model indirect F4_T1 ind F3_T1 F1_T1; F4_T1 ind F3_T1 F2_T1; F4_T2 ind F3_T2 F1_T2; F4_T2 ind F3_T2 F2_T2; f3_t1 with f3_t2; f4_t1 with f4_t2; Does it look reasonable to test two models (T1 and T2) this way? If not, could you please let me know which example model I could use in the users' guide? Thank you in advance for your advice. 


It is up to you to determine which parameters you want to compare across time. For example, you can compare f4 ON f3 by labeling and using MODEL TEST to obtain a Wald test. model: F4_T1 on F3_T1 (p1); F3_T1 on F2_T1 F1_T1; F4_T2 on F3_T2 p2); F3_T2 on F2_T2 F1_T2; MODEL TEST: 0 = p1  p2; 

Gloria Koh posted on Sunday, October 06, 2013  2:39 am



Hello Linda & Bengt, I collected measurements from one cohort of participants at 2 time points. At Time 1, I conducted SEM using 2 categorical latent exogenous variables, 1 endogenous categorical latent variable, 5 endogenous measured categorical variables and 1 endogenous measured continous variable (outcome). I repeated the SEM with Time 2 variables. Most of the Time 2 variables are repeated measures except for 2 latent variables that were measured differently. Repetition of SEM using Time 2 variable will only give me cross sectional SEM. I intend to conduct a longitudinal analysis by including all the Time 1 and Time 2 variables into the SEM model, but due to repeated measures, clustering is a problem. I am unsure that with only two time points, if a growth model is appropriate given my understanding that growth modelling requires at least 4 time points in MPlus. Correct me if I am wrong. That leaves multilevel modelling as the alternate option but I am unsure if this is appropriate for what I am intending to do. Another issue is sample size. Due to participants withdrawing from the study at Time 2, I have only just over 200 participants at Time 2. Will multilevel SEM modeling be a feasible option or should I just report cross sectional SEM for Time 2? Thanks in advance for your reply. 


In general, not in Mplus, it is desirable to have four or more time points for a growth model. With fewer time points, it is difficult to discern a trend. Analyzing the data in long format rather than wide format does not change this. 

Gloria Koh posted on Sunday, October 06, 2013  10:01 pm



Thank you, Linda & Bengt for your prompt reply. 

JMC posted on Wednesday, October 16, 2013  8:45 am



Hello Drs. Muthen, I have been working through my analyses using the great examples in the manual and message board, but got a bit stuck! I had originally performed my analysis using repeated measures in SPSS since I only had three time points and would not be able to get quadratic effects using latent growth modeling. I have been reworking my data using a longitudinal latent SEM , but have some questions. How can I use this to compare means at the three time points? Can I look at linear and quadratic effects? Can I add in covariates? Thank you very much and I appreciate your time! 


You can fit a linear model. You need a minimum of four time points for a quadratic growth model. You can include covariates. 

Jingwen posted on Thursday, March 19, 2015  12:35 pm



Dear Dr. Muthen, I have a question about a SEM model using Mplus. It seems very simple but I don't know how to best specify it. I tested a mediation model (without latent variables) as below: M1 ON X (A1); M2 ON X (A2); M3 ON X (A3); M4 ON X (A4); M5 ON M1 (B1) M2 (B2) M3 (B3) M4 (B4) X (F1); Y ON M5 (C) X (R); M1 WITH M2 M3 M4; M2 WITH M3 M4; M3 WITH M4; MODEL CONSTRAINT: NEW(INDEFF1); INDEFF1=(A1*B1*C); NEW(INDEFF2); INDEFF2=(A2*B2*C); NEW(INDEFF3); INDEFF3=(A3*B3*C); NEW(INDEFF4); INDEFF4=(A4*B4*C); NEW(INDEFF5); INDEFF5=(F1*C); My problem now is that our Y was measured at 2 time points so I have a repeated measure of Y. How do I specify Y as a repeated measure (just 2 measures at 2 times) in this model? My interests are in the indirect effects. Any advice is much appreciated!!! Thank you so much! Best wishes, Jingwen 


Is only Y measured twice and not the M's or X? 

Jingwen posted on Thursday, March 19, 2015  1:53 pm



Hi Dr. Muthen, Only Y was measured twice (6 month and 12 month followup). X was our intervention and Ms were measured at immediatepost the intervention. 


Then the simplest approach is to do 2 analyses, one for each Y. 

Jingwen posted on Thursday, March 19, 2015  2:18 pm



Thanks for the answer. But I'm interested in learning how to make an estimate for the effect (averaging over the 2 followups), like using GEE in SAS? 


You can use Type=Complex where subject is level 2 and the repeated measures level1. Or Type=twolevel, which is more complex. Or, use a wide format, that is, 2 columns for y's 2 timepoints and then let them correlate in some way. With 3 time points you can apply a growth model. 

Jingwen posted on Thursday, March 19, 2015  2:52 pm



Dr. Muthen, thank you very much! I will check these options. 

Jingwen posted on Thursday, March 19, 2015  9:01 pm



Hi Dr. Muthen, I want to follow up with one more question. You suggested "Or, use a wide format, that is, 2 columns for y's 2 timepoints and then let them correlate in some way." Could you let me know if the following is what you suggested? M1 ON X (A1); M2 ON X (A2); M3 ON X (A3); M4 ON X (A4); M5 ON M1 (B1) M2 (B2) M3 (B3) M4 (B4) X (F1); Y1 Y2 ON M5 (C) X (R); M1 WITH M2 M3 M4; M2 WITH M3 M4; M3 WITH M4; Y1 WITH Y2; MODEL CONSTRAINT: NEW(INDEFF1); INDEFF1=(A1*B1*C); NEW(INDEFF2); INDEFF2=(A2*B2*C); NEW(INDEFF3); INDEFF3=(A3*B3*C); NEW(INDEFF4); INDEFF4=(A4*B4*C); NEW(INDEFF5); INDEFF5=(F1*C); 


That's an ok setup. But you can't talk about 2 slopes and only 1 label in your line Y1 Y2 ON M5 (C) Instead, you have to say Y1 ON M5 (c1) X (R1); Y2 ON M5 (c2) X (R2); But this is just considering a bivariate Y outcome. To combine their effect, you can put a factor behind Y1, Y2 and regress that on the predictors: f BY Y1 Y2; f ON M5 (C) X (R); 

Jingwen posted on Friday, March 20, 2015  11:46 am



This is very helpful. Thanks! Another question... I checked about bootstrap method for type=complex and type=twolevel. Is it still the case that Mplus could not apply bootstrap to such analyses? 


Type = complex bootstrapping is available using the REPSE" option  see tech note under V6: Resampling methods in Mplus for complex survey data Twolevel bootstrap is not available. 

Jingwen posted on Friday, March 20, 2015  11:53 am



Great. Thanks so much!!! 

Jingwen posted on Friday, March 20, 2015  12:15 pm



Dr. Muthe, I used the codes below: ANALYSIS: TYPE = COMPLEX; REPSE=BOOTSTRAP; BOOTSTRAP=500; OUTPUT:CINTERVAL (BOOTSTRAP); and received this error message: *** ERROR in ANALYSIS command BOOTSTRAP is not allowed with TYPE=COMPLEX. I am using Mplus 7.2 When I did not include the bootstrap, the model run normally with type=complex. Could you advise me on this? Many thanks. 


I don't see your full input, but you need a weight variable. If your sample didn't use weights you can set a weight variable to be constant = 1. 


Version 7.3 has a more complete error message. 

Jingwen posted on Friday, March 20, 2015  4:49 pm



It worked out. Thanks so much! No more error message. 

Alvin posted on Monday, March 23, 2015  11:13 pm



Hi Prof Muthen, I am working on a longitudinal (two time points) dyadic model. I wondered would it be possible to model change of anger (y, measured at both time points) across time as a function of x1 (measured at time 2) and x2 (measured at both time 1 and time 2). My dyads are husbands and wives. I assume that y will be associated with x1 and x2 across time for both husbands and wives (and between persons/couples).. In particular, I am interested how much variance of y (wife) at time 2 is explained by x1 (time 2) and x2 (time 1 and 2) (husband). Would you model this as a random intercept (x1, x2 across individuals) and random slope (varies across time) model? Many thanks 


I would model this in a "doublywide" format. So the variables (columns) would include both the 2 time points and male and female (so, for example, you will have 4 y's). No need for random intercepts or slopes. In this format you can include any relation across time and across gender that you want in the model. 


Dear all, I have longitudinal data across 3 times of measurement. The lags between each time of measurement differ between individuals as well as within individuals (i.e. between the first and the second lag). I want to fit a latent variable crosslagged model. It seems to me that I would have to use some form of continuous time modeling as opposed to discrete time modeling. Is there any way to do this using Mplus? Any help would be greatly appreciated! 


Look at the UG index entry individuallyvarying times of observation. 

Ejlis posted on Sunday, August 07, 2016  11:48 pm



Hi! I have a longitudinal model with three constructs repeated over three time points. At t1 I let them correlate. I wonder if the residuals within t2 and t3 need to be correlated between constructs? If I let them, my cross causal relations are not any longer present.... Thank you! 


Q1. Yes, I think so because there may be many leftout predictors that make them correlate. 

Ejlis posted on Tuesday, August 09, 2016  6:42 am



Thank you! 

Chen posted on Tuesday, September 20, 2016  9:04 pm



Hi, Prof. Muthen, I am working on running a within subject SEM using Mplus. Basically I have 2 IVs, and my study is a 2 x 2 x3 (repetition) design, in which each participant viewed 12 stimuli in a random order (3 message per condition, across 4 conditions). I just transformed all of my data into long format (each participant shows up 12 times in answering questions to 12 stimuli), and have been trying to run an SEM model using 2 IVs, 2 Mediators and 4 DVs. I used code like: “cluster is ID” and then "Analysis: Type = complex" But I kept having errors saying “the number of observations is 0. check your data and format statement" and "invalid symbol in data file" could you let me know where i got wrong or how I can run this better? Thank you very much!!! 

Chen posted on Tuesday, September 20, 2016  9:19 pm



Also, I opened the editor via Mplus, and my data seems just fine, nothing weird symbol appears in the first line. 


Please send the output, data, and your license number to support@statmodel.com. 


Dear all, I've conducted a 2 condition withinparticipant repeated measures study, and I'm estimating a mediation model. I have been successful in using Montoya and Hayes' (2017) syntax, but I have trouble generalising their method to a latent factor model. Please see the shortened syntax: DEFINE: hum1D=humKM1humAM1; cs1D=csKM1csAM1; ftD=ftKMftAM; blD1=blKM1blAM1; !Mean centring indicators h1Dmc=0.5((humKM1+humAM1)9.751); cs1Dmc=0.5((csKM1+csAM1)6.1528); ftDmc=0.5((ftKM+ftAM)14.8498); MODEL: Blatdiff BY blD1 blD2 blD3; csdiff BY cs1Dmc cs2Dmc cs3Dmc; humdiff BY hum1D hum2D hum3D; Blatdiff ON humdiff ftD csdiff h1Dmc h2Dmc h3Dmc cs1Dmc cs2Dmc cs3Dmc ftDmc; ftD ON humdiff h1Dmc h2Dmc h3Dmc; csdiff ON humdiff h1Dmc h2Dmc h3Dmc; [humdiff]; [ftD); [CSdiff] ; [Blatdiff] ; [ftDmc@0]; [cs1Dmc@0]; [hum1D@0]; With this syntax I get a message about an internal error. My questions are thus: 1) What does this internal error message mean? 2) Is it possible to estimate latent factors on difference score indicators? 


I see one error  the right parenthesis should be a bracket: [ftD); If this doesn't help, send your output and license number to Support. 

Back to top 