In a study I conducted, I had subjects rate multiple stimuli that represent different levels of within subjects, independent variables. I have a situation containing variance due to subjects, variance due to manipulations, and, I believe, variance due to the interaction of subjects and manipulations. In my SEM I want to account for these sources of variation. Can Mplus help me with this analysis? If so, how would I proceed?
It sounds like what you want is a variance component analysis which can be done in Mplus using latent variables in a factor analytic framework. George Marcoulides has described how to do this in strucutural equation modeling in the SEM journal. I'm not sure of the exact citation.
I'm a little lost. I've got a 4 wave panel who've answered items for the Theory of planned behaviour on attitudes to speeding: a series of anti-speeding advertising campaigns have intervened. I'd like to be able to see if the TPB (it's a SEM) works over the panel who've answered each of the 4 waves and which model parameters have changed. I can see how to do a group analysis but that's just for separate cross sections. Longitudinal analyses are appropriate for panels, but only seem to be for a few variables (I've got 5 latents at each time point and up to 10 indicators for each). The type=twolevel isn't appropriate as I don't have clusters within time-points.
lmuthen posted on Friday, August 23, 2002 - 8:09 am
This sounds like a joint analysis of all time points using a growth model would be appropriate. You have latent variables with multiple indicators and after establishing a sufficient degree of measurement invariance over time you can formulate a growth model for the latents. This is regular type = meanstructure modeling. See the User's Guide, pages 218-219 and 366.
If you send your fax number to email@example.com I can fax you some pages from our short course on this topic.
I am fitting a longitudinal structural equation model, where i have the same binary variable measured at different waves. I am specifying this as both endogenous to previous events and exogenous to subsequent events. The binary variable is an indicator of becoming a parent for the first time. Because this can only ever happen once to any individual, the cross-classification of any two of these variables contains an empty cell. This, I believe, is why I am receiving the following message when I try to fit run my model:
"THE WEIGHT MATRIX PART OF VARIABLE BECOMEPC IS NON-INVERTIBLE. THIS MAY BE DUE TO ONE OR MORE CATEGORIES HAVING TOO FEW OBSERVATIONS. CHECK YOUR DATA AND/OR COLLAPSE THE CATEGORIES FOR THIS VARIABLE. PROBLEM INVOLVING THE REGRESSION OF BECOMEPC ON BECOMEPA. THE PROBLEM MAY BE CAUSED BY AN EMPTY CELL IN THE JOINT DISTRIBUTION."
The binary variables in question are becomepa and becomepc. Is there a work around for this problem, or is it not possible to model this as a single system of equations?
I don't think a random effect growth model is the right model for this type of data. There really is not development. A person has zeroes until they become a parent for the first time and then they have ones. There is no ability to move in and out of being a first time parent. It sounds more like a candidate for a survial model.
Anonymous posted on Thursday, October 20, 2005 - 5:49 am
Good morning Drs. Muthen and Muthen,
I am a new Mplus user. I've managed to learn the ins and outs of the software fairly quickly thanks to the excellent examples provided in the Mplus version 3 manual. However I've encountered a problem. I am attempting to develop a longitudinal structural equation model that only has two panels of data for the outcome of interest. I would like to evaluate the ability of factors measured at wave I to predict variation in the outcome factor in the wave II data, while accounting for the autoregression that exists between the wave I and II outcome measures. It should also be noted that I would like to use the wave I outcome measure as a predictive factor. In reviewing the manual I have located examples for growth modeling and mixture modeling using longitudinal data but have not located an example that contains features somewhat similar to that described above. Could you perhaps identify an example that illustrates SEM with longitudinal data of the type that I've described? Any input into this scenario would be greatly appreciated. Thank you.
I am trying to fit path analysis with an individual random effect. It starts from a straightforward repeated measure situation. There are, say, SES, wealth (W) as independent vars and health (H) as dependent vars; 3 periods each separated by 2 years. Repeated measure with indiv. random effect will be as follows:
H = SES*beta1 + W*beta2 + mu_i + epsilon,
where mu_i is individual random effect.
Now, I do believe that the model should be slightly more detailed; hence, involving path analysis as follows. At any period: SES --> W ------> H SES ------------> H
My question is, how do I incorporate and specify individual random effect, mu_i, into this path analytic model?
For completeness, individual random effect is needed here mainly to capture 'sorting' or 'selection' or any unobserved confounding effect which influence both wealth accumulation and health status at the same time.
Dear Drs. Muthen, I am conducting an RCT with 2 groups (I-group and C-group). They were tested at T1 prior to randomization, at T2, and at T3. The results from the pre-post study were fine, many sig differences and 87%retention rate. Age, gender and dosage were covariates, and in a few cases there were also age and gender interactions. My problem is that it looks like all the well-functioning families in the I-group and the problematic families in the C-group have dropped out from T2 to T3, making the comparison "unfair" to the Intervention. At T3, the retention is only 58% of the initial sample. 1. How do I model these data so to keep the information about the families in the I-group who really improved and most likely continued to improve at T3? 2. What exactly is the CACE command line in Mplus and should I use that? 3. Should I look for effects from T1 to T3, skipping T2 data, as some argue that groups are no longer randomized from T2 and on (the groups have become qualitatively different). On the other hand, I suppose I have to include T2 in the analyses because this is the time at which we see that our I-group is doing really well. 4. Some ANCOVA tests do show differences between the groups at T3 as well, but because we have so few participants left (n= 60-65), the differences are not significantly different. What do we do with that? I thank you in advance.
You should use all your data from T1, T2, and T3. It sounds like T2 scores are predictive of dropout at T3. This helps make maximum-likelihood estimation under "MAR" missing data theory perform better in that MAR allows missingness at T3 to be a function of the T2 value. If only T1 and T3 were used, your results would not be as trustworthy due to missingness.
One way to analyze T1, T2, and T3 is to do growth modeling, where you center at T1 and let the intervention dummy covariate influence the slope.
CACE has to do with some subjects not adhering to the treatment, so that wouldn't seem directly relevant here.
Mark Kline posted on Tuesday, September 01, 2009 - 8:34 am
I have negatively correlated residuals for the same variable across four time points? Does anyone know what this means or how to interpret it?
I am analyzing data from an ambulatory assessment study and I want to specify a latent first-order autoregressive model. Because the time points are randomly selected for each individual, the data has an unbalanced structure: The occasions are not equally spaced and the time lags differ between individuals and between different occasions.
I think that in order to specify such a model the autoregressive parameter has to have the individual time lag in the exponent: beta^time-lag(individual).
Could such a model be specified in Mplus or is there another way to solve this problem?
You could try drawing on the Constraint = option in UG ex 5.23 where you read in the individual-specific lags. Perhaps in combination with UG ex 6.17.
Michael Eid posted on Tuesday, October 19, 2010 - 12:12 am
This worked fine, thanks a lot for this suggestion. Using the model constraint command in this way I do not get goodness of fit coefficients and standardized solutions. Is there any way to get this information?
With the individually-varying time points, the model falls outside the SEM covariance structure modeling and like with HLM there is no overall test of fit (see also the Raudenbush chapter in the Best Methods longitudinal book).
Standardized would have to be computed say via Model constraint, expressing the standardized coefficients in terms of labeled model parameters.
Kesinee posted on Monday, March 28, 2011 - 5:38 am
I have a 5-year follow up study. Three mediators (a continuous variable) were measured 2 times at time 1&3. IV is a normal category (4 categories) measured at time 1 and DV (having disease) is a dichotomous variable measured at time 5, including gender and age (measured 2 times). I would like to test whether developing of disease is mediated by IV combined with a change of mediators controlled for age and gender. I am thinking to purchase Mplus (student license). Is there the best approach to do this with Mplus? I also found that one category of the IV did not develop the outcome (empty cell), but I cannot re-categorize it. Does this situation affect the analysis? Any suggestion would be appreciate.
I'm afraid we don't understand your question. You can try to restate it. Examples 3.11 through 3.17 in the user's guide on the website shows mediation.
Kesinee posted on Monday, March 28, 2011 - 1:28 pm
Thank you for your prompt response. Sorry, if it did not clear. I mean I want to test 1) X---- > M1 (time1) ------- >M1 (time 3) --------> Y (time 5) X---- > M2 (time1) ------- >M2 (time 3) --------> Y (time 5) X---- > M3 (time1) ------- >M3 (time 3) --------> Y (time 5)
All of variable are observed. (X= nominal, M1-3=Continuous, Y=Binary) I would like to know that I can use Mplus (student license) to test this. 2) Others, when I crosstab for X and Y (4x2) I have one empty cell in the results. I cannot regroup for X, so I do not know that this situation may cause problems in analysis. Thank you again.
I have a path model (6 variables, 9 paths) that I would like to compare (both paths and variable means) across 4 different conditions. Traditional MG SEM requires that all four groups (i.e., participants in each of the four conditions) be independent of each other for testing between groups. However, I am trying to figure out if there is a way to do this with a within-subjects comparison (repeated measures design, with each participant providing data for all four conditions). Because I am comparing an entire model, and not just means, a growth model does not appear to be an appropriate solution. How would one go about testing such a model? Is there a way to use an approach similar to MG SEM, but in a way that correlates error terms across groups (accounting for shared variance, since it is the same participants across conditions)?
When you have a multivariate model where several variables are measured for each person, you can compare the means for the four conditions in a single group analysis. You do not need to worry about non-independence of observations because the multivariate model takes that into account.
Thank you for your reply. It is not the lack of independence of variables WITHIN the path model I am concerned about, but rather the lack of independence BETWEEN the path models in each condition. I want to be able to compare paths between groups, but the groups are not independent by design (i.e., each participant has provided responses to all variables in all conditions).
What I am interested in doing is comparing an entire model (= all paths) between groups, not just a single path. Is there a way to correlate terms across a MG path model to accommodate the non-independence of individuals? If so, which terms would I correlate? Would this be a sufficient way to model this?
Thanks for your response, Dr. Muthen. I am interested in accounting for the non-independence of individuals across group, which is done by design. My thought was to correlate residuals, but I'm not sure if this is an appropriate way to do this. Would I correlate all residuals? Just one? Any insight would be appreciated.
When you say "non-independence of individuals across groups", your earlier description made it sound like the groups were 4 conditions for the same group of individuals at the different time points? That is, you have longitudinal data.
If that is a correct impression, then a single-group analysis of all 4 conditions is the right way to go. That's the multivariate approach, where with p variables observed at each time point, your model concerns 4*p variables. The only issue is how you let the variables correlate over the conditions (the different time points). You didn't want to do a growth model, so you can instead let all the variables correlate freely by using WITH statements.
Jenny L. posted on Tuesday, April 23, 2013 - 1:38 pm
I have a set of longitudinal data (2 time points). I'd like to see whether the associations among the variables would differ across time.
Given that f1 and f2 are exogenous variables, f3 is a mediator, and f4 is an outcome, I wrote the following codes:
model: F4_T1 on F3_T1; F3_T1 on F2_T1 F1_T1;
F4_T2 on F3_T2; F3_T2 on F2_T2 F1_T2;
model indirect F4_T1 ind F3_T1 F1_T1; F4_T1 ind F3_T1 F2_T1;
F4_T2 ind F3_T2 F1_T2; F4_T2 ind F3_T2 F2_T2;
f3_t1 with f3_t2; f4_t1 with f4_t2;
Does it look reasonable to test two models (T1 and T2) this way? If not, could you please let me know which example model I could use in the users' guide?
It is up to you to determine which parameters you want to compare across time. For example, you can compare f4 ON f3 by labeling and using MODEL TEST to obtain a Wald test.
model: F4_T1 on F3_T1 (p1); F3_T1 on F2_T1 F1_T1;
F4_T2 on F3_T2 p2); F3_T2 on F2_T2 F1_T2;
MODEL TEST: 0 = p1 - p2;
Gloria Koh posted on Sunday, October 06, 2013 - 2:39 am
Hello Linda & Bengt,
I collected measurements from one cohort of participants at 2 time points. At Time 1, I conducted SEM using 2 categorical latent exogenous variables, 1 endogenous categorical latent variable, 5 endogenous measured categorical variables and 1 endogenous measured continous variable (outcome).
I repeated the SEM with Time 2 variables. Most of the Time 2 variables are repeated measures except for 2 latent variables that were measured differently.
Repetition of SEM using Time 2 variable will only give me cross sectional SEM. I intend to conduct a longitudinal analysis by including all the Time 1 and Time 2 variables into the SEM model, but due to repeated measures, clustering is a problem.
I am unsure that with only two time points, if a growth model is appropriate given my understanding that growth modelling requires at least 4 time points in MPlus. Correct me if I am wrong.
That leaves multilevel modelling as the alternate option but I am unsure if this is appropriate for what I am intending to do. Another issue is sample size. Due to participants withdrawing from the study at Time 2, I have only just over 200 participants at Time 2. Will multilevel SEM modeling be a feasible option or should I just report cross sectional SEM for Time 2?
In general, not in Mplus, it is desirable to have four or more time points for a growth model. With fewer time points, it is difficult to discern a trend. Analyzing the data in long format rather than wide format does not change this.
Gloria Koh posted on Sunday, October 06, 2013 - 10:01 pm
Thank you, Linda & Bengt for your prompt reply.
JMC posted on Wednesday, October 16, 2013 - 8:45 am
Hello Drs. Muthen,
I have been working through my analyses using the great examples in the manual and message board, but got a bit stuck! I had originally performed my analysis using repeated measures in SPSS since I only had three time points and would not be able to get quadratic effects using latent growth modeling. I have been reworking my data using a longitudinal latent SEM , but have some questions. How can I use this to compare means at the three time points? Can I look at linear and quadratic effects? Can I add in covariates?
My problem now is that our Y was measured at 2 time points so I have a repeated measure of Y. How do I specify Y as a repeated measure (just 2 measures at 2 times) in this model? My interests are in the indirect effects.
Any advice is much appreciated!!! Thank you so much!
You can use Type=Complex where subject is level 2 and the repeated measures level1. Or Type=twolevel, which is more complex. Or, use a wide format, that is, 2 columns for y's 2 timepoints and then let them correlate in some way. With 3 time points you can apply a growth model.
Jingwen posted on Thursday, March 19, 2015 - 2:52 pm
Dr. Muthen, thank you very much! I will check these options.
Jingwen posted on Thursday, March 19, 2015 - 9:01 pm
Hi Dr. Muthen,
I want to follow up with one more question. You suggested "Or, use a wide format, that is, 2 columns for y's 2 timepoints and then let them correlate in some way."
Could you let me know if the following is what you suggested?
M1 ON X (A1); M2 ON X (A2); M3 ON X (A3); M4 ON X (A4);
M5 ON M1 (B1) M2 (B2) M3 (B3) M4 (B4) X (F1);
Y1 Y2 ON M5 (C) X (R);
M1 WITH M2 M3 M4; M2 WITH M3 M4; M3 WITH M4; Y1 WITH Y2;
I am working on a longitudinal (two time points) dyadic model. I wondered would it be possible to model change of anger (y, measured at both time points) across time as a function of x1 (measured at time 2) and x2 (measured at both time 1 and time 2). My dyads are husbands and wives. I assume that y will be associated with x1 and x2 across time for both husbands and wives (and between persons/couples).. In particular, I am interested how much variance of y (wife) at time 2 is explained by x1 (time 2) and x2 (time 1 and 2) (husband). Would you model this as a random intercept (x1, x2 across individuals) and random slope (varies across time) model? Many thanks
I would model this in a "doubly-wide" format. So the variables (columns) would include both the 2 time points and male and female (so, for example, you will have 4 y's). No need for random intercepts or slopes. In this format you can include any relation across time and across gender that you want in the model.