Message/Author 

Anonymous posted on Monday, March 19, 2001  2:27 pm



I am analyzing a questionnaire in regards to differences by a binary group variable. What would be the most appropriate way to go? 1) Multiple group CFA analysis 2) MIMIC model with a direct path leading from the covariate to the factors Am I mistaken to assume that analysis #1 is concerned about structural differences and #2 with factor mean differences between the groups? Thanks for any input! 


I don't see it as an either or choice. Both models can be used to study measurement invariance and population heterogeneity. The MIMIC model is also a good way to determine which covariates have direct effects and to therefore determine what grouping variables might be important in a multiple group analysis. The advantage of multiple group analysis is that there are more parameters that can be examined for invariance. The MIMIC model can look at differences in intercepts and factor means. The multiple group model can look at these parameters along with factor loadings, residual variances/covariances, factor means, and factor covariances. Multiple group analysis does, however, require sufficiently large sampe size for each group and can be difficult to carry out with many groups. With smaller samples, the MIMIC model may be more practical. 

Anonymous posted on Wednesday, August 27, 2003  6:50 pm



Is it possible to conduct multiple group CFAs to examine invariance of a single factor scale across those groups while also inluding a direct path from a separate covariate to the factor as one would in a MIMIC model? Any help would greatly appreciated! Thanks! 


Yes. See Chapter 18 of the Mplus User's Guide. You would need to add a direct effect to the examples which you would do by adding y1 ON x1, for example, to the MODEL command. 

Anonymous posted on Wednesday, August 27, 2003  8:29 pm



Thank you very much for the rapid response! I'm afraid that in my haste to write the initial message (listed above: Anonymous, Aug 27, 2003), I forgot to mention that the covariate was a dichotomous variable. Would that change your response? Also, to clarify, I would add a direct effect from x1 to the indicator (y1) or a direct effect from x1 to the factor (f1)? Thank you again! 


The answer is still the same. 

Anonymous posted on Thursday, August 28, 2003  9:42 am



Thanks again but I'm afraid I'm still a little unclear regarding the direct effects (Anonymous, August 28, 2003). I would add a direct effect from x1 to the indicator (y1 ON x1) as well as a direct effect from x1 to the factor (f1 ON x1)? Thanks for your time! 


Yes, this is how you would specify it in the MODEL command. 

Hervé CACI posted on Sunday, January 04, 2004  4:49 am



Hello, I have a 4indicator LV with age as covariate (F1 on AGE) with a very good fit, if not perfect, in five samples even when gender is taken into consideration (i.e. I ran 10 CFA). The classical multigroup analysis (N=2515) shows that the complete factorial invariance is not possible. How can I formulate a model to test for the effect of sample and gender on the items, and the noninvariance of the items across the five samples ? Happy New Year 2004. 


When you did the classical multigorup analysis, what were your groups and what steps did you carry out? 

Hervé CACI posted on Monday, January 05, 2004  10:52 am



Linda, The groups were the language version of the scale (English, French, Italian, Spanish and Thaï). The default setting (MLM estimator) indicated a bad fit (significant chisquared, CFI0.150), which became better after freeing the path betwwen age and LV in three groups, then the intercepts of the items and finally an error covariance between two items in three other groups: chisquared significant at p=0.01, CFI>0.980 and RMSEA<0.04. This latter modification was suggested by the modification indices under the ML estimator, and might be 'theoretically' justified. 


It sounds like you have concluded that the factor is not the same for different language groups. Now it sounds like you want to do a multiple group analysis with gender as the grouping variable and age as a covariate for each language group separately to test the invariance across gender. 

Hervé CACI posted on Wednesday, January 07, 2004  1:29 am



The factor fits in both genders whatever the language, and age is a significant covariate in three language groups (i.e. gender is not taken into account there). 1. When I estimate the model in the entire sample with GENDER as the grouping variable, I get a bad fit: chisquared(16)=57.682, CFI=0.978 and RMSEA=0.046. I therefore reject the total invariance hypothesis, although the fit can be improved by relaxing some nonsignificant parameters/constraints. 2. When I estimate the model in the entire sample with LANGUAGE as the grouping variable, I get a bad fit. I therefore reject the total invariance hypothesis, although the fit can be improved again by relaxing some nonsignificant parameters/constraints. Now I wish to investigate the reasons for these rejections. The factors may not be the same, or there may be only a partial invariance. In that later case, language and gender (and age?) may affect the items and not the factor. How can I test the effect of language, gender and maybe age on the observed variables to determine what item may be noninvariant ? I thought a MIMIC model was the solution. 


Ed Rigdon's answer on SEMNET is what you should do to investigate measurement noninvariance 


Dear Linda & Bengt, what is the suggested procedure when differential item functioning is detected in a factor analysis using background variables? For a give scale (parental monitoring) I found that certain items show significant differences in factor loading when comparing AA to nonAA. I computed factorscores (FS) using the CFA without MIMIC to a CFA including MIMIC. When I compared the FS, I found significant differences by race for model 1, but not for model 2. Thanks. Best, Hanno 

bmuthen posted on Tuesday, February 10, 2004  5:48 pm



It is not clear to me exactly what the models are. You say you found factor loading differences  that must have been using a 2group analysis. Then you did a mimic  was that without a direct effect to the items with different loadings in order to find differences also in intercepts? Adding a covariate that truly has a direct effect onto some items and not including such direct effects in the analysis causes a misspecified model and that can explain the factor score difference wrt race. In such a case the CFA results would be more trustworthy. 

Hanno Petras posted on Thursday, February 12, 2004  7:29 am



Dear Bengt, thank you for your response and sorry for not being clear enough. In order to explore if racial differences are "real" or just a result of model misspecification, I ran MIMIC models, using a binary race indicator as a background variable influence item loadings. I hypothesized that when there are no significant regression paths between item and the race variable, race difference for that particular construct are real. In fact I found that certain items are influenced by race. I am not quite sure if I read your response correctly, but it sounds like that I should also let race influence the factor intercept (f on race) in addition to the item specific path. My second question then is: How do you deal with problem? Is it appropriate to use the factor scores from the MIMIC model to account for variation by race or are thre more appropriate procedures, such as IRT. I am looking forward to hearing your answer. Best, Hanno 


In a MIMIC model, the significance of a direct effect of, for example, y1 ON race, indicates that the intercepts of y1 differ for race. This does not say anything about the factor loadings being different. A multiple group analysis is needed to determine whether factor loadings are different for the different categories of race. You would need the coefficient in the regression of f ON race in order to fully interpret your direct effect. If you have significant direct effects, you should estimate factor scores with these indirect effects in the model. Using IRT would not add anything to the analysis. You would obtain essentially the same results. 

Anonymous posted on Sunday, May 23, 2004  10:39 pm



Hi, there, I have a problem of how to improve model fit in MIMIC. My current RMSEA is a little bit higher than 0.08. I included demo information and some disease dummy variables as exogenour variables. The model fit was not better when I included direct effect of one of the exogenous variables. Anyone has some idea on that? Thanks a lot. 


Did your model fit well before you added covariates? If not, I would go back to EFA to see where the problems are. 

Anonymous posted on Monday, May 24, 2004  9:32 am



Thank you for your reply. Linda. It is my first time to set up MIMIC model. Perhaps I don't understand your answer completely. Do you mean I should look at the model fit before adding any x's (exogenous variables)? Then what should I include? Why go back to EFA? Is CFA not a part of MIMIC? Be specific, my scenario is that I am looking at physical functioning domain of SF36 (10 items are my y's), and I have age, gender, race, income, education, marital status, working status, and chronic disease information. Except that age is continuous and chronic disease variables are dummy, others are all categorical. Thank you for your time. 


A MIMIC model is a CFA model with covariates. You want to investigate your measurement model to be sure it is well fitting before adding covariates. EFA is a good way to start looking at any factor model. You can see whether your factor indicators behave the way you think they should or that you have unexpected cross loadings. An EFA can be followed by an EFA in a CFA framework to investigate significance of factor loadings. The Day 1 handout from our short courses goes through a series of steps from EFA to a final wellfitting simple structure CFA before turning to MIMIC and multiple group analysis. You might find this handout useful. See our website for details about obtaining course handouts. 

Anonymous posted on Friday, July 02, 2004  1:52 pm



Hi There... I am trying to find the code in MPLUS for doing a confirmatory factor analysis of three continuous outcome variables. I also would like to have the goodness of fit statistic for each of the alternative models as compared to the null model which assumes that the three variables are independent. Could anyone let me know if the following code is correct or not: TITLE: Continuous scale variables (x1, x2, x3) DATA: FILE IS C:\DATA.dat ; VARIABLE: NAMES ARE X1 X2 X3 ; USEVARIABLES ARE X1 – X3 ; MISSING ARE all (9) ; ANALYSIS: TYPE = general ESTIMATOR=mlm ; MODEL: X1 BY X2 X3; X2 BY X1 X3; X1 WITH X2 X3; OUTPUT: standardized sampstat ; ** Note** : The chisquare test is called SatorraBentler SCALED chisquare test for GOF. Please let me know as soon as you can which commands I should remove or add in order to serve my purpose. Thank you very much and hope to be hearing from you soon, Sincerely yours, May 

bmuthen posted on Friday, July 02, 2004  2:04 pm



This analysis is described in Chapter 5 of the current User's Guide. Here are some mistakes that I see: You should have the name of a factor on the lefthand side of BY (not the name of an observed variable). So a 1factor model with factor "f" would be f BY x1 x2 x3; Also, you need to have a semicolon after type = general. You should include "missing" in your Type = statement, otherwise you get listwise deletion. 

Anonymous posted on Saturday, September 18, 2004  5:05 pm



Thanks a lot for your help on this. Sorry it took so long to respond! Sincerely yours, May 

Anonymous posted on Wednesday, June 01, 2005  4:47 pm



Dear Dr. Muthen, I have a confusion with the MIMIC model using Mplus. If my covariates are categorical (binary or several levels with 1,2,3,4, say), how Mplus knows that as I can not specify them in the "CATEGORICAL ARE", which is only for response variable. Thank you. 

Anonymous posted on Wednesday, June 01, 2005  5:11 pm



Dear Dr. Muthen, Following my previous question above. I saw a similar post back in 2001, wher Linda mentioned that it was not important to specify categorical independent variable. I wonder how Mplus knows which level is the reference level for both binary case and categorical independent variables with more than 2 levels, so that we can interpret the coefficient. Do we actually need to create 0/1 dummy variables for categorical variables with more than 3 levels? For continuous independent variable, does the coefficient reflect the 1unit change? Any explanation will be greatly appreciated! 

bmuthen posted on Wednesday, June 01, 2005  5:47 pm



Yes, with nominal x variables you need to create C1 0/1 dummy variables (like in regression analysis). For continous x's, you get the 1unit change coefficient in the standardized columns (the last column standardized wrt to both x and y). 

Anonymous posted on Wednesday, June 01, 2005  5:50 pm



That is quick! Thank you! 

Anonymous posted on Thursday, June 02, 2005  9:09 am



Dear Dr. Muthen, I have a further question regarding MIMIC model. I added several backgroupd variables and without syntex "ANALYSIS: TYPE = general;ESTIMATOR = wlsmv;", but only "Model" part, I got error message "*** FATAL ERROR THE WEIGHT MATRIX PART OF VARIABLE A5 IS NONINVERTIBLE. THIS MAY BE DUE TO ONE OR MORE CATEGORIES HAVING TOO FEW OBSERVATIONS. CHECK YOUR DATA AND/OR COLLAPSE THE CATEGORIES FOR THIS VARIABLE. PROBLEM INVOLVING THE REGRESSION OF A5 ON OTHERR. THE PROBLEM MAY BE CAUSED BY AN EMPTY CELL IN THE JOINT DISTRIBUTION." It ran through when I deleted on background variable. So I guess the problem is the sparse data. But when I added "ANALYSIS: TYPE = general;ESTIMATOR = wlsmv;", even with all the backgroud variables, Mplus ran through, but there is no chiSquare test of model fit in the output. May I ask why this is happening? I greatly appreciate you help! 

bmuthen posted on Thursday, June 02, 2005  12:37 pm



This question is more suitable to send to support@statmodel.com, including your input, output, data, and license number. 

Sanjoy posted on Thursday, June 02, 2005  8:16 pm



Dear (concerned) Anonymous and Prof. Muthen ... concerning your June 1st query (5.11 pm posting) ... should it not be a function of the nature of the Covariates (Professor, correct me please if I'm wrong) 1. Say our covariates is Rural and Urban, yes we have to use Dummy (1/0) 2. our covariates is people from different group, like low(1), middle(2), high(3), rich(4) ...in this case we need not to use Dummy ...we can simply keep (1,2,3,4)as a value FOR the covariates "Dummy" variable is a kind of Latent variable, i.e. we can not see, we attribute something "unable to trace" under dummy effect ... therefore when we use dummy for the first case, we inherently assume there is something in City setup which is not present in rural area and at the same time we are unable to decipher it ... while on the other, for the second case it's not like that ... even if we assume that income is latent (in a sense not revealed properly) the mapping is simply monotonic from a SAME domain onto a SAME range Now coming to the Marginal effects of DUMMY covariates (W.H.Greene, Econometric Analysis, Page 668) ... it's appropriate to apply the following rather than the conventional way Marginal Effect = Prob[Y=1means of all other variables, dummy=1]  Prob[Y=1means of all other variable, dummy=0] 

bmuthen posted on Friday, June 03, 2005  6:08 pm



I think your answer for your case 2. refers to a discrete variable that can be treated as continuous, not a categorical (nominal or ordinal) variable. With a categorical variable that is nominal (unordered) you need to dummy code several 0/1 variables. You also need to do this for an ordinal variable if you don't believe in equidistance of your category values 1, 2, 3, 4. A dummy variable is not a latent variable but an observed variable created from another observed variable. The marginal effect statement makes sense assuming y is binary; I see this as a conventional representation (where you don't have to condition on the means of the other variables, but can use any value of interest to you). 

Sanjoy posted on Friday, June 03, 2005  10:31 pm



Thank you Professor... A). Yes, under the assumption that we won't lose much information if we keep them as continuous ... in fact, in earlier days (I mean prior to your 1983 work), people used to assume even ordinal DEPENDENT variable as continuous ... their argument was unless the ordinality causes two much of skewness, and there is 9 to 11 division on Likert scale, it's ok to treat them as continuous ...of course, now we have WLSMV, we can treat dependent ordinal variable as categorical B). Now Professor I have a request cum question ... I have total 10 covariates in my SEM ... and all of them are in fact ordinal (on 5 scale) except (Male/Female, employed / unemployed = 0/1 Dummy)... Q1. Now ...in order to use Dummy, I will then end up with 8*4=32 dummies.... won't it make things too messy, I mean is it worth doing! Q2. Of course, as you have suggested "Linearity" /equidistance criterion need not to hold true in real life ... I guess because of that people sometimes use X^2 or log (x), I mean some sort of "relevant" nonlinear transformation ... what's your opinion professor NB: The reason I have said "Dummy variable as a kind of latent variable”, actually I meant Dummy variable “represents” a latent/unobservable variable ... usually, in Econometrics when we have DEPENDENT (dummy) variable, say a choice between BART/ private car = (1/0) … we assume, there is an underlying utility (latent/unobservable) which is being mapped (technically being maximized, RUM framework, Mcfadden ) onto choice Thanks and regards 

bmuthen posted on Sunday, June 05, 2005  10:14 am



There's the principle and then there is practice. In practice, I would treat ordinal x's as if they were continuous (but nominal broken up into dummies). Yes, nominal dependent variables can certainly be thought of via continuous latent variables seen as "utilities". 

wader posted on Monday, June 06, 2005  9:51 am



Dear Dr. Muthen, I am running a MIMIC model to check the Measurement Noninvariance, i.e., the direct relationship between the covariates and outcomes variables (indicators) that are not mediated by the factors. I am following the online lectures (#3) given by Dr. Bengt O. Muthen at UCLA, and using the handouts of ¡°Mplus Short Courses ¨C Traditional Latent Variable Modeling Using Mplus¡±. I have three questions, all of which related to the second step on page 74 of the handout (¡°Steps In CFA With Covariates¡±): 1. According to page 69 of the handout, we can check the ¡°Measurement Invariance¡± by comparing the number of factors, the equality of factor loadings, and the equality of intercepts between the CFA models with and without covariates. Therefore, how do we define the ¡°Equality¡±? In the lecture, Dr. B. Muthen mentioned that ¡°The measurement structure hasn¡¯t changed very much at all¡± by comparing the ¡°Model Results¡± between these two models. But, to be considered as ¡°Equal¡±, how small should the differences between the factor loadings and intercepts need to be? Is there a ¡°cutoff¡± value? If there were not, how can we deal with this ¡°Test difficulty¡± (refer to p.69 of the handout)? 2. For ¡°Modification Indices (M.I.)¡±, what is the cutoff value which indicates a direct effect of a covariate onto a outcome variable (indicator)? In the lecture, Dr. B. Muthen chose the MI¡¯s that were ¡°much higher than others¡± (refer to p. 85 of the handout). In the example, the M.I. that Dr. B. Muthen chose were 26.616 and 31.730, and those with values 12.715 or smaller were considered nonindicative for direct effects. However, in the model I am running, I got some M.I. greater than 3.84 and less than but close to 10.0. What should I deal with this kind of situation? 3. What is the relationship between these two parts of this ¡°second step¡±? i.e., is there any difference for the implications of comparing the ¡°Model Results¡± and checking the M.I.? To my understanding by following the lecture, while the former is dealing with the ¡°Stability of the measurement structure¡±, the later is checking the possible direct effect of the covariates to the indicators, which is checking the possible ¡°measurement noninvariance due to differential item functioning (DIF)¡±(refer to p.73 of the handout). Can we interpret the meanings of these two parts are the same? Thank you very much. Sincerely, wader 

bmuthen posted on Monday, June 06, 2005  5:15 pm



To understand these issues well, a person has to read an SEM book such as Bollen (1989). To get an introductory understanding, other more entrylevel SEM books may be more suitable. My UCLA lectures do not cover introductory SEM, but merely gives a quick review of it. I can however give brief answers to your 3 questions: 1.Equality is statistically tested by the series of tests reported on page 87 of the handout. (Note, however, that as usual you can have statistical significance without practical significance and the practical significance is up to the researcher.) 2. MIs gives information on how to modify a baseline model  in this case a model without direct effects. Since MIs are chisquare tests of the need to free up a parameter, they are 1 df chisquare test values and therefore the 5% critical value if 3.84. In practice, however, you want to only free parameters with much larger MIs. By choosing the 2 largest I am finding out how different the estimates will be for such a large MI  and thereby addressing the practical significance matter. So in your case, free only the parameter with MI=10 and see how the parameter estimates come out. 3. Model relaxation based on MIs only refers to relaxing one parameter at a time whereas the testing reported on page 87 refers to relaxing several parameters. 

Anonymous posted on Tuesday, September 13, 2005  8:49 am



I am using MIMIC to examine possible reporting differences associated with a demographic variable X1 on a latent variable F1 assessed by indicators Y1Y5. Can a MIMIC model include more than one direct effect simultaneously (e.g. X1 > Y1 plus X1 >Y2)? 


Yes. It cannot include all direct effects because it would not be identified. But more than one direct effect can be inlcuded as long as the model is identified. 

Anonymous posted on Tuesday, September 13, 2005  12:12 pm



I looked at the direct effects of X1 on each indicator individually, and found all but one of these direct effects to be statistically significant. Would an appropriate next step be to include all of the significant direct effects simultaneously, and refit the model iteratively  removing those that become nonsignificant when in the model with other direct effects (analogous to a backwards elimination regression procedure)? 

bmuthen posted on Tuesday, September 13, 2005  4:34 pm



That seems like a reasonable approach. 

rpaxton posted on Saturday, November 19, 2005  8:41 pm



If you are moving data from SPSS to MPLUS and wanted to determine if your latent variables were Longitudinally invariant, is there a special way that you have to restructure your data to have a grouping variable. Or should you just model each time point as a seperate latent variable as depicted below. Model time1 by a b c d; time2 by a b c d; 


If you are looking at invariance over time, you do not need a grouping variable. You would just constrain parameters equal over time. It would be similar to Example 6.14 in the Mplus User's Guide but without the growth factors. 

joshua posted on Friday, November 25, 2005  1:08 pm



Hi, 1) I ran a multigroup analysis and I have a negative residual variances for one of my variables in say group 2. How would I be able to rectify this? Thanks for the help in advance. 

bmuthen posted on Saturday, November 26, 2005  6:35 am



You can fix it at zero: y@0; Or, respecify the model in other ways. 

joshua posted on Saturday, November 26, 2005  2:33 pm



Thanks Bengt. 1. Are there implications arising from fixing a residual variance to 0? Is fixing it to 0 similar as stating that all the indicators are perfect measures for the latent variable? If so, is this possible? 2. I'm a new convert from AMOS to MPLUS and if I remember correctly, AMOS does not allow for correlated endos in the context of a full latent variable model. I realised that in a full latent variable model where there are say 2 endos and 2 exos, the 2 endos are allowed to be correlated in MPLUS (judging from the with statements of the output). Does this mean that in a full latent variable model, we need to fix all the "with" statements manually to 0? Thanks once again. 


1 fixing at zero does imply perfect relibility for that indicator. But a small sample can also give a perfect rel estimate even when the tru rel is < 1. 2. you can identify correlated residuals for endog factors. If you don't want those residuals correlated you use with@0 

joshua posted on Sunday, November 27, 2005  3:14 pm



Thanks Linda. 1. When I fix say M to 0, my stdYX of M ON CBE = 1.53 in one of the two groups of my multigroup analysis. However, fixing to 0.1 fixes the problem. Is my action appropriate from a theoretical standpoint? 2. According to Example 5.11 of the MPLUS user's guide, the latent endogenous are not allowed to correlate by default. Why do I then have a series of with statements in my output between the latent endos? Is this default hold true only when we run single group analysis? Thanks once again. 

joshua posted on Sunday, November 27, 2005  3:25 pm



Hi Linda, I have one more thing to add. With reference to point 2 above, when I fix the correlations between my latent endos to 0, Mplus gave me this statement: "Unknown variable(s) in a WITH statement: WITH" If this variable is not defined in default, why then I have the results estimated in the output? 


1. Standardized regression coefficients can be greater than one. I think that is what you are getting at. 2. Exogenous factors are correlated as the default. That is why you see them in your output. I would have to see your output to know why you are getting this error message. It sounds like you may not be spelling a variable name correctly or don't have a semicolon at the end of a WITH statement. 

joshua posted on Monday, November 28, 2005  5:35 pm



Referring to point 2, I know exos are correlated by default. It is the endos that I'm worried about. I didn't set the correlations but they appear in my output. Thanks. 


Residuals for endogenous variables may be correlated by default to reflect common model setups. If you don't want them correlated, just say e.g. y1 with y2@0; 

joshua posted on Tuesday, November 29, 2005  4:03 am



Thanks Linda, 1. I've correlated a pair of observed indicators, one from each latent endo. Is this the cause of the output showing correlated residuals for the latent endogenous variables? Btw, I've tried fixing the correlated residuals to 0. The model became undefined for one of the two groups in multigroup analysis. 2. You mentioned two posts ago that standardized regression coefficients can be greater than one. If this is the cause of the warning that is being triggered in the Mplus output, I shouldn't be worried about it, right? 


I think you need to send your input, data, output, license number and specific questions related to both 1 and 2 to support@statmodel.com. There is not enough information in your post to fully understand your question. 

R. Paxton posted on Tuesday, November 29, 2005  7:06 am



If I want to determine item uniqueness was invariant across groups, how do I specify it in the model statement 


To hold a residual variance equal across groups state in the MODEL command: y1 (1); 


Is it possible to test for invariance of factor loadings in a MIMIC model? I am wondering if I could accomplish that by creating an interaction term between the latent factor and the observed covariate (e.g., gender) and regress the indicators on the interaction term and gender. For example: ANALYSIS: type= missing H1 random; MODEL: ANTISOC5 by YR5DOL5r* YRRUIN5r YRTAKE5r; ANTISOC5@1; ANTISOC5 on female; int  female XWITH ANTISOC5; YRRUIN5r on antisoc5 female int; I am able to estimate this model, but I am wondering if it is an appropriate way to test the invariance of loadings. 


I think this works. You can test it out by a Monte Carlo simulation. It's an alternative to multiplegroup modeling. I don't think you have to say: YRRUIN5r on antisoc5 in the last line because you already have that regression from your by statement (but it probably doesn't hurt). 


Thanks! Yes, I tried it both ways and I get the same results. Including the basically redundant YRRUIN5r on ANTISOC5 statement does not hurt. The model runs either way. 

Arpana posted on Tuesday, May 22, 2007  7:47 am



Dr. Muthen, Is there a way to get 95% CI for standardized thresholds and factor loadings (or for difficulty and discrimination) from a multiple group CFA? The CINT option appears to provide them for the raw estimates only. Thanks! 


Confidence intervals for standardized parameters are not currently available in Mplus. 

wai posted on Wednesday, January 30, 2008  10:19 am



Dear Linda and Bengt, Previous examples of MIMIC involves twolevels categorical covariate. I have a nominal covariate, and run into problems. I want to test the effect of sex, age and a nominal variable U (1/2/3/4 as four classes), on f1 and f2 in a MIMIC model. U: 1=control; and 2, 3, 4=three disorders. MODEL: f1 by y14; f2 by y58; f1 f2 ON sex age U; Mplus doesnt allow U defined as 'categorical' or 'nominal' variable in this model. Mplus then treated U as a continuous variable with 14 range in the OUTPUT. I tried splitting U into 4 dummy variables. But I have problems with getting a stable comparison group. For example, when I recoded '3' as '1', then '0' will consist of '1', '2' & '4'. When I recoded '4' as '1', then '0' will consist of '1', '2', and '3'. Mplus doesnt recognize '1' is the 'control' and the intercept; and will yield an output for the dummy variable of category 1, comparing with 0 (consisted of 2,3,4). I tried to preserve only 'controls' as 0, recode other redundant disorder groups as 'missing', then Mplus cannot identify 'twolevel' in this new dummy variable with missing values. Advice is much appreciated, WIth kind regards Wai 


A nominal variable with four categories is represented by three dummy variables. 

yang posted on Friday, February 01, 2008  6:47 am



Drs. Muthen, I am studying the effects (both of main effects and the interaction effect) of two covariates (one continuous, on binary) on a secondorder factor structure (SF36, with 35 ordinal indicators, 8 firstorder factors, and 2 secondorder factors). I am interested in not only the population heterogeneity (the indirect effects of the 2 covariates and their interaction term on the 35 indicators mediated by the first and the second order factors), but also the Differential Item Functioning (DIF) of the 2 covariates and their interaction term on the 35 indicators, i.e., the indirect effect of them on the indicators not mediated by the first and the second order factors. I know how to do the iterative DIF tests using the DIFFTEST option in Mplus for a firstorder factor structure, but I do not know how to do it for a secondorder one. Would you please give me some instructions? Thank you very much. p.s. Here is the link for the measurement model of the SF36, just FYI. http://www.sf36.org/tools/sf36.shtml#CONSTRUCT Sincerely, Yang 


I don't believe there is any difference in direct effects for a secondorder factor model versus a firstorder factor model. The direct effects are the regressions of the factor indicators on the covariates. 

yang posted on Friday, February 01, 2008  12:33 pm



Linda, Thank you for your nice instructions. Applying MIMIC to such a secondorder measurement model, what bothers me is that, it seems to me there may be 3 levels/types of 'direct effect'(Please correct me if I were wrong): 1. Covariates to indicators; 2. Secondorder factors (e.g., PCS and MCS in SF36) to indicators; 3. Covariates to firstorder factors (e.g., the 8 scales in SF36). Do you mean that we can still use the DIFFTEST option to check ALL of these 3 levels of direct effect? Or, should/can we only do it at the first level (covariates to indicators) and ignoring the remaining two? Thanks. Yang 


You can do any of the tests you mention. You do not need DIFFTEST if you have only one group. If you have a singlegroup MIMIC model, you can use the ratio of the parameter estimate (column three of the output) to the standard error of the parameter estimate to test for significance. 

wai posted on Tuesday, February 05, 2008  1:25 pm



Dear Linda, Would you please advice how to generate the three dummy variables for a 4 class nominal covariate, as posted on Wednesday, January 30, 2008  10:19 am. Wai 


DEFINE: newvar1 = 0; newvar2 = 0; newvar3 = 0; if nominal = 1 then newar1 = 1; if nominal = 2 then newar2 = 1; if nominal = 3 then newar3 = 1; 

Alex Gamma posted on Tuesday, June 02, 2009  6:59 am



Drs. Muthen, I'm running a multigroup CFA (the groups are geographical regions) with binary items. I also have two covariates (sex, age). I want to test measurement invariance and population heterogeneity. Question: at which point to enter the covariates into the model? My solution has been to enter the covariates at the first step already, in which I test a model where all loadings are estimated freely across all groups. I first specify, for every group (geographical region) separately, the significant direct covariate effects on the items using the mod indices. Once I have these, I continue constraining factor loadings to equality across groups etc. Is that procedure correct? Thanks, Alex 


The MIMIC model can assess only intercept invariance whereas the multiple group model can assess both intercept and factor loading invariance. In our experience, it is most often the case that measurement invariance shows up in intercepts more often than factor loadings. Because of this, I would start with a MIMIC model using the covariates of region, age, and sex and possible their interactions. See the Topic 1 course handout where we do this. Then go onto a multiple group analysis using the covariates where measurement invariance shows up as grouping variables. 


Hi, Is it possible to conduct a multigroup (two groups) CFA analysis over time (i.e. two time points)? I have a clinical trial dataset with patients assigned to two treatment strategies. I am using SF36 as the survey instrument. In addition, I have a lot of other variable information, e.g. sociodemographic variables, comorbid conditions. I want to assess measurement invariance between the two treatment groups. Should I conduct a multigroup analysis both at baseline and at followup, or should I just conduct it at followup. Do I need to regress other covariates? Thank you for your help! 


Multiple group analysis assumes that the groups are independent. Instead you can assess measurement invariance across time by imposing constraints over time. See the Topic 4 course handout for multiple indicator growth. The beginning of this example shows how to test measurement invariance across time. 

Bruce Hanik posted on Wednesday, November 18, 2009  1:50 pm



Dr. Muthen, I am attempting a multigroup analysis with categorical data. I am following the steps covered in your short course at the Bloomberg school of public health. I fit the groups separately. Now I'm trying to fit the groups together but I'm getting the following error: Based on Group 0: Group 1 contains inconsistent categorical value for Q25: 2 I cannot figure out what this means. My syntax is as follows: data: file is C:\Documents and Settings\bhanik\Desktop\SASODE\SASODE BFA.txt; variable: names are group V11 Q14 Q20 Q22 Q23 Q24 Q25 Q27 Q30 Q31 Q35 Q36 Q37 Q38 Q41 Q44 Q45 Q46 Q47; categorical are all; missing are all (999); grouping is group (0=old 1=new); usevariables are V11Q47; !useobservations are (group eq 0); analysis: estimator= wlsmv; model: f1 by V11 Q14 Q25 Q27 Q30 Q31 Q24; f2 by Q35 Q36 Q37 Q38 Q41 Q45 Q30; f3 by Q44 Q45 Q46 Q47; f4 by Q22 Q23 Q20 Q31 Q30 V11 Q27; model new: f1 by Q14 Q25 Q27 Q30 Q31 Q24; f2 by Q36 Q37 Q38 Q41 Q45 Q30; f3 by Q45 Q46 Q47; f4 by Q23 Q20 Q31 Q30 V11 Q27; output: modindices(5) tech4 ; Any help is much appreciated. 


It means that you don't have the same values of the categorical variable Q25 in all groups. This is required. You may need to collapse categories. 


Dear Dr. MuthÃ©n, I am running a multigroup CFA analysis with 2 groups: The fit indices indicated good fit in both groups separately. However, when I performed the multigroup analysis the fit indices idicated hardly adequate fit. How it is possible? What can I do know? Thank you. Best regards, Robert 


For multiple group analysis, the default in Mplus is to hold the intercepts and factor loadings equal across groups representing measurement invariance. These equality constraints are what cause the fit to be poor. See the Topic 1 course handout under multiple group analysis to see how to relax these equality constraints and test for measurement invariance. 

Jiyeon So posted on Saturday, April 14, 2012  1:05 am



Hi Prof. Muthen, I have small sample (N = 500) and I want to see if being exposed to different message feature (two message conditions) make difference in the hypothesized SEM model. My model has a latent variable as a mediator and all other variables are observed variables. If I split the sample into two condtion groups, there are much less than 10 cases per parameter so I'm reluctant to do so. Can I add the "message condition" variable as a covariate in the model instead? Would doing this make the model MIMIC model? If this is a valid way to see if different condition made a difference in the patterns of association, what should I look for in the model to see if this variable in fact makes a difference? Thank you very much in advance!! Jiyeon 


You can use a MIMIC model instead of a multiple group model. To answer your question you will need to interact conditon with the observed exogenous covariate and the factor, for example, f ON x c xc; y ON f c xf; 

Jiyeon So posted on Sunday, April 15, 2012  7:40 pm



Thank you very much! I have a few more questions and i would really appreciate if you could provide some insights on these as well. 1. This strategy (of using MIMIC model as opposed to multigroup) is applicable in SEM context right? (not just for CFA, that is). 2. in my case, i was thinking the the "condition" is the "covariate" that i should consider. Is that right? 3. If Question #2 is right, I'm confused at what you meant by "interact condition with the observed covariate." What if i am not considering adding a covariate besides the condition variable? Thank you very much in advance! Jiyeon 


1. Yes. 2. You said you have a latent mediator. To me this implies you have a covariate and a final outcome like x > f > y You said you also have condition that you want to use as a moderator. This implies an interaction between x and condition and f and condition as shown above. 


Dear Linda & Muthen.. i'm a student master degree faculty of psychology at islamic university of syarif hidayatullah Jakarta, Indonesian. and now i'm studying how to detection differential item functiong (DIF)with multigroup and IRT (2PL). can you help me about use mplus to analysis it? and so how formulas to calculate item bias? thanks.. 


See multiple group analysis in the Topic 2 course handout and video on the website for detectng DIF for binary items. See multiple group analysis in the Topic 1 course handout and video for a more general discussion of DIF. 


Thanks for your response linda... i'll study DIF with this handout and video. nice to discussion with you... 


Hello Dr. Muthén: I am running a multigroup CFA to test for factorial invariance and I read somewhere that the WLSMV estimator does not allow for direct comparisons of the change in CFI, RMSEA and WRMR. Is this still the case in version 7? If it is, is there any alternative to assess the change in model fit besides using change in ChiSquare ussing the DIFFTEST procedure? Thanks! 


No, you need to use DIFFTEST to test for factorial invariance. See the Version 7.1 Mplus Language Addendum on the website with the user's guide. We have added an automatic way to do this. See convenience features for multiple group analysis. 


Thank you Dr. Muthen. I will try that! 


Dear all, I'm currently using a MIMIC model to take into account the clustering of my dataset (data comes from different cohorts, at an individual patient data metaanalytical framework). My covariates are dummy variable for each study level. I did not use Multilevel CFA because there are not many levels (n=8). First, I want to compare the fit of 2 factor solutions for a selfreport questionnaire. One model was derived from EFA in my data and the other comes from the literature. Q1 Should I compare the goodness of fit already at the MIMIC framework, or should I do it using a "standard" CFA? (I get slightly different results, in terms of which one has the best fit). Afterwards, I want to generate factor scores and include it as predictors in a multilevel Cox regression (Frailty models). Q2. Is it defensible to use the factor scores generated in the MIMIC model, or should I stick to the standard CFA for generating it? I appreciate any sort of advice! Best regards, Ricardo 


I would use the MIMIC model in both cases since it takes into account the study level. 

Back to top 