Message/Author 

Utkun Ozdil posted on Thursday, February 03, 2011  11:27 am



Hi, My question is about the coding of categorical covariates in the data file of Mplus. In my model I want to test the effect of gender, socioeconomic status, school type on mathematics achievement. Socioeconomic status has 4 scales and school type has 3 scales. Do you suggest dummy coding for these variables? For instance, should I create 3 dummies for SES and 2 dummies for school type,, and then include these dummy variables in the MODEL part or should I leave them as they are in the data file (without creating any dummies)? I also hesitated in that whether a multigroup analysis would be more appropriate. Thanks... Utkun 


Covariates must be binary or continuous. If you do not recode nominal variables to s set of dummy variables, they will be treated as continuous in model estimation. With ordinal variables, you can create a set of dummy variables or treat them as continuous. 


Is there an efficient way to create the dummy variables in MPLUS? I have 55 groups in a nominal variable that I need to dummy code. I know this can be done using the DEFINE command, but will I need to manually list out all 55 dummy variables? 


No, there isn't an efficient way. But perhaps you want to treat group as random when you have that many? Or use a multiplegroup approach. 


Hi, I am using a latent growth curve modeling approach. How might I use these techniques for this statistical approach? I have been watching your short course topic 3 and following along in the slides, but not sure how to bring in such a large number of groups. Thanks! 


See the paper on our website: Muthén, B. & Asparouhov, T. (2016). Recent methods for the study of measurement invariance with many groups: Alignment and random effects. (Download scripts). Although that paper talks about a factor analysis model, you should note that such a model is essentially equivalent to a growth curve model so the modeling alternatives that are discussed apply. 

Sara Namazi posted on Wednesday, August 08, 2018  6:23 am



Hi Dr. Muthen, I have a question: In my analysis, I am interested in controlling for the effect of female. My demographic gender variable is coded as 1=female, 2=male In Mplus, I recoded this variable as: !female=sex; if (Sex eq 2) then female=0; if (Sex eq 1) then female =1; Does this look correct? When I tried to control for the effect of male I changed the coding: !male=sex; if (Sex eq 2) then male=1; if (Sex eq 1) then male=0; The results came out the same for both male and female in terms of fit statistics and beta coefficients. Am I coding this incorrectly? Thanks, Sara 


You control for gender, not for female or male. That is, you control for a variable not values of the variable. A simple way to define the variable is female  2 gender; Just use this female variable in your model. 

Sara Namazi posted on Wednesday, August 08, 2018  6:46 pm



Hi Dr. Muthen, Thank you for your response and clarification. Could I recode my Sex variable (male=2 female=1) using this command under define: if (Sex eq 2) then male =1; if (Sex eq 1) then male =0; 


Sure. 


I had a chance to look at the 2016 paper on multiple groups. I also tried to run the analyses. It seems the Alignment method will not work with latent growth curve analyses because MPLUS wants me to free all the loadings. But arn't fixed loadings required to run that analysis correctly? I tried to run the twolevel analysis as well, but do not have that version of MPLUS. I think I finally figured out how to run my data (expect for the groups part) using nonlinear analyses (short course 3 was helpful!), but feeling very stuck on trying to analyze with so many groups. 


Alignment should work with growth curve modeling using fixed loadings. If you have a problem with this, please send your full output to Support along with your license number. 


I misspoke  the loadings and intercepts of the factor indicators need to be free in the alignment procedure. The growth model with its typically fixed loadings (fixed time scores) assumes measurement invariance in that the time scores are thought to be the same for all groups for the same linear/quadratic etc development. So capture the 55 groups by either using either a large set of dummy covariates or a multiplegroup run. In the latter, you can test if the fixed time scores need to be relaxed in some groups due to deviations from the overall linear/quadratic etc growth. Modindices are useful here. 

Back to top 