Jiyoung posted on Saturday, June 20, 2009 - 9:34 am
I want to run a structural equation model. The variables in my model were chosen based on theories that previous studies found.
Along with the variables based on theories, I also want to include demographic variables (e.g., gender, race) to control the effect of demographic variables on endogenous variables. I wonder if it is correct that those demographic variables are just added like other exogenous variables based on theories.
For instance, let's say that I want to predict intention to use the Internet (INTENTION).
My exogenous variables based on theories are 1) perceived ease of use (PEU) and 2) relative advantage (RA). Those are latent variables with multiple indicators. I also want to control the effect of gender, race, income, etc.
First, I will run a CFA to examine the validity of PEU and RA. I will also use the following command: genderL by gender; raceAAL by raceAA; race AsianL by raceAsian; raceWhiteL by raceWhite;
After I make sure the model fit of the CFA model is good, I will move onto the SEM model. I will also use the following command.
Intention on peu ra genderL raceAAL raceAsianL raceWhiteL;
I wonder if I am on the right track. If you could clarify my question, I would appreciate it. Thank you.
I want to include SES as a latent variable in a SEM model. The indicators are: education (educ) - 7 level ordinal indicator income (inc) - 6 level ordinal indicator race/ethnicity (race) - nominal indicator with 5 categories (Black, White, Hispanic, Asian, and other) marital status (mar) - nominal indicator with 5 categories (married, living with partner, divorced, widowed, single never married) nativity (bornUS) - born in US=1 not born US=0
1. I know I need to create dummy variables for the two nominal indicators - marital status and race/ethnicity. What I cannot figure out is how to code the dummy variables and then how to include them in the BY statement for the SES factor.
2. When SES is a latent factor, is it appropriate to have the measured indicators "cause" the latent factor? If so, how is the command for this written?
I tried ML and defining raceth and marital as nominal variables but get an error message that I do not have enough memory. WLSMV seems to work best with my model (except for my lack of understanding of how to write the command for dummies for my two nominal variables).
On pp 449-450 in my users manual (version 5, Nov 2007) is a description of how to refer the the levels of a nominal dependent variables in the MODEL command but no description of how to create dummy variables in the DEFINE command.
I tried this in the DEFINE command: white=raceth==1; black=raceth==2; hispanic=raceth==3;
It seems to work. Is this correct?
If not, please help me - I am just a dumb graduate student ready to pull my hair out... Thanks.
DEFINE; white = 0; if (raceth eq 1) then white = 1; black = 0; if (raceth eq 2) then black = 1; etc.
You need k-1 dummies where k is the number of categories of the nominal variable. Whenever you create a variable in DEFINE, you should check to see that you get the results intended by saving the old and new variables and spot checking.
Lucy Morgan posted on Tuesday, February 03, 2015 - 5:56 am
I need to control for the effects of two observed demographic variables in an otherwise fully latent SEM model using MLM (due to non-normal data). One is ordinal (education level) and the other is nominal (ethnicity) and both have a significant effect on the dependent variables. I have read through all the posts but I am still not clear what would be the correct approach. I need to run both measurement models and path models including these variables.
1) Can I include both the ordinal and the nominal variables as continuous variables as I do not need to interpret their effects, only control for them? (I will use SPSS to analyse for specific effects)
2) If no, my understanding is that I would need to create dummy variables - so would I state
CATEGORICAL = ethnic educa;
and then create dummy variables using DEFINE: white = 0; if (raceth eq 1) then white = 1; black = 0; if (raceth eq 2) then black = 1; etc.
as outlined in the post above? If yes, do I use the newly created variables black, white, etc in WITH and ON statements or can I simply use ethnic and educ variables in the WITH and ON statements???
Many thanks for your help with this, I am very confused! Lucy
1. The scale of observed exogenous variables is not specified. The scale is specified only for endogenous variables.
2. You do not need to create dummy variables for an ordinal variable unless you prefer to. You must create dummy variables for a nominal variable. Your code looks correct. For a nominal variable, you should use the dummy variables not the nominal variable in the analysis.
Student posted on Wednesday, April 29, 2015 - 2:02 am
I am experiencing issues trying to create dummy variables for a nominal independent variable (c) with 3 classes. In my dataset, this variable has values ranging from 1-3.
I am defining the classes as you suggested above: DEFINE: race2 = 0; if (c eq 2) then race2 = 1; race3 = 0; if (c eq 3) then race3 = 1;
I am adding the new variables to my USEVARIABLES list, and regressing Y (my outcome) on race and other covariates:
Y on race3 race2 cov1 cov2;
However, I get the warning: *** ERROR One or more variables have a variance of zero. Check your data and format statement.
Do you have any suggestions/recommendations regarding this error?
Student posted on Wednesday, April 29, 2015 - 2:18 am
Apologies- I got the model above to run. I do have another question-- that model above allows you to see the effects of race2 and race 3 in the Output, but obviously not race1 since you don't define that as a dummy variable. Is it OK to just re-run the model with defining different dummy variables to see what the effect of race1 would be on Y? If there is some statistical issue with that, is there another method you recommend?
Student posted on Wednesday, April 29, 2015 - 9:39 am
I checked my output and I only see effects of race2 and race3-- could you point me to where in the output I could find the effect of race1? I know it wouldn't be labelled as "race1" since we didn't define that. The only intercept information is for the indicators of the latent outcome variable. Is there a specific "OUTPUT" I need to request?
Hello! Thank you for the help. I am running into problems because we have a DV that is a latent variable with two observed vars. Thus, our OUTPUT gives us just the intercept for both observed vars. How do we go about getting the effect of, in this instance, race 2 vs race3, under these circumstances?
In other words, our ouput gives us the estimate and sig for race 2 vs race 1 and for race 3 vs race 1. However, we are uncertain about how we get the estimate and sig for race 2 vs race 3.
Thank you for the response. I will try this. Originally, I had rerun the model with a new set of dummy variables (race 1 vs race 2; race 3 vs race 2) instead of the old set (race 2 vs race 1), (race 3 vs race 1). I was anticipating the race 2 vs race 1 contrast to be the same magnitude in both models because all that I changed was the selection of dummy variables. However, the results were slightly different between the two models. Do you know why this is?
I am trying to compare two models (rational and emotional models) where i am testing the effect of two messages (emotional message and rational message) on emotions and risk. So I have two independent variables (emotional and rational messages) and the rest of my endogneous variables are continuous.
initially I have one independent categorical exogenous variable i named 'experiment where 1= emotional, 2= rational.
Would you please let me know what is wrong with my syntax. it is my first to run model comparison using mplus.
here is the syntax:
USEV experiment em em1 em3 r1 r2 r3 ben1 ben2 ben3; Model: feb by em em1 em3; rik by r1 r2 r3; benft by ben1 ben2 ben3;
Send your output to Support along with your license number.
Wim Beyers posted on Friday, May 18, 2018 - 2:39 am
When coding a 3-level nominal variable in two dummies (D1 & D2, each with same reference category), for regression, should we allow the correlation between D1 & D2 in the regression model, in order to get really the unique effects of the dummies?
I ran a Second-Order LGCM and now want to regress slope and level on covariates. I need to include a nominal variable with 9 categories, hence, I created 9-1=8 dummy coded variables. Now, is there a limit for dummy coded variables? I consecutively added the dummy variables, the estimation terminates normally and the model looks fine. However, as soon as I add the last dummy, I get tthe following message: THE MODEL ESTIMATION TERMINATED NORMALLY
THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES MAY NOT BE TRUSTWORTHY FOR SOME PARAMETERS DUE TO A NON-POSITIVE DEFINITE FIRST-ORDER DERIVATIVE PRODUCT MATRIX. THIS MAY BE DUE TO THE STARTING VALUES BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. THE CONDITION NUMBER IS -0.756D-20. PROBLEM INVOLVING THE FOLLOWING PARAMETER: Parameter 89, I ON DUBLIN
Could you help me with this problem? Thank you very much in advance, your help is highly appreciated. Anna