Jiyoung posted on Saturday, June 20, 2009 - 9:34 am
I want to run a structural equation model. The variables in my model were chosen based on theories that previous studies found.
Along with the variables based on theories, I also want to include demographic variables (e.g., gender, race) to control the effect of demographic variables on endogenous variables. I wonder if it is correct that those demographic variables are just added like other exogenous variables based on theories.
For instance, let's say that I want to predict intention to use the Internet (INTENTION).
My exogenous variables based on theories are 1) perceived ease of use (PEU) and 2) relative advantage (RA). Those are latent variables with multiple indicators. I also want to control the effect of gender, race, income, etc.
First, I will run a CFA to examine the validity of PEU and RA. I will also use the following command: genderL by gender; raceAAL by raceAA; race AsianL by raceAsian; raceWhiteL by raceWhite;
After I make sure the model fit of the CFA model is good, I will move onto the SEM model. I will also use the following command.
Intention on peu ra genderL raceAAL raceAsianL raceWhiteL;
I wonder if I am on the right track. If you could clarify my question, I would appreciate it. Thank you.
I want to include SES as a latent variable in a SEM model. The indicators are: education (educ) - 7 level ordinal indicator income (inc) - 6 level ordinal indicator race/ethnicity (race) - nominal indicator with 5 categories (Black, White, Hispanic, Asian, and other) marital status (mar) - nominal indicator with 5 categories (married, living with partner, divorced, widowed, single never married) nativity (bornUS) - born in US=1 not born US=0
1. I know I need to create dummy variables for the two nominal indicators - marital status and race/ethnicity. What I cannot figure out is how to code the dummy variables and then how to include them in the BY statement for the SES factor.
2. When SES is a latent factor, is it appropriate to have the measured indicators "cause" the latent factor? If so, how is the command for this written?
I tried ML and defining raceth and marital as nominal variables but get an error message that I do not have enough memory. WLSMV seems to work best with my model (except for my lack of understanding of how to write the command for dummies for my two nominal variables).
On pp 449-450 in my users manual (version 5, Nov 2007) is a description of how to refer the the levels of a nominal dependent variables in the MODEL command but no description of how to create dummy variables in the DEFINE command.
I tried this in the DEFINE command: white=raceth==1; black=raceth==2; hispanic=raceth==3;
It seems to work. Is this correct?
If not, please help me - I am just a dumb graduate student ready to pull my hair out... Thanks.
DEFINE; white = 0; if (raceth eq 1) then white = 1; black = 0; if (raceth eq 2) then black = 1; etc.
You need k-1 dummies where k is the number of categories of the nominal variable. Whenever you create a variable in DEFINE, you should check to see that you get the results intended by saving the old and new variables and spot checking.
Lucy Morgan posted on Tuesday, February 03, 2015 - 5:56 am
I need to control for the effects of two observed demographic variables in an otherwise fully latent SEM model using MLM (due to non-normal data). One is ordinal (education level) and the other is nominal (ethnicity) and both have a significant effect on the dependent variables. I have read through all the posts but I am still not clear what would be the correct approach. I need to run both measurement models and path models including these variables.
1) Can I include both the ordinal and the nominal variables as continuous variables as I do not need to interpret their effects, only control for them? (I will use SPSS to analyse for specific effects)
2) If no, my understanding is that I would need to create dummy variables - so would I state
CATEGORICAL = ethnic educa;
and then create dummy variables using DEFINE: white = 0; if (raceth eq 1) then white = 1; black = 0; if (raceth eq 2) then black = 1; etc.
as outlined in the post above? If yes, do I use the newly created variables black, white, etc in WITH and ON statements or can I simply use ethnic and educ variables in the WITH and ON statements???
Many thanks for your help with this, I am very confused! Lucy
1. The scale of observed exogenous variables is not specified. The scale is specified only for endogenous variables.
2. You do not need to create dummy variables for an ordinal variable unless you prefer to. You must create dummy variables for a nominal variable. Your code looks correct. For a nominal variable, you should use the dummy variables not the nominal variable in the analysis.
Student posted on Wednesday, April 29, 2015 - 2:02 am
I am experiencing issues trying to create dummy variables for a nominal independent variable (c) with 3 classes. In my dataset, this variable has values ranging from 1-3.
I am defining the classes as you suggested above: DEFINE: race2 = 0; if (c eq 2) then race2 = 1; race3 = 0; if (c eq 3) then race3 = 1;
I am adding the new variables to my USEVARIABLES list, and regressing Y (my outcome) on race and other covariates:
Y on race3 race2 cov1 cov2;
However, I get the warning: *** ERROR One or more variables have a variance of zero. Check your data and format statement.
Do you have any suggestions/recommendations regarding this error?
Student posted on Wednesday, April 29, 2015 - 2:18 am
Apologies- I got the model above to run. I do have another question-- that model above allows you to see the effects of race2 and race 3 in the Output, but obviously not race1 since you don't define that as a dummy variable. Is it OK to just re-run the model with defining different dummy variables to see what the effect of race1 would be on Y? If there is some statistical issue with that, is there another method you recommend?
Student posted on Wednesday, April 29, 2015 - 9:39 am
I checked my output and I only see effects of race2 and race3-- could you point me to where in the output I could find the effect of race1? I know it wouldn't be labelled as "race1" since we didn't define that. The only intercept information is for the indicators of the latent outcome variable. Is there a specific "OUTPUT" I need to request?
Hello! Thank you for the help. I am running into problems because we have a DV that is a latent variable with two observed vars. Thus, our OUTPUT gives us just the intercept for both observed vars. How do we go about getting the effect of, in this instance, race 2 vs race3, under these circumstances?
In other words, our ouput gives us the estimate and sig for race 2 vs race 1 and for race 3 vs race 1. However, we are uncertain about how we get the estimate and sig for race 2 vs race 3.
Thank you for the response. I will try this. Originally, I had rerun the model with a new set of dummy variables (race 1 vs race 2; race 3 vs race 2) instead of the old set (race 2 vs race 1), (race 3 vs race 1). I was anticipating the race 2 vs race 1 contrast to be the same magnitude in both models because all that I changed was the selection of dummy variables. However, the results were slightly different between the two models. Do you know why this is?