Mplus Discussion >> Useobservations and define commands

Topics
Last Day
Last 3 Days
Last Week
Tree View

Edit Profile


Useobservations and define commands

Mplus Discussion > Structural Equation Modeling >

Message/Author

Jon Elhai posted on Friday, June 20, 2008 - 2:13 pm

Linda,
Is it possible to define a new variable using the DEFINE command (such as, y = a*b), and also exclude some subjects based on that new variable using the USEOBSERVATIONS command (such as USEOBSERVATIONS are y ge 2)?

Linda K. Muthen posted on Friday, June 20, 2008 - 3:00 pm

Yes. Just put it at the end of the USEVARIABLES list.

Jon Elhai posted on Friday, June 20, 2008 - 8:54 pm

Dear Linda,
I did as you suggested, but I received an error message. That is, I defined my new variable (newvar), included it at the end of my USERVARIABLES list, and included a USEOBSERVATIONS command to only include those subjects with a response of "1" on the new variable. But I received the error:

*** ERROR
Undefined variable used in transformation:
newvar

Here's abbreviated syntax I used...
VARIABLE:
NAMES ARE id ptsdd4life ptsdd5life;

MISSING ARE ptsdd4life ptsdd5life (-999);

CATEGORICAL are ptsdd4life ptsdd5life newvar;

USEVARIABLES ARE ptsdd4life ptsdd5life newvar;
USEOBSERVATIONS are newvar== 1;

DEFINE: (I set up define commands below this to calculate newvar)

ANALYSIS:
TYPE = basic ;
estimator=wlsmv;

Linda K. Muthen posted on Saturday, June 21, 2008 - 6:50 am

Please send your input, data, output, and license number to support@statmodel.com.

Kristine Molina posted on Saturday, January 30, 2010 - 1:02 pm

Hi--

I am planning to run a multiple group analysis for males and females in a particular subethnic group. My ethnicity variable has 4 subgroups. I am only interested in one group. I am using complex survey data.

I was wondering if the way I should specify the commands would be as such:
GROUPING IS SEX (1= Male 2=Female);
USEOBSERVATIONS ARE (ETHNICITY EQ 1);

Or if I should be using:
GROUPING IS SEX (1= Male 2= Female);
SUBPOPULATION IS (ETHNICITY EQ 1);

Thanks!

Linda K. Muthen posted on Sunday, January 31, 2010 - 10:20 am

Theoretically you should use SUBPOPULATION but it is not available with multiple group analysis.

Kristine Molina posted on Sunday, January 31, 2010 - 5:50 pm

So is there any way that I can still just run a path analysis model for each separate subgroup even if I can't do the difference test for them, at least just to see how the estimates compare across the subgroups?

I tried the following command:
USEOBSERVATION ARE (ETHNICITY EQ 1 AND SEX EQ 1);

and it is telling me that "the input file does not contain valid commands"

I cannot find an example in the Users Guide that specifies observations from two different variables (e.g., Sex and Ethnicity).

Thanks!

Linda K. Muthen posted on Monday, February 01, 2010 - 8:21 am

I would run two analyses for each group -- one using SUBPOPULATION and one using USEOBSERVATIONS and compare the results. If they are very similar, I would use USEOBSERVATIONS in the multiple group analysis.

Page 442 of the user's guide shows a USEOBSERVATIONS statement with two variables. I believe the error message you are getting means you have statements before the TITLE command that are not in the Mplus language. If you cannot see the problem, you need to send the output and your license number to support@statmodel.com.

Marie-Helene Veronneau posted on Thursday, June 17, 2010 - 12:30 pm

Just like Jon Elhai, I also got the error message "Undefined variable used in transformation" when using the DEFINE command to create a new variable (newvar) that I wanted to use in the USEOBSERVATIONS statement (USEOBSERVATIONS = newvar GT 0).

Apparently, we can only use variables from the original dataset in the USEOBSERVATIONS statement. Any variable created with the DEFINE commant CANNOT be used in the USEOBSERVATIONS statement.

If this information is true, I think it can be useful to leave it here on the discussion forum for other users who have this problem.

Linda K. Muthen posted on Thursday, June 17, 2010 - 12:50 pm

As stated in the user's guide, only variables from the NAMES statement can be used with the USEOBSERVATIONS option.

yang posted on Wednesday, November 16, 2011 - 12:37 pm

Linda,

Would you please kindly teach me how to select observations based on a character instead of numeric variable that is already in the data set being used? I could not find this information anywhere else. Thanks.

Linda K. Muthen posted on Wednesday, November 16, 2011 - 2:07 pm

Mplus does not read character data. All data must be numeric.

Summer McKnight posted on Friday, October 26, 2012 - 1:28 pm

I keep getting an error when I try to select only my baseline data. Any advice?

INPUT INSTRUCTIONS

TITLE: SOPS
DATA: FILE IS sops_cfa.csv;
VARIABLE: NAMES ARE RECID TIMEPT BASELINE CELL SOPSA1 SOPSA2 SOPSA3 SOPSA4
SOPSD1 SOPSD2 SOPSD3 SOPSD4;
USEV ARE SOPSA1 SOPSA2 SOPSA3 SOPSA4
SOPSB1 SOPSB2 SOPSB3 SOPSB4
SOPSC1 SOPSC2 SOPSC3 SOPSC4
SOPSD1 SOPSD2 SOPSD3 SOPSD4;
MISSING IS *;
!TIMEMEASURES = TIMEPT;
USEOBSERVATIONS ARE (TIMEPT EQ BASELINE);
ANALYSIS: TYPE = H1 MISSING;

OUTPUT: SAMPSTAT STANDARDIZED MODINDICES
*** WARNING in ANALYSIS command
Starting with Version 5, TYPE=H1 is the default for all analyses with
missing data. To turn off the estimation of the H1 model and the
computation of chi-square, use NOCHISQUARE in the OUTPUT command.
*** WARNING in ANALYSIS command
Starting with Version 5, TYPE=MISSING is the default for all analyses.
To obtain listwise deletion, use LISTWISE=ON in the DATA command.
*** ERROR
The number of observations is 0. Check your data and format statement.
Data file: sops_cfa.csv
*** ERROR
Invalid symbol in data file:
"YR1" at record #: 1, field #: 2

Linda K. Muthen posted on Friday, October 26, 2012 - 1:43 pm

It sounds like you have the variable names in the first record of the data set. Please remove this record.

Rebecca Wolf posted on Thursday, October 31, 2013 - 8:58 am

I'm trying to use the USEOBSERVATION command to run a multiple regression analysis for each school ID (SCHOOL=1302, for example), because I don't want to split my data file into hundreds of files, one for each school. However, I am getting drastically different results than in SPSS, and I don't have much missing data. So, I'm wondering if the USEOBSERVATION command is using the entire sample in the estimation instead of using only the cases in a particular school. Can you please comment on this? Thanks!

Linda K. Muthen posted on Thursday, October 31, 2013 - 10:19 am

You can see what is being used by looking at the sample size printed in the output. Perhaps you are specifying USEOBSERVATIONS incorrectly. If you can't figure this out, please send the output and your license number to support@statmodel.com.

Rebecca Wolf posted on Thursday, October 31, 2013 - 10:35 am

The sample size is correct, so that's not it.

Linda K. Muthen posted on Thursday, October 31, 2013 - 11:42 am

If you can't figure this out, please send the output and your license number to support@statmodel.com.

Calvin D. Croy posted on Friday, May 01, 2015 - 10:44 am

I wanted my CFA to only use observations for children 12 months old or older. I specified USEOBSERVATIONS = (KidAge1 >= 12).

I saved the data used in the analysis with the SAVEDATA command and then examined the values of KidAge1. I found that 3 of the observations had missing values for KidAge1. Could someone please explain why this happened?

Before my USEOBSERVATIONS statement I had said MISSING IS .;

Are missing values identified with non-numeric values considered to have higher values than numeric values (as, for example, Stata does)?

Linda K. Muthen posted on Friday, May 01, 2015 - 11:05 am

I would need to see the relevant files and your license number at support@statmodel.com.

Annie Robitaille posted on Monday, October 19, 2015 - 10:12 am

I am needing to select cases on variables not included in the analysis. Because the data is in long format, I first need to transform the variable in wide format. However, when I include the new wide variable on the USEOBSERVATIONS command, I get an error message "Undefined variable used in transformation". Is that because the USEOBS command happens before the data transformation command?

Thank you.

Bengt O. Muthen posted on Monday, October 19, 2015 - 1:59 pm

We need to see the output to say what's going on. Please send to Support along with license number.

wang ying posted on Wednesday, December 02, 2015 - 3:57 am

Dear Muthen,

I'm running a data set with 380 observations without any missing data.

However, no matter what kind of analysis or data format, the output only shows 190 observations.

It is appreciated, if you could advice me how to deal with this situation.
Thanks.

Linda K. Muthen posted on Wednesday, December 02, 2015 - 6:46 am

It sounds like you have more variables names on the NAMES list than you have columns in your data set causing two records to be used for each observation.

wang ying posted on Wednesday, December 02, 2015 - 7:14 pm

Thanks!
The number of observations is 380 now.

Grant Jackson posted on Tuesday, January 31, 2017 - 10:34 am

Greetings,

Let's say I am doing an SEM and want to compare the results of women and men. I know of two ways I could do this:

1. Using the GROUPING option, which produces the results for women and men in a single output file.

2. Using the USEOBSERVATIONS option, i.e., looking at the results for women only, then men only, in two separate output files, and comparing.

I assume these are both valid approaches, but my results are not consistent across both approaches. For example, the results for women in the USEOBSERVATIONS output do not match the results for women in the GROUPING output. Same goes for men.

Shouldn't they be identical? Am I missing something?

Thanks!

Grant

Linda K. Muthen posted on Tuesday, January 31, 2017 - 12:10 pm

Please send the relevant outputs and your license number to support@statmodel.com.

Marieke Carpentier posted on Wednesday, September 06, 2017 - 1:06 am

Hi,

I want to remove some cases out of my dataset for analysis.
I tried doing it with USEOBSERVATIONS and defining only the ID numbers that I want to keep in the set

USEOBSERVATIONS (id_var EQ 1 2 3 4 ...);

was my input. However this does not work.
Can you help me? Thanks!

Linda K. Muthen posted on Wednesday, September 06, 2017 - 11:51 am

The format would be

id_var eq 1 or id_var eq 2 etc.

It may be easier to specify something like

id_var le 4

Sarah Rosenbach posted on Thursday, January 16, 2020 - 11:34 am

Hello,

I am interested in creating variables that are weighted averages of other variables. For example:

DEFINE:
ImmDiAvg=(ImmDi1*2+5*ImmDi2)/7;

Is there a way to then drop the constituent variables (ImmDi1 and ImmDi2) from the model?

Thanks,
Sarah

Linda K. Muthen posted on Thursday, January 16, 2020 - 1:38 pm

Variables used in the DEFINE command that you don't want in the analysis should not be on the USEVARIABLES list. This list should be for variables used in the analysis only.

Sarah Rosenbach posted on Friday, January 17, 2020 - 7:41 am

Thanks, I assumed that I would need to have the constituent variables in the USEVARIABLES list (and also to be imputed) before being able to compute the weighted average.

Linda K. Muthen posted on Friday, January 17, 2020 - 11:47 am

Only variables on the IMPUTE list of DATA IMPUTATION are imputed. Variables on the USEVARIABLES list are used to impute the variables on the IMPUTE list. Example 11.5 goes over this. So I think you should take them off the USEVARIABLES list and put them on the IMPUTE list if you want them imputed. For further help on this, you should send the output and data along with your license number to support@statmodel.com so we can see the full picture and give you a complete answer.

Michael Sciffer posted on Monday, June 08, 2020 - 4:22 pm

My dataset consists of observations over a number of years. I wish to analyse a subset of dataset according to years (easy enough with USEOBSERVATIONS) but I am having trouble creating a dummy variable from recoding the selected years.

If I use:
USEOBSERVATIONS ARE (Year EQ 2006) OR (Year EQ 2009);
DEFINE: IF (Year EQ 2009) THEN Dummy = 0;
IF (Year EQ 2006) THEN Dummy = 1;

I get an error that all of my variables have missing data.

If I use:
USEOBSERVATIONS ARE (Year EQ 2006) OR (Year EQ 2009);
DEFINE: Dummy = 0;
IF (Year EQ 2006) THEN Dummy = 1;

I get an error saying that my dummy variable has no variance on half of the cases while the other half of the cases are missing on all variables.

I can manually recode the variable in Excel and the analysis works but I am looking to save time.

Linda K. Muthen posted on Monday, June 08, 2020 - 6:25 pm

Please send the output, data set, and your license number to support@statmodel.com.