Message/Author 

Jon Elhai posted on Friday, June 20, 2008  2:13 pm



Linda, Is it possible to define a new variable using the DEFINE command (such as, y = a*b), and also exclude some subjects based on that new variable using the USEOBSERVATIONS command (such as USEOBSERVATIONS are y ge 2)? 


Yes. Just put it at the end of the USEVARIABLES list. 

Jon Elhai posted on Friday, June 20, 2008  8:54 pm



Dear Linda, I did as you suggested, but I received an error message. That is, I defined my new variable (newvar), included it at the end of my USERVARIABLES list, and included a USEOBSERVATIONS command to only include those subjects with a response of "1" on the new variable. But I received the error: *** ERROR Undefined variable used in transformation: newvar Here's abbreviated syntax I used... VARIABLE: NAMES ARE id ptsdd4life ptsdd5life; MISSING ARE ptsdd4life ptsdd5life (999); CATEGORICAL are ptsdd4life ptsdd5life newvar; USEVARIABLES ARE ptsdd4life ptsdd5life newvar; USEOBSERVATIONS are newvar== 1; DEFINE: (I set up define commands below this to calculate newvar) ANALYSIS: TYPE = basic ; estimator=wlsmv; 


Please send your input, data, output, and license number to support@statmodel.com. 


Hi I am planning to run a multiple group analysis for males and females in a particular subethnic group. My ethnicity variable has 4 subgroups. I am only interested in one group. I am using complex survey data. I was wondering if the way I should specify the commands would be as such: GROUPING IS SEX (1= Male 2=Female); USEOBSERVATIONS ARE (ETHNICITY EQ 1); Or if I should be using: GROUPING IS SEX (1= Male 2= Female); SUBPOPULATION IS (ETHNICITY EQ 1); Thanks! 


Theoretically you should use SUBPOPULATION but it is not available with multiple group analysis. 


So is there any way that I can still just run a path analysis model for each separate subgroup even if I can't do the difference test for them, at least just to see how the estimates compare across the subgroups? I tried the following command: USEOBSERVATION ARE (ETHNICITY EQ 1 AND SEX EQ 1); and it is telling me that "the input file does not contain valid commands" I cannot find an example in the Users Guide that specifies observations from two different variables (e.g., Sex and Ethnicity). Thanks! 


I would run two analyses for each group  one using SUBPOPULATION and one using USEOBSERVATIONS and compare the results. If they are very similar, I would use USEOBSERVATIONS in the multiple group analysis. Page 442 of the user's guide shows a USEOBSERVATIONS statement with two variables. I believe the error message you are getting means you have statements before the TITLE command that are not in the Mplus language. If you cannot see the problem, you need to send the output and your license number to support@statmodel.com. 


Just like Jon Elhai, I also got the error message "Undefined variable used in transformation" when using the DEFINE command to create a new variable (newvar) that I wanted to use in the USEOBSERVATIONS statement (USEOBSERVATIONS = newvar GT 0). Apparently, we can only use variables from the original dataset in the USEOBSERVATIONS statement. Any variable created with the DEFINE commant CANNOT be used in the USEOBSERVATIONS statement. If this information is true, I think it can be useful to leave it here on the discussion forum for other users who have this problem. 


As stated in the user's guide, only variables from the NAMES statement can be used with the USEOBSERVATIONS option. 

yang posted on Wednesday, November 16, 2011  12:37 pm



Linda, Would you please kindly teach me how to select observations based on a character instead of numeric variable that is already in the data set being used? I could not find this information anywhere else. Thanks. 


Mplus does not read character data. All data must be numeric. 


I keep getting an error when I try to select only my baseline data. Any advice? INPUT INSTRUCTIONS TITLE: SOPS DATA: FILE IS sops_cfa.csv; VARIABLE: NAMES ARE RECID TIMEPT BASELINE CELL SOPSA1 SOPSA2 SOPSA3 SOPSA4 SOPSD1 SOPSD2 SOPSD3 SOPSD4; USEV ARE SOPSA1 SOPSA2 SOPSA3 SOPSA4 SOPSB1 SOPSB2 SOPSB3 SOPSB4 SOPSC1 SOPSC2 SOPSC3 SOPSC4 SOPSD1 SOPSD2 SOPSD3 SOPSD4; MISSING IS *; !TIMEMEASURES = TIMEPT; USEOBSERVATIONS ARE (TIMEPT EQ BASELINE); ANALYSIS: TYPE = H1 MISSING; OUTPUT: SAMPSTAT STANDARDIZED MODINDICES *** WARNING in ANALYSIS command Starting with Version 5, TYPE=H1 is the default for all analyses with missing data. To turn off the estimation of the H1 model and the computation of chisquare, use NOCHISQUARE in the OUTPUT command. *** WARNING in ANALYSIS command Starting with Version 5, TYPE=MISSING is the default for all analyses. To obtain listwise deletion, use LISTWISE=ON in the DATA command. *** ERROR The number of observations is 0. Check your data and format statement. Data file: sops_cfa.csv *** ERROR Invalid symbol in data file: "YR1" at record #: 1, field #: 2 


It sounds like you have the variable names in the first record of the data set. Please remove this record. 


I'm trying to use the USEOBSERVATION command to run a multiple regression analysis for each school ID (SCHOOL=1302, for example), because I don't want to split my data file into hundreds of files, one for each school. However, I am getting drastically different results than in SPSS, and I don't have much missing data. So, I'm wondering if the USEOBSERVATION command is using the entire sample in the estimation instead of using only the cases in a particular school. Can you please comment on this? Thanks! 


You can see what is being used by looking at the sample size printed in the output. Perhaps you are specifying USEOBSERVATIONS incorrectly. If you can't figure this out, please send the output and your license number to support@statmodel.com. 

Rebecca Wolf posted on Thursday, October 31, 2013  10:35 am



The sample size is correct, so that's not it. 


If you can't figure this out, please send the output and your license number to support@statmodel.com. 


I wanted my CFA to only use observations for children 12 months old or older. I specified USEOBSERVATIONS = (KidAge1 >= 12). I saved the data used in the analysis with the SAVEDATA command and then examined the values of KidAge1. I found that 3 of the observations had missing values for KidAge1. Could someone please explain why this happened? Before my USEOBSERVATIONS statement I had said MISSING IS .; Are missing values identified with nonnumeric values considered to have higher values than numeric values (as, for example, Stata does)? 


I would need to see the relevant files and your license number at support@statmodel.com. 


I am needing to select cases on variables not included in the analysis. Because the data is in long format, I first need to transform the variable in wide format. However, when I include the new wide variable on the USEOBSERVATIONS command, I get an error message "Undefined variable used in transformation". Is that because the USEOBS command happens before the data transformation command? Thank you. 


We need to see the output to say what's going on. Please send to Support along with license number. 

wang ying posted on Wednesday, December 02, 2015  3:57 am



Dear Muthen, I'm running a data set with 380 observations without any missing data. However, no matter what kind of analysis or data format, the output only shows 190 observations. It is appreciated, if you could advice me how to deal with this situation. Thanks. 


It sounds like you have more variables names on the NAMES list than you have columns in your data set causing two records to be used for each observation. 

wang ying posted on Wednesday, December 02, 2015  7:14 pm



Thanks! The number of observations is 380 now. 


Greetings, Let's say I am doing an SEM and want to compare the results of women and men. I know of two ways I could do this: 1. Using the GROUPING option, which produces the results for women and men in a single output file. 2. Using the USEOBSERVATIONS option, i.e., looking at the results for women only, then men only, in two separate output files, and comparing. I assume these are both valid approaches, but my results are not consistent across both approaches. For example, the results for women in the USEOBSERVATIONS output do not match the results for women in the GROUPING output. Same goes for men. Shouldn't they be identical? Am I missing something? Thanks! Grant 


Please send the relevant outputs and your license number to support@statmodel.com. 


Hi, I want to remove some cases out of my dataset for analysis. I tried doing it with USEOBSERVATIONS and defining only the ID numbers that I want to keep in the set USEOBSERVATIONS (id_var EQ 1 2 3 4 ...); was my input. However this does not work. Can you help me? Thanks! 


The format would be id_var eq 1 or id_var eq 2 etc. It may be easier to specify something like id_var le 4 


Hello, I am interested in creating variables that are weighted averages of other variables. For example: DEFINE: ImmDiAvg=(ImmDi1*2+5*ImmDi2)/7; Is there a way to then drop the constituent variables (ImmDi1 and ImmDi2) from the model? Thanks, Sarah 


Variables used in the DEFINE command that you don't want in the analysis should not be on the USEVARIABLES list. This list should be for variables used in the analysis only. 


Thanks, I assumed that I would need to have the constituent variables in the USEVARIABLES list (and also to be imputed) before being able to compute the weighted average. 


Only variables on the IMPUTE list of DATA IMPUTATION are imputed. Variables on the USEVARIABLES list are used to impute the variables on the IMPUTE list. Example 11.5 goes over this. So I think you should take them off the USEVARIABLES list and put them on the IMPUTE list if you want them imputed. For further help on this, you should send the output and data along with your license number to support@statmodel.com so we can see the full picture and give you a complete answer. 


My dataset consists of observations over a number of years. I wish to analyse a subset of dataset according to years (easy enough with USEOBSERVATIONS) but I am having trouble creating a dummy variable from recoding the selected years. If I use: USEOBSERVATIONS ARE (Year EQ 2006) OR (Year EQ 2009); DEFINE: IF (Year EQ 2009) THEN Dummy = 0; IF (Year EQ 2006) THEN Dummy = 1; I get an error that all of my variables have missing data. If I use: USEOBSERVATIONS ARE (Year EQ 2006) OR (Year EQ 2009); DEFINE: Dummy = 0; IF (Year EQ 2006) THEN Dummy = 1; I get an error saying that my dummy variable has no variance on half of the cases while the other half of the cases are missing on all variables. I can manually recode the variable in Excel and the analysis works but I am looking to save time. 


Please send the output, data set, and your license number to support@statmodel.com. 

Back to top 