Message/Author 

Michael posted on Thursday, April 21, 2005  2:43 pm



I’ve been trying to run a CFA with binary outcomes. I keep getting an error message which reads, “Categorical variable S1 contains less than 2 categories.” I’ve double checked the data file and there are definitely observations that reflect both possible responses (true and false) along with some missing data. The proportion of responses is approximately 90% true and 10% false. Do you know what could account for this error message? TITLE: CFA multi item and scale test DATA: FILE IS G:\RCS Items (WOMEN)(04.15.05).dat; FORMAT IS free; TYPE IS individual; VARIABLE: NAMES ARE D1D2 S1S10 L1L10 C1C10 P1P10 DN1DN10 A1A10 H1H10; USEVARIABLES ARE S1S10 L1L10 C1C10 P1P10 DN1DN10 A1A10 H1H10; CATEGORICAL ARE S1S10 L1L10 C1C10 P1P10 DN1DN10 A1A10 H1H10; MISSING ARE all (99); ANALYSIS: TYPE IS general; ESTIMATOR = WLSMV; MATRIX = covariance; MODEL: S BY S1@1 S2S10; L BY L1@1 L2L10; C BY C1@1 C2C10; P BY P1@1 P2P10; DN BY DN1@1 DN2DN10; A BY A1@1 A2A10; H BY H1@1 H2H10; OUTPUT: standardized sampstat modindices (0) residual; 


If you are reading your data correctly, then it is most likely the case that after listwise deletion, some variables have only one category. Any observation with a missing value on one or more analysis variables is eliminated by the default of listwise deletion. If you cannot resolve this, please send the input/output, your data, and your license number to support@statmodel.com. Note also that with categorical factor indicators, MATRIX = COVARIANCE is not allowed. 


While trying to run a multigroup LCA model, I am receiving the following error. *** ERROR Categorical variable SMOKE contains less than 2 categories. I have double checked the data and my smoke category contains more than two categories. I have also tried deleting the smoke variable and the error repeats with the next variable in the model. I would like to know if I am specifying the model correctly. My input is as follows: VARIABLE: NAMES ARE WEIGHT PSU STRATUM BMIPCT PSU2 blckwht smoke alcohol1 alcohol2 pot coke inhale hard; USEVARIABLES ARE smoke alcohol1 alcohol2 pot coke inhale hard; Categorical ARE smoke alcohol1 alcohol2 pot coke inhale hard; CLASSES = cg (2) c(4); KNOWNCLASS = cg (blckwht = 0 blckwht=1); WEIGHT IS WEIGHT; CLUSTER IS PSU2; Missing ARE all (999); ANALYSIS: type = mixture complex; starts = 500 10; iterations = 1000; Model: %overall% c#1c#3 ON cg#1; OUTPUT: SAMP stand cint tech11; Plot: TYPE IS PLOT3; SERIES IS smoke alcohol1 alcohol2 pot coke inhale hard(*); 


Please send your input, data, output, and license number to support@statmodel.com. 

Martin H. posted on Thursday, September 02, 2010  8:31 am



My problem is quite similar. I've been trying to run a CFA with binary outcome (Rasch model) with multimatrix data. The data looks like this (example data) : 1330010415 1 1 1 1 1 . . 1330030125 1 1 1 0 1 . . 1330050102 1 1 0 1 0 . . 1330060304 1 0 0 0 1 . . 1340020211 1 0 0 0 1 . . 1330010417 3 . . 0 1 1 1 1330030127 3 . . 1 0 1 0 1330050104 3 . . 0 1 1 1 1330060306 3 . . 0 0 1 1 1340020213 3 . . 0 1 0 0 TITLE: muma_rasch_test; DATA: file = "muma_rasch_test1.dat"; format = free; VARIABLE: names = idperson booklet Item1 Item2 Item3 Item4 Item5 Item6; usevar = booklet Item1 Item2 Item3 Item4 Item5 Item6; categorical = Item1 Item2 Item3 Item4 Item5 Item6; missing = .; grouping = booklet(1=booklet1, 3=booklet3); MODEL: Latent by Item1 Item2 Item3 Item4 Item5 Item6 (1); Latent@1; MODEL booklet1: Latent by Item1 Item2 Item3 Item4 (1); [Item1] (2); [Item2] (3); [Item3] (4); [Item4] (5); MODEL booklet3: Latent by Item3 Item4 Item5 Item6 (1); [Item3] (4); [Item4] (5); [Item5] (6); [Item6] (7); *** ERROR Categorical variable ITEM5 contains less than 2 categories. 


Please send your input, data, output, and license number to support@statmodel.com. You may be reading the data incorrectly or subsetting is a way that one group has only one value for item5. 


Hi; While trying to run a CFA model, I am receiving the following error. *** ERROR Categorical variable V2 contains less than 2 categories. Thank you for your help 


You may be subsetting the data such that v2 has the same value for everyone. If you can't figure it out, please send your input, data, output, and license number to support@statmodel.com. 

IYH Boon posted on Friday, June 01, 2012  1:54 pm



I'm trying to estimate a LCGA using some data that I simulated. The outcome is binary, but in some time points there is no variation across observations (i.e., everyone has a 0 or everyone has a 1). As a result, I'm getting the "Categorical variable contains less than 2 categories error." Is there an easy workaround for this problem? 


Try adding VARIANCES=NOCHECK to the DATA command. 

IYH Boon posted on Monday, June 04, 2012  7:55 am



Thanks, Linda, but unfortunately that didn't do the trick. After adding VARIANCES=NOCHECK to the DATA command, I still get the same error: "Categorical variable V2 contains less than 2 categories." 


You can try the CATEGORICAL * option with ML but I don't think that will help. You may have to use only time points where you have variability. 

Sarah Moens posted on Thursday, December 06, 2012  2:17 am



I also had the same error message stating that the categorical variable contained less than 2 values. However, I figured out that in my case this had to do with the way Mac saves txtcsvdat... files. Mac uses other line breaks than e.g. Windows, \r vs. \r\n Although this is not visible in the file itself, this is how it apparently is stored. When you use a general text editor like TextWrangler, you can usually specify what type of line break to use: Mac (CR), Unix (LF), or Windows (CRLF). Choosing Windows (CRLF) fixed the error for me and allowed the file to be read properly. 

Susu Zhang posted on Friday, February 14, 2014  10:50 am



This happened to me, too, when I tried running LCA. And here is the output with the error message: INPUT INSTRUCTIONS TITLE: LCA for Puerto Rico DATA: FILE IS PuertoRicoLCA.dat; FORMAT IS (1F4,120F1); VARIABLE: NAMES ARE id cb1cb55 cb56a cb56b cb56c cb56d cb56e cb56f cb56g cb57cb112; USEVARIABLES ARE cb14 cb29 cb30 cb31 cb32 cb33 cb35 cb45 cb50 cb52 cb71 cb91 cb112 cb1 cb8 cb10 cb13 cb17 cb41 cb61 cb80 cb3 cb16 cb19 cb20 cb21 cb22 cb23 cb37 cb57 cb68 cb86 cb87 cb88 cb89 cb94 cb95 cb97 cb104; CATEGORICAL ARE cb14 cb29 cb30 cb31 cb32 cb33 cb35 cb45 cb50 cb52 cb71 cb91 cb112 cb1 cb8 cb10 cb13 cb17 cb41 cb61 cb80 cb3 cb16 cb19 cb20 cb21 cb22 cb23 cb37 cb57 cb68 cb86 cb87 cb88 cb89 cb94 cb95 cb97 cb104; CLASSES = c (3); ... ... (a few more lines) *** ERROR Categorical variable CB29 contains less than 2 categories. I went back to the data set and checked variable CB29, and it does contain 2 categories. Is there anything I can do to fix this? 


You should look at the data set that Mplus reads not the original data set. You may have blanks in the data set Mplus reads causing the data to be read incorrectly. If you can't see the problem, send the input, data, output, and your license number to support@statmodel.com. 

JinHee Hur posted on Saturday, July 22, 2017  5:52 pm



Hello, I ran into similar problem. I first tried with WLSMV estimator and got the error saying "Categorical variable ITEM1 contains less than 2 categories." I saw in other threads that ML or MLR should be used to retain as much as information from the missing values. However, it resulted in the same errors. A part of codes I've used are: CATEGORICAL ARE item1 item2 item3 item4 item5 item6 item7 item8 item9 item10 item11 item12 item13 item14 item15 item16 item17 item18 item19 item20 item21 item22 item23 item24 item25 item26 item27 item28 item29 item30 item31 item32 item33 item34 item35 item36 item37; MISSING = ALL (9); ANALYSIS: TYPE IS GENERAL; ESTIMATOR IS ML; ITERATIONS = 1000; CONVERGENCE = .00005; MODEL: F1 by item1 item2 item3 item4 item5 item6 item7 item8 item9 item10 item11 item12 item13 item14 item15 item16 item17 item18 item19 item20 item21 item22 item23 item24 item25 item26 item27 item28 item29 item30 item31 item32 item33 item34 item35 item36 item37; F1@1; OUTPUT: *** ERROR Categorical variable ITEM1 contains less than 2 categories. Does this mean I should apply multiple imputation or do you suggest using other estimators? Thank you 


If Item1 has less than 2 categories it has only 1 category and is therefore not a variable but a constant. So you should delete the item from the USEV list. 

JinHee Hur posted on Sunday, July 23, 2017  6:57 pm



Thank you for the response Dr. Muthen. All of the items in the dataset have at least 4 categories. Based on the response for the previous questions, It seems like it's saying there is less than 2 categories after listwise deletion. Is there a way I can run it with pairwise deletion? I'm afraid deleting the items in the model would change the substantive meaning of the data (survey). Could you suggest alternative method to resolve the issue? Thank you 


ML does not use listwise deletion  only if you request it. Check your data by using Type=Basic and not declaring the variables as categorical. 

mboer posted on Saturday, February 23, 2019  4:48 am



Dear Dr. Muthen, I have a twolevel model with a dichotomous dependent variable, using imputed data. To create my dependent variable, I recoded a variable that has imputed values for missing data using the DEFINE command. When I execute the analysis, I get the error that the dependent variable has less than two categories. I checked the datasets, and the variable really has five different values. I also checked the sample statistics, and this confirmed that the variable has multiple values. The problem also appears when I use other recoded dependent variables. I also conducted the analyses using the variable without recoding (treating it as a continuous variable), and then the analyses seems to work fine. Do you have any idea what the problem could be? This is how I defined the variable: DEFINE: dbullied = BULLIED>2; whereby BULLIED is a continuous variable with imputed values. I also tried replacing '2' with other values, but the error keeps returning. Thank you in advance. 


It sounds like you are not reading the imputed data sets correctly. Are you using the information from the end of the output where you imputed the data sets to read them? If you can't see the problem, send the files and your license number to support@statmodel.com. 

Back to top 