Michael posted on Thursday, April 21, 2005 - 2:43 pm
I’ve been trying to run a CFA with binary outcomes. I keep getting an error message which reads, “Categorical variable S1 contains less than 2 categories.” I’ve double checked the data file and there are definitely observations that reflect both possible responses (true and false) along with some missing data. The proportion of responses is approximately 90% true and 10% false. Do you know what could account for this error message?
TITLE: CFA multi item and scale test DATA: FILE IS G:\RCS Items (WOMEN)(04.15.05).dat; FORMAT IS free; TYPE IS individual; VARIABLE: NAMES ARE D1-D2 S1-S10 L1-L10 C1-C10 P1-P10 DN1-DN10 A1-A10 H1-H10; USEVARIABLES ARE S1-S10 L1-L10 C1-C10 P1-P10 DN1-DN10 A1-A10 H1-H10; CATEGORICAL ARE S1-S10 L1-L10 C1-C10 P1-P10 DN1-DN10 A1-A10 H1-H10; MISSING ARE all (99); ANALYSIS: TYPE IS general; ESTIMATOR = WLSMV; MATRIX = covariance; MODEL: S BY S1@1 S2-S10; L BY L1@1 L2-L10; C BY C1@1 C2-C10; P BY P1@1 P2-P10; DN BY DN1@1 DN2-DN10; A BY A1@1 A2-A10; H BY H1@1 H2-H10; OUTPUT: standardized sampstat modindices (0) residual;
If you are reading your data correctly, then it is most likely the case that after listwise deletion, some variables have only one category. Any observation with a missing value on one or more analysis variables is eliminated by the default of listwise deletion. If you cannot resolve this, please send the input/output, your data, and your license number to firstname.lastname@example.org.
Note also that with categorical factor indicators, MATRIX = COVARIANCE is not allowed.
While trying to run a multi-group LCA model, I am receiving the following error.
*** ERROR Categorical variable SMOKE contains less than 2 categories.
I have double checked the data and my smoke category contains more than two categories. I have also tried deleting the smoke variable and the error repeats with the next variable in the model. I would like to know if I am specifying the model correctly.
You may be subsetting the data such that v2 has the same value for everyone. If you can't figure it out, please send your input, data, output, and license number to email@example.com.
IYH Boon posted on Friday, June 01, 2012 - 1:54 pm
I'm trying to estimate a LCGA using some data that I simulated. The outcome is binary, but in some time points there is no variation across observations (i.e., everyone has a 0 or everyone has a 1). As a result, I'm getting the "Categorical variable contains less than 2 categories error." Is there an easy workaround for this problem?
You can try the CATEGORICAL * option with ML but I don't think that will help. You may have to use only time points where you have variability.
Sarah Moens posted on Thursday, December 06, 2012 - 2:17 am
I also had the same error message stating that the categorical variable contained less than 2 values.
However, I figured out that in my case this had to do with the way Mac saves txt-csv-dat... files. Mac uses other line breaks than e.g. Windows, \r vs. \r\n Although this is not visible in the file itself, this is how it apparently is stored.
When you use a general text editor like TextWrangler, you can usually specify what type of line break to use: Mac (CR), Unix (LF), or Windows (CRLF). Choosing Windows (CRLF) fixed the error for me and allowed the file to be read properly.
Susu Zhang posted on Friday, February 14, 2014 - 10:50 am
This happened to me, too, when I tried running LCA. And here is the output with the error message: INPUT INSTRUCTIONS
TITLE: LCA for Puerto Rico DATA: FILE IS PuertoRicoLCA.dat; FORMAT IS (1F4,120F1); VARIABLE: NAMES ARE id cb1-cb55 cb56a cb56b cb56c cb56d cb56e cb56f cb56g cb57-cb112; USEVARIABLES ARE cb14 cb29 cb30 cb31 cb32 cb33 cb35 cb45 cb50 cb52 cb71 cb91 cb112 cb1 cb8 cb10 cb13 cb17 cb41 cb61 cb80 cb3 cb16 cb19 cb20 cb21 cb22 cb23 cb37 cb57 cb68 cb86 cb87 cb88 cb89 cb94 cb95 cb97 cb104; CATEGORICAL ARE cb14 cb29 cb30 cb31 cb32 cb33 cb35 cb45 cb50 cb52 cb71 cb91 cb112 cb1 cb8 cb10 cb13 cb17 cb41 cb61 cb80 cb3 cb16 cb19 cb20 cb21 cb22 cb23 cb37 cb57 cb68 cb86 cb87 cb88 cb89 cb94 cb95 cb97 cb104; CLASSES = c (3); ... ... (a few more lines)
*** ERROR Categorical variable CB29 contains less than 2 categories.
I went back to the data set and checked variable CB29, and it does contain 2 categories. Is there anything I can do to fix this?
You should look at the data set that Mplus reads not the original data set. You may have blanks in the data set Mplus reads causing the data to be read incorrectly. If you can't see the problem, send the input, data, output, and your license number to firstname.lastname@example.org.
JinHee Hur posted on Saturday, July 22, 2017 - 5:52 pm
Hello, I ran into similar problem. I first tried with WLSMV estimator and got the error saying "Categorical variable ITEM1 contains less than 2 categories." I saw in other threads that ML or MLR should be used to retain as much as information from the missing values. However, it resulted in the same errors.
If Item1 has less than 2 categories it has only 1 category and is therefore not a variable but a constant. So you should delete the item from the USEV list.
JinHee Hur posted on Sunday, July 23, 2017 - 6:57 pm
Thank you for the response Dr. Muthen.
All of the items in the dataset have at least 4 categories. Based on the response for the previous questions, It seems like it's saying there is less than 2 categories after listwise deletion. Is there a way I can run it with pairwise deletion? I'm afraid deleting the items in the model would change the substantive meaning of the data (survey). Could you suggest alternative method to resolve the issue?
ML does not use listwise deletion - only if you request it.
Check your data by using Type=Basic and not declaring the variables as categorical.
mboer posted on Saturday, February 23, 2019 - 4:48 am
Dear Dr. Muthen,
I have a two-level model with a dichotomous dependent variable, using imputed data. To create my dependent variable, I recoded a variable that has imputed values for missing data using the DEFINE command. When I execute the analysis, I get the error that the dependent variable has less than two categories. I checked the datasets, and the variable really has five different values. I also checked the sample statistics, and this confirmed that the variable has multiple values. The problem also appears when I use other recoded dependent variables. I also conducted the analyses using the variable without recoding (treating it as a continuous variable), and then the analyses seems to work fine. Do you have any idea what the problem could be? This is how I defined the variable:
DEFINE: dbullied = BULLIED>2;
whereby BULLIED is a continuous variable with imputed values. I also tried replacing '2' with other values, but the error keeps returning.
It sounds like you are not reading the imputed data sets correctly. Are you using the information from the end of the output where you imputed the data sets to read them? If you can't see the problem, send the files and your license number to email@example.com.