Mplus Discussion >> Cluster Issues

Topics
Last Day
Last 3 Days
Last Week
Tree View

Edit Profile


Cluster Issues

Mplus Discussion > Multilevel Data/Complex Sample >

Message/Author

Anonymous posted on Saturday, October 08, 2005 - 9:19 am

I am trying to develop a twolevel model using individuals as the clusters. I hope that this is possible. At the moment, my input results in the following error:

*** ERROR
One or more between-level variables have variation within a cluster for
one or more clusters. Check your data and format statement.

Between Cluster ID with variation in this variable
Variable (only one cluster ID will be listed)

FAMTY ******

What does this mean? How do I repair this problem with the data?

Thank you in advance for your response.

Linda K. Muthen posted on Saturday, October 08, 2005 - 1:56 pm

Yes, it is possible to have individual as the cluster variable.

A between-level variable must have the same value for every member of a cluster. You must be violating that. If this does not help, send your input, data, output, and license number to support@statmodel.com.

Anonymous posted on Saturday, October 08, 2005 - 7:41 pm

If I'm understanding you correctly, every individual has to have the same response value? For instance, if I want to examine family structure at level-2, then every individual should have the same score. If that is the case, should I delete the individuals that deviate? If I delete the individuals that deviate would you consider this a comparison group?

Linda K. Muthen posted on Sunday, October 09, 2005 - 8:02 am

Please send your input, data, output, and license number to support@statmodel.com so we can see exactly what you are doing.

cvanhull posted on Thursday, January 05, 2006 - 2:04 pm

Dear Dr. Muthen,

I am attempting a CFA using twins. What I would like to do is account for not only clustering within families but the different degree of similarity in identical and fraternal twins. Is there a way to do this in Mplus?

Thank you,

Carol Van Hulle

bmuthen posted on Friday, January 06, 2006 - 9:33 am

Yes, you do a 2-group analysis of MZ and DZ twins, where you specify a group difference in the correlation. See also Carol Prescott's article on our web site which deals with categorical outcomes. Many more twin and family models will soon be posted on our web site showing new features in Mplus Version 4.

cvanhull posted on Tuesday, January 10, 2006 - 8:22 am

Dr. Muthen,

Thank you for your help. If I understand correctly, a 2 group analysis would give me the factor structure for MZ and DZ twins separately. However, I would like to use all my data in a single group.

Say for example that I have 5 items which I expect to load on a single underlying factor or trait. If I understand the issue of clustering, by measuring these items on related individuals some of the responses (i.e. response from related individuals) will be correlated and this correlation must be accounted for. However, if I measure the 5 items on identical and fraternal twins than responses from identical twins will be more highly correlated than responses from fraternal twins.

Is there way to account for this differential similarity but still use all of the data in a single analysis?

Thanks,
Carol Van Hulle

bmuthen posted on Tuesday, January 10, 2006 - 8:43 am

The 2-group analysis is the way to go. This is single analysis of all the data for both MZ and DZ twins. The SEM term "2-group" simply means that you allow different parameters for the 2 groups in this single run of all the data. See for example the Neale & Cardon (1992) book. With a single item, the analysis has 2 outcomes (one for each twin). The clustering is taken into account in the model by letting the 2 outcomes correlate as a function of the "ACE" factors. Identical twins have a higher correlation between the A factors (1.0) than fraternal twins (0.5). With 5 items, you would analyze 10 outcomes. The ACE correlation structure would be imposed on the 2 factors (one for each twin). Hope this helps.

MKS posted on Tuesday, July 26, 2011 - 8:50 am

I am trying to develop a twolevel model using brands as the cluster variable. My input result is:

*** ERROR
One or more between-level variables have variation within a cluster for
one or more clusters. Check your data and format statement.

Between Cluster ID with variation in this variable
Variable (only one cluster ID will be listed)
When i change the coding of the clustervariable the message error no longer appears. Is this possible and the real reason for this problem?
Thanks for your help.

Linda K. Muthen posted on Tuesday, July 26, 2011 - 12:42 pm

This message means that a variable you have placed on the BETWEEN list varies within a cluster. If you have done something to make the values the same, the message will not appear. I am not clear on what you are saying. Please send the output and your license number to support for further information.

unanimous posted on Thursday, March 28, 2013 - 10:41 am

I've got the same error message like
*** ERROR
One or more between-level variables have variation within a cluster for
one or more clusters. Check your data and format statement.

Between Cluster ID with variation in this variable
Variable (only one cluster ID will be listed)

SO1 3
DEC1 3

So I checked my data- and within clusters values are the same. But in some cases, for example, clusters are 50 and, then unique identifiers for each variables should be 50? Is it also related that SO1 and DEC1 are binary variables?

Linda K. Muthen posted on Thursday, March 28, 2013 - 10:52 am

If this message is not correct, you are reading your data incorrectly. If you can't see the problem, send the data, your output and your license number to support@statmodel.com.

Daniel Lee posted on Friday, February 24, 2017 - 2:34 pm

Hi Dr. Muthen,

I'd like to follow up on this thread as I am receiving this message as well (running a MSEM):

One or more between-level variables have variation within a cluster for
one or more clusters. Check your data and format statement.

I was wondering if there is a way to identify the clusters contributing to this problem. I have a really big data set w/ a lot of clusters (census block groups), and several block-group-level variables, and doing it manually would take a long time.

Thank you!

Linda K. Muthen posted on Friday, February 24, 2017 - 4:08 pm

The message should give the cluster number. If not, send the output and your license number to support@statmodel.com.

FRANS posted on Monday, July 31, 2017 - 10:48 pm

Dear Dr. Muthen,
Normally, I would not want to have clusters without within-cluster variation in the analysis, and usually it is really just a minor problem (if it is an issue at all). However, now I have an outcome variable that has no variation in about as many clusters (relatively speaking) as in User's Guide Example 9.3. In fact, I have noticed that several and different multilevel examples in the User's Guide yield the warning message that one or more individual-level variables have no within-cluster variation. Similar to Example 9.3, my outcome variable is not continuous. Now I was wondering if there is any recommendation whether or not to include clusters without variation in one variable, and whether this depends on the respective variable being a predictor or an outcome (esp. in fixed effects models)?
Thank you!

Bengt O. Muthen posted on Tuesday, August 01, 2017 - 6:03 pm

UG ex 9.3 has data generated by the Monte Carlo setup:

csizes = 40 (5) 50 (10) 20 (15);

That is, many clusters have rather few cluster members (5, 10, 15). Combining that with a binary u variable, it is to be expected that some clusters have only one of the 2 u outcomes. I don't think it is known yet to which extent this impacts the quality of the results - that could be a research study. I don't think it is too harmful unless it shows most clusters, but that's just a guess.

We added this printout mostly due to the multilevel time series analysis of Version 8 where level-2 is subject and because you study processes over time you want to know which subjects have constant outcome values over time.