Mplus Discussion >> Warning message regarding number of clusters

Topics
Last Day
Last 3 Days
Last Week
Tree View

Edit Profile


Warning message regarding number of c...

Mplus Discussion > Multilevel Data/Complex Sample >

Message/Author

Xu, Man posted on Sunday, March 29, 2009 - 8:48 am

Dear Sir or Madam,

I tested a two group SEM model using Type=Complex, and TYPE = IMPUTATION for 10 datasets. There are 47 clusters in total with 32 for one group and 15 for the other group. In this model there are 104 Free Parameters with 76 Degrees of freedom.
The model produced good model fit and parameter estimates, etc. But I recevied warnings regarding different parameters for different datasets, like this:

Errors for replication with data file feb 12 ie_fixed_ascii_4.dat:
THE MODEL ESTIMATION TERMINATED NORMALLY
THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES MAY NOT BE
TRUSTWORTHY FOR SOME PARAMETERS DUE TO A NON-POSITIVE DEFINITE
FIRST-ORDER DERIVATIVE PRODUCT MATRIX. THIS MAY BE DUE TO THE STARTING
VALUES BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. THE
CONDITION NUMBER IS -0.487D-16. PROBLEM INVOLVING PARAMETER 33.
THIS IS MOST LIKELY DUE TO HAVING MORE PARAMETERS THAN
THE NUMBER OF CLUSTERS.

Errors for replication with data file feb 12 ie_fixed_ascii_5.dat:
....smae as above
CONDITION NUMBER IS -0.902D-16. PROBLEM INVOLVING PARAMETER 32.
....same as above

Given that the warning involved different parameter numbers, do you think there is something wrong?

Linda K. Muthen posted on Sunday, March 29, 2009 - 11:56 am

We give a warning when there are more parameters than there are clusters. If there are more between-level parameters than clusters, this is a problem because the number of clusters is effectively the sample size on the between level. The effect of having more within and between level parameters than clusters has not been studied.

I would be concerned about the group with only 15 clusters. It is recommended that a minimum of 30-50 clusters be used.

Xu, Man posted on Sunday, March 29, 2009 - 12:07 pm

Thank you! In my model I don't have between level variables. But seems unusual that the same model recevied warnings that are not on the same parameter, for one data set it's for parameter 33 but for another data set it is for parameter 32. I wonder what the reason is for this.

Linda K. Muthen posted on Sunday, March 29, 2009 - 12:52 pm

The same principle holds for TYPE=COMPLEX as TYPE=TWOLEVEL. I would need to see the two outputs and your license number at support@statmodel.com to answer this.

Xu, Man posted on Sunday, March 29, 2009 - 4:39 pm

Dear Linda,

Thank you. I think you are right that it is not a good idea to use complex design in my case. I just found that, when I use type=complex, the chi square even went down after I constrained more parameters to be equal across the two groups. When I didn't specify a type=complex, the chi square went up as expected after I constrained more parameters. In this case, would you still need to see the syntax and data file?

Thanks!

Xu Man

Linda K. Muthen posted on Monday, March 30, 2009 - 9:57 am

If you want me to see why you are getting unexpected results, I would need to see the two outputs and your license number.

Mi-young Webb posted on Friday, February 05, 2010 - 8:53 am

I am trying to run multilevel CFA for my manuscript under revision. I have 28 items measuring 3 latent variables and 38 clusters (total number of observations is 418).

When I ran multilevel CFA, I got the message: "THE LOGLIKELIHOOD DECREASED IN THE LAST EM ITERATION. CHANGE YOUR MODEL AND/OR STARTING VALUES."

So I ran the model with TYPE = COMPLEX in conjunction with CLUSTER options. I got the same message as Xu.

THE STANDARD ERRORS OF THE MODEL
PARAMETER ESTIMATES MAY NOT BE
TRUSTWORTHY FOR SOME PARAMETERS DUE
TO A NON-POSITIVE DEFINITE
FIRST-ORDER DERIVATIVE PRODUCT
MATRIX. THIS MAY BE DUE TO THE
STARTING VALUES BUT MAY ALSO BE AN
INDICATION OF MODEL
NONIDENTIFICATION. THE CONDITION
NUMBER IS -0.333D-16. PROBLEM
INVOLVING PARAMETER 38.

THIS IS MOST LIKELY DUE TO HAVING
MORE PARAMETERS THAN THE NUMBER OF
CLUSTERS.

The warning message only mentions about the standard errors of the estimates. How about the parameter estimates? Are those trustworthy? Is it possible for me to use them to make the model more parsimonious (remove the items with low factor loadings and R-square values, for example)?

Sorry for the long message and thank you much for your help.

Linda K. Muthen posted on Friday, February 05, 2010 - 9:46 am

The parameter estimates are fine. If you want to create a more parsimonious model, I would do an EFA as a first step. I would not start altering a CFA.

Mi-young Webb posted on Friday, February 05, 2010 - 10:56 am

Thank you for prompt response. I agree that EFA is a better tool for a data reduction. But I am hesitant give that researchers have criticized conducting EFA and CFA using the same sample. And I do not have enough sample size to randomly split the sample to conduct EFA and CFA separately. Has either Dr. O. Muthen or Dr. L. Muthen written paper(s) on this topic that I can use for justification for such practice?

Linda K. Muthen posted on Friday, February 05, 2010 - 3:07 pm

When you have only one small sample, I think it may be less bad to do both an EFA and CFA on the same sample than to start modifying a simple structure CFA. I have no references for this opinion. Your hypothesized model is the CFA you estimated. So by fixing parameters you are already using it more than once.

In your case, I don't know if your model fits or not. You are dealing with a model with more parameters than clusters. This is definitely a problem is you have more between parameters than clusters. Whether it is a problem if you have more total parameters than clusters has not been studied. It is recommended to have no fewer than 30-50 clusters.

Elisabeth Schüller posted on Wednesday, October 16, 2013 - 1:03 am

Dear Linda,

I had the same problem as Xu and Mi-young:

THIS IS MOST LIKELY DUE TO HAVING
MORE PARAMETERS THAN THE NUMBER OF
CLUSTERS.

However and as you recommended referring to your statement posted on Sunday, March 29, 2009 - 11:56 am, I have 67 Cluster. So it should be okay.

However, do you know a quotable study that I can use as source for justification?

Thanks for help!

Linda K. Muthen posted on Wednesday, October 16, 2013 - 4:01 pm

If you get the message above, you have more parameters than clusters. We only print the message in that case.

I know of no study about this.

Aurelie Lange posted on Monday, July 01, 2019 - 6:00 am

Dear Dr Muthen,

I also have a message regarding more parameters than clusters.
Does the message impact the model fit indices and chisquare test of model fit? Or, in other words, if the s.e. of the paramaters are not thrustworthy, can I thrust the model filt results and use these for model comparison?

Thank you for your reply.

Sincerely,
Aurelie

Bengt O. Muthen posted on Monday, July 01, 2019 - 5:19 pm

Use Tech1 to check how many cluster-level parameters you have. If less than number of clusters, you are probably ok.

Aurelie Lange posted on Tuesday, July 02, 2019 - 3:58 am

Dear Dr Muthen,

Thank you for your quick reply. I am using type=comlex. So, in my understanding, I don't have cluster-level parameters, as I haven't specified different levels in my model (merely accounted for clustering through type=complex).
Can I compare model fit indices, when I get a warning reagerding more parameters than clusters in this situation? Or is it advisable not to?

Thank you!

Sincerely,
Aurelie

Bengt O. Muthen posted on Tuesday, July 02, 2019 - 5:35 pm

How many parameters do you have and how many clusters?

The only way to explore this is a simulation study.

Aurelie Lange posted on Wednesday, July 03, 2019 - 10:59 pm

I have 35 clusters in group A and 29 clusters in group B. I want to test a moderator. The fully restricted model has 57 parameters (both groups are restricted to be equal). The free/ unconditional model (both groups can have their own estimates) has 70 parameters.
I want to compare model fit of these two models, to know whether I should run seperate analyses for the two groups or whether I can test the model on the sample as a whole (which goes fine, as I then have 36 clusters and 35 parameters).

But, perhaps the conclusion is that this moderator analyses is too complex for my data?

Thank you for your answer!
Aurelie

Bengt O. Muthen posted on Friday, July 05, 2019 - 3:25 pm

It is hard to know if the testing against the unrestricted 2-group model is ok without doing a Monte Carlo study. Perhaps you can report it with a caveat and also present the BIC values for the restricted and unrestricted models.