Xu, Man posted on Sunday, March 29, 2009 - 8:48 am
Dear Sir or Madam,
I tested a two group SEM model using Type=Complex, and TYPE = IMPUTATION for 10 datasets. There are 47 clusters in total with 32 for one group and 15 for the other group. In this model there are 104 Free Parameters with 76 Degrees of freedom. The model produced good model fit and parameter estimates, etc. But I recevied warnings regarding different parameters for different datasets, like this:
Errors for replication with data file feb 12 ie_fixed_ascii_4.dat: THE MODEL ESTIMATION TERMINATED NORMALLY THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES MAY NOT BE TRUSTWORTHY FOR SOME PARAMETERS DUE TO A NON-POSITIVE DEFINITE FIRST-ORDER DERIVATIVE PRODUCT MATRIX. THIS MAY BE DUE TO THE STARTING VALUES BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. THE CONDITION NUMBER IS -0.487D-16. PROBLEM INVOLVING PARAMETER 33. THIS IS MOST LIKELY DUE TO HAVING MORE PARAMETERS THAN THE NUMBER OF CLUSTERS.
Errors for replication with data file feb 12 ie_fixed_ascii_5.dat: ....smae as above CONDITION NUMBER IS -0.902D-16. PROBLEM INVOLVING PARAMETER 32. ....same as above
Given that the warning involved different parameter numbers, do you think there is something wrong?
We give a warning when there are more parameters than there are clusters. If there are more between-level parameters than clusters, this is a problem because the number of clusters is effectively the sample size on the between level. The effect of having more within and between level parameters than clusters has not been studied.
I would be concerned about the group with only 15 clusters. It is recommended that a minimum of 30-50 clusters be used.
Xu, Man posted on Sunday, March 29, 2009 - 12:07 pm
Thank you! In my model I don't have between level variables. But seems unusual that the same model recevied warnings that are not on the same parameter, for one data set it's for parameter 33 but for another data set it is for parameter 32. I wonder what the reason is for this.
The same principle holds for TYPE=COMPLEX as TYPE=TWOLEVEL. I would need to see the two outputs and your license number at email@example.com to answer this.
Xu, Man posted on Sunday, March 29, 2009 - 4:39 pm
Thank you. I think you are right that it is not a good idea to use complex design in my case. I just found that, when I use type=complex, the chi square even went down after I constrained more parameters to be equal across the two groups. When I didn't specify a type=complex, the chi square went up as expected after I constrained more parameters. In this case, would you still need to see the syntax and data file?
I am trying to run multilevel CFA for my manuscript under revision. I have 28 items measuring 3 latent variables and 38 clusters (total number of observations is 418).
When I ran multilevel CFA, I got the message: "THE LOGLIKELIHOOD DECREASED IN THE LAST EM ITERATION. CHANGE YOUR MODEL AND/OR STARTING VALUES."
So I ran the model with TYPE = COMPLEX in conjunction with CLUSTER options. I got the same message as Xu.
THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES MAY NOT BE TRUSTWORTHY FOR SOME PARAMETERS DUE TO A NON-POSITIVE DEFINITE FIRST-ORDER DERIVATIVE PRODUCT MATRIX. THIS MAY BE DUE TO THE STARTING VALUES BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. THE CONDITION NUMBER IS -0.333D-16. PROBLEM INVOLVING PARAMETER 38.
THIS IS MOST LIKELY DUE TO HAVING MORE PARAMETERS THAN THE NUMBER OF CLUSTERS.
The warning message only mentions about the standard errors of the estimates. How about the parameter estimates? Are those trustworthy? Is it possible for me to use them to make the model more parsimonious (remove the items with low factor loadings and R-square values, for example)?
Sorry for the long message and thank you much for your help.
Thank you for prompt response. I agree that EFA is a better tool for a data reduction. But I am hesitant give that researchers have criticized conducting EFA and CFA using the same sample. And I do not have enough sample size to randomly split the sample to conduct EFA and CFA separately. Has either Dr. O. Muthen or Dr. L. Muthen written paper(s) on this topic that I can use for justification for such practice?
When you have only one small sample, I think it may be less bad to do both an EFA and CFA on the same sample than to start modifying a simple structure CFA. I have no references for this opinion. Your hypothesized model is the CFA you estimated. So by fixing parameters you are already using it more than once.
In your case, I don't know if your model fits or not. You are dealing with a model with more parameters than clusters. This is definitely a problem is you have more between parameters than clusters. Whether it is a problem if you have more total parameters than clusters has not been studied. It is recommended to have no fewer than 30-50 clusters.
I also have a message regarding more parameters than clusters. Does the message impact the model fit indices and chisquare test of model fit? Or, in other words, if the s.e. of the paramaters are not thrustworthy, can I thrust the model filt results and use these for model comparison?
Thank you for your quick reply. I am using type=comlex. So, in my understanding, I don't have cluster-level parameters, as I haven't specified different levels in my model (merely accounted for clustering through type=complex). Can I compare model fit indices, when I get a warning reagerding more parameters than clusters in this situation? Or is it advisable not to?
I have 35 clusters in group A and 29 clusters in group B. I want to test a moderator. The fully restricted model has 57 parameters (both groups are restricted to be equal). The free/ unconditional model (both groups can have their own estimates) has 70 parameters. I want to compare model fit of these two models, to know whether I should run seperate analyses for the two groups or whether I can test the model on the sample as a whole (which goes fine, as I then have 36 clusters and 35 parameters).
But, perhaps the conclusion is that this moderator analyses is too complex for my data?
It is hard to know if the testing against the unrestricted 2-group model is ok without doing a Monte Carlo study. Perhaps you can report it with a caveat and also present the BIC values for the restricted and unrestricted models.