Mplus Discussion >> Multiple group comparion with complex sample using cluster and weigh variables

Topics
Last Day
Last 3 Days
Last Week
Tree View

Edit Profile


Multiple group comparion with complex...

Mplus Discussion > Multilevel Data/Complex Sample >

Message/Author

davood posted on Friday, October 01, 2004 - 1:32 pm

Dear Dr. Muthen,
I am using a large data base with cluster and weight variables. My purpose is to perform a multiple group comparison between 4 groups. It's a fairly complex model in which latent variables are regressed on some background variables. My data are ordinal. I treat them as continuous. I also have missing values. I used TYPE=COMPLEX. I have several questions for you:
1- It takes M-plus 23 minutes to run the analysis. Would you please let me know why it takes that long? (Since the data are confidential, I would be able only to provide the input syntax and partial output results).
2- I have heard you said as long as the observed dependent variables are not very badly skewed, the analysis should be ok. Could you please elaborate more on this issue?
2- M-plus uses the MLR estimator as default for this type=complex analysis. I am interested in Chi-squared difference test. I already consulted the documents on the website regarding the chi-square difference test; but, just to clarify the subject, to perform chi-squared difference test, how would I obtain the "regular" chi-square values?
3- In case I should treat my data as categorical, which estimators you would recommend to perform chi-squared difference data with missing values?

Sincerely,
Davood

bmuthen posted on Friday, October 01, 2004 - 2:37 pm

1) Perhaps this example has many outcomes which in addition to having a large sample and 4 groups could be time-consuming. Do you have Type = Missing as well? Do you ask for Modification Indices? Is a run without Type=Complex equally time-consuming? And, hopefully you are using version 3.11.

2) With very skewed items, the regular linear models used in standard continuous-variable analysis aren't realistic or dependable. For instance a Person correlation is attenuated.

2 again) Multiply by the printed correction factor and also see http://statmodel.com/chidiff.html

3)Use WLSMV and the new DIFFTEST option.

davood posted on Tuesday, October 05, 2004 - 11:05 am

Dear Dr. Muthen,
Thanks for your time and your valuable comments. Just would like to let you know what I used in my analysis:
1)I do not use type=missing. When I dont ask for MI's, the run time just decreases by just few minutes. And, I upgraded M-Plus to version 3.11. The running time decreased by 1 or 2 minutes.

Another question: I re-ran the same model(for simplicity, this time just for 2 groups) by specifying the categorical obs dependent variables and using WLSMV estimator (type=complex, cluster and weight variables). This time, however, I've got this error message:

*** ERROR
Cluster ID cannot appear in more than one group.
Problem with cluster ID: 50

I did not have this error when I treat my observed dep variables as continuous, however.

Sinecerly,
Davood
p.s. If you need the syntax, please let me know.

Linda K. Muthen posted on Tuesday, October 05, 2004 - 11:16 am

You should send the output where you treat the variables as continuous and also the output where you treat the variables as categorical.

davood posted on Thursday, December 09, 2004 - 9:27 am

Dear Dr. Muthen,
I'm doing a multiple group CFA with one factor and categorical indicators, the mplus generated this warning:

SERIOUS COMPUTATIONAL PROBLEMS OCCURRED IN THE BIVARIATE ESTIMATION OF
THE CORRELATION FOR VARIABLES MOMRELS1 AND MOMCOMM1. CHECK YOUR DATA.
IF THE PROGRAM RECOVERS FOR THIS PAIR OF VARIABLES (SEE TECHNICAL 6
OUTPUT), THE ESTIMATES ARE VALID. THE PROBLEM OCCURRED FOR THE FOLLOWING
OBSERVATION(S):
OBSERVATION(S) WITH MOMRELS1=4 AND MOMCOMM1=4.

I checked the tech6 output. Correlation between MOMCOMM1 and MOMRELS1 was estimated with so many iterations. However, at the end, message was
NORMAL TERMINATION FOR ESTIMATING THE CORRELATION FOR MOMRELS1 AND MOMCOMM1

I did not have this problem when I ran my model for each group seperately. Are my results valid then?
Thank you,
Davood

Linda K. Muthen posted on Thursday, December 09, 2004 - 9:54 am

As long as you get the message "Normal termination etc.", your results are valid.

Davood Tofighi posted on Friday, January 07, 2005 - 5:07 pm

Dear Dr, Muthen,
I have faced an error while doing a two group CFA with categorical outcomes. Here is the error message:
*** ERROR
Group 4 does not contain all values of categorical variable: GOOD1

would you please let me know how to overcome this error?
Thank you!

Linda K. Muthen posted on Friday, January 07, 2005 - 5:37 pm

Each group must have the same values on categorical observed variables. So if males have values of 1, 2, and 3 on a variable, females cannot have only 1 and 2. This sometimes comes about with listwise deltion. You need to collapse categories until you have the same values in each group.

Davood Tofighi posted on Thursday, January 20, 2005 - 10:48 am

Dear Dr. muthen,
I'm doing a multiple group analysis with categorical indicators, cluster and weight variables using WLSM estimator. I have faced partial measurement invariance. In doing so, I free each factor loading and respected thresholds while I fixed the scale factor to one for one indicator at a time in the nested model. I use scaled chi square difference test mentioned on the statmodel homepage to compare the nested model with the less restricted baseline model.
The problem is that in doing so, however, after freeing three indicators parameters I faced negative scaled chi square difference test. I apprecaite your comments.
Regards,
Davood

BMuthen posted on Thursday, January 20, 2005 - 8:11 pm

This can happen. There have been writing by Bentler and Satorra on this for continuous outcomes. The difference testing is an asymptotic procedure and can break down like this.

Davood Tofighi posted on Friday, January 21, 2005 - 9:36 am

In this regard, can I conclude that the difference between the two models (nested and less restricted) is not significant?

Linda K. Muthen posted on Saturday, January 22, 2005 - 3:19 pm

I don't think it is possible to make a conclusion in this case. The results point to a breakdown of the method.

Davood Tofighi posted on Monday, January 31, 2005 - 11:30 am

Dear Dr. Muthén,

I am doing a two-group SEM with continuous indicators using TYPE=Complex (with weight and cluster variables). I use MLM as an indicator. The model is a hybrid SEM model: 4 observed background variables predict two latent variables; these, in turn, are used to predict depression, which is a two level latent variable (20 indicators are used to form 4 latent variables which in turn form the depression variable). In addition, direct paths from two control variables and from the 4 background variables to the depression variable are included.

Here is the summary of the model that Mplus prints:
Number of dependent variable: 32
Number of independent variables: 6
Number of Continuous latent variables: 7
Degrees of freedom: 1280
Number of free parameters: 268

To calculate degrees of freedom, I used the formula mentioned in Bollen (1989, p. 361). ½*G*(p+q)(p+q+1)-t where
p: number of background variables (6)
q: number of independent variable (32)
G: number of groups: (2)
t: number of free(estimated)parameters (268)

After plugging in the numbers if the Bollen�s formula, I get a different number from the Mplus output. (df=½*2* 38*39-268 =1214). So, the calculated df is less than df in Mplus print out. Would you please help us resolve this discrepancy?

Davood Tofighi posted on Monday, January 31, 2005 - 11:34 am

I would like to correct one mistake in the previous post. I used MLM as the estimator(not indicator).
Sorry for the inconvenience.

Linda K. Muthen posted on Monday, January 31, 2005 - 8:52 pm

Please send your complete output to support@statmodel.com and I will look at it when I return to LA. The formula that you are using does not take into account that df's for covariates are treated differently than for outcomes.

Sadhana posted on Monday, February 21, 2005 - 2:56 am

For Hierarchichal Cluster analysis of continuous data, Which distance and linkage method is appropriate as ther is a variety of methods available in the literature.

bmuthen posted on Saturday, February 26, 2005 - 5:05 pm

For cluster analysis related to Mplus, please see mixture modeling (categorical latent variables)references on our web site. For instance, the Applied Latent Class Analysis book shown there has a chapter by Vermunt and Magidson that touches on relationships between classic "k-means" clustering and latent class analysis of continuous outcomes. I would say that latent class analysis and related techniques such as factor mixture modeling could be preferrable to traditional clustering techniques.

Davood Tofighi posted on Thursday, March 10, 2005 - 8:39 am

Dear Dr. Muthen,

I have some questions regarding using both categorical and continuous indicators in a SEM model with both cluster and weight variables.
1) can we use both categorical and continuous in the above analysis?
2)How does Mplus treat continuous indicators when some others are categorical?
3)In this case, what type of estimator is recommended?
4)Are there any complications involved when one uses both categroical and continuous indicators?
5)The ultimate goal is to test for factorial invariance between two groups. Is there any other issue that we should be carfeul about when using both categorical and continuous indicators to test for measurement invariance?

Regards,

Linda K. Muthen posted on Thursday, March 10, 2005 - 9:00 am

1) can we use both categorical and continuous in the above analysis?

Yes.

2)How does Mplus treat continuous indicators when some others are categorical?
3)In this case, what type of estimator is recommended?

Two estimators are available in this situation -- weigthed least squares or maximum likelihood. They are treated as continuous variables if that is what you mean.

4)Are there any complications involved when one uses both categroical and continuous indicators?

No.

5)The ultimate goal is to test for factorial invariance between two groups. Is there any other issue that we should be carfeul about when using both categorical and continuous indicators to test for measurement invariance?

For categorical indicators, thresholds and factor loadings must be freed together. You cannot free just a threshold or a factor indicator. And when free, factor means need to be fixed at zero in all groups and scale factors or residual variances need to be fixed to one in all groups depending on whether the delta or theta parameterization is used. See Examples 5.16 and 5.17 in the Mplus User's Guide and Web Note 4.

davood tofighi posted on Wednesday, June 15, 2005 - 10:20 am

Dear Dr Muthen,

1-In TYPE=COMPLEX MISSING H1 with categorical observed variables, how the missing values are treated?
2-What is the best estimator for this analysis?
3- In the analysis, I got WRMR >2, CFI>.95 and RMSEA<.05. Does it mean any serious misfit?
4-In the case of categrocial observed variables, what is the interpretation of factor loadings? Is it the same as for continuous observed variables?

Regards,
Davood

bmuthen posted on Friday, June 17, 2005 - 8:14 am

1. with categorical outcomes, WLSMV is the default for which a pairwise present approach is taken.

2. WLSMV is good, unless the MCAR flavor of the pairwise approach needs to be replaced by MAR in which case ML should be used if possible from a computational point of view.

3. CFI is rather reliable and sounds like you are in the ballpark.

4. The loadings are probit slopes when WLSMV is used (Logit witn ML). But if you only check significance, sign and size of standardized loadings, then they can be looked at analogous to continuous indicators.

Frank Davis posted on Sunday, November 06, 2005 - 12:09 am

Dr. Muthen,

I plan to use either cluster analysis or LCA for my dissertation. I will have a large a relatively large data set (N=3,000) with approximately 8 continuous variables and 4 categorical variables. Do you forsee any complications? I guess this answer depends on what kind of data I have. Based upon my limited knowledge, cluster analysis doesn't handle categorical variables; however, LCA handles both categorical and continuous variables. Could you provide me a few good texts or references that would point a novice in the right direction? Thanks so much..

Frank

Linda K. Muthen posted on Saturday, November 12, 2005 - 6:01 pm

This analysis should work fine. You may be interested in the following reference:

Hagenaars, J.A. & McCutcheon, A.L. (2002). Applied latent class analysis. Cambridge, UK: Cambridge University Press.

Sven D. Klingemann posted on Thursday, February 22, 2007 - 10:47 am

Dr. Muthen,
I am looking into using Mplus for testing measurement invariance across groups (e.g. gender, age, race etc.). I have a complex survey design and would like to use a split-sample design for my analyses. Based on the user manual I noticed that the �subpopulation� option was not available for multiple group comparison, meaning that it is not possible to use this option to select half of my sample (based on a dummy differentiating between the two randomly selected sub-samples). Is there any way I can solve this problem given (a) the nature of the sample design, (b) wanting to do a cross-validation and (c) wanting to test for group differences in measurement invariance?
Thank you!

Bengt O. Muthen posted on Saturday, February 24, 2007 - 3:32 pm

I can think of 2 approaches. I think you could do the multiple-group analysis using mixtures with the "knownclass" option in combination with the subpopulation option. But note also that the subpopulation option often makes very little difference.

Sven D. Klingemann posted on Monday, February 26, 2007 - 7:48 am

Dr. Muthen,
thank you for your quick reply. Just a short follow-up:
1. When you say that "the subpopulation option often makes very little difference" do you mean that it should not be problematic to use the "useobservations" option instead, even if the analysis involves a complex survey design? I am just concerned that this would be viewed as a source of bias in my analysis.
2. Based on your suggestion, I assume that mplus will not "complain" that I am doing multiple group analysis with the subpopulation command if groups are defined by the "knownclass" option rather than the "grouping" option in a "regular" CFA, correct?
3. Last (naive)question: what would be the pendant of comparing factor means between groups in CFA when switching to LCA?
Thanks again for all your help!

Linda K. Muthen posted on Monday, February 26, 2007 - 8:38 am

1. You can check out whether with your data the SUBPOPULATION option makes a difference by analyzing each group separately using the SUBPOPULATION command.

2. You would have to try this to know for sure. I think it would work.

3. The counterpart to comparing the means of a continuous latent variable would be comparing the means of the categorical latent variable.

Courtney Bagge posted on Friday, March 09, 2007 - 1:10 pm

Hello Dr. Muthen,

I am analyzing data from a national data set. The full data set contains information from youth between the ages of 12 and 20. I wish to analyze data from youth between the ages of 15-18 only. Thus, I believe that I use the subpopulation (selecting data from those between the ages of 15 and 18) command to make sure the results are representative. However, I wish to analyze my data as a multigroup by age. The manual indicates that the subpopulation command cannot be used in conjunction with the multigroup. I thought I could make a dataset which removes data from individuals younger than 15 and older than 18. I would then NOT use the subpopulation command, but still include the weight, stratification, and cluster commands.
1) Would the weighting be wrong if I used this technique? I am asking because I read somewhere that subsetting concerns regarding the variance computation are only relevant to survey software using the Taylor Series. I believe that Mplus uses robust variance estimation. If what I read is true I am wondering if it is even necessary to use a subpopulation command when analyzing complex data using Mplus (regardless if doing a multi-group)?
2) If the weighting would be incorrect using the strategy described above (see above #1) then how would I conduct a multigroup only using a subsample of the full data?

Thank you very much!

Linda K. Muthen posted on Saturday, March 10, 2007 - 7:41 am

I would do two single group analyses, one using the SUBPOPULATION option and one not using it to see if there is a big difference in the results. If not, I would just do the multiple group analysis without the SUBPOPULATION option.

Sylvana Robbers posted on Wednesday, March 12, 2008 - 2:32 am

Hello dr. Muthen,

I am using clustered data (twins) and I would like to calculate means for 3 groups of twins and decide if these means are significantly different to each other.
My questions are:

1) As a first step (saturated model) can I just run the data and look at the sample statistics for the means? Or do I need to look at the model results for the means? These means are not the same.

2) Is it okay not to specify any model when I am only interested in mean scores, or is it necessary to specify the correlations like this:
y1 with y2
y3 with y4
I tried both approaches, and I get different mean scores in the model results.

3) To compare the group-means, can I just fix the means like this: [y1](1) in the separate group-models and then compare the BICs of these models with the BIC of the saturated model? Can I use BIC for these purposes?

4) When I want to report group means of y1 that are corrected for covariate x (y1 ON x), should I report the means (i.e. intercepts, using centering) in the model results?

I hope my questions are clear. Thank you very much in advance for your time!

Sincerely,
S. Robbers

Linda K. Muthen posted on Wednesday, March 12, 2008 - 10:19 am

If you want to test that the means are equal to each other and equal across groups, do the following:

MODEL:
y1 WITH y2;
MODEL g1:
[y1-y2] (p1);
MODEL g2:
[y1-y2] (p2);
MODEL g3:
[y1-y2] (p3);
MODEL TEST:
p2 = p1;
p3 = p1;
p3=p2;

Sylvana Robbers posted on Thursday, March 13, 2008 - 3:50 am

Thank you very much for you help.

I tried your approach, but I get the following warning:

WALD'S TEST COULD NOT BE COMPUTED BECAUSE OF A SINGULAR COVARIANCE MATRIX.

Do you have any idea how I can solve this problem? I tried to run the model with 2 and with 4 dependent variables, but I get the message every time.

Also I would like to know whether or not I can use BIC for these purposes as described in my previous post.

Thank you very much in advance for your time.

Sylvana

Linda K. Muthen posted on Thursday, March 13, 2008 - 5:59 am

Pleas send your input, data, output, and license number to support@statmodel.com. I am not aware that BIC can be used as a test of two nested models.

Nathan Tintle posted on Thursday, May 29, 2008 - 1:35 pm

I am using complex survey data to examine a fairly straightforward path model. All variables are dichotomous.

What I'd like to do is to an "overall" test of men vs. women to (hopefully) find that the path models are significantly different. That is to say, that at least some of the path coefficients are significantly different between the two groups. I envision first doing an "overall" test of the men model vs. women model and then, having established that they are different, compare individual path coefficients to find which are different.

My question is this: much of what I've been reading says that you must have measurement invariance. But, as I understand it, this would only apply to a model with a latent variable, right?

Would you be able to point me in the right direction for some assistance as to how to do these 2 comparisons? I'm stuck as to what commands to use....

Further, if we did (in the future) add a latent variable to the model, how would I need to change the procedure? I anticipate that, at this point, we would test for measurement invariance across gender first....

Thanks in advance!

Bengt O. Muthen posted on Friday, May 30, 2008 - 7:24 am

You can do an analysis with all paths held equal across the groups using the parameter equality constraint feature described in the UG. Then you can see which paths are not invariant by looking at the Modification indices. Yes, measurement invariance can typically be studied only with multiple indicators, in which case the measurement part should be invariance tested first.

Sylvana Robbers posted on Monday, October 27, 2008 - 6:22 am

Hi,
I am testing for mean differences between 6 groups using clustered data (twins) while correcting for SES:

USEVARIABLES= ntrid ses3 intmo3s extmo3s intfa3s extfa3s;
GROUPING= grsexe (1=mboys 2=mgirls 3=lboys 4=lgirls 5=dboys 6=dgirls);
MISSING= ALL (-1.00);
CENTERING = GRANDMEAN (ses3);
CLUSTER IS ntrid;
ANALYSIS:
TYPE= complex;

MODEL:
intmo3s extmo3s intfa3s extfa3s ON ses3;
!model mboys:
![intmo3s] (1);
!model dboys:
![intmo3s] (1);

I do chi-square diff testing while fixing the means of one var in two groups every time. I then get an MLR chisquare with 1 df, which I devide by the scaling correction factor to get the right chi-square. If this chisquare is significantly different from zero, I conclude that the means were different. FYI, the variables are not normally distributed. My first question is: is this approach correct?

When I use squareroot transformed data in order to obtain normality, I get somewhat different results. How is that possible? I thought normality was not needed when using MLR. Which results should I trust?

Thanks,
Sylvana

Linda K. Muthen posted on Monday, October 27, 2008 - 9:08 am

Yes, you can test mean differences in the way you describe.

When you transform a variable by taking the square root of the variable, you change its relationships with other variables. This is why you get different results. For example, if linearity was approximately correct for your regression, it will not be after transformation. It is not the same as dividing a variable by a constant where all relationships remain the same.

Sylvana Robbers posted on Tuesday, October 28, 2008 - 1:33 am

Thank you, this is very helpful!

Sylvana Robbers posted on Tuesday, October 28, 2008 - 1:41 am

By the way, is there a paper that describes this procedure (chisquare diff testing with Mplus using MLR and the scaling corr factor), that I can use as a reference?
Thanks.

Sylvana Robbers posted on Tuesday, October 28, 2008 - 5:32 am

I am sorry not to put my concerns into one message, but I got a following question related to my first post yesterday 6.22am:
Since I perform multiple comparisons (24 in total), do I have to control for type I error? If yes, how can I do that?
Thanks,
Sylvana

Linda K. Muthen posted on Tuesday, October 28, 2008 - 8:44 am

See the following for a discussion of MLR difference testing:

http://www.fcsm.gov/events/papers05.html

With many tests, you should do some type of Bonferroni correction. I don't know of any rule but I would just use smaller p-values.

Moh Yin Chang posted on Friday, July 03, 2009 - 2:22 pm

Hi,

How can I obtain Wald test statistics for nested model with survey data from the Mplus output?

Bengt O. Muthen posted on Friday, July 03, 2009 - 3:50 pm

You get that if you use the MLR estimator and Model Test (see User's Guide).

Sylvana Robbers posted on Wednesday, September 29, 2010 - 6:12 am

Dr Muthen,
I have a multiple group GMM (cg=2, c=3). Can I use LL difference testing to test if my 3 intercepts are equal between the 2 cg's? (test with 3 df) I suppose in that case I need to use the scaling correction factor?

Or do I need to compare the nested models with BIC?

Thanks.

Bengt O. Muthen posted on Wednesday, September 29, 2010 - 8:40 am

You can use LL difference testing with the scaling factors because you are not testing at the border of the admissible parameter space. An easier way is to use the Wald test of Model test.

Eser Sekercioglu posted on Thursday, December 09, 2010 - 3:37 am

I am trying to run a twolevel multiple group path model (similar to Example 9.11 in MPLUS handbook, but not a CFA and a path model). The cluster is countries, and the grouping variable is religious affiliation. I'd like to test whether the hypothesized direct and indirect effects apply to different religious groups, when taking into account the variation at the country level. I keep on getting the error message:
Cluster ID cannot appear in more than one group.
Problem with cluster ID: 2504
what do you think could be the problem here?

Linda K. Muthen posted on Thursday, December 09, 2010 - 12:06 pm

Multiple group analysis assumes each group contains independent observations. In multilevel modeling, this means the grouping variable should be a between-level variable. If a within-level variable is used as a grouping variables, observations from the same cluster may be in both groups violating this assumption.

Brondeel Ruben posted on Tuesday, May 03, 2011 - 9:22 am

Hi,

I'm testing a relatively simple model for 2 groups: males and females. So I use gender as a grouping variable. I use type=complex to specify they're clustered in schools. The dependent is continuous.

The hypothesis is that there's no difference in coefficients between males and females. So I compared a model with no constraints and a model with only a different intercept for each group, everything else constrainted.
1. Am I correct to think that the chi-square test is a formal test for all constraints at once?
2. Can I test every constraint seperately? I would like to know for every coefficient if it's significantly different between boys and girls. I could do this by comparing the full (no constraints) model with all possible models with one constraint. But could I also do this with single command in Mplus?

Thanks a lot,
Ruben.

Linda K. Muthen posted on Tuesday, May 03, 2011 - 9:38 am

1. Yes.
2. You can do one at a time. There is no automatic way to do this. You can ask for modification indices when you test all at the same time to see which coefficients have the largest modification indices.

sam posted on Wednesday, September 07, 2011 - 3:32 pm

Hi,

I have two models from two different scenarios (i.e., one scenario induces good intention and another scenario induces negative intention). Each model has five variables. The variables are the same, except for one variable. That is, the type of consequences is different for each of the two models. The paths of these two models are exactly the same. My questions are:

(1) Can I compare these models? I am interested in testing the effect of intention on the responses.

(2) Do I need to transform one of the variables in one model that has problems with skewness and kurtosis? If I transform this variable, will I still be able to compare the models?

Thank you.

Bengt O. Muthen posted on Wednesday, September 07, 2011 - 5:02 pm

1. No, not statistically unless you have exactly the same set of variables, but perhaps substantively.

2. No, use MLR which is robust to non-normality.

sam posted on Thursday, September 08, 2011 - 12:58 pm

Thank you very much for the quick response. For the second question, all of the variables have skewness between 0.1 and 1.5 and kurtosis between 0.5 and 1.5, except for one latent variable that its skewness is above 1.5 and kurtosis about 2. When I ran the model, fit indices are inconclusive. SRMR is 0.058 and RMSEA is 0.051, which suggest that the model has a good fit. However, CFI is only 0.923, below the conventional cutoff value of 0.95. Can I conclude that the mixed results are because one variable is too skewed and high in kurtosis? Is it appropriate to transform that one variable? Thank you very much.

Bengt O. Muthen posted on Thursday, September 08, 2011 - 5:40 pm

An important consideration is what the MLR chi-square test says.

Marion Spengler posted on Monday, October 17, 2011 - 1:22 am

Dear Dr. Muth�n,
I am running a multiple group analysis (grouping is male vs. female) and was using the type=complex command to take into account that we have MZ and DZ twin pairs in our data. I got an error message (see below) and I have two thoughts about it. We also have single twins (only one member of a pair) in our dataset so that there are some cluster with n=1 observations. Is that a problem? And second: Is there a problem for the correction algorithm when there are a lot of cluster (177) with only few (in our case 2) observations?

THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES MAY NOT BE
TRUSTWORTHY FOR SOME PARAMETERS DUE TO A NON-POSITIVE DEFINITE
FIRST-ORDER DERIVATIVE PRODUCT MATRIX. THIS MAY BE DUE TO THE STARTING
VALUES BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. THE
CONDITION NUMBER IS -0.422D-17. PROBLEM INVOLVING PARAMETER 107.

THIS IS MOST LIKELY DUE TO HAVING MORE PARAMETERS THAN THE NUMBER
OF CLUSTERS MINUS THE NUMBER OF STRATA WITH MORE THAN ONE CLUSTER.

I really appreciate your thoughts!
Thanks, Marion

Linda K. Muthen posted on Monday, October 17, 2011 - 6:04 am

Having one or two members of a cluster is not a problem.

The message says you have more parameters than independent pieces of information in your data. That is you have more parameters than the number of clusters minus the number of strata with more than one cluster. The impact of this on your results is not known. You would need to do a Monte Carlo study for your situation to see this.

Marion Spengler posted on Monday, October 17, 2011 - 7:06 am

Thanks for the quick answer, Linda!

Stata posted on Tuesday, September 18, 2012 - 2:37 pm

Dr. Muthen,

I only incorporated weight variable for MCFA without stratification and cluster commands. Would it be alright if I use MLM instead of MLR (with type=general) if the data is not normally distributed?

Thank you.

Linda K. Muthen posted on Wednesday, September 19, 2012 - 6:36 am

MLR is also robust to non-normality. I think MLR is preferable to MLM. MLM does not allow missing data.

Carme Viladrich posted on Wednesday, March 13, 2013 - 3:41 am

Dr. Muth�n,

I'm running a measurement model invariance analysis with categorical data obtained from players clustered within teams. I want to test invariance across two age groups (younger/older than 12) and I'm not sure on how to handle the fact that some teams have only younger players, other teams have only older players and other teams have both younger and older players. Could you suggest a proper way to deal with this data? Or maybe recommend something similar to your webnote 16 adressing suitable models for measurement model invariance in this case?

Thank you.

Linda K. Muthen posted on Wednesday, March 13, 2013 - 12:41 pm

I think you would take the approach described in Web Note 16.

Carme Viladrich posted on Friday, March 15, 2013 - 5:51 am

Thanks for your answer.
When I run the models described in Web Note 16, I'm not able to see how to conclude that item factor loadings/thresholds are invariant or not invariant between age groups.
Could you elaborate a bit more on this point or suggest a more detailed reading, please?

Linda K. Muthen posted on Friday, March 15, 2013 - 6:20 am

See either the results or TECH1 to know what is held equal and what is not held equal across groups.

Isbelle Schmidt posted on Monday, July 07, 2014 - 4:18 am

Hello!

I have tested meausurement invariance between groups (5 ability groups). The model is a second-order CFA.
Because students are nested in classes, I controlled for the cluster information(class) with Type=complex.I've got the following error message:

THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES MAY NOT BE
TRUSTWORTHY FOR SOME PARAMETERS DUE TO A NON-POSITIVE DEFINITE
FIRST-ORDER DERIVATIVE PRODUCT MATRIX. THIS MAY BE DUE TO THE STARTING
VALUES BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. THE
CONDITION NUMBER IS -0.303D-15. PROBLEM INVOLVING THE FOLLOWING PARAMETER:
Parameter 49, Group 1: K WITH M

THIS IS MOST LIKELY DUE TO HAVING MORE PARAMETERS THAN THE NUMBER
OF CLUSTERS MINUS THE NUMBER OF STRATA WITH MORE THAN ONE CLUSTER.

The problem is that i couldn`t discover any problem with parameter 49.
I did the same analysis with type=general and no error message appeared. Can I trust the results with type is complex?

Thanks!

Linda K. Muthen posted on Monday, July 07, 2014 - 8:41 am

With clustered data, independence of observations is at the cluster level. You must have 49 clusters. The message tells you that your model has more parameters than it has clusters. The effect of this on model results has not been well-studied. This is a warning.

Aurelie Lange posted on Sunday, April 19, 2020 - 1:56 pm

Good evening,

In one of the message above, it is stated that, in multilevel, a grouping variable should be on the between level. We intended to test for moderation of a binary variable on the within level. As this is not possible, I was searching for alternative solutions. Would it be appropriate to simply estimate the models for both groups separately and describe these outcomes? Or is there some other way around this problem.

Thank you!

Bengt O. Muthen posted on Monday, April 20, 2020 - 11:24 am

There is a methodology described here

Asparouhov, T. & Muth�n, B. (2012). Multiple group multilevel analysis. Mplus Web Notes: No. 16. November 15, 2012.

But it is still quite an advanced technique so not easy to do unless you are a strong methodologist.

Tugce Aral posted on Monday, October 26, 2020 - 6:02 am

Hi,
I am conducting a (multigroup)multilevel modeling analysis. There is an Error. Would you please tell me what could be wrong with cluster ID?

*** ERROR
Cluster-ID cannot appear in more than one group.
Problem with cluster ID: 102

My syntax:

Grouping is migst3w1 (0= nomigback, 1= migbackground);
CLUSTER IS ccode; ! Level-2 grouping identifier

WITHIN ARE sex2w1 booksw1 migst3w1;
BETWEEN ARE stype2w3 imgperc sdiw1 ;

ANALYSIS:
ESTIMATOR = MLR ;
TYPE= twolevel;

Bengt O. Muthen posted on Tuesday, October 27, 2020 - 10:32 am

Sounds like your groups are on the within level, not on the cluster level. See Web Note 16.