Message/Author 


When I subset my data (say include only Caucasian subjects), I get clusters with only one observation in them. How does this affect the analysis and should I omit clusters with fewer than a certain number of cases? 

bmuthen posted on Wednesday, February 20, 2002  10:01 am



You should keep all clusters, even those with only 1 member. Clusters with one member contribute to estimation of betweenlevel parameters. They don't contribute to withinlevel parameters, resulting in less withinlevel power. 

Anonymous posted on Thursday, June 12, 2003  12:16 pm



I’m a beginner with Mplus and have questions about the cluster option and the necessary number of clusters for a twolevel analysis. (1) Am I right in assuming that the cluster option – compared to an “ordinary” analysis (SEM, REGRESSION, …) – simply corrects the SE’s for the fact of nonindependent observations? (2) Is there a lower limit for the number of clusters when doing twolevel analysis? I think I read something about that in a paper of Hox, but I can’t remember where. Thanks 


1. TYPE=COMPLEX adjusts standard errors and chisquare for nonindependence. 2. I think a lower limit would be 3050. This is the sample size for the between part of the model. 

Anonymous posted on Thursday, January 29, 2004  4:30 pm



I have a question regarding Muthen's 2/20/2002 response to Allison Tracy (above). I'm using Mplus to construct a multilevel SEM with an "intercepts as outcome" parameterization. I find that a notable proportion (40%) of my Level2 units have sample sizes of j=1. Should I infer from Muthen's response of 2/20/2002 that the corresponding cases (roughly 15% of the total sample) are effectively "ignored" by Mplus in estimating the Level1 parameters (i.e., the nonrandomly varying slopes; I assume these cases are also not included when the CENTERING option is applied) ? I'm puzzled because I haven't read that any other HLM package that handles data this way. Would you provide a reference so that I could better understand the nature / implications / logic of the "loss of Level1 sample size" incurred in Mplus in these situations ? Thanks very much (in advance). 

bmuthen posted on Thursday, January 29, 2004  5:29 pm



Mplus handles level2 units of size 1 the same way as all other multilevel programs. No cases are excluded from the analysis. What I tried to convey was that such units carry no information on level1 variation since such units have no level1 sample variation. Such units do however contribute to fixed effects estimation. 

David DeWit posted on Friday, March 12, 2004  11:24 am



I have a threewave longitudinal data set with roughly 1,400 individuals spread across 22 schools (widely varied cluster sizes). I attempted to estimate a single process linear growth model for selfesteem adjusting for clustering of students within schools. The model estimation terminated normally but I'm getting a message that reads, "standard errors and chisquare may not be trustworthy due to cluster structure. Change your estimator". In another model with frequency of illicit drug use as the outcome, I get a message that reads, "sample weight matrix for the robust estimator could not be computed because each cluster has a different size". Please advise on what these messages mean and steps to correct the problems. Thank you. 

bmuthen posted on Friday, March 12, 2004  3:25 pm



I think you are doing a Type = Complex analysis. The problem occurs in the unusual situation where a given cluster size is represented by only one cluster. In the soon to be released Version 3, a more flexible estimation approach is used that does not run into this issue. 

Maggie posted on Monday, September 13, 2004  2:27 am



I did a twolevel SEM, and I got reasonable factor loading and beta estimates at both levels, and the overall CFI=0.989. but in the output, there always appears an error message:THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES MAY NOT BE TRUSTWORTHY FOR SOME PARAMETERS DUE TO A NONPOSITIVE DEFINITE FIRSTORDER DERIVATIVE PRODUCT MATRIX. THIS MAY BE DUE TO THE STARTING VALUES BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. THE CONDITION NUMBER IS 0.267D17. PROBLEM INVOLVING PARAMETER 35. I think that it's problem of clsutering size, since I only have 34 culsters where the parameter estimated > 34. I tried to reduce the number of parameters but seems I cannot reduce them lower than 34, so my question is: 1. is the model results trustable? (factor loadings,EST./S.E.) 2. I use the default estiamter MIR, is it correct for a unbalanced nonnormality data? Thank you very much for the insights 

Maggie posted on Monday, September 13, 2004  2:28 am



I did a twolevel SEM, and I got reasonable factor loading and beta estimates at both levels, and the overall CFI=0.989. but in the output, there always appears an error message:THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES MAY NOT BE TRUSTWORTHY FOR SOME PARAMETERS DUE TO A NONPOSITIVE DEFINITE FIRSTORDER DERIVATIVE PRODUCT MATRIX. THIS MAY BE DUE TO THE STARTING VALUES BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. THE CONDITION NUMBER IS 0.267D17. PROBLEM INVOLVING PARAMETER 35. I think that it's problem of clustering size, since I only have 34 culsters while the parameter estimated > 34. I tried to reduce the number of parameters but seems I cannot reduce them lower than 34, so my question is: 1. is the model results trustable? (factor loadings,EST./S.E.) 2. I use the default estiamter MIR, is it correct for a unbalanced nonnormality data? Thank you very much for the insights 


Yes and yes. 

Anonymous posted on Friday, December 17, 2004  7:13 am



I'm conducting twolevel modelling (version 3.00) to examine between and withinindividual variation in children's social goal scores (assessed in four different situations). My question is this: I use participants IDnumber as a cluster (i.e., I have formed a variable which is equivalent to the participants N in the data set=310). However, the "number of clusters" reported in the output is always 309 instead of 310. I have rechecked the data set many times, so I know that that it contains 310 participants ( x four situations). Is the formula for calcualting the number of cluster N1, or am I missing something? Also, the data set contains some missing values (treated as adviced in the manual), but as I understand, this should not be related to the number of clusters? Thank you so much in advance! 


I would have to see the data and output at support@statmodel.com to answer this. 


I have a problem similar to Maggie's Sept. 24, 2004, issue. I am doing a twolevel CFA to examine teacher ratings of social behavior using type=complex. I have more parameters than clusters (43 clusters (teachers); n = 210). My model fit reasonably after including some conceptuallyacceptable cross loadings. I got the same error message as Maggie (NONPOSITIVE DEFINITE FIRSTORDER DERIVATIVE PRODUCT MATRIX...MOST LIKELY DUE TO HAVING MORE PARAMETERS THAN THE NUMBER OF CLUSTERS). The model made sense. Fits were adequate. One Std loading value was slightly > 1. There were no negative residuals. Based on Linda's response to Maggie's posting, I think I can trust these results. I now want to test measurement invariance for boys v. girls using a multigroup approach, testing for invariance of loadings, then intercepts. This increases the number of parameters to be estimated. I continue to get error messages like Maggie's, with an occasional Std and Stdyx loadings > 1 but no negative residuals. 1. Can I trust the chi square and loading values in these models? 2. Are there any problems comparing nested models to look at measurement invariance in this circumstance? 3. Other than negative residuals or a message that standard errors cannot be estimated, what might indicate that I should not ignore the error message? Thanks for any comments. 


You are never in a desirable or definsible siutation when you have more parameters than clusters. The only way to know the impact on your results would be to do a simulation study. 

Sharon posted on Friday, January 05, 2007  4:29 pm



Hi, Linda  I am trying to use the Monte Carlo option to follow your suggestion. I am using ex. 11.7 steps 1 and 2 in the Users' Guide. I have three questions: 1. In the model in ex. 11.7, step 1, you set start values. Is this necessary? If so why? Where did you get the actual numbers in the start values? 2. How does this sort of Monte Carlo study differ from bootstrapping? 3. Would simply outputting the within matrix and conducting CFAs with this be a viable alternative way to manage the "too few clusters" problem? I don't care about the between structure  it is just a statistical nuisance. Thanks, Sharon 


1. You do not need starting values in the MODEL command. 2. In bootstrapping, random samples are drawn from the sample. In Monte Carlo, random samples are drawn from a population. 3. Yes. 


Hello, Could you recommend a reference that explains the use of the sandwich estimator with clustered sampling designs in Mplus? Thanks, Sarah Dauber 


See the following reference which is available on the website: Asparouhov, T. (2005). Sampling weights in latent variable modeling. Structural Equation Modeling, 12, 411434. See also the Skinner reference listed in that paper. 


I have a follow up question to Bengt Muthen regarding following: “When I subset my data (say include only Caucasian subjects), I get clusters with only one observation in them. How does this affect the analysis and should I omit clusters with fewer than a certain number of cases?” “You should keep all clusters, even those with only 1 member. Clusters with one member contribute to estimation of betweenlevel parameters. They don't contribute to withinlevel parameters, resulting in less withinlevel power.” Do you have any references for your argument about allowing to use clusters with only one observation? 


I don't know of any such reference offhand. You might see what Joop Hox has to say. 

Alex posted on Wednesday, June 06, 2007  5:21 am



Hello, I am trying to take into account the non independance of observations in an otherwise "standard" SEM (i.e. supervisors evaluating multiple employees). So I use the "type=complex" with a "cluster = x" variable. I have three questions. (1) Can I use multiple clustering variables in the same analysis (say two: supervisors and organization) ? If so, is there a specific way to indicate it ? (2) Is there an inferior limit to the number of clusters I can use (ie. 3 organizations) ? (3) Is there a way to indicate that the non independance of observations only affect a subset of my variables (the evaluated by the supervisors) ? Thank you very much 


1. See the discussion of complex survey data features on pages 400403 of the user's guide. 2. I believe it is recommended to have no fewer than 3050 clusters. 3. No. 


Can you recommend an article/source that specifies how to estimate number of parameters for a multilevel SEM during the design/diagramming phase? I am trying to determine for sure if I will have a problem with model fit due to small number of clusters per parameters and I want to make sure that I am estimating my between + within + crosslevel parameters accurately. Thank you! 


I'm not clear on your question. Are you asking how to determine the number of parameters in a model? 

Student 09 posted on Thursday, April 08, 2010  6:23 am



Hi, I just noted that Mplus 6 will include MCMC etimation procedures. Will that allow for estimating crossclassified multilevel models in Mplus? 


Crossclassified multilevel models will not be part of Version 6. 


Sorry for the delay. Yes, I am trying to determine number of parameters in a multilevel SEM model same as the Mplus program will to determine if there are more parameters than clusters. I know how to determine number of parameters for a path model, but I am not sure how to diagram a multilevel SEM model properly to get the correct result. I apologize for any confusion. Any references would be helpful. Thanks. 


See the examples in Chapter 9, their path diagrams, and their outputs. 


I am running NB regression testing a continuous variable moderated by agegroup [Contrast coded C1: adult (2/3), young adult (1/3), adolescent (1/3) and c2: adult (1/2), young adult (1/2), and adolescent (1/2)]. These 3 agegroups are named in a "group" variable. The adult and adolescent sample are nonindependnent. When I run the anlaysis with sandwhich estimatation type = complex cluster = group I get much larger parameter estimates than without the sandwhich estimation. Is this because sandwhich estimators are unstable with NB regression? Or have I "double" accounted for cluster? Thanks so much for your help. 


Sorry, Dr's MuthenI realize I had the wrong grouping variable. When I use "id" as my cluster variable the results look more similar to my nonclustered analsyis. Although again I find that many of my effects are stronger with sandwhich estimation. I had thought that sandwhich estimators decreased type 1 error and generally standard errors would increase. Is this incorrect? Thanks. 


It is true that theoretically the standard should increase. This does not always happen in practice because model fit may not be perfect. 


Dear Linda, I got this error message from the analysis using type = complex (see below). I have checked the potentially problematic parameter but it seems fine. I suspect that this could be the fact I have more clusters (82) than the number of parameters estimated (66). The model fit is great and the established relationships are consistent with the theories. I would like to know if I can trust the result. Also, could you please suggest how to deal with the issue. Thanks. Pat Error message: THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES MAY NOT BE TRUSTWORTHY FOR SOME PARAMETERS DUE TO A NONPOSITIVE DEFINITE FIRSTORDER DERIVATIVE PRODUCT MATRIX. THIS MAY BE DUE TO THE STARTING VALUES BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. THE CONDITION NUMBER IS 0.136D15. PROBLEM INVOLVING PARAMETER 62. THIS IS MOST LIKELY DUE TO HAVING MORE PARAMETERS THAN THE NUMBER OF CLUSTERS MINUS THE NUMBER OF STRATA WITH MORE THAN ONE CLUSTER. 


Dear Linda, Referring the message that I posted earlier, I don't think I have a problem with number of parameters over clusters. I have 82 clusters, 66 parameters estimated. Is it correct? My questions are: given the error message that I sent forth, can I trust the results?; and how to remove this error in the analysis. Thanks. pat 


The message refers to more parameters than THE NUMBER OF CLUSTERS MINUS THE NUMBER OF STRATA WITH MORE THAN ONE CLUSTER not just the number of clusters. This is the number of independent observations in your data. It is not know how this affects standard errors. You would need to do a simulation study based on your data to see. 


Thanks Linda. Please help me understand this more clearly. Does THE NUMBER OF STRATA WITH MORE THAN ONE CLUSTER eqate number of observations (i.e. number of respondents)? I used cluster command to control for the cluster error to data in the analysis. The other point is there are 2 incidents regarding the error message. I got the two paragraphs message when I used subpopulation command. However, I got only this first paragraph error message when I used the whole population. In which case, I have checked the parameter in question and it is fine. THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES MAY NOT BE TRUSTWORTHY FOR SOME PARAMETERS DUE TO A NONPOSITIVE DEFINITE FIRSTORDER DERIVATIVE PRODUCT MATRIX. THIS MAY BE DUE TO THE STARTING VALUES BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. THE CONDITION NUMBER IS 0.136D15. PROBLEM INVOLVING PARAMETER 62. For the latter case, I would like to knnow if I can trust the results. Many thanks. pat 


When you have clustered data, the individual observations are not independent. This is what TYPE=COMPLEX takes into account. Independence of observations is at the cluster level and with both clustering and stratification, independence of observations is at the number of strata with more than one cluster. Regarding the message, please send the output and your license number to support@statmodelc.com. 


Thanks so much for your clarification. I am in the process of refining the model. Will send the output and license number once I got the final model results. pat 

MT posted on Monday, February 20, 2012  6:19 am



Dear Muthens, In my data, I have 53 teams, however, when I read the output of Mplus, it says that the number of clusters is 15. How could this be? The input is: CLUSTER IS Team; USEVARIABLES ARE Struc_T Bevl_T StrucJR Bevl; ANALYSIS: TYPE = TWOLEVEL; ESTIMATOR = ML; MODEL: %WITHIN% Bevl ON StrucJR; %BETWEEN% Bevl_T ON Struc_T; Thanks so much for your help! Maria 


It would seem you are reading your data incorrectly. Please send your input, data, output and license number to support@statodel.com. 

MT posted on Tuesday, February 21, 2012  12:03 am



Hi Linda, Now that I am preparing the data to sent it to you and ran it one more time, the appropriate team size appears in the output!! I guess the cluster variable should be at the beginning of the data and not somewhere at the end to work? Your offered help is greatly appreciated! 


The cluster variable should be in the same place on the NAMES list as it is in the data file. 


I am having some problems with including a cluster correction in my SEM models. The code runs fine with no errors and the model converges beautifully. But the standardized results do not have standard errors (the unstandardized ones do have SEs). The same code but with type=general instead of type=complex gives me standardized results with SEs. Can you help? 


Please send the output and your license number to support@statmodel.com. 


I am running a model in which I have students nested in five schools. To account for any schoollevel sources of variation in the outcomes of interest I am planning on using four dummy indicators for the schools (leaving one school as the referent). Is there anything else that I should be doing to account for the clustering? Is there a standard error adjustment that I am missing? 


This is all you need to do. 


Thanks for the response, but I realize that I have a couple followup questions that will hopefully help me understand how Mplus treats these dummy variables: 1. How does using dummies for school differ than simply using a single categorical indicator, in terms of the coefficient and s.e. for my predictors of interest? 2. Is there any difference in how Mplus handles cluster dummies vs. something like race dummies? Is there something that I would need to do to let Mplus know that the school dummies are "different" than the race dummies? 


1. By a single categorical indicator I assume you mean declaring school as categorical with 5 categories (so an ordinal variable) or as nominal with 5 categories. You don't want to do that because you are talking about schools as covariates, not DVs. 2. No. 

X. Portilla posted on Wednesday, June 12, 2013  8:57 am



I am running a path analysis across a kindergarten school year (fall and spring) and have children clustered within 29 classrooms. I want to account for the shared variance between classrooms. My understanding is that I need 3050 clusters to use TYPE=COMPLEX in my model. Therefore I have two sets of questions: 1) Do you think I can account for clustering with 29 classrooms using TYPE=COMPLEX? If so, should my clustering variable "class" be coded as 129? Is there anything else I need to designate in the model to account for clustering? 2) Alternatively, I think I can use dummy variables as covariates to represent each classroom (coded 0/1), leaving one group out as the reference group. If so, are these covariates only applied at time 1 (fall k) or at both time 1 & 2 (fall & spring)? Would I still use TYPE=COMPLEX and designate the clustering variable in addition to adding dummy covariates? Is there anything else I need to designate in the model to account for clustering? Thank you so much in advance! 


1. Yes, I think Type=Complex will work ok for 29 clusters. You don't have to recode the cluster values as long as they are distinct. 2. Don't use dummies. 


Thank you, Bengt. I proceeded with using Type=Complex on the 29 clusters which are uniquely identified by my clustering variable. In comparing the clustered output to the unclustered output, the results are very similar, as are the goodness of fit indices (CFI= .974). However, the output has an error which I'm not sure how to interpret: THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES MAY NOT BE TRUSTWORTHY FOR SOME PARAMETERS DUE TO A NONPOSITIVE DEFINITE FIRSTORDER DERIVATIVE PRODUCT MATRIX. THIS MAY BE DUE TO THE STARTING VALUES BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. THE CONDITION NUMBER IS 0.142D15. PROBLEM INVOLVING PARAMETER 29. THIS IS MOST LIKELY DUE TO HAVING MORE PARAMETERS THAN THE NUMBER OF CLUSTERS MINUS THE NUMBER OF STRATA WITH MORE THAN ONE CLUSTER. I checked parameter 29 and did not identify anything strange with it. Can you advise me on how to proceed or whether I can trust the results? Thanks so much. 


The results are most likely ok. This is just a warning that you have fewer clusters than parameters. Our simulations suggest that this is often ignorable. 


Thank you for your input. 


Dear Linda or Bengt, I have also the equal number of cluster and parameters, I would like to estimate using type=complex. Can you post any reference, please, in witch something such as "simualtion studies show that the fewer clustersthanparameters assumption can be often ignored" is published? It will be very helpfull for justifying the usage of such models for me and other researchers used this kind of analysis. 


I don't know of any reference related to this. You can do a simulation study based on the attributes of your data to see the effect on the results. 


Thank you. I hope some researchers in statistical methods will investigate this open problem and publish their results soon. Linda, do you have a general description of such a simulation study? 


See Example 12.6. In the first step, clustered data are generated. In the second step the data are analyzed using TYPE=COMPLEX. 

Lee Allison posted on Tuesday, April 08, 2014  6:13 pm



I am also new to plus and the discussions. I have 21 clusters in my data. Average cluster size 6.8. Some clusters have only one. I ran the ICC for each of the constructs in my CFA which reported the ICC values ranged from 4.7%  13.2%. When these values are used to calculate the design effects, all design effects are less than 2. I read your post that indicated design effects less than 2 can be ignored, citing tongue in cheek conversations with your husband. =D Then with Mplus 6.12 I ran SEM model using Type = Complex Random, with the variable command option of cluster, algorithm=integration which is the Mplus option for maximum likelihood estimation with robust standard errors. As I understand it, this is recommended for clustered complex survey data (Muthén and Satorra 1995; Muthen 1995). My concern is that I do not understand the interpretation. Did I improve my model in any way by running type=complex since the ICC's values were small enough to result in design effects less than 2 anyway? Is type=complex still an appropriate analytical approach? Or, would my ICCs need to present a greater problem before the type=complex is beneficial to the analysis? I have sought many sources for an explanation or advice on this matter. I am left without counsel, so your kind help is greatly appreciated. Best regards. 


Twentyone clusters clusters is the bare minimum for using TYPE=COMPLEX or TYPE=TWOLEVEL. Many recommend using at least 3050 clusters. A practical way to see if you need to take clustering into account is to run the analysis with and without TYPE=COMPLEX and see how different the standard errors are. 

LAlli posted on Wednesday, April 09, 2014  11:27 am



Thank you for being so kind and awesome. I will try this. Best, Lee 

Andrea posted on Thursday, August 28, 2014  9:23 pm



Hello! Regarding Bengt's post on Friday, June 14, 2013  11:45 am (above; The results are most likely ok. This is just a warning that you have fewer clusters than parameters. Our simulations suggest that this is often ignorable.) Was this a published simulation? Do you have any additional support for this issue? 


No, this was not published. No, I have no additional support. 

Shiny7 posted on Monday, December 08, 2014  11:33 am



Dear Drs. Muthen, I´d like to run multilevel analysis using MLRestimator; my cluster size is only 21 (average group size 100); Mplus gives me the well known message: THE NONIDENTIFICATION IS MOST LIKELY DUE TO HAVING MORE PARAMETERS THAN THE NUMBER OF CLUSTERS. REDUCE THE NUMBER OF PARAMETERS. Today a friend told me, that the warning only refers to the level 2 standard errors, not the level 1 estimates, because of having enough cases at level 1 (N=2000). Is that correct? I thought it refers to the parameter estimates in general. To which estimates refers the message in general? Thank you very much in advance! 


It is a general warning and until someone does a thorough simulation study, it is not fully known which parameters are affected how much. It is important to have many clusters relative to the number of clusterlevel (level2) parameters, particularly for variance parameters; the estimates for level1 parameters are most likely less affected. 

Shiny7 posted on Monday, December 08, 2014  11:58 pm



Dear Dr. Muthen, thank you very much for that helpful and immediate reply. Shiny 


I ran a two level observed variables only model and obtained estimates that are consistent with my theory and stata estimates. The one wrinkle, though, is that I received this familiar error message: THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES MAY NOT BE TRUSTWORTHY FOR SOME PARAMETERS DUE TO A NONPOSITIVE DEFINITE FIRSTORDER DERIVATIVE PRODUCT MATRIX....THE NONIDENTIFICATION IS MOST LIKELY DUE TO HAVING MORE PARAMETERS THAN THE NUMBER OF CLUSTERS. REDUCE THE NUMBER OF PARAMETERS. Based on the discussion here, I am inclined to proceed with the model. However, I will like to verify that nothing has changed since the last post (about 6 months ago) concerning this issue. And I actually don't have level 2 predictors. I have also had difficulty getting Mplus to accept my desire to use the MLM estimator, even when I added LISTWISE=ON to the data command. It defaults to MLR reporting that "Estimator MLM is not allowed with TYPE=TWOLEVEL." I may have misinterpreted it, but I thought the manual suggests this is possible. I want to switch to MLM so I can obtain chisquared and RMSEA test statistics. 


Permit me to ask one more question. I also badly want to obtain total effects, but the error message I kept getting is that the INDIRECT subcommand is not allowed in TWOLEVEL models. Will switching to MLM from MLR help fix this problem? And by the way, I as I mentioned in my earlier post, Mplus will not allow me to switch from MLR to MLM. I am a new user of Mplus. 


MLM is not available with TWOLEVEL. This is shown on page 601 of the user's guide.If you do not get chisquare and related fit statistics with MLR, you would not get them with MLM. They are available when means, variances, and covariances are sufficient statistics for model estimation. I believe you should get MODEL INDIRECT with TWOLEVEL. 


Dear Linda: Thank you for your quick response. I am not getting them, so there must be something wrong with my model. What I got are the Loglikelihood for HO, along with the HO scaling factor, and the information criteria statistics (AIC and BIC). I am wondering if I bring my parameters down to 45 (one below the number of clusters I have), this issue would be corrected. If I can use the INDIRECT subcommand to get total effect, I will be happy doing away with some of the paths to achieve this. I have 7 yvariables, but my primary interest is y7. 


As follow up to the post above, I rerun my twolevel model with the MODEL INDIRECT subcommand. Although I specified ESTIMATOR = MLR under the analysis command, Mplus aborted prematurely and reported that "MODEL INDIRECT is not available for TYPE=TWOLEVEL with ALGORITHM=INTEGRATION." This is input: TITLE: ZeroSum Game Paper Model 1 Data: FILE IS C:\Users\adual\Desktop\SLGR Local\Data and Analysis\MplusM1.csv; VARIABLE: NAMES ARE y1 y2 y3 y4 y5 y6 y7 x1 x2 x3 x4 x5 x6 x7 x8 x9 z1 z2 z3 z4 st; MISSING ARE .; CATEGORICAL = y6; WITHIN = y1 y2 y3 y4 y5 y6 y7 x1 x2 x3 x4 x5 x6 x7 x8 x9 z1 z2 z3 z4; BETWEEN = ; CLUSTER is st; ANALYSIS: TYPE = TWOLEVEL; ESTIMATOR = MLR; MODEL: %WITHIN% y1 on x2 x3 x6 x8 x9 z1 z4; y2 on x2 x3 x6 x7 x8 z1 z4; y3 on x1 x2 x5 z1 z4; y4 on x1 y3 x1 x2 x6 z1 z2 z4; y5 on y2 y3 x2 x3 x4 x8 z1 z4; y6 on y1 y2 x2 x3 x6 x8 z1 z3 z4; y7 on y1 y2 y4 y5 y6 x1 x2 x3 x6 x8 z1 z4; MODEL INDIRECT: y1 IND x2 


MODEL INDIRECT is not available with numerical integration. This is the issue. It is required because you have a categorical dependent variable. You can specify the indirect effect using MODEL CONSTRAINT. 

Luo Wenshu posted on Friday, October 30, 2015  2:19 am



Dear Dr. Muthen, I am running multiplegroup (gender groups) analyses. Because there are about 100 classes (for male and female students the number of classes are not equal). I try to control for nonindependence in the data by using Type=Complex. I first tested a measurement model with same factor pattern between gender groups, and then a more restricted model with same factor pattern and same factor loadings. Both models have good fit. However, for the second more restricted model (also the following more restricted models) with less free parameters, I got the warning message below. I know that in all the models, the number of free parameters is much larger than the the number of classes. What's the reason for getting this message for the more restricted models, but not first model with more free parameters. Can I trust the results? THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES MAY NOT BE TRUSTWORTHY FOR SOME PARAMETERS DUE TO A NONPOSITIVE DEFINITE FIRSTORDER DERIVATIVE PRODUCT MATRIX. THIS MAY BE DUE TO THE STARTING VALUES BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. THE CONDITION NUMBER IS 0.621D16. PROBLEM INVOLVING THE FOLLOWING PARAMETER: Parameter 102, Group MALE: HDST2 THIS IS MOST LIKELY DUE TO HAVING MORE PARAMETERS THAN THE NUMBER OF CLUSTERS MINUS THE NUMBER OF STRATA WITH MORE THAN ONE CLUSTER. 


We need to see your outputs to say  send to Support along with your license number. 

SABA posted on Tuesday, December 15, 2015  7:36 am



Hi, Is there any lowest limit of number of observations in each cluster? Could you please suggest a reference about that. Thank you 


The lowest limit is 1. Look for papers/books by Joop Hox. Or, post on Multilevelnet. 


Dear Dr. Muthen, Could you recommend a reference to this statement "it is recommended to have no fewer than 3050 clusters (to run twolevel analyses)."? Thanks (in advance)! 


Joop Hox has done a lot of work in this area. I would start looking at his work. 


Dear Prof. Muthen, the participants in my study take part in 6 university courses. The students in three groups get a treatment, the others are the controllgroup and I want to see if the interest of the students in the treatmentgroup changes. Because the ICC is in some cases over .10 I thougt about using type = complex to consider the influences of belonging to the university courses. I know you wrote in an earlier answer, that you need at least 20 groups to use type=complex. But in my case the data is not really hierarchically because I don't have other variables to consider. The only thing I want to check is, whether the intrest of the treatmentgroup changes and consider therfore the beloning to the different courses. So is it possible to use type=complex with less than 20 groups when the data is not really hierarchically? My input looks like this: usevar = VT_SeUSu ; cluster = Semi; analysis: type = twolevel basic; Thank you for your help! 


All you can do is to let university course be represented by dummy variables. 


Hello, I am running basic stats for a 3level multilevel model (level 1 = time (3 time points), level 2 = childid level, 3 = famid). I have a question about the estimated cluster sizes. Specifically, the childID level seems correct. However, the FAMID level should have an average cluster size around 23, but it is much higher (i.e., it is including each child and each time point in the estimated cluster size). Could you advise? Thank you. The estimated cluster sizes are below: Average cluster size for CHILDID level 2.348 Estimated Intraclass Correlations for the Y Variables for CHILDID level Intraclass Variable Correlation FEARFUL_ 0.198 Average cluster size for FAMID level 5.394 Estimated Intraclass Correlations for the Y Variables for FAMID level Intraclass Variable Correlation FEARFUL_ 0.151 


Please send the output, the data set, and your license number to support@statmodel.com. 


Dear dr. Muthen, We are running a SEM analysis with the TYPE=COMPLEX analysis and keep on getting the same warning: THE MODEL ESTIMATION TERMINATED NORMALLY THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES MAY NOT BE TRUSTWORTHY FOR SOME PARAMETERS DUE TO A NONPOSITIVE DEFINITE FIRSTORDER DERIVATIVE PRODUCT MATRIX. THIS MAY BE DUE TO THE STARTING VALUES BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. THE CONDITION NUMBER IS 0.178D15. PROBLEM INVOLVING THE FOLLOWING PARAMETER: Parameter 44, [ SRLTB_3 ] followed by... THIS IS MOST LIKELY DUE TO HAVING MORE PARAMETERS THAN THE NUMBER OF CLUSTERS MINUS THE NUMBER OF STRATA WITH MORE THAN ONE CLUSTER. We wonder if the results are still trustworthy because of this. We have 44 clusters (schools), 331 observations (teachers), 72 dependent variables and 14 continuous latent variables. Hope you can help, thanks in advance. 


I don't think a simulation study has been done on this so we don't know for sure. It would seem that results are ok if at least there are fewer clusterlevel parameters than clusters. 


Dear Professor Múthen, I´m a new user of MPLus. I´m trying to performe a multilevel analysis (two level model) and I´m very concerne regarding some points. I have 1240 individuals and 20 clusters (clinical groups). in the model tested I considered thought supression and negative affect as level 1 predictors, mindfulness, selfcompassion and acceptance as level 2 predictors and, depression as outcome. Is it possible to performe this analysis with just 20 clusters? the ICC at the inconditional model was only 3.9%, but the design effect was 3.38485. Another possible dummy question, if I split out my clusters in male/female, I´ll have 40 clusters. Do you thing this may have any beneficit? Sincerly Joana Costa 


20 clusters is quite low for twolevel analysis. You will probably find that level 2 relations are insignificant (although the SEs may not be reliable due to few clusters). At least 50 is typically recommended. Splitting into malesfemales doesn't help. 


If you are familiar with Bayes, that is an alternative (see my 2010 Bayes paper on our website). 


Thanks for your help. Another question related to these 50 clusters, how many individuals should I have in each cluster thinking about a two level analysis? Sincerly, Joana 

shonnslc posted on Saturday, May 18, 2019  8:22 pm



Hi, I am doing a path analysis with nested data structure (cluster = 6). I know that type = complex is not suitable for this small number of clusters. However, I am wondering what is the consequence for using type = complex for this small number of clusters. Thanks. 


Dear Drs Muthen, I'm trying to build a moderated mediation SEM to evaluate the effect of classroom environment quality (n=119 classrooms) on measures of performance at the child level. Classroom environment is measured by an instrument purportedly measuring 3 latent aspects of classroom quality at the classroom level. The DV, mediator, and moderator are all measured at the child level (ordinal, n=1300). Is a multilevel model indicated when the IV is already measured at level 2? For the initial stage of evaluating the validity of my classroom quality measure, is TYPE=COMPLEX the best way to evaluate the measurement model for the level 2 predictor with n=119? Thank you kindly in advance for your guidance. Melissa 


For the question in your first paragraph, I recommend you study Preacher et al's "211" approach as described on our Mediation page: http://www.statmodel.com/Mediation.shtml For the question in your second paragraph, no  for the sample of N=119 you don't use Type=Complex, just regular analysis. 


Dear Professor Múthen, I ran a 3levelmodel and was given this output THE NONIDENTIFICATION IS MOST LIKELY DUE TO HAVING MORE PARAMETERS THAN THE NUMBER OF LEVEL 3 CLUSTERS. REDUCE THE NUMBER OF PARAMETERS. I have 14 clusters at level 3 and 10 free parameters at this level. So does this recommendation (to reduce the parameters) does not refer to parameters of a particular Level (level 3)? That means I can also reduce parameters of level 2 to overcome nonidentification? Sincerely Maren 


If you are certain that the model is identified you can ignore this message. You might find this helpful http://statmodel.com/download/ConditionNumber.pdf (Section 2.4) 

Back to top 