Message/Author 

Anonymous posted on Thursday, June 10, 2004  6:23 am



HI Linda, I have a second order factor within a larger sem model.The second order factor is measured by 2 first order factors and one measured variable. which of the following methods should I use for this part of my model. Assuming I have 7 measured variables (x1x7) for this part of the model method 1: model: f1 by x1x3; f2 by x4x6; f3 by x7@1; f4 by f1f3; Method2: model: f1 by x1x3; f2 by x4x6; f4 by f1 f2 x7; Thanks 


I believe they would be identical if you add x7@0; to the MODEL command of Method 1. I would use Method 2 because there is less room for error in setting up the model. 

ldumenci posted on Tuesday, June 29, 2004  6:14 am



Hi Linda, Do you see any identification problems below? f1 by x1x5; f2 by x6x10; f3 by x11x15 f by f1 f2 f3; f f1 f2 f3 on y1 y2; Thanks 


If you are receiving an error about identification, send the output including TECH1 to support@statmodel.com and I will be happy to take a look at it. 

anonymous posted on Wednesday, October 26, 2005  12:10 pm



I am trying to test a model where there are 4 firstorder factors subsumed by a single higherorder factor, and there is also a fifth firstorder factor that covaries with the higherorder factor. Is this model meaningfully different than a model that has all 5 firstorder factors subsumed by the higherorder factor? 


To answer you , I used Example 5.6 in the Mplus User's Guide and ran your two models. The model fit and the number of parameters are the same. The difference is that in one you estimate a factor loading and a residual variance and in the other you estimate a variance and a covariace. 

Anonymous posted on Thursday, October 27, 2005  10:46 am



Thank you. So is this difference (factor loading and residual variance versus variance and covariance) make the models meaningfully distinct, given that the # of parameters and the fit statistics are the same? That is, is there any reason to test both these models? 


These models are not statistically distinguihable. 

anonymous posted on Thursday, January 19, 2006  9:19 am



I am trying to ceate a CFA model as a basis for nested models I want to compare. However, with my input, my output does not show any numbers for CFI, TLI and RSMEA. How get I get the numbers for them? INPUT INSTRUCTIONS TITLE: Fieke DATA: FILE IS "D:\dataspss.dat"; FORMAT is 12f1; VARIABLE: NAMES ARE country inno1 inno2 inno3 inno4 quality inno5 custsat custloy emplsat emplret emplloc; USEVARIABLES ARE country inno1 inno2 inno3 inno4 quality inno5 custsat custloy emplsat emplret emplloc; GROUPING IS country (1=UnitedKingdom 2=Austria 3=Ireland 5=NewZealand 6=Australia); ANALYSIS: TYPE IS MEANSTRUCTURE; ESTIMATOR IS ML; ITERATIONS = 100000; CONVERGENCE = 0.00001; MODEL: f1 BY inno1 inno2 inno3 inno4 inno5; f2 BY quality; f3 BY custsat custloy; f4 BY emplsat emplret emplloc; f1 ON f2 f3 f4; f1 BY inno1*; f2 BY quality*; f3 BY custsat*; f4 BY emplsat*; MODEL UnitedKingdom: f1@0.0; f2@0.0; f3@0.0; f4@0.0; INPUT READING TERMINATED NORMALLY Fieke SUMMARY OF ANALYSIS Number of groups 5 Number of observations Group UNITEDKINGDOM 487 Group AUSTRIA 657 Group IRELAND 249 Group NEWZEALAND 472 Group AUSTRALIA 250 Number of yvariables 11 Number of xvariables 0 Number of continuous latent variables 4 Observed variables in the analysis INNO1 INNO2 INNO3 INNO4 QUALITY INNO5 CUSTSAT CUSTLOY EMPLSAT EMPLRET EMPLLOC Grouping variable COUNTRY Continuous latent variables in the analysis F1 F2 F3 F4 Estimator ML Information matrix EXPECTED Maximum number of iterations 100000 Convergence criterion 0.100D04 Maximum number of steepest descent iterations 20 Input data file(s) D:\dataspss.dat Input data format (12F1) THE MODEL ESTIMATION TERMINATED NORMALLY THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES COULD NOT BE COMPUTED. THE MODEL MAY NOT BE IDENTIFIED. CHECK YOUR MODEL. PROBLEM INVOLVING PARAMETER 59. MODEL RESULTS Estimates Group UNITEDKINGDOM F1 BY INNO1 0.908 INNO2 0.940 INNO3 0.908 INNO4 0.695 INNO5 0.459 F2 BY QUALITY 0.338 F3 BY CUSTSAT 0.737 CUSTLOY 0.816 F4 BY EMPLSAT 0.725 EMPLRET 0.811 EMPLLOC 0.439 F1 ON F2 0.016 F3 1.845 F4 1.883 F3 WITH F2 0.228 F4 WITH F2 0.078 F3 0.118 Means F2 0.000 F3 0.000 F4 0.000 Intercepts INNO1 3.468 INNO2 3.389 INNO3 3.420 INNO4 3.310 QUALITY 3.700 INNO5 3.448 CUSTSAT 3.715 CUSTLOY 3.639 EMPLSAT 3.313 EMPLRET 3.346 EMPLLOC 3.420 F1 0.000 Variances F2 0.000 F3 0.000 F4 0.000 Residual Variances INNO1 0.265 INNO2 0.211 INNO3 0.164 INNO4 0.445 QUALITY 0.505 INNO5 0.526 CUSTSAT 0.603 CUSTLOY 0.643 EMPLSAT 0.724 EMPLRET 0.964 EMPLLOC 0.694 F1 0.000 Group AUSTRIA F1 BY INNO1 0.908 INNO2 0.940 INNO3 0.908 INNO4 0.695 INNO5 0.459 F2 BY QUALITY 0.338 F3 BY CUSTSAT 0.737 CUSTLOY 0.816 F4 BY EMPLSAT 0.725 EMPLRET 0.811 EMPLLOC 0.439 F1 ON F2 1.127 F3 0.268 F4 0.283 F3 WITH F2 0.349 F4 WITH F2 0.466 F3 0.368 Means F2 0.207 F3 0.013 F4 0.192 Intercepts INNO1 3.468 INNO2 3.389 INNO3 3.420 INNO4 3.310 QUALITY 3.700 INNO5 3.448 CUSTSAT 3.715 CUSTLOY 3.639 EMPLSAT 3.313 EMPLRET 3.346 EMPLLOC 3.420 F1 0.475 Variances F2 0.671 F3 0.481 F4 0.747 Residual Variances INNO1 0.304 INNO2 0.230 INNO3 0.174 INNO4 0.415 QUALITY 0.455 INNO5 0.459 CUSTSAT 0.235 CUSTLOY 0.196 EMPLSAT 0.252 EMPLRET 0.227 EMPLLOC 0.401 F1 0.299 Group IRELAND F1 BY INNO1 0.908 INNO2 0.940 INNO3 0.908 INNO4 0.695 INNO5 0.459 F2 BY QUALITY 0.338 F3 BY CUSTSAT 0.737 CUSTLOY 0.816 F4 BY EMPLSAT 0.725 EMPLRET 0.811 EMPLLOC 0.439 F1 ON F2 0.676 F3 0.291 F4 0.505 F3 WITH F2 0.360 F4 WITH F2 0.119 F3 0.397 Means F2 0.138 F3 0.015 F4 0.173 Intercepts INNO1 3.468 INNO2 3.389 INNO3 3.420 INNO4 3.310 QUALITY 3.700 INNO5 3.448 CUSTSAT 3.715 CUSTLOY 3.639 EMPLSAT 3.313 EMPLRET 3.346 EMPLLOC 3.420 F1 0.011 Variances F2 0.588 F3 0.531 F4 0.649 Residual Variances INNO1 0.387 INNO2 0.259 INNO3 0.174 INNO4 0.395 QUALITY 0.363 INNO5 0.468 CUSTSAT 0.213 CUSTLOY 0.175 EMPLSAT 0.397 EMPLRET 0.502 EMPLLOC 0.560 F1 0.429 Group NEWZEALAND F1 BY INNO1 0.908 INNO2 0.940 INNO3 0.908 INNO4 0.695 INNO5 0.459 F2 BY QUALITY 0.338 F3 BY CUSTSAT 0.737 CUSTLOY 0.816 F4 BY EMPLSAT 0.725 EMPLRET 0.811 EMPLLOC 0.439 F1 ON F2 1.012 F3 0.640 F4 0.105 F3 WITH F2 0.430 F4 WITH F2 0.243 F3 0.289 Means F2 0.160 F3 0.149 F4 0.403 Intercepts INNO1 3.468 INNO2 3.389 INNO3 3.420 INNO4 3.310 QUALITY 3.700 INNO5 3.448 CUSTSAT 3.715 CUSTLOY 3.639 EMPLSAT 3.313 EMPLRET 3.346 EMPLLOC 3.420 F1 0.360 Variances F2 0.707 F3 0.486 F4 0.718 Residual Variances INNO1 0.219 INNO2 0.237 INNO3 0.167 INNO4 0.488 QUALITY 0.422 INNO5 0.498 CUSTSAT 0.233 CUSTLOY 0.213 EMPLSAT 0.302 EMPLRET 0.289 EMPLLOC 0.560 F1 0.230 Group AUSTRALIA F1 BY INNO1 0.908 INNO2 0.940 INNO3 0.908 INNO4 0.695 INNO5 0.459 F2 BY QUALITY 0.338 F3 BY CUSTSAT 0.737 CUSTLOY 0.816 F4 BY EMPLSAT 0.725 EMPLRET 0.811 EMPLLOC 0.439 F1 ON F2 0.607 F3 0.015 F4 0.029 F3 WITH F2 0.405 F4 WITH F2 0.226 F3 0.388 Means F2 0.177 F3 0.101 F4 0.300 Intercepts INNO1 3.468 INNO2 3.389 INNO3 3.420 INNO4 3.310 QUALITY 3.700 INNO5 3.448 CUSTSAT 3.715 CUSTLOY 3.639 EMPLSAT 3.313 EMPLRET 3.346 EMPLLOC 3.420 F1 0.134 Variances F2 0.847 F3 0.623 F4 1.061 Residual Variances INNO1 0.199 INNO2 0.248 INNO3 0.121 INNO4 0.321 QUALITY 0.469 INNO5 0.568 CUSTSAT 0.383 CUSTLOY 0.332 EMPLSAT 0.408 EMPLRET 0.262 EMPLLOC 0.791 F1 0.45 


Please do not post outputs on Mplus Discussion. It takes too much room. Please send your input, data, output, and license number to support@statmodel.com and we will look into your problem. 


I'm testing the measurement of 2 second order model by five factors each. If the hypothesized model do not fit the data, is it appropriate to add correlations between the first order factors, or are the 2nd order factors suppose to represent those correlation. 


Yes, this would be similar to adding residual covarariances to the first order factors. 

john posted on Wednesday, August 09, 2006  9:28 am



hi sorry, i have two questions. 1. do you know how to figure out the discriminant validity Fornell Larker test? i am stuck with this as i do not know what the avarage variance extracted is...any help will be a massive reief. 2. if my total variance extracted is 30% how big a problem is this for me? thankyou john 

john posted on Wednesday, August 09, 2006  9:56 am



hi or what is the best way to calculate discriminent validity using SPSS EFA? john 


I am have not heard of the Fornell Larker test. Maximizing the percent variance extracted is not the goal of factor analysis. Since you seem to be using SPSS, they might be able to guide you further. 


Dear Prof Muthen I am fitting categorical LCA models for four categorical manifest variables with two levels in each.Actually they are diagnostic tests.Also i have one covariate which is the age of the patient.Previous my model did not fit well.Through diagnostics for local dependence by using CONDEP program i detected that two test are locally dependent.I tried to fit the model with adjustment of local dependence in LEM software the model was ok. I real want to fit the same model in Mplus but now accounting for local dependence.I tried but i don't seem to get it right.I tried example 7.16 for Qu model which is some what close to my situation it does not seem to work well either in my situation. I guess may be i don get the coding right.Please will you kindly advice me accordingly. Here are my mplus syntax for the model:{the local depenedence is bewteen A and D variables). DATA:file is D:\sum\zlon.txt ; variable:names are id age A D B H cou; usevar age A D B H; categorical are A D B H; classes=cl(2); analysis: type = mixture ; starts = 0; model: %overall% cl#1 on age; [A$1*10 D$1*10 B$1*10 H$1*10]; %cl#1% [A$1*10 D$1*10 B$1*10 H$1*10]; output:tech10 tech11; 


The conditional independence in Example 7.16 is modeled by the f BY statement. I don't see this in your input. Please look at the example again. 


Dear linda. Actually i am afraid if was able to communicate well my problem to you.Here is the original code i used for adjusting local dependence.But it did not seem to work well.That is why i gave you the naive code i.e. one without local dependence to see how we adjust it. I will appreciate your help once again. DATA:file is D:\sum\zlon.txt ; variable:names are id age u1u4 cou; USEVARIABLES ARE age u1u4; CATEGORICAL = u1u4; CLASSES = c(2); ANALYSIS: TYPE = MIXTURE; ALGORITHM = INTEGRATION; MODEL: %OVERALL% c#1 on age; f by u1u2@0; f@1; [f@0]; %c#1% [u1$1u4$1*1]; f by u1@1 u2; OUTPUT: TECH1 TECH8 tech10; 


I don't know what you mean by it did not seem to work well. Perhaps you should send your input, data, output, license number, and explanation of what you mean to support@statmodel.com. 

Lars Penke posted on Wednesday, September 19, 2007  2:19 pm



Hello, From prior analyses, I have a wellfitting MPlus solution to a hierarchical CFA with 3 latent factors and a higherorder factor. Now I fixed all loadings to the solution I found and then wanted to estimate the correlations between all 4 factors and an external variable: VARIABLES ARE rv1r rv2r rv5r re_1 re_2 re_4 ra_1 ra_2 ra_5 extvar; ANALYSIS: TYPE = general; MODEL: rv BY rv1r@1, rv2r@0.975, rv5r@0.898; re BY re_1@1, re_2@2.063, re_4@2.581; ra BY ra_1@1, ra_2@0.967, ra_5@0.819; rho BY rv@1 re@0.653 ra@4.315; rv WITH extvar; re WITH extvar; ra WITH extvar; rho WITH extvar; However, I receive the following error message: THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES COULD NOT BE COMPUTED. THE MODEL MAY NOT BE IDENTIFIED. CHECK YOUR MODEL. PROBLEM INVOLVING PARAMETER 27. TECH1 output: Parameter 27 is "rho WITH extvar". I don't see why this model is not identified. And how could I fix it? 


You cannot identify all of the direct and indirect effects in the model as you are trying to do. 


Hi, I'm a new user of mplus, so I apologize if this is an obvious question. I tried to create a secondlevel factor based on three latent factors and one observed variable. This was my model syntax: alc by C81AL71 P7AlcD P7AlcA; pot by C81CL71 P7CanD P7CanA ; tob by C81TL71 P7TobD P7TobW ; anti by alc pot tob C5F7ASR; While the model fit the data well, I was surprised to see that the output provided two set of estimates for my secondlevel factor (anti)  one for just my observed variable (C5F7ASR) and another for my three latent factors. Additionally, since I had only one observed variable that was measuring my secondlevel factor, it looks like mplus set this variable's factor loading at 1. However, it also did this to one of my latent factors as well. Is there something wrong with my syntax, or can I not define a secondlevel factor that is measured by a combination of latent factors and observed variables? If it is possible, can you explain how I interpret the output for this secondlevel factor? thank you for your help! 


Please send the output and your license number to support@statmodel.com. I am having trouble understanding your description of the results. 

Carla Bann posted on Friday, August 15, 2008  7:36 am



I am fitting the following second order factor model with two groups of respondents. F1 by var1var4; F2 by var5var8; F3 by var9var12; F4 by F1F3; The model runs fine when I analyze the two groups separately; however, I get an error about identification when I try to include both groups and use the grouping statement. The identification error does not occur when I remove the second order factor (i.e., just have the 3 firstorder factors). I'd really appreciate any suggestions. Thank you! 


You need to fix the intercepts of the firstorder factors to zero in all groups. 

Carla Bann posted on Monday, August 18, 2008  9:38 am



This may be a dumb question. However, I am running two models, one with 3 factors and the other with the 3 factors loading on one second order factor. The fit indices for the two models are exactly the same. Is that to be expected? 


The secondorder factor is justidentified. This is why it does not change model fit. 


how can I set residual variances of firstorder factors to 1? model: f1 BY y1 y2 y3; f2 BY y4 y5 y6; f3 BY f1 f2; with f1@1 I would fix the total variance of this first order factor to 1, wouldn't I? Thank you! 


In a situation where f1 is a dependent variable as in your example where it is a factor indicator, f1@1 fixed the residual variance at one. 


Hi, all, I have a quick question about secondorder factor in a measurement model. Specifically, I am trying to run a measurement model with a secondorder factor built upon 2 firstorder underlying variables. See below. MODEL: F1 by burdn9 burdn2 burdn6 burdn10; F2 by burdn3 burdn4 burdn5 burdn7; F3 by F1 F2; F1 @ 0; F1 with F2; But Mplus did not like it. Any suggestions? Thanks a lot and happy new year! 


A secondorder factor with two firstorder factor indicators is not identified. 

JPower posted on Wednesday, March 18, 2009  11:13 am



Hello, Can you provide some guidance as to how to choose between a model with correlated factors and a model with those factors loading on a second order factor? As the second order factor is just identified, it does not change model fit. What else should be considered? Rsquare for the first order latent factors? Anything else? Thanks. 


Statistically, you can choose between those models only if you have more than 3 firstorder factors. 


Hi, I want to test a model with 8 firstorder factors and 3 secondorder factors. One of the secondorder factors is indicated by only two firstorder factors. Are there any 'tricks' (like setting parameters to 1 or 0) to get this model identified? Thanks for your help. 


I think this model is identified because the factor with two factor indicators will borrow information from the other factors. If you run this and have a problem, please send the full output and your license number to support@statmodel.com. 


The overall model will be identified. What you have there is a factor that will be locally nonidentified. This should not pose problem in the estimation process. However, if you do want to have it locally identified, you may either want to fix both loadings to 1 or to fix them to equality and to fix the higher order variance to 1. Little et al. discuss this in the context of first order models. Little, T.D., Lindenberger, U., & Nesselroade, J.R. (1999). On selecting indicators for multivariate measurement and modeling with latent variables: When “good” indicators are bad and “bad” indicators are good. Psychological Methods, 4, 192211. 


Hi Drs. Muthen, A professor told me that when you have latent variables loading onto a higherorder latent variable, it is not possible to estimate means for the higherorder factor and ALL of the lowerorder factors. He said that you'd either need to NOT estimate the mean for the higherorder factor, or NOT estimate the mean for one of the lowerorder factorselsewise your model would not be identified. If this is true, how do I choose to NOT estimate either the mean of the higher order factor or the mean of one of the lower order factors? Do I set the mean to zero using "@0" for one of these? Or, do I just comment out the portion of my code where I had previously been asking for a mean? I apologize for my oblivion on how to do this; I tried several things in my Mplus code just now, but none of them worked, and so I could use some advice. Thanks for your help! Lisa 


In a crosssectional study, factor means cannot be identified at all. It is only with multiple groups or multiple time points that factor means can be estimated in all but one group and at all but one time point. 


Thanks, Dr. Muthen. Sometimes these theoretical points about SEM are challenging to translate into Mplus code. Thanks again! 


This is the default in Mplus so no code is necessary. 


Dear. Dr Muthén, I am testing a secondorder CFA of a measure with 28 items, using 5 firstorder and 2 secondorder latent variables. All items use a 5 point Likert scale – however some items show skewness of > +1.5 and the skewness varies from item to item (some are positive, some negative, and quite a few are skewed). Here is the secondorder LV portion of my code: F6 BY F1* F2; F7 BY F3* F4 F5; F6@1; F7@1; I first used an ML algorithm and got satisfactory fit (CFI =0.907, SRMR=0.060). However, I tried rerunning the model using the WLSMV (and specifying all items as categorical), and received the following warning: WARNING: THE LATENT VARIABLE COVARIANCE MATRIX (PSI) IS NOT POSITIVE DEFINITE…. I checked to see whether any latent variables had negative residual variances, and one of the firstorder residual variances was 1.337, which seems too large to fix the variance to zero. I am not sure what else I could try… Thanks in advance for your help. 


I would suggest doing an EFA on the firstorder factor indicators and using the correlation matrix of the five factors as data in an EFA of the five factors. It may be that the secondorder factor structure is not correct. 


Hi Linda and Bengt, Normally, a factor model with three indicators is justidentified; a factor model with two indicators is underidentified; and a factor model with 4+ indicators is overidentified. However, in a higherorder model with two higherorder correlated factors, must each higherorder factor have at least 3 firstorder factors for the model to be identified? If one of the two higherorder factors has only two firstorder factors, will this present estimation problems? To provide specifics, I have Aggressiveness and RuleBreaking firstorder factors loading onto an Externalizing secondorder factor; and I have Somatic Complaints, Withdrawl, and Anxious/Depressed firstorder factors loading onto an Internalizing secondorder factor. Is the Externalizing portion of the model OK, or will this setup present estimation problems? 


A secondorder factor must have at least three firstorder factor indicators to be identified. 


Thank you so much! 


I need a very basic information. I'm not sure I'm properly observing the output of the second order factor analysis. Where to visualize the correlations between the presence of second and first order? are these? BY F5 F1 0,734 0,068 10,813 0,000 F2 0,961 0,035 27,219 0,000 F3 1,053 0,053 19,840 0,000 F4 0,980 0,044 22,270 0,000 The errors of the four factors? These are: residual variances F1 0,462 0,100 4,636 0,000 F2 0,076 0,068 1,118 0,264 F3 999,000 999,000 999,000 999,000 F4 0,039 0,086 0,453 0,651 Thank you! 


These parameters are not identified in a secondorder factor model. 

Heidi Kjogx posted on Thursday, March 29, 2012  1:16 am



Hi. I can see that my problem has been seen before, but I still do not understand how to fix it. I am running a model with 3 factors loading on one second order factor. The fit indices for the two models are exactly the same. You have previously written that this is because the model is "justidentified". I am still not sure what this means and what can be done to fix it? What can I do to make it work? Here is my model: ANALYSIS: type=general; estimator=mlm; MODEL: f1 BY PCS1 PCS2 PCS3 PCS4 PCS5 PCS12; f2 BY PCS8 PCS9 PCS10 PCS11; f3 BY PCS6 PCS7 PCS13; f4 BY f1f3; OUTPUT: standardized; 


There is nothing wrong with the model being justidentified. It means that there are zero degrees of freedom and the fit of the model cannot be tested. You would need four or more firstorder factors to change this. 


Hi Linda and Bengt, I have run a firstorder factor model (5 factors and 1 observed variable) which runs ok and fits ok. I have now run a second order model where one of the first order factors (SCSOC) now has a negative residual variance. I have quite a small sample size (approx 230) and realise that I can set the error for SCSOC to 0 or drop the factor (i think this would make my second order model unidentified) but neither of these is ideal. Is there anything else I can do to remedy this problem? Many thanks for your help. Here is my model input: MODEL: ATTAIN BY english2 maths2 science2; attndr; LIKING BY sl2_1r sl2_3r sl2_4r sl2_6r sl2_7r; BEHAV BY disrup2r cooper2r; SCSOC BY SC2frnds SC2older SC2bulld; SCSCH BY SC2class SC2ntchr SC2trvl SC2ltppl SC2size SC2break SC2dnnr; attndr WITH LIKING; SL2_6R WITH SL2_3R; SCFUNC BY ATTAIN BEHAV attndr; SCAFF BY SCSCH SCSOC; 


This points to the model being misspecified. You might want to look at modification indices to see if crossloadings or residual covariances are needed. Or you might want to go back to an EFA to see if the firstorder factors fit are wellspecified. 


Dear Prof. Muthen, My measurement model consists of  three first order latent constructs  and a second order latent construct with five first order factors. One of those five first order factors (i.e. general mood) has 7 items of which 3 items are negative worded. Adding a negative wording factor (method factor) to the secondorder factor solution improved the loadings of these three negatively worded items. However, when we estimate the structural model between those four latent factors, the loadings of the three negative worded items on the first order construct (i.e. general mood) diminish far below 0.40 boundary line. Can you help me to clarify this problem? 


I would ask for MODINDICES (ALL) in the OUTPUT command. The need for crossloadings and residual covariances may be the cause of the misfit. 


Dear Prof. Muthen, Thank you very much for your quick answer. Unfortunately the suggested modification indices are not possible with respect to content. Are these suggestions of the modification indices the only solution for this problem? Or could there be other underlying problems which can be addressed? Thank you very much in advance. 


If the suggested modification indices do not make sense, you may need to rethink your model. The modification indices show where the model and the data do not agree. Perhaps you should try testing parts of your model to see where the misfit is. 


Hi Linda I understand that a 2ndorder factor needs 3 1storder factors to be justidentified. However, Mplus seems happy to run my model below, where I have only 2 1storder factors and 1 observed variable. Is it ok to do this? MODEL: ATTAIN BY english2 maths2 science2; BEHAV BY disrup2r cooper2r; attndr; SCFUNC BY ATTAIN BEHAV attndr; Thanks, Terry 


You need three factor indicators for the secondorder factor. They can be firstorder factors or observed variables. 

Marcus Crede posted on Wednesday, November 14, 2012  11:19 am



If I want to examine whether data (16 manifest variables, 4 theoretical firstorder factors) is characterized by a single secondorder factor onto which all four firstorder factors load I believe that I need to examine the fit of such a model and compare it to the fit of an alternate model with four correlated firstorder factors. My first question is this: do I base my decision on a comparison of chisquare values or on a comparison of other fit statistics (or do both)? My concern about using chisquare values is that I do not see how the more parsimonious higherorder model is nested within the model with correlated firstorder factors. Hence my second question: Are these two models nested and if so, how are they nested? Thank you for any guidance that you can offer. Marcus 


Yes, these models are nested. You can think of the model where the firstorder factors as an unrestriced H1 model and the secondorder factor model as a restrictive H0 model. You can compare them using a chi=square difference test. 


Dear Dr. Muthen, I run a CFA with three first order factors (The first has 5 variables, the second 6, the third 5 variables), and then I conducted a second order analysis with 1 second order factor measured by the same 3 first order factors I just mentioned. The second order analysis gives me the exactly same results than the first order analysis. Am I doing something wrong? I did not obtain any warning in the output. My syntax was TITLE: CPSSinvestigation DATA: FILE IS "C:\Users\CP\Documents\AnalisisMPLUS\CPSS.dat"; FORMAT IS FREE ; VARIABLE: CATEGORICAL ARE CPSS1 CPSS2 CPSS3 CPSS4 CPSS5 CPSS6 CPSS7 CPSS9 CPSS10 CPSS11 CPSS12 CPSS13 CPSS14 CPSS15 CPSS16 CPSS17; NAMES ARE ID CPSS1 CPSS2 CPSS3 CPSS4 CPSS5 CPSS6 CPSS7 CPSS9 CPSS10 CPSS11 CPSS12 CPSS13 CPSS14 CPSS15 CPSS16 CPSS17; USEVARIABLES ARE CPSS1 CPSS2 CPSS3 CPSS4 CPSS5 CPSS6 CPSS7 CPSS9 CPSS10 CPSS11 CPSS12 CPSS13 CPSS14 CPSS15 CPSS16 CPSS17; ANALYSIS: TYPE IS GENERAL ; ESTIMATOR IS WLSMV; model: f1 by CPSS1  CPSS5; f2 by CPSS6  CPSS12; f3 by CPSS13  CPSS17; f4 by f1  f3; OUTPUT: SAMPSTAT RESIDUAL STANDARDIZED; SAVEDATA: RESULTS IS second_orderCPSS; 


A secondorder factor with threeindicators is justidentified. It has zero degrees of freedom. It does not contribute to model fit. This is why the fit statistics are the same. 


Dear Dr. Muthen, Thank you very much for your quick answer. If my model is just identified, I could fix some parameters to gain some degrees of freedom, right? Could I specify at fixed levels some coeficients estimates whose magnitud was revealed in the first order analysis to make the model work? How can I do that? If not, what option do I have? Sorry for asking you these basic questions. Thanks in advance 


No, it is incorrect to fix parameters to gain degrees of freedom. Adding the secondorder factor does not impose any further restrictions. You should just accept that and report the estimates. 


Dear Dr Muthen: So, if adding the secondorder factor does not impose any further restriction to the model, do this mean that both models are equivalents (the second order factor has the same model fit than the first order analysis)or just mean that the second order analysis could not be calculated? Could the addition of a second order factor in an overidentified model make the fit of a model worse? Thanks! 


It means that the models are equivalent. There is nothing wrong with your secondorder model; the secondorder part is just not testable. With more than 3 indicators (3 firstorder factors), a secondorder model is overidentified and can make model fit worse. 


Dear Dr Muthen, I am unsure about how to interpret the rsquare for a second order factor. I have 3 first order factors which all correlate quite highly (>.7) on a fourth higher order factor. The fit of this model is good and almost identical to the fit of a model with just 3 factors and no higher order factor. However, for theoretical reasons, I have chosen the model with the second order factor. The rsquare for the three first order factors ranges from .4 to .6 but the rsquare for the second order factor is 0.008. Would I interpret this as indicating that the second order factor is not explaining much variance over and above that explained by the 3 first order factors? Does this argue against including the second order factor in my model? Many thanks, Louise 


I don't understand what you mean that the rsquare for the secondorder factor is 0.008. There should be one rsquare for each dependent variable. Also, note that the fit is exactly the same with our without a secondorder factor when there are only 3 firstorder factors. 


Thanks for getting back to me so quickly. My apologies, I wasn't clear. I am running a second order factor model with covariates. After running an initial model and checking modindices I included specific paths in the model as follows: f1 by GA1 GA2 GA3 GA4 GA5 GA6 GA7 GA8 GA9; f2 by GD1 GD2 GD3 GD4 GD5 GD6 GD7 GD8 GD10 GD11 GD12 GD13 GD14 GD15 ; f3 by NEU1 NEU2 NEU3 NEU4 NEU5 NEU6 NEU7 NEU8 NEU9 NEU10 NEU11 NEU12; f4 by f1f3; F4 ON AGE SEX; f1 on age; f2 on sex age; f3 on sex; I guess I'm confused about how you interpret the rsquare for the second order factor here? Thanks, Louise 


The Rsquare you get for the secondorder factor is the variance explained in the secondorder factor by the covariates. 


Dear Bengt & Linda, I have a second order factor model: f12 BY f1 f2; f13 BY f3 f4 f5 f6; f14 by f7 f8 f9 f10 f11; The three factors f12, f13 and f14 are correlated. I have problems with the factor loading for f1: even if I fix the variance of f12 (f12@1) and allow the factor loadings to be freely estimated (f12 BY f1* f2), Mplus seems to fix the factor loading: in the output for STDYX, the factor loading is 1. Moreover, in Tech9 I get the information that there is a problem with this factor loading for all datasets (I use 10 imputed datasets) due to a nonpositive definite firstorder derivative product matrix. Am I specifying the model in a wrong way? Thanks for your help! 


Please send the output and your license number to support@statmodel.com. 

H Steen posted on Thursday, May 02, 2013  1:51 am



Dear dr. Muthen, I doing a CFA analysis, with a model that is partly first and partly second order. (it is done with WLSMV as it concerns all categorical data) f1 by x1 x2; f2 by x3 x4; f3 by f1 f2 x5 x6 x7 x8; It shows a RMSEA of 0.044 and CFI and TLI above 0.95, so it seems to perform fine. I have two questions: 1) is it indeed ok do do a CFA partly second order? 2) the output gives me loadings on f3 for the latents f1 and f2 seperately from x5 to x8, meaning both f1 and x5 have fixed loadings on f3 of 1. How should I interpret this? Do all items and factor contribute in the same way to f3? 


1. Yes. 2. You should free one of the fixed loadings. All factor indicators of f3 are treated in the same way. 

H Steen posted on Wednesday, May 15, 2013  4:37 am



Thank you very much! Just to be sure, the output on Mplus is: F3 by x5 x6 x7 x8 F3 by F1 F2 all followed by loadings. The standardized loadings of say f2 and x8 can be compared with each other? Thank you so much in advance. 


This is the same as if the results were presented: F3 by x5 x6 x7 x8 F1 F2 The observed and latent factor indicators are simply present separately. They are estimated together. 

H Steen posted on Wednesday, May 15, 2013  7:41 am



Thank you very much! 

H Steen posted on Wednesday, November 13, 2013  8:53 am



I have a follow up question about the distribution of the resulting factor scores of this partly second order WLSMV CFA. The distribution is skewed, but this is understandable as the items are generally easy. The distribution has a range of 1,3 to 0.71, mean 0,02. What I find strange is that between 0,55 and 0,7 there are no scores, and then a peak again at 0,71 of 5,3% of the respondents.This seems a ceiling effect, but I wonder why this gap? Have you come across such a finding before, and do you know what can cause this? The model is working fine in other respects. 


This can happen depending on the location of the item thresholds and the subjects' responses. 

H Steen posted on Thursday, November 14, 2013  3:08 am



Thank you for your prompt answer. Could yyou recommand literature on this subject? 


I don't know of any. This is based on experience. You might want to contact Steve Reise at UCLA. 


Hello, We are attempting to fit a second order CFA that we will ultimately examine in a parallel process growth model. We are wondering whether the intercepts of the marker variables should be set to 0, whether the metric of the firstorder factors should be set to 1.0 (is this automatic in Mplus), and how we should set the metric at the 2nd order level. Also, should the intercepts of the firstorder factors be set to zero, and should the factor loadings of the 2nd order factor (on the five primary factors) be constrained to equivalence across time if the 1st order factor loadings are constrained across time? Any advice would be greatly appreciated. Thank you, Loryana 


See Example 5.6. The default in Mplus is to fix the first factor loading to one to set the metric of the factor. You can free that and fix the factor variance to you if you wish, for example, f BY y1* y2 y3; f@1; 


Thank you. We're having trouble running a parallel process growth model reflecting growth in a 2ndorder factor and growth in a 1storder factor. (LV cov. matrix not positive definite). Any advice? !Construct 1 (C1), 1storder, T1(a) F1a BY x1a x2ax3a (12); F2a BY x4a x5ax6a (34); F3a BY x7a x8ax9a (56); F4a BY x10a x11ax12a (78); F5a BY x13a x14ax15a (910); ... !C1, T4(d) F1d BY x1d x2dx3d (12); ... F5d ... !C1, 2ndorder, T14 F6a BY F1a F2a (11) F3a (12) F4a (13) F5a (14); ... F5d (14); !C2, 1storder, T14 F7a BY x16a x17ax18a (1516); ... F7d BY x16d x17dx18d (1516); !Growth, 2nd and 1storder factors i1 s1  F6a@0 F6b@1 F6c@2 F6d@3; i2 s2  F7a@0 F7b@1 F7c@2 F7d@3; [i1@0 i2@0]; [F1aF5d@0]; [x1ax1d] (18); ... [x18ax18d] (35); 


Please send the relevant outputs and your license number to support@statmodel.com. 


Is a parameter, like correlated slopes, that is out of bounds (greater than 1) an indication of model identification problems or the model not being able to find a single estimable solution? Might this be caused by collinearity between the intercept terms? Thank you. 


When slopes correlate greater than one, this is not an indication of model identification problems or that the model cannot find a single solution. It means the the model is inadmissible and needs to be changed in some way. 

Sarah posted on Friday, March 21, 2014  4:06 am



Hello, I am attempting to construct a measurement model of child wellbeing. I have constructed a four factor model with each factor representing a domain of child wellbeing. This model works well. However when I attempt to add in a second order factor of 'child wellbeing' I run into problems. I receive warnings stating that the latent variable covariance matrix is not positive definite. There are no negative variances or residual variances, and no correlations greater than or equal to one between latent variables. However, the residual variance (unstandardized) for one of the first order factors is huge at 28552.928. The warning states that the problem is with the second order variable of child wellbeing and many of the estimates involving this factor are not calculated. I also receive an error that the model may not be identified. However, my four factor first order model was identified and my second order factor includes four first order factors so I thought it was identified? Do these difficulties probably mean that a second order factor is simply inappropriate for the model? Many thanks for your help. 


Please send the output and your license number to support@statmodel.com. 


Hi Dr. Muthen, I would like to compare a secondorder factor model (one second order factor and two first order factors) and a firstorder factor model (two first order factors). I know that a secondorder factor must have at least three firstorder factor indicators to be identified. However, my data do not allow me to do so. I am wondering whether there are any other ways I could compare them? 


The estimates of a nonidentified model are not meaningful. 


Hello, Why is Mplus not recognizing the lowerorder factors set up the following higherorder mixture model, where I have known classes? I get the message: *** ERROR in MODEL command Unknown variable(s) in a BY statement: SUBST1 My code is below. ANALYSIS: TYPE = MIXTURE; MITERATIONS = 1000; PROCESSORS = 4; !8 INTEGRATION=MONTECARLO (500); MCONVERGENCE = 0.015; MODEL: %OVERALL% SUBST1 BY alc30d1 !(1) mar30d1 (2) hdr30d1 (3); SUBST3 BY alc30d3 !(1) mar30d3 (2) hdr30d3 (3); SUBST4 BY alc30d4 !(1) mar30d4 (2) hdr30d4 (3); !* higherorder Intercept and Slope factors * INT_HORD BY SUBST1@1 SUBST3@1 SUBST4@1; SLP_HORD BY SUBST1@1.5 SUBST3@0 SUBST4@1.5; SUBST1 WITH SUBST3@0 SUBST4@0; SUBST3 WITH SUBST4@0; INT_HORD with SLP_HORD; INT_HORD; SLP_HORD; %VIOL#2% INT_HORD with SLP_HORD; INT_HORD; SLP_HORD; 


Please send the output, the data, and your license number to support@statmodel.com. 


Dear Dr. Muthen, Is a model which is composed of two secondorder latent factors with each being measured by two firstorder latent factors identifiable? The results of running with Mplus did not show any improper results. !! examples variable: names are a1a4 b1b4 c1c4 d1d5; analysis: estimator = mlr; model: a by a1a4; b by b1b4; c by c1c4; d by d1d5; fab by a b; fcd by c d; If yes, this should be also applied to a multilevel CFA model which both the within and between models have the same secondorder measurement structure as shown in the above example, right? Thank you very much! John 


A secondorder factor with two indicators is not identified. When a model has two secondorder factors with two indicators, the model is identified because information from other parts of the model are used for identification. This is not an ideal situation. 

Guillermo posted on Thursday, November 27, 2014  7:11 am



Dear Dr. Muthen, I have a question similar to the one that opens this thread and to that posted by H Steen. I have a secondorder CFA in which the secondorder latent variable is derived from 4 first order latent variables, two of them measured with only one manifest variable. f1 by x1x5; f2 by x6x10; f3 by x11@1; x11@0; f4 by x12@1; x12@0; f5 by f1f4; When I try to fit this model, the output displays the message: "NO CONVERGENCE. NUMBER OF ITERATIONS EXCEEDED". I have also tried this other way: f1 by x1x5; f2 by x6x10; f5 by f1 f2 x11 x12; In this case the output displays the message: "The metric for the following factor cannot be determined because the factor has both observed and latent indicators". Am I doing something wrong? Thank you so much. 


Try f1 by x1x5; f2 by x6x10; f5 by f1@1 f2 x11 x12; 

Guillermo posted on Thursday, November 27, 2014  1:47 pm



It does not work. When I try it, the output displays the message: WARNING: THE LATENT VARIABLE COVARIANCE MATRIX (PSI) IS NOT POSITIVE DEFINITE. THIS COULD INDICATE A NEGATIVE VARIANCE/RESIDUAL VARIANCE FOR A LATENT VARIABLE, A CORRELATION GREATER OR EQUAL TO ONE BETWEEN TWO LATENT VARIABLES, OR A LINEAR DEPENDENCY AMONG MORE THAN TWO LATENT VARIABLES. CHECK THE TECH4 OUTPUT FOR MORE INFORMATION. PROBLEM INVOLVING VARIABLE F1. However, when I let f1 free for estimation and I fix the link of the first manifest variable to 1, it works perfectly. f1 by x1x5; f2 by x6x10; f5 by f1* f2 x11@1 x12; Luckily, this is exactly the link I needed to fix to 1, so my problem is completely solved. Nevertheless, I do not understand why other options do not work. When I fix f2 to 1 letting f1 free, the output also displays the previous warning message, and when I fix x12 to 1 instead of x11, the output displays the message "NO CONVERGENCE. NUMBER OF ITERATIONS EXCEEDED". Thank you so much for your help. 


I would need to see the outputs to explain what is happening. Send them and your license number to support@statmodel.com. 


Hi. I am eating a factorial analysis of second orde. You can run with categorical data? 


Hi. I am running a secondorder factor analysis. It can work with categorical data? 


Yes. 


When we added "Categorical" as it is shown here, we could´t run the analysis. is the sintaxis wellwritten? TITLE: secondorder satisfacción factor analysis DATA: FILE IS Satisfaccion.dat; VARIABLE: NAMES ARE y1y10; CATEGORICAL ARE y1y10; MODEL: f1 BY y1 y4 y5 y7 y10; f2 BY y2 y3 y6 y8 y9; f3 BY f1f2; 


A secondorder factor with two indicators is not identified. Is that the message you get? 


Hi, Mplus team, Could you please help me shed some light on the measurement invariance issue of the 2nd order CFA models. Specifically, I wonder if the issue of partial intercept/means invariance applies to these models in the same way it would to the 1st order CFAs? I am using the approach outlined by Wang and Wang (2012; pp.245267) in which they propose to test the invariance of the firstorder intercepts and secondorder means in two stages. We did not observe a strict invariance of the firstorder intercepts due to a handful of noninvariant indicators (based on the Mod. Indices) which I then set free across the groups. What I would then call a partially invariant scalar model (or firstorder intercept model) in which at least two item intercepts remained constrained across the groups provided an adequate fit to the data. This partially invariant model was then used to test the invariance of the 2nd order factor mean and resulted in a good fit as well. I would call it then a partially invariant 2nd order mean model, unless a better term has already been coined. Does this approach sound right and can it be applied to 2nd order CFA models? I can’t see why not, but I was not able to find any reference to partial measurement invariance in the context of 2nd order CFA models. Thank you, Dmitriy Wang, J., & Wang, X. (2012). Structural Equation Modeling. Wiley. 


If you are referring to the intercepts in the regressions of firstorder factors on secondorder factors, I think you are right. I would start with these being fixed at zero in all groups and then free those that need it. I haven't seen a name for it. 

Cheng posted on Saturday, April 09, 2016  3:41 pm



I am testing a second order level CFA. I have 2 latent variables (2nd order level). Each of these latent variables has 5 latent variables (or the subscales). Each of these subscales has 2 items on it. In the Mplus output, I found that 23 latent variables (the subscales) have factor loading more than 1 toward their 2nd order latent variables. Is this indicating high correlation among the subscales? High multicollinearity is possible among the subscales? I removed the subscales and have all the items measuring directly toward the 2 latent variables. I found the model is slight better in term of fit indices. If it is theoretically supported, should I collapse the subscales and having 2 latent variables only in the model? 


You may want to direct this general analysis question to SEMNET. 


Thank you for your response, Dr. Muthén. As a followup, may I please ask for your advice regarding the following modeling issue? I am investigating a longitudinal invariance of a 2nd order model (37 items, 7 1st order factors and one 2nd order factor). Same respondents were reinterviewed over three timepoints. I am not necessarily interested in the within subject variation over time. My goal is to show that the instrument’s measurement properties remain the same across time. I looked at your example 9.27 which could be an answer to my problem, but for various reason, I would not want to use Bayesian framework in this particular case. I am trying to come up with a suitable modeling strategy and wanted to discuss a few options: 1.Could I proceed with establishing longitudinal invariance similar to how one would model invariance across independent groups (i.e. without letting residuals correlate across time points)? Can you think of doing that? What are the caveats of not letting the residual errors correlate across time? The reason I wonder is that I am concerned about switching to the wide format given the increase in the number of parameters and a relatively modest sample size. 2.I am concerned that a growth curve or multilevel models may not be a viable option due to the relative complexity of the model. Am I wrong here? 3.Any suggestions or examples/codes I could use for this analysis? 


1. The wide approach should work well here with only 3 timepoints. UG ex9.27 is intended for many timepoints (I think 100 here). You should use the same invariance modeling principles as with independent groups. 2. Growth modeling is not needed. Just have correlated factors that are allowed to have different means and variances across time. 3. See the invariance setup for invariance across time for multipleindicator growth in the UG Chapter 17, pp. 687. 


Thank you so much for your help, Dr. Muthen. This is extremely helpful! 


Dear Drs Muthen, I wanted to ask you a question about a measurement model. My measurement model has 3 first order latent factors and 1 second order latent factor defined by its respective 4 first order factors. So the measurement model is: Model: D BY D1 D2 D3 D4; B BY B1 B2 B3 B4; N BY N1 N2 N3 N4; IMP BY Pimp1 Pimp2 Pimp3 Pimp4; CP BY Pcondpr1 Pcondpr2 Pcondpr3; MANIP BY Pmanip1 Pmanip2 Pmanip3; CAL BY Pcal1 Pcal2 Pcal3 Pcal14; P BY IMP CP MANIP CAL; Here, I basically allow all factors in the model to be correlated, but at the same time, the 4 first order factors (belonging to their second order factor) are also correlated with other first order factors in the model. Is this technically correct or should the 4 first order factors (belonging to the second order factor) only correlate with one another and their respective second order factor and NOT with any other first order factors within the measurement model? Thank you 


Because this is a general analysis strategy question, you may want to post this on SEMNET. 


Hi, From reading the thread above, I understand that is not possible to test model fit for a second order factor. However, how is it possible to determine whether a second order factor model is a better fit of the data compared to a first order factor model? To explain: I put 9 observed variables into a CFA, which produced three factors. Model fit statistics show this to be a good fit of the data. I now want to add in a second order factor  so that I have just one 'superfactor'. How can I work out if this model is 'better' than using just the three first order factors? Thank you 


A secondorder factor can impose a structure on the covariance matrix for the firstorder factors. It won't fit better  the question is instead if it fits significantly worse. Or in BIC terms if the worsening is worth the reduction in number of parameters. With 3 firstorder factors and one secondorder factor, you have as many parameters as in the model with 3 factors only  so there is no structure imposed on the factor covariance matrix. In other words, there is no change in model fit and no statistical reason to choose one or the other. 


Thanks Bengt. So, just to check: is it possible to determine whether the second order factor model fits significantly worse than the first order factor model? If so, how? Or, is this just not possible (given there is no change in model fit and no statistical reason to choose one or the other)? Sorry for these probably very simple questions (I am on a steep learning curve!) Thanks 


You need at least four firstorder factors to be able to test the fit of the secondorder model. 

Back to top 