Mplus Discussion >> Confirmatory factor analysis

Topics
Last Day
Last 3 Days
Last Week
Tree View

Edit Profile


Confirmatory factor analysis

Mplus Discussion > Confirmatory Factor Analysis >

Message/Author

Melady Preece posted on Saturday, February 26, 2000 - 8:28 am

I need to do a confirmatory factor analysis on nested data (days within individuals). Can Mplus do this for me? Is it possible to do such a factor analysis with different numbers of data points from different individuals?

Bengt O. Muthen posted on Saturday, February 26, 2000 - 9:04 am

Yes, that can be done in several ways in Mplus if your outcomes can be viewed as continuous. First, you can do it as multilevel factor analysis, see

Muthén, B. (1994). Multilevel covariance structure analysis. In J. Hox & I. Kreft (eds.), Multilevel Modeling, a special issue of Sociological Methods & Research, 22, 376-398.

Second, you can treat the days within individuals as multivariate outcomes in a single-level model, just as is done in growth modeling.

The issue of different number of days within individuals is not a problem in either approach. In the first approach, this means different cluster sizes. In the second approach this means missing data.

Anonymous posted on Wednesday, May 31, 2000 - 11:37 am

You memtion on the website that with MPlus a MIMIC model can include indirect effects, the effect of a background variable on a factor indicator via the factor. Is this illustrated in the Mplus manual? If not is it a straight froward procedure. Many thanks.

Linda K. Muthen posted on Thursday, June 01, 2000 - 7:46 am

There are examples of MIMIC models in Chapter 21 of the Mplus User's Guide. Indirect effect parameters are not automatically calculated in Mplus. You would have to multiply the factor loading by the regression coefficient to obtain the parameter estimate. The standard error of the estimate would have to be obtained using the Delta method.

melady posted on Sunday, August 13, 2000 - 9:01 am

I have done a factor analysis with disaggregated data using different cluster sizes. What I would like to do now is see whether the factors hold across clusters. I am thinking I could compare the goodness of fit for a confirmatory analysis using all the data without regard for clustering to the goodness of fit for a confirmatory factor analysis using the cluster as a part of the model. Is this reasonable?

Anonymous posted on Wednesday, February 21, 2001 - 7:59 am

I am doing CFA on a model with three latent variables with four or five paramaters each. Nevertheless, my model is not identified. I have fixed the starting value of each paramter to 1 and correlated the three latent variables. There is prior research using this scale which suggest that at least two of the factors are highly correlated. Would fixing the correlation between factors help identify the model? Are there other reasonable options for overcoming the model identification problem? Thanks for your time.

Linda K. Muthen posted on Wednesday, February 21, 2001 - 8:43 am

If you have 12 to 15 factor indicators which I think is what you are saying, a three factor model should be identified. Fixing all of the starting values to one is not a solution. Have you perhaps freed the factor means? They must be fixed at zero (the Mplus default) for the model to be identified. Otherwise, I would need to see your output to fully understand your problem.

melady posted on Tuesday, March 06, 2001 - 2:34 pm

I wish to conduct a factor analysis with data on specific individuals provided by multiple informants. How can Mplus help me to do this?
I have the program, but I don't fully understand what I'm doing with it!

Melady Preece

Linda K. Muthen posted on Wednesday, March 07, 2001 - 8:53 am

As I understand, you have data where a trait is measured for individuals be several informants, for example, teacher, peers, and parents. You could do a factor analysis in the following way,

factor BY teacher peers parents;

Frank Lawrence posted on Friday, May 04, 2001 - 11:13 am

I would like to use Mplus to estimate a non-linear relationship among latent variables [interaction]. Joreskog and Yang (1996) demonstrated such a model can be estimated using SEM if an observed product variable is used as an indicator of the latent product variable. Bollen (1995) used a two-stage least squares with instrumental variables to estimate the interaction. How can I use Mplus to estimate an interaction model?

Linda K. Muthen posted on Thursday, May 10, 2001 - 10:07 am

Mplus cannot do what Joreskog and Yang demonstrated because it requires non-linear constraints. The Bollen approach can be done in Mplus but it is not directly implemented. It would have to be done in a series of steps.

Anonymous posted on Thursday, May 10, 2001 - 3:57 pm

I read in the Mplus 2.0 manual (page 346, bottom) that: "When x variables are present, the conditional normality assumption allows non-normality for y* as a function of non-normal x variables".

I also note that in his chapter "Some Uses of Structural Equation Modeling in Validity Studies: Extending IRT to External Variables" Bengt writes (pg 218, middle): "We will add the multivariate probit assumption that y*|x is multivariate normal. Note that this does not mean that we assume normality for the y*'s or for the [Eta], but normality is merely required for the [residual and error terms]. The distribution of [Eta] and the y*'s is actually to some extent generated by the x's".

I.e., in a CFA "in isolation" with ordered categorical indicators but no background (x) variables, one assumes that the latent y*'s are normal (but makes no assumptions about y* variances). However, introducing covariates relaxes or makes less crucial the assumption of the y*'s normality ?

How is this so ?

Does "non-normality of y*'s when x's are introduced" hold for all Mplus estimators (i.e., WLS as well as robust WLSMV, etc) ?

Linda K. Muthen posted on Friday, May 11, 2001 - 9:58 am

The reason that conditional normality can be assumed is that the sample statistics used for estimation are not correlations but rather probit regression coefficients. The follow article explains this:

Muthén, B. (1983). Latent variable structural equation modeling with categorical data. Journal of Econometrics, 22, 48-65. (#9)

This conditional normality holds for WLS, WLSM, and WLSMV.

Anonymous posted on Monday, May 14, 2001 - 11:28 am

Responding to your May 11 reply above: Is this why I find my indicator item r-squares increase by 10-15 percentage points when external variables are introduced into the model ?

Linda K. Muthen posted on Tuesday, May 15, 2001 - 9:47 am

No. The reason your r-square values increase is that the variables that you are adding explain the variability of the indicators. The r-square values are not dependent on which estimation method you use but on the model.

Anonymous posted on Wednesday, June 13, 2001 - 2:48 pm

I have a question concerning the r-squares that are in my output. I have dichotomous dependent variables, so are the r-square values equivalent to adjusted r-squares? thanks.

Linda K. Muthen posted on Thursday, June 14, 2001 - 2:57 pm

The r-squares for categorical dependent variables are defined for the y* variables. See Appendix 1 of the Mplus User's Guide. It is not an adjusted r-square.

Anonymous posted on Monday, August 27, 2001 - 1:40 pm

Provided that one isn't performing nexted Chi-Square tests of model fit, are there any circumstances when you would recommend one *not* use the robust (WLSMV) estimator ?

Why does Mplus include the WLSM estimator if the WLSMV estimator is available (and, I'm guessing technically superior) ?

Linda K. Muthen posted on Tuesday, August 28, 2001 - 9:33 am

We actually recommend WLSMV in all situations based on current knowledge of WLSMV and the other alternatives.

We include WLS and WLSM so that they can be studied further. It may be in certain situations one of them will behave better.

Anonymous posted on Tuesday, August 28, 2001 - 12:11 pm

As a follow up to the above response, would you provide a reference / citation discussing nested Chi-Square tests of fit for the robust estimator.

Linda K. Muthen posted on Wednesday, August 29, 2001 - 8:38 am

I don't know of any for categorical outcomes.

David Klein posted on Friday, October 05, 2001 - 3:15 pm

I saved the factor scores from a CFA (using the "SAVE = F_SCORES" option) and read them into another software program. When I calculate simple statistics on the saved data, the means & variances of my latent variables do not match the means & variances shown in the M-plus "Tech4" output. Not even close, really.
Why not? My goal, if it matters for your answer, is to create some kind of residual that represents the unexplained variance of an indicator (that is, the variance not explained by the latent variable on to which the indicator was loaded)

Thanks!!!

Linda K. Muthen posted on Friday, October 05, 2001 - 3:48 pm

Be sure to include TYPE=MEANSTRUCTURE if you are using Version 2.0. Means were not turned on automatically in that version. Subsequent updates have corrected this problem. If this does not help, please send data and output and I will take a look at it.

Anonymous posted on Wednesday, November 14, 2001 - 1:21 pm

I'm trying to save factor scores in CFA. The output file has all the variables on the usevariables list and the factor scores appended at the end. I'd like to merge the factor scores with a larger data set, but am not sure how to save an identifying variable with the factor scores. When I include an identifier on the usevariables list, the model doesn't converge.

Linda K. Muthen posted on Wednesday, November 14, 2001 - 2:07 pm

You cannot do this in Version 1. In Version 2, there is an IDVARIABLE = statement in the VARIABLE command.

Jef Kahn posted on Friday, November 16, 2001 - 9:29 am

I am estimating a CFA with 4 factors and 49 items using ML. The fit of the model is not very good. However, when looking at the modification indices I notice that dozens and dozens of the fixed factor loadings have MI values of 999 with a StdYX S.P.C. of 0. What does this mean?

Linda K. Muthen posted on Friday, November 16, 2001 - 10:13 am

We print 999 when the denominator in the formula for modification indices becomes zero or very close to zero. You can find the formula on page 373 of the following article:

Sorbom, D. (1989). Model modification. Psychometrika, 54, 371-384.

Jef Kahn posted on Friday, November 16, 2001 - 10:59 am

Following up from the last post, does a M.I. value of 999 indicate that the M.I. value would likely be very large or very small? Or is that simply indeterminable in this case?

bmuthen posted on Friday, November 16, 2001 - 4:04 pm

Simply indeterminable.

Hervé CACI posted on Saturday, November 24, 2001 - 2:14 am

I'm conducting a set of CFAs. How can I check my datasets of 0-3 Likert-type variables for outliers ? Can I use the usual methods (i.e. those for continuous variables...) ?

Linda K. Muthen posted on Tuesday, November 27, 2001 - 7:34 am

I would suggest using the usual methods for continuous variables.

Anonymous posted on Thursday, December 06, 2001 - 5:53 am

1: I have run a multiple group confirmatory factor analysis on categorical data (3 factors, 3 groups) using meanstructure and save fscores in vers. 2. Can you explain in words how the scores are estimated? (I have trouble understanding the formulas in the manual pp 385-6.)
2: Is it reasonable to use multiple group analysis (default settings) to compute scores for the same individuals on 3 points in time (= 3'groups') to study changes in scores?

bmuthen posted on Thursday, December 06, 2001 - 5:51 pm

A factor score is an estimate of an individual's most likely value on the factor given both the estimated model and the individual's observed variable values. For example, in measuring an ability dimension with multiple-choice items, an individual who got many items right gets a high estimated score, but the estimated score is also affected by group membership so that the estimated score is lower for a group that has a lower estimated factor mean. This is in line with Bayesian estimation where the estimated model is the "prior" to which information on the individual's data is added to get a "posterior". The estimated score is the maximum of the posterior distribution.

Multiple-group analysis is concerned with independent data from the different groups. Longitudinal data do not give independent data from the different time points. You can, however, use (longitudinal) factor analysis with across-time measurement invariance to estimate and compare scores over time. To study change in score over time, however, it would seem advantageous to use growth modeling.

Anonymous posted on Tuesday, December 11, 2001 - 11:59 pm

I am trying CFA on ordinal data using WLS estimator. I have constructed the second-order factorial model, consisting of five first-order factors and twelve observed variables. Of these five first-order factors, two had only one observed variable.
For model identification, we have tried to constrain 1 to the path loadings from these first-order factors to observed variable. In addition, we constrained 0 to the error variance of these observed variables. However, this model could not be identified. Can I identificate this type of model in M-plus?

Linda K. Muthen posted on Wednesday, December 12, 2001 - 7:39 am

If the factors have only one indicator, don't create a factor. With categorical outcomes, the residual variances are not parameters in the model, so this changes things from the continuous case. Just use the indicator as a indicator of the second-order factor. Let Mplus do what is necessary. If you ask for TECH1, you will be able to see which parameter is causing the identification problem. It is most likely the residual covariances among the first-order factors which need to be fixed to zero for identifiability. If you cannot solve your problem this way, send me the output including TECH1 and the data and I will take a look at it.

JND posted on Tuesday, December 18, 2001 - 9:55 am

I hate to express my ignorance publicly on this . . . .

I ran the CFA with three latent variables and 28 observed variables (per theory). Only five of the estimates were less than 1.0. (None were negative.)

How do I attack this?

bmuthen posted on Tuesday, December 18, 2001 - 10:29 am

The unstandardized loading estimates do not have to be less than one; this depends on the scale of the y. The Stdyx standardized values can still be less than one. If you fix to one the loading that has the largest unstandardized estimate in your run, you will most likely get unstandardized loadings less than one.

Anonymous posted on Monday, March 18, 2002 - 10:11 am

Is there a way to check the multivariate normality assumption in ML estimation in Mplus?

Anonymous posted on Monday, March 18, 2002 - 12:20 pm

I am estimating a three factor model with ML MLM and WLS. My 18 observed variables are likert-type and is multivariate non-normal (Mardia's coefficient normalized estimate = 160). As expected MLM chi-square estimate 2272.9 (df=125) is lower than ML estimate 2750.966, other fit indices also behave similarly since they are corrected for non-normality (scaling correction factor (1.21).
But when I ran the same model with WLS estimation method the results are considerably different.
Chi-Square Value 1326.607 (df=125)
Both CFI and TLI is quite lower than their previous estimates of (.94; .92)
CFI: 0.718
TLI: 0.655
although RMSEA is lower with WLS compared to MLM (.061)
WLS RMSEA: 0.046
Which results should I trust in this case? Is there a better method than the ones I used? Thank you very much for you help.

Linda K. Muthen posted on Monday, March 18, 2002 - 12:44 pm

There is no way to do this in the current version. We are currently exploring this topic.

Linda K. Muthen posted on Monday, March 18, 2002 - 2:14 pm

Regarding the three-factor model, can you send the three outputs and data if possible to support@statmodel.com? I will take a look at them.

Linda K. Muthen posted on Tuesday, March 19, 2002 - 9:24 am

Thank you for sending the outputs. Getting such different results with is unusual. There are a couple of things that may be going on.

1. WLS is not suiited for many observed variables like the 18 you have. See Muthen and Kaplan, 1982.

2. You may have hit a local solution with ML and MLM. You might want to try the WLS estimates as starting values and rerun these analyses. It may be that a local solution is also the problem in the baseline model. Note that for ML and MLM, the chi-square for the baseline models is about ten times as large as for the target models, while for WLS, the chi-square for the baseline model is about four times as large as for the target model.

3. You might consider doing a simulation study to compare the three estimators for a problem similar to yours and see which estimator behaves best in practice. This in the only way to really know which outcome to trust.

Anonymous posted on Tuesday, March 19, 2002 - 11:43 am

Thank you very much for your answers to my post dated March 18. If I may, I would like to follow up on your answers.
I tried your second suggestion and ML or MLM estimates and fit indices did not change. But maybe the problem lies in the baseline model as you suggested. Is there a way to fix the local solution problem for the baseline model for ML or MLM? Is the local solution problem the case when the iterations converge when they hit a plateau even though it is not the best one?

With regard to your first answer, I tried to look up the reference but I wasn't able to locate the exact one. I was able to locate the following two references but their years are different. I am guessing you meant one of these:
1. Muthén, B., & Kaplan D. (1985). A comparison of some methodologies for the factor analysis of non-normal Likert variables. British Journal of Mathematical and Statistical Psychology, 38, 171-189.
2. Muthén, B., & Kaplan D. (1992). A comparison of some methodologies for the factor analysis of non-normal Likert variables: A note on the size of the model. British Journal of Mathematical and Statistical Psychology, 45, 19-30.

I don't have a lot of knowledge on doing simulations. Are there any sources that you can recommend me to read on this topic?

thank you very much for your help.

Anonymous posted on Tuesday, March 19, 2002 - 11:46 am

Thank you very much for your answers to my post dated March 18. If I may, I would like to follow up on your answers.
I tried your second suggestion and ML or MLM estimates and fit indices did not change. But maybe the problem lies in the baseline model as you suggested. Is there a way to fix the local solution problem for the baseline model for ML or MLM? Just to make sure I understand, is the local solution problem the case when the iterations converge when they hit a plateau even though it is not the best one?

With regard to your first answer, I tried to look up the reference but I wasn't able to locate the exact one. I was able to locate the following two references but their years are different. I am guessing you meant one of these:
1. Muthén, B., & Kaplan D. (1985). A comparison of some methodologies for the factor analysis of non-normal Likert variables. British Journal of Mathematical and Statistical Psychology, 38, 171-189.
2. Muthén, B., & Kaplan D. (1992). A comparison of some methodologies for the factor analysis of non-normal Likert variables: A note on the size of the model. British Journal of Mathematical and Statistical Psychology, 45, 19-30.

I don't have a lot of knowledge on doing simulations. Are there any sources (e.g., previous simulation studies) that you can recommend me to read on this topic?

thank you very much for your help.

Linda K. Muthen posted on Tuesday, March 19, 2002 - 3:45 pm

You could run the baseline model yourself. It consists of means and variances only, no covariances. I am not sure that a local solution is the problem however. And yes, your understanding of what a local solution is correct.

I apologize about the reference. It is the 1992 paper.

Our website will have a new feature in the next couple of days. It will be called Mplus Web Notes and will show how to use simulations to answer research questions. This may help you.

Anonymous posted on Monday, June 03, 2002 - 11:37 am

I have conducted a confirmatory factor analysis using dichotomous indicators on a very large sample (n=8008). I would like to use the results of this model to approximate factor scores for individuals not in the analysis dataset. I understand that strictly proper factor scores cannot be estimated through a factor score coefficient matrix for models with categorical indicators - they must be iteratively obtained. I am not a mathmatician and could not create programming language to do this operation for individual cases as they come up. Is there a way I can fudge a "good enough" estimate of factor scores using the Mplus output for this particular application?

bmuthen posted on Tuesday, June 04, 2002 - 6:35 am

It should be possible to use your estimated model from the n=8008 as fixed parameters in a new analysis where you enter the new individuals and get factor scores the correct way. In this new analysis you don't estimate any parameters but only estimate factor scores.

Anonymous posted on Tuesday, June 04, 2002 - 7:06 am

Is there any way to use the results from my estimated model to construct factor scores on a case-by-case basis? I need a way to convert responses to factor scores as an assessment tool for nonscientists to use with individuals.

Anonymous posted on Tuesday, June 04, 2002 - 7:08 am

BTW, the users of the new instrument will not necessarily have access to Mplus, so I wanted to come up with a solution that can be used outside the program.

bmuthen posted on Tuesday, June 04, 2002 - 9:30 am

You can do the factor score estimation on a case-by-case basis by the approach I described; so using an n=1 analysis with fixed parameters. Regarding your last question, the factor score estimation with categorical outcomes is an iterative process, so there is no explicit formula such as a factor score coefficient matrix that can be used to easily obtain factor scores. The only approximation would be to ignore that the outcomes are categorical and treat them as continuous, but that would seem to forfeit the purpose of your analysis.

Rich Jones posted on Wednesday, June 05, 2002 - 9:26 am

Couldn't Anonymous generate a data set where each record was one of all 2^p combinations of the items in the instrument (where p is the number of items), and then estimate factor scores for each record with fixed parameters as in the n=1 case. The result would be a table of response patterns and factor scores conditional on the parameter estimates returned from the n=8,008 model. Anonymous could then distribute this table to other investigators interested in using the new instrument or prepare a simple program to associate response patterns with the appropriate factor score.

Linda K. Muthen posted on Thursday, June 06, 2002 - 6:52 am

Yes, this could be done. It's a good idea. Thanks for the suggestion.

Anonymous posted on Wednesday, June 12, 2002 - 6:48 am

Thanks, Rich! Nice problem solving! I'll try that approach.

Anonymous posted on Thursday, June 27, 2002 - 1:40 pm

I'm doing a CFA with ordered, categorical indicators. The CFA model by itself fits quite well, but when I go to include it in a fuller SEM, some of the scale factors become very small and insignificant. If the scale factors are related to error, a scale factor of zero doesn't seem to make sense. Should insignificant scale factors be cause for alarm ?

Anonymous posted on Thursday, June 27, 2002 - 1:49 pm

Hi

I have a latent variable called participation measured by 6 items asking individuals if they have participated in 6 different activities within the last two years. Activities are not neccesarily correlated with each other. These six items are considered to be "causal" indicators (formative model, or spurious model) rather than "effect" indicators (reflective model) for the dependent variable (Bollen & Lennox, 1991; Edwards & Bagozzi, 2000). This implies I need to have an unobserved latent variable (Blalock, 1971) in my model. Can Mplus handle models with such unobserved latent variables?

Thank you very much.
Blalock (1971). Causal models involving unobserved variables in stimulus-response situations. In H. M. Blalock (ed.) Causal models in the social sciences (pp.335-347). Chicago: Aldine.
Bollen & Lennox (1991). Conventional Wisdom on measurement: A structural equation perspective. Psych Bull, 110 (2), 305-314.
Edwards & Bagozzi (2000). On the nature and direction of relationships between constructs and measures. Psych Methods, 5 (2), 155-174.

bmuthen posted on Thursday, June 27, 2002 - 3:43 pm

Scale factors close to zero correspond to latent response variable (y*) variances that are very large. This can certainly be a sign of model misspecification.

bmuthen posted on Thursday, June 27, 2002 - 3:49 pm

I think by "causal" indicators you are referring to a situation where indicators are influencing rather than being influenced by a latent variable. Yes, Mplus can handle this situation. Although you don't have a factor indicator in the usual sense, you can always say

factor BY anyvble fixed at 0

factor on x1-x5

where x1-x5 are your "causal" indicators and anyvble is any observed dependent variable in the model. Don't forget to include the identifying restrictions that this type of model requires. If I remember correctly off hand, this involves one of the slopes on x fixed at 1 and fixing the factor residual variance at zero, but you'd better check this.

Anonymous posted on Wednesday, September 11, 2002 - 7:15 am

You write that a CFA requires at least m*m restrictions (m is the number of factors) to be identifiable. Is it possible to give a more precise definition of when a CFA is identifiable or when it is not? Is there any general rules for when a CFA is identifiable / non-identifiable?

bmuthen posted on Wednesday, September 11, 2002 - 8:50 am

This is a big topic. A good treatment is in the reference off of the Mplus web site Reference list:

Joreskog, K.G. (1979). Author's addendum. In Advances in Factor Analysis and Structural Equation Models, J. Magidson (Ed.). Cambridge, Massachusetts: Abt Books, pp. 40-43.

Apart from that, rules of thumb are generally helpful. For instance, having 3 indicators per factor. Or, 2 indicators per factor if there is more than one factor.

Anonymous posted on Friday, October 11, 2002 - 12:16 am

I am estimating a CFA model with about 10 3-level indicators with 2 latent variables on about 1000 subjects, with 5 indicators per latent variable. I fix the latent variables to be uncorrelated.
However, when I compute factor scores, I get a decidedly non-zero correlation between the factor scores (about 0.5), and the scores do not have mean zero and variance 1. I understand that the factor score empirical distribution is necessairly discrete according to the pattern of indicator values, but shouldn't the scores be uncorrelated? If not, what is the meaning of correlated factor scores from a model with uncorrelated factors?

bmuthen posted on Sunday, October 13, 2002 - 10:44 am

The distribution of estimated factor scores does not have the same means, variances, and covariances as the factors themselves. This is shown for example in the Lawley-Maxwell factor analysis book (look under the "regression method"); for ref. see the Mplus web site. With many good indicators (high loadings), however, the estimated factor scores tend to behave more and more like the true scores. This is true whether your indicators are continuous or categorical. Your correlation of 0.5 between estimated factor scores seems high however, unless your indicators are weak. I think you are saying that you have 3-category indicators. Perhaps their loadings are quite small. If you like, you can send your Mplus input, output, and data to support@statmodel.com

Deborah Bandalos posted on Monday, November 25, 2002 - 10:44 am

I am introducing MPlus to my CFA class and have not been able to find any studies in which the WLSMV, WLSM, and WLS estimators have been compared in terms of bias in standard errors and chi-square values and/or efficiency. Are any such studies available?

bmuthen posted on Monday, November 25, 2002 - 10:59 am

These two reference cover this (I'd be happy to send them):

Muthén, B., du Toit, S.H.C., & Spisic, D. (1997). Robust inference using weighted least squares and quadratic estimating equations in latent variable modeling with categorical and continuous outcomes. Accepted for publication in Psychometrika. (#75)

Muthén, B. (1993). Goodness of fit with categorical and other non-normal variables. In K. A. Bollen, & J. S. Long (Eds.), Testing Structural Equation Models (pp. 205-243). Newbury Park, CA: Sage. (#45)

Deborah Bandalos posted on Tuesday, November 26, 2002 - 6:43 am

Thanks Bengt. Would you please send the Muthen, du Toit & Spisic? I have the other article. I am at:

325V Aderhold
University of Georgia
Athens, GA 30602

or email at dbandalo@coe.uga.edu.

Thanks,

Debbi

bmuthen posted on Tuesday, November 26, 2002 - 6:46 am

Will send it.

Anonymous posted on Thursday, August 14, 2003 - 7:58 am

Hi,

I am claculating factor scores for dichotomous items using confirmatory factor analysis in M-plus. in general what is the range of these factor scores?

Linda K. Muthen posted on Saturday, August 16, 2003 - 4:24 pm

The range depends on the estimated factor means and factor variances.

Anonymous posted on Tuesday, September 09, 2003 - 12:20 pm

I'm trying to locate a piece Muthen did with Kaplan a while back that evaluated the performance of various estimators when doing CFA with categorical indicators and I'm having some difficulties.

I believe the piece was Muthen and Kaplan, 1985 (British Journal of Mathematical and Statistical Psychology). Does this sound right, or did M+K do another piece which made similar comparisons ?

Linda K. Muthen posted on Tuesday, September 09, 2003 - 1:57 pm

Are these the references you want?

Muthén, B., & Kaplan D. (1985). A comparison of some methodologies for the factor analysis of non-normal Likert variables. British Journal of Mathematical and Statistical Psychology, 38, 171-189.

Muthén, B., & Kaplan D. (1992). A comparison of some methodologies for the factor analysis of non-normal Likert variables: A note on the size of the model. British Journal of Mathematical and Statistical Psychology, 45, 19-30.

Mpduser1 posted on Tuesday, September 09, 2003 - 3:02 pm

Those might be the ones; I'll give them a try.

I've actually read the piece I'm attempting to refer to -- M+K do a series of simulations and find that the Muthen estimator (whatever it was called at the time, but I assume its the WLSMV estimator now) provides superior results over WLS, GLS.

Thanks.

Michael Conley posted on Tuesday, October 07, 2003 - 10:47 am

I am planning on fitting a CFA on a large sample and obtaining parameter estimates. I then plan to use/fix those parameter estimates in a CFA on another small sample and obtain factor score estimates based on the original large sample parameter estimates. I haven't seen this done, so I am wondering if there is a logical/technical problem with doing this?

Linda K. Muthen posted on Saturday, October 11, 2003 - 9:31 am

If the small sample is not from the same population as the large sample, this would not make sense. If it is, it would seem reasonable. I don't know of any literature on this.

Michael Conley posted on Sunday, October 12, 2003 - 8:16 am

Thanks. Yes the small sample is very much like the large sample (actually a seemingly representative subset of the large sample) but the small sample has an interesting criterion variable available. I want to know the relationship between the factor scores and the criterion variable. The large sample is big enough to let me estimate the CFA parameters, while the small sample is not.

Linda K. Muthen posted on Sunday, October 12, 2003 - 9:08 am

Both samples should be randomly sampled from the same population for your logic to hold. I think you will run into critique if you can't claim that.

Anonymous posted on Thursday, November 13, 2003 - 5:57 am

I am estimating several CFA-models for a questionnaire with 112 dichotomous items. The sample size is 685. What estimator should I use? (WLS, WLSM, WLSMV)? (item difficulties range from .2 - .8)

Linda K. Muthen posted on Thursday, November 13, 2003 - 7:18 am

We recommend WLSMV. It is the Mplus default.

Dustin posted on Monday, December 22, 2003 - 11:27 am

Is there any way to turn of the defualt estimation of the covariances between continuous latent variables?

bmuthen posted on Monday, December 22, 2003 - 12:08 pm

If you want factors f1 and f2 to be uncorrelated, you simply say

f1 with f2@0;

dustin posted on Monday, December 22, 2003 - 12:39 pm

Is fixing the covariance to be equal to zero the same as not estimated the parameter to begin with in terms of the df in the model?

bmuthen posted on Monday, December 22, 2003 - 12:53 pm

Yes.

Scott Weaver posted on Saturday, May 22, 2004 - 9:35 am

Drs. Muthen & Muthen,
I am conducting a CFA with categorical indicators (using WLSMV and Type=missing). I am sometimes obtaining modification indices = 999.000. I've read on the discussion board that this is because the modification index is indeterminate (zero or near zero denominator). My question is: Does this mean that the parameter estimate for my model are suspect? (The fit of my model is good - chi-square is non-significant.) Or should I just ignore these modification indices and retain the model since the model results make sense to me?
Thanks, Scott

bmuthen posted on Saturday, May 22, 2004 - 10:03 am

No, this most likely has no implication for quality of estimates or the model - just ignore.

Lieven posted on Thursday, October 14, 2004 - 3:06 am

I have 1056 variables (equity portfolios) with the number of observations ranging from 10 to 95 (monthly dollar returns). For some variables the covariance is missing because the observations may not overlap in time. These missing covariances can be set to zero. There are 74 factors. 1 world-factor: each variable can have an unrestricted loading on it; 39 country-factors: each porfolio belongs to a specific country and can have an unrestricted loading to only one country-factor, the rest zero; and 34 industry-factors: each portfolio (variable) belongs to a specific industry and can have an unrestricted loading on only one industry-factor, the rest zero. I have initial values, coming from a two-step regression methode, for the factors and the loadings. Our aim is twofold: (1) getting estimates for the loadings and the factors in one-shot (2) testing whether the 3 loadings per variable are equal. A reliable estimate of the fit of the model is less important. Is it possible to perform such an analysis?

Bengt O. Muthen posted on Thursday, October 14, 2004 - 10:28 am

Mplus has a limit of 500 variables. This is an arbitrary limit but it neverthless is there in the current version. An analysis with considerably more variables than observations is difficult to carry out particularly in terms of getting good inference, in your case, testing of equalities. The situation is similar to that of recent factor analyses of microarray data; see, for example, the work of Geoff McLachlan for some recent developments in this area.

dana posted on Thursday, November 18, 2004 - 7:00 am

Hi,

I would like more information about the fact that 'with categorical outcome, the residual variances are not parameters in the model'. So that is different when we have continuous outcomes, right?
I wonder why we can't estimate the residual variances?

I read on this article:
B. Muthén (1978). Contributions to factor analysis dichotomous variables. Psychometrika, vol 43, no 4, 551-560.

And on page 552, it is said that there is one 'necessary restriciton to make since there is no possibility to identify the diagonal elements of sigma, only observing dichotomous variables'.
I still don't understant why it is not possible to identify the diagonal elements, ...
Can you please explain me this?

So if we need to make this necessary restriction (which is: diag(sigma) = I ), does this mean that to know the psi covariance matrix of the errors, I need to do this calculation:
psi = I - diag((lambda)*(phi)*(t(lambda)))

where
psi = covariance matrix of the errors
I = identity matrix
lambda = matrix of factor loadings
phi = covariance matrix of the factors
t(lambda) = transpose of the matrix lambda

if yes, does this mean that we can't fix or let free any elements of the psi matrix?

Thanks for your help!

dana

bmuthen posted on Thursday, November 18, 2004 - 1:50 pm

Look at technical appendix 1 for details about this from the perspective of probit regression. The issue arises from a binary variable having mean p (the probability) and variance p (1-p), which means that there is not separate information about the variance beyond information about the mean. This is different than for continuous variables.

Yes, you compute the residual variance as in the formula you give.

You can only estimate a residual variance if you work with longitudinal or multigroup data, where you have equality of measurement parameters - see Mplus web note #4.

dana posted on Friday, November 19, 2004 - 7:46 am

Thank you very much!
I'll read the paper!
Many thanks!

dana

Anonymous posted on Tuesday, November 23, 2004 - 3:41 am

What is the difference between using WLS and MLR when indicators in a CFA are categorical?

Linda K. Muthen posted on Tuesday, November 23, 2004 - 6:38 am

With WLS, probit regressions are estimated. With MLR, logistic regressions are estimated.

Anonymous posted on Wednesday, November 24, 2004 - 10:36 am

Hi,

when doing a CFA with binary indicators and with missing value (so TYPE = MISSING), is it possible to have CFI, TLI RMSEA and SRMR like we can get when we do not use type = missing?

thanks

Linda K. Muthen posted on Wednesday, November 24, 2004 - 4:14 pm

Add H1 to TYPE = MISSING and I think you will get what you want.

Anonymous posted on Wednesday, November 24, 2004 - 5:48 pm

thanks!

I have another question.
What is the meaning of this warning:

THE MODEL ESTIMATION TERMINATED NORMALLY

WARNING: THE RESIDUAL COVARIANCE MATRIX (PSI) IS NOT POSITIVE DEFINITE.
PROBLEM INVOLVING VARIABLE F2.

thanks a lot

dana posted on Friday, November 26, 2004 - 12:10 pm

Hi

I would like to add another question about the psi matrix (message from 18 november 2004
Following the formula:
psi = I - diag((lambda)*(phi)*(t(lambda)))

where
psi = covariance matrix of the errors
I = identity matrix
lambda = matrix of factor loadings
phi = covariance matrix of the factors
t(lambda) = transpose of the matrix lambda

So that means that the errors are not correlated! Because as i understand the equation, the identity matrix is for example like:
1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1

If yes, I thought the CFA can allow correlated errors, ...!!

thanks

dana

bmuthen posted on Friday, November 26, 2004 - 12:24 pm

Correlated errors are allowed. Your computations above are only for the diagonal of the matrix you call Psi, not the off-diagonal elements. See the Mplus Technical Appendix 2, formulas 42 and on (where the residual cov matrix is called Theta) for a more precise formulation.

dana posted on Friday, November 26, 2004 - 12:37 pm

thanks

just to know: can Mplus give me the covariance matrix for the errors in the ouput? Because it seems a little bit complicate to calculate it, ...

thanks

bmuthen posted on Friday, November 26, 2004 - 2:15 pm

The residual covariances are parameters that can be identified and estimated in the model. Just say e.g. u1 with u2.

dana posted on Monday, November 29, 2004 - 5:03 am

ok!

So if I understand well:
1. In a CFA with binary indicators, I can't estimate the errors variances but I can calculate it with the formula above.

2. But I can estimate the errors covariances by writing in the program "u1 with u2"

3. So if I don't write "u1 with u2", that will mean that the errors between u1 and u2 are uncorrelated?

Is that right?

Thanks!

dana

Linda K. Muthen posted on Monday, November 29, 2004 - 8:04 am

1. Yes.
2. Yes.
3. Yes.

Scott Engel posted on Thursday, December 09, 2004 - 12:01 pm

I ran a CFA with categorical factor indicators on 25 items with four scales. I got an excellent fit for this model. I followed this up with a second order CFA to determine if these four factors are well represented by a single latent construct. This model also fit well and it appears that a single latent construct does, in fact, represent the four scales well. My question is this- given that I’ve run nonlinear factor analysis, can I use a simple sum score of the scales as a total score? Or, do I need to consider weighting of the scaled scores based upon how they loaded on the second order latent construct? For practical reasons I’d prefer to use a simple unweighted score. If the weighted scores are ideal, is using the unweighted scores defensible and an adequate rough approximation of the latent construct?

bmuthen posted on Thursday, December 09, 2004 - 5:10 pm

The optimal approach would be to get the estimated factor scores for the second-order construct. You get this by using the FSCORES option of the SAVEDATA command.

A sum of the 25 items is probably a decent rough approximation of the factor score. But note that you then revert to assuming interval scale for the four categories and also don't take into account the differential loadings on each of the two levels.

phillipkwood posted on Tuesday, January 11, 2005 - 8:26 am

A student is interested in doing a confirmatory factor analysis with categorical variables and has specified several items with multiple thresholds in a large sample. She wants to accomplish a change to her confirmatory model in which error variances are covaried. How can one do this, given the variables are categorical? A search of the manual revealed that it's possible to specify that thresholds are correlated. Does this accomplish the desired correlation between error terms? I'm somewhat unsure of what correlating thresholds means and where I'd read more about it. Given that the data are polytomous, what would it mean, by the way, if one specified a covariance between one of the thresholds and not the other between two manifests?
As an interim workaround, I told the student to model the desired error covariance as a factor with two indicators, but of course, it would be good to know what the best answer is.
thanks!

BMuthen posted on Wednesday, January 12, 2005 - 4:33 pm

Residuals can be correlated using the default WLSMV estimator. These are the residuals for the underlying y* variables. The thresholds are not correlated. They are not random variables.

Scott C. Roesch posted on Friday, March 04, 2005 - 10:33 am

Is it possible to get Mardia's coefficient in MPlus version 3?

Tihomir Asparouhov posted on Friday, March 04, 2005 - 11:05 am

Technical output 13 provides univariate, bivariate and multivariate sample skew and kurtosis. This output is available for mixture models only. None-mixture models can be estimates as a single class mixture to get that output.

MichaelCheng posted on Saturday, March 05, 2005 - 10:38 am

Earlier I mentioned that I will be doing a CFA with 24 binary outcomes and 1 latent variable. It was suggested that I use WLSMV instead of WLS and that nested models should be compared using chi-square difference. On the Mplus output, however, a warning message says that "The chi-square value for MLM, MLMV, MLR, WLSM and WLSMV cannot be used for chi-square difference tests." It further says that MLM, MLR and WLSM chi-square difference testing is described in the Mplus Technical Appendices. How about WLSMV? Thank you!

Linda K. Muthen posted on Saturday, March 05, 2005 - 10:48 am

You need to use the DIFFTEST option to do chi-square difference testing using WLSMV. See the Mplus User's Guide for a description of the DIFFTEST option. See Example 12.12 for an illustration of this option.

Anonymous posted on Thursday, March 17, 2005 - 10:39 pm

I am trying to reconcile my understanding of web note 4 with page 480 of the manual. My model is
MODEL
f1 BY y1-y7
f2 BY y8-y15
f1 ON x1-x10
f2 ON f1 x1-x15

Where y1-y15 are ordinal, f1 and f2 are continuous, and x1-x15 are a mix of continuous and categorical.

My question is: are the output coefficients for the “ON” part of the model simple linear regression coefficients or probit (WLS) / logistic (MLE) regression coefficients? From web note 4 and a technical appendix, I started thinking the “ON” portion is a “latent response variable formulation” and could be viewed as a simple linear regression with continuous outcome since the latent variable is continuous. From p 480 of the manual, I’m not sure this is the case.

Either way, can I assume the parameter estimates are normal and use the estimates with their std errors to get p-values?

Another question- should I expect to see a coefficient on the output for f1?
Thank you.

Linda K. Muthen posted on Friday, March 18, 2005 - 7:52 am

Because f1 and f2 are continuous latent variables, the coefficients are simple linear regression coefficients. The factor loadings are probit regression coefficients.

The ratio of the parameter estimate to its standard error can be used to assess significance.

You should expect to see coefficients for x1-x10 for f1 and x1-x15 for f2.

Anonymous posted on Monday, March 21, 2005 - 3:24 pm

This is a follow-up question for the model in my March 17, 2005 post above. I have been reading technical appendix for version 2 (although I’m using 3.12) but still unclear on how my model is being estimated. I’m not specifying an analysis type so the default WLSMV is being used.

More specifically, I’m confused by the statement that the coefficients for the “ON” part of the model are simple linear regression coefficients, which would imply that maximum likelihood is being applied. Is the estimation applied to the “BY” part of the model different than that applied to the “ON” part of the model? Is WLSMV applied only to the “BY” part of the model?

I also see in Tech 5 that gradient calculations and quasi-newton are being applied…are these steps required for both ML and WLSMV?

On a separate topic, I need some help understanding the model fit: Chi-Square, Chi-Square for baseline (what is baseline?), CFI, TLI, RMSEA, WRMR…can you recommend a reference?
Thank you so much!

bmuthen posted on Monday, March 21, 2005 - 5:36 pm

Say that a model has both (1) factor indicators and (2) regressions of factors on covariates. With categorical indicators, part (1) is a non-linear regression (probit or logit), while part (2) is a linear regression. The choice of regression is done as a function of the dependent variable being categorical in (1) and continuous in (2). The fact that linear regression is used in (2) does not imply that ML is used. With the WLSMV estimator you get probit in (2) and linear reg in (2) and all of this is done in a single WLSMV estimation step encompassing both (1) and (2).

Yes, both ML and WLSMV require numerical optimization using gradient and QN steps.

See the Yu dissertation on the Mplus web site and references therein for these fit index definitions.

Anonymous posted on Thursday, March 24, 2005 - 7:44 pm

Hi,
I am using CFA to evaluate a model of 5 correlated first-order factors. My understanding is that it is appropriate to constrain error covariances to zero. It has been suggested to me that, instead, I estimate the error covariances among the items specified as indicators of a given factor, but not estimate error covariances for items across different factors.
The argument provided to me for doing this is that this is okay because theoretically the items serving as indicators of a given factor are measuring something distinct from the other factors. To me, this is represented by the fact that such indicators are specified to load on one and only one of the factors. Is there ever a time to allow these errors to covary?
The only situation that comes to my mind is if there are method effects, maybe reverse-scored items, but even in this situation I wouldn't be that comfortable doing it.
Any thoughts about this situation or any other? How justifiable is it to estimate these parameters?

Linda K. Muthen posted on Friday, March 25, 2005 - 7:37 am

It is justifiable to estimate residual covariances of factor indicators if there is a reason that this parameter makes sense in the model, for exmaple, methods effects or minor factors. I do not believe that this would apply to reverse order items.

Calvin Croy posted on Monday, April 11, 2005 - 4:49 pm

I am running a multiple group CFA where I want to compare the loadings on the same 6 factors in 4 different age groups. Each factor is defined using the same 2 observed variables in each of the 4 groups (factor1 by var1 var2 in group1, group2, group3, and group4). There are no covariates.

Following the documentation in the Mplus 3.0 User's guide I have first specified an overall model, followed by group specific models for each of the age groups, listing only the second observed variable for each factor on the group-specific models (since the first variable will always have a loading of 1.0 to establish scale).

My questions:

1. When I specify the group-specific models, do I need to leave out one of the age categories, as I would need to do in multiple regression to prevent multicolinearity? Curiously, I get the same results whether I do or don't omit one of the age categories.

2. The User's Guide says that, if I do not request a model for one of the groups that I have defined with the GROUPING command, the omitted group will have the overall model fitted. Could you explain why I get the same output (identical stats) if I don't include a group-specific model command for the age 15-24 group as when I do? I would have expected to get different loadings for the second observed variable on my 6 factors when I allow them to be freely estmated in the age 15-24 group compared to when I force that group to have the overall model by not including a model statement for it.

3. I have observed that I get exactly the same output (all stats the same) regardless of which age category group I omit (i.e. don't include a specific model for) or whether I specify that I want the loadings to be freely estimated separately for all 4 age groups. When I fit the same models (6 factors, 2 observed vars per factor, first var loading fixed at 1.0) across 3 education levels, I of course get different loadings than when I model by age groups, but again the output is the same across the education groups.

To summarize:
Can you explain this consistency regardless of how I specify that I want models built for each group? Whether I leave a group out of my model statements or don't, and regardless of which group I omit, I get the same results.

Thank you so very much for whatever clarification you can provide.

Thuy Nguyen posted on Tuesday, April 12, 2005 - 10:30 am

If possible, please send your input, output and data to support@statmodel.com. I think these questions will be better answered if we are able to see the results that you are seeing and how the model is set up.

Anonymous posted on Monday, August 15, 2005 - 2:06 pm

First, let me say thank you for maintaining this discussion site. It is invaluable. I have found that your responses frequently answer questions I have had but did not post.

I am running a CFA with two factors. I get the warning message that the psi matrix is not positive definite. The Tech4 option shows that the estimated correlation between my two factors is 1.098, which the cause of the warning, but the output under Model Results shows under "factor1 with factor2" a correlation of just 0.118.

Computationally, what's the difference between the estimated correlation between factors produced by the Tech4 option and the one that appears under the model results?

bmuthen posted on Monday, August 15, 2005 - 3:47 pm

Thanks for the encouragement regarding our site. Tech4 gives the correlation which is the same as what is given in the regular output if the factors are exogenous in the model, but if the factors are dependent variables the regular output concerns the residual correlation.

Anonymous posted on Wednesday, August 17, 2005 - 9:30 am

Could you please explain how to interpret the Residual Variances and R-SQUARE values listed in the output for a MIMIC model?

1. Are the residual variances the differences between the variances estimated for the observed variables (factor indicators) from the model and the actual variances for these variables calculated from the data? -- such as what is measured in the Chi-square goodness of fit measure?

2. Or is the residual variance of a factor indicator variable the variation that's unaccounted for by its loadings of the factors? If this is the case one would expect that the R-Square value would be equal to (1 - the residual variance), but this does not seem to be the case judging from the values in my output. Is R-square the same as the communality in EFA?

3. I know that the R-Square value is equal to 1 - the StdYX of the Residual variance. But how do the R-square values and the raw residual variances relate to one another?

Thank you for the information!

BMuthen posted on Wednesday, August 17, 2005 - 2:18 pm

1. No.

2. That's right. In the MIMIC model, the R-square is not 1 - the residual variance but it is the ratio of variance in the factor explained by the covariates over the total variance of the factor, where the total variance includes both the explained variance and the residual variance.

3. See number 2.

Anonymous1 posted on Monday, October 03, 2005 - 6:02 am

Good morning,

I am a relatively new user of your software. I am currently attempting to conduct a CFA using the latest version of MPlus but keep encountering two series of error messages:

1. COMPUTATIONAL PROBLEMS ESTIMATING THE CORRELATION FOR x2 AND x4.
INCREASING THE ITERATION OR CONVERGENCE OPTIONS MAY RESOLVE THIS PROBLEM.

When I follow the suggested advice I then encounter the second message which pertains to a completely different variable:

2. SERIOUS COMPUTATIONAL PROBLEMS OCCURRED IN THE UNIVARIATE ESTIMATION OF THE THRESHOLDS/MEANS, VARIANCES AND/OR SLOPES FOR VARIABLE x10.

What might these messages and their associated problems be attributable to?

bmuthen posted on Monday, October 03, 2005 - 10:34 am

It sounds like your variables are categorical so you may want to look at their univariate and bivariate distributions (frequency tables) to see if you see anything unexpected such as very skewed distributions. If this doesnt help, send your input, data, and output to support@statmodel.com

Anonymous1 posted on Tuesday, October 04, 2005 - 1:37 pm

Thanks Dr. Muthen. I followed your advice and reexamined the variable distributions. Several variables were defined by negative and postive skews. Transformations solved the problems that I'd been encountering.

JI posted on Tuesday, October 04, 2005 - 2:03 pm

Hi,
I'm a new MPlus user. I had just ran a CFA with my .inp as stated below. However, I've been getting a series of error messages on "ERROR in Variable command Duplicate variable on NAMES list". I truly hope I can get some help with this issue. Thanks.

TITLE: CFA for Mach and Religiosity
DATA: FILE IS Mach_Rel_BE_EPbeg_EPnow.dat;
VARIABLE: NAMES ARE T1-T9 V1-V9 M1-M2 R1-R4 Be1-Be16 Epb1-Epb10 Epn1-Epn10;
USEVARIABLES ARE T1 T3 T5 T7 V1 V2 T2 T4 T6 T8 T9 M1 V3 V4 V6 V8 V9 R1-R4;
CATEGORICAL ARE T1 T3 T5 T7 V1 V2 T2 T4 T6 T8 T9 M1 V3 V4 V6 V8 V9 R1-R4;
MODEL:
M BY T1 T3 T5 T7 V1 V2;
Mi BY T2 T4 T6 T8 T9 M1;
Fio BY V3 V4 V6 V8 V9;
Rel BY R1-R4;

Thuy Nguyen posted on Tuesday, October 04, 2005 - 5:30 pm

JI,

From your input, I don't see what might be causing a problem. If you send your input, output and data to support@statmodel.com, I can check into this for you.

Pancho Aguirre posted on Friday, October 28, 2005 - 9:33 am

Hi,

I'm trying to estimate the effects of a national level observed variable on the covariance between two individul-level latent constructs.

IS this possible to do using Mplus?

I would like to be refer to some examples/papers that have done this type of analysis in the past.

Thanks in advance for your help,

Pancho

Linda K. Muthen posted on Friday, October 28, 2005 - 1:10 pm

Can you describe this in a little more detail? For example, do you want to explain the covariance between an individual's income and ses by by the nation's GNP? I don't think I understand what you are asking.

Pancho Aguirre posted on Saturday, October 29, 2005 - 12:42 pm

Hi Linda,

Yes your example is correct. Can I do that using Mplus?

Thanks a lot!

Pancho

Linda K. Muthen posted on Saturday, October 29, 2005 - 12:52 pm

Explain to me how the nation variable has any variability. It seems it would be the same for each person. Or are you looking at several nations?

Pancho Aguirre posted on Saturday, October 29, 2005 - 1:34 pm

Linda,

Yes, I have a total of 25 nations, their GNPs vary among them. so you are correct people are nested within nations.

Thanks so much,

Pancho

Linda K. Muthen posted on Sunday, October 30, 2005 - 7:25 am

The will be available in Version 4 of Mplus.

Ed Wu posted on Tuesday, November 15, 2005 - 5:01 pm

"In the MIMIC model, the R-square is... the ratio of variance in the factor explained by the covariates over the total variance of the factor, where the total variance includes both the explained variance and the residual variance."

What is the relationship between R-square and communalities in models without covariates?

bmuthen posted on Tuesday, November 15, 2005 - 5:47 pm

Communalities refer to explained variance in an item as a function of the factors influencing that item. So it is the R-square for the item instead of for the factor.

Blaze Aylmer posted on Saturday, November 19, 2005 - 4:33 am

Linda

Can you recommend any sources that can help with the interpretation of CFA output in MPLUS?

Blaze

Linda K. Muthen posted on Saturday, November 19, 2005 - 6:26 am

Chapter 17 of the Mplus User's Guide has a description of the Mplus output. The scale of the factor indicators tells you what type of regression coefficient the factor loading is. As far as how to understand other things about CFA, see a book like the Bollen book or other references on the Mplus website.

fati posted on Tuesday, January 17, 2006 - 6:55 am

I am estimating a CFA with 7 factors and 38 items using wlsmv, my output shows : RMSEA=0.095, CFI=0.94, I know that is a poor model, and I have some questions , this is the first time that I do this analysis.
1-what statistics can help me to have a good model? my program is:
TITLE: CFA FOR PCAS (Categorical factor indicators)
DATA: FILE IS PCAS.dat;
VARIABLE: NAMES ARE q1b q2 q3b q4b
q6a q6b q9a q9b q9c q9d q9e q10
q11a q11b q11c q11d q11e
q12b q12d q12g q14a q14b
q14c q14d q15 q19a q19b
q19c q19d q19e q19f q7r q8r
q12ar q12cr q12er q12fr q13REC;

CATEGORICAL ARE ALL;
missing = ;
MODEL: ACCES BY q1b q2 q3b q4b q6a q6b;
CR_VISIT BY q7r q8r;
CR_KNOW BY q14a q14b q14c q14d q15;
IC_COMM BY q9a q9b q9c q9d q9e q10;
IC_TRUST BY q12ar q12b q12cr q12d q12er q12fr q12g q13rec;
MC_INTEG BY q19a q19b q19c q19d q19e q19f;
RESP BY q11a q11b q11c q11d q11e;

OUTPUT:
tech2 modindices;
2-can I use the modifications indices for this, my modification indices output are:
MODEL MODIFICATION INDICES

Minimum M.I. value for printing the modification index 10.000

M.I. E.P.C. Std E.P.C. StdYX E.P.C.

BY Statements

ACCES BY Q10 17.281 0.423 0.237 0.237
CR_KNOW BY Q6B 17.053 0.254 0.226 0.226
IC_COMM BY Q3B 10.165 -0.155 -0.146 -0.146
IC_COMM BY Q6B 26.019 0.268 0.253 0.253
IC_COMM BY Q15 12.545 0.317 0.299 0.299
IC_TRUST BY Q6B 31.992 0.361 0.278 0.278
IC_TRUST BY Q15 13.594 0.405 0.312 0.312
MC_INTEG BY Q6B 18.600 0.312 0.256 0.256
RESP BY Q6B 24.439 0.270 0.249 0.249
RESP BY Q15 12.753 0.348 0.321 0.321

how can I apply this change in my model, for example, I understand that q10 must be better with factor acces ,h ow can I do this, do i change the polace of item q10 to a factor acces?
2-how can I use the estimates in order to have a better model , what that is means if I have:
MODEL RESULTS

Estimates S.E. Est./S.E.

ACCES BY
Q1B 1.000 0.000 0.000
Q2 1.235 0.101 12.262
Q3B 1.329 0.118 11.280
Q4B 1.200 0.117 10.230
Q6A 1.317 0.117 11.227
Q6B 1.534 0.147 10.406.

for q1b , can I tell that q1b is not significant, what can I do to change this.
thank you very much in advance, for your response in

Linda K. Muthen posted on Tuesday, January 17, 2006 - 8:10 am

Given that the modification indices suggest a lot of possible cross-loadings, I would suggest starting with an EFA. It may be that the data you have does not support the theory that you are testing. EFA will give you a better idea of whether the variables are behaving as you expect.

fati posted on Tuesday, January 17, 2006 - 9:19 am

Thank you for response,
I have doing a EFA, I have used a promax rotated loading (>0.4) to compare the items in each factor, there is a little difference, but when I use a CFA for a model changed , my test RMSEA IS ALWAYS >0.05 (RMSEA=0.10, CFI=0.936), WHAT THAT IS MEANS, My indices modifactions are:
MODEL MODIFICATION INDICES

Minimum M.I. value for printing the modification index 10.000

M.I. E.P.C. Std E.P.C. StdYX E.P.C.

BY Statements

ACCES BY Q10 16.162 0.223 0.223 0.223
CR_KNOW BY Q6B 17.360 0.226 0.226 0.226
IC_RESP BY Q6B 25.570 0.246 0.246 0.246
IC_RESP BY Q14A 10.447 0.252 0.252 0.252
IC_RESP BY Q15 17.956 0.352 0.352 0.352
IC_TRUST BY Q6B 31.391 0.277 0.277 0.277
MC_INTEG BY Q6B 18.837 0.257 0.257 0.257
OTHER BY Q3B 10.089 -0.163 -0.163 -0.163
OTHER BY Q6B 31.372 0.307 0.307 0.307
OTHER BY Q14A 10.620 0.247 0.247 0.247
OTHER BY Q15 31.692 0.427 0.427 0.427

ON/BY Statements

IC_RESP ON CR_KNOW /
CR_KNOW BY IC_RESP 999.000 0.000 0.000 0.000
IC_RESP ON IC_RESP /
IC_RESP BY IC_RESP 999.000 0.000 0.000 0.000
IC_RESP ON IC_TRUST /
IC_TRUST BY IC_RESP 999.000 0.000 0.000 0.000
IC_RESP ON MC_INTEG /
MC_INTEG BY IC_RESP 999.000 0.000 0.000 0.000
IC_RESP ON OTHER /
OTHER BY IC_RESP 999.000 0.000 0.000 0.000

how can I do for having a good model?

thanks

Linda K. Muthen posted on Tuesday, January 17, 2006 - 1:53 pm

I'm afraid that I cannot tell you how to get your model to fit. I don't think you would have so many factor loading modification indices if your EFA clearly pointed to the factors in your CFA. I would revisit the EFA.

jad posted on Wednesday, January 18, 2006 - 12:16 pm

Iam conducting CFA with 40 categorical factor indicators with estimator by default (wlsmv), (7 constructs), I have understand in the article (David B.Flora and Patrick J.Curran (2004)) that a robust WLS is robust to modest violations of underlying normality, I want to know how can I determine a modest nonnormality , do I use a skweness and kustosis ? if yes, how can I do this in Mplus, do I verifyy a normality to esch item used in my analysis, I have 40 items with 350 observations?
thanks

Linda K. Muthen posted on Thursday, January 19, 2006 - 8:13 am

The normality assumption is for the u* variables underlying the observed u variables. It is difficult to test their normality. From a practical point of view, if you did test the normality of the u* variables and found that they were extremely non-normal, what would you do? You would probably just use a robust weighted least squares estimator. To read more about these issues, see the following paper:

Muthén, B. (1993). Goodness of fit with categorical and other non-normal variables. In K. A. Bollen & J. S. Long (Eds.), Testing Structural Equation Models (pp. 205-243). Newbury Park, CA: Sage. (#45)

If you don't have access to the paper, you can request paper 45 from bmuthen@ucla.edu.

JAD posted on Monday, January 23, 2006 - 7:54 am

THANK YOU

Melinda Taylor posted on Monday, February 06, 2006 - 4:24 pm

I'm running a 5 factor CFA with 44 dichotomous items. Here is my input syntax:

VARIABLE:
NAMES ARE q1 - q44;
CATEGORICAL ARE q1 - q44;

ANALYSIS:
ESTIMATOR=WLSMV;

MODEL: f1 by q5* q8* q12* q21* q24* q43*;
f2 by q2* q6* q10* q11* q13* q15*;
f3 by q3* q4* q14* q16* q18* q19* q22* q31* q33* q35* q37* q40* q42*;
f4 by q9* q17* q20* q23* q25* q26* q27* q32* q34* q36* q39* q44*;
f5 by q1* q7* q28* q29* q30* q38* q41*;
f1 @ 1;
f2 @ 1;
f3 @ 1;
f4 @ 1;
f5 @ 1;

I get the following warning:
WARNING: THE RESIDUAL COVARIANCE MATRIX (PSI) IS NOT POSITIVE DEFINITE.
PROBLEM INVOLVING VARIABLE F2.

I'm confused as to what this means.
1) Is this the covariance matrix of the item residuals? I don't think so but I need to clarify & wouldn't it be called theta?
2) Is this the covariance matrix of the 5 latent factors? If so, why is it referring to residuals when my factors are just freely correlated (i.e., nothing is predicting them so there should be no residual) & wouldn't this be called phi?

thanks a lot

Linda K. Muthen posted on Monday, February 06, 2006 - 5:28 pm

It is referring to the covariance matrix of the factors. Most likely f2 has a negative or zero variance. I would have to see the entire output to understand why residual is being printed. You can send it along with your license number to support@statmodel.com.

Kim-Chi Trinh posted on Friday, March 24, 2006 - 1:45 pm

I ran two programs of CFA with the exact same MODEL command but with different USEVARIABLE command and generated two very different outputs.

The first satement consisted of three variables (USEVARIABLE) which generated an RMSEA score of 0.000. (Extremely perfect!!!)

The second satement consisted of four variables (USEVARIABLE) which generated an RMSEA score of 1.581. (Extremely bad!!!)

Below are my commands for this simple CFA:

DATA:
FILE = "y:\carmen\carmen2.txt";
FORMAT = free;

VARIABLE:
NAMES ARE
id fib211 fib212 fib213 fib214;

USEVARIABLES =
! Statement 1:
! fib211 fib212 fib214;
! Statement 1 produces an RMSEA score of 0.000 ;

! Statement 2:
fib211 fib212 fib213 fib214;
! Statement 2 produces an RMSEA score of 1.581;

CATEGORICAL =
! Statement 1:
! fib211 fib212 fib214;

! Statement 2:
fib211 fib212 fib213 fib214;

MISSING ARE ALL(99);

ANALYSIS:
TYPE = GENERAL MISSING H1 ;
ESTIMATOR = WLSMV;

MODEL:
f1 by fib211 fib212 fib214;

OUTPUT:
SAMPSTAT STANDARDIZED;

*********************

Will someone please tell me what is wrong here?

Thank you very much for your help.

Linda K. Muthen posted on Sunday, March 26, 2006 - 3:22 pm

When you have four variables on the USEVARIABLES list and only three variables in the MODEL command, all four variables are used in the analysis. The variable not mentioned in the model command is not correlated with any of the other variables. This could make the model not fit. You will find a message to this effect in the output. If you have further questions of this type, send them along with the input, data, output, and your license number to support@statmodel.com.

Thomas Rodebaugh posted on Thursday, April 13, 2006 - 7:07 am

i have a well-fitting model for an instrument that mixes categorical and continuous items. in the model, items load on three lower-level factors (two of which include categorical items), which then load on a single factor. the categorical items are all "yes/no."

a somewhat arbitrary scoring system exists for this measure. based on the above discussion, i get the sense that, at least for the factors that include categorical items, an iterative process must be gone through in order to create the factor scores. i want to be able to describe a scoring system that does not require use of mplus, so i have several questions:

1. there is some discussion above about creating a table that includes all possible answers and the resulting score on the factor. however, it's not particularly clear to me how this would be accomplished. can someone provide a little more detail?

2. am i correct that the iterative process is necessary because of the categorical items? that is, for the one lower-order factor that is defined by continuous variables, can i instead use the loadings on the factor to estimate factor scores?

3. when categorical and continuous variables are both included, are the continuous variables part of the iterative process, or is their contribution to the factor score more direct?

i went into this thinking that it should be easy to come up with something better than the arbitrary scoring method that now exists, but reading the posts above i'm realizing i don't really know much about the issues here!

thanks in advance for any thoughts on this question.

Bengt O. Muthen posted on Thursday, April 13, 2006 - 11:26 am

Whenever some observed factor indicators are categorical, the factor score estimates have to be obtained using an iterative optimization procedure. There is no simple approximation (short of just summing the items). The procedure is described in the technical appendix for Version 2 posted on the web site. So the estimation has to be done in Mplus.

A tabulation approach is not feasible.

Thomas Rodebaugh posted on Thursday, April 13, 2006 - 1:19 pm

thanks for the answer. just to clarify:

an above answer to a similar question seemed to suggest that, for example, if a factor is defined by 5 yes/no items, then a table could be constructed that would look something like (spacing isn't coming out exactly as i meant to, but i hope the idea is clear enough):

1 2 3 4 5 score
n n n n n 0
y n n n n 2
y y n n n 3
y n y n n 3.5 etc...

where all possible combinations are expressed and the corresponding factor score is indicated. i am not sure this ends up being at all feasible in my data set (one of the factors has 7 yes/no items and one item with responses from 1-6--so that would be a *lot* of rows in a table), but is it *theoretically* possible to create such a table? (if not, i'm giving up; if so, it would be nice to have a pointer or two.)

thanks!

tom

Bengt O. Muthen posted on Thursday, April 13, 2006 - 2:02 pm

Each distinct response pattern (row in your table) does give rise to a distinct estimated factor score value. To fill in the value for each row, you would need estimates for the model parameters and then compute the estimated factor scores using the iterative optimization technique. If this has been computed, the table can be created and then applied to any other individuals with these values.

Thomas Rodebaugh posted on Friday, April 14, 2006 - 10:06 am

thanks for your answer, that clarifies the task.

i'll go bang my head against that particular wall for a while, but i may be back with further questions.

thanks again!

yang posted on Friday, April 21, 2006 - 6:58 am

Is it permitted for an indicator to have continuous and categorical (binary) indicators at the same time?

Linda K. Muthen posted on Friday, April 21, 2006 - 7:47 am

A factor may have a combination of continuous and categorical factor indicators.

Zhongmiao Wang posted on Thursday, May 11, 2006 - 1:28 pm

Am I understanding correctly? (1)The scale factor in categorical data CFA is the latent response variable associated with each observed categorical factor indicator? (2)if we want to compare two models, one nested in the other, we should use the uncorrected chi-square that is resulted from nonrobust estimation method.
Thanks Linda and Bengt!

Linda K. Muthen posted on Thursday, May 11, 2006 - 1:43 pm

The scale factor is one divided by the standard deviation of the latent response variable.

For WLS, you can use a standard chi-square difference. For WLSM, you need to use the scaling correction factor provided in the output. For WLSMV, you need to use the DIFFTEST option.

Silvia Sörensen posted on Friday, May 12, 2006 - 11:25 am

Hi Linda,
I'm running a CFA with 5 factors and 29 items. As have others above, I received the message:
THE MODEL ESTIMATION TERMINATED NORMALLY

WARNING: THE RESIDUAL COVARIANCE MATRIX (PSI) IS NOT POSITIVE DEFINITE.
PROBLEM INVOLVING VARIABLE avoid.

I checked the tech4 and the correlation with another factor is larger than 1. (same in model result).

What is the fix for this? Theoretically I would expect these factors to be negatively correlated, so I hesitate to fix it to 0.
Thanks!

Linda K. Muthen posted on Friday, May 12, 2006 - 11:41 am

This means that the two factors with a correlation greater than one are statistically indistinguishable from each other. You either need to rethink them or use only one of them. Did you ever do an EFA with the set of items to see if they are loading as expected? It may be that your items are not valid measures of what they are meant to measure.

yang posted on Monday, May 15, 2006 - 12:45 pm

Will confirmatory factor analysis allow one indicator to be loaded on two factors? Thanks.

Linda K. Muthen posted on Monday, May 15, 2006 - 2:11 pm

Yes, cross loadings are possible in CFA.

Silvia Sörensen posted on Wednesday, May 17, 2006 - 8:21 am

Response to your May 12 posting:
That's a good thought, but it doesn't seem to fit.
I did do an EFA (both 4 and 5 factor solution) and these two factors are the ones that have the loadings most consistent with the theory (the others have some problems, but these two are very robust and they are generally not correlated more than -.30).

The other 3 factors which don't the correlation greater than 1 are not as distinct as I would like. COuld this be affecting the parameters of the more robust factors?

Thanks

Linda K. Muthen posted on Wednesday, May 17, 2006 - 9:14 am

I don't think so. I would do an EFA to see where the items load. If they don't load according to theory, I would think about the validity of the items.

Silvia Sörensen posted on Wednesday, May 31, 2006 - 8:48 am

I am puzzled by a discrepancy between EFA and CFA
In the EFA the factor intercorrelations are all within reasonable range, whereas in the CFA 2 Factors 1 and 5 end up with a correlation >1.

I know the solution below is not very good, but this ALSO happens when I eliminate the items that are cross-correlated or don't load according to the theory and when I assign items in the CFA to the factors they load on in the EFA.

Any other thoughts on this?

EFA result

EXPLORATORY ANALYSIS WITH 5 FACTOR(S) :

CHI-SQUARE VALUE 522.915
DEGREES OF FREEDOM 271
PROBABILITY VALUE 0.0000

RMSEA (ROOT MEAN SQUARE ERROR OF APPROXIMATION) :
ESTIMATE IS 0.052

ROOT MEAN SQUARE RESIDUAL IS 0.0373

PROMAX ROTATED LOADINGS
1 . 2 . 3 .4 . 5
AWARE1 0.770 -0.233 -0.056 0.140 -0.006
AWARE2 0.610 0.077 -0.070 0.057 -0.031
AWARE3 0.662 0.059 -0.019 0.098 -0.013
AWARE4 0.513 0.400 0.121 -0.197 0.074
AWARE5 0.584 0.280 0.117 -0.248 0.015
AWARE6 -0.021 -0.262 -0.074 0.111 0.409
AWARE7 -0.030 0.021 0.009 0.057 0.806
AWARE8 0.055 0.145 0.032 -0.125 0.860
AWARE9 0.277 0.197 -0.022 0.284 -0.007
AWARE10. -0.005 -0.100 0.012 0.018 0.578
GATHER1 0.062 0.617 -0.129 0.244 -0.052
GATHER2 0.105 0.587 0.012 0.176 0.026
GATHER3 0.000 0.530 -0.068 0.295 -0.044
GATHER4 0.141 0.107 0.054 0.376 0.010
GATHER5 -0.002 0.231 0.230 0.139 -0.075
GATHER6 0.116 0.678 0.133 -0.036 -0.009
GATHER7 0.050 0.605 0.071 0.264 0.065
PREFER1 -0.081 0.614 0.016 0.441 0.088
PREFER2 0.054 0.177 0.138 0.495 0.060
PREFER3 -0.048 -0.019 0.599 0.395 -0.001
PREFER4 0.093 0.113 0.586 0.076 0.048
PREFER5 -0.155 0.371 0.284 0.387 0.004
PREFER6 0.037 -0.293 0.394 0.348 -0.029
PLANS1 -0.027 0.202 0.056 0.674 -0.063
PLANS2 -0.035 0.113 -0.047 0.597 0.009
PLANS3 -0.059 0.498 -0.158 0.174 0.050
PLANS4 0.022 0.133 -0.010 0.263 -0.050
PLANS5 0.021 -0.038 0.060 0.512 -0.059
PLANS6 -0.011 0.081 0.061 0.290 -0.028

PROMAX FACTOR CORRELATIONS
1 . 2 .3 . 4 . 5
. 1 1.000
. 2 0.424 1.000
. 3 0.249 0.337 1.000
. 4 0.234 0.295 0.258 1.000
. 5 0.036 -0.268 -0.036 -0.088 1.000

CFA result

Chi-Square Test of Model Fit

Value 4942.984
Degrees of Freedom 367
P-Value 0.0000

Chi-Square Test of Model Fit for the Baseline Model

Value 22500.021
Degrees of Freedom 406
P-Value 0.0000

CFI/TLI
CFI 0.793
TLI 0.771

Loglikelihood

H0 Value -41460.522
H1 Value -38989.030

Information Criteria

Number of Free Parameters 68
Akaike (AIC) 83057.044
Bayesian (BIC) 83329.977
Sample-Size Adjusted BIC 83114.201
(n* = (n + 2) / 24)

RMSEA (Root Mean Square Error Of Approximation)

Estimate 0.175
90 Percent C.I. 0.170 0.179
Probability RMSEA <= .05 0.000

SRMR (Standardized Root Mean Square Residual)Value 0.114

MODEL RESULTS

Estimates S.E. Est./S.E. Std StdYX

AWARE BY
AWARE1 1.000 0.000 0.000 9.144 0.727
AWARE2 1.169 0.082 14.332 10.691 0.686
AWARE3 1.343 0.069 19.605 12.280 0.918
AWARE4 1.485 0.080 18.542 13.575 0.872
AWARE5 1.463 0.091 16.073 13.381 0.764
AWARE9 1.611 0.076 21.202 14.734 0.985

AVOID BY
AWARE6 1.000 0.000 0.000 11.760 0.668
AWARE7 1.128 0.072 15.678 13.271 0.818
AWARE8 1.059 0.063 16.749 12.459 0.884
AWARE10 1.241 0.068 18.204 14.596 0.979

GATHER BY
GATHER1 1.000 0.000 0.000 10.569 0.625
GATHER2 1.605 0.104 15.380 16.963 0.934
GATHER3 1.495 0.102 14.631 15.801 0.869
GATHER4 1.643 0.121 13.534 17.364 0.780
GATHER5 1.622 0.104 15.566 17.141 0.951
GATHER6 1.626 0.107 15.234 17.182 0.921
GATHER7 1.631 0.105 15.552 17.242 0.950

DECIDE BY
PREFER1 1.000 0.000 0.000 18.141 0.999
PREFER2 1.004 0.003 303.259 18.210 0.999
PREFER3 0.926 0.013 72.939 16.802 0.965
PREFER4 0.989 0.018 53.961 17.934 0.937
PREFER5 0.928 0.018 50.476 16.834 0.929
PREFER6 0.452 0.034 13.314 8.198 0.550

CONCRETE BY
PLANS1 1.000 0.000 0.000 14.628 0.866
PLANS2 1.013 0.050 20.421 14.813 0.788
PLANS3 0.980 0.039 25.265 14.338 0.883
PLANS4 0.952 0.042 22.604 13.921 0.834
PLANS5 0.859 0.047 18.399 12.571 0.739
PLANS6 0.934 0.030 30.998 13.667 0.968

AVOID WITH
AWARE 109.749 10.896 10.072 1.021 1.021

GATHER WITH
AWARE 61.594 7.458 8.258 0.637 0.637
AVOID 81.230 10.001 8.122 0.654 0.654

DECIDE WITH
AWARE 106.331 10.965 9.697 0.641 0.641
AVOID 141.003 14.838 9.503 0.661 0.661
GATHER 188.091 17.704 10.624 0.981 0.981

CONCRETE WITH
AWARE 115.983 10.850 10.690 0.867 0.867
AVOID 149.943 14.608 10.264 0.872 0.872
GATHER 104.788 11.798 8.882 0.678 0.678
DECIDE 167.761 16.404 10.227 0.632 0.632

Variances
AWARE 83.616 9.716 8.606 1.000 1.000
AVOID 138.302 17.817 7.762 1.000 1.000
GATHER 111.711 15.937 7.009 1.000 1.000
DECIDE 329.089 23.066 14.267 1.000 1.000
CONCRETE 213.979 19.474 10.988 1.000 1.000

Residual Variances
AWARE1 74.442 5.142 14.477 74.442 0.471
AWARE2 128.728 8.912 14.445 128.728 0.530
AWARE3 28.213 1.924 14.667 28.213 0.158
AWARE4 57.915 3.952 14.654 57.915 0.239
AWARE5 127.639 8.796 14.512 127.639 0.416
AWARE6 171.665 11.655 14.729 171.665 0.554
AWARE7 87.099 5.784 15.058 87.099 0.331
AWARE8 43.288 2.868 15.092 43.288 0.218
AWARE9 6.641 0.581 11.431 6.641 0.030
AWARE10 9.108 0.759 11.996 9.108 0.041
GATHER1 174.124 12.288 14.170 174.124 0.609
GATHER2 41.799 3.255 12.842 41.799 0.127
GATHER3 81.028 5.930 13.665 81.028 0.245
GATHER4 193.663 13.851 13.982 193.663 0.391
GATHER5 30.818 2.518 12.241 30.818 0.095
GATHER6 52.600 4.008 13.123 52.600 0.151
GATHER7 32.027 2.604 12.299 32.027 0.097
PREFER1 0.759 0.172 4.404 0.759 0.002
PREFER2 0.677 0.172 3.940 0.677 0.002
PREFER3 21.023 1.491 14.100 21.023 0.069
PREFER4 44.379 3.127 14.193 44.379 0.121
PREFER5 44.784 3.152 14.207 44.784 0.136
PREFER6 154.765 10.827 14.294 154.765 0.697
PLANS1 71.324 5.564 12.819 71.324 0.250
PLANS2 134.137 9.927 13.512 134.137 0.379
PLANS3 57.917 4.622 12.530 57.917 0.220
PLANS4 84.616 6.418 13.184 84.616 0.304
PLANS5 131.413 9.575 13.725 131.413 0.454
PLANS6 12.449 1.765 7.053 12.449 0.062

Linda K. Muthen posted on Wednesday, May 31, 2006 - 11:35 am

You are comparing an EFA to a simple structure CFA. These are two different models and therefore would not necessarily end up with the same results. Also, please don't paste large portions of output on the discussion board as it takes too much room. If it is necessary to show this much output, please send the question to support@statmodel.com along with your license number.

Lois Downey posted on Thursday, June 29, 2006 - 10:24 am

The following statement appears in B. Muthen's post of 10/13/02, 10:44 AM: "With many good indicators (high loadings), ... the estimated factor scores tend to behave more and more like the true scores."

I computed factor scores for a single-factor complex missing model with 10 dichotomous indicators. 801 patients were clustered under 92 physicians. Standardized loadings ranged from 0.934 to 0.858. Am I correct in interpreting this to constitute "many good indicators"?

Estimated mean and variance for the latent variable were 0.00 and 0.79, respectively, whereas the mean and variance for the factor scores were -0.07 and 0.44, respectively. Would one expect greater correspondence between the two distributions than this, given the number of indicators and the size of the loadings?

Bengt O. Muthen posted on Sunday, July 02, 2006 - 5:10 am

With binary indicators you need closer to 20 indicators for the factor scores to behave well.

Nyankomo Wambura Marwa posted on Tuesday, July 18, 2006 - 12:58 pm

Dear prof.
I am trying to fit the latent class analysis model to four manifest variables each with two levels and one covariate i.e.age.Inititutively since i am working on diagnostics test i rarely expect the number of classes to go beyond 3,the two the better.But unfortunately the model with 3 classes seems to fit better than the one with two and may be with four may be still significant.One of the problem i noticed is that there is high correlation between manifest varibles about 0.95.

Please will you advice me on how to handle the effects of correlation in modelling latent class.

Second i saw correlation of about 0.95 can you kindly give me an idea on how is obtained because to my knoweledge correlation is for continous varaiables and we commonly use odds ratio as measure of association in categorical variable.

Find the attached Mplus codes for my analysis.

Title:
Summer Latent Class Analysis.
Data:
File is C:\Documents and Settings\Maruwa1\Desktop\mps2.txt;
Variable:
names = id age viap paplsilp colpohgrp hpv;
usevariables = id age viap paplsilp colpohgrp hpv ;
categorical = viap paplsilp colpohgrp hpv;
classes = c(3);
missing=all(9999);
Analysis:
Type=missing mixture ;
MODEL:
%oVERALL%
C#1 ON AGE;
Plot:
type is plot3;
series is viap(1) paplsilp(2) colpohgrp(3) hpv(4);
Savedata:
file is mps_save.txt ;
save is cprob;
format is free;
Output:
tech11 tech14;

Linda K. Muthen posted on Tuesday, July 18, 2006 - 4:04 pm

When I look at your input, I wonder if you want to include the ID variable in the analysis. By including it on the USEVARIABLES list, it will be used as a latent class indicator along with the four categorical variables. You would need to send your output and license number to support@statmodel.com for me to comment on the .95 correlation. I can't see where that would come from.

SC posted on Tuesday, November 21, 2006 - 3:11 pm

TECH4 shows model estimated means. For example, my latent factor "F1" has three indicator variables X1, X2, and X3, all on a 7-point likert scale. The sample statistics tell me that the means for the indicators X1, X2, and X3 are 5.105, 4.643, and 4.827. However, the mean for the latent factor F1 is only "0.123".

===========================================================
Estimates S.E. Est./S.E. Std StdYX

F1 BY
X1 1.000 0.000 0.000 1.155 0.760
X2 1.274 0.091 13.949 1.472 0.917
X3 1.215 0.102 11.893 1.403 0.888

SAMPLE STATISTICS
X1 X2 X3
_______ ________ ________
5.105 4.643 4.827

TECHNICAL 4 OUTPUT

ESTIMATED MEANS FOR THE LATENT VARIABLES
F1
________
0.123

ESTIMATED COVARIANCE MATRIX FOR THE LATENT VARIABLES
F1
______
F1 1.333
===========================================================

How is the mean "0.123" for the latent factor calculated? Why is it so small when compared to the means of the respective indicator variables? In journals, we need to report the means and std.devs, so is this the value we report?

Bengt O. Muthen posted on Tuesday, November 21, 2006 - 4:51 pm

Remember that the mean of an indicator y is:

Mean(y) = intercept + loading*Mean(factor)

A. Dyrlund posted on Wednesday, January 31, 2007 - 9:21 am

If I have a simple CFA model where:
VARIABLE: NAMES ARE psq1-psq15;
MODEL: reward BY psq1, psq6, psq11;
coercive BY psq2, psq7, psq12;
referent BY psq3, psq8, psq13;
legit BY psq4, psq9, psq14;
expert BY psq5, psq10, psq15;

What is the syntax for placing this CFA model in EFA in order to obtain the promax rotation results and eigenvalues?

Linda K. Muthen posted on Wednesday, January 31, 2007 - 11:12 am

If EFA, all you can specify is the number of factors not which variables load on which factors. All variables load on all factors. You would say TYPE= EFA 5 5; to obtain the five-factor solution.

A. Dyrlund posted on Wednesday, January 31, 2007 - 11:20 am

Then is there anyway to obtain eigenvalues and an oblique rotation while specifying specific items loading on specific factors?

Bengt O. Muthen posted on Wednesday, January 31, 2007 - 11:33 am

You can get an oblique rotation if you do "EFA within a CFA" - see our course material, "Day 1". But Mplus does not give eigenvalues except for EFAs.

A. Dyrlund posted on Wednesday, January 31, 2007 - 12:25 pm

I have purchased the manual already. Is the course material separate and also has to be purchased? I cant find a link on your site to the Day 1 course material.

Linda K. Muthen posted on Wednesday, January 31, 2007 - 2:30 pm

Note that the user's guide is available in pdf form on the website. See online ordering for the course handouts. What are you trying to do?

jenny yu posted on Saturday, February 10, 2007 - 8:26 pm

Dear Drs. Muthen,

I have some questions when I built my MIMIC model with DIF effects. I would appreciate if you can give me some clues.

1) I am implementing a iterative model building process by dropping a variable each time and using DIFFTEST to test significance of nested models. V3 mplus requires WLSMV to be the estimator to do DIFFTEST, however, my model fit is better when I used WLS. I read earlier discussion here and noticed that WLS was able to run DIFFTEST in earlier version. So I am wondering whether there is any way to run DIFFTEST with WLS estimator.

2) In earlier discussion, I also noticed a strategy of doing DIFFTEST with WLS and using WLSMV for the final model. Similarly, can I did DIFFTEST with WLSMV to achieve the final model and then use WLS to run the final one? Because in my case, the model fit (CFI and RMSEA) is better with WLS.

3) With 'residual' in 'output' statement, I get covariance/residual correlation/correlation matrix? Is there any way to output p-values related to this matrix?

Thank you very much for your time and help in advance.

Linda K. Muthen posted on Sunday, February 11, 2007 - 1:48 pm

I am not sure choosing an estimator because it gives better fit is justifiable but I will let you make that decision.

DIFFTEST is used only with WLSMV. With WLS, the difference in chi-square and degrees of freedom for the two nested models is used.

I am confused by two things that you say. One is that you are using difference testing for a MIMIC model. In a MIMIC model, DIF of intercepts is looked at by regressing the items on one or more covariates. I am also confused about what you mean by dropping a variable. Nested models should have the same set of observed variables.

Mplus does not give p-values for residuals .

jenny yu posted on Monday, February 12, 2007 - 7:41 am

I apologize for the confusion.

I am trying a model building process, that's why I was dropping variable. I probably misconcepted the definition of 'nested model'. Isn't it that a full model vs. a restricted model (with fewer variables than the full model)?

I think my question is that given a bounch of variables, DIF effect of which variable should be added to the model so as to achieve a parsimonious model (resulting in better model fit) instead of looking at DIF with all variables.

When we look at significance of coefficients of a variable for all items (indicators), DIF effects on some items were signficant, some were not. How can we decide whether this variable should be kept in the model or not? what is a valid and doable strategy to select variable?

Also is there any function in Mplus similar to Macro in SAS which can be used when we run something iterative.

Linda K. Muthen posted on Monday, February 12, 2007 - 9:18 am

Nesting in reference to chi-square difference testing refers to models using the same set of observed variables where restrictions are place on a more general model. Following are some papers you might find useful:

Gallo, J.J., Anthony, J. & Muthén, B. (1994). Age differences in the symptoms of depression: A latent trait analysis. Journals of Gerontology:
Psychological Sciences, 49, 251-264. (#52)

Muthén, B. (1989). Latent variable modeling in heterogeneous populations. Psychometrika, 54, 557-585. (#24)

Muthén, B., Tam, T., Muthén, L., Stolzenberg, R.
M. & Hollis, M. (1993). Latent variable modeling in the LISCOMP framework: Measurement of attitudes toward career choice. In D. Krebs & P.
Schmidt (Eds.), New Directions in Attitude Measurement, Festschrift for Karl Schuessler (pp.
277-290). Berlin: Walter de Gruyter. (#46)

If by the macro in SAS you are asking about running several analyses at the same time, you can use a DOS batch file for this purpose.

jenny yu posted on Monday, February 12, 2007 - 10:29 am

Thank you very much for your answers and the references. They are helpful. In addition, could you give me some instructions on my questions about building parsimonious model of DIF effect within MIMIC (the 2nd and 3rd paragraph in my previous post)?

Linda K. Muthen posted on Monday, February 12, 2007 - 11:16 am

This is discussed in the papers I suggested. Models to study measurement invariance are also described in Chapter 13 at the end of the discussion of multiple group analysis.

jenny yu posted on Sunday, February 18, 2007 - 3:31 pm

Thank you for the explanation. I would still like to ask --

Can I do DIFFTEST with WLSMV estimator to achieve the final model and then use WLS estimator to run the final model to get model coefficients and other statistics?

Thanks.

Linda K. Muthen posted on Saturday, February 24, 2007 - 6:57 am

You should stay with the same estimator. You should either use WLSMV for the entire analysis or use WLS for the entire analysis.

Fadia Alkhalil posted on Sunday, March 04, 2007 - 5:51 am

I am looking for statistical help in doing SEM analyses using LISREL (or any other software for SEM analyses).
I already did the EFA (using SPSS) and I only need to run the CFA. I do have all the conditions (correlations...) that needs to be enter in order to get the right model fit, however, I don't know how to operate the LISREL software very well.
I am confused by Lisrel language!!!
Thanks

Linda K. Muthen posted on Sunday, March 04, 2007 - 9:43 am

If you find the LISREL language too difficult, try Mplus. If you want to use LISREL, post your request for help on SEMNET.

Fadia Alkhalil posted on Sunday, March 04, 2007 - 12:19 pm

How I can post on SEMNET?
Thank you.

Linda K. Muthen posted on Monday, March 05, 2007 - 8:46 am

You can sign on to SEMNET by going to the following link:

http://bama.ua.edu/archives/semnet.html

Fadia Alkhalil posted on Friday, March 09, 2007 - 6:12 am

I tried to subscribe to SEMNET at the link you sent me but it keeps telling me that there is a cookie problem and I can't get ride of it. It has been taking me on a circle ride for the last three days.
HELP!!

I need to know whether the Cronbach Alpa can be determined by Lisrel or it needs to be calculated? if so how?
Thanks

Linda K. Muthen posted on Friday, March 09, 2007 - 6:33 am

I don't know how to help you with the SEMNET problem.

You should contract LISREL support to ask your question if you are using LISREL.

Sophie van der SLuis posted on Tuesday, May 22, 2007 - 8:59 am

I try to fit the following model (CFA):

DATA:
FILE=girls.dat;
Format=free;
TYPE=individual;

VARIABLE:
NAMES= famid y1 y2 y3 y4 Y5 y6 y7
y8 y9 y10 y11 y12;
USEVARIABLES= y1 y2 y3 y4 Y5 y6 y7
y8 y9 y10 y11 y12;
MISSING= ALL (-999999);
CLUSTER = famid ;

ANALYSIS:
TYPE= MEANSTRUCTURE COMPLEX MISSING h1;

MODEL:
f1 by y1* y2 y3 y4 y5 y6;
f2 BY y7* y8 y9 y4;
f3 by y10* y2 y4 y6 y11 y12;

F1@1 F2@1 F3@1;
[F1@0 F2@0 F3@0];
![y1 y2 y3 y4 Y5 y6 y7 y8 y9 y10 y11y 12];

OUTPUT: TECH1 STANDARDIZED RESIDUAL MODINDICES(0);

however, when I try to estimate intercepts, or when I start y12 at a certain value, the program gives a 'parsing error':

*** ERROR
Error in parsing line:
"[OA VOC SIM AR COM PC PA BD SS CO DIG INF]"

or

*** ERROR
Error in parsing line:
"VOC* SIM AR COM dig INF*4"

This error seems to be linked to variable y12. I checked the data; y12 is not strangely distributed, no uncoded missing values, not correlated 1 with other variables..
any suggestions?

best
sophie

Linda K. Muthen posted on Tuesday, May 22, 2007 - 9:51 am

You would need to send your input, data, output, and license number to support@statmodel.com. I cannot see what the problem is from what you have included.

Alex posted on Tuesday, June 26, 2007 - 8:28 am

Greetings,

Todd Little (Little et al., 1999, SEM) recommends putting equality constraints on factor loadings when estimating a model with only two indicators per latent variables (fixing both loadings to be equal). In this case, I use continuous indicators.

(1) How do we do that in MPlus ?
(2) I tried doing it by fixing both loadings to 1. In this case, the estimates I obtained are both fixed to 1 without S.E. (0), the STD estimates are equal to one another but the StdYX differ. If I which to report standardized loadings, what should I do (if that is the right way to do it - ref question 1)?

Linda K. Muthen posted on Tuesday, June 26, 2007 - 8:36 am

He must fix the metric of the factor by fixing the factor variance to one and have both factor loadings free and equal as follows:

f1 BY y1* y2 (1);
f1@1;

Nina Zuna posted on Wednesday, July 11, 2007 - 2:05 pm

Dear Linda,

I have a very elementary question. I would like to set a correlation to 1 between two factors. When I used the WITH command and @1, I received this error: NO CONVERGENCE. SERIOUS PROBLEMS IN ITERATIONS. ESTIMATED COVARIANCE MATRIX NON-INVERTIBLE. CHECK YOUR STARTING VALUES.
My goal is to do a chi sq diff test between a model in which the correlation between 2 factors is set to 1 vs a model in which the correlation is freely estimated.

Below is the model:
ChildFoc by q2_1 q2_2 q2_3 q2_4 q2_5 q2_6 q2_7 q2_8 q2_9;
FamFoc by q2_10 q2_11 q2_12 q2_13 q2_14 q2_15 q2_16 q2_17 q2_18;
ChildFoc WITH FamFoc@1;

Thank you for your kind assistance,

Nina

Linda K. Muthen posted on Wednesday, July 11, 2007 - 5:05 pm

I think a better way to do this is to use MODEL TEST. See the user's guide under MODEL CONSTRAINT to see how to label the parameters. And then see MODEL TEST which performs a Wald test.

Nina Zuna posted on Wednesday, July 11, 2007 - 7:27 pm

Thank you, Linda; your speedy response was very much appreciated. I reviewed pps.484-488 in the UG as advised. My new syntax is:
ChildFoc by q2_1* q2_2 q2_3 q2_4 q2_5 q2_6 q2_7 q2_8 q2_9;
FamFoc by q2_10* q2_11 q2_12 q2_13 q2_14 q2_15 q2_16 q2_17 q2_18;
ChildFoc@1; FamFoc@1;
ChildFoc WITH FamFoc (p1);
MODEL CONSTRAINT:
p1=1;
MODEL TEST
p1=.93
The correlation on the output is now 1.0; however, I want to make sure I interpreted your suggestion and the UG correctly: It appears that I can not conduct a model test (Wald test)for p1 since I constrained p1, right? If I am constraining it to 1, then I probably can't test it for a different value?
Incidently, I noticed that the constrained model is the same as having all 18 indicators load on 1 factor (makes sense since I am saying the two factors perfectly correlate). However, I am back to square 1: given these two scenarios, what is the best way to test between the 2 models: Model 1 that allows the two factors to correlate freely vs. a model in which the correlation is fixed to 1 or a model with 1 factor identified by 18 observed indicators? Is my only option to run the two models separately and conduct the chi sq diff test by hand since I got the error message for Wald test(WALD'S TEST COULD NOT BE COMPUTED BECAUSE OF A SINGULAR COVARIANCE MATRIX)?

Linda K. Muthen posted on Thursday, July 12, 2007 - 8:54 am

Don't use MODEL CONSTRAINT. Instead use:

MODEL TEST:
p1 = 1;

Nina Zuna posted on Thursday, July 12, 2007 - 9:20 am

Dear Linda,

Great...I have the Wald's test at the end of my output!! If I may bother you to ask one final question---I want to ensure I am interpreting it correctly. It is significant.
I removed the constraint so now the correlation between my two factors is freely estimated on my output (completely stdzd r=.93). I added the model test only as advised. Am I correct to assume the Wald test (p1=1) tested the significant difference between a correlation of .93 and 1.0?

Thank you in advance for your final thoughts,
Nina

Linda K. Muthen posted on Thursday, July 12, 2007 - 9:50 am

Yes, it tested the difference between .93 and 1.0. It is the same as the square of the z test:

(.93 - 1) / std. error of .93

Nina Zuna posted on Thursday, July 12, 2007 - 12:56 pm

Thank you so much!! The assistance you provide and the speed with which you respond to queries is phenomenal.

With much gratitude,

Nina

Masha Ivanova posted on Monday, August 06, 2007 - 8:34 am

Greetings, Drs. Muthen,

Pardon such a basic question, but sometimes it is helpful to check one’s understanding of the basics.

Would you be so kind to please clarify if the interpretation of the Covariances and Variances sections of the CFA output would differ if factor variances WERE vs. WERE NOT set @1? Let’s say we have a simple 3-factor model, with no freed parameters (i.e., no freed factor loadings or covariances). Let’s also say that for Model 1, f1@1 f2@2 f3@1 and for Model 2 there are no such specifications. Obviously, the StdYX values would = 1.0 in the Variances section for Model 1. If your time permits, can you please clarify everything else (i.e., Estimates, S.E., Est./S.E., Std, StdYX)?

Thank you,
Masha.

Brandi Jones posted on Monday, August 06, 2007 - 8:11 pm

Greetings~
I apologize in advance, but I am relatively new to CFA and Mplus, so I had some rather basic questions.

1)I wanted to check my understanding of the use and computation of factor
scores. I'm gathering that factor scores are individuals' predicted scores
on a factor created by multiplying their score on each predictor by that
predictor's factor loading and then summing these values. Is this accurate?
And then, is it comparable to a composite of those indicators? Do factor
scores mean the same thing for categorical variables?

2)I was trying to find the matrix to use to calculate the composite
reliability based on an equation I was provided in a SEM class, and I believe I can request the matrix for categorical indicators using the Tech 4 output
request. But, when I was looking into this on this discussion forum, if I did not completely misunderstand what was meant, it was suggested that with binary variables, the reliability is
better considered using IRT. Additionally, not
all of my indicators are binary, most are, but the others have 3-4 unordered
categories. So, can I calculate the reliability in the same way with my
categorical variables, and if so, will Tech 4 provide the matrix I need to
use? If not, any basic references on IRT?

Matthew Cole posted on Tuesday, August 07, 2007 - 4:07 pm

Hi Brandi,

Linda posted the following reply a few years ago that I addresses your first question:

"A factor loading is a regression coefficient. If factor loadings are continuous, they are simple linear regression coefficients and are interpreted as such. They can be greater than one. There is a discussion of this on the LISREL website under Karl's Corner.

If the factor indicators are categorical, then the factor loadings are probit or logistic regression coefficients depending on the estimator used in Mplus. "

If you haven't already done so, use the search function in this discussion forum and I bet you'll be able to find posts that provide even more information for your first question, and then posts that address your second question.

If not, you'll need to wait a few more days since the Muthen's are on vacation.

Linda K. Muthen posted on Tuesday, August 14, 2007 - 6:29 pm

Masha: When factor variances are fixed to one, correlations are estimated. When factor variances are free, covariances are estimated.

See Chapter 17 of the Mplus User's Guide for a description of the output.

Linda K. Muthen posted on Tuesday, August 14, 2007 - 6:32 pm

Brandi:

1. For continuous outcomes, this is approximately correct. See Technical Appendix 11 for a description of how factor scores are estimated in Mplus.

2. The information functions are available in the PLOT command. See the IRT section on the website for more information.

Masha Ivanova posted on Wednesday, September 05, 2007 - 10:17 am

Thank you, Dr. Muthen,

This forum is an unbelievable tool! Your responsiveness and patience are amazing.

Stephan Golla posted on Wednesday, September 26, 2007 - 2:19 am

Hello, I'm referring to a post posted on May 04, 2001
"I would like to use Mplus to estimate a non-linear relationship among latent variables [interaction]. Joreskog and Yang (1996) demonstrated such a model can be estimated using SEM if an observed product variable is used as an indicator of the latent product variable. Bollen (1995) used a two-stage least squares with instrumental variables to estimate the interaction. How can I use Mplus to estimate an interaction model?

Linda K. Muthen posted on Thursday, May 10, 2001 - 10:07 am
Mplus cannot do what Joreskog and Yang demonstrated because it requires non-linear constraints. The Bollen approach can be done in Mplus but it is not directly implemented. It would have to be done in a series of steps."

Is this still the case or does Mplus deal with the 2SLS approach? What would be the mentioned steps?
Thanks for your help.-Stephan

Linda K. Muthen posted on Wednesday, September 26, 2007 - 8:17 am

Since 2001, Mplus has added the XWITH option for latent variable interactions and MODEL CONSTRAINT for linear and non-linear constraints. Latent variables interactions are estimated using maximum likelihood according to the principles described in the following paper:

Klein, A. & Moosbrugger, H. (2000). Maximum likelihood estimation of latent interaction effects with the LMS method. Psychometrika, 65, 457-474.

Stephan Golla posted on Wednesday, September 26, 2007 - 9:46 pm

Dear Linda,
thanks for your fast response and the literature.
Best, Stephan

Linda posted on Thursday, September 27, 2007 - 10:23 am

Hello,
I ran a CFA with three continuous indicator variables, and I get a chi-square value of 0.0000, degrees of freedom = 0, CFI=1.0, TLI=1.0, RMSEA= 0.0000. It's a just identified model. In this situation, do I report the fit indices?

Thanks in advance!
Linda

Linda K. Muthen posted on Thursday, September 27, 2007 - 11:20 am

No. You can't test model fit in this situation.

Linda posted on Friday, September 28, 2007 - 7:17 pm

Thank you for your prompt response. Then,what do I report?...the loadings and the r square of the continuous variables?

Linda

Linda K. Muthen posted on Saturday, September 29, 2007 - 8:03 am

Also, you would report the standard errors of the estimates.

Jessica Schumacher posted on Thursday, November 01, 2007 - 1:59 pm

I'm trying to save factor scores in CFA using the "idvariable is" statement in the variable command. The problem is that I need to merge the "factor score" dataset with a larger data set using 2 different identifiers (a family ID as well as an individual ID). Is there a way to save two identifying variables with the factor scores? I wasn't able to get the code to run with two variables listed in the idvariable statement. Thank you!

Linda K. Muthen posted on Thursday, November 01, 2007 - 6:06 pm

The only thing I can think of is to create one variable from the two variables and then do the same thing in the larger data set when you do the merge. The length of the id variable is increasing to 16 in Version 5 so this might help.

Thierno Diallo posted on Monday, November 05, 2007 - 10:46 am

I am running a CFA with MPlus for the first time. I would like to have fit index like GFI, AGFI, Gamma 1, Gamma 2, TFI, NFI, NNFI. I don't know what option to type.

Thank you for your help.

Linda K. Muthen posted on Monday, November 05, 2007 - 11:08 am

These fit statistics are not available in Mplus. All available fit statistics are printed as the default.

Thierno Diallo posted on Monday, November 05, 2007 - 11:57 am

Thank you for your fast response. Can I submit an article on CFA without these fit statistics?

Linda K. Muthen posted on Monday, November 05, 2007 - 1:18 pm

I think the fit statistics we provide should be sufficient.

Paul Silvia posted on Monday, November 05, 2007 - 1:38 pm

You might consult a recent paper by Bentler in Personality and Individual Differences, in which he suggests "best practices" for reporting fit statistics. (Less is more, I think: a few well-chosen ones are better than a laundry list of every statistic that a program will compute.)

Matthew Cole posted on Wednesday, November 07, 2007 - 4:06 am

Bentler, P.M. (2007). On tests and indices for evaluating structural models. Personality and Individual Differences, 42, 825–829.

Thierno Diallo posted on Tuesday, December 11, 2007 - 7:06 am

Hi,
I am running a cfa and i want to run a LM (Lagrangian Multiplier) test to identify which fixed parameters, if set free, would lead to a significatly better fitting. Which option is good for that, in Mplus?
My second question is about modeling. Can I correlate a second order factor with a first order factor? For example, is it true to write :

Model:

f1 by v1-v3;
f2 by v4-v8;
f3 by v9-v10;
f4 by v11-v14;
f5 by v15-v18;
f6 by f1 f4 f5;
Y on f6;

f2 with f3;
f6 with f2 f3;

Thank you in advance

Linda K. Muthen posted on Tuesday, December 11, 2007 - 7:22 am

You can obtain modification indices using the MODINDICES option of the OUTPUT command.

I believe that you can correlate f6 with f2 and f3 because they are not part of the second-order factor.

Thierno Diallo posted on Tuesday, December 11, 2007 - 9:58 am

Thank you Dr. Muthen. It was very helpful.

Lisa Melander posted on Wednesday, February 06, 2008 - 8:52 am

Drs. Muthen,

I am new to Mplus and have a question regarding a CFA that includes both categorical and continuous variables. I ran the following output and recevied a message that says "MODINDICES option is not available for ALGORITHM=INTEGRATION. Request for MODINDICES is ignored.1 WARNING(S) FOUND IN THE INPUT INSTRUCTIONS."

Here's my syntax:

Title: Family Characteristics CFA
Data: file is H:\sociology\lgriepenstroh\runfeb4.dat;
format is F8.2(41);

Variable: NAMES ARE female grade hetero bioHH momed phyab sexab
neglect dadwarm parcon depress depres1 depres2 depres3 depres4
depres5 depres6 depres7 depres8 depres9 depres10 depres11 depres12
depres13 depres14 depres15 depres16 depres17 depres18 depres19 run1
run2 binge bingeD marij drug delinq scheng pvict dpeer run3;
MISSING ARE ALL (999);
CATEGORICAL ARE sexab;
USEVAR = phyab sexab neglect dadwarm parcon;

Analysis:
ESTIMATOR = ML;

Model: FAMCHAR BY phyab sexab neglect dadwarm parcon;

Output: Standardized;
Modindices;

Am I missing something that will enable me to run the model with both categorical and continuous indicators? Thanks so much for your help!

Linda K. Muthen posted on Wednesday, February 06, 2008 - 10:30 am

When you have one or more categorical indicators and use maximum likelihood estimation, numerical integration is required. The error message is about modification indices not about having categorcal and continuous indicators. You can use ESTIMATOR=WLSMV as an alternative. If this does not help, please send your input, data, output, and license number to support@statmodel.com.

Lisa Melander posted on Monday, February 11, 2008 - 5:11 am

Thank you for your help.

Franchesca Madison posted on Wednesday, February 13, 2008 - 3:49 pm

I am running two CFA models, a three-factor model and a second-order model. However, I am getting the same result for both models (model fit and estimates). Below are the syntax used. Am I missing something?

Thanks.

Model 1: Three factor Model

VARIABLE:
NAMES ARE v1-v62;
USEVARIABLES ARE v1-v62;
CATEGORICAL ARE v1-v62;
ANALYSIS:
ESTIMATOR= wlsmv;
MODEL:
f1 BY v1-item20;
f2 BY v21-item38 v61-v62;
f3 BY v39-v60;
OUTPUT:
sampstat tech4;

Model 2: Second Order

VARIABLE:
NAMES ARE v1-v62;
USEVARIABLES ARE v1-v62;
CATEGORICAL ARE v1-v62;
ANALYSIS:
ESTIMATOR= wlsmv;
MODEL:
f1 BY v1-item20;
f2 BY v21-item38 v61-v62;
f3 BY v39-v60;
f4 BY f1 f2 f3;
OUTPUT:
sampstat tech4;

Linda K. Muthen posted on Wednesday, February 13, 2008 - 5:49 pm

The second-order factor that you add is just-identified. This is why it makes no difference.

Zsuzsa Londe posted on Thursday, February 28, 2008 - 5:47 pm

Hi,
You say that Mardia's coefficient can be generated using TECH13, which has to be used with a MIXTURE model, which has to be of mixed classes, and that "None-mixture models can be estimates as a single class mixture to get that output." Could you please help me figure out how to make all continuous variables to be a "single class mixture" in order to get Mardia's? Thank you, Zsuzsa

Linda K. Muthen posted on Thursday, February 28, 2008 - 5:51 pm

You need to use TYPE=MIXTURE with the CLASSES option. You specify the CLASSES option as CLASSES = c (1); One class is the same as a one group analysis.

Zsuzsa Londe posted on Thursday, February 28, 2008 - 6:41 pm

Thank you very much for the incredibly speedy answer. I have been trying your suggestion but am not succeeding. I'm a new user and it is possible that I'm doing something very wrong but keep getting an error message "This analysis is only available with the Mixture or Combination Add-On."
This is my input;

Variable:
Names are
id digs_hu reads_hu words_hu nonwd_hu liss_hu corsi
digs_en reads_en words_en nonwd_en liss_en comp_esl;
Usevariables are
digs_hu reads_hu words_hu nonwd_hu liss_hu digs_en
reads_en words_en nonwd_en liss_en;
Missing are all (-9999) ;
CLASSES = c(1);

Data: LISTWISE=ON;
MODEL:
STM BY digs_hu words_hu nonwd_hu digs_en words_en nonwd_en;
WM by reads_hu liss_hu reads_en liss_en;

ANALYSIS:
TYPE=MIXTURE;

OUTPUT: tech13;

Linda K. Muthen posted on Friday, February 29, 2008 - 6:38 am

You can obtain TECH13 on if you have the mixture or combination add-on. It sounds like you don't.

Zsuzsa Londe posted on Friday, February 29, 2008 - 9:15 am

Hi Linda,
Would you have any suggestions how to check multivariate normality with Mplus? I do have univariate non-normality and my committee members would also like me to provide multivariate comparisons. Thank you, Zsuzsa

Linda K. Muthen posted on Friday, February 29, 2008 - 10:01 am

This is the only way to do the check in Mplus. If you send your input, data, and license number to support@statmodel.com, I can do the run for you as a one time favor.

Mick Cunningham posted on Thursday, May 01, 2008 - 1:57 pm

I am attempting to use DIFFTEST in a CFA with categorical indicators to test a 1-factor structure compared to a 2-factor structure with 5 (binary) observed indicators. When I follow the instructions in Example 12.12, I receive a message that " THE CHI-SQUARE DIFFERENCE TEST COULD NOT BE COMPUTED BECAUSE THE H0 MODEL IS NOT NESTED IN THE H1 MODEL."

Key parts of my setup follows:
File 1:
Snip>

Usevariables are
cohab divorce samesex DIsinbir DIpmsex ;
Categorical are
cohab divorce samesex DIsinbir
DIpmsex ;
Analysis:
Type = general;
ESTIMATOR = wlsmv ;

Model:
f1 BY cohab@1 samesex DIpmsex divorce DIsinbir;

Savedata:
DIFFTEST IS deriv.dat;

File 2:
Snip>
Usevariables are
cohab divorce samesex DIsinbir DIpmsex ;
Categorical are
cohab divorce samesex DIsinbir

Analysis:
Type = general;
ESTIMATOR = wlsmv ;
DIFFTEST IS deriv.dat;

Model:
f1 BY cohab@1 samesex DIpmsex;
f2 BY divorce@1 DIsinbir;
f1 with f2;

Linda K. Muthen posted on Thursday, May 01, 2008 - 6:15 pm

The one factor model is nested in the two factor model so you should do the two factor model first. Note that you have different variables on the two CATEGORICAL lists.

Mick Cunningham posted on Tuesday, May 06, 2008 - 3:40 pm

Linda,
Thanks for your quick response.
The different variables listed were just a result of my cut and paste to get down to the maximum character number for the message.
I had initially run it the other way (2 factor first, 1 factor second) without success. When I tried again today it worked. I'm running on a remote terminal server and it is almost as if it wasn't having time to register or something.
Anyway, I've got my difftest results now! Thanks!
Mick

Vivian Towe posted on Wednesday, May 14, 2008 - 2:08 pm

Linda,

I ran a very simple CFA with categorical data (Likert scale responses). My model fit was very poor (chi-square, CFI/TLI, RMSEA) according to Yu's dissertation.

I was thinking that one way to improve fit would be to see if the item residuals are correlated, but I don't know how to model that? Can you point me to some Mplus examples of CFA that specify correlated item errors? Or is there another way to deal with fit problems?

Thank you.

Linda K. Muthen posted on Wednesday, May 14, 2008 - 4:02 pm

I don't know you situation but if you have more than two factors, I would start with an EFA. The WITH option is used to correlate residuals, for example, u1 WITH u2; You can look at modification indices to see other possible model misfit.

Vivian Towe posted on Wednesday, May 21, 2008 - 10:53 am

I am running a CFA multiple group analysis. I am following your handouts for multiple group analysis. The 2 groups are male and female.

When I ran a single group analysis restricting to female using the useobservations (gender eq 2) statement, the analysis ran.

However, when I ran the analysis for males and females simultaneously using the statement grouping is gender (male=1 female = 2), I received the following error message:

*** ERROR
Group 2 does not contain all values of categorical variable: ESTEEM
*** ERROR
Group 2 does not contain all values of categorical variable: SEXREG
*** WARNING

This is true, for females, no one answered with the value '1' for either of these variables, but not sure how to fix problem.

Any ideas on why the single analysis works but not the multi group?

Linda K. Muthen posted on Wednesday, May 21, 2008 - 11:13 am

The groups are expected to have the same categories with weighted least squares estimation. You can collapse categories or use maximum likelihood estimation with the * setting of the CATEGORICAL option. See the user's guide for more information.

Gloria Chou posted on Friday, June 06, 2008 - 2:47 am

My first question is whether there is difference between the "confirmatory factor analysis (CFA)" and "CFA conducted by using SEM framework or by using SEM software"?

My second question is whether analysis of CFA would be suggested to be conducted by following typical analysis/ software for CFA or by using SEM software?

Thanks for your feedback!

Linda K. Muthen posted on Friday, June 06, 2008 - 8:46 am

If you have the same model, the same data, and the same estimator, all programs should give the same results. I see no distinction between CFA and SEM software.

Eduardo Bernabe posted on Sunday, June 15, 2008 - 2:58 am

Hi there, i hope you can help me with this:

I'm testing the factor structure of the SOC scale (12 items collected using 7-point ordinal scales) with two different hipothesised structures (one-factor model vs. second order-factor model with three latent factors of four items each which in turn load on the high-order factor) in a national study (using survey commands). The fit of the second order factor model is realively higher (CFI 0.98 RMSEA 0.083 AIC 231201.21) than that for the one-factor model (CFI 0.98 RMSEA 0.092 AIC 231559.79), however some correlations between latent factors are higher than one (Heywood cases?).

These are my questions:

1) any idea why is that happening?

2) should I dismiss the second-order factor model (which was my personal bet in this research) because of this inadmissible solution?

3) is there any way of constraining correlations to avoid values higher than one?

Thanks in advance for your help,

E

Linda K. Muthen posted on Monday, June 16, 2008 - 9:27 am

Factors that correlate one are not statistically distinguishable. A second-order factor model with three first-order factors is the same as a model with three correlated first-order factors. I suggest doing an EFA for 1-4 factors as a first step.

Derek Kosty posted on Wednesday, June 18, 2008 - 1:23 pm

Hi,

This question is regarding CFA. I am trying comparatively evaluate 5 models, some nested and some non-nested, by simultaneously taking into account the goodness of fit, sample size, and the number of parameters estimated.

I know that Information criterion indicators are good for this purpose (eg. BIC and AIC). However, the problem is that my observed variables (lifetime diagnosis of different mental disorders) are dichotomous and have low base rates.

I am aware that BIC/AIC indices are not appropriate when using the WLS estimator but I am unsure of the appropriateness of using ML and specifying my observed variables as continuous (due low base rates which cause a highly skewed distribution).

I have seen multiple papers reporting BIC statistics while claiming that parameters were estimated using weighted least squares. This does not make sense to me.

How would you recommend that I compare my models in this context?

Thanks for your support!

Bengt O. Muthen posted on Wednesday, June 18, 2008 - 1:51 pm

If you have only a small number of factors you can use ML. Using ML does not mean that you have to assume that the variables are continuous which it sounds like you are implying. If you specify the variables as categorical and the estimator as ML (or MLR), then the appropriate logit (or probit) model parts will be used in Mplus.

I don't see how BIC can be computed with weighted least squares since it builds on the likelihood.

Derek Kosty posted on Wednesday, June 18, 2008 - 5:27 pm

Dr. Muthen,

Thanks for the quick reply. The model I am currently trying to run has six factors, each having between 3 and 6 observed variables loading on them. My sample size is 816. I specified the variables as categorical, requested the MLR estimator, and reduced the number of integration points to 5. The compiler has been working for about 3 hours now, is this normal? If so, what should I be looking for in the DOS window that could give a clue to how much longer the process will take?

Maybe the slow pace is a result of running version 4.2?

D

Bengt O. Muthen posted on Wednesday, June 18, 2008 - 5:47 pm

As indicated in the User's Guide, six factors gives very time-consuming computations. Particularly if you are not using a computer with at least 2 and preferably 4 or even 8 processors. Mplus takes advantage of multiple processors using parallelized code which gives considerable time savings (using the PROCESS= option in the ANALYSIS command). You should also use version 5.1. The DOS window shows you the iteration history and the time each iteration takes - usually you can get an idea from this of how long it will take to converge. But with this many factors I would recommend using WLSMV and instead of BIC compare models via fit measures such as SRMR.

Derek Kosty posted on Thursday, June 19, 2008 - 10:15 am

Yu (2002) suggests that SRMR is not good when dealing with binary outcomes. It is unclear to me if his critique of the SRMR is with respect to the cutoff recommendation, or with the statistic in general. For example, can the values between models still be compared (e.g. which one is lower) and it be meaningful?

Bengt O. Muthen posted on Thursday, June 19, 2008 - 7:00 pm

It is not clearcut what to do here, but I think maybe measures such as CFI, which Yu (2002) found useful, may not be able to discriminate between neighboring models that are not far apart in terms of fit. Perhaps SRMR is more useful for this.

Derek Kosty posted on Monday, June 23, 2008 - 10:11 am

In Mplus version 4.2 the SRMR is included in the output for a model in which the outcomes are all categorical, the latent variables are continuous, and WLSMV is the estimator. However, SRMR is not included in the output for version 5.1. What is the reason behind this and can I request the SRMR to be computed in version 5.1?

Here is my model:
MODEL:
distr by LMDD4 LDYS4 LDPD4 LGOA4 LPTS4;
fear by LSPE4 LSOC4 LPAN4 LOBC4;
intern by fear distr;! LBIP4;
fear@0;

Thanks again.

Linda K. Muthen posted on Monday, June 23, 2008 - 11:37 am

SRMR is not available when thresholds are in the model which is the default starting with Version 5. Add MODEL = NOMEANSTRUCTURE; to the ANALYSIS command.

Derek Kosty posted on Monday, June 23, 2008 - 3:54 pm

If I run a model using MLR as the estimator with categorical outcomes, I notice that fit indices such as CFI, TLI, RMSEA, SRMR, and WRMR do not appear in the output. Can they be requested as well?

Derek Kosty posted on Monday, June 23, 2008 - 4:04 pm

Nevermind, I discovered (from another thread) that with maximum likelihood and categorical outcomes, these fit statistics are not available because sample statistics are not sufficient statistics for model estimation.

Derek Kosty posted on Tuesday, June 24, 2008 - 1:51 pm

I am conducting a CFA with dichotomous observed variables with low base rates and an n=816. What method of estimation is most appropriate (MLSMV or MLR) and do you know of any articles that discuss this issue?

In trying to resolve the question, I stumbled across an article in which Beauducel and Herzberg (2006) compare MLSMV with ML (not MLR). They use categorical data with Mplus version 3.11 and somehow are reporting CFI, TLI, RMSEA and SRMR for both methods of estimation. This contradicts my earlier discovery that "with maximum likelihood and categorical outcomes, these fit statistics are not available because sample statistics are not sufficient statistics for model estimation". Is this due to a difference in Mplus itself across versions, or do you think that the authors did not actually specify their variables as categorical within the model?

Sorry about so many questions. I really appreciate all of your support!

Linda K. Muthen posted on Tuesday, June 24, 2008 - 2:04 pm

What is MLSMV? Do you mean WLSMV?

Derek Kosty posted on Tuesday, June 24, 2008 - 2:10 pm

Sorry, I did mean WLSMV.

Linda K. Muthen posted on Tuesday, June 24, 2008 - 2:19 pm

WLSMV uses weighted least squares estimation. Chi-square and related fit statistics are available with this estimator. MLR uses maximum likelihood estimation. With categorical outcomes, chi-square and related fit statistics are not available. With maximum likelihood and categorical outcomes, each factor requires one dimension of integration which can be computationally demanding. More than 3 or 4 factors is not feasible. Weighted least squares is a better option when you many many factors.

Derek Kosty posted on Tuesday, June 24, 2008 - 2:29 pm

Two more questions:

1.) Does your previous answer imply that Beauducel and Herzberg (2006) had to be using continuous data in order to get CFI, TLI, RMSEA and SRMR for both methods of estimation?

2.) Do you know of any articles that discuss using SRMR to compare across models with binary outcomes?

Linda K. Muthen posted on Tuesday, June 24, 2008 - 2:53 pm

1. I would think if the data were categorical, they were not using the CATEGORICAL option so it was being treated as continuous.

2. No.

Pajarita Charles posted on Sunday, September 28, 2008 - 4:37 pm

I have done a CFA using the default analysis TYPE=GENERAL. What is the rotation method used for TYPE=GENERAL? I thought the method and type of rotation was principal axis factoring with promax (oblique) rotation. Is this correct? Thank you.

Bengt O. Muthen posted on Monday, September 29, 2008 - 7:29 am

There is no rotation involved with CFA, only with EFA measurement structures. Are you referring to the new "exploratory SEM" approach? For EFA, Mplus does not use PAF, but estimators such as ML and ULS. A multitude of rotations are available - see the User's Guide.

Andrea Hildebrandt posted on Wednesday, October 29, 2008 - 2:21 am

Thank you very much for your help yesterday!
I have another question:
Are those three syntaxes equivalent, when I use mplus 5

1.
model:
WMS BY mnamb4 mnam mvis mvisb4 mver mverb4;

GS BY mifa miss minc;

OBJ BY mchsmfu mchsmfi mchpwp mchpww;

GenCog BY Zcrsa Zmcmu raven WMS GS OBJ;

________________________________________________________________________
2.
model:
WMS BY mnamb4@1 mnam mvis mvisb4 mver mverb4;

GS BY mifa@1 miss minc;

OBJ BY mchsmfu@1 mchsmfi mchpwp mchpww;

GenCog BY Zcrsa@1 Zmcmu raven WMS GS OBJ;

_________________________________________________________________________
3.
model:
WMS BY mnamb4* mnam mvis mvisb4 mver mverb4;
WMS@1;

GS BY mifa* miss minc;
GS@1;

OBJ BY mchsmfu* mchsmfi mchpwp mchpww;
OBJ@1;

GenCog BY Zcrsa* Zmcmu raven WMS GS OBJ;
GenCog@1;

with 3. I get other results as with 1 or 2, but I thought, those are equivalent specifications.
Thank you!

Linda K. Muthen posted on Wednesday, October 29, 2008 - 6:18 am

Numbers one and two are equivalent because you choose the same factor indicator to set the metric of the factor. In Number three you set the metric of the factor using a different factor indicator. Model fit will be the same but not parameters estimates.

Andrea Hildebrandt posted on Wednesday, October 29, 2008 - 11:06 am

thank you very much for your answer! Unfortunatelly there is something wrong. Model fit is not the same between those models. model 3 fits much better.

Which indicator will be used to set the metric in Model three, where the first indicator has a *? The second in the list?

thank you for your support!

Linda K. Muthen posted on Wednesday, October 29, 2008 - 11:38 am

When you free the first indicator and fix the factor variance to one as in model 3, no factor indicator is fixed to one. You would need to send the output from model 3, the output from either model 1 or 2, and your license number to support@statmodel.com so I can see exactly what you are doing.

RDU posted on Thursday, December 11, 2008 - 8:18 am

Hello. I am trying to run a MIMIC model for eight ordinal indicators. My sample size is around 500.

My data are nested (students clustered within schools), and due to my research interests I chose to use the aggregated or design-based approach (i.e., Type=complex in conjunction with cluster=school) to model my data.

I have read several articles about MIMIC models, but have yet to see one where school-level effects are used as covariates. I have given some thought to looking at the direct effect of several school-level covariates on my latent factor. Though, perhaps this isn't substantively meaningful or correct in terms of modeling population heterogeneity.

Questions:

1.) I was wondering what your thoughts are on using a combination of student and school-level covariates as predictors for the latent factor(s) in an aggregated MIMIC model?
2.) If it is true that regressing the latent factor(s) on a combination of student and school-level covariates is feasible (i.e., substantively interpretable) then how would one interpret the school level covariates? This is a bit confusing to me since the aggregated approach doesn't disentangle the between school and within school effects.

Thank you so much for your time.

Best,

RDU

Linda K. Muthen posted on Thursday, December 11, 2008 - 9:51 am

You can do what you want but I would recommend using multilevel analysis because I do not believe that a MIMIC model is aggregatable.

Anonymous posted on Tuesday, December 30, 2008 - 9:26 am

Hi:

I am running a CFA with dichotomous response items and modeling on 8 latent factors. I want to run a logistic SEM model with ML estimator. I keep getting this error. I have tried to run this model on couple of computers but it does not seem to solve the problem. Where could I be wrong?

*** FATAL ERROR
THERE IS NOT ENOUGH MEMORY SPACE TO RUN Mplus ON THE CURRENT
INPUT FILE. THE ANALYSIS REQUIRES 8 DIMENSIONS OF INTEGRATION RESULTING
IN A TOTAL OF 0.25629E+10 INTEGRATION POINTS. THIS MAY BE THE CAUSE
OF THE MEMORY SHORTAGE. YOU CAN TRY TO FREE UP SOME MEMORY BY CLOSING
OTHER APPLICATIONS THAT ARE CURRENTLY RUNNING. NOTE THAT THE MODEL MAY
REQUIRE MORE MEMORY THAN ALLOWED BY THE OPERATING SYSTEM.
REFER TO SYSTEM REQUIREMENTS AT www.statmodel.com FOR MORE
INFORMATION ABOUT THIS LIMIT.

Linda K. Muthen posted on Tuesday, December 30, 2008 - 9:30 am

With maximum likelihood and categorical outcomes, each factor is one dimension of integration. We recommend no more than four dimensions of integration. I suggest using weighted least squares estimation when you have several factors.

Anonymous posted on Wednesday, December 31, 2008 - 12:31 pm

My main concern with weighted least squares estimation is that it does not allow one to run a logit model. This is true right? Because my outcome is dichotomous I would like to get a logit estimate. Can you please advice. Thanks.

Linda K. Muthen posted on Wednesday, December 31, 2008 - 1:43 pm

In Mplus weighted least squares estimates a probit model. If you want logistic regression, you need to use maximum likelihood estimation in Mplus.

Anonymous posted on Wednesday, December 31, 2008 - 1:57 pm

So if I have more than 4 factors to estimate, Mplus will not allow me to run a logistic regression?

Linda K. Muthen posted on Wednesday, December 31, 2008 - 2:28 pm

Numerical integration is required with CFA and categorical factor indicators. A model with more than four factors is not feasible in this case. Your only option is using weighted least squares and probit regression. I don't see this as a problem.

Anonymous posted on Wednesday, December 31, 2008 - 2:40 pm

Thank you so much for your quick responses. I apologize if I am belaboring on this point. I am have a problem with interpreting results of probit estimate in my context due to the differences in the distribution used. I am thinking of using the latent variable scores from Mplus and using them in a logit regression in STATA. Do you think that is a feasible approach?

Bengt O. Muthen posted on Wednesday, December 31, 2008 - 3:15 pm

It sounds like you mean that you would first do WLS probit factor analysis, then estimate factor scores, and then regress each item on those scores to get logit results. If I am interpreting that correctly, that would seem to be less precise than using the probit factor analysis results and do the usual approximate translation to logit by the Sqrt(pi^2/3) factor. But I don't see much of a difference between probit and logit in practice to warrant any interpretational concerns. Probit modeling certainly seems accepted in IRT.

Anonymous posted on Wednesday, December 31, 2008 - 3:29 pm

Thank you for your quick response. I think your suggestion it is a reasonable way forward.

Anonymous posted on Friday, January 02, 2009 - 10:12 am

I did manage to run my models with WLS. Thank you. I am now wanting to run an interaction model. I am running a CFA with dichotomous response items and modeling on 8 latent factors. I want to create interaction between 2 of these 8 latent factors. Is it possible to do this? I read "5.13: SEM with continuous factor indicators and an interaction
between two factors" in the Mplus user guide, but I don't think the command listed there is helping me. Is there any modification you can suggest? Thank you.

Linda K. Muthen posted on Friday, January 02, 2009 - 10:19 am

The XWITH option is available only for TYPE=RANDOM which is available only with maximum likelihood estimation.

Thierno Diallo posted on Monday, March 09, 2009 - 10:11 am

Hi,
Mplus computes standard errors of factors loadings in a CFA model. So, I was wondering if we can use a pool standard error to compare 2 factors loadings across groups instead of using a chi-square difference testing. If yes,
1) what are the advantages of using one instead of the other testing?
2) Is it possible for some reasons to have a different result for the 2 testings?
Thank you in advance.

Linda K. Muthen posted on Monday, March 09, 2009 - 6:51 pm

I do not know how one would calculate pooled-standard errors in the maximum likelihood framework. I would use either a difference test or a Wald test via MODEL TEST.

Derek Kosty posted on Wednesday, March 11, 2009 - 3:49 pm

Dear Mplus team,

I am running a series of CFA models each including a different outcome measure regressed on the latent factors. Looking at the R-squared values associated with the outcomes across the different models suggests that the significance of the R-squared value is not based on magnitude alone. In other words, some outcomes have significant R-squared values that are less than the non-significant R-squared values of another outcome. I am sure that the answer lies in how the standard error is computed for R-squared. Can you provide any input on this issue?

Thanks for your support!

Linda K. Muthen posted on Wednesday, March 11, 2009 - 4:18 pm

The two tests are not the same. One tests whether the regression coefficient is significantly different from zero. The other tests whether the variance explained in the dependent variable is significantly different from zero.

Derek Kosty posted on Wednesday, March 11, 2009 - 4:31 pm

Sorry, I don't think my question was clear enough. I am only looking at the test of variance explained in the dependent variable. Here is an example:

16.1% of the variance in years of school completed is explained by disruptive behavior and substance use factors. This R-squared is statistically significant.

19.6% of the variance in lifetime prevelence of depression is explained by disruptive behavior and substance use factors. This R-square is not statistically significant.

I am wondering how the first R-squared is significant while the latter is not givin the relative magnitudes.

Linda K. Muthen posted on Wednesday, March 11, 2009 - 6:52 pm

Each R-square has its own standard error based on information related that that dependent variable. So one may have a larger standard error resulting in non-significance even though the absolute value is larger.

Derek Kosty posted on Thursday, March 12, 2009 - 9:01 am

Can you provide further information on what goes into the computation of the standard error?

Linda K. Muthen posted on Thursday, March 12, 2009 - 10:20 am

The Delta method of computing standard errors is used. Google this for further information.

Maša Vidmar posted on Wednesday, April 08, 2009 - 8:20 pm

Hi,

I am running CFA with 1 construct and three indicators. It was my believe that this should result in a just-identfied model. But I get following error msg:
THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES COULD NOT BE COMPUTED. THE MODEL MAY NOT BE IDENTIFIED. CHECK YOUR MODEL. PROBLEM INVOLVING PARAMETER 5.
Parameter 5 is loading for one of the indicators.
What is causing this? And how to avoid it?

Another (unrelated) question...in a different model I have 2 construct at two time points (so actually 4 latent variables) with 3-8 indicators. Fit indices and all estimates are exactly the same in CFA (no order imposed) and SEM models (paths). Is this expected?

Thank you!

Linda K. Muthen posted on Thursday, April 09, 2009 - 9:25 am

A factor model with three indicators is just identified. You may have all factor loadings and the factor variance free. If this is not the case, you need to send the full output and your license number to support@statmodel.com.

It sounds like you have a model where the four factors have an unrestricted covariance matrix such that covarying all factors gives the same number of parameters as regressing two of them on the other two. If so, the models are statistically equivalent and the fit should be the same. If not, I would need to see your full output and license number at support@statmodel.com to answer your question.

Javarro Russell posted on Sunday, April 12, 2009 - 8:49 am

Hi,
I have attempted a CFA with 59 dichotomous items and 1000 observations. My fit statistics were less than ideal. I am attempting to analyze the misfit and report my findings to a non-technical audience. There are some issues I could use your help on:

Is the input matrix tetrachoric? If so, is there a way to obtain a print out of the input matrix?

I have several positive residuals (as high as .283) Does this indicate that my model implied correlations are smaller than my observed correlations?

Linda K. Muthen posted on Monday, April 13, 2009 - 1:45 pm

Yes, the sample statistics used for model estimation are tetrachoric correlations with WLSMV and binary outcomes. These are printed is you ask for SAMPSTAT in the OUTPUT command. They can be saved using the SAMPLE option of the SAVEDATA command.

A positive residual means that observed value is larger than the model estimated value.

Greg posted on Wednesday, April 22, 2009 - 9:09 am

Hi,

I'm new to the forum, so excuse me if my question may seem straightforward.

I'm running a CFA on 29 reflective indicators (4 factors) and 2 observed continuous dependent variables (scores to a test).
The output says:
*** WARNING in MODEL command
Variable is uncorrelated with all other variables: scorea
*** WARNING in MODEL command
Variable is uncorrelated with all other variables: scoreb
*** WARNING in MODEL command
All least one variable is uncorrelated with all other variables in the model.
Check that this is what is intended.
3 WARNING(S) FOUND IN THE INPUT INSTRUCTIONS

The output gives relationships between the latent variables (e.g. f1 WITH f2) but not between the latent variables and the scores (e.g. f1 WITH scorea).

Do I need to specify the hypothesized relationships between the latent var and the scores in the CFA already? Then, isn't it already a structural model?

Thanks for the answer!

Bengt O. Muthen posted on Wednesday, April 22, 2009 - 6:01 pm

Yes and yes.

Ben Saville posted on Wednesday, May 06, 2009 - 9:33 am

Dr. Muthen,

I have a 2-level one-factor CFA that I'm trying to fit in Mplus. I found a very similar model in the article

Muthén, B. (1994). Multilevel covariance structure analysis. In J. Hox & I. Kreft (eds.), Multilevel Modeling, a special issue of Sociological Methods & Research, 22, 376-398.

Can you tell me how to code the one factor model with 8 indicators, in which students are nested within schools (Figure 1)? My best guess is the following:

Variable: Names are y1-y8;
cluster = school;

ANALYSIS: TYPE= TWOLEVEL;

Model: %within%
fw by y1-y8;

%between%
fb by y1-y8;

Thanks in advance.

Linda K. Muthen posted on Thursday, May 07, 2009 - 10:25 am

This looks fine. Note, however, that between level residual variances often are vary small and may need to be fixed at zero. See the Topic 7 course handout for more information.

Joykrishna Sarkar posted on Friday, May 08, 2009 - 6:32 pm

I am using complex sample data with cluster, weight, and groups (2 groups). Could you please advice me how can I compute maximum likelihood (ML), pseudo maximum likelihood (PML) and pseudo-maximum log-likelihood (PLL) estimators in MPlus Version 5.2? I am also interested in corresponding CFI Values. I didn't find the specific commands for these estimators in MPlus manual.
Thanks in advance.

Linda K. Muthen posted on Saturday, May 09, 2009 - 11:09 am

With TYPE=COMPLEX, ESTIMATOR=ML gives ML and ESTIMATOR=MLR gives PML. I don't know what you mean by PLL. In all cases, if a chi-square test statistic is given, CFI is given.

Joykrishna Sarkar posted on Saturday, May 09, 2009 - 2:19 pm

Thanks Dr. Linda for your response. But I ran With TYPE=COMPLEX, ESTIMATOR=ML . This command gives me the following error:
*** ERROR in ANALYSIS command
Estimator ML is not allowed with TYPE = COMPLEX.
Default will be used.
1 ERROR(S) FOUND IN TH IENPUT INSTRUCTIONS.
So, I can't use Type=Complex and Estimator=ML. What is the alternative way to estimate ML estimator?

I found the Pseudo log-likelihood (PLL) estimator in the article:
Asparouhov, T. & Muthen, B. (2006). Comparison of estimation methods for complex survey data analysis.

In that article, the estimator is described by the equation 10. I am saying it PLL. Would you think PML and PLL (in rquation 10 of the above article) are the two different estimators or same.
If these two (PML and PLL)are different how can I estimate in Mplus 5.2? In literature I found both are available in Mplus.

William Welch posted on Friday, May 15, 2009 - 10:41 am

I want to compare a "full" model with a "basic" model and include the same variables in both models. However, in the full model, some variables are treated as 1-item measures. For example, the basic model has two factors - the first factor has two variables - and the full model is one where these same two variables are treated as separate one-item measures (i.e., are not loaded on a factor).

I am new to Mplus, and I am not sure what is the correct procedure for specifying these two one-item measures in the full model. Should I fix their variance at 1 or leave them out of the model statement but in the USEVARIABLES statement? What different assumptions would I be making with this change?

Ex:
For the models below:
All variables are continuous
ANALYSIS: estimator=ml;
OUTPUT: standardized mod(3.84) tech4;

(Basic model)
MODEL:
f1 by v1 v2;
f2 by v3 v4 v5;

compared to

(Full model #1)
MODEL:
v1@1;
v2@2;
f2 by v3 v4 v5;

or compared to

(Full model #2)
MODEL:
f2 by v3 v4 v5;

>>in Full model #2, v1 and v2 are included in the "USEVARIABLES" statement but not specified in the MODEL statement<<

Thank you in advance.

Linda K. Muthen posted on Saturday, May 16, 2009 - 8:24 am

I think the following is the way to go:

MODEL:
f2 by v3 v4 v5;

Check to be sure v1 and v2 are correlated with f2 as the default. If not, add a WITH statement.

Arina Gertseva posted on Monday, May 18, 2009 - 4:52 pm

Dr. Muthen,
I have a few outliers in my data and I do not want to eliminate them. Is there a way to run a CFA model with certain cases identified as outliers?
Or should I just run two models (one with outliers and one without) and compare them.

Arina.

Linda K. Muthen posted on Monday, May 18, 2009 - 5:20 pm

You would have to do the analysis twice as you suggest. You could use the USEOBSERVATIONS option to select cases that are not outliers.

Joykrishna Sarkar posted on Wednesday, June 03, 2009 - 1:37 pm

Is it possible to use weights in conventional SEM analysis? In particular, how can I use weights to estimate CFA model parameters, when ML estimator is used in Mplus?
Another question is, I found the correlation between two factors is more than 1 in CFA. This gives me error. So, is there any way to solve this problem?

Linda K. Muthen posted on Thursday, June 04, 2009 - 9:28 am

Yes, it is possible to use weights in conventional SEM analysis using any estimator. See the following paper which is available on the website:

Asparouhov, T. (2005). Sampling weights in latent variable modeling. Structural Equation Modeling, 12, 411-434.

See also the WEIGHTS option in the user's guide and the topic Complex Survey Data in the user's guide for information about all of the complex survey data features in Mplus.

When two factors correlate more than 1, they are not statistically distinguishable. Only one can be used in the model. You may want to go back to an EFA.

Christopher McKinley posted on Wednesday, September 02, 2009 - 5:15 am

I ran a CFA and the chi-square degrees of freedom was greater than the sample size. Am I violating any assumptions here?

Linda K. Muthen posted on Wednesday, September 02, 2009 - 8:39 am

There would be a problem if there were more free parameters than number of observations. You are fine.

Jacqueline Cheng posted on Saturday, September 12, 2009 - 1:55 am

Hi,

I'm trying to do a 3-level (students in classes in schools) multilevel CFA in Mplus. Is it possible to incorporate the 3rd level by using the COMPLEX function in the commands?

Thanks!

Bengt O. Muthen posted on Saturday, September 12, 2009 - 12:19 pm

Yes. You don't get 3-level modeling, so no estimated 3rd-level random effects, but you do correct the 2-level model SEs for the level 3 nesting.

kanreutai klangphahol posted on Tuesday, September 15, 2009 - 5:22 am

Dear muthen,
I analysis two confirmator factor analysi with mplus 5.12 but something result cannot estimated.

STDYX Standardization

Two-Tailed
Estimate S.E. Est./S.E. P-Value

F BY
X1 0.279 0.000 999.000 999.000 ****** (unestimated)
X2 0.637 0.017 36.761 0.000

X2 WITH
X1 0.230 0.030 7.648 0.000

Intercepts
X1 0.000 0.026 0.000 1.000
X2 0.000 0.026 0.000 1.000

Variances
F 1.000 0.000 999.000 999.000****** (unestimated)

Residual Variances
X1 0.922 0.000 999.000 999.000****** (unestimated)
X2 0.594 0.022 26.894 0.000
thank you,
tam

Bengt O. Muthen posted on Tuesday, September 15, 2009 - 9:01 am

These parameters are fixed by you and are therefore not estimated. "999" says that something should not or could not be computed, which clearly is the case here.

Anne DeField posted on Tuesday, September 15, 2009 - 2:12 pm

I am running a CFA with both metric and dichotom items. Here I assume the following structure: a latent variable is represented by four observed variables: Three of them consist of metric items and one consists of dichotome items (yes/no). How do I handle these items in this factor anaylsis?

Linda K. Muthen posted on Tuesday, September 15, 2009 - 5:05 pm

Put the dichotomous one on the CATEGORICAL list in the VARIABLE command. The default estimator for this situation is WLSMV. You can ask for maximum likelihood using the ESTIMATOR option of the ANALYSIS command.

Anne DeField posted on Thursday, September 17, 2009 - 7:38 am

Thanks!

Da C posted on Tuesday, November 03, 2009 - 7:41 pm

Hi,

I ran a 3-factor CFA model with categorical/ordinal indicators. The means of the 3 factors were set to 0 and their variances set to 1. The loading of the first indicator within each factor was set free. This analysis was weighted and I used the WLSMV estimator.

The model results and STDYX results were identical.

Are these results the structure or pattern loadings? Which ever it is how would I estimate one from the other?

Thank you!

Linda K. Muthen posted on Wednesday, November 04, 2009 - 9:12 am

The raw coefficients and STDYX are the same because factor variances are one and variances of the latent response variables underlying the categorical variables are one.

The coefficients are factor pattern coefficients. The matrix for these coefficients is lambda. The matrix for the factor variances and covariances is psi. The product of the two matrices gives the factor structure coefficients.

Da C posted on Wednesday, November 04, 2009 - 1:46 pm

Thank you very much for your quick reply!

I have yet another question concerning my analyses.

I first ran an EFA with categorical/ordinal indicators. This analysis was weighted and I used the WLSMV estimator. Based on the interpretability of this EFA and the fact that there were 3 eigenvalues > 1, I chose the 3-factor solution.

Based, on the 3-factor simple structure I identified in the EFA I ran a CFA model.
As I described in the previous post, I ran a 3-factor CFA model with categorical/ordinal indicators. The means of the 3 factors were set to 0 and their variances set to 1. The loading of the first indicator within each factor was set free. This analysis was weighted and I used the WLSMV estimator.

My question concerns the latent correlations from the EFA 3-factor solution and the CFA confirming the 3-factor model. The CFA correlations are much greater than the EFA correlations:

1st & 2nd factor: 0.47 vs. 0.79
1st & 3rd factor: 0.57 vs. 0.92
2nd & 3rd factor: 0.46 vs. 0.73

Why would this occur? Does this indicate an issue with the model or latent correlation estimation in EFA or CFA? Which one is the correct one?

Thank you!

Linda K. Muthen posted on Wednesday, November 04, 2009 - 5:43 pm

When you create the simple structure CFA and fix factor loadings to zero, this influences the correlations between the factors by forcing the relationship to go through the factors. See the Asparouhov and Muthen and Marsh papers on the website under ESEM. This issue is discussed.

Daiwon Lee posted on Monday, March 08, 2010 - 2:32 pm

Hi,

I am trying to confirm my measurement model by applying CFA on each of latent construct.
However, when I run CFA using three items misfitting seems to occur automatically. Every time I use CFA with three-item constructs I get .000 for RMSEA and SRMR and 1.000 for CFI/TLI.
Please see the syntax bellow.

TITLE: CFA for material strain w1 base model;

DATA:
File is .dat ;

VARIABLE:
NAMES ARE
.....

usevariables are
clothes1 allow1 goods1;
Missing are all (-9999) ;

MODEL: matstrw1 BY clothes1 allow1 goods1;

OUTPUT: sampstat modindices(all) residual patterns FSDETERMINACY;

Please help me!
Thank you.

Linda K. Muthen posted on Monday, March 08, 2010 - 5:06 pm

The reason that you get this is because a factor model with three-indicators is just-identified. Fit cannot be assessed. There are no degrees of freedom.

Daiwon Lee posted on Monday, March 08, 2010 - 9:26 pm

Thank you for the note.
Then, is there any way to identify whether the just-identified model is good fit or not?
In other words, how can we identify three-item underlying construct model is good fit or not?
Thank you!

Linda K. Muthen posted on Tuesday, March 09, 2010 - 6:53 am

No. That is why it is a good idea to have no fewer than four factor indicators.

Brian Hall posted on Thursday, March 18, 2010 - 3:24 pm

Hi,
I am working on a couple of CFA models.
Here are the model statements:
MODEL: F1 BY x1 x2 x3 x4 x5;
F2 BY y1 y2 y3 y4 y5 y6 y7;
F3 BY z1 z2 z3 z4 z5;
F4 BY F1 F2 F3;

MODEL: F1 BY x1 x2 x3 x4 x5;
F2 BY y1 y2;
F3 BY z3 z4 z5 z6 z7 z1 z2 z3;
F4 BY zz4 zz5;
F5 BY F1 F2 F3 F4;

I am getting the following error:
WARNING: THE LATENT VARIABLE COVARIANCE MATRIX (PSI) IS NOT POSITIVE
DEFINITE.
Model1 PROBLEM INVOLVING VARIABLE F2.
Model2 PROBLEM INVOLVING VARIABLE F4.

I do have in both cases correlations with the latent variable F2 or F4 that exceed 1, and errors that are negative.

My question is how best to deal with this issue.

I have tried fixing the variance of the offending variances to 0 using this command: F2@0 or F4@0. When I do this, my models become unidentified.

I have tried using different start values, even high values using this command: F2*.7 or F2*2, and this did not change the PSI error.

I have tried fixing the variance of those latent variables to 1 using this command F2@1 or F4@1. This allowed the models to run (Rindskof; Psychometrika, 1983). Is this is a correct thing to do?
I appreciate your help.
Brian

Linda K. Muthen posted on Thursday, March 18, 2010 - 4:14 pm

When you give the factor variances to one, you should free the first factor loading, for example,

F1 BY x1* x2 x3 x4 x5; f1@1;

This might allow you to see if the problem is that the first factor loading is not close to one so fixing it to one is a problem. You should not both fix one factor loading to one and fix the factor variance to one.

finnigan posted on Tuesday, April 20, 2010 - 6:41 am

Hi There

I have a 56 item scale answered on a 5 point likert scale.I am planning to do a two level CFA.

Unfortunately the scale items are not normally distributed despite attempts to use log transformations.

AM I correct in saying that a factor analysis can be conducted in MPLUS using MLR which adjust chi square for non normality while using ML under non normailty should give larger standard errors. It might be useful to compare the two outputs

Can MPLUS provide corrected correlations as part of the output.

Thanks

Linda K. Muthen posted on Tuesday, April 20, 2010 - 6:57 am

If the items have floor or ceiling effects, you should not transform them. You should treat them as categorical variables. You can then use the default of weighted least squares or use maximum likelihood. The categorical data methodology using either estimator deals with the floor and ceiling effects.

If you do not have floor or ceiling effects, you should not transform them but instead use MLR.

finnigan posted on Tuesday, April 20, 2010 - 8:17 am

Hi Linda

How are floor and ceiling effects detected in data?

Linda K. Muthen posted on Tuesday, April 20, 2010 - 8:55 am

By looking at the univariate frequencies. If the lower or upper categories have a piling up of frequencies, this indicate a floor or ceiling effect. In this case, the variables should be treated as categorical.

Michael Spaeth posted on Tuesday, April 20, 2010 - 9:00 am

I would not log transform, i.e. producing something that you don't have (normal distribution) and making interpretation of your results more difficult. I would use MLR instead. In case of nonnormal distributions, I guess, MLR SE's are always more trustworthy than ML SE's, so a comparison may make no sense. Floor and ceiling effects should be detected by inspecting a graphical display of the distribution, like a histogramm. Many values at the higher end of your distribution indicate a ceiling effect, many values at the lower end a floor effect.

finnigan posted on Wednesday, April 21, 2010 - 9:01 am

Is there any recommended cut off criteria such as 20% of observations located on the lowest or highest response category?

Thanks again for the help. If so is there any references to support the cut off

Linda K. Muthen posted on Wednesday, April 21, 2010 - 9:08 am

It is the bivariate tables that are most important. They should not contain zero cells.

Enrique posted on Thursday, April 22, 2010 - 10:07 am

I used a questionnaire with 25 Likert-type items (5 levels), sample n = 129. Ordinal variables, and multivariate non-normal distribution. I run EFA with promax rotation, wich showed that the 25 items are grouped in 4 factors: all variables loading > 0.55 except 1, and 4 variables with crossloadings > 0.25. When I run CFA on these factors, the fit was very poor (RMSEA= 0.20; CFI = 0.84, SRMR = 0.17). The same happens when I change paths according the modification indices suggested, choosing 1, 2 or 3 factors, or eliminating the items with crossloading. I need some help for testing the source of misfit.

thanks

Bengt O. Muthen posted on Thursday, April 22, 2010 - 6:05 pm

Why don't you use the Mplus default EFA rotation method which will give you SEs for all factor loadings so you can see which ones should not be fixed at zero in the CFA. It also gives you Modification Indices for residual correlations.

Or, don't move to CFA by stay with ESEM - see our web site.

Enrique posted on Friday, April 23, 2010 - 2:51 am

Thanks, Sorry but I'm not familiar with that acronym, what is SEs?

Linda K. Muthen posted on Friday, April 23, 2010 - 8:47 am

These are standard errors of the parameter estimates. The ratio of the parameter estimate to its standard error is a z-test which assesses significance.

finnigan posted on Saturday, April 24, 2010 - 12:16 am

Linda

If I have data that appears MCAR, but demonstrates significant non normality. Can MLR still be used if there are no ceiling effects? Or should FIML be used on account of the missing data, but what deals with the non normailty of data?

Thanks

Linda K. Muthen posted on Saturday, April 24, 2010 - 11:50 am

MLR is the best choice you have in this case. It is full-information maximum likelihood.

finnigan posted on Tuesday, April 27, 2010 - 8:16 am

Does MLM and MLR estimation of non normal data adjust correlations among observed and latent variables for non normality?

Linda K. Muthen posted on Tuesday, April 27, 2010 - 9:49 am

MLM and MLR are robust to non-normality. This does not affect the correlations among observed or latent variables. It affects the standard errors.

Mario Mueller posted on Tuesday, July 06, 2010 - 2:52 am

Hello

I'm running a longitudinal CFA with 6 time points and 4 indicators per time. I made my model specification as in example 5.1. as the initial model for a sequence testing measurement invariance.
Although my sample is N=380, I got a large significant Chi-square. Could there something else be the reason than a worse model fit?

Thanks
M.

Linda K. Muthen posted on Tuesday, July 06, 2010 - 8:08 am

Perhaps the four indicators are not unidimensional. Or they may be non-normal.

ann bell posted on Tuesday, July 13, 2010 - 11:05 am

What does it mean when the output for MPLUS reads this:

NO CONVERGENCE. NUMBER OF ITERATIONS EXCEEDED.

as opposed to giving me fit scores.

Linda K. Muthen posted on Tuesday, July 13, 2010 - 11:28 am

It means that the model was not able to converge in the default number of iterations. See pages 415-417 of the Version 6 User's Guide for suggestions. If these do not help, send the output and your license number to support@statmodel.com.

margot peeters posted on Friday, July 30, 2010 - 3:49 am

I am having some trouble with my analysis. I am running a CFA for my path analysis. The model fit of the the latent factors is good (CFI = 1, RMSEA = 0), howerever the two-tailed p-values of the undstandardized results are non-significant for some indicators.
I already rescaled (logtransform) my variables since the variance was very large and I constrained the variance of the factors (@1).

Furthermore, I receive the warning:
THE RESIDUAL COVARIANCE MATRIX (THETA) IS NOT POSITIVE DEFINITE.
The problems involves my indicators.
My data includes negative values > could this be the problem? And could it also affect the significance of the indicators?

Thank you for your time!

Linda K. Muthen posted on Friday, July 30, 2010 - 6:43 am

Please send the full output and your license number to support@statmodel.com.

Robert Urban posted on Sunday, August 01, 2010 - 12:04 am

Dear Dr. Muthén,

I have performed a simple two-factor CFA and received a positive correlation between factors (.50), however it is contradictory to my expectation (I expected a negative correlation).
Then I calculated by hand the scores with summarizing the appropriate items, and calculated the correlation which was negative (-.15) as I would expect it.
Do you have any idea about this discrepancy? Which correlation should I rely on and which should be reported?

Best regards,

Robert

Linda K. Muthen posted on Sunday, August 01, 2010 - 10:13 am

It sounds like you may be reading the data incorrectly or that the factor loadings are all negative resulting in a sign change for the covariance. I would need to see the full output, your calculations, and your license number at support@statmodel.com to say for sure.

nanda mooij posted on Wednesday, August 04, 2010 - 11:17 am

Hi, I have a question about my model, it doesn't fit (not positive definite). This is what I'm trying to fit:
f1 BY v1-v16;
f2 BY v17-v32;
f3 BY v33-v48;
h1 BY f1 f2 f3;
In the output I saw that de correlation between f1 en h1 is above 1. I've tried to fix f1 at zero, like this: f1@0, but that doesnt help. I also tried to add a correlation to the model: h1 WITH f1, and try to fix the correlation with the factor loadings freed like I red in the discussions, but that also doesnt make the model fit. So what can I do more to fit the model while there is a high correlation between two factors?

Thanks very much!

Linda K. Muthen posted on Wednesday, August 04, 2010 - 11:34 am

How does the model look without the second-order factor, for example,

f1 BY v1-v16;
f2 BY v17-v32;
f3 BY v33-v48;

I suspect you have problems already then.

nanda mooij posted on Wednesday, August 04, 2010 - 12:07 pm

I tried this, and it fits, no problems. What does this mean? I want to present the results of the original model, so is there a way to fit this model with a second-order factor?

Thanks, Nanda

Linda K. Muthen posted on Wednesday, August 04, 2010 - 2:44 pm

When you add the second-order factors, I assume that the residual variance of f1 is negative and this is why you fix it to zero. Is this the case?

nanda mooij posted on Wednesday, August 04, 2010 - 3:20 pm

Yes, Linda, this is the case. Also, the output says the R-square of f1 is undefined. When I fix the res. var. of f1 to zero, Mplus says the covariance matrix is not positive definite, while I don't see anything strange about it, no negative values in that matrix.

nanda mooij posted on Thursday, August 05, 2010 - 6:37 am

How can I solve this problem? Do I have to leave the second order factor out of the model or can I still fit this model in another way?

Thanks for all your help!

Linda K. Muthen posted on Thursday, August 05, 2010 - 9:25 am

If the negative residual variance is small and not significant, you can fix the residual variance at zero and ignore the not positive definite message. If the residual variance is zero that means that the 1st-order factor represents the 2nd-order factor perfectly.

nanda mooij posted on Thursday, August 05, 2010 - 2:01 pm

Thanks for the answer. If I want to get the factor scores of f1, f2, f3 and h1, can I fit the model without h1, so that the factor scores will be calculated for f1, f2 and f3. Can I then assume that the factor scores of f1 are the same as the factor scores of h1?

Thanks

Linda K. Muthen posted on Thursday, August 05, 2010 - 4:25 pm

I don't think this will work. Try fixing the f1 variance at .001.

Richard Hermida posted on Wednesday, October 06, 2010 - 2:54 pm

Hello,

I am conducting a CFA with order categorical indicators.

I conducted the CFA with the categorical option specified, and was wondering how best to interpret the fit indices associated when the WLSMV estimator is used.

I ask because it seems highly unlikely to me that the fit indices under WLSMV are directly and perfectly comparable to fit index values that are obtained in the "usual" CFAs conducted in Psychology with Max.Likelihood and indicators that are treated as continuous.

I realize that "rules of thumb" are usually overly simplistic, but are there any journal articles that address how to interpret fit indices under WLSMV?

Thank you for your time.

Linda K. Muthen posted on Thursday, October 07, 2010 - 9:58 am

See the Yu dissertation on our website where cutoffs are looked at for binary and continuous outcomes. She finds similar cutoffs in both cases.

Yu, C.Y. (2002). Evaluating cutoff criteria of model fit indices for latent variable models with binary and continuous outcomes. Doctoral dissertation, University of California, Los Angeles.

Dallas posted on Saturday, October 30, 2010 - 4:28 am

I have a question. I'd like to conduct a full-information factor analysis of categorical data. I have 19 indicators of a single trait. I understand that I simply have to specify the estimator to get this. I've done that. However, I would like to allow some of the uniquenesses to correlate in the model. When I do this with ML, I get an error saying "Covariances for categorical, censored, count or nominal variables with other observed variables are not defined". How can I go about allowing correlated errors in a factor model with categorical data and maximum likelihood estimation?

Linda K. Muthen posted on Saturday, October 30, 2010 - 7:40 am

You can do this using the BY option as illustrated in Example 7.16. Note that each residual covariance adds one dimension of integration to the analysis.

Resmi Gupta posted on Tuesday, November 02, 2010 - 7:42 pm

I am conducting CFA for polytomous items usgin WLSMV. Is there a way to get factor scores ? Here is my code :

y1 by a1*, a2, a3, a4, a5;
y2 by a6*, a7, a8, a9;
y1@1 , y2@1;
y1 with y2;

Thanks

Linda K. Muthen posted on Wednesday, November 03, 2010 - 5:30 am

Use the FSCORES setting of the SAVE option in the SAVEDATA command. See the user's guide for further information.

Resmi Gupta posted on Wednesday, November 10, 2010 - 7:23 pm

My factor scores distribution is not normal. Is it because I have highly skewed items ? Is the mixture distribution would be better choice here ? Thanks.

Jana Buchmann posted on Thursday, November 11, 2010 - 6:46 am

Dear Dr. Muthen,

I need your advice, conducting a simple cfa with one factor. There is an error message i don´t know. Even though everything is right with the names of variables and Missings are defined as 99, there is this error: "unable to expand" the factor. Please help me. Tnx

The following syntax was used:
VARIABLE:
Names are tsk1_a tsk2_a tsk3_a tsk5_a tsk6_a tsk7_a
tsk9_a tsk10_a tsk11_a tsk13_a tsk14_a tsk15_a tsk17_a;
Missing is ALL (99);
Model:
TSK-13 by tsk1_a tsk2_a tsk3_a tsk5_a tsk6_a tsk7_a
tsk9_a tsk10_a tsk11_a tsk13_a tsk14_a tsk15_a tsk17_a;
Analysis:
estimator=mlr;
Output: Standardized;
modindices;
residual;

*** ERROR
Unable to expand: TSK-13

Linda K. Muthen posted on Thursday, November 11, 2010 - 6:59 am

Jane: Variable names cannot contain a dash. That is used to specify a list. Rename the factor tsk13 or something without a dash.

Bengt O. Muthen posted on Thursday, November 11, 2010 - 7:57 am

Resmi:

Skewed items can cause a non-normal factor score distribution. This is not necessarily an indication that the normality assumption for the factor is wrong, but rather that the items are not optimal - they don't discriminate well between people with high or low factor scores.

ywang posted on Friday, November 12, 2010 - 11:43 am

Dear Drs. Muthen,

We would like to report the t-test results of a latent factor. Is it possible to conduct a T-test for a latent factor underlying three indicator variables with Mplus?

Thanks!

Linda K. Muthen posted on Friday, November 12, 2010 - 3:49 pm

What would the t-test test?

ywang posted on Friday, November 12, 2010 - 7:17 pm

For example, the t-test will test whether the factor score is equal between males and females.

Thanks!

Bengt O. Muthen posted on Saturday, November 13, 2010 - 3:51 pm

For this you do a 2-group analysis with measurement invariance and estimate the factor mean in one group with the mean fixed at zero in the other group. The z-test for that mean is then what you want. See Topic 1. You don't do it via estimated factor scores.

Dr George Chryssochoidis posted on Monday, November 15, 2010 - 3:52 pm

Dear all,
I have 2000 respondents who have each rated two out of a set of 10 (rotating randomly) stimuli. The respondents have answered 20 questions for each stimuli. So, a total of 400 responses per stimuli X 20 variables. The grand total dataset contains 4000 judgments [2000 respondents X 2 stimuli each] x 20 variables.
I want to perform EFA and CFA for each stimuli, but I am still unclear about the treatment of the 'repeated measures' nature of the exercise given the random rotation of the stimuli shown to respondents. Any comments?

Furthermore, a separate issue.. Some of the stimuli are nested within others in terms of attributes. So, this also is a 'nested' EFA / CFA problem. Any comments on that front too?

Regards

Bengt O. Muthen posted on Tuesday, November 16, 2010 - 10:39 am

I should preface this with saying that I am not an expert on this type of design, but here is my quick take.

The random rotation of stimuli assures that the subject groups corresponding to different pairs of stimuli are randomly equivalent so that you can say that all stimuli responses draw from the same population.

I would just do one analysis per stimuli pair, so using 400 subjects for each analysis where the number of variables is 2x20=40.

If you feel there are important differences between stimuli pairs, you could do a multi-group analysis for the 5 groups to compare factor solutions.

Regarding your last question, I don't see a special modeling for this, but would merely keep this nesting in mind when interpreting the findings.

ywang posted on Monday, November 29, 2010 - 12:23 pm

Dear Drs. Muthen:
For the CFA model using MLR, the scaling factor for the loglikehood is H0 and H1 is 1.168, but the scaling factor for the chi-square is 1.0. For this model, should we use MLR Or ML? Is there any cutoff for the scaling factor which can indicates whether we should use MLR or ML? Also are chi-square and loglikehood scaling factor always different?

Thanks!

Linda K. Muthen posted on Tuesday, November 30, 2010 - 9:29 am

The scaling correction factors differ for chi-square and the loglikelihood due to the fact they are in different metrics. Generally speaking you should use MLR if the scaling correction factor is different from one.

Catherine posted on Friday, March 04, 2011 - 4:47 am

Dear Dr Muthen,

I want to test a 3 factor model with categorical variables.
Now the 3 factors only contain 19 of the total 26 variables.
What should i do with the other 7 variables?

Should they still be in de USEVARIABLES option or not?

And how will i know if these variables load on one of three factors?

Hope you can help me,

Cat

Linda K. Muthen posted on Friday, March 04, 2011 - 7:05 am

All variables on the USEVARIABLES list are used in the analysis. You should leave the variables off if you don't want them included. If you want to see how they load on the factors, you need to include them. It sounds like you should start with an EFA of the total set of variables.

Resmi Gupta posted on Friday, March 18, 2011 - 6:24 pm

For a one factor CFA solution, I am getting an error message - "chi-square value could not be computed because of too many categories" - Can you please help me understanding why am I getting this message ? I have total 11 items which are ordinal in nature ( 1- 5 scale)
Thanks
Reshmi Gupta

Linda K. Muthen posted on Sunday, March 20, 2011 - 9:46 am

I have not seen this message. Please send the full output and your license number to support@statmodel.com.

Mohamed Abou-Shouk posted on Thursday, March 31, 2011 - 10:27 am

Hi,
I have 3 latent variables, each latent variable consists of 3 indicators, the indicators are all Likert scale 5-points.

My question is:
Likert scale as indicator is treated as continuous or categorical variable, hence can i say that my latent variable is categorical (expressed by c) or continuous (expressed by f)when i run CFA model.
Thanks

Linda K. Muthen posted on Thursday, March 31, 2011 - 11:18 am

If you put the Likert variables on the CATEGORICAL list, they are treated as categorical. If you do not, they are treated as continuous. In both cases, the latent variables specified using the BY option are continuous.

robert klein posted on Thursday, March 31, 2011 - 3:09 pm

I ran a CFA and the residuals and the fit indices sound too good to be true (RMSEA = .00, CFI= 1.00). The factor coefficients are between 0.6 and 0.91, the alphas for the factors are .89-.95, and the correlations between items belonging to the same factor range between 0.7-0.98.

1. I am guessing there is a lot of multicollinearity in this set up. Hence, the high fit. Correct me if I am wrong. ---Is there a way to deal with this problem? I read somewhere that centering might be a technique that people use. Any suggestions?

2. A factor requires at least 4 items. He is left with two items in a factor (in another model). But these are important items. How do people normally deal with this situation (you want to include the items, but do not want to construct a factor with 2 items)?

Thanks
Rob

Linda K. Muthen posted on Friday, April 01, 2011 - 9:39 am

Regarding the fit, please send the output and your license number to support@statmodel.com.

Regarding the two items, if that is all you have you have no choice but to use them knowing the pitfalls. The model is not identified unless information is borrowed from other parts of the model and model fit for that factor cannot be assessed.

Mohamed Abou-Shouk posted on Monday, April 11, 2011 - 12:10 pm

Hi,
I have two latent variables (my model two main constructs), i have run CFA for the first latent variable (first construct) then I have run CFA for the second latent variable (the second construct) and i got good fit indices for each construct,
Now I want to check the covariance between those two latent variables (constructs) but i do not know how can i do that? could you please give any advice.
Thanks.

Linda K. Muthen posted on Monday, April 11, 2011 - 2:14 pm

If you include both factors in the MODEL command, the covariance will be estimated as the default.

Richard Hermida posted on Tuesday, April 26, 2011 - 4:42 pm

Hello,

I am interested in testing measurement equivalence of a model across levels of several ordered categorical variables.

Can you recommend any citations related to best practices in this type of situation?

Also, can MPLUS handle this type of procedure? If so, can you point me in the(very) general direction of how to conduct such an analysis?

Thank you very much.
Richard Hermida

Linda K. Muthen posted on Wednesday, April 27, 2011 - 9:12 am

I'm sending you a paper you may find helpful.

You can start from the inputs found in the Topic 2 course handout on the website under multiple group analysis. This looks at binary items. You can extend it using the inputs shown in the Topic 4 course handout under multiple indicator growth. This uses ordinal items.

Richard Hermida posted on Wednesday, April 27, 2011 - 6:21 pm

Dr. Muthen,

Thank you very much for the response and for the sent article. I will read the article (probably multiple times) with great interest.

I think my last post (4/26) might have been a bit unclear.

I am interested in assessing if a measure differs not between different groups, but across the continuum of a continuous variable (technically measured by ordered categorical items).

For example, if I were to hypothesize a particular 4-factor model for my data, could I test if this model holds as values for a continuous variable increased or decreased across a continuum of values?

Is that possible? Or would I need to split the continuous variable into multiple groups to conduct any test of measurement equivalence?

Thanks again. My apologies if this is indeed covered in the materials you indicate earlier.

Richard Hermida

Linda K. Muthen posted on Thursday, April 28, 2011 - 10:33 am

There are two things you can do:

1. Categorize the continuous variable and use multiple group analysis.
2. Use the XWITH option to create an interaction between the factor and the continuous variable and regress the factor indicators on the interaction.

Eser Sekercioglu posted on Sunday, May 22, 2011 - 5:18 am

Hi,

I have an annoying problem. I am running a multigroup CFA. When I try to save factor scores using SAVEDATA the fit statistics and factor loadings I get are different from the model without the SAVEDATA command. Otherwise everything is identical between two specificaitons.

Any ideas?

Linda K. Muthen posted on Sunday, May 22, 2011 - 5:40 am

Please send the files and your license number to support@statmodel.com.

Lela Williams posted on Monday, May 23, 2011 - 8:09 am

Hi,

Do you know of any rules-of-thumb for the maximum number of indicator variables you should have on a latent factor? I read some ideas about parceling variables but it seems controversial….so I’m wondering at what point you start worrying about the number of items you have for a scale.

Thanks.

Linda K. Muthen posted on Monday, May 23, 2011 - 9:09 am

I would recommend no less than four indicators for the reason that you can't test the fit of a factor model with less than four indicators.

Lela Williams posted on Monday, May 23, 2011 - 1:18 pm

Hi,

Thanks but I was concerned about the maximum of indicators. For example, is 12 items, 20 items, etc. too many items for one latent factor? Thank you.

Linda K. Muthen posted on Tuesday, May 24, 2011 - 5:43 am

I don't think there is an upper limit on the number of factor indicators. If you have 15 unidimensional items, that should be sufficient for creating a sum score. So between 4 and 15 should be optimal.

Elisabet Solheim posted on Thursday, May 26, 2011 - 5:09 am

Drs Muthens,

I have run a CFA with 28 categorical items and three factors. i have a fairly large, weighted sample n= 875.

The fit of the model is not acceptable. However if i dekete 2 items which have low R2 and non significant factor loadings I get a better fit.

I wonder however if there is a way to test whether the fit of this modified 26 item model is significantly better than the original 28 item model?

Thank you,
Elisabet

Linda K. Muthen posted on Thursday, May 26, 2011 - 10:53 am

I don't know of such a test.

Oana Lup posted on Tuesday, May 31, 2011 - 5:44 am

Could you please help?

I test for significance of the difference between intercepts for east european and west european countries.

USEVARIABLES ARE poldisc polint media pid female age agesq educlev
emp mar urban swi rus por den wger eger nl slo nor ro mold sp;
CENTERING = GRANDMEAN (poldisc, polint, media, age, agesq, educlev);
Missing are all (-999);

MODEL:

poldisc ON polint media pid female age agesq educlev emp mar urban
swi(mswi)
rus(mrus)
por(mpor)
den(mden)
wger(mwger)
eger(meger)
nl(mnl)
slo(mslo)
nor(mnor)
ro(mro)
mold(mmold)
sp(msp)
!swe (mswe)
;

[poldisc] (mint);

MODEL CONSTRAINT:
NEW = (hm);

hm =(((5*mint)+mrus+meger+mslo+mro+mmold)/5)-
(((8*mint)+mswi+mpor+mwger+mnl+mnor+msp+mswe)/8);

OUTPUT: TECH1;

and I get these warnings :
*** WARNING
Warning: No specification of mean structure analysis in 'ANALYSIS' paragraph.
*** ERROR in Model Constraint command
Unknown parameter label in MODEL CONSTRAINT: NEW
in assignment: NEW = (HM)

thanks very very much,
Oana

Linda K. Muthen posted on Tuesday, May 31, 2011 - 7:10 am

It sounds like you may be using an old version of the program where you must specify TYPE=MEANSTRUCTURE; to include means in the model.

NEW should not be followed by an equal sign. It should be

NEW (HM);

Oana Lup posted on Tuesday, May 31, 2011 - 2:23 pm

thanks very much! added the TYPE=MEANSTRUCTURE and this indeed sorted out the problem.
i also removed the equal sign and run it again but am still getting this error message.

ERROR in Model Constraint command
Unknown parameter label in MODEL CONSTRAINT: NEW(HM)
in assignment: NEW(HM) =

really hope you can find out what this is.

many many thanks!!
Oana

Linda K. Muthen posted on Tuesday, May 31, 2011 - 2:25 pm

There should not be an = sign. Please see the NEW option in the user's guide for the correct specification.

Oana Lup posted on Wednesday, June 01, 2011 - 4:25 am

Yes I did. now my script looks like

MODEL CONSTRAINT:
NEW(hm);

hm =(((5*mint)+mrus+meger+mslo+mro+mmold)/5)-
(((7*mint)+mswi+mpor+mwger+mnl+mnor+msp)/7);

but I am still getting the error message:
*** ERROR in Model Constraint command
Unknown parameter label in MODEL CONSTRAINT: NEW(HM)
in assignment: NEW(HM) =

am really not understanding where the problem is :-(

thanks,
Oana

Linda K. Muthen posted on Wednesday, June 01, 2011 - 5:21 am

Please send the full output and your license number to support@statmodel.com.

Oana Lup posted on Thursday, June 02, 2011 - 5:33 am

thanks very much. there was a problem with our school program i think. now it works. many thanks,

Oana

Elisabet Solheim posted on Thursday, June 02, 2011 - 6:09 am

Drs Muthens,

I am running a CFA with 28 categorical
indicators and three latent factors. I am trying to test for gender invariance:
using the command:

Grouping is GENDER (1 = female 2 = male);

However I get the following error message:

*** ERROR
Based on Group 2: Group 1 contains
inconsistent categorical value for STRS26_1: 5

What does this mean? Is there a way to work around it?

Thank you,
Elisabet

Linda K. Muthen posted on Thursday, June 02, 2011 - 9:40 am

With the default WLSMV estimator, categorical variables must have the same categories in each class. You can collapse categories to achieve this.

chuma owums posted on Tuesday, June 21, 2011 - 12:08 pm

Drs Muthens,

I am knew to Mplus and was wondering how to interpret the Confidence Intervals of a bootstrapped indirect effect. In particular the output from my analysis shows that my result zero was not within the upper and lower limits of a 2.5% bootstrapped CI but not at .5%. do these percentages correspond to 97.5% and 99.5% confidence levels? Any help with this would be greatly appreciated.

Linda K. Muthen posted on Wednesday, June 22, 2011 - 9:31 am

The confidence intervals are 95 and 99 and are interpreted in the regular way.

Amang Sukasih posted on Friday, June 24, 2011 - 10:03 am

Dear Linda and Bengt,

In the situation where the CFA produces the following message:

MINIMIZATION FAILED WHILE COMPUTING FACTOR SCORES FOR THE FOLLOWING
OBSERVATION(S) :
457 FOR VARIABLE V32011
484 FOR VARIABLE V32001
860 FOR VARIABLE V32001

how does MPlus identify such cases (in this example they are cases 457, 484, 860). Can you please let me know any reference that describes about the methodology.

Thanks,
Amang

Bengt O. Muthen posted on Friday, June 24, 2011 - 5:35 pm

Perhaps you have categorical outcomes in which case this type of iterative optimization takes place for each subject in line with our appendix 11 of

http://www.statmodel.com/download/techappen.pdf

The message means that for these subjects the optimum could not be found by the iterative technique, perhaps due to an unusual response vector. Also, be sure that you use the latest 6.11 version of Mplus.

Marissa Ericson posted on Tuesday, July 05, 2011 - 2:13 pm

I am having a very strange problem. In SPSS one of my variables has a mean of 0 but then in mplus it says that the mean is 11. I have tried everything, including combing through the spss and text files by hand to see if there are large values or if something got read in incorrectly. I have no idea what could be going on. Thank you!

Linda K. Muthen posted on Tuesday, July 05, 2011 - 3:51 pm

It sounds like you are reading the data incorrectly. You may have blanks in your data set. Free format data cannot contain blanks. If you can't see the problem, please send your output, data, and license number to support@statmodel.com.

Lena Herich posted on Friday, August 19, 2011 - 11:16 am

Hello !

I was reading your article “ Applications of continuous-time survival in latent variable Models for the analysis of oncology randomized clinical trial data using mplus.”.
In chapter five, several different latent variable models are fit to the data.
My question ist, why for the 1f, 2f and 3f models exploratory factor analysis was used and not confirmatory factor analysis. Would it also have been possible to fit confirmatory models, and what would be the differences?

Bengt O. Muthen posted on Friday, August 19, 2011 - 6:16 pm

That's certainly possible, but there was not really any well-formed substantive theory behind the measurement instrument that called for specific CFAs.

But note that M4 is a CFA.

Eric Teman posted on Friday, August 26, 2011 - 8:31 pm

I have a 3 factor CFA (with 4 indicators per factor). When I output the results ASCII file, factor loading estimates only appear for 3 indicators per latent variable. Is there a reason that all the factor loading estimates do not appear in the ASCII file?

Linda K. Muthen posted on Saturday, August 27, 2011 - 10:26 am

The first indicator is fixed to set the metric of the factor.

Eric Teman posted on Saturday, August 27, 2011 - 3:30 pm

Yes, but shouldn't the standardized output include all factor loading estimates including the one that is fixed to set the metric?

Linda K. Muthen posted on Saturday, August 27, 2011 - 4:43 pm

No, only free parameters are saved.

Nidhi Kohli posted on Wednesday, September 14, 2011 - 5:03 pm

I am running a CFA with 61 continuous indicators and 4 latent factors. However I get the following error message:

*** ERROR
Mismatched parentheses:
WAI2 WITH WAI4(

I have checked the paratheses and it is correctly specified. Why then I am getting this error message? Is there a way to work around it?

Thank you

Linda K. Muthen posted on Wednesday, September 14, 2011 - 5:36 pm

Please send the full output and your license number to support@statmodel.com.

Ramya Pratiwadi posted on Tuesday, September 20, 2011 - 10:13 am

Hello,

I am running a two factor CFA on dataset containing 107 variables. We keep getting the following warning when trying to run the code:

Warning: The estimation of a model with 107 variables with the WLSMV estimatory may be slow. Using the VLSMV estimator will produce more timely results. If analysis with the WLSMV estimator is desired, try specifying NOSERROR and NOCHISQUARE in the output command to reduce computation command.

Here is my input. Do you see any error that could be causing this warning?

ANALYSIS:
ESTIMATOR=WLSMV;

MODEL:
PsyCog by P_088 P_023 P_085 P_101 P_022 P_007
P_019 P_059 P_107 P_054 P_020 P_061 P_004 P_057 P_016
P_097 P_012 P_052 P_011 P_064 P_056 P_006 P_018 P_051
P_065 P_105 P_089 P_047 P_008 P_021 P_111 P_090 P_055
P_118 P_108 P_099 P_106 P_046 P_122 P_049 P_084 P_045
P_048 P_095 P_067 P_014 P_015 P_017 P_109 P_058 P_050
P_102 P_063 P_091 P_112 P_113 P_119 P_041 P_003 P_120
P_002 P_060 P_062 P_121;
Som by S_070 S_029 S_059 S_031 S_017
S_044 S_040 S_063 S_055 S_039 S_048 S_069 S_030 S_019
S_026 S_053 S_062 S_024 S_032 S_072 S_013 S_037 S_042
S_015 S_052 S_058 S_064 S_045 S_066 S_046 S_034 S_054
S_067 S_065 S_033 S_043 S_060 S_049 S_068 S_035 S_011
S_038 S_021;
OUTPUT: standardized res;

Linda K. Muthen posted on Tuesday, September 20, 2011 - 10:26 am

The warning is not due to an error. It is just telling you that with 107 categorical variables, the estimation may be slow.

Paraskevas Petrou posted on Wednesday, October 26, 2011 - 2:42 am

Dear Linda,

Regarding an older post of yours from 2010,
in my multilevel CFA I also get a warning for a negative residual variance of an item at the between level. You said that if it is very small and non-significant, it can be fixed to 0 or 0.001. It's actually -.001 and non-significant in my case. If I fix it, do you think I have to report that in my paper? Is there a reference for that?

Thank you!

Linda K. Muthen posted on Wednesday, October 26, 2011 - 10:08 am

This will not change your solution so I would simply mention it in a footnote. See the following paper which is available on the website:

Muthén, B. & Asparouhov, T. (2011). Beyond multilevel regression modeling: Multilevel analysis in a general latent variable framework. In J. Hox & J.K. Roberts (eds), Handbook of Advanced Multilevel Analysis, pp. 15-40. New York: Taylor and Francis.

Paraskevas Petrou posted on Thursday, October 27, 2011 - 1:30 pm

Thank you very much Linda!

yezi posted on Thursday, November 03, 2011 - 5:52 am

Hi,
Do group factors correlate in bifactor model? Thank you very much!

Kind regards

Yours sincerely

Ye

Linda K. Muthen posted on Thursday, November 03, 2011 - 11:17 am

No. See the input on Slide 159 of the Topic 1 course handout on the website.

Stefanie Hinkle posted on Monday, November 07, 2011 - 7:25 am

Is there a way to output standardized factor scores when using the following command:

SAVEDATA:
SAVE IS FSCORES;

Thanks.

Linda K. Muthen posted on Monday, November 07, 2011 - 1:33 pm

No, this is not possible.

yezi posted on Tuesday, November 08, 2011 - 10:41 pm

Hi,
When we do ESEM, is the construc of the test ECFA(that is ESEM)? That is to say, each item load each factor?
Thank you very much!

Kind regards

Yours sincerely

Ye

Roger E. Millsap posted on Wednesday, November 09, 2011 - 12:11 pm

Hi,
We have a factor analysis model fit in two independent groups with invariance constraints. One unique variance estimate is negative. There are no identification problems. As an experiment, we imposed a positivity constraint on the unique variance in both groups using the model constraint command. The result of that is a model that converges, but we get a message about the standard errors not being computed, and a possible identification problem with parameter 8. Yet there are only 7 free parameters; no parameter 8 exists in the parameter count. Can you tell us what is going on? Why would the model be identified without the constraint, but not identified with the constraint?

Tihomir Asparouhov posted on Wednesday, November 09, 2011 - 1:29 pm

Roger

When a model is estimated with inequality constraints the constraint parameter is substituted with the so called slack parameter in your case Variance=slack*slack and this keeps the variance positive. The actual parameter in the model is the slack parameter and that is parameter 8. Typically in a situation as the one you describe this parameter will be estimated to a value that is near 0 and thus the variance itself will be estimated to 0 which is a parameter value on the borderline of admissible solutions. In that case the standard methodology for computing SE is not valid and currently Mplus would just report that as a problem. The bottom line is that in this case the standard error for the variance parameter, if estimated at the borderline value of 0, is not reliable. All other results are fine.

Without looking at the exact results I am not 100% sure of the above answer so if this doesn't make sense for your situation send the example to support@statmodel.com

Tihomir

yezi posted on Wednesday, November 09, 2011 - 4:04 pm

Hi,
When we do ESEM, is the construc of the test ECFA(that is ESEM)? That is to say, each item load each factor?
Thank you very much!

Kind regards

Yours sincerely

Ye

Linda K. Muthen posted on Wednesday, November 09, 2011 - 4:21 pm

The ESEM measurement model is EFA. Each item loads on all factors after rotation.

yezi posted on Thursday, November 10, 2011 - 8:09 am

Hi,
Thank you very much!
How to estimate reliability of the test whose model is ESEM measurement model? Thank you.

Kind regards
Ye

Linda K. Muthen posted on Thursday, November 10, 2011 - 11:53 am

For ESEM, look at the StdY estimates for each item's residual variance. Reliability would be 1 minus the StdY residual variance.

yezi posted on Thursday, November 10, 2011 - 4:30 pm

Hi,
Thank you very much!
For ESEM, why not look at the StdYX or Std estimates for each item's residual variance? Is reliability 1 minus the StdYX or Std residual variance? Thanks!

Kind regards
Ye

Linda K. Muthen posted on Friday, November 11, 2011 - 11:15 am

You would not use StdYX because ESEM as EFA has no independent variables. You would not use Std because ESEM analyzes a covariance matrix.

yezi posted on Friday, November 11, 2011 - 6:33 pm

Hi,
Thank you very much!

Kind regards
Ye

Samuel Greiff posted on Sunday, November 27, 2011 - 4:53 am

Dear Linda and Bengt,

we used WLSMV estimator for a CFA with 3 factors. There are 11 observed variables on each factor. For 2 of the 3 factors, indicators are binary while indicators for the 3rd factor have 3 categories.

A reviewer now asks how binary-based correlations were corrected. He/she claims this to be necessary whenever limited information is used in MPlus.

However, we did not use limited information but raw data of item responses. Is, in this case, a correction necessary and if so, is this done automatically? Can I find any information about this in the manual or the technical reports?

Thank you very much,

Samuel

Linda K. Muthen posted on Sunday, November 27, 2011 - 11:16 am

I'm not sure what the reviewer is asking about. The sample statistics for model estimation for WLSMV for a model with no covariates are tetrachoric and polychoric correlations.

Wen-Hsu Lin posted on Thursday, March 08, 2012 - 1:41 am

I try to fit a simple CFA model. I have 4 observable variables, which are all categorical variables, and want to see if they load on one latent variable. However, Mplus kept giving me the following error message. What is wrong?
INPUT INSTRUCTIONS
data: file is c:\crime1.dat;
type is individual;
format is 4f1.0;
variable: names are w1c1-w1c4;
usevariable are w1c1-w1c4;
categorical are w1c1 w1c2 w1c3 w1c4;
missing is blank;
model: dev1 by w1c4
w1c1
w1c2
w1c3;
output: sampstat
stand
mod (4);
*** ERROR
The number of observations is 0. Check your data and format statement.
Data file: c:\crime1.dat
*** ERROR
Invalid symbol in data file:
"ï»¿0000" at record #: 1, field #: 1

Linda K. Muthen posted on Thursday, March 08, 2012 - 6:50 am

You seem to have a problem in the data file. Please send your output, data, and license number to support@statmodel.com.

Wen-Hsu Lin posted on Thursday, March 08, 2012 - 5:38 pm

I am using public computer to analyze it. I know it looks like I have problem in the data file. However, I check it and did not see a problem nor do the SPSS report any strange number or something like that.

Linda K. Muthen posted on Friday, March 09, 2012 - 12:13 pm

It sounds like the dataset may be saved in an incorrect format. Try opening your dataset in Excel and resaving it as a txt file.

seefeh posted on Thursday, March 29, 2012 - 3:31 am

Hello, I would like to run a multiple group (males vs females) CFA using indicator values that are non-normally distributed. I have tried a number of transformations (log, square root and reciprocal), but indicator skew is not reduced to <2. I have tried running a CFA with the indicators as count data, but this doesn't seem to work when running a multiple group analysis. Is there a way around this issue?

Kind regards

Linda K. Muthen posted on Thursday, March 29, 2012 - 5:40 am

If your variables are continuous and do not have a piling up at either end, using a non-normality robust estimator like MLR should be sufficient. If they have a piling up at either end, you can consider treating them as censored.

Hallie Bregman posted on Thursday, April 19, 2012 - 1:16 pm

I conducted a CFA on a two-factor model, where each factor had 2 indicators. The residual variances of all 4 indicators were constrained equal. I wanted to test whether a 2-factor or a 1 factor solution was best- thus, I ran the model once where the two factors were allowed to freely correlate, and again where I constrained the correlation between the two-factors to 1. I determined that the 1 factor solution was better than the 2 factor solution using a chi-square difference test of the nested models.

Subsequently, I ran a 1-factor model, with the same 4 indicators as before. The residual variances of all 4 indicators were still constrained equal. However, I found that when run as a 1-factor model, the model fit indices changed significantly, as did the residual variances. Could you explain why these parameters changed when I changed the model from a 2-factor model (with the factors correlated @1) to a 1-factor model?

Below are my syntax for reference.

2-factor model:
DISORG by ydiseng ychaotic;
CONTROL by yrigid yenmesh;
ydiseng(1);
ychaotic(1);
yrigid(1);
yenmesh(1);

1-factor model:
UNBAL by ydiseng ychaotic yrigid yenmesh;
ydiseng(1);
ychaotic(1);
yrigid(1);
yenmesh(1);

Bengt O. Muthen posted on Thursday, April 19, 2012 - 9:01 pm

You don't show the setup where you constrain the 2 factors to correlate 1.

Hallie Bregman posted on Friday, April 20, 2012 - 7:57 am

Hi, I apologize for the omission. Thanks for pointing it out. When I constrain the 2 factors to correlate 1, I used the syntax below. Thanks!

DISORG by ydiseng ychaotic;
CONTROL by yrigid yenmesh;
DISORG WITH CONTROL@1;
ydiseng(1);
ychaotic(1);
yrigid(1);
yenmesh(1);

Bengt O. Muthen posted on Friday, April 20, 2012 - 1:20 pm

This setup does not constrain the factor correlation to 1, but the factor covariance. This is because your factor variances are not one but freely estimated.

You can instead free all factor loadings by using * and fix the factor variances to 1. And then fix the covariance which is then a correlation. But note that the chi-square difference test is suspect here because you are on the border of the admissible parameter space, namely a correlation of 1.

Hallie Bregman posted on Monday, April 23, 2012 - 8:07 am

Thank you for your response. Would you recommend using a Wald test instead of the Chi-Square Difference Test, in this case?

Linda K. Muthen posted on Monday, April 23, 2012 - 12:30 pm

The same issue holds for the Wald test. You have a parameter on the border of the admissible parameter space.

seefeh posted on Tuesday, May 08, 2012 - 7:15 am

Hello,

I have a question regarding the Scaling Correction Factor.

When running CFAs I get Scaling Correction Factors for MLR of >3 even though there is high Goodness of Fit (e.g., CFI = 0.962; TLI = 0.958; and RMSEA = 0.043). What does the high Scaling Correction Factor tell me about the model?

Also - is there a rule of thumb regarding what should be classified as a satisfactory SCF value?

Many thanks,

Seefeh

Linda K. Muthen posted on Wednesday, May 09, 2012 - 9:50 am

The scaling correction factor tells how non-normal the data are. The larger the scaling correction factor, the more non-normal the data. It is not a fit statistic.

Chelsea Garneau posted on Sunday, May 13, 2012 - 12:47 pm

I am trying to run a multilevel CFA where I have children nested in families - less than half of the families have more than 1 child. The purose of multilevel modeling is more simply to account for dependency in the data, not make strong conclusions about factors which vary within and between.

I have run the following using raw data:

VARIABLE:
NAMES ARE famid relrf1-relrf8 fmon1-fmon4 psinv1-psinv2;
CLUSTER = famid
ANALYSIS:
TYPE=TWOLEVEL;
ESTIMATOR=ML
MODEL:
%WITHIN%
relatew by relrf1-relrf8;
monitorw by fmon1-fmon4;
involvew by psinv1-psinv2;
%BETWEEN%
relateb by relrf1-relrf8;
monitorb by fmon1-fmon4;
involveb by psinv1-psinv2;
OUTPUT:
STDYX modindices residual;
TECH4;

First, Mplus doesn't seem to be recognizing my families by famid because it says "Number of groups 1". Next, I'm getting the error that the correlations among many of my items is either 1.00 or 0.994. However, when I examine a correlation matrix and none of the items appear to be correlated greater than r=.49.

I've attempted examining these data as a single level model, in case the small number of clusters with more than 1 case was causing a problem, and I'm still getting the same correlation messages.

Do you have any idea what might be causing this?

Linda K. Muthen posted on Sunday, May 13, 2012 - 1:06 pm

You need a semicolon after the CLUSTER option for it to be recognized.

For the other problem, please send the relevant files and your license number to support@statmodel.com.

Alexandru Cernat posted on Tuesday, July 17, 2012 - 1:05 pm

Hello,

I am trying a include in a SEM model a acquiescence style like the one proposed by Billiet and McClendon (2000). I can't figure out how to impose the constraints to measure the style factor. This is also complicated by the fact that the items use different scales and are coded in different directions (sometimes a large code represents agreement, sometimes disagreement).

Thank you,

Alex

Linda K. Muthen posted on Tuesday, July 17, 2012 - 5:56 pm

We are not familiar with the Billiet and McClendon article. If you can briefly describe the model, we can try to help you impose the constraints using MODEL CONSTRAINT.

Alexandru Cernat posted on Wednesday, July 18, 2012 - 1:34 am

The basic idea is to model acquiescence (i.e., responding positively regardless of the question content) using a latent variable. For this at least two balanced sets of items are needed (i.e., for some questions answering positively represents positive attitudes while for others negative attitudes).
In the articles the authors used LISREL8 and the figure presented shows only a "+1" (for all the questions a higher score represented agreement and they assumed the effect to be equal for all the questions) for the relationships from the items to the acquiescence factor.
Considering that I have different scales (5-7-10 categories) that are ordered in different directions (agreement represented by a smallest code or by the highest) I was wondering if I can impose something like positive or negative relationships (eventually with the possibility of freeing the equality constraint) so I won't need to recode all the questions in the same direction.

Thank you again,

Alex

Linda K. Muthen posted on Wednesday, July 18, 2012 - 11:20 am

You might want to take a look at Confirmatory Factor Analysis for Applied Research by Timothy A. Brown. He uses Mplus for Multi-Trait Multi-Methods models. You might get some ideas from his Mplus syntax that you can apply to your situation.

Sam Hawes posted on Wednesday, August 08, 2012 - 8:42 am

Hello,

I've ran a model attempting to identify a latent trait. The fit indices are good (CFI-1.00, TLI-1.00, RMSEA- 0.00), but I wanted to check and see if anyone sees any problems in attempting to identify a latent trait with the following model setup. Thank you for your help.

im by im1* im2 (1);
el by el1* el2 (2);
ca by ca1* ca2 (3);

im-ca@1;

psy by im el ca;

im1 with el1 (4);
im1 with ca1 (5);
el1 with ca1 (6);

im2 with el2 (4);
im2 with ca2 (5);
el2 with ca2 (6);

Bengt O. Muthen posted on Wednesday, August 08, 2012 - 6:57 pm

If Mplus doesn't complain about the model not being identified, it most likely is. It seems that it could be.

Sam Hawes posted on Thursday, August 09, 2012 - 7:01 am

Thank you for your quick response. Would it be accurate to say that the latent variables in the model represent the trait-like stable aspects of the three constructs across the two timepoints? Thank you again.

Linda K. Muthen posted on Friday, August 10, 2012 - 10:22 am

Questions not specific to Mplus are best posted on a general discussion forum like SEMNET.

Lenka Drazanova posted on Wednesday, November 07, 2012 - 2:24 pm

I have a categorical (5-point Likert scale) single-item latent variable. I know I can treat the single-item latent variable directly as observed in Mplus, but in the CFA framework this would not allow me to asses the model fit of the one factor vs. two factors solution, which is what I am after.
Therefore, my syntax would look like:
F1 BY y1* y2 y3;
F2 BY y4;
[F1@0];
F1@1;
F2@0;

My questions are:
1.) I know the F2 as a single item latent variable should be fixed to
F2 BY y4@1;
y4@a;
where a = (1 - reliability)* sample variance.
However, is this valid also for categorical (Likert scale) variables?!
2.) In case it is, I apologize for the lack of knowledge, but is there a way in Mplus how to actually obtain the reliability and sample variance? And if there is, could you please provide me with the relevant syntax?
3.) the initial syntax works when I do not specify y4 as a categorical variable. When I do, the THETA parametrization is necessary. However, when I use the THETA parametrization, the model does not work. Is there a solution to this problem?

Thank you very much in advance for your answer!

Linda K. Muthen posted on Wednesday, November 07, 2012 - 3:11 pm

You cannot correct for reliability with categorical indicators.

To put a factor behind a categorical variable say

f BY u@1;

The variance of a categorical variable is not an estimated parameter in a cross-sectional study so you can't fix it at zero.

Cindy Masaro posted on Wednesday, November 07, 2012 - 3:44 pm

Hi Linda,
I am hoping you can help me. I am running 5 separate CFAs. Each CFA has one latent factor measured by three indicators. All indicators (in each CFAs) has been measured on a 7 point scale. I have specified these indicators as categorical and used WLSMV as the estimator. For each of these CFAs I get an RMSEA=0.000, CFI=1.000, TLI=1.000. Standardized factor loadings are high with small standard errors. Residuals for covariances/correlations/residual correlations are all 0.000. The modification indices (ON statements for all indicators) all show an MI of 999.000, and EPC 0.000. I'm suspecting something isn't quite right so my question is, why am I getting 999.000 for the MI and can I put any faith in the parameter estimates and fit indices etc.?

Linda K. Muthen posted on Wednesday, November 07, 2012 - 4:33 pm

A factor with three indicators is just-identified. The model has no degrees of freedom so there are no modification indices. There can be no modifications to the model.

Lenka Drazanova posted on Thursday, November 08, 2012 - 4:25 am

Dear Linda,
thank you very much for your prompt answer!
However, I still have few questions. If I understood you correctly, you are saying to run the syntax with y1-y4 categorical:
F1 BY y1* y2 y3;
F2 BY y4@1;
[F1@0];
F1@1;

When I do, I get the following warning:

WARNING: THE LATENT VARIABLE COVARIANCE MATRIX (PSI) IS NOT POSITIVE DEFINITE. THIS COULD INDICATE A NEGATIVE VARIANCE/RESIDUAL VARIANCE FOR A LATENT VARIABLE, A CORRELATION GREATER OR EQUAL TO ONE BETWEEN TWO LATENT VARIABLES, OR A LINEAR DEPENDENCY AMONG MORE THAN TWO LATENT VARIABLES. CHECK THE TECH4 OUTPUT FOR MORE INFORMATION. PROBLEM INVOLVING VARIABLE F2.
THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES COULD NOT BE COMPUTED. THE MODEL MAY NOT BE IDENTIFIED. CHECK YOUR MODEL. PROBLEM INVOLVING PARAMETER 51.

Therefore, it does not work!
1.) Is there any solution to this?
2.) If I decide to treat the 5-point Likert scale variable y4 as continuous and therefore F2 will be continuous, can I say
F2 BY y4;
y4@0;

or do I have to set y4 at some other number than 0 (given that it is a 5-point scale Likert variable)?
3.) And if I do have to set it to (1 - reliability)* sample variance, how do I actually find out "reliability" and "sample variance".

Thank you very much in advance for your answer!

Linda K. Muthen posted on Thursday, November 08, 2012 - 6:06 am

Please send the full output and your license number to support@statmodel.com.

Alexandru Cernat posted on Thursday, December 06, 2012 - 9:37 am

Hello,

Is there any way to save the standardized coefficients for the fixed parameters (e.g., loading to set the scale of a latent variable) and other linked indicators (residuals, R2).

Thank you,

Alex

Linda K. Muthen posted on Thursday, December 06, 2012 - 9:41 am

No, we save them only for free parameters.

mustafa YILDIZ posted on Friday, January 25, 2013 - 7:23 pm

Hello,

I am hoping that the place I am posting is correct. I want to ask a question about a CFA study that has a sample of 3276 individuals, 191 items, 8 factors, and WLSMV estimation method. When I run it, I get an error massage that says the phisical memory of the computer is not enough (i5 processor, 8GB ram, 64 bit). Then i reduce the items to 131, again the same error. I did not try much, but the program worked when i had 77 items. I have two questions now:
1)What is the highest number of items i could analyse at a time? How can i know it?
2)the criteria to reduce the items was a previously conducted EFA study. I simply referred to the size of loadings. I exluded the items if they had a loading of .40 or lower.. Is there a better way to make this decision. I mean statistically, without considering the content background of the instrument.

I never see the whole picture since I cannot run everything at the same time. So, forgive me I am a naive user of Mplus and a learner of CFA. Thanks a lot in advance
regards
Mustafa Yildiz

Linda K. Muthen posted on Monday, January 28, 2013 - 11:36 am

Please send your input, data, and license number to support@statmodel.com and we will give it a stry.

Lucy Hebert posted on Wednesday, February 13, 2013 - 1:57 pm

I have a 2 factor model with categorical indicators and solid sample size. I have 6 very distinct groups by sex and city and I am wanting to simply compare the loadings for my hypothesized CFA model between these 6 groups. Is it most appropriate to compare the unstandardized loadings or the standardized in this case? (All indicators have the same response options). Thank you in advance.

Lucy Hebert posted on Wednesday, February 13, 2013 - 1:59 pm

And to clarify-- I am not doing a multi=group comparison analysis-- simply comparing between different strata in unique analyses. Thanks.

Bengt O. Muthen posted on Wednesday, February 13, 2013 - 2:36 pm

It is hard to make that sort of comparison because it is confounded by group variation in factor means and factor variances. So I wouldn't use either approach. The advantage of the multiple-group analysis is that you put the factor on a common scale.

ywang posted on Monday, March 04, 2013 - 10:02 am

Hello,
We conducted a confirmatory factor analysis (one factor out of three indicator variables) and link the factor to another variable using "with" command. The sample size is 54. The unstandardized correlation coefficient is signficant, but the standardized correlation coefficient is not significant. Is this inconsistency due to the small sample size? My other question is what correlation coefficient to report in this condition? Should we report standardized or unstandardized correlation coefficient?

Thanks!

Linda K. Muthen posted on Monday, March 04, 2013 - 10:11 am

Raw and standardized significance can be different because the sampling distributions of the two coefficients are different. It would be your decision which coefficient to report.

Lucy Hebert posted on Wednesday, March 06, 2013 - 1:40 pm

Unless I am mistaken, Tech10 provides estimates of standardized residuals,
but if every indicator I am using has the exact same scale, then wouldn't
the fitted residuals (unstandardized) be indicative of strain, and if so, what
is an acceptable cut-off value for that?
Also, if Tech10 is only used for mixture models, so how does one get standardized residuals with a non-mixture model?
Thanks in advance for your help!

Bengt O. Muthen posted on Wednesday, March 06, 2013 - 3:04 pm

For unstandardized residuals you don't get a statistical test so the cutoff is arbitrary.

You can do a single-class mixture analysis to get Tech10.

John Nelson posted on Monday, March 18, 2013 - 11:39 am

I ran a CFA using MPlus 6.12. My sample had 186 parameters. I used a randomized sample of 2,500, drawn from a sample of 5,000. My model fit very well. I now want to test this model in a sample from another country but I have only 82 in my sample. I would like to run a partial model, even just the factor loadings of the 51 items, from 10 different factors. I have read through the discussion board and searched on the web for MPlus solutions but cannot find. Could you please tell me if this is possible using my version of MPlus? Thank you!

Bengt O. Muthen posted on Monday, March 18, 2013 - 2:29 pm

You say that you have 10 factors. It isn't clear to me if you have 51 parameters (loadings) in the run with n=82. You don't say if your items are categorical or continuous. A sample size of n=82 is rather small unless your items are continuous or uni-dimensional.

Linda K. Muthen posted on Monday, March 18, 2013 - 2:29 pm

I'm not sure how you could do this. With a sample size of 82, you should have less than 82 parameters.

John Nelson posted on Tuesday, March 19, 2013 - 7:19 pm

Thank you for this response! I do agree I should have less than 82 parameters. Thus, I am seeking to minimize my parameters. My plan was to set all error variances to 1, the command, as I understand, is f1@1; f2@1; etc.
That should remove 50 parameters.

I then plan to remove all the correlations/covariances between factors as I am interested in confirming the CFA from a previous sample from the USA and not how correlated the factors are.

My plan was to
1. Set factor variances to 1: f1@1; f2@1; etc

2. Set residual variances to 0: Q1MD - Q58EL@0;

3. Eliminate certain correlations between factors: f1 with f5 - f9@0; this allows the correlation between f1 and f2, f3, f4 but eliminates the rest.

All of the items in my measures/subscales use 7-point Likert scales.

I would be grateful for your reflection that my plan is accurate or flawed. If I am accurate, I would like to know where I can find in the MPlus documents what the command is for me to use in the syntax so I can execute this analysis. I can also not find direction in how to minimize the number of parameters in a small sample.

Linda K. Muthen posted on Wednesday, March 20, 2013 - 10:22 am

I don't think fixing parameters is a good idea. You need a different model for the small group.

John Nelson posted on Saturday, March 23, 2013 - 1:15 pm

I have done extensive research on this model in 10 different facilities using large samples and the model held up well using CFA. Those studies were in the USA. This study was conducted in the Caribbean and so I would like to stick with this model to see if it applies in other countries. I did try setting the variances to 1 and so on as I stated above, but it did not help.

I see that parceling my factors makes sense. I did define how the indicators should be parceled. However, after I defined how the indicators should be parceled and ran the model, I received the following error: *** ERROR in MODEL command
Unknown variable(s) in a BY statement: S1

I have a 2-factor model with 4 parcel scores in first factor and 4 parcel scores in the second factor. I have checked and rechecked but cannot see what is wrong. Any ideas?

Thanks!

Linda K. Muthen posted on Saturday, March 23, 2013 - 1:25 pm

Please send the output and your license number to support@statmodel.com.

Maren Formazin posted on Thursday, May 02, 2013 - 10:33 am

Hi,

I'm trying to confirm the structure of a questionnaire. This questionnaire has been used in four samples with n > 2000 in each sample.

For most scales, a four-point answering scheme has been used for the single items. I use scale scores as indicators.

One sample differs from the others in that for two scales, a five-point answering scheme and for one scale, a seven-point answering scheme have been used.

I've been wondering whether this might lead to problems when Mplus estimates the models - does Mplus use correlations when estimating the models?

Thanks for your help!

Linda K. Muthen posted on Thursday, May 02, 2013 - 11:57 am

Are you treating the factor indicators as continuous or categorical?

Maren Formazin posted on Thursday, May 02, 2013 - 11:46 pm

Dear Linda,

the factor indicators are mean scores over up to 8 items. We have therefore decided to treat them as continous.

Thanks for your help.

Linda K. Muthen posted on Friday, May 03, 2013 - 6:26 am

Then there should be no problem in estimating the model. I would not compare the items across groups that did not have the same answering scheme.

Maren Formazin posted on Friday, May 03, 2013 - 8:12 am

Dear Linda,

thanks for your reply - that's reassuring! Does Mplus use correlations when the factor indicators are treated as continous?

Thanks again!

Linda K. Muthen posted on Friday, May 03, 2013 - 8:44 am

The sample statistics used for model estimation for continuous variables and CFA are means, variances, and covariances.

Nathan Alkemade posted on Tuesday, May 07, 2013 - 7:02 pm

hi

I am getting the below error meesage when i try to run a CFA

*** ERROR
Invalid symbol in data file:
"ï»¿5" at record #: 1, field #: 1

I have tried saving the file as a txt file, manually checking the data file, altering the input from spss file to sepcify the field size rather than use tab delimited and i continue to get variants on this error.

Not sure what to try next.

Linda K. Muthen posted on Tuesday, May 07, 2013 - 8:03 pm

Open the file in the Mplus Editor. The symbol is the first entry in the data file. Delete it and save the file. This seems to be related to a new version of SPSS.

Nathan Alkemade posted on Wednesday, May 08, 2013 - 3:42 pm

This worked, Thanks Linda.

Eric Deemer posted on Thursday, May 30, 2013 - 2:25 pm

I just read the version 7.1 addendum to the manual and the invariance testing setups sound so much simpler than in version 7. Thank you!

Markus Martini posted on Thursday, August 22, 2013 - 11:30 pm

Hi, I ran a CFA with the mlm estimator and I don´t know the command to get the p values for the correlation matrix.
Thank you for your help.

Linda K. Muthen posted on Friday, August 23, 2013 - 6:01 am

Which correlation matrix do you mean. From SAMPTSTAT or TECH4.

Markus Martini posted on Saturday, August 24, 2013 - 1:14 am

From SAMPSTAT. But I can´t find them in TECH4, either. Thank you!

Linda K. Muthen posted on Saturday, August 24, 2013 - 6:08 am

We don't provide them with SAMPSTAT. With TYPE=BASIC we give standard errors for correlations for categorical variables. We also give standard errors for TECH4 in most cases. You would need to compute the ratio of the estimate to the standard error to get a z-value and get the p-value from that.

Scott Smith posted on Thursday, October 10, 2013 - 10:54 am

I am running a two level CFA with three composites. Two of the composites have three items each. One composite only has two items. After the initial run, one of the items in the two-item composite had an STDYX estimate above 1 at the within level. I set it to @.001 and reran the model. Now the other item in the two item composite has an STDYX estimate above 1. Will I get valid results if I set both items in a two-item composite to @.001, in conjunction with two three-item composites?

Linda K. Muthen posted on Thursday, October 10, 2013 - 1:56 pm

Standardized factor loadings can be greater than one. There is a FAQ about this on the website.

maurice topper posted on Monday, October 21, 2013 - 4:39 am

I am running a bifactor model CFA, and I am interested in the estimated common variance accounted for by the general factor. Mplus does not give this value by default. How can I calculate/obtain this value?

Bengt O. Muthen posted on Monday, October 21, 2013 - 5:04 pm

Perhaps you mean the amount of variance in all the indicators explained by the general factor. If the general factor variance is set at 1, you sum the squared factor loadings and divide them by the sum of the indicator variances.

Kelly M Allred posted on Friday, November 08, 2013 - 8:33 am

I am conducting a CFA to test for measurement invariance for whites and blacks on a scale. Here are my models:

Constrained model:

VARIABLE:
NAMES ARE RACE ACS1-ACS10;
USE VARAIBLES ARE RACE ACS1-ACS10;
CATEGORICAL ARE ACS1-ACS10;
MISSING ARE ALL (999);
GROUPING is RACE (1=black 0=white);
MODEL:
pos BY ACS1 ACS2 ACS3 ACS4 ACS5;
neg BY ACS6 ACS7 ACS8 ACS9 ACS10;
OUTPUT: STDYX MODINDICES;

Unconstrained model:

VARIABLE:
NAMES ARE RACE ACS1-ACS10;
USE VARAIBLES ARE RACE ACS1-ACS10;
CATEGORICAL ARE ACS1-ACS10;
MISSING ARE ALL (999);
GROUPING is RACE (1=black 0=white);
MODEL:
pos BY ACS1* ACS2 ACS3 ACS4 ACS5;
neg BY ACS6* ACS7 ACS8 ACS9 ACS10;
pos@1 neg@1;
MODEL white:
pos BY ACS1* ACS2 ACS3 ACS4 ACS5;
neg BY ACS6* ACS7 ACS8 ACS9 ACS10;
OUTPUT: STDYX MODINDICES;

I am unsure what Mplus is constraining to be equal across groups (e.g., factor loadings, intercepts, error variances) as a default in the first model. Also, is there a way to individually constrain factor loadings, intercepts, and error variances to be equal across race so that I can conduct tests for weak, strong, and strict invariance?

Linda K. Muthen posted on Friday, November 08, 2013 - 8:56 am

See the discussion in Chapter 14 on multiple group analysis. This should answer your questions. For the models to test for measurement invariance, see the Topic 1 course handout under multiple group analysis. See also the Version 7.1 Language Addendum on the website with the user's guide where a new feature that automatically tests for measurement invariance is described.

Andriana Rapti posted on Wednesday, December 04, 2013 - 1:41 pm

I have 25 different variables and only 2 of them give a good fit. In all the rest, the x squared is significant and the RMSEA high. However, in many variables the CFI and TLI are close to 1. Can I assume that the model fits the data because of the CFI and TLI values and ignore the x squared?

Linda K. Muthen posted on Wednesday, December 04, 2013 - 2:39 pm

No.

Sarah Hafidz posted on Wednesday, January 22, 2014 - 7:28 pm

Hi

I ran a CFA model for 75 indicators into 12 latent variables. The fit indices showed that the model fit is good. However, it also came up with a warning as follows:

THE MODEL ESTIMATION TERMINATED NORMALLY

WARNING: THE LATENT VARIABLE COVARIANCE MATRIX (PSI) IS NOT POSITIVE DEFINITE. THIS COULD INDICATE A NEGATIVE VARIANCE/RESIDUAL VARIANCE FOR A LATENT VARIABLE, A CORRELATION GREATER OR EQUAL TO ONE BETWEEN TWO LATENT VARIABLES, OR A LINEAR DEPENDENCY AMONG MORE THAN TWO LATENT VARIABLES. CHECK THE TECH4 OUTPUT FOR MORE INFORMATION.
PROBLEM INVOLVING VARIABLE FACTOR4.

Does this essentially mean that I cant accept the model and use it for other analysis?

Thanks
'

Linda K. Muthen posted on Thursday, January 23, 2014 - 10:17 am

This message cannot be ignored. It means you should change your model. Perhaps looking at an EFA will reveal the problem.

SY Khan posted on Wednesday, March 12, 2014 - 5:40 am

Hi Dr. Muthen,

Is there a way in Mplus to save the actual covariance matrix that the program uses rather than an abbreviated/estimated version saved through tech3 command?

Is there some way that the actual covariance matrix used in a negative residual variance case in CFA can be saved? if yes, can you please guide me to the syntax to get that?

Many thanks for your guidance.

Linda K. Muthen posted on Wednesday, March 12, 2014 - 11:33 am

TECH3 is the matrix of variances and covariances among the parameters. See the SAMPLE option of the SAVEDATA command to save the observed variable covariance matrix. See the RESIDUAL option for the model estimated covariance matrix.

SY Khan posted on Thursday, March 13, 2014 - 3:30 am

Thanks for your guidance on the above Dr. Muthen.

I have been able to get the RESIDUAL option working. However, since my data are categorical the SAMPLE SAVEDATA command is giving the correlation matrix as default.

I have tried using the TYPE option along with the SAMPLE option, but it is still not giving the covariance matrix.

SAVEDATA:

TYPE=COVARIANCE;

SAMPLE=COVSAMPLE.dat;

Is it possible to obtain a covariance matrix for categorical dependent variables at all?

Many thanks for your time and guidance in advance.

Linda K. Muthen posted on Thursday, March 13, 2014 - 7:14 am

The covariance matrix is not the matrix analyzed for categorical variables. If you want a covariance matrix for these variables, remove the CATEGORICAL option.

Tom Bailey posted on Sunday, April 06, 2014 - 8:56 am

Dear Linda

I was hoping you (or someone else on the board) may be so kind as to answer an issue related to items on one of latent factors in a CFA model using the WLSMV estimator in MPlus.

When I run an EFA in SPSS or MPlus or a CFA in AMOS all the item loadings on my latent variable are positive (with the exception of one). However, when I run the CFA model in MPlus loadings on this variable are now all negative (again with the exception of 1). Is there a rational explanation for this, or is it something that perhaps I am doing wrong when specifying the model?

Regards

Tom

Linda K. Muthen posted on Sunday, April 06, 2014 - 5:47 pm

In factor analysis, this reversal can happen. You can just reverse all signs if you like to interpret the factor that way.

Tom Bailey posted on Monday, April 07, 2014 - 5:04 pm

Thanks Linda

Actually, once I make modifications to my measurement model, the factor loadings then revert back to the direction they were in the EFA in Mplus and the ML CFA in AMOS; so it is no longer an issue.

Tom

Richard Dembo posted on Thursday, April 24, 2014 - 10:12 am

Hi Linda:

I just completed two level CFAs involving ordinal variables using Bayesian estimation. I get PRS values, but not PPP values. Can you advise how I can obtain PPP values for these runs?

Thank you.

Richard

Linda K. Muthen posted on Friday, April 25, 2014 - 8:31 am

PPP has not been developed for multilevel models. It is not available.

Nara Jang posted on Monday, April 28, 2014 - 12:30 pm

Dear Dr. Muthen,

I got following warning, after conducting CFA latent variable. Would you tell me how I can solve this problem. Thank you very much for your great help!

WARNING: THE RESIDUAL COVARIANCE MATRIX (THETA) IS NOT POSITIVE DEFINITE.
THIS COULD INDICATE A NEGATIVE VARIANCE/RESIDUAL VARIANCE FOR AN OBSERVED
VARIABLE, A CORRELATION GREATER OR EQUAL TO ONE BETWEEN TWO OBSERVED
VARIABLES, OR A LINEAR DEPENDENCY AMONG MORE THAN TWO OBSERVED VARIABLES.
CHECK THE RESULTS SECTION FOR MORE INFORMATION.
PROBLEM INVOLVING VARIABLE I16CAT.

THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES COULD NOT BE
COMPUTED. THE MODEL MAY NOT BE IDENTIFIED. CHECK YOUR MODEL.
PROBLEM INVOLVING PARAMETER 7.

Nara Jang posted on Monday, April 28, 2014 - 1:27 pm

Dear Dr. Muthen,

This is follow-up question. I found out the theta value of the numhsl variable in parameter specification showed "8" and the other variables had "0". So I removed the "numhsl". The model fit indices are as follows:

Chi-Square Test of Model Fit
Value 0.000*
Degrees of Freedom 0
P-Value 0.0000

RMSEA (Root Mean Square Error Of Approximation)
Estimate 0.000
90 Percent C.I. 0.000 0.000
Probability RMSEA <= .05 0.000

CFI/TLI
CFI 1.000
TLI 1.000

WRMR (Weighted Root Mean Square Residual)
Value 0.000

Would you tell me if it is correct to remove the numhsl variable. And is it OK to interpret the model is good based on the CFI/TLI value?

I tried removing i16cat, but the result indicated that the negative variance/residual variance of the numhsl variable. So I keep the "i16cat" and delete the "numhsl" variable. Then the result showed as aforementioned.

Thank you so much for your expert explanation in advance!

Linda K. Muthen posted on Monday, April 28, 2014 - 4:47 pm

Regarding your first question, I would need to see the output and your license number.

The fit of a model with zero degrees of freedom cannot be assessed.

Nara Jang posted on Saturday, May 03, 2014 - 10:15 pm

Dear Muthen,

I conducted CFA with a half random sample drawn from my dataset. I would like to use weighting methods, because women and old aged people were oversampled. So I downloaded 5 zip code areas (the surveyed areas) from the U.S. census web. Would you tell me if it is correct or not that I have to use weighting method for only CFA or both total sample? The first random half sample was used for EFA.

Thank you very much!

Nara Jang posted on Sunday, May 04, 2014 - 5:48 am

Dear Dr. Muthen,

The weighting method I posted earlier is for regression/logistic regression. Would you mind explaining weighting methods for CFA or recommend any reference regarding weighting methods for CFA?

Thank you very much for your expert explanation in advance!

Have a great day!

Linda K. Muthen posted on Sunday, May 04, 2014 - 10:22 am

You should use the weight variable that comes with the data set.

See the Special Topic on complex survey data on the website.

Nara Jang posted on Sunday, May 04, 2014 - 12:13 pm

Dear Dr. Muthen,

Thank you so much for your information!

f f posted on Tuesday, May 06, 2014 - 7:15 pm

Dear Prof.Muthen,

I'd like to know whether I can scratch variables with absolute value less than .30 when I am using Mplus for CFA.

In SPSS, there is this option.

Linda K. Muthen posted on Wednesday, May 07, 2014 - 7:29 am

Mplus has no such option.

Olivier Colins posted on Tuesday, June 24, 2014 - 5:57 am

Dear
To test the factor structure of a screening tool for mental health problems (yes or no items; 7 scales -but no total score; some items appear on two scales)I ran a CFA (WLSMV) :

ADU by M10 M19 M23 M24 M33 M37 M40 M45;
AI by M2 M6 M7 M8 M13 M35 M39 M42 M44;
DA by M3 M14 M17 M21 M34 M35 M41 M47 M51;
SC by M27 M28 M29 M30 M31 M43;
SI by M11 M16 M18 M22 M47;
TD by M9 M20 M25 M26 M32;
TE by M46 M48 M49 M51 M52;

And check if the MODINDICES suggest if correlations between factors are needed.
The model estimation terminated normally but with warnings (non-positive definite) Removing the items that caused trouble did not solve the problem. It The model may be too complex to test in 1600 boys

Yet, I wonder if it makes sense to just perform seven separate CFAs (e.g., one input file testing : ADU by M10 M19 M23 M24 M33 M37 M40 M45; then a next input file AI by M2 M6 M7 M8 M13 M35 M39 M42 M44; etcetera). After all, and conceptually, this tool was designed to assess 7 several construct that are not supposed to load on a higher order factor. This would avoid having scales in one and the same model that include the same items, and having scales that are related to each other.

So, do you think this is a strategy that makes sense? I would really appreciate your input as I have difficulties to find discussion or examples that are related to my question.
Cheers
Olivier

Linda K. Muthen posted on Tuesday, June 24, 2014 - 8:01 am

This general modeling question is more appropriate for a general discussion forum like SEMNET.

Hyojeong Seo posted on Friday, September 12, 2014 - 10:27 am

Hello Dr.Muthen,

I am wondering how I can obtain the chi-square value. I ran a simple CFA and the chi-square is indicated by ***********. What do I need to do to see actual numbers on the output?

Thank you for your help in advance!

Bengt O. Muthen posted on Friday, September 12, 2014 - 6:02 pm

Sounds like you either have a huge sample or a very ill-fitting model, or both - the value is too big to print. Your RMSEA and CFI results are probably also poor.

Hyojeong Seo posted on Sunday, September 14, 2014 - 4:08 pm

Hello Dr.Muthen,

Thank you for your response.

Yes, I do have a large sample (n = 134,984). Is there any possible way that I can do on syntax to see the chi-sqaure?

Thank you.

Bengt O. Muthen posted on Sunday, September 14, 2014 - 5:12 pm

Please send the output and your data to support with your license number.

Maria Carrasco posted on Wednesday, October 15, 2014 - 10:47 am

Hi, I calculated factor scores using CFA and I need to export the data into STATA for further analysis. Can you please indicate how to do that? Below is part of my input file.

Thank you!

Maria

Data:
File is C:\Users\Mcarrasc\Documents\Maria\Latent.dat ;
MODEL:
cohesion BY CSPC_01* CSPC_02 CSPC_03
CSPC_05 CSPC_06 CSPC_07 CSPC_08
CSPC_10 CSPC_10A;
cohesion@1;
CSPC_03 WITH CSPC_02;
CSPC_02 WITH CSPC_01;
CSPC_06 WITH CSPC_07;
OUTPUT:
sampstat tech1 stdyx modindices (all);

SAVEDATA:
File is C:\Users\Mcarrasc\Documents\Maria\LCvul.dat;
Save = FSCORES;

Linda K. Muthen posted on Wednesday, October 15, 2014 - 12:11 pm

The factor scores will be in the file lcvul.dat. This is an ASCII file. You will need to read that using STATA. See the STATA user's guide to see how to do this.

Joanna Jones posted on Tuesday, January 06, 2015 - 2:51 am

I have a second order CFA model, and I would like to get a histogram of the distribution of estimated factor scores. I succeeded in getting that with the plot3 command. However, in terms of lay-out I prefer the frequency table over the histrogram to make it in excel in the format of the journal. Is there a command to give me the estimated factor scores in a table?

Linda K. Muthen posted on Tuesday, January 06, 2015 - 8:26 am

No, there is no such option. You can save the factor scores and create the table using another software.

Djangou C posted on Friday, January 09, 2015 - 7:24 pm

Hi
I am doing a simulation study with ESTIMATOR=BAYES. And I am interested in the median, mean and the mode for point estimate. The default in Mplus is the median. Is there a way to get the same stat for mean and mode in simulation studies?
Thank you.

Bengt O. Muthen posted on Saturday, January 10, 2015 - 2:05 pm

Use the POINT= option in the Analysis command.

Djangou C posted on Sunday, January 11, 2015 - 1:08 am

Thank you.

Fatih Koca posted on Monday, January 26, 2015 - 10:14 am

Hi
I need help on that. Here is my question
How I can modify this code to use effects coding method of identification and introduce phantom constructs for each of the lower-order constructs to convert the variances covariances into standard deviations and correlations?

Grouping is SchLev(1=ELEM, 2=MIDDLE, 3=HIGH)
;
idvariable=ID;
Missing are all (99, 777);

Auxiliary = (m) auxvar1 auxvar2 auxvar3 auxvar4 auxvar5 auxvar6 auxvar7 auxvar8 auxvar9
auxvar10 auxvar11 auxvar12 auxvar13 auxvar14 auxvar15 auxvar16 auxvar17 auxvar18 auxvar19
Auxvar20 Auxvar21
;

model: f1 by IASTH1* IASTH2 IASTH3;
f1@1 ;
F2 BY TSGR1* TSGR2 TSGR3;
F2@1
;
Model Constraint:
f1=2-f2;
output: standardized;

Bengt O. Muthen posted on Monday, January 26, 2015 - 3:58 pm

I don't understand what you want to do. Is there a reference that you are going by?

Fatih Koca posted on Tuesday, January 27, 2015 - 8:01 am

Dr. Muthen,
The script is above. What I want to use effects coding method of identification and introduce phantom constructs for each of the lower-order constructs to convert the variances covariances into standard deviations and correlations. However, I really could not figure out how I can?

Bengt O. Muthen posted on Tuesday, January 27, 2015 - 9:00 am

I don't know what you mean by lower-order constructs in this context. The relationship between f1 and f2 is a correlation. Perhaps you want to ask this question on a general discussion list like SEMNET.

lee posted on Thursday, January 29, 2015 - 8:47 am

Hi,

I am handling a single item measure. I have checked the slide 44 in topic 1 but still not sure how to calculate the reliability. How can I get sample variance, psi and reliability?

Thank you

Bengt O. Muthen posted on Thursday, January 29, 2015 - 8:52 am

The sample variance is obtained using Type=Basic. The reliability you have to provide - using prior information of some sort. Psi is estimated.

Timothy Fung posted on Friday, February 27, 2015 - 1:36 am

Dear Prof. Muthen,

I am running CFA for my path analysis. However, I received an error message as below:

"WARNING: THE RESIDUAL COVARIANCE MATRIX (THETA) IS NOT POSITIVE DEFINITE.
THIS COULD INDICATE A NEGATIVE VARIANCE/RESIDUAL VARIANCE FOR AN OBSERVED
VARIABLE, A CORRELATION GREATER OR EQUAL TO ONE BETWEEN TWO OBSERVED
VARIABLES, OR A LINEAR DEPENDENCY AMONG MORE THAN TWO OBSERVED VARIABLES.
CHECK THE RESULTS SECTION FOR MORE INFORMATION.
PROBLEM INVOLVING VARIABLE ATT1."

I checked the ATT1 variable. It has negative residual variance. It does not look How can I fix this problem? Thanks

Bengt O. Muthen posted on Friday, February 27, 2015 - 10:51 am

If the negative value isn't large, I would fix it to zero. If it is large, it may indicate an important model mis-specification.

Cecily Na posted on Monday, May 18, 2015 - 10:37 am

Hi I want to do a weighted CFA and weighted SEM. Do I just add a line for weight variable "weight= "?
Thanks.

Cecily Na posted on Monday, May 18, 2015 - 2:15 pm

I have a followup question. When there are missing data and I need to normalize the weights, how can I do it in Mplus?

Tihomir Asparouhov posted on Monday, May 18, 2015 - 2:58 pm

Yes you would add a line for the weight variable "weight= ".

The normalization is done automatically for you.

Jian-Bin Li posted on Wednesday, May 27, 2015 - 7:48 am

Hi, I am using Mplus 7.0 to construct a CFA model. The data are 25 observed ordinal variables. The model consists of 5 first-order factors and an additional factor that several items are cross-loaded on. Following is my input:

variable:
NAMES ARE nation gender period sdq1-sdq25
sdqm1-sdqm25 sdqp1-sdqp25;
USEV ARE sdqm1-sdqm25;
categorical are sdqm1-sdqm25;

model:
emo by sdqm3 sdqm8 sdqm13 sdqm16 sdqm24;
con by sdqm5 sdqm7 sdqm12 sdqm18 sdqm22;
hyp by sdqm2 sdqm10 sdqm15 sdqm21 sdqm25;
peer by sdqm6 sdqm11 sdqm14 sdqm19 sdqm23;
pro by sdqm1 sdqm4 sdqm9 sdqm17 sdqm20;
method by sdqm1 sdqm4 sdqm9 sdqm17 sdqm20 sdqm21 sdqm25
sdqm7 sdqm11 sdqm14;

!method is not correlated with other latent factors
method with emo@0;
method with con@0;
method with hyp@0;
method with peer@0;
method with pro@0;
method@1;

output:
standardized modindices(3.84) sampstat tech4

Question:
I cannot get the standardized solution of the "pro" factor. In the output, loadings of its five items are zero. How to fix this problem?

Thank you in advance.

Bengt O. Muthen posted on Wednesday, May 27, 2015 - 1:26 pm

This can be answered only by looking at your full output - please send to support along with your license number.

Noud Frielink posted on Friday, May 29, 2015 - 6:06 am

Dear Prof. Muthen,

I am using Mplus 6.1 to construct a CFA model. The data are 12 observed ordinal variables. The model consists of 3 first-order factors. Because of the ordinal variables, I am not sure whether I have to use Maximum Likelihood (ML) estimation, or, as proposed by DiStefano en Morgan (2014), the Weighted Least Squares — Mean and Variance adjusted (WLSMV).Which one do you prefer, and do you know a reference to support this decision?

Thank you very much.

Bengt O. Muthen posted on Friday, May 29, 2015 - 8:12 am

For the choice of estimators, see our FAQ:

Estimator choices with categorical outcomes

You should update to version 7.31.

Noud Frielink posted on Friday, May 29, 2015 - 11:52 am

Dear Prof. Muthen,

Thank you very much for your prompt reply. ML seems to be the best choice. After running a CFA and looking at the Modification Indices, I want to add an additional parameter between two items within one factor, in order to improve the model fit. In the user's guide I cannot found the proper command to do so. Could you please help me with this?

Thank you very much.

Bengt O. Muthen posted on Friday, May 29, 2015 - 2:01 pm

y1 WITH y2;

But if you use ML with categorical items this won't work unless you say

Parameterization = Rescov;

I think you have to use Type = mixture for this and you would say classes = c(1), etc. See the Mplus Version 7.2 Language Addendum.

Roukaya Ibrahim posted on Wednesday, June 03, 2015 - 11:38 am

What estimator should be used in the case of a second order factor analysis model where all the observed variables are binary?

Also, is it possible to weight the data (which is household survey data), or should it be weighted prior to inputting it to Mplus? When trying to weight the data, I receive the error message that "Categorical variable SL6 contains non-integer values". SL6 is the last variable listed just before the weight variable.

Bengt O. Muthen posted on Wednesday, June 03, 2015 - 12:52 pm

Q1. WLSMV, ML, or Bayes.

Q2. Yes.

Send problematic output to Support.

Ejlis posted on Saturday, June 13, 2015 - 1:06 pm

Hi!
I have run a cfa with three correlated factors (wlsmv estimation). Why is it that when I run the same model but as hierarchical, this produce the very same fit results as the correlated model?
How, then, to choose between the two?

Bengt O. Muthen posted on Saturday, June 13, 2015 - 1:58 pm

You can't choose on statistical grounds since they produce the same correlation matrix. It is just two ways at looking at the same thing. We have this situation often - for instance, with EFA and correlated vs uncorrelated factors. Go with whatever alternative is most useful to you.

Tais S. Barreto posted on Thursday, June 25, 2015 - 9:56 am

Hi,
Is it possible to set minimum and maximum item loadings in CFA using Mplus?
For instance, I would want one item to load with a .7 or higher on a latent factor and another to load with a .5 or lower. I am trying to simulate several models for fit comparison.

Thank you very much.

Linda K. Muthen posted on Thursday, June 25, 2015 - 10:34 am

You can do this using MODEL CONSTRAINT. See the user's guide for further information.

Yanxia WANG posted on Tuesday, August 11, 2015 - 1:32 am

Hi Professor Muthen,

I tried to do a CFA by using Mplus, however, the result keeps warming that "unexpected end of file reached in data file". I checked the related response before, and then checked the number of variables in the "names" part and found that it was as the same as the column of variables in the data set. I really do not know how to deal with this issue, please help me to figure it out. Thanks so much!

YX

Linda K. Muthen posted on Tuesday, August 11, 2015 - 6:13 am

It sounds like you have blanks in the data set and are reading it free format where blanks are not allowed. If you can't see the problem, send the output, data set, and your license number to support@statmodel.com.

Yanxia WANG posted on Tuesday, August 11, 2015 - 11:38 pm

Thanks Muthen, I have already fixed it through deal with missing value in the data file. Still thanks for your reply.

YX

Robert Buch posted on Monday, October 19, 2015 - 12:16 am

Dear Dr. Muthen,

when using:

Analysis:
Type = COMPLEX;
MODEL=NOMEANSTRUCTURE;
ESTIMATOR = wlsmv;

I obtain the SRMR fit, but when adding:
CATEGORICAL =....

Then I no longer obtain the SRMR.. Is there a way to obtain it when using the "categorical=" option?

From reading an earlier post I thought "MODEL=NOMEANSTRUCTURE;" would do the trick, but seems this does not help?

Linda K. Muthen posted on Monday, October 19, 2015 - 6:31 am

With CATEGORICAL outcomes, SRMR is available only when there are no thresholds and no covariates. If this is your situation and you don't get SRMR, please send the output and your license number to support@statmodel.com.

Yoon Jae Kim posted on Monday, November 09, 2015 - 10:04 am

(I accidentally posted this in the "mean structures" thread, but I cannot figure out how to delete that comment; apologies)

Greetings,

I am using Mplus to test the theorized factor structure of a 6-item unidimensional measure. The items are rated on a 1-5 scale with strongly disagree <-> strong agree anchors.

My question is this: is there a minimum percent of my cases that need to select a given answer option (e.g., "1: Strongly Disagree) in order to assume that my data is continuous? I.e., say only 2% of my cases indicate "1: Strongly Disagree" for item 1. Is this a problem? What about if only 2% of my cases indicate "1: Strongly Disagree" for the measure as a whole?

I have heard that there is a rough rule around 5%, but I have yet to find a citation for it. I.e., I have heard that if less than 5% of my cases indicate a response (e.g., "1: Strongly Disagree"), that A) I can no longer assume my data is continuous, and B) that I should combine that response with another (e.g., combine "1: Strongly Disagree" with "2: Disagree" to create a "Disagree" category).

I would appreciate any input on this. Thank you.

Linda K. Muthen posted on Monday, November 09, 2015 - 10:37 am

If you have floor or ceiling effects, you should treat the variable as categorical. If you don't, you can treat it as continuous. If you have small frequencies, you can collapse categories.

Roosevelt Vilar Lobo de Souza posted on Tuesday, February 02, 2016 - 6:31 pm

I am working with a model that is consolidated in the literature, showing good model fit adjustments in several cross-cultural studies. I'm trying to test this model (18 Observed variables split equally into 6 latent variables) with my data (22.000 subjects divided into 20 countries), but the model fit indices are not being satisfactory, and I'm not sure whether I am using the correct approach to analyze it. One of the differences between my data and the data which the model has been tested in is that I have a greater variability of age and I'm not sure if it's composing a heterogeneous population, and consequently influencing the model fit index (as for the construct that I'm working with variations in the scores during the life span are expected). I have tested the model with robust estimators for non-normal samples (MLR, WLS) and I have tested the model using age as a covariate as well (MIMIC). However, I did not get improvement in the model fit. I wonder if it would be reasonable to test the CFA MIXTURE MODELING for this case. I appreciate any comments.

Linda K. Muthen posted on Wednesday, February 03, 2016 - 6:47 am

It sounds like you are using a model that was validated on a sample from a different population. I am not sure mixture modeling would help here because the population it was validated on was not unobserved. Try an EFA to see if the CFA you are using is close the what the data show.

Richard Hermida posted on Wednesday, February 17, 2016 - 6:32 am

Hello,

I was curious as to what the best practices were regarding the assessment of measurement invariance when the items were ordered categorical in nature.

Is a multi-group approach appropriate with a non-ML estimator? Or are there better methods?

Any advice would be appreciated.

Thank you for your time.

-R

Linda K. Muthen posted on Wednesday, February 17, 2016 - 10:09 am

The models we suggest are described in the most recent user's guide on the website under measurement invariance. Models are given for the weighted least squares estimator and the maximum likelihood estimator.

Doris Matosic posted on Tuesday, March 08, 2016 - 10:30 am

Hi,

Is is plausible to run 2 factor CFA with 2 indicators in each factor? Could you possible suggest any reference on this topic?

Any help would be appreciated. Thank you!

Bengt O. Muthen posted on Tuesday, March 08, 2016 - 1:51 pm

It is not recommended because it is a fragile model - a misspecification of one factor can distort the estimation of the other factor.

Don't know about references - try SEMNET.

Ald posted on Tuesday, March 29, 2016 - 9:36 am

I have the following model:

A by y1 y2 y3 y4 y5;
B by y6 y7 y8 y9 y10;
C by y11 y12 y13 y14 y15;
categorical are y1-y15;
B on A;
C on B;
y4 with y5;
y5 with y15;
y5 with y10;
y8 with y9;
y6 with y11;
y7 with y12;

My question is: Can I correlate the error terms of the indicators of A with the error terms of the indicators of C if A and C are not correlated?
Thank you.

Ashley Strickland posted on Wednesday, April 20, 2016 - 4:41 pm

Hi,
I have done everything I know to do to make a model run- made sure the data file was in the correct order, double checked my program, did a PCA to determine the factor structure most appropriate for the population, increased the iterations to 10,000 and the model still won't converge.

Any ideas?
thanks in advance,
ashley

Linda K. Muthen posted on Wednesday, April 20, 2016 - 5:36 pm

Please send the output and your license number to support@statmodel.com.

David R Lewis posted on Saturday, April 23, 2016 - 12:19 pm

I am trying to do CFA with two different versions. One where factor loadings are freely estimated and a second where the factor loadings for one latent variable are fixed at specific numerical value.

I tried the code below but it was not successful.

Model:
Distress BY Distres1 Distres2 Distre3;
PR BY PR1 PR2 PR3;
CMV BY CMV1 CMV2 CMV3;

MODEL CONSTRAINT:
CMV1 = 1.012;
CMV2 = 0.430;
CMV3 = 0.862;

Linda K. Muthen posted on Saturday, April 23, 2016 - 4:46 pm

Please send the output and your license number to support@statmodel.com.

Frodi Debes posted on Thursday, April 28, 2016 - 3:45 pm

In the video, Muthen_Mplus_Topic1.avi at http://statmodel.com/course_materials.shtml, from (h.mm.ss) 2.31.06 to 2.32.25, Bengt Muthén advises against specifying a correlated two factor model with only two indicators each, since identification of the model is achieved by each factor borrowing information from the other factor through the covariation of the factors. The factors are then not self sufficient and independently identified. I intuitively agree with this argument and think that factors, in order to be good latent measurement variables, should be able to stand as separate building blocks to be entered into a more complex model. Unfortunately, I have not been able to find this argument presented in any of my text books. Could you please inform me on any literature arguing for this point? I need to convince someone else who thinks differently.
Best
Fróði.

Frodi Debes posted on Thursday, April 28, 2016 - 3:50 pm

In addition, if two indicators is all that we've got for each factor, would the best thing to do not be to restrain the loadings of each indicator to 1?
Best
Fróði.

Bengt O. Muthen posted on Friday, April 29, 2016 - 6:56 pm

I am not aware of literature on my claim.

I would not restrain all the loadings to 1.

Peter Tomsett posted on Tuesday, May 31, 2016 - 12:12 pm

Hello,

I am trying to run a bi-factor CFA on a 30-item scale with 8 specific factors and one general factor (n = 389). The data are ordinal and I am using the WLSMV estimator. I want to account for nested data using the TYPE = COMPLEX command, but whenever I do this, the analysis returns a non-positive definite residual covariance matrix warning, and standard errors cannot be computed. This problem only seems to occur when applying the COMPLEX command, as the same syntax runs fine if you revert to GENERAL. I have managed to run an exploratory bi-factor model using a target rotation on the same data while using the COMPLEX command, with no warnings.

Is there a specific reason why COMPLEX might cause this type of warning? Why might the exploratory model not be vulnerable to the same problem?

Thanks,
Peter

Bengt O. Muthen posted on Tuesday, May 31, 2016 - 5:52 pm

I assume that your Complex run uses weights which changes the data and therefore the estimates. The CFA is probably worse fitting than the EFA, resulting in the problem.

Maren Schulze posted on Thursday, June 02, 2016 - 8:32 am

I am running a CFA on m = 10 imputed datasets, n = 7290. The amount of missing data is not large, less than 5 %.

I have a three factor model:

F1 BY x1* x2 x3;
F2 BY x4* x5;
F3 BY x6* x7;
F1@1;
F2@1;
F3@1;

For this model, I get the following information in the output file:
"THE CHI-SQUARE COULD NOT BE COMPUTED. THIS MAY BE DUE TO AN
INSUFFICIENT NUMBER OF IMPUTATIONS OR A LARGE AMOUNT OF MISSING DATA."

This confuses me, as the amount of missing data is comparable low and I have 10 imputed datasets.
How trustworthy are the factor loadings that are estimated?

When altering the above mentioned model slightly to

F1 BY x1* x2 x3;
F2 BY x4* x5 x7;
F3 BY x6* x7;
F1@1;
F2@1;
F3@1;

the model is estimated normally.How is that possible?

Linda K. Muthen posted on Thursday, June 02, 2016 - 1:08 pm

This message is related to only chi-square. The parameter estimates and standard errors should be fine. Check TECH9 to be sure there are no other messages. I can't say more without seeing the full output.

Maren Schulze posted on Friday, June 03, 2016 - 9:47 am

Dear Linda,

thanks for your reply.

In TECH 9, I get the following information:
"WARNING: THE RESIDUAL COVARIANCE MATRIX (THETA) IS NOT POSITIVE DEFINITE.
THIS COULD INDICATE A NEGATIVE VARIANCE/RESIDUAL VARIANCE FOR AN OBSERVED
VARIABLE, A CORRELATION GREATER OR EQUAL TO ONE BETWEEN TWO OBSERVED
VARIABLES, OR A LINEAR DEPENDENCY AMONG MORE THAN TWO OBSERVED VARIABLES.
CHECK THE RESULTS SECTION FOR MORE INFORMATION.
PROBLEM INVOLVING VARIABLE x5."

Would this explain why chi square is not computed?

Linda K. Muthen posted on Friday, June 03, 2016 - 10:31 am

You should run one of the imputed data sets separately to see if that reveals the problem better. It should say for which data set, the message is relevant.

Maren Schulze posted on Monday, June 06, 2016 - 2:57 am

Dear Linda,

thanks for pointing this out.

It was the third imputed dataset that caused the problem.

I have run analysis with this dataset separately, getting the following error message:

"WARNING: THE RESIDUAL COVARIANCE MATRIX (THETA) IS NOT POSITIVE DEFINITE.
THIS COULD INDICATE A NEGATIVE VARIANCE/RESIDUAL VARIANCE FOR AN OBSERVED
VARIABLE, A CORRELATION GREATER OR EQUAL TO ONE BETWEEN TWO OBSERVED
VARIABLES, OR A LINEAR DEPENDENCY AMONG MORE THAN TWO OBSERVED VARIABLES.
CHECK THE RESULTS SECTION FOR MORE INFORMATION.
PROBLEM INVOLVING VARIABLE x5."

For this variable, the standardized loading for x5 is 2.38.

I do get chi-square, CFI etc. though.

I have additionally tried
F2 BY x5* x4;
F2@1;

getting the same result.

What can I do?

Bengt O. Muthen posted on Monday, June 06, 2016 - 5:45 am

There is not much you can do except re-specify your model. If you get this problem already in your 3rd draw your model may be fragile. Perhaps there the CFA structure is too strict.

Maren Schulze posted on Wednesday, June 08, 2016 - 7:51 am

Dear Bengt,

thanks for your reply. So would you think that my conclusion "an additional loading of x7 on F2 is needed" is justified when fit-indices for this revised model can be estimated and indicate good fit (chi-square² (10, N = 7290) = 46.85; p < .01; CFI = .933; RMSEA = .022; SRMR = .030).

Thank you!

Bengt O. Muthen posted on Wednesday, June 08, 2016 - 10:51 am

I don't want to commit to an answer without having done a full analysis of your data - which we can't do. You might want to try this kind of question on SEMNET.

David Smith posted on Tuesday, June 14, 2016 - 10:34 pm

Hello,
I have conducted a Bayesian CFA without any issues using data from a community-based survey. To account for potential distortion in standard errors from interviewer clustering (39 interviewers, ave size=23) I specified TYPE=TWOLEVEL where the theoretical interest is at the within level. My question is whether the interpretation of these findings would greatly differ to that of ML-CFA using a marginal approach (TYPE=COMPLEX). My understanding is that TYPE=COMPLEX does not fit Bayes estimator theory-is this correct?

Linda K. Muthen posted on Wednesday, June 15, 2016 - 9:12 am

Yes, this is correct about Bayes. To understand the difference between multilevel and complex, see

Muthén, B. & Satorra, A. (1995). Complex sample data in structural equation modeling. Sociological Methodology, 25, 267-316.

which is available on the website.

David Smith posted on Wednesday, June 15, 2016 - 4:45 pm

Thank you.

Helen Norman posted on Thursday, June 16, 2016 - 6:18 am

I have a sample of data (n=6000 cases) from a much larger dataset (n=19,000 cases) which I am working on in STATA. In order to weight data correctly, I cannot simply delete the 13,000 unwanted cases from my dataset but rather I must retain all the cases and run sub group analyses on my 6,000 households (in STATA).

I now need to transfer my data to Mplus to run a CFA. However, in doing this, I transfer the entire dataset (19,000 cases) but I only want to run a CFA on my particular sample of 6,000 cases. Is it possible to run subgroup analyses in Mplus like it is in STATA? If so, how?

ps I don’t want to just delete these additional cases before I transfer to Mplus – I know this is probably not correct plus I would like to run CFAs on different sub samples of data.

Linda K. Muthen posted on Thursday, June 16, 2016 - 6:21 am

The USEOBSERVATIONS option can be used to subset data using values of variables in the data set.

Helen Norman posted on Thursday, June 16, 2016 - 6:33 am

Thank you Linda for your very speedy response!

SABA posted on Friday, June 17, 2016 - 9:48 am

Hi, I have positive and negative statements in my questionnaire for each factor. Is it important to recode negative statements before doing CFA? Does it effect CFA results and factor loadings?
What does negative factor loadings indicate?
Thank you

Bengt O. Muthen posted on Friday, June 17, 2016 - 12:59 pm

You may want to ask these general analysis questions on SEMNET.

Adnane Belakhdar posted on Wednesday, August 03, 2016 - 9:53 am

Hi I'm trying to conduct a Growth Curve Models, but when I try to run the model, I get the following error message:

Mismatched parentheses:
(MODEL SYNTAX GOES HERE TO BE CHANGED FOR EACH MODEL

Bengt O. Muthen posted on Wednesday, August 03, 2016 - 10:31 am

It looks like your comment about MODEL SYNTAX...gets interpreted as an Mplus command. Check that you have commented out this line.

If this doesn't help, we need to see the full output. Please send to Support along with your license number.

Leslie Rutkowski posted on Monday, September 05, 2016 - 8:15 am

Hi Bengt and Linda,

I'm writing out fscores with a fixed format using

FORMAT = 8F10.3, 2I9, F10.3, I4;

But in the .out file, the SAVEDATA summary gives me a format of

Order and format of variables

Q01 8F10.3
Q02 8F10.3
Q03 8F10.3
Q04 8F10.3
Q05 8F10.3
Q06 8F10.3
Q07 8F10.3
Q08 8F10.3
IDSCH 8F10.3
IDSTU 8F10.3
BULLY 8F10.3
CNT I4

Save file format
118F10.3 I4

What am I missing here?

Thanks,
Leslie

Linda K. Muthen posted on Monday, September 05, 2016 - 2:26 pm

Please send the full output and your license number to support@statmodel.com.

Lisa McGarrigle posted on Sunday, September 25, 2016 - 7:53 am

Hi Linda and Bengt,

I have run a CFA using WLSMV estimator due to a combination of continuous and categorical indicators and non-normal data. My understanding was that, when running a CFA, analysis was conducted on the sample variance-covariance matrix. However, as this matrix is not generated in sample statistics, would you be able to tell me what matrix the analysis is based on?

Many thanks,
Lisa

Linda K. Muthen posted on Sunday, September 25, 2016 - 10:54 am

For the categorical variables, tetrachoric or polychoric correlations and thresholds. For the continuous variables, variances, covariances and means.

Lisa McGarrigle posted on Monday, September 26, 2016 - 2:16 am

Thank you for the reply. As variances and covariances are used for the continuous variables I would like to include the sample covariance matrix in my report as an appendix. Could you please let me know how to generate this in Mplus?

Linda K. Muthen posted on Monday, September 26, 2016 - 3:12 pm

Put the continuous variables on the USEVARIABLES list. Do a TYPE=BASIC with no MODEL command.

Beth Moroney posted on Tuesday, November 29, 2016 - 9:15 pm

I am trying to replicate a bifactor model with eight indicators (diagnosis symptom counts) loading on an Internalizing factor, three indicators loading on an Externalizing factor, and all indicators loading on a General Psychopathology factor. The model runs and has good fit, but I receive the "non positive definite" error message about one of the variables. I have tried rescaling all of the variables because the variable in question has a larger scale than the others. I currently have the covariances between the factors set to 0, but I have also tried allowing them to covary, as well as fixing the variable in question's variance to 1. The residual variance estimate for this variable is undefined, and a negative value if using the scaled (Z) scores. Not sure if it may be of help, but there is a moderate correlation between the raw data of the variable in question and another observed variable, but the parameters have a very low covariance.

I would very much appreciate any suggestions on other things that I can try.

Thank you so much,
Beth

Linda K. Muthen posted on Wednesday, November 30, 2016 - 6:02 pm

Please send the output and your license number to support@statmodel.com. Include TECH4 in the OUTPUT command.

Marco Pannacci posted on Wednesday, December 07, 2016 - 3:44 am

Dear MPLUS support,
I estimated with MPLUS a CFA with two level structure and I saved the dataset to have the factor score of my latent variable.
I am including in this model 768 units, but I have complete data without missing values for 420 units.
Therefore what I would expect is that I would have the factor scores only for the units without missing values.
But looking at the factor score I have a factor score value for all the 768 units.

How is it possible?

Many thanks
Marco Pannacci

Marco Pannacci posted on Wednesday, December 07, 2016 - 4:54 am

In addition, In my model I have all continuous indicators (no covariates) and I am using the ML estimator (default) for the CFA.
Which imputation method Mplus is using as default?

thanks

Bengt O. Muthen posted on Wednesday, December 07, 2016 - 10:23 am

Mplus uses the standard ML approach assuming MAR, also referred to as FIML. There is no imputation. You get factor scores also for subjects who have some missingness. The scores are produced based on two sources: The estimated model parameters (the subjects with missing are assumed to come from the same population as the rest) and any observed scores for the subject.

Marco Pannacci posted on Thursday, December 08, 2016 - 12:17 am

Thank you very much. Not it is clear.
Therefore my last question.
I have 768 units, 10 quantitative variables, for 6 variables I performed single stochastic imputation before performing CFA since for these I had less than 30 % of missing values. But I have 4 variables with 50% of missing values and I did not performed the stochastic imputation because it did not work well.
Question: Do you think it is fine If I perform the CFA on this dataset with FIML (not listwise deletion) even if I imputed before 6 variables?
In other words, is it a problem using partial imputation and then FIML on this dataset for CFA?

Thank you very much,
Marco

Marco Pannacci posted on Thursday, December 08, 2016 - 12:25 am

*sorry I meant "Now it is clear" (typing error).

Bengt O. Muthen posted on Thursday, December 08, 2016 - 5:08 pm

I would not use FIML and imputation in combination - that might lead to incorrect SEs. Always use FIML instead of imputation if FIML is possible.

Zoltan Kozinszky posted on Friday, December 30, 2016 - 11:54 am

Dear Professors,

I am not a statistician.I can not decide whether I have to perform my CFA with WLSMV or MLR estimator.
We distributed a 10-item questionnaire to 5000 women. 4 answers could be given to each question and the answers can be ranked to 0-3 (0 is the most favourable answer and 4 is the least favourable answer indicating depression in each item). The distributions of points on each items follow non-normal distributions and are very skewed. There are 2-4% people with high scores (have probable depression). Is the WLSVM estimator what I need?

Bengt O. Muthen posted on Saturday, December 31, 2016 - 6:47 am

You should treat your 10 variables as categorical (Categorical = option of the Variable command). You can use either WLSMV or MLR - for choices see the FAQ on our website

Estimator choices with categorical outcomes

Sakshi Bhargava posted on Saturday, January 07, 2017 - 1:34 pm

Hello,
I am trying to perform mutigroup CFA to test for measurement invariance. I am not sure if I am using the correct syntax for estimating the constrained model.

Can you please tell which of the following syntax is correct?

Thank you!

Constrained model:

ANALYSIS:
ESTIMATOR IS MLR;
TYPE = MGROUP;

MODEL:

HI BY XHI1
XHI4
XHI5
XHI6
XHI7
XHI8;

MODEL BLACK:
HI BY
XHI4 (1)
XHI5 (2)
XHI6 (3)
XHI7 (4)
XHI8 (5);

MODEL LATINO:
HI BY
XHI4 (1)
XHI5 (2)
XHI6 (3)
XHI7 (4)
XHI8 (5);

OR

ANALYSIS:
ESTIMATOR IS MLR;
TYPE = MGROUP;
MODEL:

HI BY XHI1 (1)
XHI4 (2)
XHI5 (3)
XHI6 (4)
XHI7 (5)
XHI8 (6);

Bengt O. Muthen posted on Saturday, January 07, 2017 - 5:00 pm

See the Topic 1 course handout and the Chapter 14 UG treatment of multiple-group analysis of measurement invariance.

Zi Yan posted on Monday, January 16, 2017 - 5:04 pm

I did a CFA. The Mplus output file said " NO CONVERGENCE. NUMBER OF ITERATIONS EXCEEDED. FACTOR SCORES WILL NOT BE COMPUTED DUE TO NONCONVERGENCE OR NONIDENTIFIED MODEL."
Why is that? Thanks.

Linda K. Muthen posted on Monday, January 16, 2017 - 5:59 pm

Please send the output and your license number to support@statmodel.com.

Mercy Oyet posted on Friday, February 03, 2017 - 3:35 am

Hi, I am new to CFA. I am testing a model comprising three latent factors made up of ten items each. The three factors are highly correlated (ranging from .9 to .92). My fit indices are not very good. Each model has some items repeated because what is being measured are people's perceptions of themselves when they take on different identities. For instance, as a mother I see myself as hardworking, as a wife I may also see myself as hardworking. So each item counts. I used different samples in generating the items. Then I ran separate EFAs using another set of different samples and they were very good. Next, the CFA for the three factors was conducted using a different sample. The cronbach alpha of each scale is really good using the CFA sample. The only way I can get good fit indices is to correlate measurement errors. Some of the correlations make sense. For instance the repeated items are expected to correlate, as well other somewhat similar other items too (eg reliable and dependable). However, other recommended error correlations based on the modification indices appear a-theoretical. I am unsure how to proceed and would appreciate any help advice on how to proceed. I don't know whether I should collect three data from three separate samples and conduct separate CFAs testing against other factors to demonstrate discriminant validity. I am new to CFA and would appreciate a step by step guide on how to do this. Thanks so much!

Mercy Oyet posted on Friday, February 03, 2017 - 3:37 am

Hi, I forgot to add that I am running the CFA using AMOS. Thanks.

Linda K. Muthen posted on Friday, February 03, 2017 - 6:18 am

General modeling questions like this should be posted on a general discussion forum like SEMNET. This discussion forum is for questions specific to Mplus.

Mercy Oyet posted on Friday, February 03, 2017 - 7:15 am

Thanks Linda for your prompt reply. I will seek guidance there.

Luis Anunciação posted on Wednesday, February 08, 2017 - 6:44 am

Dr. Muthen, could you help me to understand output for fscores.

First file:
NAMES ARE i1-i32;
USEVARIABLES ARE i13 i24 i20 i18 i7 i4 i25 i8 i31 i22 i23 i16 i26 i15 i11
i29 i17 i27 i30 i1 i19 i9 i3 i14 i5;
CATEGORICAL i13 i24 i20 i18 i7 i4 i25 i8 i31 i22 i23 i16 i26 i15 i11
i29 i17 i27 i30 i1 i19 i9 i3 i14 i5;
MODEL:
emocional BY i13 i24 i20 i18 i7 i4 i25 i8 i31 i22 i23 i16 i26 i15 i11;
social BY i29 i17 i27 i30 i1 i19 i9 i3 i14 i5;

Second file (with same data, but more variables)
NAMES ARE i1-i68;
USEVARIABLES ARE i43 i54 i50 i48 i37 i34 i55 i38 i61 i52 i53 i46 i56 i45 i41
i59 i47 i57 i60 i31 i49 i39 i33 i44 i35;
CATEGORICAL IS i43 i54 i50 i48 i37 i34 i55 i38 i61 i52 i53 i46 i56 i45 i41
i59 i47 i57 i60 i31 i49 i39 i33 i44 i35;
MODEL:
emocional BY i43 i54 i50 i48 i37 i34 i55 i38 i61 i52 i53 i46 i56 i45 i41;
social BY i59 i47 i57 i60 i31 i49 i39 i33 i44 i35;

Results 1
Number of observations 6530
Number of dependent variables 25
I13
Category 1 0.810 5292.000
(etc)

Results 2
Number of observations 6530
Number of dependent variables 25
I43
Category 1 0.810 5292.000
Ok, looks the same (i check it a lot)
Fscores 1
1.711 0.796
0.898 0.642
(etc)
Fscores 2
-0.087 -0.213
-0.025 -0.19

i appreciate any help

Bengt O. Muthen posted on Wednesday, February 08, 2017 - 3:53 pm

Please send your files to Support along with your license number.

Kelly M Allred posted on Wednesday, March 01, 2017 - 10:57 am

I am trying to run CFA and ESEM models in a sample of patient and relative dyads. Is there a way to model these data to account for non-independence among participants?

Bengt O. Muthen posted on Wednesday, March 01, 2017 - 12:12 pm

You can do the analysis using a single-level, wide data format approach. Then the model accounts for the non-independence. I think we have papers on this on our website under Papers, Dyadic Analysis.

Jonelle Reynolds posted on Wednesday, March 22, 2017 - 9:41 am

I am conducting a longitudinal dyadic data analysis of 1285 coparents for 3 waves of data. I have relationship quality and age (continuous) and categorical variables (race, education and relationship status). The two latent constructs are RelQ and RelS. The data is in the wide format. The estimator is WLSMV.

I am trying to run a CFA beginning with year 1. After univariate proportions and counts for cat variables it says: no convergence. number of iterations exceeded. I am not sure what I should do next.

Here is the syntax used:
VARIABLE:
usevar are m2d6T f2d6T cm2relf cm1age cf1age m1race f1race cm1edu cf1edu;

categorical are cm2relf m1race f1race
cm1edu cf1edu;

MODEL:
RelQ1 BY m2d6T cm1age m1race cm1edu;
RelQ1 BY f2d6T cf1age f1race cf1edu;
RelQ1@1.0;

RelS1 BY cm2relf cm1age m1race cm1edu;
RelS1 BY cm2relf cf1age f1race cf1edu;
RelS1@1.0;

m2d6T cm1age m1race cm1edu pwith f2d6T cf1age f1race cf1edu;
cm2relf cm1age m1race cm1edu pwith cm2relf cf1age f1race cf1edu;

OUTPUT: TECH1 MODINDICES STANDARDIZED;

Bengt O. Muthen posted on Wednesday, March 22, 2017 - 6:49 pm

Send your output to Support along with your license number.

Binhindi posted on Friday, March 24, 2017 - 2:01 pm

Hi,
I am running a MCFA in 2 levels and I have 2 questions:
First, I saw some researchers went through 4-5 steps to do that. Can you refer me to a good example doing that?
Second, when I tried to get the descriptive parameters using WLS for categorical in 2 level, I got this message:
Categorical variable "A" contains less than 2 categories!!
I checked the variable and found that was not correct. It has values from 1-4! Any suggestions?

Linda K. Muthen posted on Friday, March 24, 2017 - 5:34 pm

You should send the output, data set, and your license number to support@statmodel.com regarding question 2. You are most likely reading the data incorrectly.

You may want to ask your other question on a general discussion forum like SEMNET.

fulan fu posted on Thursday, July 27, 2017 - 12:21 pm

Dear Dr. Muthens,

I am trying to run a multiple group CFA model, but I want to use Bayesian estimator. The group is race and I want to study simultaneously factor mean difference and factor loadings difference among different groups. Next step is to examine the composite reliability of the CFA model among different groups.

My question is:
1.Is Bayes estimator available for multiple group CFA?
2. If not, What's the difference of the MG-CFA vs independent CFA models if I set the latent variable variance to 1 and first item loading to 1?

Thanks for this great forum!

Lydia August posted on Monday, August 28, 2017 - 2:15 pm

Hello,

I'm relatively new to CFA and have a few questions:

1) If I have a two-factor CFA model based on categorical indicators estimated using WLSMV that does not fit well (RMSEA=0.14, CFI=0.63, TLI=0.57, WRMR=1.72), would it be appropriate to refine the model by dropping indicators with low, non-significant loadings? If so, should I drop items after estimating the whole (two factor) model, or should I first drop items for each construct, estimate those single factor models, and then add them into the larger model?

2) Once I have a refined model, how can I compare the refined model to the original model, since I do not believe that dropping indicators constitutes a nested model? Would using MLR to get BICs and comparing those be an appropriate way to do this? Unfortunately I have a small sample of 157 cases and therefore splitting the sample to do an EFA after the CFA was not an option.

3) If theory indicates that there should be a correlation between the two factors and I am finding that there is almost no correlation between factors in my model, am I justified in leaving both factors in the CFA and presenting the model as is? Or should I choose one scale and ignore the other?

Thanks very much!

Bengt O. Muthen posted on Monday, August 28, 2017 - 4:35 pm

I would not recommend dropping items to get a better fit. Use Modification indices instead. But these general analysis strategy questions are better suited for SEMNET.

Tom Young posted on Thursday, October 12, 2017 - 7:24 am

Dear All,

I am examining the factor structure of a questionnaire and my input instructions are incorrect. I have consulted Mplus guidelines but still no luck. Therefore, could I have some help on the input please?

The questionnaire has 15-items and 1 factor and I am using zero-mean small variance priors using a Bayesian estimator: Here is my input:

TITLE: Bayesian model with cross-loadings and zero-mean and small-variance priors

DATA: FILE IS MTI Validation.dat;

VARIABLE: NAMES ARE MTI1 MTI2 MTI3 MTI4 MTI5 MTI6 MTI7 MTI8 MTI9 MT1I0 MTI11 MTI12 MTI13 MTI14 MTI15;

USEVARIABLES MTI1-MTI15;

ANALYSIS:
ESTIMATOR = BAYES;
FBITERATIONS = 100000;

MODEL:
MTI BY MTI1* MTI2 MTI3 MTI4 MTI5 MTI6 MTI7 MTI8 MTI9 MT1I0 MTI11 MTI12 MTI13 MTI14 MTI15;
MTI-MTI15@1;

MTI BY MTI1-MTI15*0(A1-A15);

MTI1-MTI15(RV1-RV15);
MTI1-MTI15 WITH MTI1-MTI15(CR16-CR109);
!(K*(K-1)/2)!K=number of items (this example = 15 items*(15-1 = 14)/2

MODEL PRIORS:

A1-A15 ~ N(0,.01);

RV1-RV15 ~IW(1,21); !K=number of items +6
CR16-CR110 ~IW(0,21); !inverse wishart distribution

OUTPUT:tech1 tech8;

PLOT: type = plot 2

Please help!

Bengt O. Muthen posted on Thursday, October 12, 2017 - 7:51 am

Please send your output to Support along with your license number.

Tom Young posted on Monday, November 20, 2017 - 6:53 am

Hello there,

I have successfully conducted a Bayes analysis on the 15-item questionnaire I am looking to validate. But, I have a posterior predictive p value of 0.021 and 2.018 and 117.962 for the Difference Between the Observed and the Replicated Chi-Square Values. With this the model is rejected and possibly misspecified? Do you have any advice/research papers you could point me towards please?

Kind Regards

Tom Young

Bengt O. Muthen posted on Monday, November 20, 2017 - 3:49 pm

Have a look at the paper on our website:

Muthén, B. & Asparouhov, T. (2012). Bayesian SEM: A more flexible representation of substantive theory. Psychological Methods, 17, 313-335.

This uses small-variance priors to obtain what amounts to modification indices for e.g. cross-loadings that need to be included in the model.

Tom Young posted on Tuesday, November 21, 2017 - 3:14 am

Thanks Bengt, I have had a long read of including small-variance priors and including cross-loadings into the model. I have only one factor with 15 items; therefore do I need cross-loadings? Also I have tried including cross-loadings into the syntax but just comes back with an error message saying invalid commands but wont tell me which commands to change. Any help would be much appreciated.

Kind Regards

Bengt O. Muthen posted on Tuesday, November 21, 2017 - 2:59 pm

Then you can't add cross-loadings. The non-optimal fit is a matter of residual covariances between the factor indicators. That's a much harder Bayes topic. If you don't have that experience, I would use ML modification indices to find the res covs that need freeing.

Tom Young posted on Wednesday, November 22, 2017 - 4:08 am

Ok thanks Bengt, the only final error message its come back with is:

*** ERROR in MODEL command
Unknown variable(s) in a BY statement: -DF8
I Have gone back through the syntax checked for a space, or an incorrect dash but it still wont run and DF8 is clearly a variable in my model. Any help would be much appreciated.

Bengt O. Muthen posted on Wednesday, November 22, 2017 - 3:48 pm

Are you sure you haven't called the variable "-DF8" instead of "DF8"? If that doesn't help, send to Support along with your license number.

Tom Young posted on Wednesday, November 29, 2017 - 6:00 am

Thanks for all the support, I have one final issue... I am running a Bayesian CFA with 15 items loading onto 1 factor and the output has produced good model convergence with PSR value of 1.0 on the two markov chains, a posterior predictive p value of 0.56 and 95% Confidence Interval values for the Difference Between
the Observed and the Replicated Chi-Square Values of -51.656 and 44.576. However my factor loadings are very low (doesn't get higher than 0.025) could you shed some light on what might be happening here please? or point me in the direction of some good sources please?

Bengt O. Muthen posted on Wednesday, November 29, 2017 - 11:46 am

I assume you have set the factor metric by fixing the factor variance at 1. If so, the low loadings reflect low sample correlations between the items.

tianyi fan posted on Friday, December 01, 2017 - 12:53 pm

Dear Dr. Muthen,
I meet one issue in multi-group CFA (with known class option). I fix all factor loadings to certain values. I firstly run the multi-group CFA analysis with covariates using 2005 rural sample and 2005 urban sample. Then I ran the second multi-group CFA with the same set of covariates using 2005 rural sample and 2010 rural sample. From the output, I found that even I fix the loading for indicators of the latent factor to the same value for 2005 rural sample in these two multi-group analysis, the regression coefficient for covariates slightly changed. Because I use the same 2005 rural sample and fix the loading for indicators as the same value, the regression coefficient should not experience any change. So whether the change of "the other group" in the two-group analysis will influence regression coefficients of covariates or the 2005 rural
sample. Thanks much for answering my questions.

Bengt O. Muthen posted on Friday, December 01, 2017 - 2:46 pm

Send the 2 outputs that disagree to Support along with your license number.

Tom Young posted on Thursday, December 14, 2017 - 3:09 am

Dear Dr Muthen,

In reply to your previous message: 'I assume you have set the factor metric by fixing the factor variance at 1. If so, the low loadings reflect low sample correlations between the items'.

Is there any way I can resolve this issue? is it a case of collecting more data? I have also noticed that the standardized model results with regards to the factor loadings are all 0.70 or above. Is there a rationale for using standardized factor loadings please?

Bengt O. Muthen posted on Thursday, December 14, 2017 - 12:18 pm

This is a good question for SEMNET.

Victoria Thompson posted on Saturday, December 16, 2017 - 9:59 am

Hi,

I have a couple of questions regarding the MIMIC model and its syntax in mplus. I am looking at DIF. I have used the information from Topic 2 on the mplus website but I want to be sure I am clear. I ran a CFA which showed good fit indices. I have a four factor second order model. I then went ahead and added 4 covariates to the CFA. What I am confused about is when I run this model, I constrain the covariates to zero, check the MI's and then add direct effects. The next time I run the model do I keep the covariates fixed at zero?.

I have 20 items. Do I use the following syntax and just include the direct effects between the covariates and the items with the largest MI's one at at time?. If a significant direct effect is evident I can report DIF in that item?. I have read so much on MIMIC and have utterly confused myself. I apologize if this doesn't make sense.

Thanks

Bengt O. Muthen posted on Saturday, December 16, 2017 - 2:42 pm

Q1: I don't know what you mean by "keeping the covariates fixed at zero".

Q2: You let the covariates influence the factor(s) and add one direct effect at a time. See also my 1989 Psychometrika article on our website.

xu shuangfei posted on Wednesday, December 20, 2017 - 7:01 pm

Dear Dr Muthen,
When I do a CFA containing nominal, categorical, and continuous variables,the default output display limited model fit indice, such as chi-sq, loglikelihood, AIC but rathen CFI and TLI. Could you give me any sugestion?

Linda K. Muthen posted on Thursday, December 21, 2017 - 7:02 am

Please send the output and your license number to support@statmodel.com.

Cheng posted on Tuesday, January 09, 2018 - 8:34 pm

Dear Linda,
Under the CFA output, STDYX standardization, the WITH (eg., A WITH B), the estimate value corresponding to it, we call it as covariance. Is that right?

Bengt O. Muthen posted on Wednesday, January 10, 2018 - 10:15 am

When standardized it is a correlation between the two residuals.

Cheng posted on Wednesday, January 10, 2018 - 4:54 pm

Dear Muthen, How about the latent A WITH latent B under STDYX? They don't have residual on it. Is it called "covariance" between latent A and latent B?. Then under MODEL (standardized), we call it correlation between latent A and latent B?

Sorry for asking the basic questions. I would like to make sure I reporting the right values in my report.

Thank you.

Bengt O. Muthen posted on Thursday, January 11, 2018 - 4:28 pm

Right.

Karlygash Assylkhan posted on Wednesday, February 14, 2018 - 9:04 am

Hi Dr. Muthen,
Can you advise why the number of observations is halved from the sample size?
When I ran CFA, the number of observations is halved from 714 to 357, although the number of groups was 1. The syntax terminated normally but only took half of the sample.

The following syntax:

MODEL:
TYPE IS GENERAL;
ESTIMATOR IS ML;
ITERATIONS = 5000;
CONVERGENCE = 0.00005;
! MODEL IS NOMEANSTRUCTURE;

OUTPUT: MODINDICES(1) STANDARDIZED CINTERVAL;
SAVEDATA: FILE IS FSCORESZscoresCONTROL.DAT;
SAVE=FSCORES;

INPUT READING TERMINATED NORMALLY

Model Factor Scores WITH CONTROL VARIABLES 02-11-2018

SUMMARY OF ANALYSIS

Number of groups 1
Number of observations 357

Number of dependent variables 74
Number of independent variables 0
Number of continuous latent variables 13

Bengt O. Muthen posted on Wednesday, February 14, 2018 - 4:08 pm

This is usually due to listing more variables on the NAMES = list than there are columns in the data. If this doesn't help, send data and output to Support along with your license number.

Tom Young posted on Wednesday, April 04, 2018 - 8:46 am

Dear Dr Muthen,

I am wondering whether I could have some help please? I am having problem with a Bayesian CFA analysis where I have 15 items loading onto 1 factor. I have adequate model convergence, but my ppp is 0.001 and 95% confidence interval is 145.636 and 224.885. I have literally exhausted every avenue I can think of. Here is my model:

ANALYSIS:
ESTIMATOR = BAYES;
FBITERATIONS = 100000;

MODEL:

MTI BY MTI1* MTI2 MTI3 MTI4 MTI5 MTI6 MTI7
MTI8 MTI9 MTI10 MTI11 MTI12 MTI13 MTI14 MTI15;
MTI@1;

OUTPUT: TECH1 TECH8;
PLOT: TYPE=PLOT2;

Is there anything I am missing here? Is the model mis-specified? I do not think I need cross-loadings as i only have one factor with 15 items loading onto it. Do i need to include priors still? as i have tried that and the model fit is still horrendous.

Any help would be much appreciated.

Regards Tom

Bengt O. Muthen posted on Wednesday, April 04, 2018 - 3:33 pm

The model may fit poorly because of residual correlations or because other factors are at play. You can do an EFA to find out if other factors are relevant.

Tom Young posted on Monday, April 09, 2018 - 5:06 am

Thanks for the advice, I have added small variance informative priors for the residual covariances for my 15-item 1 factor model.

The model fit is excellent indicating a good match between the model and the data:

PPP-0.55
95% Confidence interval- -50.41 45.30
PSRF Values-1.01

However, I have a couple of questions:

1. 11 of the 15 unstandardized factor loading values are higher than 1.0. How do I interpret this please?

2. I also see a few author have completed a sensitivity analysis to see the influence of the priors on the estimates. I used p1-p15~IW(1,21) and p16-p120~IW(0,21) from your 2012 paper where you used a inverse wishart distribution of the number of items +6, corresponding to prior means and standard deviations for residual covariances of zero and 0.1. How do I change the IW for priors means and standard deviations please?

Kind Regards

Bengt O. Muthen posted on Monday, April 09, 2018 - 3:57 pm

1. Unstandardized loadings don't need to be less than 1. They are totally dependent on the scales of the indicator and the factor just like regular regression coefficients.

2. For guidelines on this, see the paper on our website:

Asparouhov, T., Muthén, B. & Morin, A. J. S. (2015). Bayesian structural equation modeling with cross-loadings and residual covariances: Comments on Stromeyer et al. Journal of Management, 41, 1561-1577.

Sophia Winkler-Schor posted on Friday, June 29, 2018 - 4:37 am

Hello,
I ran an LCA and with 15 variables and 4 classes. I need the standard deviations (or variance) for each item in the 4 classes. When I run the commands below I receive an output that has the means and variance for the pooled data but when the means and variances are provided for each class the means all differ (as they should) but the variances below are the same for each item in each class. I am struggling to understand why this is happening.

Is there another way to get the variance/SD for each item in the 4 separate classes?

Analysis:
Type = Mixture;
Savedata:
file is LCA_4_save.txt;
save is cprob;
format is free;
!Analysis:
!Estimator is ML;
!Model:
!Alt by IV1-IV3;
!Bio by IV4-IV6;
!Ego by IV7-IV9;
!Hed by IV10-IV12;
!Eud by IV13-IV15;
!Output: tech11 tech14 standardized sampstat mod residual;
Output: tech11 tech14 sampstat;

Tihomir Asparouhov posted on Monday, July 02, 2018 - 11:57 am

Use this model command

Model:
%overall%
IV1-IV15;
%C#1%
IV1-IV15;
%C#2%
IV1-IV15;
%C#3%
IV1-IV15;
%C#4%
IV1-IV15;

The default is to hold the variances equal as it is a more introductory model (easier to estimate). Using the above model you directly specify that variances are estimated as class-specific parameters.

Lisa Dragoni posted on Tuesday, July 17, 2018 - 12:00 pm

I am seeking to generate validation evidence for my focal construct, vqlty. One rater assessed multiple individuals so I am controlling for rater effect by using the Twolevel type analysis and group mean centering my variables (have all items at the within level and set the means to zero).

I get the following error message:

THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES MAY NOT BE
TRUSTWORTHY FOR SOME PARAMETERS DUE TO A NON-POSITIVE DEFINITE
FIRST-ORDER DERIVATIVE PRODUCT MATRIX. THIS MAY BE DUE TO THE STARTING
VALUES BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. THE
CONDITION NUMBER IS -0.103D-16. PROBLEM INVOLVING THE FOLLOWING PARAMETER:
Parameter 3, %WITHIN%: [ VQLTY3 ]

1. I don't think my model is unidentified--is it?
2. If the problem is with my start values, how do I fix this?
3. If the problem is with one of my parameters, how do I fix this?

Thank you! Lisa

Bengt O. Muthen posted on Wednesday, July 18, 2018 - 7:00 am

Send your output to Support along with your license number.

Mira Patel posted on Monday, July 23, 2018 - 6:25 pm

I have a questionnaire (10 items, 5 same response choices for each item) that is collected daily for 12 weeks. For each week, I can average each item, as well as, the total score, to get weekly averages per person. Can I do a CFA on weekly averages?

We also plan to assess measurement invariance between baseline and Week 12 as well, so I am wondering if its okay to use the weekly averages, or should I pick a single day from that week?

Bengt O. Muthen posted on Tuesday, July 24, 2018 - 2:25 pm

If you like, you don't have to average but could analyze the 10 items using all 7*12 time points in one analysis. This can be done using Dynamic SEM as shown in the UG ex 9.34.

Mira Patel posted on Tuesday, July 24, 2018 - 3:11 pm

Thanks Dr. Muthen. That's a helpful reference to look into!

And a follow-up question. In a second dataset, I only have the weekly averages (no daily scores were provided). And at only at three points (baseline, Week 8, and Week 16). Can I do a MG-CFA on this dataset or is there another type of approach I should take?

Bengt O. Muthen posted on Tuesday, July 24, 2018 - 5:01 pm

Having the same subjects measured at different time points cannot be represented by a multiple-group approach because you don't have independent observations across time given that the subjects are the same. Instead, you have a factor model for each time point - and you can apply the same measurement invariance testing across time as you would do across groups.

Mike posted on Tuesday, August 07, 2018 - 12:38 pm

Hello,

We are fitting a multi-group CFA model to test measurement invariance and wondering about the interpretation of some modification indices for a particular group. The model has three factors, each identified by 3 items, and the factors are allowed to correlate.

For one group, the modification indices are dramatically higher than the other groups under the heading "ON/BY statements":

F1 ON F2 /
F2 BY F1

Is this telling us that a hierarchical factor structure (in just this group) would improve model fit? Or, what is it telling us?

We fit the original CFA model to just the data for this group, and the model fit is good, but when we move to a multi-group framework the fit falls apart - suggesting non-invariance of the factor structure across groups?

Thanks for your help!

Bengt O. Muthen posted on Tuesday, August 07, 2018 - 2:48 pm

Please send your full output with these mod indices to Support along with your license number.

Robin Robin posted on Thursday, August 09, 2018 - 7:30 am

Hi,

I have 260 observation, is this enough to run CFA using Mplus. My model contains one independent, two mediators and one dependent.

Thank you in advance.

Regards

Bengt O. Muthen posted on Thursday, August 09, 2018 - 2:15 pm

The sample size N that you need depends on how many observed variables you have (p) and how many parameters you have (r). You typically want N > p and N >> r.

Cesar Daniel Costa Ball posted on Monday, August 13, 2018 - 3:44 pm

I try to conduct a CFA, but it only output chi-square, TLI, CFI, and RMSEA. How can you get the SRMR index?

Mira Patel posted on Tuesday, August 21, 2018 - 7:22 pm

Can having a high correlation between two items produce a low CFI? What is the syntax to adjust for the high correlation?

Bengt O. Muthen posted on Wednesday, August 22, 2018 - 4:41 pm

You can add a residual correlation using

i1 WITH i2;

Mira Patel posted on Wednesday, August 22, 2018 - 5:08 pm

Thanks, Dr. Muthen. That worked perfectly.

I added residual correlations for an eight item measure (I based it on having correlations greater than 0.80):

q3 with q7;
q4 with q5;
q4 with q8;
q5 with q8;

This increased the CFI from 0.79 to 0.97.

I was wondering if doing something like this is justified?

I am trying to understand if this is a good way to improve the CFI or if another way should better? What's the downfall of this?

Bengt O. Muthen posted on Thursday, August 23, 2018 - 6:26 pm

If you have to add that many residual correlations, it seems your original model was not suitable for your data. Adding that many seems like data fishing. They should really make strong substantive sense.

Mira Patel posted on Thursday, August 23, 2018 - 6:58 pm

Got it! Thanks so much for all your help!

Eric Knowles posted on Thursday, September 20, 2018 - 6:41 am

Hi,

I'm attempting to do a CFA in Mplus to test the discriminability between two latent variables -- called ess and col -- and am having some trouble. I've read that I should assess the difference in chi-square between a model in which I freely estimate the correlation between ess and col and a model in which I fix the correlation between ess and col to 1. Here's my model:

model:
ess by ess1* ess3 ess4 ess6; ! * = don't fix @1
col by collude1* collude3 collude4; ! * = don't fix @1
ess@1; ! factor variance = 1
col@1; ! factor variance = 1

ess with col@1; ! comment out to freely estimate correlation

Although the freely estimated model run fine, the constrained model (as above) leads to a psi matrix error:

WARNING: THE LATENT VARIABLE COVARIANCE MATRIX (PSI) IS NOT POSITIVE
DEFINITE. THIS COULD INDICATE A NEGATIVE VARIANCE/RESIDUAL VARIANCE FOR A
LATENT VARIABLE, A CORRELATION GREATER OR EQUAL TO ONE BETWEEN TWO LATENT
VARIABLES, OR A LINEAR DEPENDENCY AMONG MORE THAN TWO LATENT VARIABLES.
CHECK THE TECH4 OUTPUT FOR MORE INFORMATION.
PROBLEM INVOLVING VARIABLE COL.

Am I missing something? (I usually am!)

Thanks so much,

Eric

Eric Knowles posted on Thursday, September 20, 2018 - 7:33 am

Followup to my previous message ... I did a better search of the Mplus discussion board, and found that I can do this:

model:
ess by ess1* ess3 ess4 ess6; ! * = don't fix @1
col by collude1* collude3 collude4; ! * = don't fix @1
ess@1; ! factor variance = 1
col@1; ! factor variance = 1

ess with col (corr_ec);

model test:
corr_ec = 1;

Am I correct that the "Wald Test of Parameter Constraints" is equivalent to a chi-square difference test comparing the fixed and freely-estimated models?

Thanks!

Eric

Bengt O. Muthen posted on Thursday, September 20, 2018 - 5:46 pm

The Wald test is asymptotically equivalent to the chi-2 difference test. But if you concerned about only the one correlation parameter, why not simply test against 1 and use (Est-1)/SE - the square of that is asymptotically the same as chi-2 with df=1.

Raffaele Zanoli posted on Sunday, February 03, 2019 - 8:11 am

Hi
for number of items I have ratings for both performance and importance.
Is there a way in MPLUS to make a CFA using the importance scores as weights for the performance scores in testing a multidimensional construct?

Thanks!

Best regards

Raffaele

Bengt O. Muthen posted on Monday, February 04, 2019 - 1:15 pm

I am not aware of this is done in your literature. Can one use the importance scores as item-specific covariates? I did this in the opportunity-to-learn context but that has more to do with differential item functioning. You may want to ask on SEMNET.

Snigdha Dutta posted on Friday, May 17, 2019 - 10:17 am

I am unable to figure out the cause for this error: I can't find anything wrong in my syntax or my .dat file.

*** ERROR
The following MODEL statements are ignored:
* Statements in the GENERAL group:
[ MMOP9T1$3 ]
[ MMOP9T2$3 ]

Bengt O. Muthen posted on Friday, May 17, 2019 - 11:12 am

We need to see your full output - send to Support along with your license number.

Jane Lee posted on Wednesday, November 13, 2019 - 11:51 am

Dear Dr. Muthen,

I am attempting a CFA for measurement invariance over three waves. My latent variable is "self regulation" and will be measured by 10 items from a self control scale. This questionnaire has 3 different forms according to the appropriate age range of the child (preschool, elementary, secondary). My sample is 600 children who ranged from 0-18 years at baseline. These children therefore answered to the self-control questionnaire that corresponded to their age and left legitimately skipped the other forms. Over the three waves of data, some children age from being applicable for the preschool form to the secondary. Could you provide advice as to how I should go about the CFA to test measurement invariance over time? Should I create models for the same age groups over different waves(ex. self regulation preschool age (wave 1) - self regulation preschool age (wave 2) - self regulation preschool age(wave 3))? Or should I go about by creating models for different age groups over the same wave? Also, would Mplus find the large amount of missing data due to the legitimate skips a problem? If yes, how should I correct this? Lastly, each item on the latent scale is a 3 point Likert scale. Should I be treating this as categorical and use the WLSMV estimator? Thank you.

Bengt O. Muthen posted on Wednesday, November 13, 2019 - 12:05 pm

Much of these issues are discussed in our Short Course Topic 4 video and handout. See slides 79-99, especially 98-99. If you have some items that are the same across forms that makes it possible to put all forms on the same scale in the modeling. That may not be the case. If many children use the same form over time that may be helpful in the modeling but perhaps that is not the case either. Regarding missing data, you can think of the data as 3 groups corresponding to the 3 forms, each group using the same number of items. So that's planned missingness which doesn't cause problems.

Svane Blume posted on Friday, November 22, 2019 - 1:07 am

Hello,

I would like to perform CFA with ordinal data and missing values.

I purchased MPlus because literature suggests that the latent variable approach for imputing ordinal data followed by Limited Information Diagonally Weighted Least Squares Estimation is the best option to address the ordinal and missing data structure in CFA.

However, I have problems to find out how to implement this approach, because I am not sure which of the MI approaches described in the User's guide matches my target model.

Can you help me with this? Thanks in advance!

Tihomir Asparouhov posted on Friday, November 22, 2019 - 10:01 am

The Bayes estimator and the ML estimator are both fine to use as well. We discuss various imputation approaches here
http://www.statmodel.com/download/Imputations7.pdf
but it is fine to use our defaults with type=basic. See User's Guide example 11.5.

Svane Blume posted on Monday, November 25, 2019 - 12:31 am

Thanks for your response.
I already knew the document "Multiple Imputation with Mplus", but I was not able to find the specific estimator I was asking for. Since there is a study by Shi et al. (2019) recommending this approach based on a simualtion study, I would really like to stick to it. They used Mplus for this - This is why I purchased MPlus in the first place.

The study can be found here:
https://journals.sagepub.com/doi/full/10.1177/0013164419845039

Thanks again!

Svane Blume posted on Monday, November 25, 2019 - 6:31 am

Sorry, me again. The more I try to get an overview of the different specifications and the corresponding Mplus language, the more confused I get.

In the document "Multiple Imputation with Mplus" you introduce 3 different MI models that are implemeted in Mplus, i.e. the Variance Covariance Model, the Sequential Regression Model and the Regression Model (pp.3-5).
In chapter 3.4 you give an example for the imputation with a large number of categorical variables (which I believe represents my data best). From your simulation you conclude that the H1 imputation based on the variance covariance model is the preferred one here (p.13).
However, there is no hint on how to specify this model in Mplus.
I thought, I might be supposed to carry the newly gained information on the preferred model to the user's guide and that it would be easy to find out the correspoding Mplus code there. But the term "variance covariance model" doesn't even exist in the user's guide.
I am wondering what I am missing: How is it possible to find out how the various options can be translated into Mplus code (or whether it is the dafault anyway)?

Tihomir Asparouhov posted on Monday, November 25, 2019 - 2:39 pm

Take a look at User's Guide page 565 under "MODEL IMPUTATION" and at User's Guide page 578 under "MODEL". The COVARIANCE option is the "variance covariance model" and is also our default so you can essentially skip that command.

I think User's Guide example 11.5 is similar to what you want to do.

Svane Blume posted on Tuesday, November 26, 2019 - 1:58 am

Okay, thanks - I see.

May I ask one more question?

In the "Multiple Imputation with Mplus" document it says that missingness indicators should not be included in the imputation model (p.20).

In our case the missing values are not really missing in the sense of a true value which is undobserved and "masked" by a missing value.
I have items on patient satisfaction with nursing care during a hospital stay. Some of them are not applicable to all patients because certain services were not necessary during their hospital stay. This is why these items contain a category "service was not necessary" which is not part of the satisfaction scale and, thus, coded as missing. Because we couldn't find research on how to handle this kind of "missingness" in factor analysis, we decided to go for the best practice for addressing missing values in genereal and perform multiple imputation (supplemented by sensitivity analyses using pairwise deletion).
Since the missingness depends on the observed variable "service was not necessary" we assumed MAR and planned to include this indicator in the model. However, the indicator is essentially equal to a missingness indicator.

Is there any chance to stick to this approach and include the indicator in the model? Otherwise we couldn't assume MAR, right?

Tihomir Asparouhov posted on Tuesday, November 26, 2019 - 10:14 am

What is the purpose of trying to include the observations from a patient that didn't use the service? I would need to see the model to be able to tell you more. I can't think of a model where that would be necessary. Consider this - in Mplus - any row that has missing on all dependent variables is automatically discarded. There is no information there and it won't contribute anything to the model. Even if you impute these missing data it won't change the model estimates or standard errors.

Svane Blume posted on Tuesday, November 26, 2019 - 12:13 pm

I have about 30 items on patient satisfaction with different aspects of nursing care. So each patient did demand nursing care as a whole but single items are not applicable to all patients. For example the satisfaction with wound management can only be evaluated by patients who actually had a wound to take care of. Thus, not all all dependent variables are missing, only up to seven variables. Since we know exactly why the values are missing "patient didn't need service XY" we assumed MAR and planned to include it as an indicator in the model.

Tihomir Asparouhov posted on Tuesday, November 26, 2019 - 3:04 pm

I see. This sounds perfect then.

I would not recommend including the the missing data indicator in the model. For example, in the imputation model if you include the missing data indicator you will create a problem. The correlation between the indicator and the variable it is an indicator for can not be estimated because when the variable is observed the indicator is constant. We point that out in the paper. Just record "patient didn't need service" as missing.

Svane Blume posted on Wednesday, November 27, 2019 - 1:28 am

Thanks for the recommendation!

Svane Blume posted on Friday, January 17, 2020 - 6:58 am

Hello,

I have two questions regarding my CFA output.

I estimate WSLMV based on a multiply imputed dataset. Does Mplus neither provide Modification Indices nor standardized residuals for this estimation? How do you recommend to proceed in evaluating my model in terms of areas of misfit and possible modifications instead?

R-Square estimates: For one item the R-Squared value is 0.000 with p-value 999.0. The values for all other items are reasonable. Does this imply model misspecification or how should I interpret these values?

Thanks!

Tihomir Asparouhov posted on Friday, January 17, 2020 - 12:10 pm

You can use those for each of the imputed data sets. If certain modifications are suggested consistently across the data sets you can modify the original model and test particular hypothesis with Model Test.

You might also consider using the unimputed original data to search for model modifications.

The R2 alone doesn't necessarily imply misspecification.

Svane Blume posted on Monday, January 20, 2020 - 1:14 am

Thanks. This is quite some effort, but if it's the only possible way, I'll go for it.

Okay, but what exactly does the 0.000 R-Square tell me? I also tested a sub-model for which most factors represent only 2 to 3 items. The respective CFA output essentially contains only 0.000 and 999.0 values although overall model fit is good. Is there no way to get "real" values for the model results?

Svane Blume posted on Monday, January 20, 2020 - 1:49 am

And even if I analyse each of the imputed datasets separately, only unstandardized residuals are available under WLSMV, correct?

Tihomir Asparouhov posted on Tuesday, January 21, 2020 - 9:25 am

Standardized residuals are available under WLSMV for both of these sections

UNIVARIATE PROPORTIONS FOR CATEGORICAL VARIABLES

BIVARIATE PROPORTIONS FOR CATEGORICAL VARIABLES

but not for the polychoric correlation.

Let me walk back the statement on the modification indices etc. You can just run a couple of the imputed data sets. If the amount of missing data is 10% or less the results are not going to be very different across the imputed data sets. Since you are using this step as an exploratory step, you don't need to run all the imputed data sets. Ultimately significance is determined by the combined run that uses all the data sets. You would be using individual data sets to search for model modification ideas not as an ultimate decision maker.

There are other options. You can use the Bayes estimator on the unimputed data and use PPP.

The problem with the 0 R2 and what you are describing there as getting 0 or 999 only might be a bit more serious and you might have to figure that out. Do you get the same when running individual data sets? You can run this also with ML and Bayes to see if you can get any further insights. You can also send data input and output to support@statmodel.com.

R2=0 means that none of the predictors (if there are any) have any effect and you should be able to see this in the model results section.

Jane Lee posted on Tuesday, February 18, 2020 - 1:46 am

Hello,
I am running a EFA with the weight is command because it is a complex sample. Negative residual variances begin to appear at the 4-factor model, but the s.e. estimated residual variances remain positive. Which one should I be reading for deciding the number of factors for my model? Thank you for your help.

Bengt O. Muthen posted on Tuesday, February 18, 2020 - 4:58 pm

You don't want negative residual variance estimates. SEs are always positive so that is not informative.

Jane Lee posted on Wednesday, February 19, 2020 - 1:42 am

Hi Bengt,

Thank you for your answer. The problem is that 1,2,and 3 factor models without the negative residual variances have bad model fit (SRMR higher than 0.1). I get adequate model fit at the 7 factor model which includes negative residual variances for some variables. If I only extract variables with a positive residual variance to create one latent variable, is this still a problem?

Bengt O. Muthen posted on Thursday, February 20, 2020 - 4:10 pm

It is difficult to find the reason that the EFA model doesn't fit. If you delete one factor indicator with a negative residual variance, another one may appear.

Jordan Thomas posted on Wednesday, July 29, 2020 - 4:21 pm

Hello!
I am interested in establishing invariance of a 5-factor PTSD model in a large sample of three racial/ethnic groups (N=1663). There is no missing data. As my data are categorical (ordinal; e.g., frequency of PTSD symptoms on a 5-point Likert scale), I am using WLSMV for this multigroup CFA approach. (I have already fit the data among the sample using WLSMV estimation for single-group CFA and it fits quite well). As per recommendations from the users guide and discussion board, theta parameterization is being used for the multigroup approach. My input generated the following error message: "Group 2 does not contain all values of categorical variable: C3_AMNES." I read elsewhere on this board that each group must have the same values on categorical observed variables, and that categories need to be collapsed until the same number of values exist in each group. I am having trouble understanding how this could be meaningful for my research question and data. Are there any other recommendations for how to troubleshoot this error and move forward with invariance testing?

Bengt O. Muthen posted on Wednesday, July 29, 2020 - 5:16 pm

With maximum-likelihood, you can use the * option

Categorical = u1-u5(*);

See UG page 607.

But ML estimation is heavy with 5 factors even with MonteCarlo integration so this may not be practical for you. Collapsing categories to have the same number of categories in each group is more practical.

Jeongwook CHOI posted on Tuesday, October 27, 2020 - 10:48 pm

Hello!

I have had a trouble analysing CFA.

The trouble is a ERROR in VARIABLE command in CFA analysis.

The ERROR is below.

*** ERROR in VARIABLE command
On the USEVARIABLES list, variables from the NAMES list must come before
all new variables created using the DEFINE command. The variables(s)
violating this order are: UN_03

but I didn't define a variable anything.

same error occurs other input file.

why does it occur?

Bengt O. Muthen posted on Wednesday, October 28, 2020 - 4:09 pm

To diagnose this, we need to see your full output - send to Support along with your license number.

Jeongwook CHOI posted on Thursday, October 29, 2020 - 6:17 am

Hi, Bengt

I solved the above ERROR about variable.

UN_03 was mis-typed.

I typed alphabet 'O' in NAMES ARE, but typed number 0 in USEVARIABLE and MODEL command.

thank you.