Q1. Under "Theta parameterization" theta the residual variance (associated with measurement section of SEM) is our model parameter, right Professor! ... however, the corresponding "Tech 1" output showing NO parameter assignment ... why is it so, I can't get it ... I thought because my analysis is Single group, I tried to make it Multiple group, there also it's NOT assigning any parameters in Tech1 (I could have written code wrong)
(following your Mplus web note 4) ... We get Scale factor as remainder, which is = (factor loading^2)*factor variance + theta
we have factor loading values from Lambda Matrix (tech 1 showing the parameter assignment), we have factor variance from Psi matrix (tech 1 showing parameter asignment here as well)
Q2. Mplus output report "scale factors" ... how does it calculate scale factor ...it's written in the web note that "Theta approach standardizes to theta=1" ... in that case scale factor should be greater than one (since factor loading^2 as well as factor variance are positive number) ... however my results are different ... confused once more
Q3. What does the R-square associated with scale factor stand for? The reason I am asking Q3 in particular ... our dependent variables are categorical (ordinal) .... can we interpret "r-square" values for the categorical outcomes as the proportion of variance explained as we do in the case with continuous outcomes
thanks and regards
bmuthen posted on Saturday, June 11, 2005 - 6:05 am
Theta is only identifiable in groups > 1 in multiple-group analysis and time point > 1 in growth modeling.
The scale factor delta is the inverted y* SD, not the y* variance.
R2 is the regular R2 for y*.
See also Appendix 1 of the technical appendices on the Mplus web site.
Sanjoy posted on Saturday, June 11, 2005 - 10:36 pm
Thank you very much Professor
jenny yu posted on Saturday, September 09, 2006 - 12:34 pm
Dear Drs. Muthen,
When we build a MIMIC model with ordinal indicators and categorical background variables, how should we choose the parameterization? Dr. Bengt mentioned that either delta or theta can be used. But I am wondering whether there are some criteria we need to consider to choose between delta and theta? How can I justify my selection over the other?
The Delta parameterization has been chosen as the default in Mplus because there are some cases where it works better than Theta - see Mplus Web Note#4.
jenny yu posted on Sunday, September 10, 2006 - 1:59 pm
Thank you, Professor Muthen. I have another question on adding direct effect in MIMIC model--
Can the model fit obtained after adding the direct effect be used as one of the criteria to decide the existence of DIF effect? Say, if the model fit gets worse, then the variable is not appropriate to add for examining direct effect.
I would recommend looking at the z score (Est/SE) of the direct effect instead.
Jungeun Lee posted on Monday, August 20, 2007 - 5:42 pm
I am working on a SEM model in which a categorical (ordinal) dependent variable is influenced by and influences another latent variable. I looked at the Mplus user's guide(v.3) parameterization section and thought that THETA would be the way to go. I ran the model with THETA. Out of curiousity, I re-ran the model with DELTA. Both models ran fine but gave me somewhat different results. Usually, I go with a default setting in statistical packages, unless I have a clear idea about why the default setting wouldn't work... Since models with either DELTA and THETA worked fine and I am not super clear about when to use which parameterization method, my usual self tells me to go with results from the model with DELTA parameterization... do you think it is a good way to go? Thanks in advance!!
Y1* is related to a binary variable, Y1 and Y2* is related to an ordinal variable Y2 (3 categories). X's are strictly exogenous.
I am struggling to understand the following.
The R-square value of Y1 increases significantly (.32 to .50) when we use theta instead of delta parameterization.
Q1. I am wondering why?
Q2. Coefficient values change from one parameterization to the other one. However, their relative strength remains same. I think that's how it should be; because parameterization affects the limit of integration through the scaling factor. Is not it?
Q3. As an extension of the model, I add X4 in the Y2 equation. As expected, the R-square value of Y2 goes up but R-square value of Y1 goes down. Given Y2 is significant in explaining Y1, why does it happen? I think only "adjusted R-square" could go down with addition of new variable but I am not sure what is Mplus doing here.
There have been changes that could change the results. The only way to know is to use the latest version and to send the files and your license number to email@example.com. The information you give is not sufficient to answer your question.
I just wanted to quickly check something with you: I am running a MC simulation of a CFA with categorical factor indicators (5 response choices; i.e. 4 thresholds) using your two step process. In step one I specify the parameterization as theta and set the metric of the factor by fixing factor variance to 1. I specify the same with the model and model population in step 2. The results from the MC seem plausible but I just wanted to check I wasn't missing something important. I note in example 4.2 you set the values to make delta and theta parameterizations equivalent but presumably this is because the original CFA used the delta parameterization. Is this correct? Many thanks for your help. best wishes Paul
I'd like to fit a model to real data and save parameter estimates (as in the example 12.7 Step 1 of the User's Guide), and then read the parameters to use in a Monte Carlo simulation (as in the example 12.7 Step 2).
This works fine for as long as I am treating the indicators as continuous.
However, when I declare indicators as ordered categorical, I can no longer save estimates, and I am getting an error message "PARAMETERIZATION=THETA is required when estimates are to be saved".
Once I switch to THETA, I can save parameter estimates, but I cannot read them from the Monte Carlo simulation.
I wonder if there is somewhere an example similar to the example 12.7, but using ordered categorical indicators?
I created a Monte Carlo model by adding SVALUES in the model I used to analyse real data (the model declares all indicators as ordered categorical). I copy the SVALUES output both under MODEL POPULATION and under MODEL in the Monte Carlo model (for the latter, minus the part where threshold values are set).
I also needed to set residual variances for the indicators under MODEL POPULATION, which I found under R-SQUARE section of the output generated by the model I used to analyse real data (hopefully, these are the right ones - it is not very clear from the model output).
The resulting Monte Carlo model does run. However, I found that on most iterations, it fails with a "CATEGORICAL VARIABLE" "HAS ZERO OBSERVATIONS IN CATEGORY" message. Indeed, some of the categories on some of the items occur very rarely in the original data, so one would expect that in a Monte Carlo simulation fairly often they would not occur at all.
The Monte Carlo simulation does generate some reasonably looking result based on the iterations that do not fail. (Coverage values, though, are not as good as I expected - often rather far from 0.95).
What would be the best way to resolve the "ZERO OBSERVATIONS IN CATEGORY" problem?
Sara posted on Tuesday, February 15, 2011 - 4:49 pm
When I conduct a CFA with binary (0,1) indicators using WLSMV estimator and specify "parameterization is Theta", how does one intepret the first set of parameter estimates Mplus produces (under "Model Results")?
I realize the Standardized parameters(which come after the first set of parameters) indicate the standardized relationship between y* for the indicator and the latent factor. The standardized thresholds can be intepreted as z-scores. Can I call these "standardized probit regression coefficients"?
Can the values above the standardized coefficients (first set produced) be interpreted and labeled as the "unstandardized probit coefficients"? That is, can they be intepreted as the unit change in the probit of the indicator for every unit change in factor? Can the unstandardized threshold be interpreted as the expected unstandardized probit that the observed response equals lower category (zero in this case) when the factor is zero?
I didn't know if there was a naming convention that was used (or should be used) when the Theta parameterization is employed and the 2 sets of parameters are reported (and intepreted).
I think you can characterize the unstandardized Theta estimates as unstandardized probit estimates in the sense that both probit regression and Theta parameterization fix the residual variance to 1. So your interpretations are on target. The Delta param. instead fixes the u* variance to 1 (which means that the residual variance is not 1). That also gives probit estimates, but not in the usual probit regression metric. The Delta advantage is that the loading estimates are in usual standardized factor analysis metric.
Sara posted on Wednesday, February 16, 2011 - 4:51 am
Thanks so much! Then can you clarify the interpretation of the standardized vs unstandardized probit coefficients in this context. Can I interpret the standardized threshold of a binary item (0, 1) as the z-score that indicates the proportion of respondents answering 0 (thus the probability of answering a 0 for that item)? Then how is the unstandardized probit threshold interpreted? It does NOT correspond to the cumulative area under the normal curve for that response (like the standardized probit thresholds), correct? If not, how does one interpret these values? This is the main parameter I am having a difficult time interpreting. As you said, I can interpret the standardized probit regressions (loadings) as standardized factor loadings (if factor is standardized, the loading indicated the SD change in y* for the item for every SD change in factor). Then how should one interpret the unstandardized probit regression/loading. If factor is standardized, do I say the value represents the change in the probit of the item for every 1 SD in the factor? Is saying "change in probit Item 1" correct, because the item isn't standardized (that is, total variance of y* doesn't equal 1) can't say change in standard deviation of Item 1?
I just want to be sure I accurately represent the parameters Mplus reports with the Theta parameterization.
Yes, the standardized threshold refers to a z-score at the factor value(s) of zero.
The unstandardized threshold refers to a normal score where the variance is not 1 as for the z score - so the interpretation is more involved. That's why the default Delta parameterization is easier to interpret.
With Theta param., I would focus on the standardized loadings. The unstandardized loadings refer to DV's (namely the u*'s) that don't have variance 1 and therefore are harder to interpret.
Sara posted on Thursday, February 17, 2011 - 2:12 pm
Thanks for the clarification.
Given the difficulty in interpretation, when would one want to employ and report the parameters from the Theta scaling?
In Mplus, Theta is used as a backup to Delta. There are certain mediation models where only Theta can be used (see also UG). The Theta vs Delta parameterization is further discussed in Mplus Web Note #4.
Dear Dr. Muthen i am fitting a multi group cross lagged model between negative affect (latent) and smoking (observed 1 item). When I use the delta parametrization and i got the error: “ The model is not supported by DELTA parameterization. Use THETA parameterization.”
Then when I use the theta parametrization and i got the error:
“Scale factors for categorical outcomes can only be specified using PARAMETERIZATION=DELTA with estimators WLS, WLSM, or WLSMV.”
In the Delta parametrization, scale factors are parameters in the model. In the Theta parametrization, residual variances are parameters in the model. You should refer to residual variances instead of scale factors.
In pg. 485-486 of the version 5 UG, it said:" In addition, there are certain models that can be estimated using only the THETA parameterization because they have been found to impose improper parameter constraints with the DELTA parameterization. These are models where a categorical dependent variable is both influenced by and influences either another observed dependent variable or a latent variable."
Could you please elaborate more what improper parameter constraints it might be with the delta para.? I understand that when we are interested in the indirect or mediated effect with categorical data, we have to fix the scaling factor (theta para.) b/c the indirect effect (i.e., product of regression coefficients) are not scale free. However, in your scenario that a categorical dependent variable is both influenced by and influences other variables, AND if I am only interested at the regression coefficients, there should not be any scaling problem. So why I can't use the delta para.?
Yes, but why use theta over delta? Other programs such as Lavaan can estimate the same models without a theta parameterization. I guess I want to know exactly what the theta paramertization is doing, what type of transformation is happening etc.,
There are certain models where the default Delta parameterization is not suitable and Theta is needed. These include multiple-group models where you have across-group hypotheses about the residual variances for the factor indicators and path models where a categorical dependent variable is both influenced by and influences another variable.
The Theta parameterization lets you access the residual variances of the factor indicators as parameters, whereas in the Delta parameterization, the delta parameters are functions of factor variances, factor loadings, and residual variances. So if for instance you are interested in testing group invariance of residual variances that is not achieved by holding Delta equal across groups because you may have group differences in factor variances and/or loadings.
Tyler Mason posted on Thursday, October 03, 2013 - 5:04 pm
I ran a mediation model with two IVs, two mediators, and a categorical DV (2 categories). The analysis failed when I ran it with the normal delta parameter. The error said I needed to do Monte Carlo. However, I tried the theta parameter and it worked. Is it okay for me to use the theta parametrized results? Thanks!
My question is regarding different results obtained when using different parameterizations of Multiple-Group Factor Analysis when outcome variables are dichotomous. Two groups are considered.
I have 4 models. Models 1 and 2 use the Theta parameterization and Models 3 and 4 the Delta parameterization. The metric of the factor is set by fixing one loading at one in each group in Models 1 and 3, and the metric of the factor is set by fixing the factor variance at one in the first group in Models 2 and 4. In all models the factor mean is fixed at zero in the first group. In all models factor loadings and item thresholds are constrained to be invariant over group.
I was expecting to find the Wald test of differences in the factor variance between groups to be similar (asymptotically equivalent) regardless of the parameterization. But this doesn't appear to be the case.
Wald test of factor variance: Model 1: 15.212 (1) p=0.0001 Model 2: 46.423 (1) p=0.0000 Model 3: 31.584 (1) p=0.0000 Model 4: 46.434 (1) p=0.0000
I understand from Webnote 4 that the Delta parameterization may perform better. However, my intention is to, in a next step, fix residual variances at one in both groups and compare the results I obtain using the "traditional" multiple group testing approach to the the results from the new Alignment approach.
Testing z1-1=0 (model 2 and 4) or testing L1*L1*(z1-1)=0 (model 1) are two different tests and will have different power to detect significance
If you want to avoid that use the DIFFTEST command.
Alternatively instead of testing Var(f1)-Var(f2)=0 you can test Var(f1)/Var(f2)=1 or Var(f2)/Var(f1)=1 that will make the test scale independent, and you will get the same value, i.e., the formulation of the test will be independent of the parameterization.
Tait Medina posted on Wednesday, May 14, 2014 - 1:18 pm
Is Var(f1)/Var(f2)=1 possible useng the MODEL TEST command? I have tried labeling the parameters, say p1 and p2, and using the syntax:
MODEL TEST: p1/p2=1
However, I recieve this warning: *** ERROR in MODEL TEST command A parameter label or the constant 0 must appear on the left-hand side of a MODEL TEST statement. Problem with the following: p1/p2 = 1
Keri Wong posted on Monday, August 18, 2014 - 4:26 am
Dear Dr Muthen, I ran a CFA 3-factor model with intercorrelated latent factors in two separate samples. I get the R-squares for individual items but not estimates or R-squares for individual factors. They all appear to be 1.00 in the output. Is there a way to freely estimate that?
Also, I would like to know whether the R-squares explained by each item is significantly different across my samples. Is there a way to do this? Or would this be the equivalent of assessing measurement invariance at the item level?
In prior posts you mention that Mplus will tell me if I need to use the theta parameterization, and that theta is necessary to access the residual variances of the categorical dependent variables and in "path models where a categorical dependent variable is both influenced by and influences another variable." I have the latter case (single group), specifically a panel design longitudinal model, in which I allow the residuals for the categorical indicators to correlate with residuals for other latent variables within time points. I thought I would need to use theta for this, but Mplus does allow me to use delta. Should it actually be possible to use delta in this case (or have I perhaps specified the model incorrectly?) and is it appropriate?
I am running a cross-lagged model examining the bidirectional effects between a binary and a continuous variable. The analysis type = complex to account for the survey design, and the parameterization is theta. Looking at the standardized coefficients, are they all probit regression coefficients? Is it okay interpret differences in magnitude as differences in the strength of the association across the different paths? Thanks much.