Anonymous posted on Wednesday, May 09, 2001 - 2:44 pm
When you say the metric of the factor is set by fixing the first factor loading to 1, does that mean an indicator with a scale of 0-5 would set the factor to a scale of 0-5? And would a mean of 3 on the factor correspond to a response of 3 on the indicator in meaning? I am guessing this is not right, especially if the other indicators are on a different scale and the factor loadings vary.
The metric of the factor would have the same unit change value but not necessarily the same range. So if the item with factor loading one had a range from 0-5, the factor would not necessarily have that range but going from 2 to 3 on the item and the factor would have the same meaning.
Anonymous posted on Monday, June 18, 2001 - 10:32 am
I have a question about the structure / scaling of the latent variable (y*) used in Mplus probit models. Does Mplus center the latent continuous variable at zero ?
To clarify: I have a 3-category variable I want to use as an outcome in Mplus. Responses are: 1=Never, 2=Once or twice, 3=More than two times. I'm concerned that if Mplus centers the continuous latent y* outcome variable at zero, the results might not make sense; respondents could end up y* values that would correspond to a response of less than "never".
Also: do you have any plans to introduce tobit / limited outcomes capabilities into Mplus ?
bmuthen posted on Tuesday, June 19, 2001 - 4:15 pm
There is no need to be concerned here. For a 3-category variable such as yours, Mplus estimates 2 threshold parameters such that if y* is less than the lowest you observe Never, if y* is in between the two you observe once or twice, and if y* is greater than the highest you observed More. The thresholds are tied to the scale of y*. When there are no x's in the model, the mean of y* is standardized to zero and the variance standardized to one and the thresholds are in a standard normal metric. Some form of limited outcomes modeling may be incorporated in future program versions.
Hi, I have three questions: 1) How does one obtain threshold parameters when using ordinal dependent variables and the WLSMV method? I am using latent exogenous variables and a single indicator ordinal dependent variable. Running version 1, they don't show up in the output. 2) I am assuming that by using the unstandardized regression coefficients, and the threshold parameters, I should be able to look at the predicted probabilities as I would in an Ordered Probit Model. Am I wrong? 3) Would this still be appropriate if I have two or three ordinal dependent variables? Thanks in advance for your answer.
I know that when a latent factor is measured with continuous indicators the scale of the latent factor is set to the scale of one of the indicators by fixing a factor loading at 1.0. How is the scale of the latent factor determined when the indicators are all categorical and thresholds are estimated for each indicator? I'd like to say something about a one-unit change in the latent variable but it's not clear to me how that is to be substantively interpreted.
bmuthen posted on Monday, August 05, 2002 - 2:40 pm
The interpretation is a bit more involved with categorical indicators. A one-unit change in the latent variable causes a change of lambda in the y*, the continuous latent variable underlying the categorical indicators. This translates into a change of the probability of the categorical y (see Mplus Tech Appendix 1), but this probability change is different depending on where on the latent variable scale you are. This is due to the non-linear (s-shaped) relationship between the latent variable and the probability of y. A useful approach is perhaps to consider a change from 1 SD below the mean of the latent variable to 1 SD above the mean and translate that into probabilities.
H0 should have the mean of the factor fixed at zero for the first time point and free for the second - and equal measurement intercepts across time points.
H1 should have both factor means fixed at zero (which is the default) and all measurement intercepts free.
Anonymous posted on Wednesday, November 17, 2004 - 12:58 pm
Hello - I am a bit confused. I am attempting to convert my probit regression coefficients into probabilities, and I am not sure I am doing this correctly. Is this right?
Probability of y/x = normdist * (threshold of y + B1(mean of x))
If so, I am confused about my threshold for y. Mplus sampstat output gives me a threshold for y in the means/intercepts/thresholds output that is different from the threshold for y it gives me in the normal output under the factor loadings for the latent variable. Which threshold is right? Also, do I use standarized probit regression coefficients in the computation of probabilties? - my latent variable is standardized so the mean is 0.
bmuthen posted on Wednesday, November 17, 2004 - 3:14 pm
The normdist argument (what's in the parenthesis) should have a negative sign for the threshold - see Tech Appendix 1 on our web site. This is the probability of y=1 for y scored 0/1. The threshold should be taken from the regular results section where you find the other estimates, not from the sample statistics. You should not use the standardized estimates but those in the first column. For an example, see the ASB example of our "Day 3" handout.
A latent variable needs to be given a metric, typically mean zero and either variance 1 or setting a loading to 1. With multiple-group or longitudinal analysis, you can estimate means and variances when comparing to a reference group or time point where the metric has been set.
I understand that the scale of a latent factor is set to the scale of one of its indicators by fixing that indicator's factor loading at 1.0. But, how is the actual latent factor mean determined?
I have a latent factor with two indicators, each of which is on an 8 point scale (0 to 7) with means aroud 5, but my estimated latent factor mean is negative 2.5. Why might this be happening, and is it problematic?
Unless you have multiple group analysis or a growth model, the mean of the latent variables are fixed at zero.
A factor model with two indicators is not identified. I am not sure how you get the results you mention.
Susan Scott posted on Wednesday, October 18, 2006 - 11:18 am
I also have a model where the Tech4 output indicates that the means of many of my latent variables are <0. I have 2 LVs with 3 indicators, many others (5) with only 2, and most(9)with only a single indicator, to which I have applied 90% reliability. A path model of similar structure gives similar results (slightly lower betas, better fit statistics), so I assumed everything is OK. For one of my single indicator latent variables, Tech4 indicates the variable has a mean of 92 (scored 0-100), but the latent has a mean of -15. Should I be concerned about model identification? If not, how do I interpret these latent variable means?
the latent variable i´m trying to model (sem) are not "re-model-able" with a different data arrangement. i´m using the same data (!) but the number of variables in the data-file differs and the mplus results differ highly. results differ even then, when i list up a different number of variables with mplus (in the data option names are). should i just reduce the datafile to the numbers of variables i use? what am i doing wrong?
I have several latent predictor variables in my SEM with a binary, observed DV (WLSMV estimator). Using the resulting probit coefficients and the threshold for the DV, I am trying to use the equation provided in the Mplus manual (calculating probabilities from probit regression coefficients) to produce probabilities. For the calculations, I am contemplating fixing my all IVs except the one of interest at their means (or mode for binary). Thus, I am trying to determine the scale for the latent IVs.
My latent variables' observed indicators all have ordinal scales of 1-4. On some posts, I have read that the latent variable takes on the scale of the first indicator (so in my case 1-4). Yet, upon reading other posts, I think that the latent variable has a mean of 0 and standard deviation of 1 and not a range of 1-4. My Tech 4 output shows means for the latent variables which are not = 0, rather in my case range from .093 to .411.
Please address 2 questions for me:
1) What is the scale and mean of the latent variables in this example? 2) Is there a way for Mplus to compute the probabilities for me?
I have examined the probability graph produced by plot 2 for a binary DV and a latent predictor. Now I want to explain to policymakers what it means.
I have saved the factor scores for my latent predictor variables and noted the lowest value for my latent predictor and the highest value for my latent predictor. Perhaps as expected, they appear to be about within 3 standard deviations of the mean, based on the mean and standard deviation reported by Mplus while creating the graphs.
Given this, is it reasonable to explain to a policy audience that for the lowest estimated value of the predictor, the probability is [probability value at the approximate number of standard deviations below mean--generally 3] and for the highest estimated value of the predictor, the probability is [probability value at the approximate number of standard deviations above mean--generally 3]?
I would not use factor scores to compute probabilities. Instead I would estimate probabilities at plus and minus one standard deviation of the factor from the factor mean of zero. You can get the factor variance from TECH4.
I have created four separate latent factors. When I examine the Tech4 output I can see that the mean of each is zero, which is consistent with what I've read in other posts. Each factor was created using indicators on a 1-5 scale. Above you said that the metric of the factors will have the same unit change value but not necessarily the same range as the indicators. How can I determine the range (min-max) of my latent factors? Thanks!
I have an additional question regarding the posting on November 02, 2012 and the posting on April 02, 2013 about the range and variance of the latent variable. If the variance of an endogenous latent variable is not a model parameter does this meant that it is not possible to change the range of the latent variable? I have two datasets where i have a latent dependent variable with categorical indicators which are binary and the same predictors in both datasets, but not the same indicators for the latent variable. It would be nice to be able to directly compare the size of the unstandardized coefficients such as is possible in ordinary regression if for instance the dependent variable has the same scale. Seeing as the indicator variable for the latent dependent variables in both datasets are binary can the unstandardized coefficients be compared as if the scale is the same? When i look at tech4 the mean and variance of the latent factor means are not identical.
If you do not have the same indicators of the factors in each group, you cannot establish measurement invariance because the constructs are not the same. Because of this, you cannot make comparisons across groups.
Sarah Lowe posted on Friday, August 02, 2013 - 1:54 pm
I am trying to test whether the means of a latent construct differ over time. Above, you suggest the following approach:
"H0 should have the mean of the factor fixed at zero for the first time point and free for the second - and equal measurement intercepts across time points.
H1 should have both factor means fixed at zero (which is the default) and all measurement intercepts free."
2 follow-up questions:
1) Do you know of a good citation?
2) What would a significant or non-significant chi-square difference test mean in this scenario?
1. Millsap, R.E. (2011). Statistical approaches to measurement invariance. Taylor and Francis Group: New York.
2. A significant chi-square difference test says the means are not the same across time.
Meike Slagt posted on Wednesday, October 15, 2014 - 5:16 pm
Dear Dr. Muthen,
I’m using CFA to estimate a latent variable that has continuous indicators measured on different scales (e.g., X1 measured on 7-point scale, X2 measured on 9-point scale). For interpreting the factor loadings this doesn’t matter, as I understand. But what if I want to interpret the latent variable mean?
I am estimating the same latent variable repeatedly across waves, and want to be able to how much the scores on my latent variable increase or decrease. This seems hard if indicators with different scales feed into it. Would it make sense to standardize the indicators after all? A 1-unit change on the latent variable from Time1 to Time2 could then mean an increase of 1 SD on the latent variable.
This is not a problem. The factor scale is tied to the scale of the factor indicator that has loading fixed at 1: when the factor increases one unit, that indicator is expected to increase one unit. The other indicators increase lambda units where lambda is indicator-specific. An increase from a factor mean of zero at time 1 to alpha at time two, would give an expected increase in the first indicator of alpha and the other indicators lambda*alpha, again with lambda indicator-specific.
Those values are logits that represent the class probabilities, which are printed separately. See our Topic 5 course and handout on the website.
Anonymous posted on Friday, September 04, 2015 - 6:45 am
I am running an CFA with multiple latent variables. Each latent variable has indicators that are on a different metric. For example, my first latent variable is measured by both binary (0,1) and ordered categorical (1,2,3) indicators.
1)Is this problematic? If so, will standardizing the indicator variables prior to running my CFA solve this problem?
2)If I do not need to standardize them, do you know of a reference/citation I can review?
I have a non-recursive path model with categorical endogenous variables. CATEGORICAL ARE Y1 Y2; Y1 <- f(Y2, X1) Y2 <- f(Y1, X2)
The TECH4 output gives me the estimated mean of the underlying latent variables Y1* and Y2*. However, I am trying to get the values of Y1* and Y2* for each and every response. The SAVEDATA = FSCORES did not work. Please advise. Thanks a lot
Thanks Dr. Muthen. I adopted Option #1 of your suggestion.
However, the method led to substantial changes in estimated coefficients and their standard errors.
For example the coefficient of Y2 in the equation Y1 <- f(Y2, X1) was originally -0.187 with p-values of 0.000. In the new method you suggested, the coefficient of Factor.Y2 in the equation FactorY1 <- f(FactorY2, X1) is -0.216 with p-value of 0.117.
Similarly, large changes were observed in the thresholds (for ordered probit model).
Why? Does the two coefficients have the same interpretation? Can you point me to some literature?
I am trying to calculate predicted probabilities from probit regression coefficients. I have 3 factors (and multiple binary covariates, which I will hold constant/set to 0) and a binary outcome. I see that earlier in this thread Dr. Muthen recommended estimating probabilities at plus and minus one standard deviation of the factor from the factor mean of zero. I'd like to confirm that I am properly interpreting the formula provided in chapter 14 of the user guide.
The coefficient estimate for f1 is 0.152. The threshold is 1.765. The standard error of f1 (obtained from TECH4) is 0.032.
To calculate the probability (for example) at +1 standard deviation from the mean of f1: F(-1.765 + 0.032*0.152) = F(-1.760136) = 0.039
Is this correct?
I am surprised that the SE of f1 is so small, but perhaps this is because all of the indicators for f1 are binary?
This looks correct if you are using ML. WLSMV in Delta parameterization uses a different formula. Also, you say SE but I think you mean SD. You also need to know that you are holding the means of the other factors at zero which may not be the mean for those factors that you see in TECH4.
Thank you for your reply and for the reminder about the non-zero means of the other factors. I am using WLSMV (default) and theta parameterization (my model is not supported by delta, perhaps because of categorical mediators).
Can you point me to the formula for WLSMV with theta parameterization?