Metric of the latent variable PreviousNext
Mplus Discussion > Structural Equation Modeling >
 Anonymous posted on Wednesday, May 09, 2001 - 2:44 pm
When you say the metric of the factor is set by fixing the first factor loading to 1, does that mean an indicator with a scale of 0-5 would set the factor to a scale of 0-5? And would a mean of 3 on the factor correspond to a response of 3 on the indicator in meaning? I am guessing this is not right, especially if the other indicators are on a different scale and the factor loadings vary.
 Linda K. Muthen posted on Thursday, May 10, 2001 - 9:56 am
The metric of the factor would have the same unit change value but not necessarily the same range. So if the item with factor loading one had a range from 0-5, the factor would not necessarily have that range but going from 2 to 3 on the item and the factor would have the same meaning.
 Anonymous posted on Monday, June 18, 2001 - 10:32 am
I have a question about the structure / scaling of the latent variable (y*) used in Mplus probit models. Does Mplus center the latent continuous variable at zero ?

To clarify: I have a 3-category variable I want to use as an outcome in Mplus. Responses are: 1=Never, 2=Once or twice, 3=More than two times. I'm concerned that if Mplus centers the continuous latent y* outcome variable at zero, the results might not make sense; respondents could end up y* values that would correspond to a response of less than "never".

Also: do you have any plans to introduce tobit / limited outcomes capabilities into Mplus ?
 bmuthen posted on Tuesday, June 19, 2001 - 4:15 pm
There is no need to be concerned here. For a 3-category variable such as yours, Mplus estimates 2 threshold parameters such that if y* is less than the lowest you observe Never, if y* is in between the two you observe once or twice, and if y* is greater than the highest you observed More. The thresholds are tied to the scale of y*. When there are no x's in the model, the mean of y* is standardized to zero and the variance standardized to one and the thresholds are in a standard normal metric. Some form of limited outcomes modeling may be incorporated in future program versions.
 CarlosElordi posted on Thursday, August 01, 2002 - 8:36 am
I have three questions:
1) How does one obtain threshold parameters when using ordinal dependent variables and the WLSMV method? I am using latent exogenous variables and a single indicator ordinal dependent variable. Running version 1, they don't show up in the output.
2) I am assuming that by using the unstandardized regression coefficients, and the threshold parameters, I should be able to look at the predicted probabilities as I would in an Ordered Probit Model. Am I wrong?
3) Would this still be appropriate if I have two or three ordinal dependent variables?
Thanks in advance for your answer.
 Linda K. Muthen posted on Thursday, August 01, 2002 - 9:34 am
1. You can get thresholds by asking for TYPE=MEANSTRUCTURE.

2. Yes.

3. Yes.
 CarlosElordi posted on Thursday, August 01, 2002 - 12:01 pm
Thanks so much.
 Stanley Feldman posted on Monday, August 05, 2002 - 10:52 am
I know that when a latent factor is measured with continuous indicators the scale of the latent factor is set to the scale of one of the indicators by fixing a factor loading at 1.0. How is the scale of the latent factor determined when the indicators are all categorical and thresholds are estimated for each indicator? I'd like to say something about a one-unit change in the latent variable but it's not clear to me how that is to be substantively interpreted.
 bmuthen posted on Monday, August 05, 2002 - 2:40 pm
The interpretation is a bit more involved with categorical indicators. A one-unit change in the latent variable causes a change of lambda in the y*, the continuous latent variable underlying the categorical indicators. This translates into a change of the probability of the categorical y (see Mplus Tech Appendix 1), but this probability change is different depending on where on the latent variable scale you are. This is due to the non-linear (s-shaped) relationship between the latent variable and the probability of y. A useful approach is perhaps to consider a change from 1 SD below the mean of the latent variable to 1 SD above the mean and translate that into probabilities.
 Kristy Weight posted on Monday, February 03, 2003 - 4:01 pm
Do you recommend standardizing (to a mean of 0 and a s.d. of 1) the items making up a latent variable?
 bmuthen posted on Monday, February 03, 2003 - 4:04 pm
No, that's not necessary.
 Anonymous posted on Friday, July 02, 2004 - 2:15 pm
I have a one group model in which I have a latent variable measured at two time points. I'd like to test whether the mean on the latent variable changes over time.

My guess is that I should estimate a model with freely estimated means and compare it to a model in which the means are constrained to be equal.

If I free the means on the latent variables, I get an error message that the model is not identified. What should I do?
 bmuthen posted on Friday, July 02, 2004 - 2:19 pm
H0 should have the mean of the factor fixed at zero for the first time point and free for the second - and equal measurement intercepts across time points.

H1 should have both factor means fixed at zero (which is the default) and all measurement intercepts free.
 Anonymous posted on Wednesday, November 17, 2004 - 12:58 pm
Hello - I am a bit confused. I am attempting to convert my probit regression coefficients into probabilities, and I am not sure I am doing this correctly. Is this right?

Probability of y/x = normdist * (threshold of y + B1(mean of x))

If so, I am confused about my threshold for y. Mplus sampstat output gives me a threshold for y in the means/intercepts/thresholds output that is different from the threshold for y it gives me in the normal output under the factor loadings for the latent variable. Which threshold is right? Also, do I use standarized probit regression coefficients in the computation of probabilties? - my latent variable is standardized so the mean is 0.

Thank you
 bmuthen posted on Wednesday, November 17, 2004 - 3:14 pm
The normdist argument (what's in the parenthesis) should have a negative sign for the threshold - see Tech Appendix 1 on our web site. This is the probability of y=1 for y scored 0/1. The threshold should be taken from the regular results section where you find the other estimates, not from the sample statistics. You should not use the standardized estimates but those in the first column. For an example, see the ASB example of our "Day 3" handout.
 HW posted on Thursday, June 15, 2006 - 11:39 am
Is it possible to obtain the mean value and standard deviation for a latent variable?
 Bengt O. Muthen posted on Friday, June 16, 2006 - 8:02 am
A latent variable needs to be given a metric, typically mean zero and either variance 1 or setting a loading to 1. With multiple-group or longitudinal analysis, you can estimate means and variances when comparing to a reference group or time point where the metric has been set.
 Inna Altschul posted on Saturday, July 08, 2006 - 10:22 am
I understand that the scale of a latent factor is set to the scale of one of its indicators by fixing that indicator's factor loading at 1.0. But, how is the actual latent factor mean determined?

I have a latent factor with two indicators, each of which is on an 8 point scale (0 to 7) with means aroud 5, but my estimated latent factor mean is negative 2.5. Why might this be happening, and is it problematic?
 Linda K. Muthen posted on Wednesday, July 12, 2006 - 1:54 am
Unless you have multiple group analysis or a growth model, the mean of the latent variables are fixed at zero.

A factor model with two indicators is not identified. I am not sure how you get the results you mention.
 Susan Scott posted on Wednesday, October 18, 2006 - 11:18 am
I also have a model where the Tech4 output indicates that the means of many of my latent variables are <0. I have 2 LVs with 3 indicators, many others (5) with only 2, and most(9)with only a single indicator, to which I have applied 90% reliability. A path model of similar structure gives similar results (slightly lower betas, better fit statistics), so I assumed everything is OK. For one of my single indicator latent variables, Tech4 indicates the variable has a mean of 92 (scored 0-100), but the latent has a mean of -15. Should I be concerned about model identification? If not, how do I interpret these latent variable means?
 Linda K. Muthen posted on Wednesday, October 18, 2006 - 1:49 pm
I need more information to answer your question. Please send your input, data, output, and license number to
 Miriam Marleen Gebauer posted on Monday, June 09, 2008 - 4:28 am

the latent variable im trying to model (sem) are not "re-model-able" with a different data arrangement. im using the same data (!) but the number of variables in the data-file differs and the mplus results differ highly. results differ even then, when i list up a different number of variables with mplus (in the data option names are). should i just reduce the datafile to the numbers of variables i use? what am i doing wrong?

thank you.
 Linda K. Muthen posted on Monday, June 09, 2008 - 5:58 am
I don't understand your question. Can you send the output files in question and your license number to
 Tammy Kochel posted on Tuesday, November 09, 2010 - 9:24 am
I have several latent predictor variables in my SEM with a binary, observed DV (WLSMV estimator). Using the resulting probit coefficients and the threshold for the DV, I am trying to use the equation provided in the Mplus manual (calculating probabilities from probit regression coefficients) to produce probabilities. For the calculations, I am contemplating fixing my all IVs except the one of interest at their means (or mode for binary). Thus, I am trying to determine the scale for the latent IVs.

My latent variables' observed indicators all have ordinal scales of 1-4. On some posts, I have read that the latent variable takes on the scale of the first indicator (so in my case 1-4). Yet, upon reading other posts, I think that the latent variable has a mean of 0 and standard deviation of 1 and not a range of 1-4. My Tech 4 output shows means for the latent variables which are not = 0, rather in my case range from .093 to .411.

Please address 2 questions for me:

1) What is the scale and mean of the latent variables in this example?
2) Is there a way for Mplus to compute the probabilities for me?

Thanks in advance for the clarification.
 Linda K. Muthen posted on Wednesday, November 10, 2010 - 11:07 am
1. Latent variables have means zero and an estimated variance in single group analysis. Given that you do not have means of zero in TECH4, you must have covariates in the model.

2. No.
 Tammy Kochel posted on Friday, November 12, 2010 - 7:12 am
I have examined the probability graph produced by plot 2 for a binary DV and a latent predictor. Now I want to explain to policymakers what it means.

I have saved the factor scores for my latent predictor variables and noted the lowest value for my latent predictor and the highest value for my latent predictor. Perhaps as expected, they appear to be about within 3 standard deviations of the mean, based on the mean and standard deviation reported by Mplus while creating the graphs.

Given this, is it reasonable to explain to a policy audience that for the lowest estimated value of the predictor, the probability is [probability value at the approximate number of standard deviations below mean--generally 3] and for the highest estimated value of the predictor, the probability is [probability value at the approximate number of standard deviations above mean--generally 3]?

Thanks again for your assistance.
 Linda K. Muthen posted on Friday, November 12, 2010 - 9:21 am
I would not use factor scores to compute probabilities. Instead I would estimate probabilities at plus and minus one standard deviation of the factor from the factor mean of zero. You can get the factor variance from TECH4.
 Maureen Brinkworth posted on Friday, November 02, 2012 - 9:20 am
I have created four separate latent factors. When I examine the Tech4 output I can see that the mean of each is zero, which is consistent with what I've read in other posts. Each factor was created using indicators on a 1-5 scale. Above you said that the metric of the factors will have the same unit change value but not necessarily the same range as the indicators. How can I determine the range (min-max) of my latent factors? Thanks!
 Linda K. Muthen posted on Sunday, November 04, 2012 - 11:01 am
You can estimate the minimum and the maximum values by using the estimated factor variance to look at plus and minus 2 or 3 standard deviations from the factor mean of zero.
 Tobias Stark posted on Tuesday, April 02, 2013 - 10:36 am
Dear Dr. Muthen,

I ran this model:

V1 BY VV1N* VV2N* VV3N*;

V2 BY symb1* symb2* symb3*;

V1@1 V2@1;

V2 ON V1;

I was under the impression that V1@1 V2@1 would cause the two latent factors, V1 and V2, to have variances fixed at 1.

But the output for the unstandardized solution reports that the residual variance of V2 (the endogenous variable) is 1, and the TECH4 output says that the total variance of V2 is 1.55.

Did I fix the residual variance to 1 instead of fixing the variance to 1?

Is it possible to fix the variance of an endogenous variable to 1 in Mplus? If so, how?

Thank you!
 Linda K. Muthen posted on Tuesday, April 02, 2013 - 12:21 pm
In a conditional model, the variance of an endogenous variable is not a model parameter. The residual variance is. So you can't fix the variance.
 Stig Hebbelstrup Rye Rasmussen posted on Friday, April 05, 2013 - 2:18 am
I have an additional question regarding the posting on November 02, 2012 and the posting on April 02, 2013 about the range and variance of the latent variable. If the variance of an endogenous latent variable is not a model parameter does this meant that it is not possible to change the range of the latent variable?
I have two datasets where i have a latent dependent variable with categorical indicators which are binary and the same predictors in both datasets, but not the same indicators for the latent variable. It would be nice to be able to directly compare the size of the unstandardized coefficients such as is possible in ordinary regression if for instance the dependent variable has the same scale. Seeing as the indicator variable for the latent dependent variables in both datasets are binary can the unstandardized coefficients be compared as if the scale is the same? When i look at tech4 the mean and variance of the latent factor means are not identical.
 Linda K. Muthen posted on Friday, April 05, 2013 - 9:21 am
If you do not have the same indicators of the factors in each group, you cannot establish measurement invariance because the constructs are not the same. Because of this, you cannot make comparisons across groups.
 Sarah Lowe posted on Friday, August 02, 2013 - 1:54 pm

I am trying to test whether the means of a latent construct differ over time. Above, you suggest the following approach:

"H0 should have the mean of the factor fixed at zero for the first time point and free for the second - and equal measurement intercepts across time points.

H1 should have both factor means fixed at zero (which is the default) and all measurement intercepts free."

2 follow-up questions:

1) Do you know of a good citation?

2) What would a significant or non-significant chi-square difference test mean in this scenario?

Thanks very much!!!
 Linda K. Muthen posted on Friday, August 02, 2013 - 4:06 pm
1. Millsap, R.E. (2011). Statistical approaches to measurement
invariance. Taylor and Francis Group: New York.

2. A significant chi-square difference test says the means are not the same across time.
 Meike Slagt posted on Wednesday, October 15, 2014 - 5:16 pm
Dear Dr. Muthen,

Im using CFA to estimate a latent variable that has continuous indicators measured on different scales (e.g., X1 measured on 7-point scale, X2 measured on 9-point scale). For interpreting the factor loadings this doesnt matter, as I understand. But what if I want to interpret the latent variable mean?

I am estimating the same latent variable repeatedly across waves, and want to be able to how much the scores on my latent variable increase or decrease. This seems hard if indicators with different scales feed into it. Would it make sense to standardize the indicators after all? A 1-unit change on the latent variable from Time1 to Time2 could then mean an increase of 1 SD on the latent variable.

Thank you for your help!

Best, Meike
 Bengt O. Muthen posted on Thursday, October 16, 2014 - 9:37 am
This is not a problem. The factor scale is tied to the scale of the factor indicator that has loading fixed at 1: when the factor increases one unit, that indicator is expected to increase one unit. The other indicators increase lambda units where lambda is indicator-specific. An increase from a factor mean of zero at time 1 to alpha at time two, would give an expected increase in the first indicator of alpha and the other indicators lambda*alpha, again with lambda indicator-specific.
 jen jang sheu posted on Saturday, August 22, 2015 - 12:28 am
Dear Professor Muthen,

When I read the result from the output file in mixture model, then I see , for example:

Categorical Latent Variables

C#1 -0.601
C#2 -0.735

Could you be very kind to tell me how these value be calculated?and why it always be negative? Thank you very much!!

 Bengt O. Muthen posted on Saturday, August 22, 2015 - 6:57 am
Those values are logits that represent the class probabilities, which are printed separately. See our Topic 5 course and handout on the website.
 Anonymous  posted on Friday, September 04, 2015 - 6:45 am

I am running an CFA with multiple latent variables. Each latent variable has indicators that are on a different metric. For example, my first latent variable is measured by both binary (0,1) and ordered categorical (1,2,3) indicators.

1)Is this problematic? If so, will standardizing the indicator variables prior to running my CFA solve this problem?

2)If I do not need to standardize them, do you know of a reference/citation I can review?

Thank you.
 Bengt O. Muthen posted on Friday, September 04, 2015 - 7:46 am
This is not a problem. I can't think of a specific reference off hand, but this is common knowledge and is often the case in IRT applications with binary and graded responses.
 Gouri Shankar Mishra posted on Monday, October 12, 2015 - 11:26 pm
Dr. Muthen

I have a non-recursive path model with categorical endogenous variables.
Y1 <- f(Y2, X1)
Y2 <- f(Y1, X2)

The TECH4 output gives me the estimated mean of the underlying latent variables Y1* and Y2*. However, I am trying to get the values of Y1* and Y2* for each and every response. The SAVEDATA = FSCORES did not work. Please advise. Thanks a lot
 Bengt O. Muthen posted on Tuesday, October 13, 2015 - 6:28 pm
You can either put a factor behind each Y and ask for factor scores, or use Bayes and ask for Y* scores.
 Gouri Shankar Mishra posted on Saturday, October 17, 2015 - 6:40 pm
Thanks Dr. Muthen. I adopted Option #1 of your suggestion.

However, the method led to substantial changes in estimated coefficients and their standard errors.

For example the coefficient of Y2 in the equation Y1 <- f(Y2, X1) was originally -0.187 with p-values of 0.000. In the new method you suggested, the coefficient of Factor.Y2 in the equation FactorY1 <- f(FactorY2, X1) is -0.216 with p-value of 0.117.

Similarly, large changes were observed in the thresholds (for ordered probit model).

Why? Does the two coefficients have the same interpretation? Can you point me to some literature?

Thanks for your time.
 Bengt O. Muthen posted on Sunday, October 18, 2015 - 10:54 am
Check that the two models have the same number of parameters and log likelihood.
 Travis Salway Hottes posted on Tuesday, March 15, 2016 - 9:17 pm
Dear Drs. Muthen,

I am trying to calculate predicted probabilities from probit regression coefficients. I have 3 factors (and multiple binary covariates, which I will hold constant/set to 0) and a binary outcome. I see that earlier in this thread Dr. Muthen recommended estimating probabilities at plus and minus one standard deviation of the factor from the factor mean of zero. I'd like to confirm that I am properly interpreting the formula provided in chapter 14 of the user guide.

The coefficient estimate for f1 is 0.152.
The threshold is 1.765.
The standard error of f1 (obtained from TECH4) is 0.032.

To calculate the probability (for example) at +1 standard deviation from the mean of f1:
F(-1.765 + 0.032*0.152) = F(-1.760136) = 0.039

Is this correct?

I am surprised that the SE of f1 is so small, but perhaps this is because all of the indicators for f1 are binary?

Thank you for your guidance.
 Bengt O. Muthen posted on Wednesday, March 16, 2016 - 6:10 pm
This looks correct if you are using ML. WLSMV in Delta parameterization uses a different formula. Also, you say SE but I think you mean SD. You also need to know that you are holding the means of the other factors at zero which may not be the mean for those factors that you see in TECH4.
 Travis Salway Hottes posted on Thursday, March 17, 2016 - 9:56 am
Dr. Muthen:

Thank you for your reply and for the reminder about the non-zero means of the other factors. I am using WLSMV (default) and theta parameterization (my model is not supported by delta, perhaps because of categorical mediators).

Can you point me to the formula for WLSMV with theta parameterization?

Thank you,
 Bengt O. Muthen posted on Sunday, March 20, 2016 - 7:07 am
One description of Theta is in Mplus web note 4 on our website.
 Travis Salway Hottes posted on Monday, March 21, 2016 - 10:16 am
Thank you for your help.
Back to top
Add Your Message Here
Username: Posting Information:
This is a private posting area. Only registered users and moderators may post messages here.
Options: Enable HTML code in message
Automatically activate URLs in message