Message/Author 

Anonymous posted on Wednesday, May 09, 2001  2:44 pm



When you say the metric of the factor is set by fixing the first factor loading to 1, does that mean an indicator with a scale of 05 would set the factor to a scale of 05? And would a mean of 3 on the factor correspond to a response of 3 on the indicator in meaning? I am guessing this is not right, especially if the other indicators are on a different scale and the factor loadings vary. 


The metric of the factor would have the same unit change value but not necessarily the same range. So if the item with factor loading one had a range from 05, the factor would not necessarily have that range but going from 2 to 3 on the item and the factor would have the same meaning. 

Anonymous posted on Monday, June 18, 2001  10:32 am



I have a question about the structure / scaling of the latent variable (y*) used in Mplus probit models. Does Mplus center the latent continuous variable at zero ? To clarify: I have a 3category variable I want to use as an outcome in Mplus. Responses are: 1=Never, 2=Once or twice, 3=More than two times. I'm concerned that if Mplus centers the continuous latent y* outcome variable at zero, the results might not make sense; respondents could end up y* values that would correspond to a response of less than "never". Also: do you have any plans to introduce tobit / limited outcomes capabilities into Mplus ? 

bmuthen posted on Tuesday, June 19, 2001  4:15 pm



There is no need to be concerned here. For a 3category variable such as yours, Mplus estimates 2 threshold parameters such that if y* is less than the lowest you observe Never, if y* is in between the two you observe once or twice, and if y* is greater than the highest you observed More. The thresholds are tied to the scale of y*. When there are no x's in the model, the mean of y* is standardized to zero and the variance standardized to one and the thresholds are in a standard normal metric. Some form of limited outcomes modeling may be incorporated in future program versions. 


Hi, I have three questions: 1) How does one obtain threshold parameters when using ordinal dependent variables and the WLSMV method? I am using latent exogenous variables and a single indicator ordinal dependent variable. Running version 1, they don't show up in the output. 2) I am assuming that by using the unstandardized regression coefficients, and the threshold parameters, I should be able to look at the predicted probabilities as I would in an Ordered Probit Model. Am I wrong? 3) Would this still be appropriate if I have two or three ordinal dependent variables? Thanks in advance for your answer. 


1. You can get thresholds by asking for TYPE=MEANSTRUCTURE. 2. Yes. 3. Yes. 


Thanks so much. 


I know that when a latent factor is measured with continuous indicators the scale of the latent factor is set to the scale of one of the indicators by fixing a factor loading at 1.0. How is the scale of the latent factor determined when the indicators are all categorical and thresholds are estimated for each indicator? I'd like to say something about a oneunit change in the latent variable but it's not clear to me how that is to be substantively interpreted. 

bmuthen posted on Monday, August 05, 2002  2:40 pm



The interpretation is a bit more involved with categorical indicators. A oneunit change in the latent variable causes a change of lambda in the y*, the continuous latent variable underlying the categorical indicators. This translates into a change of the probability of the categorical y (see Mplus Tech Appendix 1), but this probability change is different depending on where on the latent variable scale you are. This is due to the nonlinear (sshaped) relationship between the latent variable and the probability of y. A useful approach is perhaps to consider a change from 1 SD below the mean of the latent variable to 1 SD above the mean and translate that into probabilities. 


Do you recommend standardizing (to a mean of 0 and a s.d. of 1) the items making up a latent variable? 

bmuthen posted on Monday, February 03, 2003  4:04 pm



No, that's not necessary. 

Anonymous posted on Friday, July 02, 2004  2:15 pm



I have a one group model in which I have a latent variable measured at two time points. I'd like to test whether the mean on the latent variable changes over time. My guess is that I should estimate a model with freely estimated means and compare it to a model in which the means are constrained to be equal. If I free the means on the latent variables, I get an error message that the model is not identified. What should I do? 

bmuthen posted on Friday, July 02, 2004  2:19 pm



H0 should have the mean of the factor fixed at zero for the first time point and free for the second  and equal measurement intercepts across time points. H1 should have both factor means fixed at zero (which is the default) and all measurement intercepts free. 

Anonymous posted on Wednesday, November 17, 2004  12:58 pm



Hello  I am a bit confused. I am attempting to convert my probit regression coefficients into probabilities, and I am not sure I am doing this correctly. Is this right? Probability of y/x = normdist * (threshold of y + B1(mean of x)) If so, I am confused about my threshold for y. Mplus sampstat output gives me a threshold for y in the means/intercepts/thresholds output that is different from the threshold for y it gives me in the normal output under the factor loadings for the latent variable. Which threshold is right? Also, do I use standarized probit regression coefficients in the computation of probabilties?  my latent variable is standardized so the mean is 0. Thank you 

bmuthen posted on Wednesday, November 17, 2004  3:14 pm



The normdist argument (what's in the parenthesis) should have a negative sign for the threshold  see Tech Appendix 1 on our web site. This is the probability of y=1 for y scored 0/1. The threshold should be taken from the regular results section where you find the other estimates, not from the sample statistics. You should not use the standardized estimates but those in the first column. For an example, see the ASB example of our "Day 3" handout. 

HW posted on Thursday, June 15, 2006  11:39 am



Is it possible to obtain the mean value and standard deviation for a latent variable? 


A latent variable needs to be given a metric, typically mean zero and either variance 1 or setting a loading to 1. With multiplegroup or longitudinal analysis, you can estimate means and variances when comparing to a reference group or time point where the metric has been set. 


I understand that the scale of a latent factor is set to the scale of one of its indicators by fixing that indicator's factor loading at 1.0. But, how is the actual latent factor mean determined? I have a latent factor with two indicators, each of which is on an 8 point scale (0 to 7) with means aroud 5, but my estimated latent factor mean is negative 2.5. Why might this be happening, and is it problematic? 


Unless you have multiple group analysis or a growth model, the mean of the latent variables are fixed at zero. A factor model with two indicators is not identified. I am not sure how you get the results you mention. 

Susan Scott posted on Wednesday, October 18, 2006  11:18 am



I also have a model where the Tech4 output indicates that the means of many of my latent variables are <0. I have 2 LVs with 3 indicators, many others (5) with only 2, and most(9)with only a single indicator, to which I have applied 90% reliability. A path model of similar structure gives similar results (slightly lower betas, better fit statistics), so I assumed everything is OK. For one of my single indicator latent variables, Tech4 indicates the variable has a mean of 92 (scored 0100), but the latent has a mean of 15. Should I be concerned about model identification? If not, how do I interpret these latent variable means? 


I need more information to answer your question. Please send your input, data, output, and license number to support@statmodel.com. 


hey, the latent variable i´m trying to model (sem) are not "remodelable" with a different data arrangement. i´m using the same data (!) but the number of variables in the datafile differs and the mplus results differ highly. results differ even then, when i list up a different number of variables with mplus (in the data option names are). should i just reduce the datafile to the numbers of variables i use? what am i doing wrong? thank you. 


I don't understand your question. Can you send the output files in question and your license number to support@statmodel.com. 


I have several latent predictor variables in my SEM with a binary, observed DV (WLSMV estimator). Using the resulting probit coefficients and the threshold for the DV, I am trying to use the equation provided in the Mplus manual (calculating probabilities from probit regression coefficients) to produce probabilities. For the calculations, I am contemplating fixing my all IVs except the one of interest at their means (or mode for binary). Thus, I am trying to determine the scale for the latent IVs. My latent variables' observed indicators all have ordinal scales of 14. On some posts, I have read that the latent variable takes on the scale of the first indicator (so in my case 14). Yet, upon reading other posts, I think that the latent variable has a mean of 0 and standard deviation of 1 and not a range of 14. My Tech 4 output shows means for the latent variables which are not = 0, rather in my case range from .093 to .411. Please address 2 questions for me: 1) What is the scale and mean of the latent variables in this example? 2) Is there a way for Mplus to compute the probabilities for me? Thanks in advance for the clarification. 


1. Latent variables have means zero and an estimated variance in single group analysis. Given that you do not have means of zero in TECH4, you must have covariates in the model. 2. No. 


I have examined the probability graph produced by plot 2 for a binary DV and a latent predictor. Now I want to explain to policymakers what it means. I have saved the factor scores for my latent predictor variables and noted the lowest value for my latent predictor and the highest value for my latent predictor. Perhaps as expected, they appear to be about within 3 standard deviations of the mean, based on the mean and standard deviation reported by Mplus while creating the graphs. Given this, is it reasonable to explain to a policy audience that for the lowest estimated value of the predictor, the probability is [probability value at the approximate number of standard deviations below meangenerally 3] and for the highest estimated value of the predictor, the probability is [probability value at the approximate number of standard deviations above meangenerally 3]? Thanks again for your assistance. 


I would not use factor scores to compute probabilities. Instead I would estimate probabilities at plus and minus one standard deviation of the factor from the factor mean of zero. You can get the factor variance from TECH4. 


I have created four separate latent factors. When I examine the Tech4 output I can see that the mean of each is zero, which is consistent with what I've read in other posts. Each factor was created using indicators on a 15 scale. Above you said that the metric of the factors will have the same unit change value but not necessarily the same range as the indicators. How can I determine the range (minmax) of my latent factors? Thanks! 


You can estimate the minimum and the maximum values by using the estimated factor variance to look at plus and minus 2 or 3 standard deviations from the factor mean of zero. 


Dear Dr. Muthen, I ran this model: V1 BY VV1N* VV2N* VV3N*; V2 BY symb1* symb2* symb3*; V1@1 V2@1; V2 ON V1; I was under the impression that V1@1 V2@1 would cause the two latent factors, V1 and V2, to have variances fixed at 1. But the output for the unstandardized solution reports that the residual variance of V2 (the endogenous variable) is 1, and the TECH4 output says that the total variance of V2 is 1.55. Did I fix the residual variance to 1 instead of fixing the variance to 1? Is it possible to fix the variance of an endogenous variable to 1 in Mplus? If so, how? Thank you! 


In a conditional model, the variance of an endogenous variable is not a model parameter. The residual variance is. So you can't fix the variance. 


I have an additional question regarding the posting on November 02, 2012 and the posting on April 02, 2013 about the range and variance of the latent variable. If the variance of an endogenous latent variable is not a model parameter does this meant that it is not possible to change the range of the latent variable? I have two datasets where i have a latent dependent variable with categorical indicators which are binary and the same predictors in both datasets, but not the same indicators for the latent variable. It would be nice to be able to directly compare the size of the unstandardized coefficients such as is possible in ordinary regression if for instance the dependent variable has the same scale. Seeing as the indicator variable for the latent dependent variables in both datasets are binary can the unstandardized coefficients be compared as if the scale is the same? When i look at tech4 the mean and variance of the latent factor means are not identical. 


If you do not have the same indicators of the factors in each group, you cannot establish measurement invariance because the constructs are not the same. Because of this, you cannot make comparisons across groups. 

Sarah Lowe posted on Friday, August 02, 2013  1:54 pm



Hello! I am trying to test whether the means of a latent construct differ over time. Above, you suggest the following approach: "H0 should have the mean of the factor fixed at zero for the first time point and free for the second  and equal measurement intercepts across time points. H1 should have both factor means fixed at zero (which is the default) and all measurement intercepts free." 2 followup questions: 1) Do you know of a good citation? 2) What would a significant or nonsignificant chisquare difference test mean in this scenario? Thanks very much!!! Sarah 


1. Millsap, R.E. (2011). Statistical approaches to measurement invariance. Taylor and Francis Group: New York. 2. A significant chisquare difference test says the means are not the same across time. 

Meike Slagt posted on Wednesday, October 15, 2014  5:16 pm



Dear Dr. Muthen, I’m using CFA to estimate a latent variable that has continuous indicators measured on different scales (e.g., X1 measured on 7point scale, X2 measured on 9point scale). For interpreting the factor loadings this doesn’t matter, as I understand. But what if I want to interpret the latent variable mean? I am estimating the same latent variable repeatedly across waves, and want to be able to how much the scores on my latent variable increase or decrease. This seems hard if indicators with different scales feed into it. Would it make sense to standardize the indicators after all? A 1unit change on the latent variable from Time1 to Time2 could then mean an increase of 1 SD on the latent variable. Thank you for your help! Best, Meike 


This is not a problem. The factor scale is tied to the scale of the factor indicator that has loading fixed at 1: when the factor increases one unit, that indicator is expected to increase one unit. The other indicators increase lambda units where lambda is indicatorspecific. An increase from a factor mean of zero at time 1 to alpha at time two, would give an expected increase in the first indicator of alpha and the other indicators lambda*alpha, again with lambda indicatorspecific. 


Dear Professor Muthen, When I read the result from the output file in mixture model, then I see , for example: Categorical Latent Variables Means C#1 0.601 C#2 0.735 Could you be very kind to tell me how these value be calculated?and why it always be negative? Thank you very much!! Jen 


Those values are logits that represent the class probabilities, which are printed separately. See our Topic 5 course and handout on the website. 

Anonymous posted on Friday, September 04, 2015  6:45 am



Hello, I am running an CFA with multiple latent variables. Each latent variable has indicators that are on a different metric. For example, my first latent variable is measured by both binary (0,1) and ordered categorical (1,2,3) indicators. 1)Is this problematic? If so, will standardizing the indicator variables prior to running my CFA solve this problem? 2)If I do not need to standardize them, do you know of a reference/citation I can review? Thank you. 


This is not a problem. I can't think of a specific reference off hand, but this is common knowledge and is often the case in IRT applications with binary and graded responses. 


Dr. Muthen I have a nonrecursive path model with categorical endogenous variables. CATEGORICAL ARE Y1 Y2; Y1 < f(Y2, X1) Y2 < f(Y1, X2) The TECH4 output gives me the estimated mean of the underlying latent variables Y1* and Y2*. However, I am trying to get the values of Y1* and Y2* for each and every response. The SAVEDATA = FSCORES did not work. Please advise. Thanks a lot 


You can either put a factor behind each Y and ask for factor scores, or use Bayes and ask for Y* scores. 


Thanks Dr. Muthen. I adopted Option #1 of your suggestion. However, the method led to substantial changes in estimated coefficients and their standard errors. For example the coefficient of Y2 in the equation Y1 < f(Y2, X1) was originally 0.187 with pvalues of 0.000. In the new method you suggested, the coefficient of Factor.Y2 in the equation FactorY1 < f(FactorY2, X1) is 0.216 with pvalue of 0.117. Similarly, large changes were observed in the thresholds (for ordered probit model). Why? Does the two coefficients have the same interpretation? Can you point me to some literature? Thanks for your time. 


Check that the two models have the same number of parameters and log likelihood. 

Back to top 