Message/Author 

Bill Roberts posted on Wednesday, February 27, 2002  2:42 pm



I am in the process of testing the measurement model that will be used when I test structural relationships between latent variables of interest. If I mix binary and continuous dependent variables together as measures of a latent factor and specify that the binary dependent variables are categorical, what problems or misunderstandings does this method lead to? For example, I tried running Mplus using the default analysis procedure (generalML) treating both the binary and continuous variables as continuous. I then ran the analysis again, where the only change was to specify that the binary variables are categorical. I think that Mplus automatically switches to WLSMV according to the table on page 38. In comparing the results drawn from these two analyses, model fit was about the same but, the regression coefficients between the latent factor and dependent variables were much higher when the dichotomous variables were specified as categorical. What do you recommend? 


When you treat all of the variables as continuos, the factor loadings are ordinary linear regression coefficients. When you treat them as categorical, the factor loadings are probit regression coefficients and therefore not on the same scale. I would generally recommend treatng dichotomous items as categorical particulary if they are far from a 50/50 split. The loadings are higher when you treat the variables as dichotomous because correlations are attenuated when categorical variables are treated as continuous. 

Bill Roberts posted on Thursday, February 28, 2002  7:58 am



If I fix the scale of the latent factor to the continuous variable and include categorical indicators (binary), I will be using indicators measured on different scales. Is it correct that Mplus will compute regression coefficients for the continuous variable using ordinary linear regression and compute regression coefficients for the binary variables using probit regression. Thanks for your help. 


Mplus can accommodate a combination of continuous and categorical factor indicators. The factor loadings for the categorical indicators and probit regression coefficients. The factor loadings for the continouous indicators are ordinary linear regression coefficients. The estimator is not OLS but WLSMV. These will result in the same estimates in large samples. 

ncg posted on Thursday, October 29, 2009  10:32 am



Are there any situations where you recommend treating binary variables as continuous? In another post, you referenced small sample size as a reason to do this. I have 199 observations and am using WSLMV. My model looks like this: L1 by f1 f2 f3 f4 f5 X1 on L1 Y1 on X1 X2 X3 X4 X5 I am trying to determine the best way to treat binary variable X1, which does have about a 5050 split. 


I would treat it as categorical. With a 5050 split, a true correlation of .5 is attenuated to .33. 

Mohamad K posted on Saturday, November 18, 2017  12:53 pm



Dear Linda Muthen, Firstly , Thank you everyone for this informative discussion. I am trying to compute a socioeconomic index score from various socio, economic and educational variables like, employment ( yes/ No), income ( categories) , type of Job , education ( four levels) position at work, car ownership ( yes/ no). These variables a mixture of things, does it work to treat them as continuous as a whole set. I treated them as continuous under latent factor analysis in mplus and the model was amazingly fit when we added a correlation between education and income , but I am not certain if that would be acceptable? 


I wouldn't treat categorical variables as continuous. You may want to ask on SEMNET. 

Jordanize posted on Monday, November 20, 2017  11:23 pm



Thank You Bengt. O. Muthen 

Jordanize posted on Monday, November 20, 2017  11:29 pm



I am sorry that I did Not clarify in my previous message that I have rescaled these categorical variables via the optimal scaling suite in the a SPSS program first before testing them for CFA assuming a socioeconomic factor exists affecting them all unequally . The resulting factors were presumed to be rescaled metric variables . I will post the question on the SEMNET but I appreciate your feedback too , much appreciated . Many thanks in advance , Bengt. Best regards, MK 

Back to top 