

Correlation or covariance ? 

Message/Author 

Subert Wu posted on Wednesday, January 03, 2001  5:57 pm



When performing Monte Carlo simulation with categorical indicators, I create a data set. Should it be a correlation or covariance metrix? I refer to the MPLUS User's Guide, but I am still confused. In chapter 3, the author points that only a correlation matrix can be analyzed if at last one variable is categorical. But in chapter 12, the input file must contain the means and variancescovariances structure. Please help me solve the problem. 

bmuthen posted on Wednesday, January 03, 2001  9:13 pm



A distinction is needed between population values, raw data generated from the population, and sample statistics analyzed. Chapter 12 says that the population quantities that data are generated from are means, variances and covariances involving the continuous latent response variables y* behind the categorical y's. In the analysis, the sample raw data generated from this population are then used to compute the sample correlations and weight matrix elements. 

Anonymous posted on Wednesday, March 14, 2001  7:36 am



The population quantities that data are generated from are means, variances and covariances,so we know the true value of population in the model that is relate with the covariances. But in Monte Carlo simulation for WLS or WLSMV, we compute the sample correlations and weight matrix elements to estimate.The true value of correlation does not equal to the true value of covariance,so the output of model results does not equal to the true value of covariance. We only know the true value of covariance, how do we know the population true value of correlation ? 


If you choose population parameter values that give unit variances for the covariance matrix used to generate the data, then you are generating from a correlation matrix. With unit variances your estimates will be computed in the metric of the population parameters you chose. If you don't have unit variances you can scale your estimates back to the correct metric. 

Anonymous posted on Wednesday, August 14, 2002  7:46 am



I performed a SEM to examine the correlation coefficients between total score of a test scale and the 41 scale items (also called "Corrected Itemtotal Correlation (CITC) analysis"). My concerns is placed on internal consitency of the scale. Because each item was binary, I wrote the syntax as following; VARIABLE: NAMES ARE y1y41 totalscore; CATEGORICAL ARE y1y41; MODEL: totalscore with y1y41; ANALYSIS: TYPE IS GENERAL; ESTIMATOR IS WLSMV; ITERATIONS = 1000; CONVERGENCE = 0.00005; As a result, I confronted the following warning messages. Serious computational problems occured in the bivariate estimation of the correlation for variables totalscore and y4. Check your date. If the proglam recovers for this pair of variables(See technical 6 output),the estimates are valid. the problem occured for the following observation(s): observation 10 I can't understand what the messages means. I would be appriciated if I had any suggestions. If alternative methods were, I would like to know it. 


The message means that when Mplus tried to estimate the correlation between totalscore and y1, there were computational problems. If the program continued and you get results, they are trustworthy. If the program was not able to continue, then you need to look at the distribution of y4 to see what the problem is. 

Back to top 

