Path Analysis using categorical laten... PreviousNext
Mplus Discussion > Multilevel Data/Complex Sample >
 aparthan posted on Tuesday, October 14, 2003 - 2:48 pm

I would like to know if i can use path analysis/SEM in this case.

Outcome variable: Categorical latent factor (dichotomous )measured using two categorical observed variables (dichotomous).

Indicator variables: 13 categorical variables (ordinal/categorical)

I am using a national database which has multistage sampling.

I would like to know if Mplus can handle multi stage sample in a similar manner as SUDDAN /STATA does?

And am i violating any assumptions if i run Pathanalysis/SEM using Mplus? (provided Mplus can handle clustered sample design.)

 bmuthen posted on Tuesday, October 14, 2003 - 3:25 pm
The outcome variable scenario can be handled by Mplus, but can you please clarify what you mean by "Indicator variables"?
Mplus offers computation of SEs that take into account non-independence of observations due to cluster sampling. Sampling weights can also be used. Stratification can be handled via multiple-group analysis or using covariates. SE adjustments due to multistage sampling in line with SUDAAN is not in the current version, but a project is due to start on implementing that in a future version. However, the allowance for sampling weights and clustering should go a long way towards dealing with the complex sampling.
 aparthan posted on Tuesday, October 14, 2003 - 5:27 pm

What i mean by indicator variables is predictor variables that are like observed variables?

And like i mentioned earlier these variables are categorical.
 bmuthen posted on Tuesday, October 14, 2003 - 5:36 pm
Yes, adding such covariates is not a problem.
 aparthan posted on Wednesday, October 15, 2003 - 8:10 am

Any idea when the newer version of Mplus with SE adjustments due to multistage sampling in line with SUDAAN will be released?

Thanks once again for your prompt responses to all my questions.
 bmuthen posted on Wednesday, October 15, 2003 - 8:52 am
- that's a long-term project just about to start so this is several years into the future.
 Anonymous posted on Thursday, August 19, 2004 - 5:44 am
Dear Linda and Bengt,

I do have a model containing an endogenous (and observed) binary variable as well as other dependant variables with an ordinal scale. I ran this model with AMOS. But since AMOS isn't able to estimate models with dichotomous endogenous variables, I run the same model with MPlus and the results are quite similar, but only if I define only my binary endogenous variable as CATEGORICAL. As soon as I define the ordinal indicators (5 to 6 categories) as CATEGORICAL, I do have severe convergence problems. The program gives me following hint:


Is it necessary to define those ordinal variables (not the binary outcome variable), which (theoretically) do have an underlying continous scale as CATEGORICAL?

Thank you in advance
 Linda K. Muthen posted on Thursday, August 19, 2004 - 9:46 am
I would need to see your full output. You can send it to Please include TECH1 and SAMPSTAT in the OUTPUT command.
 Anonymous posted on Thursday, December 30, 2004 - 9:29 am
Hi there,

I would like to know whether Mplus has a subporn or subpop option (like SUDAAN or STATA) for analyzing subpopulations of a national dataset? I believe that in order to have correct variance estimates (using stratification and weight option) I will need to define a subpopulation so the software can correctly handle the primary sampling units (PSUs) that are missing in each level of the stratification variable. Any ideas as how I could do this in Mplus?

 Linda K. Muthen posted on Friday, December 31, 2004 - 4:45 pm
There are two ways to do subpopulation analysis in Mplus. One way is to use the USEOBSERVATIONS command. The second way is to zero out the weights for non-subpopulation members, like this:

WEIGHT = w0;
if (in subpopulation) w0=w else w0=0.0000001;

The second method is equivalent to the subpop command in SUDAAN. We are currently investigating the merits of the two alternatives.
 Anonymous posted on Sunday, January 02, 2005 - 1:08 pm
Hi there,

Thank you for your message above. I have a quick follow up question. I tried the second way (zero out weights) which resulted in a weird 'n'. I have 15,214 people in my sample. My subpopulation contains 834 people with complete data on variables of interest (used in usevars command). The n displayed in my output is 11,585. In addition, everyone in my full and subpopulation has complete data for stratification, weights, and cluster vars. I am wondering how Mplus calculated this n (11585). Is the analysis using only the 834 people for the LCA and using a random sample to calculate the weights? I am confused why my n is not 834. If I ignore my subpopulation the N for complete data in the full sample should be 12,419. Could you help by explaining to me what is going on? Here is my code...


TITLE: mixture 2-class
DATA: FILE IS subyouth.dat;

plan attempt lbinge dbinge rbinge lalcfreq dalcfreq ralcfreq
lfight dfight rfight dep sex rage dage qnumatt lnumatt medatt
frace nrace psu stratum weight subpop asubpop;

stratification = stratum;
USEVARIABLES ARE si plan dep lnumatt lbinge lfight w0;

classes= c (2);
weight is w0;
cluster is psu;

categorical = si plan dep lfight lbinge lnumatt;

if (asubpop eq 1) then w0 eq weight;
if (asubpop eq 0) then w0 = .0000000001;

ANALYSIS: TYPE = mixture complex;

thanks in advance!
 LMuthen posted on Monday, January 03, 2005 - 8:40 am
Can you send your input/output and data to so we can see what is happening?
 Linda K. Muthen posted on Monday, January 03, 2005 - 12:10 pm
There's an error in the DEFINE command. If you change

if (asubpop eq 1) then w0 eq weight;


if (asubpop eq 1) then w0 = weight;

The sample size should be the full size of the population.
 Filipa Alexandra da Costa Rico Cala posted on Thursday, July 07, 2016 - 7:22 am
Dear Linda,

In my data, my dependent variable is a latent dichotomous variable measured by 9 dichotomous items. In fact, this latent variable is the sum of this 9 items. I would like to perform a regression with this variable in order to see which are the most important predictors for this variable. However, I have a doubt: it is better to first sum the 9 items in a more basic software (such as SPSS). and then use this final variable in Mplus and perform a logistic regression in this software, or it will be better not to perform this sum in SPSS and to use the measurement model, that is, write in the Mplus syntax in the model command F1 by I1-I9 and then F1 on X1 (which is the independent variable)? in this later case, can I also conduct a logistic regression?
Many thanks in advance for all your help,
 Bengt O. Muthen posted on Thursday, July 07, 2016 - 9:49 am
You say first that your latent variable is dichotomous and then that it is a sum/continuous. I will answer as if the latter.

If you find that your measurement model has good fit I would use the F1 approach you mention. If you use ML you can use the default logit link so that the regressions of the 9 indicators on the factor are logistic regression. The regression of the factor on x's is of course linear.
Back to top
Add Your Message Here
Username: Posting Information:
This is a private posting area. Only registered users and moderators may post messages here.
Options: Enable HTML code in message
Automatically activate URLs in message