Message/Author 

Jon Heron posted on Tuesday, February 27, 2007  3:22 am



I have a 4 level latent outcome, derived from an LCA, and wish to fit some multinomial models using this. I have fixed the thresholds for these trajectories so that inclusion of covariates will not affect the trajectory shapes. Some of my predictors are categorical, e.g. gender. To estimate the univariable relationship between gender and the outcome, I have simply been adding gender to the usevariable and categorical lists within the variable section. This gives me a gender ratio within each class and also the oddsratio for girls versus boys for each class referenced to the 4th class – just what I want. If I was just to fit a manifest logistic model, it would not matter if I included gender as a continuous or a binary variable, the estimates would be the same. However, if I fit (what I believed to the be) the same univariable model, but remove gender from the categorical list and add the line: c#1 c#2 c#3 on gender; I get a completely different answer for the effect of gender. Whys is the differe4nt and what is the meaning of the new set of estimates? (The motivation for this was that I wished to include a number of continuous predictors into my multinomial model so it seemed more appropriate to have all predictors specified within the ON statement.) 


In most cases, the predictors should not be placed on the CATEGORICAL list. This option is for dependent variables. I would have to see the two analyses to comment on why you get different results. If you would like me to do this, please send your inputs, data, outputs, and license number to support@statmodel.com. 

Jon Heron posted on Thursday, March 01, 2007  1:11 am



Thanks Linda, before we explore that option, perhaps I can ask another question which might help me get to the bottom of this. Simple example with my 4class latent variable plus gender. I want to derive 3 odds ratios for gender, with class 4 as the reference category. Odds ratios don't distinguish between dependent and independent variables, it's simply an association. Seems to me that if I fix the trajectories and then add gender to the mix (as a categorical varaible), this is simply a contingency table and that I should get exactly the same results were I to save out the posterior probabilities and match in gender for a simple analysis in a standard stats package. The advantage of doing this within Mplus is that it's much more straightforward to model the latent variable rather than resorting to modal classes. Is there a big flaw in my logic here? 

Jon Heron posted on Thursday, March 01, 2007  1:32 am



There IS a flaw, but I'm not sure if it's a major one. By modelling this contingency table within Mplus I am allowing the posterior probabilities to vary between genders whereas if I export the data to merge with gender outside Mplus, the probs are constant within each pattern. Hmm 


We do not advocate putting exogenous independent variables on the CATEGORICAL list. This is not what is done in regression, linear or logistic. If you get very different results when you do this, we would need to see more information as I mentioned earlier. Doing the analysis in two steps will introduce estimation errors in your parameter estimates and their standard errors because people are not in one class during model estimation. They are proportionally in all classes. Therefore, using most likely class membership where they are assigned to one class is not correct. 

Jon Heron posted on Friday, March 02, 2007  2:23 am



Thanks Linda, it looks like we may need to go down the route of my sending some files if that is OK with you. Just to pick up on your last point, rather than using the most likely class, my method was to actually use the posterior probabilities within a weighted regression analysis (e.g. convert a 4class model on 10,000 kids into a dataset of 40,000 observations and a set of probabilities). I think this should avoid the introduction of estimation errors. 


Sounds like your approach is letting subjects have fractional class probabilites as they should which avoids the parameter estimate bias. However, the SE biases will still be there. Although perhaps useful as an exploratory technique, a better way to approach having a covariate influence a latent class variable is to do the analysis in one step. When requesting Residual in the output, you can see how the covariate mean changes over the classes. When including a covariate, such as gender, changes the solution, then that is an indication of lack of measurement invariance, i.e. direct effects from gender to the indicators. P.S. The fractional membership approach does not have to be done via expansion of the data set  just use the Training = ...(Membership) option described in the UG under TRAINING. 

Jon Heron posted on Monday, March 05, 2007  1:08 am



Hmm, all 'residual' appears to do is give me withinclass univariate and bivariate distributions for the u variables Am I missing something out here: variable: names <snip>; classes = c (4); categorical = ds_kk ds_km ds_kp ds_kr ds_ku; usevariables ds_kk ds_km ds_kp ds_kr ds_ku sex; missing are ds_kk ds_km ds_kp ds_kr ds_ku (999); analysis: type = mixture; type = missing; starts = 500 250; stiterations = 10; stscale = 15; model: %OVERALL% c#1 c#2 c#3 on sex; output: cint; residual; 


It gives model estimated values and the difference between observed and expected. What would you expect it to give you? 

Jon Heron posted on Tuesday, March 06, 2007  12:35 am



To quote Bengt above: "how the covariate mean changes over the classes" 


If you have a covariate that is not on the CATEGORICAL list, you will also get that. When you put a covariate on the CATEGORICAL list, it is treated as a dependent variable. 

Jon Heron posted on Tuesday, March 06, 2007  7:53 am



Not in the categorical list here (I've learnt my lesson) There's no mention of my covariate in the residual output 


Then you will need to send the whole output to support@statmodel.com so I can see exactly what you are doing. 

Back to top 