Latent Profile Analysis - Normalising... PreviousNext
Mplus Discussion > Latent Variable Mixture Modeling >
 Asher Lederman posted on Tuesday, September 05, 2017 - 7:36 pm

I am performing a latent profile analysis of 15 items that were selected from 3 different questionnaires. They are based on Likert-scales with 4,5 and 6 response options respectively.

How are items with different scales managed by MPlus? I am concerned that having different scales for different items implies that the means and variances that comprise class response patterns would differ across class both due to inter-class differences in response patterns, but also due to scale ranges.

If this is indeed a problem that the user needs to address, how would you suggest to normalise/standardise each items' data before inputting into Mplus? Or does Mplus do some form of item normalisation/standardisation for the user?

Most of my item data shows a skewed distribution (in the univariate/total sample sense), so I am not sure how I would normalise it without changing it's overall shape, and (I assume) I want to retain this.

 Bengt O. Muthen posted on Wednesday, September 06, 2017 - 4:41 pm
If you are not comparing LPA results across the 3 questionnaires, I don't see that the differences in scales matter. But you are right that having different scales makes it hard to compare LPA results.

Yes, defining the variables is something the user has to do.

You can treat the variables as categorical (ordinal) to deal with the skewness (perhaps there is also floor/ceiling effects which would be handled that way).
 Asher Lederman posted on Wednesday, September 06, 2017 - 8:41 pm
Thanks for the quick reply, Bengt.

Just to clarify, this LPA model contains 15 items derived from three different questionnaires.

For this set of items, I am not comparing LPA models (i.e. 2 class vs 3 class, or 2 class with constraints vs 2 class unconstrained). I have accepted that a 2 class model is the best fit, based on theory.

So, do I not need to standardise the variables?

What I will be doing is comparing this 2-class LPA model to a completely different 2-class LPA model (which is based on items from a single scale). It turns out that the two classes in both models classify essentially the same cases, and relate to the same theoretical latent variable.

In the comparison of the LPA models (three questionnaire LPA model vs one questionnaire LPA model), I will only be comparing the models in their distribution of posterior probabilities within classes. It is a test of the measurement precision for classes in each model, using different indicators.

So, can you confirm that I don't need to normalise the 2-class LPA model comprised of items from 3 questionnaires?

 Bengt O. Muthen posted on Thursday, September 07, 2017 - 3:21 pm
I don't think standardization or normalization is helpful here - I would not do that.
 Asher Lederman posted on Thursday, September 07, 2017 - 6:07 pm
Ok, thanks. One additional thought though.

If I make all scales equal in size but just multiplying item scores (so they are all from 1-7 in my case), is that ok? Does that make sense?

An item whose scale that was 1-4 will be multiplied by 7/4 etc.
 Bengt O. Muthen posted on Sunday, September 10, 2017 - 1:16 pm
That should be ok but not change the class formation.
 J.D. Haltigan posted on Tuesday, September 11, 2018 - 4:42 pm
Related question:

I am doing an LPA with 14 outcomes and 5 covariates. I am running both R3STEP and regular LCA without R3STEP using the MODEL statement.

The R3STEP model no problems.

The later model (regular LCA with covariates) never replicates the LL. I am convinced it is because one of the covariates is on a massively different scale than the other variables and that this is making estimation of class posterior probabilities more cumbersome (which wouldn't be the case using R3STEP). This variable is birthweight in grams. Other covariates are the usual demographics, includ. sex, ethnicity etc. Indicators are all continuous either 1-5 or 1-9 scales. Do I need to transform birthweight in grams do you think? Or would this not matter. In looking at the LL list, it replicates quite a bit until the best LL value right at the end. I'm puzzled.
 Bengt O. Muthen posted on Wednesday, September 12, 2018 - 4:57 pm
Q1: Try it.

Perhaps you also have direct effects.
 J.D. Haltigan posted on Wednesday, September 12, 2018 - 9:46 pm
So z'ing the birthweight grams variable did the trick (best LL replicated many times). I'm wondering though if there is a 'short' answer as to what putting this variable on a scale more like the other covariates (sex, ethnicity, age in years) does that achieves the desired outcome. My thinking is along the lines of b/c the original variable was on such a massively different metric, a global solution would almost never be reached (i.e., the grams variable is in essence its own local solution). Is this reasoning ballpark at all?
 Bengt O. Muthen posted on Friday, September 14, 2018 - 1:32 pm
It has to do with numerical precision which can be higher when all variables are on similar sized variances.
Back to top
Add Your Message Here
Username: Posting Information:
This is a private posting area. Only registered users and moderators may post messages here.
Options: Enable HTML code in message
Automatically activate URLs in message