Defining a new variable PreviousNext
Mplus Discussion > Structural Equation Modeling >
Message/Author
 Susan Scott posted on Wednesday, November 22, 2006 - 10:39 am
I have the feeling I may be missing something very obvious, but when I add the following statement to my model

DEFINE:
sf36 = (pcs_ob + mcs_ob)/2;

with sf36 used in the following model statement:

MODEL:
HRQL BY eqsum eq5dw10 hu3m eqtherm sf36;


I get the error message:

** FATAL ERROR

THE SAMPLE COVARIANCE MATRIX COULD NOT BE INVERTED. THIS CAN OCCUR IF A
VARIABLE HAS NO VARIATION, OR IF TWO VARIABLES ARE PERFECTLY CORRELATED, OR
IF THE NUMBER OF OBSERVATIONS IS NOT GREATER THAN THE NUMBER OF VARIABLES.
CHECK YOUR DATA. THIS PROBLEM IS DUE TO:


VARIABLE : SF36

I am able to run the model with each of these variables individually, ie. pcs_ob and mcs_ob instead of sf_36. Also, I tried defining sf36 as sf36=pcs_ob, and I still get this message. So I am guessing the problem is with my definition statement.

Thank you,
Susan
 Linda K. Muthen posted on Wednesday, November 22, 2006 - 10:52 am
Please send your input, data, output, and license number to support@statmodel.com. It is not obvious what the problem is without further information.
 David Bard posted on Wednesday, November 22, 2006 - 9:44 pm
It's not obvious to me why, but adding a define command to my syntax resulted in loss of observations for a model that did not even contain the variables used in the define command. The define syntax was used for a previous model and was simply left in this new model after a cut and paste. It's not a big deal, I will simply exclude define commands not needed for a given model, but can you explain why this happens? tx, db.

If it helps, here's the define syntax:
DEFINE:
citpreg = citynum*pregnant;
twkhighc = twkhigh;
if twkhigh > 5 then twkhighc = 5;
twkamhdc = twkamhd2;
if twkamhd2 > 5 then twkamhdc = 5;
formrbc = formrb;
 Linda K. Muthen posted on Thursday, November 23, 2006 - 6:35 am
I would need more information to answer this question. Please send your input, data, output, and license number to support@statmodel.com.
 Roxana Dragan posted on Monday, June 08, 2009 - 9:13 am
I want to add a random variable to an existing variable. I do not plan to do this repeatedly like in a Monte Carlo study. I just need to use a command like DEFINE y=x+e, where:
y - new variable
x - old variable
e - random variable from the normal distribution with zero mean and a standard error I can choose. Can/(how do) I do this in Mplus? Thank you.
 Bengt O. Muthen posted on Monday, June 08, 2009 - 11:33 am
If e is latent, you don't need Define but can do it in the Model command as:

e BY;

If e is observed, Mplus does not currently offer this capability.
 PETER TOYINBO posted on Saturday, July 17, 2010 - 11:24 am
I wish to create MD_A from G12A and G15A where G12A tests presence of a condition (yes=1, no=5) while G15A scores severity scale' at 3 levels if checked 'yes' on G12A but assigns 'missing' if checked 'no'.

I wish to combine G12A and G15A in a new 4-level scale MD_A where 0 indicates absence of condition. Below is the partial syntax and error I am getting.

USEVARIABLES ARE

G18A G18B G18C G18D G18E G18F G18G G18H G18I G18J
MD_A MD_B MD_C MD_D MD_E MD_F MD_G MD_H MD_I ;

CATEGORICAL ARE

G18A G18B G18C G18D G18E G18F G18G G18H G18I G18J
MD_A MD_B MD_C MD_D MD_E MD_F MD_G MD_H MD_I ;

MISSING=. ;

DEFINE:

IF (G12A == 1) THEN MD_A == 0 ;
IF (G12A == 5 AND G15A == 1) THEN MD_A == 1 ;
IF (G12A == 5 AND G15A == 2) THEN MD_A == 2 ;
IF (G12A == 5 AND G15A >= 3) THEN MD_A == 3 ;

IF (G12B == 1) THEN MD_B == 0 ;
IF (G12B == 5 AND G15B == 1) THEN MD_B == 1 ;
IF (G12B == 5 AND G15B == 2) THEN MD_B == 2 ;
IF (G12B == 5 AND G15B >= 3) THEN MD_B == 3 ;

....

ANALYSIS: TYPE = EFA 1 4;

PLOT: TYPE IS PLOT3 ;

*** ERROR
Variable names must begin with an alphabet character:
EQ
 Linda K. Muthen posted on Saturday, July 17, 2010 - 11:31 am
See the DEFINE option of the user's guide. You can't use == on both the right and left-hand sides of THEN. On the right-hand side use =.
 PETER TOYINBO posted on Saturday, July 17, 2010 - 12:19 pm
I spotted an error in my syntax above and corrected it to now read:

IF (G12A == 5) THEN MD_A == 0 ;
IF (G12A == 1 AND G15A == 1) THEN MD_A == 1 ;
IF (G12A == 1 AND G15A == 2) THEN MD_A == 2 ;
IF (G12A == 1 AND G15A >= 3) THEN MD_A == 3 ;

But I am still getting the same error (below) about variable names which I could not figure out :

*** ERROR
Variable names must begin with an alphabet character:
EQ

Thanks for your help.
 Linda K. Muthen posted on Saturday, July 17, 2010 - 12:34 pm
On the right-hand side use = not ==.
 Spencer James posted on Tuesday, November 16, 2010 - 11:03 am
Why does Mplus use a degree of freedom to compute a new variable? How do I ensure that the added degree(s) of freedom does not influence my measures of model fit? Thanks for your help.
 Linda K. Muthen posted on Tuesday, November 16, 2010 - 2:11 pm
I don't understand your question. Please send outputs that illustrate what you are saying along with your license number to support@statmodel.com.
 Kerry Lee posted on Monday, December 06, 2010 - 3:15 am
Dear Dr. Muthen,

Is it necessary to list both the original variables and new variables created using DEFINE in USEVARIABLES? I thought this was the case after reading the User's Guide. However, when I did this, both original and new were included in a CFA model even though only the new variables were specified in MODEL.

On a related matter, I ran the same analysis using either the original and new variables (original/x to bring the scale back to 1 - 10). The raw bivariate correlations naturally remain the same, but the standardized CFA factor loadings and correlations are different.
Are such differences to be expected?

The difference in time needed to run the two analyses was astonishing: 37 min versus 56 sec (scaled).

Sincerely,
Kerry.
 Linda K. Muthen posted on Monday, December 06, 2010 - 6:34 am
Every variable specified on the USEVARIABLES list is included in the model to be estimated. New variables created using DEFINE must be placed on the USEVARIABLES list if they are used in the MODEL command. If any original variables are used in the MODEL command, the new variables created in DEFINE must follow them on the USEVARIABLES list.

Large variances make model convergence more difficult so this could increase the time. I would have to see the two outputs and your license number at support@statmodel.com to comment on the standardized coefficients.
 Luo Wenshu posted on Thursday, February 19, 2015 - 2:53 am
Dear Dr. Muthen,

I am using Mplus 7.3 doing twolevel analysis. I created cluster means for some observed variables and want to use these cluster means at level 2. I then listed these cluster means under original variables on the Usevariables list. Then running analysis led to the error that the number of record is 0.
Currently I put Define comment down following the usevariables list. How should I position usevariables and define command?
 Linda K. Muthen posted on Thursday, February 19, 2015 - 5:40 am
DEFINE should precede or follow another command. It should not be placed among the options of another command.
 Jane Doe posted on Monday, March 02, 2015 - 7:49 am
I know how to include an interaction of two latent variables in my analysis. But how about a linear combination of two latent variables.

Say, I want to define a latent variable (call it f3) which is a linear combination of two other latent variables (e.g. f3=f1+f2) and then use this f3 as (for example):

x ON z f3;

where x and z are observed variables.

How can I do this?
 Bengt O. Muthen posted on Monday, March 02, 2015 - 11:01 am
Try

f3 BY;

f3 ON f1@1 f2@1; f3@0; ! this is f3=f1+f2
f3 with f1-f2@0;

x ON z f3;
 Jane Doe posted on Monday, March 02, 2015 - 1:19 pm
Thank you. It worked.
In the meanwhile I also tried:

x On z
f1 (a1)
f2 (a2);

MODEL CONSTRAINT a1=a2;

This gave me identical results.

Are these doing the same thing basically?

Thanks a lot.
 Bengt O. Muthen posted on Monday, March 02, 2015 - 4:25 pm
Yes.
 Jane Doe posted on Thursday, March 12, 2015 - 11:11 am
Is it possible to use her absolute value of a latent variable. Say I want to estimate the following model:

f1 BY x1 x2 x3;
z ON f1 x4;

But in the regression of z on f1 and x4 I want to use the obsolete value of f1. Is this possible?

Thanks.
 Jane Doe posted on Thursday, March 12, 2015 - 11:13 am
The question above is full of typos! Apologies!

Is it possible to use the absolute value of a latent variable. Say I want to estimate the following model:

f1 BY x1 x2 x3;
z ON f1 x4;

But in the regression of z on f1 and x4 I want to use the absolute value of f1. Is this possible?

Thanks.
 Linda K. Muthen posted on Thursday, March 12, 2015 - 11:14 am
No, this is not possible.
 Jane Doe posted on Thursday, March 12, 2015 - 11:35 am
Ok.

Let me elaborate the question a bit. How about I have two latent variables f1 and f2. I generate a third latent variable which is the difference between these two latent variables: f3=f1-f2. (With your help I can now do this.) And I use f3 in my model further on as: x ON z f3;

But what I am interested in is the absolute difference between f1 and f2. Hence the previous question: can I use the absolute value of f3?

If the answer is still no. Then is it reasonable to save the factor scores for f3, take their absolute value and use that subsequently?

OR can I atleast generate a variable that takes value 1 if f3 is positive, 0 when f3 is 0 and -1 if f3 is negative?

Apologies for the long question.
Thanks in advance.

Thanks.
 Bengt O. Muthen posted on Thursday, March 12, 2015 - 3:35 pm
Using plausible values (sets of factor scores for each subject) and then getting the absolute difference would seem the way to go. No automatic option of the absolute kind.
 Jane Doe posted on Friday, March 13, 2015 - 5:12 am
Thanks a lot. This is helpful.
 Lisa M. Yarnell posted on Wednesday, March 18, 2015 - 12:56 pm
Hi Linda and Bengt,

Is it possible to use a "by" statement on the DEFINE line--or something that will achieve the same result as a "by" statement?

For instance, I want to calculate the Black-White achievement gap separately each of the numerous schools in my sample by calculating Black students' average achievement by school; and White studens' average achievement by school; and subtracting one from the other (also by school).

I will then use this variable in my model.

Is there a "by" statement available for the DEFINE line, in order to do this? I did not see one mentioned in the User's Guide.

Thank you,
Lisa
 Linda K. Muthen posted on Wednesday, March 18, 2015 - 2:29 pm
No, the DEFINE command is for observed variables only.
 Lisa M. Yarnell posted on Wednesday, March 18, 2015 - 2:52 pm
Hi Linda,
I do have School ID as an observed variable. Can you clarify?
Thank you sincerely.
 Linda K. Muthen posted on Wednesday, March 18, 2015 - 5:44 pm
There is no BY statement in DEFINE. A BY statement defines a latent variable. If you want the difference between two observed variables, say

DEFINE:
diff = y - x;
 Abbas Firoozabadi posted on Tuesday, March 24, 2015 - 5:55 am
Dear Linda,
Below I described my data and what I am working on:
My main hypothesis is: the effect of recovery during weekend (positive activation change over weekend) on health over time.
I had three measurements over 1 year, for each I have one score of Health and three scores of positive affect before weekend (PAb), during weekend (PAd) and end of weekend (PAe) respectively. In a within person design I have to define the slope of positive affect change over weekend as the proxy of recovery that in turn will be used as the predictor of Health. I have to analysis my data in two levels of Between and Within person:
Variables are: Health1 Health2 Health3 gender age and PAb1 PAd1 PAe1 (will define slope of recovery1)
PAb2 PAd2 PAe2 (will define slope of recovery2) PAb3 PAd3 PAe3 (will define slope of recovery3)

For %between%: intercept and slope of health: I S |Health1@0 Health2@1 Health3@2
Then: I S ON gender age
For %within%:
Health1 ON slope of recovery1
Health2 ON slope of recovery2
Health3 ON slope of recovery3
So I need to DEFINE the slope of recovery (as a new variable) by taking 3 points of positive affect over each weekend.
How can I have all of these analyses in one syntax of Mplus?
 Bengt O. Muthen posted on Tuesday, March 24, 2015 - 11:31 am
One approach is to do it in a wide, single-level format. Then you have 3*3 recovery outcomes and 3 health outcomes, so 12 columns in the data, not counting any covariates. You can formulate 3 growth models for the recoveries and let their growth factors predict the 3 health outcomes. I am not sure you need/want a growth model for the 3 health outcomes.
 IW posted on Sunday, August 23, 2015 - 4:08 pm
Is there a way to check defined variables by exporting the raw data set without having to run an analysis?
 Linda K. Muthen posted on Tuesday, August 25, 2015 - 7:29 am
You can use TYPE=BASIC; in the ANALYSIS command with no MODEL command. Then use SAVEDATA and the FILE option to save the variables.
Back to top
Add Your Message Here
Post:
Username: Posting Information:
This is a private posting area. Only registered users and moderators may post messages here.
Password:
Options: Enable HTML code in message
Automatically activate URLs in message
Action: