Set intercept to first measurement oc...
Message/Author
 Andrea Hasl posted on Monday, July 27, 2020 - 5:03 am
Dear Dr. Muthen,

if I understand it correctly, the intercept term in the estimation of DSEM denotes the mean of the (individual) time series. Is there a possibility to set the intercept to the first measurement occasion (t0)? I am interested in the covariation of the intercept at t0 and the growth term.

Thank you very much,
Andrea Hasl
 Tihomir Asparouhov posted on Monday, July 27, 2020 - 10:43 am
If Yt is the original variable you can form two new variables
W=Y0
Zt=Yt-Y0
Then, model W as a between variable and Zt via DSEM. If you want the intercept to be Y0 you can specify Zt as within only (but I don't think you have to). You can also add the needed correlations on the between level.
 Andrea Hasl posted on Tuesday, July 28, 2020 - 5:43 am
Thank you very much for the reply, Timohir! Do I understand it correctly that I would just add Y@0 at the between level? Or is there a way to define Y0 using the "Define" command (my data is in long format)? Another question for my understanding: What exactly does Zt tell me, or why exactly do I need it?

At the moment my code states (in its simplest form and without Y@0)

%WITHIN%
phi | Y ON Y(1);
logv | Y;

%BETWEEN%
[phi];
[Y];
[logv];
phi;
Y;
logv;
phi WITH Y;

Thank you!
 Tihomir Asparouhov posted on Tuesday, July 28, 2020 - 9:41 am
Instead of having Y@0 you would use a command
within=Z;

The Y0 variable will need to be made into another column outside of Mplus and declared as
between=Y0

%WITHIN%
phi | Z ON Z&1;
logv | Z;

%BETWEEN%
[phi];
[logv];
phi;
logv;
Y0; [Y0];
phi WITH Y0;
 Andrea Hasl posted on Wednesday, August 05, 2020 - 3:21 am
Dear Timohir,

thank you very much for your help. Now another problem arised: When I use your "trick" with Y0 and Z, Y0 is estimated without including the information of the rest of the time series (corresponding to results of listwise deletion). Unfortunately, Y0 has many missing values and the results do not make sense anymore. Thus, I need Y0 to be estimated simultaneously in the model. Could you help me out again?

Thanks a lot,
Andrea
 Andrea Hasl posted on Wednesday, August 05, 2020 - 9:38 am
Sorry I spelled your name incorrectly! I meant Tihomir, of course!
 Tihomir Asparouhov posted on Wednesday, August 05, 2020 - 10:12 am
Y0 should be the first observed value. There is nothing in your model that is time specific so 0 shouldn't really be attached to a specific time.
 Andrea Hasl posted on Thursday, August 06, 2020 - 3:26 am
Unfortunately, from the data structure, 0 is attached to a specific time. It is wage data where persons started their job in a specific year (Variable A), which we defined as t0. Many persons, however, have not reported their wages (Variable B) at that point, even though they already received them. Thus, the first wage observation is not necessarily the first time point of the actual wage series, therefore missing values are possible.

 Tihomir Asparouhov posted on Thursday, August 06, 2020 - 5:01 pm
I would recommend that you look at
savedata: file is 1.dat
in your run. This will show you want data is being analyzed.

Your main issue could be (not the missing value) but the fact that you must account for the time distance between when data first starts being available and the number of years that it is missing.

You probably must use Time as a predictor. You might also need to switch to type=cross DSEM where this time alignment is modeled. See User's Guide example 9.38
(even if you use an empty model on the
%BETWEEN time%
level).

The Y0 variable is being correlated with all the other variables on the between level, i.e., it is being imputed from those variables. I think it is a mistake to use within=Zt. Instead I would use Zt=Yt for t>0 and then have
Z with Y0 on the between level since that would be your strongest predictor for the missing values.

I would stick with standard models rather than trying to have the initial value as random intercept. Mplus is a multivariate framework. If you had data from the two jobs you can model two time series processes. But if you don't have such data then you can use the wage from job A as a separate variable and the wage from job B as time-series variable and correlate those (rather than trying to interpret the wage from job A as starting value).

It might also be useful to step back from time-series models and first try two-level models to figure out what the fundamentals are - then add the auto-correlation.