Mplus Discussion >> Forecast in DSEM

Topics
Last Day
Last 3 Days
Last Week
Tree View

Edit Profile


Forecast in DSEM

Mplus Discussion > Dynamic Structural Equation Modeling >

Message/Author

zahra posted on Sunday, December 15, 2019 - 1:17 am

Hello
I want to forecast the number of new case of a disease in 2030 for every region by DSEM.Can I do it and how?

Tihomir Asparouhov posted on Monday, December 16, 2019 - 1:26 pm

You should be able to. I would recommend looking at Appendix D
http://www.statmodel.com/download/DSEM.pdf

If you have tends you might consider using RDSEM.

You can get the estimates for the region specific random effects with a command like this
SAVEDATA: FILE = 1.dat;
SAVE = FSCORES (100);

zahra posted on Monday, December 30, 2019 - 9:51 am

Thank you.
Which example I can use to model my data?
I have the number of new case of a disease for 10 city and I want to use DSEM for modelling my data.

Tihomir Asparouhov posted on Monday, December 30, 2019 - 10:44 am

I would recommend to start with some of the variations of User's guide example 9.30. Since you only have 10 cities the model should be simple and you should probably start with something like that

MODEL:
%WITHIN%
y ON y&1 y&2;
%BETWEEN%
y;

zahra posted on Tuesday, December 31, 2019 - 11:15 am

Thank you.
I have the number of new case of a disease for 10 city from 1990 to 2017.how can I add this time too? and how about more than 10 cities?

Tihomir Asparouhov posted on Thursday, January 02, 2020 - 9:18 am

I would recommend that you dig into some of the papers and training videos on our web site
http://statmodel.com/TimeSeries.shtml

Also the Mplus User's Guide examples 9.30-9.40 give an excellent introduction.

Maybe you will find this paper to be a good introduction
https://psyarxiv.com/j56bm

zahra posted on Thursday, January 09, 2020 - 7:53 am

Thanks.
I read articles and can write this program.I have two questions.1)Is this correct?(I have the number of new case of a disease for some cities from 1990 to 2017)2)how can i use of FScore to forecast new case?
DATA: file is "H:/africa2.dat"; ! Calling data;
VARIABLE: NAMES = id time y;
CLUSTER = id; ! Specify the person id variable
USEVAR = y; ! Specify which variables are used in the model
MISSING = ALL(-999);
LAGGED = y(1); ! This creates lagged variables
!TINTERVAL = sessdate(1); ! This is to account for unequal intervals
ANALYSIS: TYPE = TWOLEVEL RANDOM; ! This allows for random slopes
ESTIMATOR = BAYES; ! DSEM requires Bayesian estimation
PROC = 2; ! Using 2 processors makes it faster
BITER = (5000); ! This implies at least 5000 iterations are used
THIN = 10; ! Thinning helps with getting more stable results
MODEL: %WITHIN% ! Specify the random lagged relationships
p | y ON y&1;
%BETWEEN% !
p WITH y
OUTPUT: TECH1 TECH8 STDYX;
PLOT: TYPE = PLOT3;
FACTORS = ALL;
SAVEDATA: FILE = 1.dat;
SAVE = FSCORES (100);

Tihomir Asparouhov posted on Thursday, January 09, 2020 - 9:30 am

For a particular city/cluster you can get the median estimates for YB (Y on the between level) and P from the 1.dat file.

Since
Y=YW+YB
and since YB doesn't change across time, getting predictions for Y is the same as getting predictions for YW and adding the estimate of YB.

Since YW(t)=p*YW(t-1)+e, the predicted value for YW(2018) is p*YW(2017)=p*(Y(2017)-YB). The predicted value for YW(2019) is p*p*(Y(2017)-YB). The predicted value for YW(2020) is P^3*(Y(2017)-YB) etc. The predicted value YW(t) for a very distant year t (such as t=2050) will be 0 as P<1 and thus the predicted value for Y(t) will be simply YB.

Because of the context, however, the above model doesn't account for population increase and it would not be a good model. You have two options

1. Model "the number of new case of a disease per 10000 people" instead of the absolute value. This number would be not as dependent of the population increase.

2. You can incorporate the population increase in the model with RDSEM
%within%
Y on t;
p | Y^ on Y^1;
That model will have a different prediction scheme: beta*t + predicted value for YW + YB

Zahra posted on Tuesday, January 28, 2020 - 7:00 am

Thank you.
In our analysis p is 1 or near 1 and so we saw a jump in our prediction.why it happen?and how we can solve it?
thanks

Tihomir Asparouhov posted on Tuesday, January 28, 2020 - 9:32 am

It probably happens because the trend isn't modeled correctly. See point 2 in my previous answer.

Zahra posted on Sunday, February 02, 2020 - 1:05 am

We run point 2 within model but our p is still near 1.what should I do?
Thanks

Zahra posted on Sunday, February 02, 2020 - 10:07 am

I also have question about this warning which happen in all of 53 cluster.Is it important?
"WARNING: PROBLEMS OCCURRED IN SEVERAL ITERATIONS IN THE COMPUTATION OF THE STANDARDIZED ESTIMATES FOR SEVERAL
CLUSTERS. THIS IS MOST LIKELY DUE TO AR COEFFICIENTS GREATER THAN 1 OR PARAMETERS GIVING NON-STATIONARY MODELS.
SUCH POSTERIOR DRAWS ARE REMOVED. THE FOLLOWING CLUSTERS HAD SUCH PROBLEMS:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35
36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53"

Tihomir Asparouhov posted on Monday, February 03, 2020 - 10:35 am

The above error message concerns only the standardized coefficients, not model results.

You probably need this model to get the trend model correctly

%within%
beta | Y on t;
p | Y^ on Y^1;

zahra posted on Tuesday, February 04, 2020 - 6:35 am

I run the last model but still p is near 1.we have the new case of disease in per 100000 people so I do not think point 2 is suitable.any other suggestion?
thanks

Tihomir Asparouhov posted on Tuesday, February 04, 2020 - 9:21 am

You can run that particular cluster by itself as a one level model. If that doesn't help with understanding the issue send the example to support@statmodel.com, input and data.

sam posted on Sunday, May 31, 2020 - 7:23 am

Hello
I run this program in mplus.but my coeffecient are zero.why is this happen?

DATA: file is ; ! Calling data;
VARIABLE: NAMES = person time y;
CLUSTER = person; ! Specify the person id variable
USEVAR = y; ! Specify which variables are used in the model
MISSING = ALL(-999);
LAGGED = y(1); ! This creates lagged variables
!TINTERVAL = sessdate(1); ! This is to account for unequal intervals
ANALYSIS: TYPE = TWOLEVEL RANDOM; ! This allows for random slopes
ESTIMATOR = BAYES; ! DSEM requires Bayesian estimation
PROC = 2; ! Using 2 processors makes it faster
BITER = (5000); ! This implies at least 5000 iterations are used
THIN = 10; ! Thinning helps with getting more stable results
MODEL: %WITHIN% ! Specify the random lagged relationships
p | y ON y&1;
%BETWEEN% ! Allow all 6 random effects to be correlated
p WITH y
OUTPUT: TECH1 TECH8 STDYX;
PLOT: TYPE = PLOT3;
FACTORS = ALL;
SAVEDATA: FILE = 1.dat;
SAVE = FSCORES (100);

Bengt O. Muthen posted on Sunday, May 31, 2020 - 4:47 pm

We need to see your full output - send to Support along with your license number.

hadi posted on Wednesday, June 24, 2020 - 9:10 am

hello
I have the number of new case of a disease for some cities from 1990 to 2017.and i want to run DSEM.I want to know the number of cities is important?if yes,how many number is okey for this analysis?
thanks

Bengt O. Muthen posted on Wednesday, June 24, 2020 - 4:21 pm

See the paper on our website:

Schultzberg, M. & Muth�n, B. (2018). Number of subjects and time points needed for multilevel time series analysis: A simulation study of dynamic structural equation modeling. Structural Equation Modeling: A Multidisciplinary Journal, 25:4, 495-515, DOI:10.1080/10705511.2017.1392862. (Supplementary material).