Shige posted on Wednesday, July 13, 2005 - 8:06 am
Dear Linda and Bengt,
I am trying to use Marlov model to study cross-temporal policy change. In pp.200 of the Mplus manual it reads "The logit intercept of c1 is fixed at 10 which means that teh probablity of being in class 1 of c1 is fixed at one in class 2 of c, the stayer class." My question is: is the fixed value of "10" a magical number? Can I use "11" or "9"?
Also, in the class specific models, the intercepts were fixed at "15" and "-15", are there reasons behind these, can they be replaced by other numbers?
bmuthen posted on Wednesday, July 13, 2005 - 2:01 pm
+15 (or +10) and -15 (or -10) are threshold values used to represent probabilities of 0 and 1. Fixing at any value between 9 and 15 hardly makes any difference. In the type of model you consider here, I use -10 together in combination with 20 so that the sum is +10 - this way I can represent a probability of 1 and 0, respectively.
Shige posted on Wednesday, July 13, 2005 - 3:03 pm
I am trying to gain some sense of Markov models by trying out some of the examples provided by ATS computing (http://www.ats.ucla.edu/stat/MPlus/examples/alca/chap11.htm). By comparing their implementation of Markov model with that provided in the manual (pp.192-193), I found some interesting differences. For example, following the manual, a latent Markov model with one indicator and five time points can be formulated as:
... model: %overall%
[cb#1 cc#1 cd#1 ce#1] (1);
ce#1 on cd#1 (2); cd#1 on cc#1 (2); cc#1 on cb#1 (2); cb#1 on ca#1 (2);
model ca: %ca#1% [a$1] (3); %ca#2% [a$1] (4); model cb: %cb#1% [b$1] (3); %cb#2% [b$1] (4); model cc: %cc#1% [c$1] (3); %cc#2% [c$1] (4); model cd: %cd#1% [d$1] (3); %cd#2% [d$1] (4); model ce: %ce#1% [e$1] (3); %ce#2% [e$1] (4); ...
Following the web site, they should be: ... model: %overall% cb#1 on ca#1 (1); cc#1 on cb#1 (1); cd#1 on cc#1 (1); ce#1 on cd#1 (1);
[ca#1]; [cb#1 cc#1 cd#1 ce#1] (2);
model ca: %ca#1% [a$1@15]; %ca#2% [a$1@-15]; !for variable a; model cb: %cb#1% [b$1@15]; %cb#2% [b$1@-15]; !for variable b; model cc: %cc#1% [c$1@15]; %cc#2% [c$1@-15]; !for variable c; model cd: %cd#1% [d$1@15]; %cd#2% [d$1@-15]; !for variable d; model ce: %ce#1% [e$1@15]; %ce#2% [e$1@-15]; !for variable e; ...
If I understand correctly, the first approach estimates what is called "latent markov model" while the second estimates what is called "simple markov model". However, this "latent Markov model" produces drastically different result from those reported in Langeheine (2002), which can be found in (http://www.ats.ucla.edu/stat/lem/examples/alca/chap11.htm) under the name "Model 7 (LM) - part A: latent Markov with time homogeneous transitions".
I must have missed something here.
Shige posted on Wednesday, July 13, 2005 - 3:06 pm
Also, while ATS guys provides almost all code example for Langeheine's chapter, they only provide three examples in Mplus, does this mean doing these Markov models in Mplus is very difficult? I know Mplus is a very flexible package, but I don't know much in this specific area (Markov model), I'd better ask before investing more time on this. Thanks!
bmuthen posted on Wednesday, July 13, 2005 - 3:22 pm
I think ATS simply hasn't had time to get to all examples; they are working on it.
In your July 13, 9:03 message, can you please email me the output and data from the Mplus Latent Markov model run for which you found different results than LEM? Also, the Mplus manual pages you refer to concern ex 8.12 (Hidden Markov Model), I assume (I have different page numbers in my version)?
Note also that LEM and Table 3 in the LCA book gives the BIC version which LEM calls BIC (L-squared), not the logL-based BIC that Mplus gives.
bmuthen posted on Saturday, July 16, 2005 - 12:02 am
Ok, so according to your email the latent (= Hidden) Markov model from Table 3 of Chapter 11 gets the same loglikelihood in both Mplus and LEM, but the programs differ in terms of predicted frequencies. I would suggest that you use the Mplus output options (Tech10 is most relevant I believe) to get the printout of each cells' observed and estimated frequency (or proportion) and make sure that you get the same results as in the top part of the Mplus output where it says "FINAL CLASS COUNTS AND PROPORTIONS FOR THE LATENT CLASS PATTERNS BASED ON THE ESTIMATED MODEL". If this agrees, then we would have to investigate the disagreement between the programs further. If so, please also email me the data.
Clearly, they disagree with each other. Tech10 output is identical to LEM output.
bmuthen posted on Saturday, July 16, 2005 - 2:06 pm
Thanks. If you are not using the current version 3.12, please download it. If you are, please send the input, output, and data to firstname.lastname@example.org.
bmuthen posted on Saturday, July 16, 2005 - 4:57 pm
I misspoke - the Mplus Tech10 frequencies are for the patterns of the observed variables, whereas the first set of Mplus frequencies you give above are for the latent variable patterns (the latent classes). It is the former that LEM prints. They can be quite different in the Latent (Hidden) Markov model.
Note tech10 produce two sets of counts: observed and estimated. It's easy to understand the difference between the observed pattern and estimated pattern, but what is the difference between the "estimated" patterns of the observed variables and the latent variable patterns?
bmuthen posted on Monday, July 18, 2005 - 11:22 pm
The latent variable pattern frequencies are the estimated frequencies for the latent class variables - that is they are based on the estimated probabilities of being in the latent classes of the model. At each time point the observed binary variable need not take on the same value as the latent binary variable because the Hidden Markov model allows for "measurement error". Let me know if I am unclear.
I am trying to replicate at leat some of the results in Langeheine and van de Pol's chapter. I may have further questions concerning latent/mixed Markov models.
By the way, do you agree that the Mplus code I provided (with Tech10 option) is indeed the same as the "latent Markov model" in the chapter? Now the likelihood, AIC, BIC, transition matrix are the same, and now the predicted class counts (as displayed in Tech10 output) are the same.
bmuthen posted on Tuesday, July 19, 2005 - 5:16 pm
Yes, Shige, your Mplus run gives the Latent Markov model in their chapter. The agreement of the log likelihoods is the best single indicator of agreement between programs, and all other things seem to agree as well in this case.
I am trying to estimate mover-stayer model (based on example 8.14). The BIC and AIC show it is somewhat close to what reported in LEM but not quite the same. Could you please take a quite look at the code and maybe can give some tip.
------------------------------------ data: file is chap11.dat ; variable: names are u1 u2 u3 u4 u5 male female count; missing are all (-9999) ; usevariables are u1 u2 u3 u4 u5 count; weight is count (freq); categorical = u1 u2 u3 u4 u5; classes = c(2) c1(2) c2(2) c3(2) c4(2) c5(2); analysis: type = mixture; model: %overall% c1#1 on c#1; [c1#1@10]; c2#1 on c#1; [c2#1@-10]; c3#1 on c#1; [c3#1@-10]; c4#1 on c#1; [c4#1@-10]; c5#1 on c#1; [c5#1@-10];
model c: %c#1% ! movers c2#1 on c1#1 (1); c3#1 on c2#1 (1); c4#1 on c3#1 (1); c5#1 on c4#1 (1); %c#2% ! stayers c2#1 on c1#1@20; c3#1 on c2#1@20; c4#1 on c3#1@20; c5#1 on c4#1@20;
model c.c1: %c#1.c1#1% [u1$1@15]; %c#1.c1#2% [u1$1@-15]; %c#2.c1#1% [u1$1@15]; %c#2.c1#2% [u1$1@-15];
model c.c2: %c#1.c2#1% [u2$1@15]; %c#1.c2#2% [u2$1@-15]; %c#2.c2#1% [u2$1@15]; %c#2.c2#2% [u2$1@-15];
model c.c3: %c#1.c3#1% [u3$1@15]; %c#1.c3#1% [u3$1@-15]; %c#2.c3#1% [u3$1@15]; %c#2.c3#2% [u3$1@-15];
model c.c4: %c#1.c4#1% [u4$1@15]; %c#1.c4#1% [u4$1@-15]; %c#2.c4#1% [u4$1@15]; %c#2.c4#2% [u4$1@-15];
model c.c5: %c#1.c5#1% [u5$1@15]; %c#1.c5#1% [u5$1@-15]; %c#2.c5#1% [u5$1@15]; %c#2.c5#2% [u5$1@-15];
After looking at the sample input you sent me, I have much better ideas of how to proceed. There is no need spending time on the old input file I posted.
In the input file for the mixed Markov model (mixedmarkov1.inp) you sent me, you have a comment like this:
%overall% ... c2#1 on c#1 (2); ! adding this to [c2#1@-10] estimates prob ! of being in the not depressed c2#1 class ! in the mover class (c#1), ! for the depressed c1#1 class: 0.701 c3#1 on c#1 (2); c4#1 on c#1 (2); ... This should report the time-homogeneous transition probability from time t to time t+1. It should be 0.701, as you noted in the comments. However, I have hard time finding this piece of information in the actual output. The only place seems to have that information is at the very bottom of "Model Results", which looks like: ------------------------------------------------ Latent Class Pattern 1 1 1 1 1
C2#1 ON C1#1 0.322 0.211 1.523
C3#1 ON C2#1 0.322 0.211 1.523
C4#1 ON C3#1 0.322 0.211 1.523
Latent Class Pattern 2 1 1 1 1
C2#1 ON C1#1 20.000 0.000 0.000
C3#1 ON C2#1 20.000 0.000 0.000
C4#1 ON C3#1 20.000 0.000 0.000 ---------------------------------------------- How to obtain a transition matrix of (.764, .701; .236, .299) from the above output?
For mover-stayer models the output is a bit more complex to read since the transition tables are not printed broken down by movers and stayers. So, yes, you have to get what you want from the printed estimates. If you look under "Categorical Latent Variables", you will find the c2#1 on c#1 estimate 10.854. Adding this to the [c2#1@-10] mentioned in the comments gives you 0.854 and getting the probability from this logit via the formula
P = 1/(1+exp(-0.854))
gives you P = 0.701. And that is the probability mentioned in the comment "of being in the not depressed c2#1 class in the mover class (c#1), for the depressed c1#1 class." I guess you can also get this by calculations using the "FINAL CLASS COUNTS AND PROPORTIONS FOR THE LATENT CLASS PATTERNS BASED ON THE ESTIMATED MODEL"
One more question: in "mixedmarkov1.inp", the overall model reads as: ----------------------------- ... %overall% [c#1]; [c1#1]; c1#1 on c#1; ... ----------------------------- while in "mixedlatentmarkov1.inp", it reads as: ----------------------------- ... %overall% [c#1]; [c1#1@10]; c1#1 on c#1; ... -----------------------------
My question is: why c1#1 was set free in the first case while fixed at 1 in the second case? Was it for identification reason only or there are something else? Thanks!
bmuthen posted on Monday, July 25, 2005 - 11:14 pm
I think the [c1#1@10] fixing was simply due to computational convenience - it should be free. This example was probably first run in an earlier version of Mplus where I might have had a problem with the parameter. This parameter can be freed and gets the large value of 4.158 (so not far from 10 in this metric) - freeing it gives a slightly better logL = -1,144.122 with 6 parameters. And then you have the 9 df that Mooijaart has in his Table 6.17 (his frequency table chi-squares are a bit higher than those of Mplus, perhaps due to Mplus using stricter convergence criteria).
"ONE OR MORE MULTINOMIAL LOGIT PARAMETERS WERE FIXED TO AVOID SINGULARITY OF THE INFORMATION MATRIX. THE SINGULARITY IS MOST LIKELY BECAUSE THE MODEL IS NOT IDENTIFIED, OR BECAUSE OF EMPTY CELLS IN THE JOINT DISTRIBUTION OF THE CATEGORICAL LATENT VARIABLES AND ANY INDEPENDENT VARIABLES. THE FOLLOWING PARAMETERS WERE FIXED: 3"
if I do not fix that parameter; the error message goes away if I fix it. In this case, I probably do not have many choices but to fix it, do I?
Also, are there other alternatives (that is, other parameters to be fixed instead of c1#1)?
bmuthen posted on Tuesday, July 26, 2005 - 2:02 pm
This is not an error but an outcome that is ok and understandable. If you look at this parameter, you will probably see that it has a large value. For example, a threshold may be large for a certain variable and class, indicating that the item probability is low in this class. Such extreme values make the information matrix singular and it is perfectly ok to fix them at a high value. Another example is "c on x" parameters, where in a certain class you don't have x variability to define the slope.
If you fix another parameter, your model may be changed and the parameter in question may no longer have this feature.
Shige posted on Saturday, July 30, 2005 - 12:56 am
Can Mplus use latent variables as mediators within a markov model. I've looked at the mediation examples in the manual,but wondered if the same approach can be applied to transition between states in the markov model using a latent variable as a mediator instead of the multinomial regresion? Thanks in advance
bmuthen posted on Tuesday, November 01, 2005 - 6:21 pm
Are you aware of any refs that adress mediational analysis with markov models?
bmuthen posted on Tuesday, November 01, 2005 - 9:34 pm
No. I am asking Dave McKinnon.
bmuthen posted on Thursday, November 03, 2005 - 3:06 pm
Here is what Dave McKinnon says:
The closest thing to this is the paper by Collins, L. M., Graham, J. W.,& Flaherty, B. P. (1998). An alternative framework for defining mediation. Multivariate Behavioral Research, 33, 295-312.
I have tried applying some Markov models for repeated learning from my dissertation to the mediation case but I have not pursued it enough to fully develop the model. I think developing mediation in a Markov model is a good idea. The states of having the mediator and having the outcome are not necessarily absorbing so it may differ from typical Markov Models for learning.
I have been doing some more work on Markov modeling and had a few questions. I have found that a partially latent mover-stayer model fits my data quite well(binary repeat measure over 7 waves, equally spaced). However, I have been experimenting with allowing for a third chain with its own transition probabilities (time heterogeneous). I have also tried to turn on Tech11 and/or Tech14 to gain some insight into whether to continue increasing the number of chains beyond the current two. However, MPLUS ignores these commands, saying that i cannot use either of these tests with the type of model. Is that true, and if so, is there any other way of identifying the optimal number of chains other than information criteria? With computational time being what it is (about four days on a new machine), i am hesistant to increase the number of class needlessly.
If the message says TECH11 and TECH14 are not available and you are using the most recent version of Mplus, then they are not avaiable. I'd need to see the output to say why. TECH11 is available in most cases. BIC and loglikelihood are alternatives. The next version of Mplus will have great speed improvements in general and some additional speed improvments for Markov models.