Good question. You can use MSEM to do it or you can use DSEM. I would expect the same answer if the size of the clusters is larger. Such as 50 for example, but it does depend on how large the auto-correlation is. If the auto-correlation is very large then you would need a sample size of 100 to have the two methods agree. What I am saying is that depending on how large the clusters are and how large the autocorelation is - the MSEM model may produce a biased result so it is actually better to use DSEM model. You can run the variables one at a time with DSEM to avoid any misspecifications affect the computation of ICC. In DSEM you have to use the option output:residual to get the estimated within and between variances. You may find appendix D useful http://www.statmodel.com/download/DSEM.pdf
To get the ICC with DSEM for a single variable Y, I would recommend this model
I don't have access to Shrout and Lane (2012) so can't comment. ICC=VB/(VW+VB) so yes.
You can't ignore the lagged output really - the residual output results are based on the lagged output. As long as you use the residual output (and not the model results output) you can ignore the lagged effect. See Appendix D.