(I understand Mplus UG recommends fixing both loadings and thresholds together. However, I was just curious.)
1. Is this decrease common? The equal-loadings model has more restrictions. Looking for a reason why, I compared the modification indices between the two models. What I found was that many of the large MIs suggestive of residual correlations in the configural invariance model disappeared in the metric invariance model. However, I don't understand why.
2. I did a DIFFTEST between the two models and the chi-square difference was only 48.658. I think this is too small. However, the difference of dfs was correct, which was 15.
The WLSMV chi-square values cannot be compared unless you use DIFFTEST. You can draw no meaningful conclusion by comparing them.
You might want to look at the Version 7.1 Language Addendum on the website with the user's guide. We describe in detail the models that can be used for testing for measurement invariance in a variety of situations.