They thinned 250 million steps to 500, then it sounds like they dropped the first 50-100 on top of that post-hoc (the 10-30% burn in gives an impression of manual tinkering).
Obviously this is going to bring up questions regarding convergence, so reviewers should have asked for the diagnostic charts. This would probably have been better suited to ABC:
BEAST (which I assumed they used) is not very fast. 250 million generations on BEAST with the number of parameters they estimated and the size of their dataset would likely take about 2-4 weeks to run. Assuming they ran all 3 chains in parallel this is still a really long time to wait for an analysis. (edit: just checked, looks like they used 700+ sequences, so this is closer to 1-3 months of compute time)
The standard number of samples for phylogenetic analyses is 1,000-10,000 samples, trending towards the smaller as the phylogenies get larger. Since they did 3 independent analyses and combined their samples, as long as they assessed convergence and ensured that all 3 chains were converged this is likely fine. Excluding burnin is also quite typical since the MCMC move operators for phylogenetics are not very good due to the high dimensionality of the search space. ABC is also challenging to work with due to the high dimensionality (though I admit I haven't worked much with these methods in phylogenetics)
Typically in this field it is not enough to say "you didn't run your MCMC chains for long enough" or anything like that. Of course the chains should be run for longer -- theoretically speaking they should be run for infinite amount of time, but there are routine disagreements over whether 100 or 1000 effective samples are sufficient for phylogenetics. But unless there is a serious model inadequacy that the authors haven't addressed there's typically no reason to nitpick about these sorts of things. The authors have covered a lot of their bases by trying a number of different models and priors and I don't see any reason to doubt this particular study based on that.
Disclaimer: I don't know anything about this stuff.
Would they be able to run these MCMC chains in a way that perhaps progressively renders results at greater and greater accuracy? Then they could get some initial "low resolution" results and they wouldn't necessarily have to wait for weeks, but the full data will eventually be available.
In principle, there would be nothing stopping them from looking at the output every step, or every n steps. I don't know if the software they used supports this though.
It is not that 500 steps is too low that concerns me, it is not showing the posteriors for the parameters used (which it sounds like may simply be the custom in this area). I would think if people had such trouble using mcmc that common practice would be publishing diagnostic charts to reassure each other.
They thinned 250 million steps to 500, then it sounds like they dropped the first 50-100 on top of that post-hoc (the 10-30% burn in gives an impression of manual tinkering).
Obviously this is going to bring up questions regarding convergence, so reviewers should have asked for the diagnostic charts. This would probably have been better suited to ABC:
https://en.wikipedia.org/wiki/Approximate_Bayesian_computati...