Phylogeny from multi-locus sequences aka MLSA

For one of my Ph.D. projects, I had to generate phylogeny from multi-locus sequence data. Often I have to repeat similar analyses and need to go back to the previous workflow to check what I actually did. I’m sharing the protocol here mainly to help my future self, and may be this is useful to you!

Multi-locus sequence analysis (MLSA) involves using multiple genes or loci, usually conserved housekeeping genes, to construct phylogeny and other sequence-based analyses. Since different genes may have different mutation rates, MLSA generally gives a better approximation of underlying evolution and a more realistic resolution of phylogenetic relations among taxa than only one gene.

This is also a better alternative to ribosomal 16s/ITS-based analysis, especially for many bacterial species (including Bradyrhizobium, which I worked with), because the 16s/ITS are often very similar in these genera and cannot be used to differentiate species.

Multilocus sequence alignment may look like this. Source
