We developed the tool PLTB (Phylogenetic likelihood (evaluator) and tree builder) in the context of a master level programming practical at KIT. Given a dataset of genetical data PHYLIP or FASTA format), it basically does two things:

  1. For each time reversible substitution model: Compute likelihood of given dataset and evaluate several information criteria (AIC, AICc, BIC…).
  2. For all best models (regarding the several information criteria): Conduct a tree search under this model.

Step 1 tests all 203 possible time reversible substitution models and can be executed in parallel with a highly-scaling MPI master/worker approach. This means, using MPI, models are distributed among the available number of processes. Further, the internal parallelization of the Phylogenetic Likelihood Library using pthreads can be configured to also run the likelihood evaluation and the tree search in parallel.

In the GitHub repository you can also find evaluation scripts utilizing RAxML and gnuplot to visualize the results by plotting RF distances between generated trees in histograms.

The question we wanted to answer with this program is whether substitution models make a real difference, when it comes to topologies of trees generated under different models. If you are interested: We published our findings on BioMed Central.