Friday, April 13, 2012

Share your trees and reduce your carbon footprint

I recently attended the SPDG in Glasgow. This is an discussion group on phylogenetics which takes place in various Universities across Scotland. The guest lecturer at the last meeting was Alexandros Stamatakis from Heidelberg. The main part of his talk was about the PaPaRa software which can be used to align short reads to a phylogeny. This is really useful for the identification of next-gen reads from environmental samples. However, he also talked about reducing the carbon footprint of computational biology by writing better algorithms and code. The effect of heavy computation on the environment was not new to me as I once sat through a video conference by Herve Philippe at the Entomological Society of America which was meant to be about "phylogenomics and the sister groups to Hexapoda" but ended up being about why he hadn't travelled to Reno, Nevada. He made valid points, which can be found here. At the SPDG, we also had a video conference from Erick Matsen and fortunately this time it was on topic. Erick is the organiser of phyloseminar which is well worth having a look at and could definitely lower your carbon footprint.

More efficient algorithms and programming, videoconferencing! This all got me thinking about the three Rs: REDUCE, REUSE and RECYCLE in the context of phylogenetics. We can all do our bit to REDUCE our carbon footprint when doing phylogenetics. For starters, is the analysis I want to do really necessary, does it have to run as long, can we use a better, more efficient algorithms. Secondly, we can REUSE the trees that others have already done but this means that we need to get much better at sharing our trees. TreeBASE and DataDryad are undoubtedly playing an important role in enabling us to share phylogenies and thus reduce our carbon footprint. However, as discussed in "Towards a taxonomically intelligent phylogenetic database" by Rod Page the pace at which we are publishing phylogenies is not being matched by the submissions to TreeBASE. This leaves us with the last option to RECYCLE our trees. This should only be a last resort but ends up happening most of the time. For this we need to get back to our raw materials, the sequences, which fortunately are more consistently shared in GenBank and redo the analyses.

Hopefully, this time round the algorithm will produce less carbon and the data will be submitted to TreeBASE!
