Evo-Karma: May 2011

Sunday, May 22, 2011

Recognition of tree images

I have just published a program on the automated recognition of phylogenies from tree images. First of all, I would like to apologise for the use of the word 'towards' in the title, I know that it is increasingly being used and irritating to some. I just wanted to be honest in that this program does not succeed on all tree images but is a step in the right direction.

I thought I would take the opportunity to post a few links to software that deals with the same problem. We have all been rather unorginal with names!

TreeThief:

To my knowledge, this was the first program that dealt with the problem of converting a phylogenetic image back to the more useful bracket format such as NEXUS or newick. It requires the user to click on tips and nodes in a specific order and type in the label at the tips. Unfortunately, this program only works on MacOS 9.

TreeSnatcher:

TreeSnatcher was a conceptual advance on TreeThief, relies heavily on Java libraries and is cross-platform. It requires a limited amount of input from the user, such as selecting the foreground and background and lets the user improve the quality of the extraction thanks to this interactivity.

TreeSnatcher Plus:

TreeSnatcher Plus is an improvement on TreeSnatcher as it lets the user convert almost anything to a newick file, for example it works on radial tree images.

TreeRogue:

TreeRogue is essentially the same concept as TreeThief and I have just come across this so unfortunately do not make reference to it in my paper (sorry). It uses an R script that converts coordinates to a tree file. These coordinates can be detected from an image by using GraphClick, which costs $8.

TreeRipper:

Image via Wikipedia

TreeRipper has been written in C++ and there is a version running on the website, the code is available under GNU GPL v3. It uses heavily the C++ API to to ImageMagick image-processing library (Magick++) and it uses Tessecract-ocr to convert the leaf labels to text. This is a fully automated approach that unfortunately only works on a proportion of the tree images. You could for example use TreeRipper for batch processing a large number of trees and then use a semi-automated approach for the trees that weren't converted.

There is still a lot of room for improvement and I am hoping that someone out there will make further progress on this interesting challenge.

Of course, none of these programs would be necessary if we all shared our trees and this would only be possible if we had a useful phylogenetic standard <= this statement should please the TDWG Interest Group on Phylogenetic Standards ;)

Reference:

Hughes, J. (2011). TreeRipper web application: towards a fully automated optical tree recognition software. BMC Bioinformatics 12: 178 doi: 10.1186/1471-2105-12-178

Friday, May 20, 2011

Insect systematics: you've got to laugh, if you didn't you'd cry

In continuation from my previous post, I have now assembled 43 order level phylogenies of insects, i.e. they are based on more or less independent sources of data. The oldest study included is from 1993, so I still have my work cut out to find trees published before then especially as it becomes increasingly hard to get your hands on the articles as you go further back in time.
As more phylogenies are included, it also becomes hard to visualize this increasingly complex network on a 2D screen and I which I could explore it in a more intuitive way.

Wednesday, May 04, 2011

Many outstanding questions in the phylogenetic relationships of insect orders

I am trying to get my head around the multiplicity of phylogenetic hypotheses for insect phylogenetic relationships in continuation from my previous post. I have been gathering a number of insect phylogenies from the literature (these include morphological and molecular based phylogenies). I wanted to illustrate where the hypotheses were conflicting so I used a SuperNetwork with no edge weights in SplitsTree. This gives an idea of how much conflicting evidence there still is at the base of the Pterygota and also the large number of studies that have focused on the Endopterygota, in particular the relationship of the Strepsiptera to the other orders. Many of the orders have only been included in one study, in particular the basal orders. What I would like to do at some point, is show how the insect phylogeny has changed over time by layering the phylogenies chronologically onto one another to form the above SuperNetwork.