Tree of Life Informatics Infrastructure
Tree of Life Programme
The Tree of Life projects will generate tens of thousands of high-quality genomes over the coming years – more than have ever been sequenced! It is a challenging and extremely exciting task that will shape the future of biology, and the team’s role is to provide the platform for assembling and analysing those genomes at an unprecedented scale. We are the interface between the Tree of Life teams (assembly production and faculty research) and Sanger’s IT teams, working together with the informatics teams of the other programmes.
The team is organised in three poles.
Data management
Our data curators and managers maintain the integrity, consistency, and quality, or multiple databases used in production, including Genomes on a Tree (GoaT), Sample Tracking System (STS), Collaborative Open Plant Omics (COPO), and BioSamples.
Bioinformatics
Our bioinformaticians develop the suite of analysis pipelines that will run on every genome produced in Tree of Life, providing a central database of core results available for all.
Systems
We develop and maintain some core systems used in production, including the execution and tracking of all bioinformatics pipelines, and the deployment of third-party web applications for internal use.
The team uses a wide range of technologies, frameworks and programming languages, including Nextflow, Python, Conda, Jira, LSF, Singularity, and Kubernetes. The technology wheel below shows most of their logos. How many can you recognise ? Let us know on the Sanger Tree of Life Twitter account.
Core team
Dr Tyler K. Chafin
Senior Bioinformatician
Mr Paul Davis
Data Manager
Ene Göktan
Informatics & Digital Associate
Dr Cibele Sotero-Caio
Genomic Data Curator - Tree of Life Genomics