Vertebrate Resequencing
Informatics
Archive Page
This page is maintained as a historical record and is no longer being updated.
The Vertebrate Resequencing group has become the Tree of Life Assembly group to support the work of the Tree of Life Programme.
We played lead or key roles in the data processing and analysis of large scale sequencing projects such as 1000 Genomes, Mouse Genomes Project, UK10K, HipSci, and Haplotype Reference Consortium among others.
In collaboration with the Durbin and GRIT groups at the Sanger Institute, along with a number of external partners, we joined the Vertebrate Genomes Project and Genome 10K to begin producing genome assemblies for hundreds to thousands of species, using cutting edge long-read sequencing technologies like PacBio, Oxford Nanopore and 10x alongside Illumina.
Software
We developed tools and software to manage our data management and analysis needs at scale.
BCFtools is a set of tools for variant calling and manipulating variant data stored in VCF and BCF files. We also contributed to the development of HTSlib and SAMtools.
We developex pipelines and pipeline management systems to track and process our data. The 1000 Genomes and UK10K projects were made possible using the VRPipe and vr-runner systems. With the Sanger Institute recently moving to a cloud oriented compute infrastructure we are developing a new workflow runner (wr) system.
Services
As part of our work with the Haplotype Reference Consortium, we developed a free genotype imputation and phasing service, the Sanger Imputation Service.
Our people
Core team
Mr Sendu Bala
Principle Software Developer
Previous core team members
Dr Dirk-Dominik Dolle
Senior Bioinformatician
Yasin Memari
Senior Bioinformatician