DNA Pipelines
Scientific Operations
Here we outline the operation of these platforms within DNA Pipelines Operations and discuss how researchers can access our technologies.
The Wellcome Sanger Institute was founded to exploit DNA sequence in order to understand biology and disease. That imperative remains as strong today as the Institute capitalises on leading-edge technologies to answer questions that were unanswerable only a few years ago. We have always used and will continue to use the most appropriate technology to provide the data to solve important questions in biomedicine and genomic variation.
The enormous capacity of the DNA Pipelines Operations is essential to the scientific aims of the Wellcome Sanger Institute and its status as a world leader in genome research. The facility has one of the largest DNA sequencing facilities in the world comprising a large fleet of Illumina® instruments. The library preparation and sequencing facilities are further supported by an automated sample management lab, processing in excess of 25,000 samples per month. The facility is capable of producing more than 3000 terabases of DNA sequence per month.
DNA Pipelines Operations is split into a number of teams, each supported by a team leader, working closely together to ensure seamless processing of the samples through the pipelines:
- Scientific Customer Support
- Sample Management
- Illumina High-throughput sequencing teams
- Bespoke sequencing
- Long Read Sequencing
- Data QC
Sample Management
The Sample Management pipeline registers and processes more than 350,000 samples per year. It provides extraction of DNA and RNA from a wide range of biological materials and delivers automated high-throughput DNA quantification and normalisation. This material then enters one of the library preparation teams for further processing and sequencing.
Illumina® library preparation and sequencing
Library preparation for Illumina® sequencing is split into 3 main areas: DNA, RNA and bespoke products. Our laboratory processes are constantly being assessed and upgraded to provide a high-throughput library preparation service able to process samples in a standardised manner to provide Illumina-ready libraries at an ever-increasing throughput. During library preparation, samples are uniquely indexed, allowing plexing of multiple sample libraries on a single lane, which can then be bioinformatically de-plexed by the customer for analysis of data. The capacity of the sequencing machines and this multiplexing of sample libraries into a single lane can routinely generate more than 3.5 Tb per run. Typically, our Libraries now meet a 99.8 per cent pass rate with a reduction in sequencing lane failure rates to less than 5 per cent.
The Illumina® Sequencing teams have the following platforms at their disposal: NovaSeq 6000, HiSeq X, HiSeq 4000 and MiSeq. This enormous sequencing capacity is continually expanding to allow the facility to meet the needs of a scientific community undertaking studies that are of a design and scale that cannot be conducted in most biomedical research institutes. The Next-generation sequencing applied by the Illumina® teams uses the application of glass micro-chip based methods and small-volume liquid handling (microfluidics) to sequence DNA more quickly and more cheaply than ever before.
Data QC
DNA Pipelines Operations has a dedicated data quality control (QC) team of highly trained analysts who evaluate all sequencing runs from the Illumina® instruments. Through this team, we can work with the faculty scientists to resolve any issues prior to the release and publication of data to the scientific community, ensuring the integrity of the sequencing data released by the Wellcome Sanger Institute
Long Read technologies
Alongside Illumina® sequencing, we have a team dedicated to long read sequencing technologies, with state-of-the-art machines from Pacific Biosciences and Oxford Nanopore Technologies. This team provides support for research projects requiring longer sequence read lengths, as well as an Optical Mapping Platform using the BioNano Saphyr® instrument.
Running as facility of this size requires a huge amount of support and we work closely with the Wellcome Sanger Institute IT team that maintains the extensive amount of computer and storage infrastructure necessary, the sequencing informatics team which develops software tools to process, analyse, store and track all the samples and data for the pipelines and the DNA Pipelines Development team who devise novel and improved protocols to take better advantage of this new technology. Where appropriate, we also work with the Institute faculty scientists to facilitate their research by providing novel and bespoke solutions to their sequencing needs.