Thirty years on: one human genome a day
“In general, sequences of from 15 to about 200 nucleotides … can be determined with reasonable accuracy.” With these words, Fred Sanger, Steve Nicklen and Alan Coulson outlined the success of the method for sequencing DNA they published 30 years ago this week: the technique that launched a thousand genome projects.
Although this technology will continue to be used, the Wellcome Trust Sanger Institute has now made a major investment in new-technology sequencing platforms that will drive new exploration of the genetics of disease to a scale unattainable by current methods. The investment reinforces the role of the Institute as one of the largest sequencing centres on earth.
Using the new platforms, Institute researchers will be able to detect genetic variants important for disease with a sensitivity not before possible. Institute scientists will study common disease with a genetic foundation, such as diabetes, carry out rapid discovery of variation in cancer samples and uncover sequence variants in pathogens, such as MRSA and malaria.
In addition, the Institute expects to increase its portfolio of research through developing new sequencing projects with collaborators.
The range of new-technology sequencing machines will be deployed to run alongside the Institute’s existing capillary-based platforms. When fully deployed, the new platforms will boost DNA sequence capacity of the Institute from 110M bases per day to more than 6500M bases – or a complete diploid human genome – per day.
“Fred Sanger is the often-overlooked father of genomics. His methods have driven the genetic exploration of human disease for thirty years – a remarkable achievement. By investing in new technologies, we build on his achievements to maximize the effectiveness of our dual role as a leading research institute and as a provider of data to the community.”
Professor Allan Bradley Director of the Wellcome Trust Sanger Institute
We strive to be at the forefront of partnerships to understand human disease as well as providing the resources for others to make these gains. This investment will drive these programmes.
The Institute is already a leading partner in collective research efforts, such as the Wellcome Trust Case Control Consortium, the Copy Number Variation Project and the ENCODE project, that provide genomic and genetic information to biomedical researchers. It is also a leading provider of access to genomic information through its websites, such as Ensembl (co-run with the European Bioinformatics Institute), which attract more than 13M hits each week.
The new investment will enhance the Institute’s ability to contribute to the global efforts to understand the genetic underpinning of disease. Institute researchers will be able to study the genomes of hundreds of cancer samples to uncover the many mutations that can lead to cancers. Research on pathogens will also be accelerated: several pathogen genomes have been analysed in trials of the technologies, proving their value in comparative studies.
“Fred Sanger’s original sequences were from a virus that infects bacteria. The new technologies will revolutionize our ability to study organisms that cause infection in human and animals.
“We will gain a deep view of genetic variation in organisms such as Salmonella typhi or Plasmodium. These organisms are extremes: S. typhi has very little variation, being virtually identical wherever it is isolated, whereas Plasmodium is highly variable.
“Very deep sequencing will allow us to identify rare variants of Salmonella to differentiate strains and trace transmission routes and response to vaccination. For Plasmodium, we can at last analyse many variants and many strains, which will allow us to identify the genetic causes of virulence and drug resistance.
“We can also undertake new programmes: we will sequence draft genomes from helminths – which infect 2 billion people worldwide – in only weeks: previously this would have taken months or years. Genome sequences for these intractable organisms from this effort will underpin and accelerate future efforts towards developing novel drugs and vaccines.”
Professor Julian Parkhill Head of the Pathogen Sequencing Unit at the Wellcome Trust Sanger Institute
Comparative genomics and large-scale genome biology – two effective tools in biomedicine – require production of and access to massive amounts of sequence data. In cancer biology, one aim is to uncover the mutations that drive cancer in hundreds of different cancer types.
“Fred Sanger and his colleagues developed the techniques that in turn led to development of systems that have brought us so much understanding of the cancer genome.
“Until now, ‘cancer genome’ surveys, although impressive, have been restricted to investigating either a large number of genes in a small number of cancers or a relatively small number of genes in tens of cancers. However, even these analyses have addressed only the 2 per cent of the human genome that includes protein-coding regions, leaving the rest untouched and unknown.
“New technologies open new important approaches. We now have the possibility of cataloguing changes in all of the genome in a number of cancer types. We will also be able to analyse hundreds of cancer samples, surveying essentially all genes. We will learn about the processes that generate cancer and the spectra of mutations in different cancers in a way that would take many years with dideoxy technology.”
Professor Mike Stratton Co-Head of the Cancer Genome Project at the Sanger Institute. “But the new systems will transform cancer genomics
Sequence data will continue to be released as rapidly as possible and new IT infrastructure has been established to provide ready access for the worldwide research community.
“The techniques we have used to find differences have been either partial – usually studying only the protein-coding parts of genes – or indirect – looking at subsets of common sequence variants (SNPs), often linked to the HapMap. It has now has become cost effective to get all the differences for a sample, revealing all the variants that might underlie disease (or influence a pathogen’s virulence).
“An important goal is to ensure that this increased output does not simply add noise by pumping out more sequence into the research arena. We will provide the raw data but also supply the added value of information extracted from these massive datasets through resources such as Ensembl. We are committed to making biology and biomedicine easier, not more difficult.”
Dr Tim Hubbard Head of Bioinformatics at the Sanger Institute
The Institute will devote more of its sequence capacity to novel collaborative projects that will support the wider community, enhancing access to this unique resource.
“Sequence production was the foundation for the Institute and, alongside our development of postgenomic research, will continue to play a vital role in the Sanger Institute. We are driven to understand the genetic basis of human disease and the new technologies will accelerate our quest.
“Fred Sanger is a quiet giant, whose discoveries and inventions transformed our research world.”
Professor Allan Bradley Sanger Institute
Sanger developed more than the DNA sequencing method: with his colleagues, he pioneered the use of thin gels to give good resolution of DNA bases, the use of phage M13 to help sequencing and, 25 years ago produced the first whole genome shotgun sequence – of a bacterial virus called lambda. And, of course, he won his first Nobel Prize in 1958 for developing a method to sequence proteins – still the dearest to his heart.
Sanger’s DNA legacy will continue as dideoxy sequencing runs alongside the new technologies. His achievements deserve continuing respect, although defence of them would never come from Fred Sanger.
In a review published in Annual Reviews of Biochemistry in 1988, he commented: “Of the three main activities involved in scientific research, thinking, talking, and doing, I much prefer the last and am probably best at it. I am all right at the thinking, but not much good at the talking.”
But then his work speaks for him.
More information
Publications:
Selected websites
The Wellcome Trust Sanger Institute
The Wellcome Trust Sanger Institute, which receives the majority of its funding from the Wellcome Trust, was founded in 1992. The Institute is responsible for the completion of the sequence of approximately one-third of the human genome as well as genomes of model organisms and more than 90 pathogen genomes. In October 2006, new funding was awarded by the Wellcome Trust to exploit the wealth of genome data now available to answer important questions about health and disease.
The Wellcome Trust
The Wellcome Trust is the largest charity in the UK. It funds innovative biomedical research, in the UK and internationally, spending around £500 million each year to support the brightest scientists with the best ideas. The Wellcome Trust supports public debate about biomedical research and its impact on health and wellbeing.