Missing pages from the "Book of Life"

New map of human genetic variation - International team identifies sites of copy number variation across all human chromosomes

Email newsletter

News and blog updates

Sign up

We always knew that we each had our own, individual copy of The Book of Life, where the spellings of our genetic code differed ever so slightly. But a series of scientific studies published today show that it’s not only single letters but sentences, paragraphs, and even whole pages that can be missing or duplicated. In the leading publication in Nature, an international team has produced a map of such changes among 270 copies of the human genetic code that is already revealing new routes for finding genes involved in disease.

The Human Genome Project delivered a reference sequence for a human genome. To identify genes involved in disease, many focused studies, including the HapMap Project, have mapped single-letter differences (called single nucleotide polymorphisms or SNPs) between individuals and compared them to the human reference DNA sequence.

But the reference sequence has also provided the foundation for an entirely new search for variation, one that was not readily identifiable before. This is the search, not for single differences, but for larger regions that are absent from, or duplicated in different individuals. With this analysis of copy number variation (CNV), a whole new vista of genetic variation with dramatic implications for disease studies has been revealed.

“Each one of us has a unique pattern of gains and losses of complete sections of DNA and one of the real surprises of these results was just how much of our DNA varies in copy number. We estimate this to be at least 12 per cent of the genome, similar in extent to SNPs. This has never been shown before.

“The copy number variation that researchers had seen before was simply the tip of the iceberg, while the bulk lay submerged, undetected. We now appreciate the immense contribution of this phenomenon to genetic differences between individuals.”

Dr Matthew Hurles One of the project leaders at the Wellcome Trust Sanger Institute

The new map will change the way in which scientists search for genes involved in disease. While the SNP maps produced by the HapMap and other work are invaluable, most CNVs are missed by these maps. One striking example is resistance to infection by HIV, which is determined in part by multiple copies of the gene CCL3L1, and is essentially invisible to SNP-based maps of genomic variation.

“Many examples of diseases resulting from changes in copy number are emerging. A recent review lists 17 conditions of the nervous system alone – including Parkinson’s Disease and Alzheimer Disease – that can result from such copy number changes.

“Indeed, medical research will benefit enormously from this map, which provides new ways for identifying genes involved in common diseases.”

Charles Lee One of the project leaders from Brigham and Womens Hospital and Harvard Medical School in Boston, USA

In comparing their results with the authoritative database of disease-related genes Online Mendelian Inheritance in Man, the team found that 10 per cent of these genes were associated with CNVs. Genes that are involved in the immune system and in brain development and activity – two functions that have evolved rapidly in humans – tend to be enriched in CNVs. By contrast, genes that play a role in early development and some genes involved in cell division, both critical to fundamental biology, tend to be spared.

The conclusions are dramatic.

“I believe this paper will change forever the field of human genetics. One can no longer consider human traits as resulting primarily from single base-pair changes or influenced only by SNPs. With all due respect to Watson and Crick, many Mendelian and complex traits, as well as sporadic diseases, may indeed result from structural variation of the genome.

Professor James R. Lupski Vice Chair, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas

The global CNV map is transforming medical research in four areas. The first and major area is in hunting genes underlying common diseases, which have not looked at CNVs to date. Second, the CNV map is being used in study of familial genetic conditions. Third, there are thousands of severe developmental defects caused by chromosomal rearrangements. The CNV map is being used to exclude variation found in unaffected individuals, helping researchers to home in on the region that might be involved. Finally, as with HIV, it will be possible to find variants that protect against other infectious diseases, such as malaria.

“In some ways, the methods we have used are molecular microscopes which have transformed the techniques used since the foundation of clinical genetics, where researchers used microscopes to look for visible deletions and rearrangements in chromosomes.

“With these new tools, we and our clinical colleagues are able to find previously undetectable deletions or duplications of the genome in a patient. The CNV map now allows us to identify which of these changes are unique to the disease.”

Dr Nigel Carter Another of the project leaders at the Wellcome Trust Sanger Institute

To increase the value of the map to researchers, the Wellcome Trust Sanger Institute and its partners have developed a database of CNVs associated with clinical conditions. The database, called DECIPHER, allows researchers around the world to submit clinical information of patients with CNV details using the internet. This patient information is then mapped onto the human genome in the public ENSEMBL browser, which enables collaborative investigations of these rare disorders. In this way, DECIPHER has already helped in the identification of new syndromes with subsequent improvements in care and genetic advice for affected individuals and families.

“The wide variation between individuals in the number of repeated or deleted portions of our DNA has not been appreciated until now. This important work will help identify genetic causes of many diseases. All of the new data is in the public domain emphasizing the commitment of research funders in making the results of research accessible to all.”

Dr Mark Walport Director of the Wellcome Trust

Copy number variation is the result of several different mechanisms, some of which remain poorly understood. Many studies to date suggest that larger CNVs occur in regions of the human genome that contain, or are flanked by, duplicated or repeated DNA sequences. Such regions are prone to errors when chromosomes are shuffled before being passed on from parent to child. Some smaller CNVs are not to be dependent on these repeated sequences. The new research identifies many more of these smaller CNVs and will greatly advance our understanding of what is perhaps the most poorly understood mutational process operating in the human genome.

The map also tells us something of our shared history. As a result of our recent common origin in Africa, the vast majority of copy-number variation – around 89 per cent – is shared among the diverse human populations studied.

Nevertheless, the pattern of CNV that each of us inherits subtly reflects our ancestry and can be used to infer in which of the three continental populations our recent ancestry lies.

Striking differences in regions of our genome between different continental populations will define variants that have allowed different populations to adapt to their different environments. One example is the strikingly increased copy number of the HIV-related CCL3L1 gene in African populations. An understanding of how genetic variation is distributed among populations not only tells us about human prehistory but also improves our ability to find disease genes.

More information

Participating Centres

  • The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK
  • Genome Science, Research Center for Advanced Science and Technology, University of Tokyo, 4-6-1 Komaba, Meguro, Tokyo 153-8904 Japan
  • Affymetrix, Inc., Santa Clara, CA, USA
  • The Centre for Applied Genomics and Program in Genetics and Genomic Biology, The Hospital for Sick Children, MaRS Centre- East Tower, 101 College Street, Rm. 14-701, Toronto, Ontario, M5G 1L7, Canada
  • Department of Molecular and Medical Genetics, University of Toronto, Canada
  • Department of Pathology, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA 02115
  • Genes and Disease Program, and Barcelona CeGen Unit, Center for Genomic Regulation, Barcelona, Catalonia, Spain
  • Dependable and High Performance Computing, Research Center for Advanced Science and Technology, University of Tokyo, 4-6-1 Komaba, Meguro, Tokyo 153-8904 Japan
  • Departments of Medical Genetics and Pediatrics, University of Alberta, Edmonton, Canada
  • Department of Human Genetics, University of Chicago, 920 East 58th Street, Chicago, Illinois, USA
  • Department of Life and Health Sciences, Pompeu Fabra University, Barcelona, Catalonia, Spain
  • Japan Science and Technology Agency Kawaguchi, Saitama, 332-0012, Japan

Publications:

Loading publications...

Selected websites

  • The Hospital for Sick Children (SickKids)

    The Hospital for Sick Children, affiliated with the University of Toronto, is Canadas most research-intensive hospital and the largest centre dedicated to improving childrens health in the country. As innovators in child health, SickKids improves the health of children by integrating care, research and teaching. Our mission is to provide the best in complex and specialized care by creating scientific and clinical advancements, sharing our knowledge and expertise and championing the development of an accessible, comprehensive and sustainable child health system. SickKids is committed to healthier children for a better world.

  • Brigham and Women's Hospital

    Brigham and Women’s Hospital is a 747-bed nonprofit teaching affiliate of Harvard Medical School and a founding member of Partners HealthCare System, an integrated health care delivery network. BWH is committed to excellence in patient care with expertise in virtually every specialty of medicine and surgery. The BWH medical preeminence dates back to 1832 and today that rich history in clinical care is coupled with its national leadership in quality improvement and patient safety initiatives, dedication to educating and training health care professionals, and strength in biomedical research. With $370M in funding and more than 500 research scientists, BWH is an acclaimed leader in clinical, basic and epidemiological investigation – including the landmark Nurses Health Study, Physicians Health Studies, and the Women’s Health Initiative.

  • The Wellcome Trust Sanger Institute

    The Wellcome Trust Sanger Institute, which receives the majority of its funding from the Wellcome Trust, was founded in 1992. The Institute is responsible for the completion of the sequence of approximately one-third of the human genome as well as genomes of model organisms and more than 90 pathogen genomes. In October 2006, new funding was awarded by the Wellcome Trust to exploit the wealth of genome data now available to answer important questions about health and disease.

  • The Wellcome Trust and Its Founder

    The Wellcome Trust is the most diverse biomedical research charity in the world, spending about £450 million every year both in the UK and internationally to support and promote research that will improve the health of humans and animals. The Trust was established under the will of Sir Henry Wellcome, and is funded from a private endowment, which is managed with long-term stability and growth in mind.