Sifting through the Genome Baggage

Press Office 25 Dec 2005

A New Method to Find Important Genome Regions

Email newsletter

News and blog updates

Evolutionary forces tend to retain important DNA sequences, whilst allowing unimportant sequences to change. Consequently, protein-coding regions – only about 1.5 per cent of the human genome – are similar in all mammalian species.

But there is a further 3 per cent of mammalian genome sequence that does not code for protein, yet is conserved. Are these sequences important or are they merely passengers on the evolutionary journey?

A new study from an international team co-directed by researchers at the Wellcome Trust Sanger Institute and the Broad Institute, published in Nature Genetics, shows that the vast majority of the conserved non-coding (CNC) regions are not areas that fortuitously are free of mutation, but are selectively constrained in their variation. This remarkable conclusion suggests that searches in CNC regions might lead to new discoveries of clinically important variants.

“Although we were aware of CNC regions, we could not tell whether they represented areas of the human genome that were relevant to the working of our genome, or were relics that had no present importance.

“Single-letter differences – called single nucleotide polymorphisms, or SNPs – in our genetic code are rarer in CNCs than in other, non-conserved regions. Crucially, we showed that this was not due to a lower rate of mutation, but to selection in these regions – they are under evolutionary pressure. This suggests these regions, which do not code for protein, perform important functions in our genome.”

Dr Manolis Dermitzakis Investigator, Division of Informatics at the Wellcome Trust Sanger Institute and a corresponding author

Our genome includes regulatory DNA sequences, which are important in control of genetic activity. The structure and sequence of these regions is emerging, but new methods to identify significant sequences are needed. Many of the CNC variants detected here include known regulatory regions, but also many other locations.

Finding regions of the genome where evolution has acted on variation is like finding a new pot of targets in which mutations that predispose to disease are to be discovered. The study also suggests ways in which the hunt for disease-associated variation can be made more productive.

“Our research suggests that CNCs are as important as coding sequences – but our genome has more than twice as much CNC sequence as gene sequence. This means there will be many more mutations to discover in CNCs that are associated with disease than there are in genes.

“If we include in our research a focus on these locations, we would expect to identify important variants more quickly. Our aim is to use the power of genomic information to improve our understanding of disease. This work suggests a method to harness and focus that power.”

Dr Manolis Dermatizakis Sanger Institute

Because SNPs in CNCs are relatively rare, they may not be well captured using standard methods of detecting variation (which tend to emphasize more common variants). If these regions are studied in more detail, greater biomedical benefit should follow.

More information

Corresponding Authors

Dr Manolis Dermitzakis, Wellcome Trust Sanger Institute
Joel N. Hirschhorn, Broad Institute of Harvard and MIT

Participating Centres

Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, CB10 1SA, UK
Program in Genomics and Division in Endocrinology, Children’s Hospital, Boston, MA 02115, USA
Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA 02139, USA
Department of Biomolecular Engineering, University of California Santa Cruz, CA, 95064, USA
Division of Cardiology, Massachusetts General Hospital, Boston, MA 02114, USA
NHLBI’s Framingham Heart Study, Framingham, MA, 01702, USA
Department of Genetic Medicine and Development, University of Geneva Medical School, Geneva, Switzerland
Zoological Institute, University of Bern, Bern, Switzerland
Department of Genetics, Harvard Medical School, Boston, MA 02115, USA

Publications:

Loading publications...

Selected websites

The Wellcome Trust Sanger Institute

The Wellcome Trust Sanger Institute, which receives the majority of its funding from the Wellcome Trust, was founded in 1992. The Institute is responsible for the completion of the sequence of approximately one-third of the human genome as well as genomes of model organisms and more than 90 pathogen genomes. In October 2006, new funding was awarded by the Wellcome Trust to exploit the wealth of genome data now available to answer important questions about health and disease.
The Wellcome Trust and Its Founder

The Wellcome Trust is the most diverse biomedical research charity in the world, spending about £450 million every year both in the UK and internationally to support and promote research that will improve the health of humans and animals. The Trust was established under the will of Sir Henry Wellcome, and is funded from a private endowment, which is managed with long-term stability and growth in mind.

Latest news

See all news

30 Oct 2024

'Black box' of stem cell transplants opened in world-first blood study

New research into the long-term dynamics of transplanted stem cells in a patient’s body explains how age affects stem cell ...

Wellcome Sanger Institute, Genome Research Limited

23 Oct 2024

Precision drug could target hard-to-treat cancers

By targeting a specific vulnerability in certain types of cancer cells, a newly created inhibitor may offer an alternative treatment for ...

18 Oct 2024

Cancer drug resistance causes and categories identified

Researchers have mapped the genetic landscape of cancer drug resistance, uncovering that DNA changes can be grouped into four main categories ...

Careers and Study

Policies

Archive

Leadership

Faculty