AdobeStock

Scientists identify over 140,000 virus species in the human gut, half of which are new to science

The results form the basis of the highly-curated Gut Phage Database (GPD) which will be an invaluable resource for those studying bacteriophages and the role they play on regulating the health of both our gut bacteria and ourselves

Email newsletter

News and blog updates

Sign up

Viruses are the most numerous biological entities on the planet. Now researchers at the Wellcome Sanger Institute and EMBL’s European Bioinformatics Institute (EMBL-EBI) have identified over 140,000 viral species living in the human gut, more than half of which have never been seen before.

The paper, published today (18 February 2021) in Cell, contains an analysis of over 28,000 gut microbiome samples collected in different parts of the world. The number and diversity of the viruses the researchers found was surprisingly high, and the data opens up new research avenues for understanding how viruses living in the gut affect human health.

The human gut is an incredibly biodiverse environment. In addition to bacteria, hundreds of thousands of viruses called bacteriophages, which can infect bacteria, also live in the human gut.

It is known that imbalances in our gut microbiome can contribute to diseases and complex conditions such as Inflammatory Bowel Disease, allergies and obesity. But relatively little is known about the role our gut bacteria, and the bacteriophages that infect them, play in human health and disease.

Using a DNA-sequencing method called metagenomics*, researchers at the Wellcome Sanger Institute and EMBL’s European Bioinformatics Institute (EMBL-EBI) explored and catalogued the biodiversity of the viral species found in 28,060 public human gut metagenomes and 2,898 bacterial isolate genomes cultured from the human gut.

The analysis identified over 140,000 viral species living in the human gut, more than half of which have never been seen before.

“It’s important to remember that not all viruses are harmful, but represent an integral component of the gut ecosystem. For one thing, most of the viruses we found have DNA as their genetic material, which is different from the pathogens most people know, such as SARS-CoV-2 or Zika, which are RNA viruses. Secondly, these samples came mainly from healthy individuals who didn’t share any specific diseases. It’s fascinating to see how many unknown species live in our gut, and to try and unravel the link between them and human health.”

Dr Alexandre Almeida, Postdoctoral Fellow at EMBL-EBI and the Wellcome Sanger Institute

Among the tens of thousands of viruses discovered, a new highly prevalent clade – or group of viruses believed to have a common ancestor – was identified, which the authors refer to as the Gubaphage. This was found to be the second most prevalent virus clade in the human gut, after the crAssphage, which was discovered in 2014.

Both of these viruses seem to infect similar types of human gut bacteria, but without further research it’s very difficult to know the exact functions of the newly discovered Gubaphage.

“An important aspect of our work was to ensure that the reconstructed viral genomes were of the highest quality. A stringent quality control pipeline coupled with a machine learning approach enabled us to mitigate contamination and obtain highly complete viral genomes. High-quality viral genomes pave the way to better understand what role viruses play in our gut microbiome, including the discovery of new treatments such as antimicrobials from bacteriophage origin.”

Dr Luis F. Camarillo-Guerrero, first author of the study from the Wellcome Sanger Institute

The results of the study form the basis of the Gut Phage Database (GPD), a highly curated database containing 142,809 non-redundant phage genomes that will be an invaluable resource for those studying bacteriophages and the role they play on regulating the health of both our gut bacteria and ourselves.

“Bacteriophage research is currently experiencing a renaissance. This high-quality, large-scale catalogue of human gut viruses comes at the right time to serve as a blueprint to guide ecological and evolutionary analysis in future virome studies.”

Dr Trevor Lawley, senior author of the study from the Wellcome Sanger Institute

More information

* Metagenomics is the study of a collection of genetic material (genomes) from a mixed community of organisms. Metagenomics usually refers to the study of microbial communities. The NIH National Human Genome Research Institute has more information here: https://www.genome.gov/genetics-glossary/Metagenomics

Publication:

Camarillo-Guerrero, L.F., et al. (2021). Massive expansion of human gut bacteriophage diversity. Cell. DOI: https://doi.org/10.1016/j.cell.2021.01.029

Funding:

This work was supported by Wellcome and EMBL.

Selected websites

  • European Bioinformatics Institute (EMBL-EBI)

    The European Bioinformatics Institute (EMBL-EBI) is a global leader in the storage, analysis and dissemination of large biological datasets. We help scientists realise the potential of big data by enhancing their ability to exploit complex information to make discoveries that benefit humankind.

    We are at the forefront of computational biology research, with work spanning sequence analysis methods, multi-dimensional statistical analysis and data-driven biological discovery, from plant biology to mammalian development and disease.

    We are part of EMBL and are located on the Wellcome Genome Campus, one of the world’s largest concentrations of scientific and technical expertise in genomics.

    Website: www.ebi.ac.uk

  • The Wellcome Sanger Institute

    The Wellcome Sanger Institute is a world leading genomics research centre. We undertake large-scale research that forms the foundations of knowledge in biology and medicine. We are open and collaborative; our data, results, tools and technologies are shared across the globe to advance science. Our ambition is vast – we take on projects that are not possible anywhere else. We use the power of genome sequencing to understand and harness the information in DNA. Funded by Wellcome, we have the freedom and support to push the boundaries of genomics. Our findings are used to improve health and to understand life on Earth. Find out more at www.sanger.ac.uk or follow us on Twitter, Facebook, LinkedIn and on our Blog.

  • About Wellcome

    Wellcome exists to improve health by helping great ideas to thrive. We support researchers, we take on big health challenges, we campaign for better science, and we help everyone get involved with science and health research. We are a politically and financially independent foundation. https://wellcome.org/