Wellcome Sanger Institute

Informatics Support Group

High Performance Computing

We deliver the at-scale computational platforms that enable the Sanger Institute’s scientists to deliver genomic research that others are unable perform. To achieve this, we support, design the architecture of, and develop Sanger's traditional high performance compute (HPC) clusters and OpenStack private cloud (flexible compute infra-structure).

Today’s computational challenges have reached a scale that require professional high performance compute infrastructure to tackle the largest challenges in informatics. To provide the scale and the flexibility our scientists need to conduct their research, while also maintaining significant industry-leading up times, we deploy and link together high performance compute clusters and OpenStack private cloud environments.

We work in close partnership with scientists and informaticians across the Institute to design, deliver, manage, code develop, and develop new solutions. Depending on the project and required solution, we employ waterfall or agile techniques including scrum, CICD, DevOps and similar workflow processes.

The infrastructure and solutions we provide have supported the delivery of the Institute’s COVID-19 Genomic Surveillance and variant analysis for the UK Government and almost 250,000 human genome sequences for the UK Biobank project without service interruptions.

Our goal is to deliver and support platforms that are scalable, resilient, cost-effective and, whenever possible, self-service.

We provide:

High performance compute clusters that are rated for the highest possible performance to tackle the largest informatics jobs our scientists can devise.
Private cloud environments to provide flexibility and enable transition to the public cloud when necessary, so that we can take advantage of developing technologies and enable our researchers to co-develop and collaborate on national and international projects.

Producing, providing, architecting and designing these types of platforms consistently requires us to develop and use at scale tools to automate our processes whenever possible. In addition, our research colleagues require large quantities of data to be delivered to their compute systems in a timely manner so that they can conduct their analyses. To achieve this, we work in collaboration with the Institute’s scientists to develop an indepth understanding of their needs so that we can provide infrastructure solutions that can adapt and grow to meet future demands.

We are constantly re-evaluating and reconfiguring our system to ensure that we can supply future requirements to power the next wave of research discovery. We have built strong working relationships with vendors and third parties so that, when off-the-shelf solutions are not available, we are able to codevelop, or shape the creation of, the next generation of solutions and features. For example, we currently have a network bandwidth that, in some areas, tops out at 1.6 terabits per second.

Our working relationships place us at the vanguard of delivering new technologies. For example, we developed secure lustre with DDN and were the first to bring iRODS as a data management system for the informatics world. The system has been such a success at the Sanger Institute that it is now the standard for our informatics community and holds more than 45 Petabytes of usable data and has a capacity of almost 60 Petabytes.

We are also developing a new solution for data management within clusters. The volumes of data our scientists need to analyse mean that it is more important to bring the compute capacity to the data. In collaboration with IBM, we have been able to introduce ‘Data Manager’ a system that sends the compute job to where the data is, thereby moving the compute node and not the data. In this way, we have been able to ensure that the Institute’s research data is available to our scientists in the most efficient way.

If you are interested in joining our team – where no two days are the same – please visit the Sanger Institute’s Careers board which advertises all our vacancies.

Our people

Group lead

Dr Peter Clapham

ISG Team Leader

Peter leads the Informatics Support Group (ISG) which provides the high performance compute (HPC) environments for Sangers scientific research teams. Our team investigates new and upcoming technical solutions that will drive our HPC platforms for tomorrow. In this way we can continue to keep abreast of the research challenges presented.

Core team

Helen Cousins

Senior Systems Administrator

Dave Holland

Principal Systems Administrator

Dr Kim Judge

Bioinformatician

Mr Martin O. Pollard

Technical Innovator

Related groups

Science group

Cellular Genetics Informatics

Cellular Genetics

Our team provides efficient access to cutting-edge analysis methods, environments and pipelines for Cellular Genetics programme, which leads and is involved ...

Science group

Human Genetics Informatics (HGI)

Human Genetics

Human Genetics Informatics (HGI) supports the scientific aims of the Human Genetics programme by developing and operating computational analysis workflows, managing ...

Science group

Informatics and Digital Solutions

Scientific Computing

We support the Sanger Institute’s mission to deliver innovative and ambitious genomics research at a scale to improve human health ...

Science group

New Pipeline Group (NPG)

Sequencing Informatics

NPG are reliant on the Sanger systems teams for providing solid compute and storage.

Science group

Cancer Genome Project

Cancer Genetics & Genomics

Throughout life, the genome within cells of the human body is exposed to DNA damage and suffers mistakes in replication. These ...

Wellcome Sanger Institute

Programmes and Facilities

Programme

Platform Solutions

The Platform Solutions unit supports the range of Wellcome Genome Campus users with IT infrastructure and platforms. The goal is to ...

Programme

Informatics and Digital Solutions

We support the Sanger Institute’s mission to deliver innovative and ambitious genomics research at a scale to improve human health ...

Programme

Information Communications Technology

Our goal is "To provide World Class High Performance Computing and First Class Production Platforms and Services for genome and biodata ...

Publications

Loading publications...

Careers and Study

Policies

Archive

Leadership

Faculty

Informatics Support Group

Our people

Group lead

Dr Peter Clapham

Core team

Helen Cousins

Dave Holland

Dr Kim Judge

Mr Martin O. Pollard

Related groups

Cellular Genetics Informatics

Human Genetics Informatics (HGI)

Informatics and Digital Solutions

New Pipeline Group (NPG)

Cancer Genome Project

Programmes and Facilities

Platform Solutions

Informatics and Digital Solutions

Information Communications Technology

Publications