Informatics Support Group
High Performance Computing
Today’s computational challenges have reached a scale that require professional high performance compute infrastructure to tackle the largest challenges in informatics. To provide the scale and the flexibility our scientists need to conduct their research, while also maintaining significant industry-leading up times, we deploy and link together high performance compute clusters and OpenStack private cloud environments.
We work in close partnership with scientists and informaticians across the Institute to design, deliver, manage, code develop, and develop new solutions. Depending on the project and required solution, we employ waterfall or agile techniques including scrum, CICD, DevOps and similar workflow processes.
The infrastructure and solutions we provide have supported the delivery of the Institute’s COVID-19 Genomic Surveillance and variant analysis for the UK Government and almost 250,000 human genome sequences for the UK Biobank project without service interruptions.
Our goal is to deliver and support platforms that are scalable, resilient, cost-effective and, whenever possible, self-service.
We provide:
- High performance compute clusters that are rated for the highest possible performance to tackle the largest informatics jobs our scientists can devise.
- Private cloud environments to provide flexibility and enable transition to the public cloud when necessary, so that we can take advantage of developing technologies and enable our researchers to co-develop and collaborate on national and international projects.
Producing, providing, architecting and designing these types of platforms consistently requires us to develop and use at scale tools to automate our processes whenever possible. In addition, our research colleagues require large quantities of data to be delivered to their compute systems in a timely manner so that they can conduct their analyses. To achieve this, we work in collaboration with the Institute’s scientists to develop an indepth understanding of their needs so that we can provide infrastructure solutions that can adapt and grow to meet future demands.
We are constantly re-evaluating and reconfiguring our system to ensure that we can supply future requirements to power the next wave of research discovery. We have built strong working relationships with vendors and third parties so that, when off-the-shelf solutions are not available, we are able to codevelop, or shape the creation of, the next generation of solutions and features. For example, we currently have a network bandwidth that, in some areas, tops out at 1.6 terabits per second.
Our working relationships place us at the vanguard of delivering new technologies. For example, we developed secure lustre with DDN and were the first to bring iRODS as a data management system for the informatics world. The system has been such a success at the Sanger Institute that it is now the standard for our informatics community and holds more than 45 Petabytes of usable data and has a capacity of almost 60 Petabytes.
We are also developing a new solution for data management within clusters. The volumes of data our scientists need to analyse mean that it is more important to bring the compute capacity to the data. In collaboration with IBM, we have been able to introduce ‘Data Manager’ a system that sends the compute job to where the data is, thereby moving the compute node and not the data. In this way, we have been able to ensure that the Institute’s research data is available to our scientists in the most efficient way.
If you are interested in joining our team – where no two days are the same – please visit the Sanger Institute’s Careers board which advertises all our vacancies.
Core team
Helen Cousins
Senior Systems Administrator
Dave Holland
Principal Systems Administrator
Dr Kim Judge
Bioinformatician
Mr Martin O. Pollard
Technical Innovator