Wellcome Sanger Institute

World's largest database for predicting cancer treatment response based on cancer proteins

Data analysed using deep learning technique to predict the response of cancer cells to treatment

Email newsletter

News and blog updates

Sign up

Researchers from the Wellcome Sanger Institute and the Children’s Medical Research Institute (CMRI) in Sydney, Australia, have completed a protein map for 949 cancer cell lines across over 40 cancer types, which have been tested with 650 different treatments. Advanced computational methods were then used to predict the response of cancer cells to treatment.

The paper, published today (14 July) in Cancer Cell, lays the foundation for ongoing efforts to predict the response of an individual cancer to drugs based on the proteins the cancer contains. These data will also inform the development of new treatments.

Every cell in the body contains thousands of different proteins, collectively referred to as the proteome. These proteins are responsible for most of the functions of life, including the behaviour of cancer cells and how they respond to treatment. It has been known by clinical cancer specialists for many decades that for some types of cancer measuring the quantities of a few specific proteins can help guide the choice of the most appropriate treatment. But methods for measuring the thousands of other types of proteins have not been readily available for clinical use.

In this study, CMRI’s ProCan® team developed a high-throughput workflow using mass spectrometry to measure thousands of different proteins in very large numbers of cancers. Using this methodology and 10,000 hours of mass spectrometry instrument time, they generated a proteomic database for the 949 cancer cell lines grown by the Sanger team, who analysed the response of each cell line to up to 650 different drugs, and who have previously deeply analysed the genes and other key molecules in these cancer lines.

In contrast to clinical trials which can each test only one treatment or treatment combination, there is no limit to the number of drugs that can be tested on cancer cell cultures in the laboratory. Generating data regarding the response of such a large number of cancer cell lines to 650 drugs, and their comprehensive molecular analysis, has required a major investment of resources and effort over many years by the Wellcome Sanger Institute.

Data scientists from CMRI and Sanger worked together to analyse the results with advanced computational methods, developing a new deep learning technique to use proteomic data to predict the response of the cancer cells to treatment. The results also pinpoint vulnerabilities in cancer cells that provide opportunities for developing new treatments.

“This study has been a collaborative team effort involving proteomics experts, software engineers, data scientists, cancer cell biologists, and oncology researchers that has resulted in important new insights into the interactions among thousands of key molecules within cancer cells, and the response of cancer cells to drug treatments. It is a major step towards ProCan’s goal of using proteogenomic data to help clinicians choose the best treatment for individual cancer patients.”

Professor Roger Reddel ProCan

The cancer database, which is of unprecedented size for this type of data, is now being made available as a resource for cancer researchers and clinicians around the world. The work at ProCan was done under the auspices of a Memorandum of Understanding between CMRI and the U.S. National Cancer Institute’s International Cancer Proteogenomics Consortium (ICPC), that encourages cooperation among institutions and nations in proteogenomic cancer research in which datasets are made available to the public.

“In addition to revealing new insights about the biology of cancer, this study is also helping to fulfil the mission of my team to generate reference datasets for widespread use in the international cancer research community. This proteomic map will contribute to our Cancer Dependency Map, an effort to systematically identify vulnerabilities in cancer cells to guide drug development.”

Dr Mathew Garnett Wellcome Sanger Institute

More information

Publication:

Emanuel Gonçalves et al. (2022). Pan-cancer proteomic map of 949 human cell lines. Cancer Cell. DOI: https://doi.org/doi:10.1016/j.ccell.2022.06.010