Human ‘Domainome’ reveals root cause of inherited conditions
Listen to this news story:
Listen to “Human ‘Domainome’ reveals root cause of inherited conditions” on Spreaker.
Most genetic changes that swap one amino acid for another cause disease by making the protein less stable, according to the largest study of human protein variants to date. Unstable proteins are more likely to misfold and degrade, causing them to stop working or accumulate in harmful amounts inside cells.
The study, published today (8 January) in Nature, helps explain why minimal changes in the human genome, also known as missense variants, can cause disease at the molecular level.
Researchers at the Wellcome Sanger Institute, the Centre for Genomic Regulation (CRG) in Barcelona, and their collaborators, discovered that protein instability is one of the main drivers of inherited cataract formation, and also contributes to some neurological, developmental, and muscle conditions.
The team analysed 563,534 variants, including 621 that are known to cause disease, and catalogued their impact on proteins. Three in five, or 61 per cent, of these variants caused a detectable decrease in protein stability, highlighting the possibility of developing therapies aimed at stabilising proteins. Instability was even higher in recessive conditions, where copies of a gene from both parents are necessary to cause disease.
The dataset, Human Domainome 1.0,1 is freely available and could help doctors pinpoint disease-causing variants, leading to faster, more accurate diagnoses. This study also marks a significant advance toward precision medicine and AI-driven bioengineering, where AI might eventually be able to design molecules that stabilise faulty proteins, restoring their shape and function.
Understanding how genetic changes, otherwise known as variants or mutations, affect protein stability is crucial for developing new treatments. However, until now, scientists have only studied a small number of proteins in detail, limiting our ability to predict how small genetic changes might cause disease or design treatments to correct them.
In this new study, researchers focused on ‘domains’ – small, stable sections of human proteins that are crucial targets for understanding disease. To get a broad view of how genetic changes impact function across different types of proteins, they used advanced laboratory techniques to create 563,534 unique protein variants across 522 different domains,2 measuring their stability and abundance of proteins in human cells.
The team used the Human Domainome 1.0 dataset to investigate some disease-causing genetic changes more closely. For example, beta-gamma crystallins are a family of proteins essential for maintaining lens clarity in the human eye. They found that 72 per cent of genetic changes linked to cataract formation destabilise crystallin proteins, making the proteins more likely to clump together and form opaque regions in the lens.
The study also directly linked protein instability to the development of reducing body myopathy, a rare condition which causes muscle weakness and deterioration, as well as Ankyloblepharon-ectodermal defects-clefting (AEC) Syndrome, a condition characterised by the development of a cleft palate and other developmental symptoms.
However, some disease-causing genetic changes did not destabilise proteins and shed light on alternative molecular mechanisms at play.
For example, Rett Syndrome is a neurological condition which causes severe cognitive and physical symptoms. It is caused by genetic changes in the MECP2 gene, which produces a protein responsible for regulating gene expression in the brain. The study found that many genetic changes in MECP2 do not destabilise the protein but are instead found in regions which affect how MECP2 binds to DNA to regulate other genes. This loss of function could be disrupting brain development and function.
The study also found that the way genetic changes cause disease often relates to whether the disease is recessive or dominant.3
Genetic changes that cause recessive conditions were more likely to destabilise proteins, while those causing dominant conditions often affected other aspects of protein function, such as interactions with DNA or other proteins, rather than just stability.
Though Human Domainome 1.0 is around 4.5 times bigger than previous libraries of protein variants, it still only covers 2.5 per cent of known human proteins. As researchers increase the size of the catalogue, the exact contribution of how genetic changes impact protein instability and human health will become increasingly clear.
In the meantime, researchers can use the information from the 522 protein domains to extrapolate to similar proteins. This suggests that by studying a small subset of proteins deeply, scientists may be able to predict how related proteins will behave.
However, as the team examined protein domains in isolation rather than within full-length proteins, the study might not fully capture how genetic changes affect proteins in their natural habitat inside human cells. To overcome this, researchers plan on studying longer protein domains, and eventually, full-length proteins.
“We reveal, at unprecedented scale, how mutations cause disease at the molecular level. By distinguishing whether a mutation destabilises a protein or alters its function without affecting stability, we can tailor more precise treatment strategies. This could mean the difference between developing drugs that stabilise a protein versus those that inhibit a harmful activity. It’s a significant step toward personalised medicine.”
Dr Antoni Beltran, first author from the Centre for Genomic Regulation, Barcelona
“Human Domainome 1.0 represents a first step toward a comprehensive understanding of how genetic changes affect human health. This means that data from one protein domain can help predict how mutations will impact other proteins within the same family or with similar structures. The ‘rules’ from these 522 domains are enough to help us make educated predictions about many more proteins than there are in the catalogue. Ultimately, we want to map the effects of every possible mutation on every human protein. It’s an ambitious endeavour, and one that can transform precision medicine.”
Professor Ben Lehner, senior author from the Wellcome Sanger Institute
More information
Human Domainome 1.0 data can be accessed here: https://github.com/lehner-lab/domainome
References
- Human Domainome 1.0 is an enormous library of protein variants. The catalogue includes more than half a million mutations across 522 human protein domains, the bits of a protein which determine its function. It is the largest catalogue of human protein domain variants to date. Protein domains are specific regions which can fold into a stable structure and perform a job independently of the rest of the protein. Human Domainome 1.0 was created by systematically changing each amino acid in these domains to every other possible amino acid, creating a catalogue of all possible mutations. The impact of these mutations on protein stability was discovered by introducing mutated protein domains into yeast cells. The transformed yeast could only produce one type of mutated protein domain, and cultures were grown in test tubes under conditions which linked the stability of the protein to the growth of the yeast. If a mutated protein was stable, the yeast cell would grow well. If the protein was unstable, the yeast cell’s growth would be poor. Using a special technique, the researchers ensured only the yeast cells producing stable proteins could survive and multiply. By comparing the frequency of each mutation before and after the yeast growth, they determined which mutations led to stable proteins and which caused instability.
- A protein variant is a modified version of a protein that can have different characteristics than the original protein. This can be caused by changes in the gene that encodes the protein.
- Dominant genetic disorders occur when a single copy of an altered gene is enough to cause the disease, even if the other copy is normal, while recessive conditions occur when an individual inherits two copies of an altered gene, one from each parent.
Publication:
A. Beltran et al. (2025) Site saturation mutagenesis of 500 human protein domains. Nature. DOI: 10.1038/s41586-024-08370-4
Funding:
This research was supported by a European Research Council (ERC) Advanced grant and Wellcome. For full funding acknowledgements, please refer to the publication.