Predicting candidate post-translational modifications sites from protein sequence using deep learning.
Heyndrickx Sander, 2025
Proteins in our cells are often modified with chemical tags that can act like on/off switches, controlling virtually every cellular process. Understanding these modifications is crucial for medical research, but current technologies face a major bottleneck. When scientists use advanced instruments called mass spectrometers to identify these modifications, they must search through enormous databases of all possible combinations. Current computers simply cannot handle all the possibilities when looking at different types of modifications simultaneously, making comprehensive analyses practically impossible.
This research developed an approach using artificial intelligence to predict which protein modifications are biologically plausible. Our tool, peptidoform2speclib, focuses only on places AI predicts are likely to occur in real cells.
Our testing showed this approach makes analyses 2-5 times faster while generally maintaining accuracy. More importantly, our filtering strategy enables analyses where computers previously couldn't handle the computational load. Though we discovered unexpected complexities in how search programs interpret data with different database compositions, our method still potentially delivers significant performance improvements.
The societal impact extends to medical research and drug development. This strategy finds new modifications sites with less computational cost and makes previously impossible searches possible. This could help uncover unknown places where modification can occur and show how different modifications influence each other in cells. Such understanding will help develop diagnostics and therapies for diseases, potentially contributing to medical breakthroughs that improve human health.
| Promotor | Robbin Bouwmeester |
| Opleiding | Biomedische Wetenschappen |
| Domein | Systems Biology |
| Kernwoorden | Deep learning protein modifications LC-MS/MS data-independent acquisition |