A new study examines the structure of hundreds of protein complexes, tiny machines in cells that control everything from energy use to DNA replication. Researchers at the Institute of protein design at the University of Washington led the study, which uses new deep learning tools and may lead to new ways to treat diseases.
This spring, IPD released its deep learning software for predicting protein folding, following a similar tool developed by deep mind, a subsidiary of alphabet. These tools shocked researchers with their speed and accuracy in predicting how proteins form three-dimensional shapes.
Proteins are made up of a series of amino acids, but they need to be folded correctly to work. Rosettafold of IPD and alphafold of deepmind have been used to predict the shape of thousands of proteins since their release.
Within cells, proteins often interact in machine like protein complexes that perform various tasks. Many approved drugs also interfere with protein complexes, such as chemotherapy that hijacks the mechanisms of DNA replication and cell division.
Ian Humphrey, a researcher at the Institute of protein design at the University of Washington. (University of Washington photo)
“To truly understand the cellular conditions that lead to health and disease, we must understand how different proteins in cells work together,” Ian Humphrey, a graduate student at David Baker’s laboratory, the head of IPD, said in a press release.
In this new study, Humphrey, Baker and their colleagues simulated most protein interactions in Saccharomyces cerevisiae. This unicellular organism is similar to human cells in performing basic functions such as growth, division, waste disposal and environmental sensing – all controlled by protein complexes.
Yeast contains about 6000 proteins. To predict which of these proteins might interact, researchers turned to evolutionary biology. As proteins evolve, they often accumulate mutations in tandem – if a building block in a protein changes, the corresponding building block in the partner protein also changes. This tandem change ensures that the complex remains intact.
The researchers found paired proteins that obtained mutations in this association, suggesting that they may interact physically. Then, they used Rosetta fold and alphafold to simulate the three-dimensional shape of these interacting proteins.
After screening millions of potential pairs, the deep learning tool extracted 1506 proteins that may interact. Based on these proteins, these tools successfully simulated 712 protein complexes.
More than 100 protein interactions have never been identified before. One new complex contains a protein involved in DNA repair and cancer progression, and the other contains an enzyme related to neurodevelopmental disorders and cancer.
The new discovery opens the door for future research on these complexes and their working principle. They may eventually lead to drug interference with disease-related cellular mechanisms.
David Baker, director of the protein design institute. (University of Washington photo)
“These models provide experimenters with tested hypotheses,” Qian Cong said in an interview with science, which published the study on Thursday. Cong, a co-author of Baker, became an assistant professor at the University of Texas Southwest Medical Center last year and was a UW researcher before.
This new discovery also laid the foundation for later research. Rosetta fold and deepmind were used to map the universe of human protein complexes.
The research involved computer experts, evolutionary researchers and structural biologists who helped explain three-dimensional protein models.
“As computer methods become more powerful, it is easier than ever to generate a large amount of scientific data, but it still needs scientific experts to understand these data,” Baker said in a press release. “This is the best community science.”