Learning to control, protein-binding specificity is useful for both fundamental and applied biology. In fundamental research, better understanding of complicated signaling networks could be achieved through engineering of regulator proteins to bind to only a subset of their effector proteins. In applied research such as drug design, nonspecific binding remains a major reason for failure of many drug candidates. However, developing antibodies that simultaneously inhibit several disease-associated pathways are a rising trend in pharmaceutical industry. Binding specificity could be manipulated experimentally through various display technologies that allow us to select desired binders from a large pool of candidate protein sequences. We developed an alternative approach for controlling binding specificity based on computational protein design. We can enhance binding specificity of a protein by computationally optimizing its sequence for better interactions with one target and worse interaction with alternative target(s). Moreover, we can design multispecific proteins that simultaneously interact with a predefined set of proteins. Unlike combinatorial techniques, our computational methods for manipulating binding specificity are fast, low cost and in principle are able to consider an unlimited number of desired and undesired binding partners.
Manipulations of PPIs (protein-protein interactions) are important for many biological applications such as synthetic biology and drug design. Combinatorial methods have been traditionally used for such manipulations, failing, however, to explain the effects achieved. We developed a computational method for prediction of changes in free energy of binding due to mutation that bring about deeper understanding of the molecular forces underlying binding interactions. Our method could be used for computational scanning of binding interfaces and subsequent analysis of the interfacial sequence optimality. The computational method was validated in two biological systems. Computational saturated mutagenesis of a high-affinity complex between an enzyme AChE (acetylcholinesterase) and a snake toxin Fas (fasciculin) revealed the optimal nature of this interface with only a few predicted affinity-enhancing mutations. Binding measurements confirmed high optimality of this interface and identified a few mutations that could further improve interaction fitness. Computational interface scanning of a medium-affinity complex between TIMP-2 (tissue inhibitor of metalloproteinases-2) and MMP (matrix metalloproteinase) 14 revealed a non-optimal nature of the binding interface with multiple mutations predicted to stabilize the complex. Experimental results corroborated our computational predictions, identifying a large number of mutations that improve the binding affinity for this interaction and some mutations that enhance binding specificity. Overall, our computational protocol greatly facilitates the discovery of affinity- and specificity-enhancing mutations and thus could be applied for design of potent and highly specific inhibitors of any PPI.
Accumulating evidence shows that many particular proteins have evolved to bind multiple targets, including other proteins, peptides, DNA, and small molecule substrates. Multispecific recognition might be not only common but also necessary for the robustness of signaling and metabolic networks in the cell. It is also important for the immune response and for regulation of transcription and translation. Multispecificity presents an apparent paradox: How can a protein encoded by a single sequence accommodate numerous targets? Analysis of sequences and structures of multispecific proteins revealed a number of mechanisms that achieve multispecificity. Interestingly, similar mechanisms appear in antibody-antigen, T-cell receptor-peptide, protein-DNA, enzyme-substrate, and protein-protein complexes. Directed evolution and protein design experiments with multispecific proteins offer some interesting insights into the evolution of such proteins and help in the dissection of molecular interactions that mediate multispecificity. Understanding the basic principles governing multispecificity could greatly assist in the unraveling of various complex processes in the cell. In addition, through manipulation of functional multispecificity, novel proteins could be created for use in various biotechnological and biomedical applications.
DNA cloning and protein engineering are basic methodologies employed for various applications in all life-science disciplines. Manipulations of DNA however, could be a lengthy process that slows down subsequent experiments. To facilitate both DNA cloning and protein engineering, we present Transfer-PCR (TPCR), a novel approach that integrates in a single tube, PCR amplification of the target DNA from an origin vector and its subsequent integration into the destination vector. TPCR can be applied for incorporation of DNA fragments into any desired position within a circular plasmid without the need for purification of the intermediate PCR product and without the use of any commercial kit. Using several examples, we demonstrate the applicability of the TPCR platform for both DNA cloning and for multiple-site targeted mutagenesis. In both cases, we show that the TPCR reaction is most efficient within a narrow range of primer concentrations. In mutagenesis, TPCR is primarily advantageous for generation of combinatorial libraries of targeted mutants but could be also applied to generation of variants with specific multiple mutations throughout the target gene. Adaptation of the TPCR platform should facilitate, simplify and significantly reduce time and costs for diverse protein structure and functional studies.
Computational prediction of stabilizing mutations into monomeric proteins has become an almost ordinary task. Yet, computational stabilization of protein-protein complexes remains a challenge. Design of protein-protein interactions (PPIs) is impeded by the absence of an energy function that could reliably reproduce all favorable interactions between the binding partners. In this work, we present three energy functions: one function that was trained on monomeric proteins, while the other two were optimized by different techniques to predict side-chain conformations in a dataset of PPIs. The performances of these energy functions are evaluated in three different tasks related to design of PPIs: predicting side-chain conformations in PPIs, recovering native binding-interface sequences, and predicting changes in free energy of binding due to mutations. Our findings show that both functions optimized on side-chain repacking in PPIs are more suitable for PPI design compared to the function trained on monomeric proteins. Yet, no function performs best at all three tasks. Comparison of the three energy functions and their performances revealed that (1) burial of polar atoms should not be penalized significantly in PPI design as in single-protein design and (2) contribution of electrostatic interactions should be increased several-fold when switching from single-protein to PPI design. In addition, the use of a softer van der Waals potential is beneficial in cases when backbone flexibility is important. All things considered, we define an energy function that captures most of the nuances of the binding energetics and hence, should be used in future for design of PPIs.
Natural proteins often partake in several highly specific protein-protein interactions. They are thus subject to multiple opposing forces during evolutionary selection. To be functional, such multispecific proteins need to be stable in complex with each interaction partner, and, at the same time, to maintain affinity toward all partners. How is this multispecificity acquired through natural evolution? To answer this compelling question, we study a prototypical multispecific protein, calmodulin (CaM), which has evolved to interact with hundreds of target proteins. Starting from high-resolution structures of sixteen CaM-target complexes, we employ state-of-the-art computational methods to predict a hundred CaM sequences best suited for interaction with each individual CaM target. Then, we design CaM sequences most compatible with each possible combination of two, three, and all sixteen targets simultaneously, producing almost 70,000 low energy CaM sequences. By comparing these sequences and their energies, we gain insight into how nature has managed to find the compromise between the need for favorable interaction energies and the need for multispecificity. We observe that designing for more partners simultaneously yields CaM sequences that better match natural sequence profiles, thus emphasizing the importance of such strategies in nature. Furthermore, we show that the CaM binding interface can be nicely partitioned into positions that are critical for the affinity of all CaM-target complexes and those that are molded to provide interaction specificity. We reveal several basic categories of sequence-level tradeoffs that enable the compromise necessary for the promiscuity of this protein. We also thoroughly quantify the tradeoff between interaction energetics and multispecificity and find that facilitating seemingly competing interactions requires only a small deviation from optimal energies. We conclude that multispecific proteins have been subjected to a rigorous optimization process that has fine-tuned their sequences for interactions with a precise set of targets, thus conferring their multiple cellular functions.
Multistate protein design is the task of predicting the amino acid sequence that is best suited to selectively and stably fold to one state out of a set of competing structures. Computationally, it entails solving a challenging optimization problem. Therefore, notwithstanding the increased interest in multistate design, the only implementations reported are based on either genetic algorithms or Monte Carlo methods. The dead-end elimination (DEE) theorem cannot be readily transfered to multistate design problems despite its successful application to single-state protein design. In this article we propose a variant of the standard DEE, called type-dependent DEE. Our method reduces the size of the conformational space of the multistate design problem, while provably preserving the minimal energy conformational assignment for any choice of amino acid sequence. Type-dependent DEE can therefore be used as a preprocessing step in any computational multistate design scheme. We demonstrate the applicability of type-dependent DEE on a set of multistate design problems and discuss its strength and limitations.