AFsample2 is a generative model based on AlphaFold2 (AF2), capable of predicting multiple conformations of protein structures from sequence. AFsample2 achieves this by introducing more noise to the inference step of the AF2 neural network.
The ability to predict biologically relevant ensembles of protein structures would not only facilitate broader understanding of biological processes but also enable deeper insights into disease mechanisms, opening up new opportunities for targeted drug development.
To achieve improved diversity of structures predicted by AF2, AFsample2 randomly masks columns in the MSA supplied to AF2 to by debias the model from the co-evolutionary reliance of its prediction process. In other words, AFsample2 reduces the constraints imposed by co-evolutionary signals in input MSAs. This favors the prediction of alternative structural states of query sequences as the breakage in covariance signals forces the network to arrive at varying solutions.
The column masking approach employed in AFsample2 shares some similarity with that utilized in SPEACH_AF. However, it differs in that SPEACH_AF introduces a sliding window of alanines (i.e. alanine scanning) at specific columns informed by prior knowledge of interacting residues based on existing structural information or contacts in generated models. AFsample2 does not rely on the need for such prior knowledge.
In a benchmark involving the open-closed conformations data sets, AFsample2 enabled the prediction of alternative states for 17 out of 23 cases, without loss of preference for the dominant end-state. For membrane protein transporters, AFsample2 achieved improved alternate state predictions for 12 of 16 test cases.
The improved sampling by AFsample2 also enhanced the TM-score of prediction of end state conformations relative to experimental structures, improving previous predictions with 0.58 scores to 0.98. Further, AFsample2 improved the prediction and diversity of intermediate states by 70 % compared to AF2.
Compared to other methods, including AFcluster, the quality of models generated by AFsample2 were significantly better.
💪 While AFsample2 predicts protein ensembles, the model also offers a novel way to select single alternate end-state structures from the generated conformations. This approach does not depend on the need for experimental reference structure and follows a three-stage process involving:
1️⃣ Calculating the similarity of each conformation in the ensemble to the best model,
2️⃣ Confidence screening to filter out models below a certain threshold, and
3️⃣ Extremity selection to identify the model (alternative state) that is furthest from the most confident model.
Open Questions and Future Directions:
By addressing these questions, the scientific community can continue to build on the advancements presented by AFsample2, pushing the boundaries of what is possible in protein structure prediction and drug discovery. The ongoing development and refinement of AFsample2 hold the promise of transforming our understanding of protein dynamics and enabling new therapeutic interventions.
Conclusion:
AFsample2 represents a significant advancement in the prediction of diverse protein conformations from sequence data. By introducing noise during the inference step of the AlphaFold2 neural network, AFsample2 overcomes the limitations imposed by co-evolutionary signals in input MSAs. This allows for the prediction of biologically relevant ensembles of protein structures, which can facilitate a deeper understanding of biological processes and disease mechanisms, and open up new opportunities for targeted drug development. The improved sampling capabilities of AFsample2 not only enhance the accuracy of end-state conformations but also significantly improve the diversity and prediction of intermediate states, positioning it as a powerful tool in the field of structural biology.
References:
Paper: AFsample2: Predicting multiple conformations and ensembles with AlphaFold2