Learning to Generate and Score Protein-RNA Complexes
The growing significance of RNA engineering in diverse biological applications has spurred interest in developing AI methods for structure-based RNA design. While diffusion models have excelled in protein design, adapting them for RNA presents new challenges due to RNA’s conformational flexibility and the computational cost of fine-tuning large structure prediction models. In this talk, we present an RNA design workflow with two main components: RNAFlow (generation)and pTMEnergy (scoring). RNAFLow is a flow matching model for protein-conditioned RNA sequence-structure co-generation. Its denoising network integrates an RNA inverse folding model and a pre-trained RosettaFold2NA network for generation of RNA sequences and structures. The integration of inverse folding in the structure denoising process allows us to simplify training by fixing the structure prediction network. PTMEnergy is an unsupervised binding energy prediction model for scoring designed protein-RNA complexes. PTMEnergy leverages the joint Energy-based Model (JEM) framework to convert AlphaFold3’s pAE confidence prediction head into an energy-based model. Compared to AlphaFold3’s ipTM score that operates on a fixed scale, pTMEnergy is continuous and able to capture fine-grained energy differences. We evaluate RNAFlow and pTMEnergy on a set of comprehensive benchmarks and show their superior performance to existing methods, unlocking new avenues for biomolecular interaction scoring and design.