Guest Column | March 31, 2026

AI Foundation Models For RNA Biology

By Haopeng Yu, Ph.D., and Yiliang Ding, Ph.D., John Innes Centre, Norwich Research Park

DNA, medical research-GettyImages-919410014

RNA biology is undergoing a transformative revolution driven by AI foundation models. These models learn the intricate relationships between RNA sequence, structure, and function by training on vast, diverse datasets spanning millions of RNA molecules across various species. Through self-supervised learning on these sequences, these models acquire a generalizable understanding of RNA, which can then be fine-tuned for various downstream tasks, thereby enabling the decoding of functional rules embedded in RNA sequences. As RNA foundation models keep advancing and integrating more multimodal biological data, they aim to uncover additional regulatory rules and functions encoded in RNA.

Over the past several decades, humanity’s capacity to generate and collect data has expanded dramatically. However, this explosion of information has also been accompanied by increasingly intricate internal relationships within these “big datasets.” In response, the quest to enable machines to learn patterns from complex, high-dimensional data, and thereby uncover the rules they encode, has driven the evolution of artificial intelligence (AI).1

AI has progressed from classical machine learning (ML), through the transformative era of deep learning (DL), and ultimately toward the rise of modern AI foundation models, which surpass previous paradigms by offering a unified and highly generalizable strategy for learning from vast and heterogeneous datasets.2,3,4 This breakthrough was first demonstrated in generative AI models, like GPT models, trained on human language, enabling natural, conversational interaction and giving rise to today’s widely adopted AI agents and AI assistants.

In biology, researchers have been applying foundation model strategies to decipher the underlying rules of molecular biology. In the protein domain, protein foundation-model-based approaches such as ESM-2/ESM-3 and AlphaFold 3, have achieved highly accurate structural predictions, outperforming earlier generations of protein modelling tools and fundamentally transforming our ability to model protein structure.5,6

In RNA biology, foundation model-based approaches are only beginning to emerge, but they are rapidly reshaping our ability to decode RNA sequence–structure–function relationships. RNA does far more than carry the coding blueprint for proteins, it also embeds rich, post-transcriptional “regulatory grammar” that governs how transcripts are translated, how quickly they are degraded, and how they interact with RNA-binding proteins and chemical modifications such as RNA methylation.7,8,9

RNA is an intrinsically structural molecule.10 Through Watson–Crick and non-canonical base pairing between reverse-complementary segments, it folds into secondary and higher-order conformations that create binding surfaces, catalytic pockets, and regulatory switches, enabling diverse and sometimes highly complex biological functions.11,12,13 These properties make RNA an especially compelling target for foundation models.

The Basis Of Foundation Models

Foundation models are large-scale models that learn broad, general-purpose representations from massive, diverse datasets and can be efficiently adapted to a wide range of downstream tasks.14

Building an RNA foundation model is typically described in two stages: pre-training and fine-tuning.14,15 During pre-training, the model is exposed to large collections of diverse, unlabeled RNA sequences from many species and is encouraged to “read” and “understand” them, a process known as self-supervised learning.16 After this stage, the model has acquired a general representation of RNA sequence diversity and context.

Fine-tuning then focuses this general knowledge on a specific biological question. Because the model has already learned rich RNA representations through large-scale pre-training, it can adapt quickly and typically achieves strong predictive performance even with modest amounts of data. Once fine-tuned, the model can be used as an in silico experimental partner.17,18 In addition, interpretability methods, known as explainable AI (XAI), can be applied to extract candidate functional RNA sequences or RNA structure motifs.17

Thus, pre-training equips the model with broad RNA “knowledge,” fine-tuning tailors it to a specific RNA biology question, and interpretability reveals the sequence or structural features behind the predictions.

Dataset For RNA Foundation Model

Pre-training an RNA foundation model usually starts with large-scale transcriptomic datasets spanning multiple species and tissues. The choice of data at this stage is important: if a study targets a particular RNA class or region, then the collected sequences should be constrained accordingly.

Universal RNA models such as Nucleotide Transformer are pre-trained on large multi-species datasets.19 Uni-RNA scales to an even broader corpus sourced from RNAcentral, NCBI and GenomeWarehouse (GWH) databases.20 For ncRNA-centered models, pre-training data are likewise restricted to non-coding transcripts.22,23,24 Other RNA region-specific models narrow the corpus to particular mRNA regions such as UTRs25 or coding sequences.26,27

More recently, multimodal biological foundation models have begun to jointly pre-train on DNA, RNA and protein information, enabling unified representations across the central dogma.30

Architectures Of The RNA Foundation Models

Most RNA foundation models are built as Transformer towers, where stacked self-attention layers naturally capture the long-range base–base dependencies common in RNA.31

Encoder-only models process the full input sequence and are best suited for prediction tasks such as RNA structure prediction or RBP binding site identification. Decoder-only models generate sequences autoregressively and are better suited for RNA design tasks.34,35 Some frameworks explore encoder–decoder Transformers when mapping sequences to structural or activity representations.36,37

Several models adapt the backbone to better align with biological priors. RNA-MSM uses multiple-sequence alignments to exploit evolutionary conservation.38 Long-context architectures such as Mamba-based models and StripedHyena address scaling challenges for long RNA sequences.39,40,41

Pre-Training Strategy For RNA Foundation Models

Self-supervised learning tasks encourage the model to “read” and “understand” RNA sequences. The most widely used task is masked language modelling (MLM).42 RNA foundation models such as RNA-FM, AIDO.RNA, ERNIE-RNA, RNA-MSM, SpliceBERT and RiNALMo rely on MLM.22,23,29,33,38,43

Some models extend MLM with additional biological signals.27,44 Others use causal language modelling for generative applications34,35.

Pre-training can combine multiple self-supervised tasks, but is computationally expensive.21,23 However, once a pretrained model is available, it can serve as a starting point for fine-tuning new biological questions.

Fine-Tuning And Adapting To Downstream RNA Tasks

After pre-training, fine-tuning adapts the RNA foundation model to specific RNA biology tasks using labeled data. Full fine-tuning updates all model parameters and often yields strong performance. Parameter-efficient fine-tuning freezes most parameters and trains only a subset.45,46

Datasets are typically split into training, validation, and test sets. Strong performance on validation and test sets suggests that the model generalizes well to new data. Each biological question is treated as a separate fine-tuning task.47

Benchmarks And Evaluation Of RNA Foundation Models

Evaluation is carried out at several complementary levels.

At the representation level, embeddings extracted from pre-trained models reflect the model’s “understanding” of RNA. Dimensionality-reduction methods such as UMAP are used to visualize patterns.48

At the task level, performance is assessed after fine-tuning across tasks such as RNA structure prediction, RBP binding prediction, and RNA stability classification. Metrics include F1 score, AUROC, AUPRC, and correlation measures.25

Generalization is also evaluated across species and RNA families.23

Interpreted RNA Foundation Model With Explainable AI

XAI methods are essential if RNA foundation models are to do more than predict labels.17

Attribution methods estimate nucleotide contributions using gradients or perturbation techniques such as in silico mutagenesis.55,56,57 Attention and representation analyses provide additional interpretability signals.33 These approaches transform models into tools that reveal candidate functional RNA elements.21,22 However, no single XAI method is optimal,59 and multiple methods should be applied and validated against biological knowledge.

Conclusion

The development of RNA foundation models has marked a pivotal shift in our ability to understand RNA biology. Through self-supervised, pre-training on large datasets, these models learn generalized representations of RNA sequences, structures, and functions.

Pre-training provides the backbone for model generalization, while fine-tuning enables task-specific predictions. Evaluation methods ensure accuracy and robustness across tasks.

The integration of explainable AI enables researchers to interpret predictions and identify key sequence elements or structural motifs.

Foundation models should be viewed as tools that augment human research. They enable systematic discovery across vast sequence spaces and support hypothesis generation.

In practice, while pre-training requires significant computational resources, fine-tuning and explainable AI analyses are more accessible. Researchers are encouraged to fine-tune existing models rather than train from scratch.

Looking forward, as datasets grow and multimodal models evolve, RNA foundation models will continue to improve. Advances in explainable AI will further enhance interpretability and support the discovery of regulatory elements. Ultimately, RNA foundation models represent a transformative approach to RNA biology.

References

  1. LeCun Y, Bengio Y, Hinton G. Deep learning. Nature 2015; 521:436–44.
  2. Sharifani K, Amini M. Machine learning and deep learning: a review of methods and applications [Internet]. 2023 [cited 2025 Nov 29].
  3. Yu H, Qi Y, Ding Y. Deep learning in RNA structure studies. Frontiers in Molecular Biosciences [Internet] 2022 [cited 2022 Sept 2]; 9. Available from: https://doi.org/10.3389/fmolb.2022.869601
  4. Zhao WX, Zhou K, Li J, Tang T, Wang X, Hou Y, Min Y, Zhang B, Zhang J, Dong Z, et al. A survey of large language models [Internet]. 2025 [cited 2025 Nov 29]; Available from: http://arxiv.org/abs/2303.18223(open in a new window)
  5. Lin Z, Akin H, Rao R, Hie B, Zhu Z, Lu W, Smetanin N, Verkuil R, Kabeli O, Shmueli Y, et al. Evolutionary-scale prediction of atomic-level protein structure with a language model. Science 2023; 379:1123–30.
  6. Hayes T, Rao R, Akin H, Sofroniew NJ, Oktay D, Lin Z, Verkuil R, Tran VQ, Deaton J, Wiggert M, et al. Simulating 500 million years of evolution with a language model [Internet]. 2024 [cited 2024 July 5]; Available from: http://doi.org/10.1101/2024.07.01.600583
  7. Zhao BS, Roundtree IA, He C. Post-transcriptional gene regulation by mRNA modifications. Nat Rev Mol Cell Biol 2017; 18:31–42.
  8. Glisovic T, Bachorik JL, Yong J, Dreyfuss G. Rna-binding proteins and post-transcriptional gene regulation. FEBS Letters 2008; 582:1977–86.
  9. Monaghan L, Longman D, Cáceres JF. Translation‐coupled mRNA quality control mechanisms. The EMBO Journal 2023; 42:e114378.
  10. Cao X, Zhang Y, Ding Y, Wan Y. Identification of RNA structures and their roles in RNA functions. Nat Rev Mol Cell Biol 2024; :1–18.
  11. Leppek K, Das R, Barna M. Functional 5′ UTR mRNA structures in eukaryotic translation regulation and how to find them. Nat Rev Mol Cell Biol 2018; 19:158–74.
  12. Strobel EJ, Yu AM, Lucks JB. High-throughput determination of RNA structures. Nat Rev Genet 2018; 19:615–34.
  13. Zhang J, Fei Y, Sun L, Zhang QC. Advances and opportunities in RNA structure experimental determination and computational modeling. Nat Methods 2022; 19:1193–207.
  14. Bommasani R, Hudson DA, Adeli E, Altman R, Arora S, Arx S von, Bernstein MS, Bohg J, Bosselut A, Brunskill E, et al. On the opportunities and risks of foundation models [Internet]. 2022 [cited 2025 Nov 25]; Available from: http://arxiv.org/abs/2108.07258(open in a new window)
  15. Zheng H, Shen L, Tang A, Luo Y, Hu H, Du B, Wen Y, Tao D. Learning from models beyond fine-tuning. Nat Mach Intell 2025; 7:6–17.
  16. Liu X, Zhang F, Hou Z, Mian L, Wang Z, Zhang J, Tang J. Self-supervised learning: generative or contrastive. IEEE Transactions on Knowledge and Data Engineering 2023; 35:857–76.
  17. Xu F, Uszkoreit H, Du Y, Fan W, Zhao D, Zhu J. Explainable AI: a brief survey on history, research areas, approaches and challenges. In: Tang J, Kan M-Y, Zhao D, Li S, Zan H, editors. Natural language processing and Chinese computing. Cham: Springer International Publishing; 2019. page 563–74.
  18. Muiños F, Martínez-Jiménez F, Pich O, Gonzalez-Perez A, Lopez-Bigas N. In silico saturation mutagenesis of cancer genes. Nature 2021; 596:428–32.
  19. Dalla-Torre H, Gonzalez L, Mendoza-Revilla J, Lopez Carranza N, Grzywaczewski AH, Oteri F, Dallago C, Trop E, de Almeida BP, Sirelkhatim H, et al. Nucleotide transformer: building and evaluating robust foundation models for human genomics. Nat Methods 2025; 22:287–97.
  20. Wang X, Gu R, Chen Z, Li Y, Ji X, Ke G, Wen H. Uni-RNA: universal pre-trained models revolutionize RNA research [Internet]. 2023 [cited 2025 Nov 25]; :2023.07.11.548588. Available from: https://doi.org/10.1101/2023.07.11.548588v1
  21. Yu H, Yang H, Sun W, Yan Z, Yang X, Zhang H, Ding Y, Li K. An interpretable RNA foundation model for exploring functional RNA motifs in plants. Nat Mach Intell 2024; :1–10.
  22. Chen J, Hu Z, Sun S, Tan Q, Wang Y, Yu Q, Zong L, Hong L, Xiao J, Shen T, et al. Interpretable RNA foundation model from unannotated data for highly accurate RNA structure and function predictions [Internet]. 2022 [cited 2023 Nov 30]; :2022.08.06.503062. Available from: https://doi.org/10.1101/2022.08.06.503062v2
  23. Penić RJ, Vlašić T, Huber RG, Wan Y, Šikić M. Rinalmo: general-purpose RNA language models can generalize well on structure prediction tasks. Nat Commun 2025; 16:5671.
  24. Zou S, Tao T, Mahbub S, Ellington CN, Algayres R, Li D, Zhuang Y, Wang H, Song L, Xing EP. A large-scale foundation model for RNA function and structure prediction [Internet]. 2024 [cited 2025 Nov 25]; :2024.11.28.625345. Available from: https://doi.org/10.1101/2024.11.28.625345v1
  25. Chu Y, Yu D, Li Y, Huang K, Shen Y, Cong L, Zhang J, Wang M. A 5′ UTR language model for decoding untranslated regions of mRNA and function predictions. Nat Mach Intell 2024; 6:449–60.
  26. Outeiral C, Deane CM. Codon language embeddings provide strong signals for use in protein engineering. Nat Mach Intell 2024; 6:170–9.
  27. Li S, Moayedpour S, Li R, Bailey M, Riahi S, Kogler-Anele L, Miladi M, Miner J, Pertuy F, Zheng D, et al. CodonBERT large language model for mRNA vaccines. Genome Res 2024; 34:1027–35.
  28. Yang Y, Li G, Pang K, Cao W, Zhang Z, Li X. Deciphering 3’utr mediated gene regulation using interpretable deep representation learning. Advanced Science 2024; 11:2407013.
  29. Chen K, Zhou Y, Ding M, Wang Y, Ren Z, Yang Y. Self-supervised learning on millions of pre-mRNA sequences improves sequence-based RNA splicing prediction [Internet]. 2023 [cited 2024 Apr 19]; :2023.01.31.526427. Available from: https://doi.org/10.1101/2023.01.31.526427v2
  30. Nguyen E, Poli M, Durrant MG, Kang B, Katrekar D, Li DB, Bartie LJ, Thomas AW, King SH, Brixi G, et al. Sequence modeling and design from molecular to genome scale with Evo. Science 2024; 386:eado9336.
  31. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I. Attention is all you need [Internet]. 2023 [cited 2023 Sept 5]; Available from: http://arxiv.org/abs/1706.03762(open in a new window)
  32. Zhang S, Fan R, Liu Y, Chen S, Liu Q, Zeng W. Applications of transformer-based language models in bioinformatics: a survey. Bioinformatics Advances 2023; 3:vbad001.
  33. Yin W, Zhang Z, Zhang S, He L, Zhang R, Jiang R, Liu G, Wang J, Zhang X, Qin T, et al. Ernie-RNA: an RNA language model with structure-enhanced representations. Nat Commun 2025; 16:10076.
  34. Zhao Y, Oono K, Takizawa H, Kotera M. GenerRNA: a generative pre-trained language model for de novo RNA design. PLOS ONE 2024; 19:e0310814.
  35. Shulgina Y, Trinidad MI, Langeberg CJ, Nisonoff H, Chithrananda S, Skopintsev P, Nissley AJ, Patel J, Boger RS, Shi H, et al. Rna language models predict mutations that improve rna function. Nat Commun 2024; 15:10627.
  36. Liu F, Huang S, Hu J, Chen X, Song Z, Dong J, Liu Y, Huang X, Wang S, Wang X, et al. Design of prime-editing guide RNAs with deep transfer learning. Nat Mach Intell 2023; 5:1261–74.
  37. Boyd N, Anderson BM, Townshend B, Chow R, Stephens CJ, Rangan R, Kaplan M, Corley M, Tambe A, Ido Y, et al. Atom-1: a foundation model for RNA structure and function built on chemical mapping data [Internet]. 2023 [cited 2025 Nov 25]; :2023.12.13.571579. Available from: https://doi.org/10.1101/2023.12.13.571579v1
  38. Zhang Y, Lang M, Jiang J, Gao Z, Xu F, Litfin T, Chen K, Singh J, Huang X, Song G, et al. Multiple sequence alignment-based RNA language model and its application to structural inference. Nucleic Acids Res 2024; 52:e3.
  39. Fradkin P, Shi R, Isaev K, Frey BJ, Morris Q, Lee LJ, Wang B. Orthrus: towards evolutionary and functional RNA foundation models [Internet]. 2024 [cited 2024 Oct 14]; :2024.10.10.617658. Available from: https://doi.org/10.1101/2024.10.10.617658v1
  40. Yuan Y, Chen Q, Pan X. Dgrna: a long-context RNA foundation model with bidirectional attention Mamba2 [ Internet]. 2024 [cited 2025 Nov 25]; :2024.10.31.621427. Available from: https://doi.org/10.1101/2024.10.31.621427v1
  41. Saberi A, Choi B, Wang S, Hernández-Corchado A, Naghipourfar M, Namini AM, Ramani V, Emad A, Najafabadi HS, Goodarzi H. A long-context RNA foundation model for predicting transcriptome architecture [Internet]. 2024 [cited 2025 Nov 25]; :2024.08.26.609813. Available from: https://doi.org/10.1101/2024.08.26.609813v2
  42. Devlin J, Chang M-W, Lee K, Toutanova K. BERT: Pre-training of deep bidirectional transformers for language understanding [Internet]. In: Burstein J, Doran C, Solorio T, editors. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Minneapolis, Minnesota: Association for Computational Linguistics; 2019 [cited 2025 Nov 25]. page 4171–86.Available from: https://aclanthology.org/N19-1423/(open in a new window)
  43. Xing E, Zou S, Tao T, Mahbub S, Ellington C, Algayres R, Li D, Zhuang Y, Wang H, Song L. A large-scale foundation model for RNA enables diverse function and structure prediction [Internet]. 2025 [cited 2025 Nov 25]; Available from: https://www.researchsquare.com/article/rs-6445344/v1(open in a new window)
  44. Akiyama M, Sakakibara Y. Informative RNA base embedding for RNA structural alignment and clustering by deep representation learning. NAR Genom Bioinform 2022; 4:lqac012.
  45. Hu EJ, Shen Y, Wallis P, Allen-Zhu Z, Li Y, Wang S, Wang L, Chen W. LoRA: Low-rank adaptation of large language models [Internet]. 2021 [cited 2025 Nov 25]; Available from: http://arxiv.org/abs/2106.09685(open in a new window)
  46. Chen J, Zhang A, Shi X, Li M, Smola A, Yang D. Parameter-efficient fine-tuning design spaces [Internet]. arXiv.org2023 [cited 2025 Nov 25]; Available from: https://arxiv.org/abs/2301.01821v1(open in a new window)
  47. Wang N, Bian J, Li Y, Li X, Mumtaz S, Kong L, Xiong H. Multi-purpose RNA language modelling with motif-aware pretraining and type-guided fine-tuning. Nat Mach Intell 2024; 6:548–57.
  48. Chen K, Zhou Y, Ding M, Wang Y, Ren Z, Yang Y. Self-supervised learning on millions of primary RNA sequences from 72 vertebrates improves sequence-based RNA splicing prediction. Briefings in Bioinformatics 2024; 25:bbae163.
  49. Danaee P, Rouches M, Wiley M, Deng D, Huang L, Hendrix D. BpRNA: large-scale automated annotation and analysis of RNA secondary structure. Nucleic Acids Research 2018; 46:5381–94.
  50. Sloma MF, Mathews DH. Exact calculation of loop formation probability identifies folding motifs in RNA secondary structures. RNA 2016; 22:1808–18.
  51. Tan Z, Fu Y, Sharma G, Mathews DH. Turbofold ii: rNA structural alignment and secondary structure prediction informed by multiple homologs. Nucleic Acids Research 2017; 45:11570–81.
  52. Zablocki LI, Bugnon LA, Gerard M, Di Persia L, Stegmayer G, Milone DH. Comprehensive benchmarking of large language models for RNA secondary structure prediction. Brief Bioinform 2025; 26:bbaf137.
  53. Szikszai M, Magnus M, Sanghi S, Kadyan S, Bouatta N, Rivas E. Rna3db: a structurally-dissimilar dataset split for training and benchmarking deep learning models for rna structure prediction. Journal of Molecular Biology 2024; 436:168552.
  54. Ren Y, Chen Z, Qiao L, Jing H, Cai Y, Xu S, Ye P, Ma X, Sun S, Yan H, et al. Beacon: benchmark for comprehensive RNA tasks and language models [Internet]. 2024 [cited 2025 Nov 25]; Available from: http://arxiv.org/abs/2406.10391(open in a new window)
  55. Koo PK, Majdandzic A, Ploenzke M, Anand P, Paul SB. Global importance analysis: an interpretability method to quantify importance of genomic features in deep neural networks. PLOS Computational Biology 2021; 17:e1008925.
  56. Sundararajan M, Taly A, Yan Q. Axiomatic attribution for deep networks [Internet]. In: Proceedings of the 34th International Conference on Machine Learning. PMLR; 2017 [cited 2025 Nov 25]. page 3319–28.
  57. Santorsola M, Lescai F. The promise of explainable deep learning for omics data analysis: adding new discovery tools to AI. New Biotechnology 2023; 77:1–11.
  58. Seitz EE, McCandlish DM, Kinney JB, Koo PK. Interpreting cis-regulatory mechanisms from genomic deep neural networks using surrogate models. Nat Mach Intell 2024; 6:701–13.
  59. Novakovsky G, Dexter N, Libbrecht MW, Wasserman WW, Mostafavi S. Obtaining genetics insights from deep learning via explainable artificial intelligence. Nat Rev Genet 2023; 24:125–37.

About The Authors

Yiliang Ding, Ph.D., is a prominent Chinese scientist and group leader at the John Innes Centre, where she has been working since 2014. Her research focuses on the functional roles of RNA structure in living cells. Ding received her bachelor’s degree from Shanghai Jiao Tong University in 2005 and completed her Ph.D. in 2009 at the John Innes Centre. She has also been an Honorary Group Leader at the Babraham Institute and an Honorary Professor at the University of East Anglia since 2024. Ding’s innovative methods for profiling RNA structures in living cells have delivered new insights into the functional roles of RNA structures in gene regulation. She has received several prestigious awards and grants, including the BBSRC David Phillips Fellowship and three ERC grants. Her work has led to the development of a single-molecule RNA structure profiling method and has revealed the functional importance of RNA structure in the regulation of long noncoding RNAs. Ding's technologies have been used to explain the role of RNA structure in targeted RNA degradation, which has been applied in RNA-based antiviral therapies for SARS-CoV-2.

Haopeng Yu is a researcher at the John Innes Centre with a focus on bioinformatics. His work includes AI technology modeling, high-throughput sequencing data mining, in vivo RNA structure and function analysis, and the full-stack development of biological websites and bioinformatics software design. Yu has contributed to various scientific publications and has been recognized as a fellow of the Federation of European Bioinformatics Societies (HFSP). His research has been published in reputable journals such as Nucleic Acids Research and Nature Communications, highlighting his significant impact in the field of molecular biology and bioinformatics.