Model Overview¶
RNAZoo includes 15 RNA deep learning models across 5 tracks. Each model runs in its own Docker container with baked-in weights.
All models at a glance¶
| Model | Track | Task | Input | Output | Device | License |
|---|---|---|---|---|---|---|
| RiboNN | Translation | TE prediction (82 cell types) | Tab-separated (UTR+CDS) | TSV with TE per cell type | CPU/GPU | Apache 2.0 |
| Riboformer | Translation | Codon-level ribosome density | WIG + FASTA + GFF3 | Density predictions | CPU/GPU | Upstream |
| RiboTIE | Translation | ORF detection from ribo-seq | FASTA + GTF + BAM | GTF + CSV | CPU/GPU | Upstream |
| seq2ribo | Translation | Riboseq/TE/protein from sequence | FASTA (CDS) | JSON | GPU only | CMU Non-Commercial |
| TranslationAI | Translation | TIS/TTS/ORF prediction | FASTA (mRNA) | TIS/TTS/ORF text files | CPU/GPU | AGPL-3.0 + CC BY-NC 4.0 |
| Saluki | Translation | mRNA half-life | FASTA (case=UTR/CDS) | NumPy array | CPU/GPU | Apache 2.0 |
| CodonTransformer | Translation | Codon optimization | FASTA (protein) | FASTA (DNA) | CPU/GPU | Apache 2.0 |
| RNA-FM | Foundation | RNA embeddings (640-d) | FASTA (RNA) | NumPy (N x 640) | CPU/GPU | MIT |
| RiNALMo | Foundation | RNA embeddings (1280-d) | FASTA (RNA) | NumPy (N x 1280) | CPU/GPU | Apache 2.0 |
| ERNIE-RNA | Foundation | Structure-aware embeddings (768-d) | FASTA (RNA) | NumPy (N x 768) | CPU/GPU | MIT |
| RNAformer | Structure | 2D structure (base-pair matrix) | FASTA (RNA) | Dot-bracket + prob matrix | CPU/GPU | Apache 2.0 |
| RhoFold | Structure | 3D structure prediction | FASTA (RNA) | PDB + CT | CPU/GPU | Apache 2.0 |
| SPOT-RNA | Structure | 2D structure + pseudoknots | FASTA (RNA) | bpseq + CT + prob + dot-bracket | CPU/GPU | MPL-2.0 |
| MultiRM | Modification | 12 RNA modification types | FASTA (RNA, min 51 nt) | TSV (probabilities + p-values) | CPU/GPU | MIT |
| UTR-LM | mRNA Design | MRL / TE / expression level | FASTA (5'UTR DNA) | TSV (predictions) | CPU/GPU | GPL-3.0 |
By track¶
Translation (7 models)¶
Models for predicting translation efficiency, ribosome profiling, ORF detection, mRNA stability, and codon optimization.
- RiboNN — Multi-task TE prediction across 82 human cell types from mRNA sequence
- Riboformer — Refine codon-level ribosome densities from ribo-seq data
- RiboTIE — Detect translated ORFs from ribo-seq + genomic sequence
- seq2ribo — Predict ribosome profiles/TE/protein from mRNA sequence (GPU only)
- TranslationAI — Identify translation initiation/termination sites and ORFs
- Saluki — Predict mRNA half-life from sequence (50-model ensemble)
- CodonTransformer — Optimize codon usage for 164 organisms
RNA Foundation Models (3 models)¶
General-purpose RNA language models that produce embeddings for downstream tasks.
- RNA-FM — 99M params, 640-d embeddings, max 1022 nt (MIT)
- RiNALMo — 650M params, 1280-d embeddings, no hard length limit (Apache 2.0)
- ERNIE-RNA — 86M params, 768-d embeddings, structure-aware attention (MIT)
RNA Structure (3 models)¶
Secondary and 3D structure prediction from sequence.
- RNAformer — 2D base-pair matrix with recycling, pseudoknot-aware
- RhoFold — Full-atom 3D structure prediction (PDB output), single-sequence mode
- SPOT-RNA — 2D structure with pseudoknots, 5-model TF ensemble
RNA Modification (1 model)¶
- MultiRM — Predicts 12 RNA modification types per position (m6A, m5C, pseudouridine, Am, Cm, Gm, Um, m1A, m5U, m6Am, m7G, A-to-I editing)
mRNA Design (1 model)¶
- UTR-LM — Predicts mean ribosome loading, translation efficiency, or expression level from 5'UTR sequences
Fine-tuning support¶
Some models can be fine-tuned on your own data. Fine-tuned checkpoints are saved to disk and can be reused for subsequent predictions.
| Model | Fine-tuning | Details |
|---|---|---|
| RiboNN | Transfer learning | Freeze pretrained conv layers, train head on user TE data; use saved checkpoint via --ribonn_checkpoint |
| UTR-LM | Full fine-tuning | Train ESM2 backbone + head on user MRL/TE/EL data; use saved checkpoint for prediction |
| RiboTIE | Built-in | Automatically fine-tunes on user ribo-seq BAMs before ORF prediction |
Licenses¶
| Model | License | GitHub | Paper |
|---|---|---|---|
| RiboNN | Apache 2.0 | Sanofi-Public/RiboNN | Nature Biotechnology 2025 |
| Riboformer | MIT | lingxusb/Riboformer | Nature Communications 2024 |
| RiboTIE | MIT | TRISTAN-ORF/TRISTAN | Nature Communications 2025 |
| seq2ribo | CMU Non-Commercial | Kingsford-Group/seq2ribo | bioRxiv 2026 |
| TranslationAI | AGPL-3.0 + CC BY-NC 4.0 | rnasys/TranslationAI | NAR 2025 |
| Saluki | Apache 2.0 | calico/basenji | Genome Biology 2022 |
| CodonTransformer | Apache 2.0 | Adibvafa/CodonTransformer | Nature Communications 2025 |
| RNA-FM | MIT | ml4bio/RNA-FM | Nature Machine Intelligence 2024 |
| RiNALMo | Apache 2.0 (code) + CC BY 4.0 (weights) | lbcb-sci/RiNALMo | NeurIPS 2024 |
| ERNIE-RNA | MIT | Bruce-ywj/ERNIE-RNA | Nature Communications 2025 |
| RNAformer | Apache 2.0 | automl/RNAformer | ICLR 2024 |
| RhoFold | Apache 2.0 | ml4bio/RhoFold | Nature Methods 2024 |
| SPOT-RNA | MPL-2.0 | jaswindersingh2/SPOT-RNA | Nature Communications 2019 |
| MultiRM | MIT | Tsedao/MultiRM | NAR 2021 |
| UTR-LM | GPL-3.0 | a96123155/UTR-LM | Nature Machine Intelligence 2024 |