Ninetails detects and characterises non-adenosine nucleotides embedded within poly(A) tails of Oxford Nanopore sequencing reads. It uses convolutional neural networks applied to Gramian Angular Field representations of raw current signals to identify cytidine (C), guanosine (G), and uridine (U) residues within otherwise pure poly(A) tails.
For complete documentation, see the Ninetails Wiki.
Installation
Ninetails is not currently available on CRAN or
Bioconductor. Install it from GitHub using devtools:
install.packages("devtools")
devtools::install_github('LRB-IIMCB/ninetails')
library(ninetails)Note for Windows users: Before installing
devtoolson Windows, install Rtools so packages compile correctly: https://cran.r-project.org/bin/windows/Rtools/
Additional dependencies
Depending on which pipeline you use, additional components are required:
For Dorado pipelines (DRS and cDNA):
- Python with
pod5module:pip install pod5 - Keras/TensorFlow for R:
keras::install_keras() - The
reticulateR package for Python interoperability
For Guppy legacy pipeline:
-
rhdf5from Bioconductor:BiocManager::install("rhdf5") - Keras/TensorFlow for R
- VBZ compression plugin (for newer MinKNOW data)
For the interactive dashboard (optional):
-
shiny,plotly,htmltools,DT,base64enc,cowplot - Install with:
install.packages(c("shiny", "plotly", "htmltools", "DT", "base64enc", "cowplot"))
See the Wiki for detailed instructions.
Available pipelines
Ninetails provides three analysis pipelines:
| Pipeline | Basecaller | Input format | Function | Status |
|---|---|---|---|---|
| Dorado DRS | Dorado ≥ 1.0.0 | POD5 + summary | check_tails_dorado_DRS() |
Recommended |
| Dorado cDNA | Dorado ≥ 1.0.0 | POD5 + BAM + summary | check_tails_dorado_cDNA() |
Under development |
| Guppy | Guppy ≤ 6.0.0 | fast5 + Nanopolish | check_tails_guppy() |
Legacy |
Choose the pipeline that matches your basecaller and sequencing protocol. The Dorado DRS pipeline is recommended for all new analyses.
Quick start: Dorado DRS pipeline
The recommended pipeline for direct RNA sequencing (DRS) data:
results <- ninetails::check_tails_dorado_DRS(
dorado_summary = "path/to/dorado_summary.txt",
pod5_dir = "path/to/pod5_dir/",
num_cores = 2,
qc = TRUE,
save_dir = "~/output/",
prefix = "my_experiment"
)Required inputs
-
dorado_summary: Dorado summary file with columns
read_id,filename,poly_tail_length,poly_tail_start,poly_tail_end. Generate this withdorado summaryon your aligned BAM file. - pod5_dir: Directory containing POD5 files from the sequencing run.
Output
The function returns a named list with two data frames:
-
read_classes: Per-read classification (decorated,
blank, or unclassified) with columns
readname,contig,polya_length,qc_tag,class, andcomments. -
nonadenosine_residues: Detailed non-A positions for
decorated reads with columns
readname,contig,prediction(C/G/U),est_nonA_pos,polya_length, andqc_tag.
Results are also saved as tab-separated files in
save_dir.
Classification codes
| Class | Code | Meaning |
|---|---|---|
decorated |
YAY |
Non-adenosine residue detected |
blank |
MAU |
No signal deviation; pure poly(A) |
blank |
MPU |
Signal deviation present but predicted as adenosine |
unclassified |
IRL |
Poly(A) tail too short (< 10 nt) |
unclassified |
UNM |
Read unmapped |
unclassified |
BAC |
Invalid coordinates |
Working with results
After running the pipeline, use the postprocessing functions to load, merge, annotate, and summarize the data:
# Load multiple samples
class_data <- ninetails::read_class_multiple(samples_table)
residue_data <- ninetails::read_residue_multiple(samples_table)
# Merge into one table
merged <- ninetails::merge_nonA_tables(class_data, residue_data)
# Annotate with gene symbols (requires biomaRt)
class_data <- ninetails::annotate_with_biomart(class_data, species = "mmusculus")
residue_data <- ninetails::annotate_with_biomart(residue_data, species = "mmusculus")
# Summarize per transcript
summary <- ninetails::summarize_nonA(merged, summary_factors = "group")See vignette("postprocessing") for the complete
workflow.
Visualization
Ninetails provides both static plotting functions and an interactive Shiny dashboard.
Static plots
# Read classification
ninetails::plot_class_counts(class_data, grouping_factor = "sample_name")
# Residue frequency
ninetails::plot_residue_counts(residue_data, grouping_factor = "sample_name")
# Non-A abundance per read
ninetails::plot_nonA_abundance(residue_data, grouping_factor = "sample_name")
# Poly(A) tail length
ninetails::plot_tail_distribution(class_data, grouping_factor = "sample_name")
# Non-A position distribution
ninetails::plot_rug_density(residue_data, base = "C", max_length = 200)See vignette("plotting") for all plotting functions and
parameters.
Interactive dashboard
Launch the Shiny dashboard for interactive exploration of results with configurable filters, per-sample rug plots, signal visualization, and downloadable reports:
ninetails::launch_signal_browser(
class_file = "/path/to/read_classes.txt",
residue_file = "/path/to/nonadenosine_residues.txt",
summary_file = "/path/to/dorado_summary.txt",
pod5_dir = "/path/to/pod5/"
)See vignette("shiny_app") for complete dashboard
documentation.
Next steps
-
vignette("detection")— Detailed pipeline documentation and parameters -
vignette("postprocessing")— Loading, merging, correcting, and summarizing results -
vignette("plotting")— All visualization functions with parameter tables -
vignette("signal_inspection")— Raw signal visualization from fast5 and POD5 files -
vignette("shiny_app")— Interactive analysis dashboard - Ninetails Wiki — Complete documentation
Troubleshooting
Keras/TensorFlow not found Run
keras::install_keras() to install the Python backend.
Ninetails requires a working Keras installation for the CNN
classification step.
POD5 module not available Install the Python pod5
package: pip install pod5. Verify with
reticulate::py_module_available("pod5").
Memory issues during processing Reduce the
part_size parameter (default: 40,000 reads per chunk). Use
cleanup = TRUE to remove intermediate files
automatically.
Empty results Check that your Dorado summary file
contains the required columns: read_id,
filename, poly_tail_length,
poly_tail_start, poly_tail_end. Reads with
poly_tail_start = 0 are classified as BAC (bad
coordinates).
Citation
Please cite Ninetails as:
Gumińska, N., Matylla-Kulińska, K., Krawczyk, P.S. et al. Direct profiling of non-adenosines in poly(A) tails of endogenous and therapeutic mRNAs with Ninetails. Nat Commun 16, 2664 (2025). https://doi.org/10.1038/s41467-025-57787-6
