Get started with ninetails • ninetails

Ninetails detects and characterises non-adenosine nucleotides embedded within poly(A) tails of Oxford Nanopore sequencing reads. It uses convolutional neural networks applied to Gramian Angular Field representations of raw current signals to identify cytidine (C), guanosine (G), and uridine (U) residues within otherwise pure poly(A) tails.

For complete documentation, see the Ninetails Wiki.

Installation

Ninetails is not currently available on CRAN or Bioconductor. Install it from GitHub using devtools:

install.packages("devtools")
devtools::install_github('LRB-IIMCB/ninetails')
library(ninetails)

Note for Windows users: Before installing devtools on Windows, install Rtools so packages compile correctly: https://cran.r-project.org/bin/windows/Rtools/

Additional dependencies

Depending on which pipeline you use, additional components are required:

For Dorado pipelines (DRS and cDNA):

Python with pod5 module: pip install pod5
Keras/TensorFlow for R: keras::install_keras()
The reticulate R package for Python interoperability

For Guppy legacy pipeline:

rhdf5 from Bioconductor: BiocManager::install("rhdf5")
Keras/TensorFlow for R
VBZ compression plugin (for newer MinKNOW data)

For the interactive dashboard (optional):

shiny, plotly, htmltools, DT, base64enc, cowplot
Install with: install.packages(c("shiny", "plotly", "htmltools", "DT", "base64enc", "cowplot"))

See the Wiki for detailed instructions.

Available pipelines

Ninetails provides three analysis pipelines:

Pipeline	Basecaller	Input format	Function	Status
Dorado DRS	Dorado ≥ 1.0.0	POD5 + summary	`check_tails_dorado_DRS()`	Recommended
Dorado cDNA	Dorado ≥ 1.0.0	POD5 + BAM + summary	`check_tails_dorado_cDNA()`	Under development
Guppy	Guppy ≤ 6.0.0	fast5 + Nanopolish	`check_tails_guppy()`	Legacy

Choose the pipeline that matches your basecaller and sequencing protocol. The Dorado DRS pipeline is recommended for all new analyses.

Quick start: Dorado DRS pipeline

The recommended pipeline for direct RNA sequencing (DRS) data:

results <- ninetails::check_tails_dorado_DRS(
  dorado_summary = "path/to/dorado_summary.txt",
  pod5_dir       = "path/to/pod5_dir/",
  num_cores      = 2,
  qc             = TRUE,
  save_dir       = "~/output/",
  prefix         = "my_experiment"
)

Required inputs

dorado_summary: Dorado summary file with columns read_id, filename, poly_tail_length, poly_tail_start, poly_tail_end. Generate this with dorado summary on your aligned BAM file.
pod5_dir: Directory containing POD5 files from the sequencing run.

Output

The function returns a named list with two data frames:

read_classes: Per-read classification (decorated, blank, or unclassified) with columns readname, contig, polya_length, qc_tag, class, and comments.
nonadenosine_residues: Detailed non-A positions for decorated reads with columns readname, contig, prediction (C/G/U), est_nonA_pos, polya_length, and qc_tag.

Results are also saved as tab-separated files in save_dir.

Classification codes

Class	Code	Meaning
`decorated`	`YAY`	Non-adenosine residue detected
`blank`	`MAU`	No signal deviation; pure poly(A)
`blank`	`MPU`	Signal deviation present but predicted as adenosine
`unclassified`	`IRL`	Poly(A) tail too short (< 10 nt)
`unclassified`	`UNM`	Read unmapped
`unclassified`	`BAC`	Invalid coordinates

Working with results

After running the pipeline, use the postprocessing functions to load, merge, annotate, and summarize the data:

# Load multiple samples
class_data <- ninetails::read_class_multiple(samples_table)
residue_data <- ninetails::read_residue_multiple(samples_table)

# Merge into one table
merged <- ninetails::merge_nonA_tables(class_data, residue_data)

# Annotate with gene symbols (requires biomaRt)
class_data <- ninetails::annotate_with_biomart(class_data, species = "mmusculus")
residue_data <- ninetails::annotate_with_biomart(residue_data, species = "mmusculus")

# Summarize per transcript
summary <- ninetails::summarize_nonA(merged, summary_factors = "group")

See vignette("postprocessing") for the complete workflow.

Visualization

Ninetails provides both static plotting functions and an interactive Shiny dashboard.

Static plots

# Read classification
ninetails::plot_class_counts(class_data, grouping_factor = "sample_name")

# Residue frequency
ninetails::plot_residue_counts(residue_data, grouping_factor = "sample_name")

# Non-A abundance per read
ninetails::plot_nonA_abundance(residue_data, grouping_factor = "sample_name")

# Poly(A) tail length
ninetails::plot_tail_distribution(class_data, grouping_factor = "sample_name")

# Non-A position distribution
ninetails::plot_rug_density(residue_data, base = "C", max_length = 200)

See vignette("plotting") for all plotting functions and parameters.

Interactive dashboard

Launch the Shiny dashboard for interactive exploration of results with configurable filters, per-sample rug plots, signal visualization, and downloadable reports:

ninetails::launch_signal_browser(
  class_file   = "/path/to/read_classes.txt",
  residue_file = "/path/to/nonadenosine_residues.txt",
  summary_file = "/path/to/dorado_summary.txt",
  pod5_dir     = "/path/to/pod5/"
)

See vignette("shiny_app") for complete dashboard documentation.

Next steps

vignette("detection") — Detailed pipeline documentation and parameters
vignette("postprocessing") — Loading, merging, correcting, and summarizing results
vignette("plotting") — All visualization functions with parameter tables
vignette("signal_inspection") — Raw signal visualization from fast5 and POD5 files
vignette("shiny_app") — Interactive analysis dashboard
Ninetails Wiki — Complete documentation

Troubleshooting

Keras/TensorFlow not found Run keras::install_keras() to install the Python backend. Ninetails requires a working Keras installation for the CNN classification step.

POD5 module not available Install the Python pod5 package: pip install pod5. Verify with reticulate::py_module_available("pod5").

Memory issues during processing Reduce the part_size parameter (default: 40,000 reads per chunk). Use cleanup = TRUE to remove intermediate files automatically.

Empty results Check that your Dorado summary file contains the required columns: read_id, filename, poly_tail_length, poly_tail_start, poly_tail_end. Reads with poly_tail_start = 0 are classified as BAC (bad coordinates).

Citation

Please cite Ninetails as:

Gumińska, N., Matylla-Kulińska, K., Krawczyk, P.S. et al. Direct profiling of non-adenosines in poly(A) tails of endogenous and therapeutic mRNAs with Ninetails. Nat Commun 16, 2664 (2025). https://doi.org/10.1038/s41467-025-57787-6