
Package index
Pipeline wrappers
Top-level functions for running the complete analysis pipeline. Choose the appropriate wrapper based on your basecaller and data type.
-
check_tails_dorado_DRS() - Complete Oxford Nanopore poly(A) tail analysis pipeline for Dorado DRS data.
-
check_tails_dorado_cDNA() - Complete Oxford Nanopore poly(A)/poly(T) tail analysis pipeline for Dorado cDNA data.
-
check_tails_guppy() - Wrapper function for complete DRS processing by ninetails package (legacy mode).
Dorado DRS pipeline
Functions for processing direct RNA sequencing (DRS) data basecalled with Dorado ≥ 1.0.0 in POD5 format.
-
preprocess_inputs() - Preprocess Dorado inputs for ninetails analysis (no BAM processing)
-
process_dorado_summary() - Process and split Dorado summary file into smaller parts
-
filter_dorado_summary() - Filter Dorado summary for reads fulfilling ninetails quality criteria
-
extract_tails_from_pod5() - Extract poly(A) tail signal segments from POD5 files using parallel Python processing
-
create_tail_features_list_dorado() - Creates a nested list of Dorado tail features (raw signal + pseudomoves).
-
create_tail_chunk_list_dorado() - Creates list of poly(A) tail chunks (Dorado mode) centered on significant signal deviations.
-
split_tail_centered_dorado() - Extracts fragments of poly(A) tail signal (Dorado mode) containing potential modifications along with their delimitation (positional indices; coordinates) within the tail.
-
process_dorado_signal_files() - Process Dorado poly(A) signal files for non-A prediction and tail chunk extraction
-
create_outputs_dorado() - Create Ninetails output tables for Dorado DRS pipeline
Dorado cDNA pipeline
Functions for processing cDNA sequencing data, including BAM file processing, basecalled sequence extraction, and read orientation classification (polyA vs polyT).
-
preprocess_inputs_cdna() - Preprocess Dorado inputs for ninetails cDNA analysis
-
split_bam_file_cdna() - Split BAM file into parts based on read IDs from summary file
-
extract_data_from_bam() - Extract data from BAM file for cDNA analysis
-
detect_orientation_single() - Detect poly tail type for a single sequence using Dorado-style algorithm
-
detect_orientation_multiple() - Classify multiple cDNA read orientations using Dorado-style poly tail detection
-
process_polya_reads_cdna() - Process polyA reads using standard ninetails pipeline
-
process_polyt_reads_cdna() - Process polyT reads using ninetails pipeline
-
create_outputs_dorado_cdna() - Create Ninetails output tables for Dorado cDNA pipeline
-
merge_cdna_results() - Merge polyA and polyT processing results for cDNA analysis
-
save_cdna_outputs() - Save cDNA pipeline outputs in standard ninetails format
Guppy legacy pipeline
Functions for processing DRS data basecalled with Guppy ≤ 6.0.0 using fast5 format and Nanopolish poly(A) coordinates. This pipeline is no longer actively developed.
-
extract_polya_data() - Extract poly(A) data from nanopolish output and sequencing summary
-
extract_tail_data() - Extract tail features of a single RNA read from a multi-Fast5 file
-
create_tail_feature_list() - Create list of poly(A) tail features from multi-Fast5 files
-
create_tail_chunk_list() - Create list of poly(A) tail chunks centered on significant signal deviations
-
split_tail_centered() - Extract modification-centered signal fragments from a poly(A) tail
-
create_gaf() - Convert ONT signal to Gramian Angular Field
-
create_gaf_list() - Create list of Gramian Angular Field matrices from tail chunks
-
process_polya_complete() - Process a single (unsplit) poly(A) data file through the Guppy pipeline.
-
process_polya_parts() - Process poly(A) data split into multiple parts through the Guppy pipeline.
-
split_polya_data() - Split large poly(A) data file into smaller parts.
-
create_outputs() - Create ninetails output tables (Guppy legacy pipeline)
-
save_outputs() - Save pipeline outputs to files.
Training dataset production
Functions for preparing training and validation datasets for the convolutional neural network (CNN) model.
-
prepare_trainingset() - Filters out signals of a given nucleotide type for neural network training-set preparation.
-
extract_tail_data_trainingset() - Extracts tail features of single RNA read from respective basecalled multi-fast5 file.
-
create_tail_feature_list_trainingset() - Extracts features of poly(A) tails of ONT RNA reads required for finding non-A nucleotides within the given tails.
-
create_tail_feature_list_A() - Extracts features of poly(A) tails containing only A nucleotides for training-set preparation.
-
create_tail_chunk_list_trainingset() - Extracts decoration-centered fragments of poly(A) tails for all reads and appends positional data to a nested list.
-
create_tail_chunk_list_A() - Creates list of tail chunks containing only A nucleotides.
-
split_tail_centered_trainingset() - Extracts decoration-centered fragments of poly(A) tail signal along with positional coordinates.
-
split_with_overlaps() - Splits signal to overlapping fragments of equal length.
-
filter_nonA_chunks_trainingset() - Filters read chunks containing non-adenosine nucleotides of interest for neural network training-set preparation.
-
filter_signal_by_threshold_trainingset() - Detection of outliers (peaks & valleys) in ONT signal using z-scores.
-
create_gaf_list_A() - Produces list of GAFs containing exclusively A-nucleotides for neural network training.
Data postprocessing
Functions for correcting, reclassifying, and reshaping ninetails output tables after the pipeline has run.
-
correct_class_data() - Corrects the classification of reads contained in the class_data table.
-
correct_residue_data() - Marks uncertain positions of non-A residues in ninetails output data.
-
correct_labels() - Correct read class labels for backward compatibility
-
reclassify_ninetails_data() - Reclassifies ambiguous non-A residues to mitigate potential errors inherited from nanopolish segmentation.
-
read_class_single() - Reads ninetails read_classes data frame from file.
-
read_class_multiple() - Reads multiple ninetails read_classes outputs at once.
-
read_residue_single() - Reads ninetails nonadenosine_residues data from file.
-
read_residue_multiple() - Reads multiple ninetails nonadenosine_residues outputs at once.
-
merge_nonA_tables() - Merges ninetails tabular outputs (read classes and nonadenosine residue data) to produce one concise table.
-
spread_nonA_residues() - Reshapes nonadenosine_residues data frame to wide format.
-
annotate_with_biomart() - Annotate ninetails output data with biomaRt
Statistics
Functions for statistical analysis and quantification of non-adenosine residues across reads and conditions.
-
calculate_fisher() - Perform Fisher's exact test per transcript with BH p-value adjustment
-
nonA_fisher() - Perform Fisher's exact test on a single transcript in ninetails output
-
count_class() - Counts read classes found in a read_classes data frame produced by the ninetails pipeline.
-
count_nonA_abundance() - Counts reads by number of non-A occurrence instances.
-
count_residues() - Counts non-A residues found in a nonadenosine_residues data frame produced by the ninetails pipeline.
-
summarize_nonA() - Produces summary table of non-A occurrences within an analyzed dataset.
-
nanopolish_qc() - Aggregates nanopolish polya quality control information.
Visualisation
Plotting functions for inspection of raw signals, GAF images, classification results, and statistical summaries.
-
plot_class_counts() - Plotting read classes data per category assigned to the analyzed reads.
-
plot_gaf() - Creates a visual representation of gramian angular field corresponding to the given poly(A) tail fragment (chunk).
-
plot_multiple_gaf() - Creates a visual representation of multiple gramian angular fields based on provided gaf_list (plots all gafs from the given list).
-
plot_nanopolish_qc() - Plots qc data (qc_tag) inherited from nanopolish polya function.
-
plot_nonA_abundance() - Plot abundances of reads with given amount of non-A residues per read
-
plot_panel_characteristics() - Plot panel characteristics of ninetails output
-
plot_residue_counts() - Plot counts of nonadenosine residues found in ninetails output data
-
plot_rug_density() - Scatterplot of nonA residue positions within poly(A) tail
-
plot_squiggle_fast5() - Draws an entire squiggle for given read.
-
plot_squiggle_pod5() - Draws an entire squiggle for given read from POD5 file.
-
plot_tail_chunk() - Draws a portion of poly(A) tail squiggle (chunk) for given read.
-
plot_tail_distribution() - Plots poly(A) tail length (or estimated non-A position) distribution in analyzed sample(s).
-
plot_tail_range_fast5() - Draws tail range squiggle for given read.
-
plot_tail_range_pod5() - Draws tail range squiggle for given read from POD5 file.
Analysis dashboard
Interactive Shiny application for exploring ninetails results. Supports single-sample and multi-sample analysis via YAML configuration, including read classification, residue composition, poly(A) distributions, and raw signal visualization with non-A modification overlay.
-
launch_signal_browser() - Launch the Ninetails Analysis Dashboard
tailfindr compatibility
Functions for converting tailfindr output into a format compatible with the ninetails Guppy legacy pipeline.
-
convert_tailfindr_output() - Converts tailfindr results to format compatible with ninetails
-
check_polya_length_filetype() - Check and convert poly(A) length file format
Signal processing
Core signal processing utilities and CNN-related helpers used internally by the pipeline functions.
-
filter_signal_by_threshold() - Detect outliers (peaks and valleys) in ONT signal using z-scores
-
winsorize_signal() - Winsorize nanopore signal
-
substitute_gaps() - Substitute short zero-gaps surrounded by nonzero pseudomoves
-
combine_gafs() - Combine GASF and GADF into a two-channel array
-
predict_gaf_classes() - Classify Gramian Angular Field matrices with a pretrained CNN
-
load_keras_model() - Load Keras model for multiclass signal prediction
Sequence helpers
Helper functions for primer matching and DNA sequence manipulation used by the cDNA orientation classification step.
-
reverse_complement() - Generate reverse complement of a DNA sequence
-
edit_distance_hw() - Calculate edit distance with sliding window (HW mode)
-
count_trailing_chars() - Count trailing occurrences of a character in a string
Input validation
Internal assertion and type-checking utilities used throughout the package for input validation.
-
assert_condition() - Assert condition is TRUE, stop with message if FALSE
-
assert_dir_exists() - Assert directory exists with informative error
-
assert_file_exists() - Assert file exists with informative error
-
check_fast5_filetype() - Check if the provided directory contains Fast5 files in the correct format
-
check_output_directory() - Check and handle existing output directory for ninetails analysis
-
is_RNA() - Check if fast5 file contains RNA reads
-
is_multifast5() - Check if fast5 file is multi-read format
-
is_string() - Test if x is a single non-empty character string
-
no_na() - Check for no NA values
-
get_mode() - Calculate the statistical mode of a numeric vector
-
ninetailsninetails-package - ninetails: Nonadenosine Nucleotides in Poly(A) Tails
-
`%>%` - Pipe operator