
Filter Dorado summary for reads fulfilling ninetails quality criteria
Source:R/ninetails_misc_helper_functions.R
filter_dorado_summary.RdTakes a Dorado summary file (from ONT Dorado basecaller) or a data frame containing equivalent summary information, and filters out reads that do not meet ninetails quality and alignment criteria.
Details
Reads must meet all four criteria to pass the filter:
Mapped:
alignment_directionis"+"or"-", not"*"High mapping quality:
alignment_mapq> 0Valid coordinates:
poly_tail_start!= 0 (in DRS, the adapter passes through the pore first, so a start position of 0 indicates a likely artifact)Sufficient length:
poly_tail_length>= 10 nt (shorter tails cannot be reliably processed by the CNN)
See also
preprocess_inputs and preprocess_inputs_cdna
where this function is called during pipeline initialization,
process_dorado_summary for splitting large summary files
Examples
if (FALSE) { # \dontrun{
# From file
filtered <- filter_dorado_summary("dorado_summary.txt")
# From data frame
df <- data.frame(
read_id = c("read1", "read2"),
alignment_direction = c("+", "*"),
alignment_mapq = c(60, 0),
poly_tail_start = c(100, 0),
poly_tail_length = c(20, 5)
)
filtered <- filter_dorado_summary(df)
} # }