
Creates list of poly(A) tail chunks (Dorado mode) centered on significant signal deviations.
Source:R/ninetails_core_functions_dorado_DRS.R
create_tail_chunk_list_dorado.RdProcesses raw poly(A) tail signals and pseudomoves (as generated by Dorado) in parallel to extract candidate signal fragments potentially containing non-A nucleotides. Each fragment is 100 signal points long, centered on pseudomove runs of sufficient length, and is returned with its positional coordinates. The resulting data are organized into a nested list keyed by read IDs.
Arguments
- tail_feature_list
list object produced by
create_tail_feature_listor an equivalent Dorado-tail feature extraction function. Must contain per-read entries with$tail_signaland$tail_pseudomoves.- num_cores
numeric [1]. Number of physical cores to use in processing. Do not exceed 1 less than the number of available cores on your machine.
Value
A nested list containing the segmented tail data (chunks and coordinates), organized by read IDs. Each read entry contains one or more fragments, where each fragment is a list with:
chunk_sequence: numeric vector of raw signal values (length 100)chunk_start_pos: integer, starting index of the chunkchunk_end_pos: integer, ending index of the chunk
Details
This Dorado-specific function differs from the Guppy-based version: * moves are not used (to avoid costly BAM parsing and processing) * pseudomoves are corrected at the tail ends (last 3 values forced to 0)
Parallelization is handled with foreach and doSNOW, allowing efficient scaling across multiple CPU cores. A progress bar is displayed to monitor job completion.