
Extracts decoration-centered fragments of poly(A) tails for all reads and appends positional data to a nested list.
Source:R/ninetails_training_dataset_production_functions.R
create_tail_chunk_list_trainingset.RdParallel wrapper around split_tail_centered_trainingset.
For every read in the feature list, it extracts 100-element signal chunks
centered on potential non-A modifications and organises them in a nested
list keyed by read ID.
Arguments
- tail_feature_list
List object produced by
create_tail_feature_list_trainingset.- num_cores
Numeric
[1]. Number of physical cores to use. Do not exceed 1 less than the number of cores at your disposal.
Value
A named nested list organised by read IDs, where each element
is a list of chunk sublists as returned by
split_tail_centered_trainingset.
Details
This training-set variant is intended for preparing training and
validation datasets. Each chunk sublist contains four fields:
chunk_sequence, chunk_start_pos, chunk_end_pos,
and pseudomoves (see
split_tail_centered_trainingset).
See also
create_tail_feature_list_trainingset for the
preceding pipeline step,
split_tail_centered_trainingset for the per-read
extraction logic,
filter_nonA_chunks_trainingset for the next pipeline
step,
create_gaf_list for GAF conversion downstream.