Skip to contents

Parallel wrapper around split_tail_centered_trainingset. For every read in the feature list, it extracts 100-element signal chunks centered on potential non-A modifications and organises them in a nested list keyed by read ID.

Usage

create_tail_chunk_list_trainingset(tail_feature_list, num_cores)

Arguments

tail_feature_list

List object produced by create_tail_feature_list_trainingset.

num_cores

Numeric [1]. Number of physical cores to use. Do not exceed 1 less than the number of cores at your disposal.

Value

A named nested list organised by read IDs, where each element is a list of chunk sublists as returned by split_tail_centered_trainingset.

Details

This training-set variant is intended for preparing training and validation datasets. Each chunk sublist contains four fields: chunk_sequence, chunk_start_pos, chunk_end_pos, and pseudomoves (see split_tail_centered_trainingset).

See also

create_tail_feature_list_trainingset for the preceding pipeline step, split_tail_centered_trainingset for the per-read extraction logic, filter_nonA_chunks_trainingset for the next pipeline step, create_gaf_list for GAF conversion downstream.

Examples

if (FALSE) { # \dontrun{

create_tail_chunk_list_trainingset(
  tail_feature_list = tail_feature_list,
  num_cores = 2)

} # }