
Split BAM file into parts based on read IDs from summary file
Source:R/ninetails_core_functions_dorado_cDNA.R
split_bam_file_cdna.RdThis function splits a large BAM file into smaller parts based on read IDs from corresponding Dorado summary files. This is essential for memory management when processing large cDNA datasets. The function filters the BAM file to include only reads present in the summary file and creates appropriately sized output files.
Usage
split_bam_file_cdna(
bam_file,
dorado_summary,
part_size = 100000,
save_dir,
part_number,
cli_log = message
)Arguments
- bam_file
Character string. Path to input BAM file to be split.
- dorado_summary
Character string. Path to corresponding Dorado summary file containing read IDs to include in this part.
- part_size
Integer. Target number of reads per output file part.
- save_dir
Character string. Directory where split BAM files will be saved.
- part_number
Integer. Part number for naming output files.
- cli_log
Function for logging messages and progress.