
Extract data from BAM file for cDNA analysis
Source:R/ninetails_core_functions_dorado_cDNA.R
extract_data_from_bam.RdThis function extracts various informations from BAM files for reads that are present in the corresponding Dorado summary file. It returns a data frame with read IDs and their corresponding additional info, such as basecalled sequences, poly(A) lengths, poly(A) coordinates etc., which can then be used in downstream analyses.
Arguments
- bam_file
Character string. Path to BAM file containing basecalled sequences.
- summary_file
Character string. Path to corresponding Dorado summary file containing read IDs to extract.
- seq_only
logical [TRUE]. When
TRUE, only minimal information is extracted, including read id, pod5 file name, and basecalled sequence. IfFALSE, a more comprehensive information is extracted, including poly(A) tail length, coordinates etc.- cli_log
Function for logging messages and progress.