
Check and convert poly(A) length file format
Source:R/tailfindr_compatibility.R
check_polya_length_filetype.RdDetects whether an input poly(A) length file originates from nanopolish or tailfindr and, if necessary, converts it to the nanopolish-like format expected by the ninetails legacy (Guppy/Fast5) pipeline. Tailfindr cDNA output is explicitly rejected because the legacy pipeline supports only DRS (direct RNA sequencing) data.
Value
A named list with two elements:
- data
Data frame / tibble. Standardised poly(A) length data in nanopolish-like format, ready for downstream ninetails processing.
- file_type
Character. One of
"nanopolish"or"tailfindr_drs", indicating the detected input format.
Details
File-type detection is based on column names:
- nanopolish
Identified by the presence of a
qc_tagcolumn. Returned as-is.- tailfindr DRS
Identified by the presence of
read_id,tail_start,tail_end,samples_per_nt, andtail_length. Converted viaconvert_tailfindr_output.- tailfindr cDNA
Identified by the presence of a
tail_is_validcolumn. Raises an error because cDNA data are not compatible with this pipeline.
If none of the above patterns match, an error is raised.
This function is part of the legacy pipeline (Guppy basecaller, multi-Fast5 files). It may be retired if the Fast5 format is deprecated.
See also
convert_tailfindr_output for the tailfindr
conversion logic,
extract_polya_data for subsequent processing of the
standardised output,
check_tails_guppy for the legacy pipeline entry point.
Examples
if (FALSE) { # \dontrun{
# run on nanopolish output
test <- ninetails::check_polya_length_filetype(
input = system.file('extdata', 'test_data',
'nanopolish_output.tsv',
package = 'ninetails'))
# run on tailfindr output
test <- ninetails::check_polya_length_filetype(
input = system.file('extdata', 'test_data',
'tailfindr_output.csv',
package = 'ninetails'))
} # }