From f3798951cfbd4d41389103eb4b14fe32d24ed315 Mon Sep 17 00:00:00 2001 From: James Bonfield Date: Wed, 9 Nov 2022 11:55:01 +0000 Subject: [PATCH] SAM: add a sentence on case-insensitivity of RG PL (PR #684) This is not changing what is valid / permitted, and indeed this hopefully clarifies it further. However the practicality of dealing with wide-spread non-compliant data with lowercase PL values is that tools may wish to be lenient and use case-insensitive matching. Also removes test/sam/failed/hdr.RG6.sam due to explicitly testing against the use of lower-case PL fields. While strictly not conforming, it's overly harsh if we are advocating a more spec-tolerant testing regime for PL. Fixes #679 --- SAMv1.tex | 3 ++- test/sam/failed/hdr.RG6.sam | 1 - 2 files changed, 2 insertions(+), 2 deletions(-) delete mode 100644 test/sam/failed/hdr.RG6.sam diff --git a/SAMv1.tex b/SAMv1.tex index 97b8e74c3..5f6cc30d9 100644 --- a/SAMv1.tex +++ b/SAMv1.tex @@ -318,7 +318,8 @@ \subsection{The header section} & {\tt PI} & Predicted median insert size.\\\cline{2-3} & {\tt PL} & Platform/technology used to produce the reads. \emph{Valid values}: {\tt CAPILLARY}, {\tt DNBSEQ} (MGI/BGI), {\tt ELEMENT}, {\tt HELICOS}, {\tt ILLUMINA}, {\tt IONTORRENT}, {\tt LS454}, {\tt ONT} (Oxford Nanopore), {\tt PACBIO} (Pacific Biosciences), {\tt SOLID}, and {\tt ULTIMA}. - This field should be omitted when the technology is not in this list (though the {\tt PM} field may still be present in this case) or is unknown.\\\cline{2-3} + This field should be omitted when the technology is not in this list (though the {\tt PM} field may still be present in this case) or is unknown. + The values should be written as described in uppercase, however due to the existance of public data with lowercase values tools should also accept lowercase when decoding.\\\cline{2-3} & {\tt PM} & Platform model. Free-form text providing further details of the platform/technology used.\\\cline{2-3} & {\tt PU} & Platform unit (e.g., flowcell-barcode.lane for Illumina or slide for SOLiD). Unique identifier.\\\cline{2-3} & {\tt SM} & Sample. Use pool name where a pool is being sequenced.\\\cline{1-3} diff --git a/test/sam/failed/hdr.RG6.sam b/test/sam/failed/hdr.RG6.sam deleted file mode 100644 index 229863580..000000000 --- a/test/sam/failed/hdr.RG6.sam +++ /dev/null @@ -1 +0,0 @@ -@RG ID:1 PL:illumina