diff --git a/ARC specification.md b/ARC specification.md index 3bfa8fa..846e483 100644 --- a/ARC specification.md +++ b/ARC specification.md @@ -255,19 +255,21 @@ All metadata references to files or directories located inside the ARC MUST foll ### Examples -In this example, there are two `assays`, with `Assay1`containing a measurement of a `Source` material, producing an output `Raw Data file`. `Assay2` references this `Data file` for producing a new `Derived Data File` +##### General Pattern + +In this example, there are two `assays`, with `Assay1`containing a measurement of a `Source` material, producing an output `Data`. `Assay2` references this `Data` for producing a new `Data`. Use of `general pattern` relative paths from the arc root folder: `assays/Assay1/isa.assay.xlsx`: -| Input [Source Name] | Parameter[Instrument model] | Output [Raw Data File] | +| Input [Source Name] | Parameter[Instrument model] | Output [Data] | |-------------|---------------------------------|----------------------------------| | input | Bruker 500 Avance | assays/Assay1/dataset/measurement.txt | `assays/Assay2/isa.assay.xlsx`: -| Input [Raw Data File] | Parameter[script file] | Output [Derived Data File] | +| Input [Data] | Parameter[script file] | Output [Data] | |----------------------------------|---------------------------------|----------------------------------| | assays/Assay1/dataset/measurement.txt | assays/Assay2/dataset/script.sh | assays/Assay2/dataset/result.txt | diff --git a/ISA-XLSX.md b/ISA-XLSX.md index bf4caf6..04d8e8a 100644 --- a/ISA-XLSX.md +++ b/ISA-XLSX.md @@ -620,15 +620,29 @@ Each annotation table sheet MUST contain at most one `Input` and at most one `Ou - An `Extract Material` MUST be indicated with the node type `Material Name`. -- An `Image File` MUST be indicated with the node type `Image File`. +- A `Data` object MUST be indicated with the node type `Data`. -- A `Raw Data File` MUST be indicated with the node type `Raw Data File`. +`Source Names`, `Sample Names`, `Material Names` MUST be unique across an ARC. If two of these entities with the same name exist in the same ARC, they are considered the same entity. -- A `Derived Data File` MUST be indicated with the node type `Derived Data File`. +The `Data` node type MUST correspond to a relevant data resource location, following the [Data Path Annotation](/ARC%20specification.md#data-path-annotation) patterns. If the annotation of the `Data` node refers not to the complete resource, but a part of it, a `Selector` MAY be added. This Selector MUST be separated from the resource location using a `#`— with no whitespace between: `location#selector`. If appropriate, the Selector SHOULD be formatted according to IRI fragment selectors specified by [W3](https://www.w3.org/TR/annotation-model/#fragment-selector). + +The format of the data resource MAY be further qualified using a `Data Format` column. The `Data Format` SHOULD be expressed using a [MIME format](https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/MIME_types), most commonly consisting of two parts: a type and a subtype, separated by a slash (/) — with no whitespace between: `type/subtype`. If appropriate, a format from the list composed by [IANA](https://www.iana.org/assignments/media-types/media-types.xhtml) +SHOULD be picked. Unregistered or niche encoding and file formats MAY be indicated instead via the most appropriate URL. + +The format and usage info about the Selector MAY be further qualified using a `Data Selector Format` column. The `Data Selector Format` SHOULD point to a web resource containing instructions about how the Selector is formatted and how it should be interpreted. + + +## Examples + +### Data Location and Selector + +In this example, there is a measurement of two `Samples`, namely `input1` and `input2`. The values measured are both written into the same data resource in the location `result.csv`, whichs formatting is tabular, according to the `Data Format` being `text/csv`. To distinguish between the measurement values stemming from the different inputs, selectors were added to the resource location (seperated by a `#`), namely `col=1` and `col=2`. The specification about the formatting of these selectors can be found in the provided link, namely `https://datatracker.ietf.org/`. -`Source Names`, `Sample Names`, `Material Names` MUST be unique across an ARC. If two of these entities with the same name exist in the same ARC, they are considered the same entity. -`Image File`, `Raw Data File` or `Derived Data File` node types MUST correspond to a relevant file location, following the [Data Path Annotation](/ARC%20specification.md#data-path-annotation) patterns. +| Input [Sample Name] | Output [Data] | Data Format | Data Selector Format | +|-------------|---------------------------------|----------------------------------|--| +| input1 | result.csv#col=1 | text/csv | https://datatracker.ietf.org/doc/html/rfc7111 | +| input2 | result.csv#col=2 | text/csv | https://datatracker.ietf.org/doc/html/rfc7111 | ## Protocol Columns