diff --git a/specifications/xpath-functions-40/src/function-catalog.xml b/specifications/xpath-functions-40/src/function-catalog.xml index efabc11b2..7fd1ab26d 100644 --- a/specifications/xpath-functions-40/src/function-catalog.xml +++ b/specifications/xpath-functions-40/src/function-catalog.xml @@ -59,6 +59,7 @@ + @@ -67,7 +68,7 @@ - + @@ -16731,7 +16732,7 @@ else $c[1] + sum(subsequence($c, 2)) - item-separator + item-delimiter xs:string? @@ -21787,196 +21788,7 @@ return $M(collation-key("a", $C)) - - - - - - - deterministic - context-independent - focus-independent - - -

Parses CSV data supplied as a string, returning the results in the form of a sequence of arrays of strings.

-
- -

The effect of the one-argument form of this function is the same as calling the - two-argument form with an empty map as the value of the $options - argument.

- -

The first argument is CSV data, as defined in , in the form of a - sequence of xs:string values. The function parses this sequence to return - an XDM value.

- -

If $csv is the empty sequence, implementations must - return the empty sequence as the value of the body field of the returned - map.

- -

The $options argument can be used to control the way in which the parsing - takes place. The option parameter conventions apply.

- -

Implementations must treat any of CRLF, CR, or LF as a single line - separator, as with fn:unparsed-text-lines.

- -

Fields are regarded as simple xs:string values. Implementations - must leave whitespace within a field untouched, without - normalizing or otherwise altering it, unless whitespace trimming is explicitly requested - by the user using the trim-whitespace option.

- -

When whitespace trimming is requested, implementations must only - strip leading and trailing whitespace, this is not equivalent to calling - fn:normalize-space().

- -

The entries that may appear in the $options map are as follows:

- - - - The character used to delimit fields within a record. An instance of - xs:string whose length is exactly one. - xs:string - "," - - - The characters used to delimit records within the CSV string, if the - default use of line separator as record separator is to be overridden. - xs:string - () - - - The character used to quote fields within the CSV string. An instance of - xs:string whose length is exactly one. - xs:string - '"' - - - Determines whether fields should have leading and trailing whitespace - removed before being returned. - xs:boolean - false - - Fields will be returned with any leading or trailing - whitespace intact. Implementations must preserve whitespace - as it occurred in the CSV string. - - Fields will be returned with leading or trailing - whitespace removed, and all non-leading or -trailing whitespace preserved. - - - - - -

The result of the function is a sequence of arrays-of-strings - array(xs:string)*.

-

A blank row is represented as an empty array.

-

An empty field is represented by the empty string.

-
- -

A dynamic error occurs if the value of - $csv does not conform to the grammar for quoted - fields.

-

A dynamic error occurs if one or more of the values - for field-separator, record-separator, - quote-character are specified and are not a single character.

-

A dynamic error occurs if any of the values for - field-separator, record-separator, - quote-character are equal.

-
- -

All fields are returned as xs:string values.

-

Quoted fields in the input are returned without the quotes.

-

For more discussion of the returned data, see .

-
- - - - - -

Handling any of the default record separators:

- - parse-csv(`name,city{$crlf}Bob,Berlin{$crlf}Alice,Aachen{$crlf}`) - ( - ["name", "city"] - ["Bob", "Berlin"], - ["Alice", "Aachen"] -) - - - parse-csv(`name,city{$cr}Bob,Berlin{$cr}Alice,Aachen{$cr}`) - ( - ["name", "city"] - ["Bob", "Berlin"], - ["Alice", "Aachen"] -) - - - parse-csv(`name,city{$lf}Bob,Berlin{$lf}Alice,Aachen{$lf}`) - ( - ["name", "city"] - ["Bob", "Berlin"], - ["Alice", "Aachen"] -) - -
- -

Quote handling:

- - parse-csv(`"name","city"${crlf}"Bob","Berlin"${crlf}"Alice","Aachen"${crlf}`) - ( - ["name", "city"] - ["Bob", "Berlin"], - ["Alice", "Aachen"] -) - - - parse-csv(`"name","city"${crlf}"Bob ""The Exemplar"" Mustermann","Berlin"${crlf}`) - ( - ["name", "city"] - ['Bob "The Exemplar" Mustermann', "Berlin"], - ["Alice", "Aachen"] -) - -
- -

Non-default record- and field-separators:

- - parse-csv("name;city§Bob;Berlin§Alice;Aachen", map{"record-separator": "§", "field-separator": ";"}) - ( - ["name", "city"] - ["Bob", "Berlin"], - ["Alice", "Aachen"] -) - -
- -

Non-default quote character:

- - parse-csv(`|name|,|city|${crlf}|Bob|,|Berlin|${crlf}`, map{"quote-character": "|"}) - ( - ["name", "city"] - ["Bob", "Berlin"], - ["Alice", "Aachen"] -) - -
- -

Trimming whitespace in fields:

- - parse-csv(`name ,city ${crlf}Bob ,Berlin${crlf}Alice ,Aachen${crlf}`, map{"trim-whitespace: true()}) - ( - ["name", "city"] - ["Bob", "Berlin"], - ["Alice", "Aachen"] -) - -
-
-
- - - - + @@ -21999,7 +21811,7 @@ return $M(collation-key("a", $C))

The first argument is CSV data, as defined in , in the form of a sequence of xs:string values. The function parses this sequence using - fn:parse-csv, and then processes its result to return an XDM value.

+ fn:csv-to-simple-rows, and then processes its result to return an XDM value.

If $csv is the empty sequence, implementations must return a parsed-csv-structure-record whose rows entry is the empty sequence.

@@ -22015,12 +21827,13 @@ return $M(collation-key("a", $C)) def="option-parameter-conventions">option parameter conventions apply.

Handling of delimiters, and whitespace trimming, are handled using - fn:parse-csv, and the options controlling their use are defined + fn:csv-to-simple-rows, and the options controlling their use are defined there.

-

If the headers option is true, implementations must - exclude the first record from the returned map’s body key, and return it as - the value of the returned map’s headers-record key.

+

If the column-names option is true, implementations must + exclude the first record from the returned map’s rows key, and use it to + construct a csv-columns-record that is returned as the value of the + returned map’s columns key.

The entries that may appear in the $options map are as follows:

@@ -22030,11 +21843,10 @@ return $M(collation-key("a", $C)) xs:string "," - - The characters used to delimit records within the CSV string, if the - default use of line separator as record separator is to be overridden. - xs:string - () + + The sequence of strings used to delimit rows within the CSV string. Defaults to CRLF/LF/CR. + xs:string+ + ("
", "
", "
") The character used to quote fields within the CSV string. An instance of @@ -22060,8 +21872,10 @@ return $M(collation-key("a", $C)) Determines whether the first row of the CSV should be treated as a list of column names and returned as a csv-columns-record in the - columns entry of the returned map. - union(xs:boolean, map(xs:string, xs:integer)) + columns entry of the returned map. Permitted values are a map of type + map(xs:string, xs:integer) or an xs:boolean. + + item() false A csv-columns-record is constructed using the @@ -22079,7 +21893,7 @@ return $M(collation-key("a", $C)) the returned parsed-csv-structure-record. Implementations must not exclude the first row from the rows entry of the parsed-csv-structure-record. - A csv-columns-record is + A csv-columns-record is constructed using the supplied map and returned as the header entry of the parsed-csv-structure-record. The supplied map is used as the names entry, and a sequence of strings for the @@ -22090,8 +21904,21 @@ return $M(collation-key("a", $C)) + + A sequence indicating which fields to return and in which order. If this + option is missing or the empty sequence, all fields are returned in their natural + order. Items in the sequence are treated as the index of the column to return. In + the returned data, only fields from the specified columnms are returned, and in + the order specified. This option is mutually exclusive with the + number-of-columns option. Specifying both options will cause an error. + xs:integer* + () + - Specifies how many columns to return. + Specifies how many columns to return. This option is mutually exclusive with the + filter-columns option. Specifying both options will cause an error. union(enum("all", "first-row"), xs:integer) "all" @@ -22112,12 +21939,6 @@ return $M(collation-key("a", $C)) -

If column names were extracted from the first row of the CSV, when there are duplicate - column names, implementations must include only the first occurrence - in the names entry of the csv-columns-record, ignoring - subsequent entries. Any fields in the first record whose value is the empty string - must also be omitted.

-

The result of the function is a parsed-csv-structure-record, a map with string keys containing two entries, columns, and rows.

@@ -22125,7 +21946,7 @@ return $M(collation-key("a", $C))

The entry with key "columns" holds a csv-columns-record record. If column names have been extracted, or supplied, then the record will have a names entry whose value is a map of column-name to - column-number, map(xs:integer, xs:string). The record’s + column-number, map(xs:string, xs:integer). The record’s fields entry will contains the column names as a sequence of strings, xs:string*, replicating the row they were taken from.

@@ -22170,27 +21991,65 @@ return $M(collation-key("a", $C)) supplied $key is a string and does not occur in the map of column names.

+ +

rules: The function returns the field in the sequence fields entry of this + csv-row-record at the position in + the sequence either explicitly provided (when the $key argument is an + xs:integer), or looked up from the map of name to position in the + names entry of the csv-columns-record of the + parsed-csv-structure-record this csv-row-record + was returned as part of.

+ +

When the argument is a string, if the string is missing from the keys + of the names map , then implementations must + raise an .

+ +

When the argument is an integer, if the integer position is outside the + bounds of the sequence contained in the fields entry of this + csv-row-record (i.e. is greater than the size of the + sequence), then implementations must return the empty + string.

+
- -

This function behaves identically to fn:csv-fetch-field-by-column - would had the header entry of the containing - parsed-csv-structure-record and the fields entry of - this csv-row-record been supplied as its first two arguments, and - $key as its last. See the definition of - fn:csv-fetch-field-by-column for more details

+ +

If column names were extracted from the first row of the CSV, when there are duplicate + column names, implementations must include only the first occurrence + in the names entry of the csv-columns-record, ignoring + subsequent entries. Any fields in the first record whose value is the empty string + must also be omitted.

+ +

If the number-of-columns options is set to "first-row" or an + integer, or the filter-columns option is set, and the + column-names option is set to true(), the filtering of + columns is performed before the extraction of the first row and creation of the + csv-columns-record.

+ +

If the number-of-columns options is set to "first-row" or an + integer, or the filter-columns option is set, and the + column-names option is set to a map(xs:string, xs:integer), + then filtering of columns does not affect the creation of the + csv-columns-record, and it is possible that the number of fields in the + rows is smaller than the number of fields in the csv-columns-record.

A dynamic error occurs if the value of $csv does not conform to the grammar for quoted fields.

A dynamic error occurs if one or more of the values - for field-separator, record-separator, - quote-character are specified and are not a single character.

+ for field-delimiter or quote-character are specified and are + not a single character.

A dynamic error occurs if any of the values for - field-separator, record-separator, + field-delimiter, row-delimiter, quote-character are equal.

+

A dynamic error occurs if any column-index integers, + such as the values in a map supplied to column-names, or as the value of + number-of-columns or filter-columns, are negative or + zero.

+

A dynamic error occurs if both the + number-of-columns and filter-columns options are set in a + call to fn:parse-csv.

All fields are returned as xs:string values.

@@ -22198,99 +22057,107 @@ return $M(collation-key("a", $C))

For more discussion of the returned data, see .

- + `name,city{$crlf}Bob,Berlin{$crlf}Alice,Aachen{$crlf}` - map { "record-separator": "§", "field-separator": ";", "quote-character": "|" } + map { "row-delimiter": "§", "field-delimiter": ";", "quote-character": "|" } `|name|;|city|§|Bob|;|Berlin|§|Alice|;|Aachen|` - map { "trim-whitespace": true() }

With defaults for delimiters and quotes, and default column extraction (false):

- map:keys(csv-to-xdm($csv-string)) - ("header", "rows") + map:keys(parse-csv($csv-string)) + ("columns", "rows") - csv-to-xdm($csv-string)?columns + parse-csv($csv-string)?columns map { "names": map {}, - "fields": (), + "fields": () } - count(csv-to-xdm($csv-string)?rows) + count(parse-csv($csv-string)?rows) 3 - csv-to-xdm($csv-string)?rows[1]?field("name") - + parse-csv($csv-string)?rows[1]?field("name") + - csv-to-xdm($csv-string)?rows[1]?field(2) + parse-csv($csv-string)?rows[1]?field(2) "city"

With defaults for delimiters and quotes, and columns: true() set:

- csv-to-xdm($csv-string, map {"columns": true()})?columns + parse-csv($csv-string, map {"column-names": true()})?columns map { "names": map { "name": 1, "city": 2 }, - "fields": ("name", "city"), + "fields": ("name", "city") } - count(csv-to-xdm($csv-string, map {"columns": true()})?rows) + count(parse-csv($csv-string, map {"column-names": true()})?rows) 2 - csv-to-xdm($csv-string), map {"columns": true()}?rows[1]?fields + parse-csv($csv-string, map {"column-names": true()})?rows[1]?fields ("Bob", "Berlin") - csv-to-xdm($csv-string, map {"columns": true()})?rows[1]?field("name") + parse-csv($csv-string, map {"column-names": true()})?rows[1]?field("name") "Bob" - csv-to-xdm($csv-string, map {"columns": true()})?rows[1]?field(2) + parse-csv($csv-string, map {"column-names": true()})?rows[1]?field(2) "Berlin"

Non-default record- and field-delimiters, non-default quotes:

- map:keys(csv-to-xdm($non-std-csv, $options)) - ("header", "rows") + parse-csv($non-std-csv, $options)?rows[3]?field(1) + "Alice" - - csv-to-xdm($non-std-csv, $options)?columns +
+ `Alice,Aachen{$crlf}Bob,Berlin{$crlf}` + map { "column-names": map { "Person": 1, "Location": 2 } } + +

Specifying column names explicitly:

+ + map:keys(parse-csv($csv-string, $options)) + ("columns", "rows") + + + parse-csv($csv-string, $options)?columns map { - "names": map {}, - "fields": (), - } + "names": map { "Person": 1, "Location": 2 }, + "fields": ("Person", "Location") +} - - count(csv-to-xdm($non-std-csv, $options)?rows) - 3 + + count(parse-csv($csv-string, $options)?rows) + 2 - - csv-to-xdm($non-std-csv, $options)?rows[3]?field(1) + + parse-csv($csv-string, $options)?rows[1]?field(1) "Alice" -
- -

Trimming whitespace in fields:

- - csv-to-xdm(`name ,city ${crlf}Bob ,Berlin${crlf}Alice ,Aachen${crlf}`, $trim-opts)?rows?fields - ("name", "city", "Bob", "Berlin", "Alice", "Aachen") + + parse-csv($csv-string, $options)?rows[2]?field("Location") + "Berlin"
- `date,name,city,amount,currency,original amount,note{$crlf}2023-07-19,Bob,Berlin,10.00,USD,13.99{$crlf}2023-07-20,Alice,Aachen,15.00{$crlf}2023-07-20,Charlie,Celle,15.00,GBP,11.99,cake,not a lie{$crlf}` + concat(`date,name,city,amount,currency,original amount,note{$crlf}`, +`2023-07-19,Bob,Berlin,10.00,USD,13.99{$crlf}`, +`2023-07-20,Alice,Aachen,15.00{$crlf}`, +`2023-07-20,Charlie,Celle,15.00,GBP,11.99,cake,not a lie{$crlf}`) -

Filtering columns

+

Filtering columns, with column-names: true()

- csv-to-xdm($csv-uneven-cols, map { "columns": true(), "filter-columns": (2,1,4) })?columns?fields + parse-csv($csv-uneven-cols, map { "column-names": true(), "filter-columns": (2,1,4) })?columns?fields ("name","date","amount") - for $r in csv-to-xdm($csv-uneven-cols, map { "columns": true(), "filter-columns": (2,1,4) })?rows return array { $r?fields } + for $r in parse-csv($csv-uneven-cols, map { "column-names": true(), "filter-columns": (2,1,4) })?rows return array { $r?fields } ( ["Bob","2023-07-19","10.00"], ["Alice","2023-07-20","15.00"], @@ -22299,29 +22166,21 @@ return $M(collation-key("a", $C))
-

Specifying the number of columns, using "all" (the default)

- - csv-to-xdm($csv-uneven-cols, map { "columns": true(), "number-of-columns": "all" })?columns?fields - ("date","name","city","amount","currency","original amount","note") - +

Filtering columns, with column-names: map { ... }

- for $r in csv-to-xdm($csv-uneven-cols, map { "columns": true(), "number-of-columns": "all" })?rows return array { $r?fields } - ( - ["2023-07-19","Bob","Berlin","10.00","USD","13.99"], - ["2023-07-20","Alice","Aachen","15.00"], - ["2023-07-20","Charlie","Celle","15.00","GBP","11.99","cake","not a lie"] -) + parse-csv($csv-uneven-cols, map { "column-names": map { "Person": 1, "Amount": 3 }, "filter-columns": (2,1,4) })?columns + map { + "names": map { "Person": 1, "Amount": 3 }, + "fields": ("Person", "", "Amount") +}
-

Specifying the number of columns using "first-row"

+

Specifying the number of columns using "first-row" and column-names: false()

- csv-to-xdm($csv-uneven-cols, map { "columns": true(), "number-of-columns": "first-row" })?columns?fields - ("date","name","city","amount","currency","original amount","note") - - - for $r in csv-to-xdm($csv-uneven-cols, map { "columns": true(), "number-of-columns": "first-row" })?rows return array { $r?fields } + for $r in parse-csv($csv-uneven-cols, map { "number-of-columns": "first-row" })?rows return array { $r?fields } ( + ["date","name","city","amount","currency","original amount","note"], ["2023-07-19","Bob","Berlin","10.00","USD","13.99",""], ["2023-07-20","Alice","Aachen","15.00","","",""], ["2023-07-20","Charlie","Celle","15.00","GBP","11.99","cake"] @@ -22329,13 +22188,16 @@ return $M(collation-key("a", $C))
-

Specifying the number of columns with a number

+

Specifying the number of columns with a number and column-names: true()

- csv-to-xdm($csv-uneven-cols, map { "columns": true(), "number-of-columns": 6 })?columns?fields - ("date","name","city","amount","currency","original amount") + parse-csv($csv-uneven-cols, map { "column-names": true(), "number-of-columns": 6 })?columns?fields + map { + "names": map { "date": 1, "name": 2, "city": 3, "amount": 4, "currency": 5, "original amount": 6 }, + "fields": ("date","name","city","amount","currency","original amount") +} - for $r in csv-to-xdm($csv-uneven-cols, map { "columns": true(), "number-of-columns": 6 })?rows return array { $r?fields } + for $r in parse-csv($csv-uneven-cols, map { "column-names": true(), "number-of-columns": 6 })?rows return array { $r?fields } ( ["2023-07-19","Bob","Berlin","10.00","USD","13.99"], ["2023-07-20","Alice","Aachen","15.00","",""], @@ -22346,6 +22208,192 @@ return $M(collation-key("a", $C))
+ + + + + + + + + deterministic + context-independent + focus-independent + + +

Parses CSV data supplied as a string, returning the results in the form of a sequence of arrays of strings.

+
+ +

The effect of the one-argument form of this function is the same as calling the + two-argument form with an empty map as the value of the $options + argument.

+ +

The first argument is CSV data, as defined in , in the form of a + sequence of xs:string values. The function parses this sequence to return + an XDM value.

+ +

If $csv is the empty sequence, implementations must + return the empty sequence as the value of the body field of the returned + map.

+ +

The $options argument can be used to control the way in which the parsing + takes place. The option parameter conventions apply.

+ +

Implementations must treat any of CRLF, CR, or LF as a single line + separator, as with fn:unparsed-text-lines.

+ +

Fields are regarded as simple xs:string values. Implementations + must leave whitespace within a field untouched, without + normalizing or otherwise altering it, unless whitespace trimming is explicitly requested + by the user using the trim-whitespace option.

+ +

When whitespace trimming is requested, implementations must only + strip leading and trailing whitespace, this is not equivalent to calling + fn:normalize-space().

+ +

The entries that may appear in the $options map are as follows:

+ + + + The character used to delimit fields within a record. An instance of + xs:string whose length is exactly one. + xs:string + "," + + + The sequence of strings used to delimit rows within the CSV string. Defaults to CRLF/LF/CR. + xs:string+ + ("
", "
", "
") + + + The character used to quote fields within the CSV string. An instance of + xs:string whose length is exactly one. + xs:string + '"' + + + Determines whether fields should have leading and trailing whitespace + removed before being returned. + xs:boolean + false + + Fields will be returned with any leading or trailing + whitespace intact. Implementations must preserve whitespace + as it occurred in the CSV string. + + Fields will be returned with leading or trailing + whitespace removed, and all non-leading or -trailing whitespace preserved. + + + + + +

The result of the function is a sequence of arrays-of-strings + array(xs:string)*.

+

A blank row is represented as an empty array.

+

An empty field is represented by the empty string.

+
+ +

A dynamic error occurs if the value of + $csv does not conform to the grammar for quoted + fields.

+

A dynamic error occurs if one or more of the values + for field-delimiter or quote-character are specified and are + not a single character.

+

A dynamic error occurs if any of the values for + field-delimiter, row-delimiter, + quote-character are equal.

+
+ +

All fields are returned as xs:string values.

+

Quoted fields in the input are returned without the quotes.

+

For more discussion of the returned data, see .

+
+ + + + + +

Handling any of the default record separators:

+ + csv-to-simple-rows(`name,city{$crlf}Bob,Berlin{$crlf}Alice,Aachen{$crlf}`) + ( + ["name", "city"], + ["Bob", "Berlin"], + ["Alice", "Aachen"] +) + + + csv-to-simple-rows(`name,city{$cr}Bob,Berlin{$cr}Alice,Aachen{$cr}`) + ( + ["name", "city"], + ["Bob", "Berlin"], + ["Alice", "Aachen"] +) + + + csv-to-simple-rows(`name,city{$lf}Bob,Berlin{$lf}Alice,Aachen{$lf}`) + ( + ["name", "city"], + ["Bob", "Berlin"], + ["Alice", "Aachen"] +) + +
+ +

Quote handling:

+ + csv-to-simple-rows(`"name","city"{$crlf}"Bob","Berlin"{$crlf}"Alice","Aachen"{$crlf}`) + ( + ["name", "city"], + ["Bob", "Berlin"], + ["Alice", "Aachen"] +) + + + csv-to-simple-rows(`"name","city"{$crlf}"Bob ""The Exemplar"" Mustermann","Berlin"{$crlf}`) + ( + ["name", "city"], + ['Bob "The Exemplar" Mustermann', "Berlin"] +) + +
+ +

Non-default record- and field-delimiters:

+ + csv-to-simple-rows("name;city§Bob;Berlin§Alice;Aachen", map{"row-delimiter": "§", "field-delimiter": ";"}) + ( + ["name", "city"], + ["Bob", "Berlin"], + ["Alice", "Aachen"] +) + +
+ +

Non-default quote character:

+ + csv-to-simple-rows(`|name|,|city|{$crlf}|Bob|,|Berlin|{$crlf}`, map{"quote-character": "|"}) + ( + ["name", "city"], + ["Bob", "Berlin"] +) + +
+ +

Trimming whitespace in fields:

+ + csv-to-simple-rows(`name ,city {$crlf}Bob ,Berlin{$crlf}Alice ,Aachen{$crlf}`, map{"trim-whitespace": true()}) + ( + ["name", "city"], + ["Bob", "Berlin"], + ["Alice", "Aachen"] +) + +
+
+
+ @@ -22369,7 +22417,7 @@ return $M(collation-key("a", $C))

The first argument is CSV data, as defined in , in the form of a sequence of xs:string values. The function parses this sequence using - fn:parse-csv, and then processes its result to return an XML document.

+ fn:csv-to-simple-rows, and then processes its result to return an XML document.

If $csv is the empty sequence, implementations must return a ]]> whose ]]> element @@ -22390,7 +22438,7 @@ return $M(collation-key("a", $C)) def="option-parameter-conventions">option parameter conventions apply.

Handling of delimiters, and whitespace trimming, are handled using - fn:parse-csv, and the options controlling their use are defined + fn:csv-to-simple-rows, and the options controlling their use are defined there.

The entries that may appear in the $options map are as follows:

@@ -22402,8 +22450,8 @@ return $M(collation-key("a", $C)) xs:string "," - - The characters used to delimit records within the CSV string, if the + + The characters used to delimit rows within the CSV string, if the default use of line separator as record separator is to be overridden. xs:string () @@ -22431,9 +22479,11 @@ return $M(collation-key("a", $C)) Determines whether the first row of the CSV should be treated as a list - of column headers and returned as a csv-columns-record in the - header entry of the returned map. - union(xs:boolean, map(xs:integer, xs:string)) + of column headers and returned as ]]> elements in + the ]]> element. Permitted values are a map of type + map(xs:string, xs:integer) or an xs:boolean. + + item() false The ]]> element is populated @@ -22443,7 +22493,7 @@ return $M(collation-key("a", $C)) element. Implementations must not include a ]]> element in the output. - The supplied map is used to + The supplied map is used to construct a sequence of ]]> elements to populate the ]]> element. The xs:integer denotes the column number, and the xs:string the column name. Gaps @@ -22489,355 +22539,278 @@ return $M(collation-key("a", $C)) $csv does not conform to the grammar for quoted fields.

A dynamic error occurs if one or more of the values - for field-separator, record-separator, + for field-delimiter, row-delimiter, quote-character are specified and are not a single character.

A dynamic error occurs if any of the values for - field-separator, record-separator, + field-delimiter, row-delimiter, quote-character are equal.

- + `name,city{$crlf}Bob,Berlin{$crlf}Alice,Aachen{$crlf}`

An empty CSV with default column extraction (false):

csv-to-xml("") - - + + + ]]>

An empty CSV with column extraction:

- csv-to-xml("", map { "columns": true() }) + csv-to-xml("", map { "column-names": true() }) - - - + + + + ]]>

An empty CSV with explicit column names:

- csv-to-xml("", map { "columns": map { "name": 1, "city": 3 }) + csv-to-xml("", map { "column-names": map { "name": 1, "city": 3 } }) - - name - - city - - - + + + name + + city + + + ]]>

With defaults for delimiters and quotes, and column extraction:

- csv-to-xml($csv-string, map { "columns": true() }) + csv-to-xml($csv-string, map { "column-names": true() }) - - name - city - - - - Bob - Berlin - - - Alice - Aachen - - - + + + name + city + + + + Bob + Berlin + + + Alice + Aachen + + + ]]>

With defaults for delimiters and quotes, and column extraction:

- csv-to-xml($csv-string, map { "columns": true() }) + csv-to-xml($csv-string, map { "column-names": true() }) - - name - city - - - - Bob - Berlin - - - Alice - Aachen - - - + + + name + city + + + + Bob + Berlin + + + Alice + Aachen + + + ]]>
- `date,name,city,amount,currency,original amount,note{$crlf}2023-07-19,Bob,Berlin,10.00,USD,13.99{$crlf}2023-07-20,Alice,Aachen,15.00{$crlf}2023-07-20,Charlie,Celle,15.00,GBP,11.99,cake,not a lie{$crlf}` + concat(`date,name,city,amount,currency,original amount,note{$crlf}`, +`2023-07-19,Bob,Berlin,10.00,USD,13.99{$crlf}`, +`2023-07-20,Alice,Aachen,15.00{$crlf}`, +`2023-07-20,Charlie,Celle,15.00,GBP,11.99,cake,not a lie{$crlf}`)

Filtering columns

- csv-to-xml($csv-string, map { "columns": true(), "filter-columns": (2,1,4) }) + csv-to-xml($csv-uneven-cols, map { "column-names": true(), "filter-columns": (2,1,4) }) - - name - date - amount - - - - Bob - 2023-07-19 - 10.00 - - - Alice - 2023-07-20 - 15.00 - - - Charlie - 2023-07-20 - 15.00 - - - + + + name + date + amount + + + + Bob + 2023-07-19 + 10.00 + + + Alice + 2023-07-20 + 15.00 + + + Charlie + 2023-07-20 + 15.00 + + + ]]>

Specifying the number of columns, using "all" (the default)

- csv-to-xml($csv-uneven-cols, map { "columns": true(), "number-of-columns": "all" }) + csv-to-xml($csv-uneven-cols, map { "column-names": true(), "number-of-columns": "all" }) - - date - name - city - amount - currency - original amount - note - - - - 2023-07-19 - Bob - Berlin - 10.00 - USD - 13.99 - - - 2023-07-20 - Alice - Aachen - 15.00 - - - 2023-07-20 - Charlie - Celle - 15.00 - GBP - 11.99 - cake - not a lie - - - + + + date + name + city + amount + currency + original amount + note + + + + 2023-07-19 + Bob + Berlin + 10.00 + USD + 13.99 + + + 2023-07-20 + Alice + Aachen + 15.00 + + + 2023-07-20 + Charlie + Celle + 15.00 + GBP + 11.99 + cake + not a lie + + + ]]>

Specifying the number of columns using "first-row"

- csv-to-xml($csv-uneven-cols, map { "columns": true(), "number-of-columns": "first-row" }) + csv-to-xml($csv-uneven-cols, map { "column-names": true(), "number-of-columns": "first-row" }) - - date - name - city - amount - currency - original amount - note - - - - 2023-07-19 - Bob - Berlin - 10.00 - USD - 13.99 - - - - 2023-07-20 - Alice - Aachen - 15.00 - - - - - - 2023-07-20 - Charlie - Celle - 15.00 - GBP - 11.99 - cake - - - + + + date + name + city + amount + currency + original amount + note + + + + 2023-07-19 + Bob + Berlin + 10.00 + USD + 13.99 + + + + 2023-07-20 + Alice + Aachen + 15.00 + + + + + + 2023-07-20 + Charlie + Celle + 15.00 + GBP + 11.99 + cake + + + ]]>

Specifying the number of columns with a number

- csv-to-xml($csv-uneven-cols, map { "columns": true(), "number-of-columns": 6 }) + csv-to-xml($csv-uneven-cols, map { "column-names": true(), "number-of-columns": 6 }) - - date - name - city - amount - currency - original amount - - - - 2023-07-19 - Bob - Berlin - 10.00 - USD - 13.99 - - - 2023-07-20 - Alice - Aachen - 15.00 - - - - - 2023-07-20 - Charlie - Celle - 15.00 - GBP - 11.99 - - - + + + date + name + city + amount + currency + original amount + + + + 2023-07-19 + Bob + Berlin + 10.00 + USD + 13.99 + + + 2023-07-20 + Alice + Aachen + 15.00 + + + + + 2023-07-20 + Charlie + Celle + 15.00 + GBP + 11.99 + + + ]]>
- - - - - - - - - - deterministic - context-independent - focus-independent - - -

Fetches a field from a parsed CSV row by name or position.

-
- -

The first argument is a csv-columns-record, as provided in the - header entry of the parsed-csv-structure-record returned by - fn:csv-to-xdm.

- -

The second argument is the row whose fields are being fetched, represented as a sequence - of strings as would be provided by the fields entry of a - csv-row-record returned by fn:csv-to-xdm.

- -

The final argument is the key to use for the lookup, supplied as either an - xs:string (the column name) or xs:integer (the column - position).

- -

When the argument is a string, if the string is missing from the keys of the map - contained in the names entry of the $columns argument’s - csv-columns-record, then implementations must raise - an .

- -

When the argument is an integer, if the integer position is outside the bounds of the - $fields sequence (i.e. is greater than the size of the sequence), then - implementations must return the empty string.

- -

The function returns the field in the sequence $fields at the position in - the sequence either explicitly provided (when $key is an - xs:integer), or looked up from the map of name to position in the - csv-columns-record provided in $columns.

-
- -

A dynamic error occurs if the value of - $key is an xs:string but is not a member of the keys of the - map contained in the names entry of the csv-columns-record in - $header. fields.

-
- - map { - "names": map { "name": 1, "city": 2 }, - "fields: ("name", "city") -} - ("Bob", "Berlin") - -

With a string key:

- - csv-fetch-field-by-column($columns, $fields, "name") - "Bob" - - - csv-fetch-field-by-column($columns, $fields, "amount") - - -
- -

With an integer key

- - csv-fetch-field-by-column($columns, $fields, 2) - "Berlin" - - - csv-fetch-field-by-column($columns, $fields, 3) - "" - -
-
-
- diff --git a/specifications/xpath-functions-40/src/xpath-functions.xml b/specifications/xpath-functions-40/src/xpath-functions.xml index 73c20d81f..4579c099b 100644 --- a/specifications/xpath-functions-40/src/xpath-functions.xml +++ b/specifications/xpath-functions-40/src/xpath-functions.xml @@ -5762,14 +5762,11 @@ correctly in all browsers, depending on the system configuration.

--> - - - - - + + @@ -6874,19 +6871,19 @@ correctly in all browsers, depending on the system configuration.

--> within a field. (See .)

The functions for processing CSV-formatted data are built on - fn:parse-csv, which provides a simple representation of a parsed CSV + fn:csv-to-simple-rows, which provides a simple representation of a parsed CSV as a sequence of arrays-of-strings, array(xs:string)*, handling row and column delimiters, and quoting.

-

The fn:csv-to-xml and fn:csv-to-xdm functions provide more +

The fn:csv-to-xml and fn:parse-csv functions provide more sophisticated processing.

Common parsing options -

All three functions: fn:parse-csv, fn:csv-to-xml, and - fn:csv-to-xdm, take options to control basic parsing, consisting +

All three functions: fn:csv-to-simple-rows, fn:csv-to-xml, and + fn:parse-csv, take options to control basic parsing, consisting of specifying the various delimiters. These core delimiter options are used by the functions that generate CSV data:

@@ -6895,7 +6892,7 @@ correctly in all browsers, depending on the system configuration.

-->

Additionally, the parsing functions share an additional option to control whether leading and trailing whitespace should be stripped or not.

- +
@@ -6984,11 +6981,11 @@ correctly in all browsers, depending on the system configuration.

--> Basic mapping of CSV to XDM -

The basic output from fn:parse-csv returns a sequence of rows, where +

The basic output from fn:csv-to-simple-rows returns a sequence of rows, where each row is simply mapped to an array of xs:string values.

The first row of the CSV is returned as with all the other rows. - fn:parse-csv does not distinguish between a header row and data + fn:csv-to-simple-rows does not distinguish between a header row and data rows, and returns all of them.

@@ -7031,16 +7028,16 @@ Field 2A,Field 2B,Field 2C,Field 2D' However, the reality is that CSVs can, and sometimes do, contain a variable number of fields in a row. As a result, implementations of this function must not truncate or pad the number of fields in each row for any reason. - The fn:csv-to-xml and fn:csv-to-xdm functions provide + The fn:csv-to-xml and fn:parse-csv functions provide facilities to deal with enforcing uniformity and an expected number of columns.

- Mapping CSV data to XDM in fn:csv-to-xdm + Mapping CSV data to XDM in fn:parse-csv -

The fn:csv-to-xdm function returns a +

The fn:parse-csv function returns a parsed-csv-structure-record:

@@ -7090,28 +7087,6 @@ Field 2A,Field 2B,Field 2C,Field 2D' fields sequence by either column position (when passed an xs:integer) or column name (when passed an xs:string).

- -

This function is, effectively, a partial application of - fn:csv-fetch-field-by-column where its $columns - argument is bound to the columns entry of the - parsed-csv-structure-record, and its $row argument - is bound to array{csv-row?fields}. This is described in more - detail below:

- -

Given a string, $csv-string containing CSV data, implementations - must return a function that will return identical results - to fn:csv-fetch-field-by-column called with the same - csv-columns and an array() containing the same - items as the fields sequence:

- - let $csv-record := fn:csv-to-xdm($csv-string), - $csv-columns := $csv-record?columns, - $csv-row := head($csv-record?rows) - return if (empty($csv-row?field(1))) - then empty(fn:csv-fetch-field-by-column($csv-columns, array{$csv-row}, 1)) - else $csv-row?field(1) = fn:csv-fetch-field-by-column($csv-columns, array{$csv-row}, 1) - (: must return true :) -
@@ -7180,22 +7155,21 @@ Bob,2023-07-14,2.34 Illustrative examples of processing CSV data -

The following examples illustrate how an application can build more complex processing of the output of fn:parse-csv.

+

The following examples illustrate how an application can build more complex processing of the output of fn:csv-to-simple-rows.

A variable, $crlf is assumed to be in scope containing the CR and LF characters

let $crlf := fn:char('x0D')||fn:char('x0A') - Converting a CSV into an HTML-style table using fn:csv-to-xdm + Converting a CSV into an HTML-style table using fn:parse-csv

Direct conversion is a matter of iterating across the records and fields to generate <tr> and <td> elements.

Using XQuery:

{ for $column in $csv?columns?fields @@ -7233,7 +7207,7 @@ return - +
@@ -10975,26 +10949,40 @@ ISBN 0 521 77752 6. -

Raised by fn:parse-csv if a syntax error in the quoting of one of the +

Raised by fn:csv-to-simple-rows if a syntax error in the quoting of one of the fields in the input CSV is found.

-

Raised by fn:parse-csv if the field-separator, +

Raised by fn:csv-to-simple-rows if the field-separator, record-separator, or quote-character option is set to an illegal value.

-

Raised by fn:parse-csv if any of the delimiter characters have been +

Raised by fn:csv-to-simple-rows if any of the delimiter characters have been set to the same value.

-

Raised by fn:csv-fetch-field-by-column, and the function from the - field entry of csv-columns-record, if its - $key argument is an xs:string and is not one of the - known column names.

+

Raised by the function from the field entry of + csv-columns-record, if its $key argument is an + xs:string and is not one of the known column names.

+
+ +

Raised by fn:parse-csv, fn:csv-to-xml, and the function + from the field entry of csv-columns-record, if an + argument referring to a column index is zero or negative. (The options + number-of-columns, filter-columns, or in a map passed + to column-names, or the argument to the field function.) +

+
+ +

Raised by fn:parse-csv and fn:csv-to-xml, if both the + number-of-columns and filter-columns options are set: + they are mutually exclusive.

Raised by fn:id, fn:idref, and fn:element-with-id @@ -11947,9 +11935,8 @@ declare function eg:distinct-nodes-stable ($arg as node()*) as node()* {

map:replace

map:substitute

fn:parse-csv

-

fn:csv-to-xdm

fn:csv-to-xml

-

fn:csv-fetch-field-by-column

+

fn:csv-to-simple-rows

array:replace

array:slice