- item-separator
+ item-delimiter
|
xs:string?
@@ -21787,196 +21788,7 @@ return $M(collation-key("a", $C))
-
-
-
-
-
-
- deterministic
- context-independent
- focus-independent
-
-
- Parses CSV data supplied as a string, returning the results in the form of a sequence of arrays of strings.
-
-
- The effect of the one-argument form of this function is the same as calling the
- two-argument form with an empty map as the value of the $options
- argument.
-
- The first argument is CSV data, as defined in , in the form of a
- sequence of xs:string values. The function parses this sequence to return
- an XDM value.
-
- If $csv is the empty sequence, implementations must
- return the empty sequence as the value of the body field of the returned
- map.
-
- The $options argument can be used to control the way in which the parsing
- takes place. The option parameter conventions apply.
-
- Implementations must treat any of CRLF, CR, or LF as a single line
- separator, as with fn:unparsed-text-lines .
-
- Fields are regarded as simple xs:string values. Implementations
- must leave whitespace within a field untouched, without
- normalizing or otherwise altering it, unless whitespace trimming is explicitly requested
- by the user using the trim-whitespace option.
-
- When whitespace trimming is requested, implementations must only
- strip leading and trailing whitespace, this is not equivalent to calling
- fn:normalize-space() .
-
- The entries that may appear in the $options map are as follows:
-
-
-
- The character used to delimit fields within a record. An instance of
- xs:string whose length is exactly one.
- xs:string
- ","
-
-
- The characters used to delimit records within the CSV string, if the
- default use of line separator as record separator is to be overridden.
- xs:string
- ()
-
-
- The character used to quote fields within the CSV string. An instance of
- xs:string whose length is exactly one.
- xs:string
- '"'
-
-
- Determines whether fields should have leading and trailing whitespace
- removed before being returned.
- xs:boolean
- false
-
- Fields will be returned with any leading or trailing
- whitespace intact. Implementations must preserve whitespace
- as it occurred in the CSV string.
-
- Fields will be returned with leading or trailing
- whitespace removed, and all non-leading or -trailing whitespace preserved.
-
-
-
-
-
- The result of the function is a sequence of arrays-of-strings
- array(xs:string)* .
- A blank row is represented as an empty array.
- An empty field is represented by the empty string.
-
-
- A dynamic error occurs if the value of
- $csv does not conform to the grammar for quoted
- fields.
- A dynamic error occurs if one or more of the values
- for field-separator , record-separator ,
- quote-character are specified and are not a single character.
- A dynamic error occurs if any of the values for
- field-separator , record-separator ,
- quote-character are equal.
-
-
- All fields are returned as xs:string values.
- Quoted fields in the input are returned without the quotes.
- For more discussion of the returned data, see .
-
-
-
-
-
-
- Handling any of the default record separators:
-
- parse-csv(`name,city{$crlf}Bob,Berlin{$crlf}Alice,Aachen{$crlf}`)
- (
- ["name", "city"]
- ["Bob", "Berlin"],
- ["Alice", "Aachen"]
-)
-
-
- parse-csv(`name,city{$cr}Bob,Berlin{$cr}Alice,Aachen{$cr}`)
- (
- ["name", "city"]
- ["Bob", "Berlin"],
- ["Alice", "Aachen"]
-)
-
-
- parse-csv(`name,city{$lf}Bob,Berlin{$lf}Alice,Aachen{$lf}`)
- (
- ["name", "city"]
- ["Bob", "Berlin"],
- ["Alice", "Aachen"]
-)
-
-
-
- Quote handling:
-
- parse-csv(`"name","city"${crlf}"Bob","Berlin"${crlf}"Alice","Aachen"${crlf}`)
- (
- ["name", "city"]
- ["Bob", "Berlin"],
- ["Alice", "Aachen"]
-)
-
-
- parse-csv(`"name","city"${crlf}"Bob ""The Exemplar"" Mustermann","Berlin"${crlf}`)
- (
- ["name", "city"]
- ['Bob "The Exemplar" Mustermann', "Berlin"],
- ["Alice", "Aachen"]
-)
-
-
-
- Non-default record- and field-separators:
-
- parse-csv("name;city§Bob;Berlin§Alice;Aachen", map{"record-separator": "§", "field-separator": ";"})
- (
- ["name", "city"]
- ["Bob", "Berlin"],
- ["Alice", "Aachen"]
-)
-
-
-
- Non-default quote character:
-
- parse-csv(`|name|,|city|${crlf}|Bob|,|Berlin|${crlf}`, map{"quote-character": "|"})
- (
- ["name", "city"]
- ["Bob", "Berlin"],
- ["Alice", "Aachen"]
-)
-
-
-
- Trimming whitespace in fields:
-
- parse-csv(`name ,city ${crlf}Bob ,Berlin${crlf}Alice ,Aachen${crlf}`, map{"trim-whitespace: true()})
- (
- ["name", "city"]
- ["Bob", "Berlin"],
- ["Alice", "Aachen"]
-)
-
-
-
-
-
-
-
-
+
@@ -21999,7 +21811,7 @@ return $M(collation-key("a", $C))
The first argument is CSV data, as defined in , in the form of a
sequence of xs:string values. The function parses this sequence using
- fn:parse-csv , and then processes its result to return an XDM value.
+ fn:csv-to-simple-rows , and then processes its result to return an XDM value.
If $csv is the empty sequence, implementations must
return a parsed-csv-structure-record whose rows entry is the empty sequence.
@@ -22015,12 +21827,13 @@ return $M(collation-key("a", $C))
def="option-parameter-conventions">option parameter conventions apply.
Handling of delimiters, and whitespace trimming, are handled using
- fn:parse-csv , and the options controlling their use are defined
+ fn:csv-to-simple-rows , and the options controlling their use are defined
there.
- If the headers option is true, implementations must
- exclude the first record from the returned map’s body key, and return it as
- the value of the returned map’s headers-record key.
+ If the column-names option is true, implementations must
+ exclude the first record from the returned map’s rows key, and use it to
+ construct a csv-columns-record that is returned as the value of the
+ returned map’s columns key.
The entries that may appear in the $options map are as follows:
@@ -22030,11 +21843,10 @@ return $M(collation-key("a", $C))
xs:string
","
-
- The characters used to delimit records within the CSV string, if the
- default use of line separator as record separator is to be overridden.
- xs:string
- ()
+
+ The sequence of strings used to delimit rows within the CSV string. Defaults to CRLF/LF/CR.
+ xs:string+
+ (" ", " ", " ")
The character used to quote fields within the CSV string. An instance of
@@ -22060,8 +21872,10 @@ return $M(collation-key("a", $C))
Determines whether the first row of the CSV should be treated as a list
of column names and returned as a csv-columns-record in the
- columns entry of the returned map.
- union(xs:boolean, map(xs:string, xs:integer))
+ columns entry of the returned map. Permitted values are a map of type
+ map(xs:string, xs:integer) or an xs:boolean .
+
+ item()
false
A csv-columns-record is constructed using the
@@ -22079,7 +21893,7 @@ return $M(collation-key("a", $C))
the returned parsed-csv-structure-record . Implementations
must not exclude the first row from the rows
entry of the parsed-csv-structure-record .
- A csv-columns-record is
+ A csv-columns-record is
constructed using the supplied map and returned as the header
entry of the parsed-csv-structure-record . The supplied map is used
as the names entry, and a sequence of strings for the
@@ -22090,8 +21904,21 @@ return $M(collation-key("a", $C))
+
+ A sequence indicating which fields to return and in which order. If this
+ option is missing or the empty sequence, all fields are returned in their natural
+ order. Items in the sequence are treated as the index of the column to return. In
+ the returned data, only fields from the specified columnms are returned, and in
+ the order specified. This option is mutually exclusive with the
+ number-of-columns option. Specifying both options will cause an error.
+ xs:integer*
+ ()
+
- Specifies how many columns to return.
+ Specifies how many columns to return. This option is mutually exclusive with the
+ filter-columns option. Specifying both options will cause an error.
union(enum("all", "first-row"), xs:integer)
"all"
@@ -22112,12 +21939,6 @@ return $M(collation-key("a", $C))
- If column names were extracted from the first row of the CSV, when there are duplicate
- column names, implementations must include only the first occurrence
- in the names entry of the csv-columns-record , ignoring
- subsequent entries. Any fields in the first record whose value is the empty string
- must also be omitted.
-
The result of the function is a parsed-csv-structure-record , a map with
string keys containing two entries, columns , and rows .
@@ -22125,7 +21946,7 @@ return $M(collation-key("a", $C))
The entry with key "columns" holds a csv-columns-record
record. If column names have been extracted, or supplied, then the record will
have a names entry whose value is a map of column-name to
- column-number, map(xs:integer, xs:string) . The record’s
+ column-number, map(xs:string, xs:integer) . The record’s
fields entry will contains the column names as a sequence of
strings, xs:string* , replicating the row they were taken from.
@@ -22170,27 +21991,65 @@ return $M(collation-key("a", $C))
supplied $key is a string and does not occur in the map of column
names.
+ -
+
rules: The function returns the field in the sequence fields entry of this
+ csv-row-record at the position in
+ the sequence either explicitly provided (when the $key argument is an
+ xs:integer ), or looked up from the map of name to position in the
+ names entry of the csv-columns-record of the
+ parsed-csv-structure-record this csv-row-record
+ was returned as part of.
+
+ When the argument is a string, if the string is missing from the keys
+ of the names map , then implementations must
+ raise an .
+
+ When the argument is an integer, if the integer position is outside the
+ bounds of the sequence contained in the fields entry of this
+ csv-row-record (i.e. is greater than the size of the
+ sequence), then implementations must return the empty
+ string.
+
-
- This function behaves identically to fn:csv-fetch-field-by-column
- would had the header entry of the containing
- parsed-csv-structure-record and the fields entry of
- this csv-row-record been supplied as its first two arguments, and
- $key as its last. See the definition of
- fn:csv-fetch-field-by-column for more details
+
+ If column names were extracted from the first row of the CSV, when there are duplicate
+ column names, implementations must include only the first occurrence
+ in the names entry of the csv-columns-record , ignoring
+ subsequent entries. Any fields in the first record whose value is the empty string
+ must also be omitted.
+
+ If the number-of-columns options is set to "first-row" or an
+ integer, or the filter-columns option is set, and the
+ column-names option is set to true() , the filtering of
+ columns is performed before the extraction of the first row and creation of the
+ csv-columns-record .
+
+ If the number-of-columns options is set to "first-row" or an
+ integer, or the filter-columns option is set, and the
+ column-names option is set to a map(xs:string, xs:integer) ,
+ then filtering of columns does not affect the creation of the
+ csv-columns-record , and it is possible that the number of fields in the
+ rows is smaller than the number of fields in the csv-columns-record .
A dynamic error occurs if the value of
$csv does not conform to the grammar for quoted
fields.
A dynamic error occurs if one or more of the values
- for field-separator , record-separator ,
- quote-character are specified and are not a single character.
+ for field-delimiter or quote-character are specified and are
+ not a single character.
A dynamic error occurs if any of the values for
- field-separator , record-separator ,
+ field-delimiter , row-delimiter ,
quote-character are equal.
+ A dynamic error occurs if any column-index integers,
+ such as the values in a map supplied to column-names , or as the value of
+ number-of-columns or filter-columns , are negative or
+ zero.
+ A dynamic error occurs if both the
+ number-of-columns and filter-columns options are set in a
+ call to fn:parse-csv .
All fields are returned as xs:string values.
@@ -22198,99 +22057,107 @@ return $M(collation-key("a", $C))
For more discussion of the returned data, see .
-
+
`name,city{$crlf}Bob,Berlin{$crlf}Alice,Aachen{$crlf}`
- map { "record-separator": "§", "field-separator": ";", "quote-character": "|" }
+ map { "row-delimiter": "§", "field-delimiter": ";", "quote-character": "|" }
`|name|;|city|§|Bob|;|Berlin|§|Alice|;|Aachen|`
- map { "trim-whitespace": true() }
With defaults for delimiters and quotes, and default column extraction (false):
- map:keys(csv-to-xdm($csv-string))
- ("header", "rows")
+ map:keys(parse-csv($csv-string))
+ ("columns", "rows")
- csv-to-xdm($csv-string)?columns
+ parse-csv($csv-string)?columns
map {
"names": map {},
- "fields": (),
+ "fields": ()
}
- count(csv-to-xdm($csv-string)?rows)
+ count(parse-csv($csv-string)?rows)
3
- csv-to-xdm($csv-string)?rows[1]?field("name")
-
+ parse-csv($csv-string)?rows[1]?field("name")
+
- csv-to-xdm($csv-string)?rows[1]?field(2)
+ parse-csv($csv-string)?rows[1]?field(2)
"city"
With defaults for delimiters and quotes, and columns: true() set:
- csv-to-xdm($csv-string, map {"columns": true()})?columns
+ parse-csv($csv-string, map {"column-names": true()})?columns
map {
"names": map { "name": 1, "city": 2 },
- "fields": ("name", "city"),
+ "fields": ("name", "city")
}
- count(csv-to-xdm($csv-string, map {"columns": true()})?rows)
+ count(parse-csv($csv-string, map {"column-names": true()})?rows)
2
- csv-to-xdm($csv-string), map {"columns": true()}?rows[1]?fields
+ parse-csv($csv-string, map {"column-names": true()})?rows[1]?fields
("Bob", "Berlin")
- csv-to-xdm($csv-string, map {"columns": true()})?rows[1]?field("name")
+ parse-csv($csv-string, map {"column-names": true()})?rows[1]?field("name")
"Bob"
- csv-to-xdm($csv-string, map {"columns": true()})?rows[1]?field(2)
+ parse-csv($csv-string, map {"column-names": true()})?rows[1]?field(2)
"Berlin"
Non-default record- and field-delimiters, non-default quotes:
- map:keys(csv-to-xdm($non-std-csv, $options))
- ("header", "rows")
+ parse-csv($non-std-csv, $options)?rows[3]?field(1)
+ "Alice"
-
- csv-to-xdm($non-std-csv, $options)?columns
+
+
+ map { "column-names": map { "Person": 1, "Location": 2 } }
+
+ Specifying column names explicitly:
+
+ map:keys(parse-csv($csv-string, $options))
+ ("columns", "rows")
+
+
+ parse-csv($csv-string, $options)?columns
map {
- "names": map {},
- "fields": (),
- }
+ "names": map { "Person": 1, "Location": 2 },
+ "fields": ("Person", "Location")
+}
-
- count(csv-to-xdm($non-std-csv, $options)?rows)
- 3
+
+ count(parse-csv($csv-string, $options)?rows)
+ 2
-
- csv-to-xdm($non-std-csv, $options)?rows[3]?field(1)
+
+ parse-csv($csv-string, $options)?rows[1]?field(1)
"Alice"
-
-
- Trimming whitespace in fields:
-
- csv-to-xdm(`name ,city ${crlf}Bob ,Berlin${crlf}Alice ,Aachen${crlf}`, $trim-opts)?rows?fields
- ("name", "city", "Bob", "Berlin", "Alice", "Aachen")
+
+ parse-csv($csv-string, $options)?rows[2]?field("Location")
+ "Berlin"
- `date,name,city,amount,currency,original amount,note{$crlf}2023-07-19,Bob,Berlin,10.00,USD,13.99{$crlf}2023-07-20,Alice,Aachen,15.00{$crlf}2023-07-20,Charlie,Celle,15.00,GBP,11.99,cake,not a lie{$crlf}`
+ concat(`date,name,city,amount,currency,original amount,note{$crlf}`,
+`2023-07-19,Bob,Berlin,10.00,USD,13.99{$crlf}`,
+`2023-07-20,Alice,Aachen,15.00{$crlf}`,
+`2023-07-20,Charlie,Celle,15.00,GBP,11.99,cake,not a lie{$crlf}`)
- Filtering columns
+ Filtering columns, with column-names: true()
- csv-to-xdm($csv-uneven-cols, map { "columns": true(), "filter-columns": (2,1,4) })?columns?fields
+ parse-csv($csv-uneven-cols, map { "column-names": true(), "filter-columns": (2,1,4) })?columns?fields
("name","date","amount")
- for $r in csv-to-xdm($csv-uneven-cols, map { "columns": true(), "filter-columns": (2,1,4) })?rows return array { $r?fields }
+ for $r in parse-csv($csv-uneven-cols, map { "column-names": true(), "filter-columns": (2,1,4) })?rows return array { $r?fields }
(
["Bob","2023-07-19","10.00"],
["Alice","2023-07-20","15.00"],
@@ -22299,29 +22166,21 @@ return $M(collation-key("a", $C))
- Specifying the number of columns, using "all" (the default)
-
- csv-to-xdm($csv-uneven-cols, map { "columns": true(), "number-of-columns": "all" })?columns?fields
- ("date","name","city","amount","currency","original amount","note")
-
+ Filtering columns, with column-names: map { ... }
- for $r in csv-to-xdm($csv-uneven-cols, map { "columns": true(), "number-of-columns": "all" })?rows return array { $r?fields }
- (
- ["2023-07-19","Bob","Berlin","10.00","USD","13.99"],
- ["2023-07-20","Alice","Aachen","15.00"],
- ["2023-07-20","Charlie","Celle","15.00","GBP","11.99","cake","not a lie"]
-)
+ parse-csv($csv-uneven-cols, map { "column-names": map { "Person": 1, "Amount": 3 }, "filter-columns": (2,1,4) })?columns
+ map {
+ "names": map { "Person": 1, "Amount": 3 },
+ "fields": ("Person", "", "Amount")
+}
- Specifying the number of columns using "first-row"
+ Specifying the number of columns using "first-row" and column-names: false()
- csv-to-xdm($csv-uneven-cols, map { "columns": true(), "number-of-columns": "first-row" })?columns?fields
- ("date","name","city","amount","currency","original amount","note")
-
-
- for $r in csv-to-xdm($csv-uneven-cols, map { "columns": true(), "number-of-columns": "first-row" })?rows return array { $r?fields }
+ for $r in parse-csv($csv-uneven-cols, map { "number-of-columns": "first-row" })?rows return array { $r?fields }
(
+ ["date","name","city","amount","currency","original amount","note"],
["2023-07-19","Bob","Berlin","10.00","USD","13.99",""],
["2023-07-20","Alice","Aachen","15.00","","",""],
["2023-07-20","Charlie","Celle","15.00","GBP","11.99","cake"]
@@ -22329,13 +22188,16 @@ return $M(collation-key("a", $C))
- Specifying the number of columns with a number
+ Specifying the number of columns with a number and column-names: true()
- csv-to-xdm($csv-uneven-cols, map { "columns": true(), "number-of-columns": 6 })?columns?fields
- ("date","name","city","amount","currency","original amount")
+ parse-csv($csv-uneven-cols, map { "column-names": true(), "number-of-columns": 6 })?columns?fields
+ map {
+ "names": map { "date": 1, "name": 2, "city": 3, "amount": 4, "currency": 5, "original amount": 6 },
+ "fields": ("date","name","city","amount","currency","original amount")
+}
- for $r in csv-to-xdm($csv-uneven-cols, map { "columns": true(), "number-of-columns": 6 })?rows return array { $r?fields }
+ for $r in parse-csv($csv-uneven-cols, map { "column-names": true(), "number-of-columns": 6 })?rows return array { $r?fields }
(
["2023-07-19","Bob","Berlin","10.00","USD","13.99"],
["2023-07-20","Alice","Aachen","15.00","",""],
@@ -22346,6 +22208,192 @@ return $M(collation-key("a", $C))
+
+
+
+
+
+
+
+
+ deterministic
+ context-independent
+ focus-independent
+
+
+ Parses CSV data supplied as a string, returning the results in the form of a sequence of arrays of strings.
+
+
+ The effect of the one-argument form of this function is the same as calling the
+ two-argument form with an empty map as the value of the $options
+ argument.
+
+ The first argument is CSV data, as defined in , in the form of a
+ sequence of xs:string values. The function parses this sequence to return
+ an XDM value.
+
+ If $csv is the empty sequence, implementations must
+ return the empty sequence as the value of the body field of the returned
+ map.
+
+ The $options argument can be used to control the way in which the parsing
+ takes place. The option parameter conventions apply.
+
+ Implementations must treat any of CRLF, CR, or LF as a single line
+ separator, as with fn:unparsed-text-lines .
+
+ Fields are regarded as simple xs:string values. Implementations
+ must leave whitespace within a field untouched, without
+ normalizing or otherwise altering it, unless whitespace trimming is explicitly requested
+ by the user using the trim-whitespace option.
+
+ When whitespace trimming is requested, implementations must only
+ strip leading and trailing whitespace, this is not equivalent to calling
+ fn:normalize-space() .
+
+ The entries that may appear in the $options map are as follows:
+
+
+
+ The character used to delimit fields within a record. An instance of
+ xs:string whose length is exactly one.
+ xs:string
+ ","
+
+
+ The sequence of strings used to delimit rows within the CSV string. Defaults to CRLF/LF/CR.
+ xs:string+
+ (" ", " ", " ")
+
+
+ The character used to quote fields within the CSV string. An instance of
+ xs:string whose length is exactly one.
+ xs:string
+ '"'
+
+
+ Determines whether fields should have leading and trailing whitespace
+ removed before being returned.
+ xs:boolean
+ false
+
+ Fields will be returned with any leading or trailing
+ whitespace intact. Implementations must preserve whitespace
+ as it occurred in the CSV string.
+
+ Fields will be returned with leading or trailing
+ whitespace removed, and all non-leading or -trailing whitespace preserved.
+
+
+
+
+
+ The result of the function is a sequence of arrays-of-strings
+ array(xs:string)* .
+ A blank row is represented as an empty array.
+ An empty field is represented by the empty string.
+
+
+ A dynamic error occurs if the value of
+ $csv does not conform to the grammar for quoted
+ fields.
+ A dynamic error occurs if one or more of the values
+ for field-delimiter or quote-character are specified and are
+ not a single character.
+ A dynamic error occurs if any of the values for
+ field-delimiter , row-delimiter ,
+ quote-character are equal.
+
+
+ All fields are returned as xs:string values.
+ Quoted fields in the input are returned without the quotes.
+ For more discussion of the returned data, see .
+
+
+
+
+
+
+ Handling any of the default record separators:
+
+ csv-to-simple-rows(`name,city{$crlf}Bob,Berlin{$crlf}Alice,Aachen{$crlf}`)
+ (
+ ["name", "city"],
+ ["Bob", "Berlin"],
+ ["Alice", "Aachen"]
+)
+
+
+ csv-to-simple-rows(`name,city{$cr}Bob,Berlin{$cr}Alice,Aachen{$cr}`)
+ (
+ ["name", "city"],
+ ["Bob", "Berlin"],
+ ["Alice", "Aachen"]
+)
+
+
+ csv-to-simple-rows(`name,city{$lf}Bob,Berlin{$lf}Alice,Aachen{$lf}`)
+ (
+ ["name", "city"],
+ ["Bob", "Berlin"],
+ ["Alice", "Aachen"]
+)
+
+
+
+ Quote handling:
+
+ csv-to-simple-rows(`"name","city"{$crlf}"Bob","Berlin"{$crlf}"Alice","Aachen"{$crlf}`)
+ (
+ ["name", "city"],
+ ["Bob", "Berlin"],
+ ["Alice", "Aachen"]
+)
+
+
+ csv-to-simple-rows(`"name","city"{$crlf}"Bob ""The Exemplar"" Mustermann","Berlin"{$crlf}`)
+ (
+ ["name", "city"],
+ ['Bob "The Exemplar" Mustermann', "Berlin"]
+)
+
+
+
+ Non-default record- and field-delimiters:
+
+ csv-to-simple-rows("name;city§Bob;Berlin§Alice;Aachen", map{"row-delimiter": "§", "field-delimiter": ";"})
+ (
+ ["name", "city"],
+ ["Bob", "Berlin"],
+ ["Alice", "Aachen"]
+)
+
+
+
+ Non-default quote character:
+
+ csv-to-simple-rows(`|name|,|city|{$crlf}|Bob|,|Berlin|{$crlf}`, map{"quote-character": "|"})
+ (
+ ["name", "city"],
+ ["Bob", "Berlin"]
+)
+
+
+
+ Trimming whitespace in fields:
+
+ csv-to-simple-rows(`name ,city {$crlf}Bob ,Berlin{$crlf}Alice ,Aachen{$crlf}`, map{"trim-whitespace": true()})
+ (
+ ["name", "city"],
+ ["Bob", "Berlin"],
+ ["Alice", "Aachen"]
+)
+
+
+
+
+
@@ -22369,7 +22417,7 @@ return $M(collation-key("a", $C))
The first argument is CSV data, as defined in , in the form of a
sequence of xs:string values. The function parses this sequence using
- fn:parse-csv , and then processes its result to return an XML document.
+ fn:csv-to-simple-rows , and then processes its result to return an XML document.
If $csv is the empty sequence, implementations must
return a ]]> whose ]]> element
@@ -22390,7 +22438,7 @@ return $M(collation-key("a", $C))
def="option-parameter-conventions">option parameter conventions apply.
Handling of delimiters, and whitespace trimming, are handled using
- fn:parse-csv , and the options controlling their use are defined
+ fn:csv-to-simple-rows , and the options controlling their use are defined
there.
The entries that may appear in the $options map are as follows:
@@ -22402,8 +22450,8 @@ return $M(collation-key("a", $C))
xs:string
","
-
- The characters used to delimit records within the CSV string, if the
+
+ The characters used to delimit rows within the CSV string, if the
default use of line separator as record separator is to be overridden.
xs:string
()
@@ -22431,9 +22479,11 @@ return $M(collation-key("a", $C))
Determines whether the first row of the CSV should be treated as a list
- of column headers and returned as a csv-columns-record in the
- header entry of the returned map.
- union(xs:boolean, map(xs:integer, xs:string))
+ of column headers and returned as ]]> elements in
+ the ]]> element. Permitted values are a map of type
+ map(xs:string, xs:integer) or an xs:boolean .
+
+ item()
false
The ]]> element is populated
@@ -22443,7 +22493,7 @@ return $M(collation-key("a", $C))
element.
Implementations must not include a
]]> element in the output.
- The supplied map is used to
+ The supplied map is used to
construct a sequence of ]]> elements to populate
the ]]> element. The xs:integer
denotes the column number, and the xs:string the column name. Gaps
@@ -22489,355 +22539,278 @@ return $M(collation-key("a", $C))
$csv does not conform to the grammar for quoted
fields.
A dynamic error occurs if one or more of the values
- for field-separator , record-separator ,
+ for field-delimiter , row-delimiter ,
quote-character are specified and are not a single character.
A dynamic error occurs if any of the values for
- field-separator , record-separator ,
+ field-delimiter , row-delimiter ,
quote-character are equal.
-
+
`name,city{$crlf}Bob,Berlin{$crlf}Alice,Aachen{$crlf}`
An empty CSV with default column extraction (false):
csv-to-xml("")
-
-
+
+
+
]]>
An empty CSV with column extraction:
- csv-to-xml("", map { "columns": true() })
+ csv-to-xml("", map { "column-names": true() })
-
-
-
+
+
+
+
]]>
An empty CSV with explicit column names:
- csv-to-xml("", map { "columns": map { "name": 1, "city": 3 })
+ csv-to-xml("", map { "column-names": map { "name": 1, "city": 3 } })
-
- name
-
- city
-
-
-
+
+
+ name
+
+ city
+
+
+
]]>
With defaults for delimiters and quotes, and column extraction:
- csv-to-xml($csv-string, map { "columns": true() })
+ csv-to-xml($csv-string, map { "column-names": true() })
-
- name
- city
-
-
-
- Bob
- Berlin
-
-
- Alice
- Aachen
-
-
-
+
+
+ name
+ city
+
+
+
+ Bob
+ Berlin
+
+
+ Alice
+ Aachen
+
+
+
]]>
With defaults for delimiters and quotes, and column extraction:
- csv-to-xml($csv-string, map { "columns": true() })
+ csv-to-xml($csv-string, map { "column-names": true() })
-
- name
- city
-
-
-
- Bob
- Berlin
-
-
- Alice
- Aachen
-
-
-
+
+
+ name
+ city
+
+
+
+ Bob
+ Berlin
+
+
+ Alice
+ Aachen
+
+
+
]]>
- `date,name,city,amount,currency,original amount,note{$crlf}2023-07-19,Bob,Berlin,10.00,USD,13.99{$crlf}2023-07-20,Alice,Aachen,15.00{$crlf}2023-07-20,Charlie,Celle,15.00,GBP,11.99,cake,not a lie{$crlf}`
+ concat(`date,name,city,amount,currency,original amount,note{$crlf}`,
+`2023-07-19,Bob,Berlin,10.00,USD,13.99{$crlf}`,
+`2023-07-20,Alice,Aachen,15.00{$crlf}`,
+`2023-07-20,Charlie,Celle,15.00,GBP,11.99,cake,not a lie{$crlf}`)
Filtering columns
- csv-to-xml($csv-string, map { "columns": true(), "filter-columns": (2,1,4) })
+ csv-to-xml($csv-uneven-cols, map { "column-names": true(), "filter-columns": (2,1,4) })
-
- name
- date
- amount
-
-
-
- Bob
- 2023-07-19
- 10.00
-
-
- Alice
- 2023-07-20
- 15.00
-
-
- Charlie
- 2023-07-20
- 15.00
-
-
-
+
+
+ name
+ date
+ amount
+
+
+
+ Bob
+ 2023-07-19
+ 10.00
+
+
+ Alice
+ 2023-07-20
+ 15.00
+
+
+ Charlie
+ 2023-07-20
+ 15.00
+
+
+
]]>
Specifying the number of columns, using "all" (the default)
- csv-to-xml($csv-uneven-cols, map { "columns": true(), "number-of-columns": "all" })
+ csv-to-xml($csv-uneven-cols, map { "column-names": true(), "number-of-columns": "all" })
-
- date
- name
- city
- amount
- currency
- original amount
- note
-
-
-
- 2023-07-19
- Bob
- Berlin
- 10.00
- USD
- 13.99
-
-
- 2023-07-20
- Alice
- Aachen
- 15.00
-
-
- 2023-07-20
- Charlie
- Celle
- 15.00
- GBP
- 11.99
- cake
- not a lie
-
-
-
+
+
+ date
+ name
+ city
+ amount
+ currency
+ original amount
+ note
+
+
+
+ 2023-07-19
+ Bob
+ Berlin
+ 10.00
+ USD
+ 13.99
+
+
+ 2023-07-20
+ Alice
+ Aachen
+ 15.00
+
+
+ 2023-07-20
+ Charlie
+ Celle
+ 15.00
+ GBP
+ 11.99
+ cake
+ not a lie
+
+
+
]]>
Specifying the number of columns using "first-row"
- csv-to-xml($csv-uneven-cols, map { "columns": true(), "number-of-columns": "first-row" })
+ csv-to-xml($csv-uneven-cols, map { "column-names": true(), "number-of-columns": "first-row" })
-
- date
- name
- city
- amount
- currency
- original amount
- note
-
-
-
- 2023-07-19
- Bob
- Berlin
- 10.00
- USD
- 13.99
-
-
-
- 2023-07-20
- Alice
- Aachen
- 15.00
-
-
-
-
-
- 2023-07-20
- Charlie
- Celle
- 15.00
- GBP
- 11.99
- cake
-
-
-
+
+
+ date
+ name
+ city
+ amount
+ currency
+ original amount
+ note
+
+
+
+ 2023-07-19
+ Bob
+ Berlin
+ 10.00
+ USD
+ 13.99
+
+
+
+ 2023-07-20
+ Alice
+ Aachen
+ 15.00
+
+
+
+
+
+ 2023-07-20
+ Charlie
+ Celle
+ 15.00
+ GBP
+ 11.99
+ cake
+
+
+
]]>
Specifying the number of columns with a number
- csv-to-xml($csv-uneven-cols, map { "columns": true(), "number-of-columns": 6 })
+ csv-to-xml($csv-uneven-cols, map { "column-names": true(), "number-of-columns": 6 })
-
- date
- name
- city
- amount
- currency
- original amount
-
-
-
- 2023-07-19
- Bob
- Berlin
- 10.00
- USD
- 13.99
-
-
- 2023-07-20
- Alice
- Aachen
- 15.00
-
-
-
-
- 2023-07-20
- Charlie
- Celle
- 15.00
- GBP
- 11.99
-
-
-
+
+
+ date
+ name
+ city
+ amount
+ currency
+ original amount
+
+
+
+ 2023-07-19
+ Bob
+ Berlin
+ 10.00
+ USD
+ 13.99
+
+
+ 2023-07-20
+ Alice
+ Aachen
+ 15.00
+
+
+
+
+ 2023-07-20
+ Charlie
+ Celle
+ 15.00
+ GBP
+ 11.99
+
+
+
]]>
-
-
-
-
-
-
-
-
-
- deterministic
- context-independent
- focus-independent
-
-
- Fetches a field from a parsed CSV row by name or position.
-
-
- The first argument is a csv-columns-record , as provided in the
- header entry of the parsed-csv-structure-record returned by
- fn:csv-to-xdm .
-
- The second argument is the row whose fields are being fetched, represented as a sequence
- of strings as would be provided by the fields entry of a
- csv-row-record returned by fn:csv-to-xdm .
-
- The final argument is the key to use for the lookup, supplied as either an
- xs:string (the column name) or xs:integer (the column
- position).
-
- When the argument is a string, if the string is missing from the keys of the map
- contained in the names entry of the $columns argument’s
- csv-columns-record , then implementations must raise
- an .
-
- When the argument is an integer, if the integer position is outside the bounds of the
- $fields sequence (i.e. is greater than the size of the sequence), then
- implementations must return the empty string.
-
- The function returns the field in the sequence $fields at the position in
- the sequence either explicitly provided (when $key is an
- xs:integer ), or looked up from the map of name to position in the
- csv-columns-record provided in $columns .
-
-
- A dynamic error occurs if the value of
- $key is an xs:string but is not a member of the keys of the
- map contained in the names entry of the csv-columns-record in
- $header . fields.
-
-
- map {
- "names": map { "name": 1, "city": 2 },
- "fields: ("name", "city")
-}
- ("Bob", "Berlin")
-
- With a string key:
-
- csv-fetch-field-by-column($columns, $fields, "name")
- "Bob"
-
-
- csv-fetch-field-by-column($columns, $fields, "amount")
-
-
-
-
- With an integer key
-
- csv-fetch-field-by-column($columns, $fields, 2)
- "Berlin"
-
-
- csv-fetch-field-by-column($columns, $fields, 3)
- ""
-
-
-
-
-
diff --git a/specifications/xpath-functions-40/src/xpath-functions.xml b/specifications/xpath-functions-40/src/xpath-functions.xml
index 73c20d81f..4579c099b 100644
--- a/specifications/xpath-functions-40/src/xpath-functions.xml
+++ b/specifications/xpath-functions-40/src/xpath-functions.xml
@@ -5762,14 +5762,11 @@ correctly in all browsers, depending on the system configuration.-->
-
-
-
-
-
+
+
@@ -6874,19 +6871,19 @@ correctly in all browsers, depending on the system configuration.-->
within a field. (See .)
The functions for processing CSV-formatted data are built on
- fn:parse-csv , which provides a simple representation of a parsed CSV
+ fn:csv-to-simple-rows , which provides a simple representation of a parsed CSV
as a sequence of arrays-of-strings, array(xs:string)* , handling row and
column delimiters, and quoting.
- The fn:csv-to-xml and fn:csv-to-xdm functions provide more
+ The fn:csv-to-xml and fn:parse-csv functions provide more
sophisticated processing.
Common parsing options
- All three functions: fn:parse-csv , fn:csv-to-xml , and
- fn:csv-to-xdm , take options to control basic parsing, consisting
+ All three functions: fn:csv-to-simple-rows , fn:csv-to-xml , and
+ fn:parse-csv , take options to control basic parsing, consisting
of specifying the various delimiters. These core delimiter options are used by the
functions that generate CSV data:
@@ -6895,7 +6892,7 @@ correctly in all browsers, depending on the system configuration.-->
Additionally, the parsing functions share an additional option to control whether
leading and trailing whitespace should be stripped or not.
-
+
@@ -6984,11 +6981,11 @@ correctly in all browsers, depending on the system configuration.-->
Basic mapping of CSV to XDM
- The basic output from fn:parse-csv returns a sequence of rows, where
+ The basic output from fn:csv-to-simple-rows returns a sequence of rows, where
each row is simply mapped to an array of xs:string values.
The first row of the CSV is returned as with all the other rows.
- fn:parse-csv does not distinguish between a header row and data
+ fn:csv-to-simple-rows does not distinguish between a header row and data
rows, and returns all of them.
@@ -7031,16 +7028,16 @@ Field 2A,Field 2B,Field 2C,Field 2D'
However, the reality is that CSVs can, and sometimes do, contain a variable number
of fields in a row. As a result, implementations of this function must
not truncate or pad the number of fields in each row for any reason.
- The fn:csv-to-xml and fn:csv-to-xdm functions provide
+ The fn:csv-to-xml and fn:parse-csv functions provide
facilities to deal with enforcing uniformity and an expected number of
columns.
- Mapping CSV data to XDM in fn:csv-to-xdm
+ Mapping CSV data to XDM in fn:parse-csv
- The fn:csv-to-xdm function returns a
+ The fn:parse-csv function returns a
parsed-csv-structure-record :
@@ -7090,28 +7087,6 @@ Field 2A,Field 2B,Field 2C,Field 2D'
fields sequence by either column position (when passed an
xs:integer ) or column name (when passed an
xs:string ).
-
- This function is, effectively, a partial application of
- fn:csv-fetch-field-by-column where its $columns
- argument is bound to the columns entry of the
- parsed-csv-structure-record , and its $row argument
- is bound to array{csv-row?fields} . This is described in more
- detail below:
-
- Given a string, $csv-string containing CSV data, implementations
- must return a function that will return identical results
- to fn:csv-fetch-field-by-column called with the same
- csv-columns and an array() containing the same
- items as the fields sequence:
-
- let $csv-record := fn:csv-to-xdm($csv-string),
- $csv-columns := $csv-record?columns,
- $csv-row := head($csv-record?rows)
- return if (empty($csv-row?field(1)))
- then empty(fn:csv-fetch-field-by-column($csv-columns, array{$csv-row}, 1))
- else $csv-row?field(1) = fn:csv-fetch-field-by-column($csv-columns, array{$csv-row}, 1)
- (: must return true :)
-
@@ -7180,22 +7155,21 @@ Bob,2023-07-14,2.34
Illustrative examples of processing CSV data
- The following examples illustrate how an application can build more complex processing of the output of fn:parse-csv .
+ The following examples illustrate how an application can build more complex processing of the output of fn:csv-to-simple-rows .
A variable, $crlf is assumed to be in scope containing the CR and LF characters
let $crlf := fn:char('x0D')||fn:char('x0A')
- Converting a CSV into an HTML-style table using fn:csv-to-xdm
+ Converting a CSV into an HTML-style table using fn:parse-csv
Direct conversion is a matter of iterating across the records and fields to
generate <tr> and <td> elements.
Using XQuery:
{
for $column in $csv?columns?fields
@@ -7233,7 +7207,7 @@ return
-
+
@@ -10975,26 +10949,40 @@ ISBN 0 521 77752 6.
- Raised by fn:parse-csv if a syntax error in the quoting of one of the
+ Raised by fn:csv-to-simple-rows if a syntax error in the quoting of one of the
fields in the input CSV is found.
- Raised by fn:parse-csv if the field-separator ,
+ Raised by fn:csv-to-simple-rows if the field-separator ,
record-separator , or quote-character option is set to
an illegal value.
- Raised by fn:parse-csv if any of the delimiter characters have been
+ Raised by fn:csv-to-simple-rows if any of the delimiter characters have been
set to the same value.
- Raised by fn:csv-fetch-field-by-column , and the function from the
- field entry of csv-columns-record , if its
- $key argument is an xs:string and is not one of the
- known column names.
+ Raised by the function from the field entry of
+ csv-columns-record , if its $key argument is an
+ xs:string and is not one of the known column names.
+
+
+ Raised by fn:parse-csv , fn:csv-to-xml , and the function
+ from the field entry of csv-columns-record , if an
+ argument referring to a column index is zero or negative. (The options
+ number-of-columns , filter-columns , or in a map passed
+ to column-names , or the argument to the field function.)
+
+
+
+ Raised by fn:parse-csv and fn:csv-to-xml , if both the
+ number-of-columns and filter-columns options are set:
+ they are mutually exclusive.
Raised by fn:id , fn:idref , and fn:element-with-id
@@ -11947,9 +11935,8 @@ declare function eg:distinct-nodes-stable ($arg as node()*) as node()* {
map:replace
map:substitute
fn:parse-csv
- fn:csv-to-xdm
fn:csv-to-xml
- fn:csv-fetch-field-by-column
+ fn:csv-to-simple-rows
array:replace
array:slice
|