Skip to content

Commit

Permalink
Merge pull request #1108 from ndw/build-uri
Browse files Browse the repository at this point in the history
566-partial Describe a less aggressive %-encoding for fn:build-uri
  • Loading branch information
ndw authored May 28, 2024
2 parents 51608cf + d3102df commit 474fd37
Showing 1 changed file with 47 additions and 7 deletions.
54 changes: 47 additions & 7 deletions specifications/xpath-functions-40/src/function-catalog.xml
Original file line number Diff line number Diff line change
Expand Up @@ -30897,12 +30897,52 @@ path with an explicit <code>file:</code> scheme.</p>
is made to determine if a password or standard port are present,
the <code>authority</code> value is simply added to the string.)</p>

<p>If the <code>path-segments</code> key exists in the map, then the
path is constructed from the parts, with non-URI characters encoded:
<code>string-join($parts?path-segments ! encode-for-uri(.), $options?path-separator)</code>,
otherwise the value of the <code>path</code> key is used.
If the <code>path</code> value is the empty sequence,
the empty string is used for the path. The path is added to the URI.</p>
<ulist>
<item><p>If the <code>path-segments</code> key exists in the map,
then the path is constructed from the segments.</p>
<p>To construct the path, each
segment is encoded, then they are joined together, separated by the path
separator, to form the path.
The encoding performed replaces any control characters (code points less than 0x20)
and exclusively the following characters with their
percent-escaped forms: <code> </code> (space) <code>%</code> (percent
sign), <code>/</code> (solidus), <code>?</code> (question mark),
<code>#</code> (number sign), <code>+</code> (plus sign),
<code>[</code> (left square bracket), and <code>]</code> (right
square bracket).</p>

<p>This is a compromise. The <code>fn:parse-uri</code> function removes
percent-escaping when it constructs the path segments because that
behavior is often most convenient when dealing with
common cases involving <code>file:</code> URIs. But the decoding process is not lossless (consider
that both “<code>%3D</code>” and “<code>=</code>” will be realized as
<code>=</code> in the path segments).</p>

<p>The escaping described here protects the delimiters used in
hierarchical URIs because leaving those unescaped would change the
interpretation of the URI.
An application with more stringent requirements can construct a <code>path</code>
that satisfies the requirements and leave the <code>path-segments</code> key out of the
map.</p>
</item>
<item>
<p>Otherwise the value of the <code>path</code> key is used.</p>
</item>
<item>
<p>If neither are present, the empty string is used for the path.</p>
</item>
</ulist>

<note>
<p>The compromise encoding used for the <code>path-segments</code> does not
apply to the query parameters or fragment identifier. Those values are encoded
with <code>encode-for-uri</code>. The compromise encoding isn’t appropriate
because those fields can contain additional characters that must be encoded.
Adopting a <emph>different</emph> compromise encoding for those values
seems unnecessary in practice.</p>
</note>

<p>The path is added to the URI.</p>

<p>If the <code>query-parameters</code> key exists in the map, its value
must be a map. A sequence of strings is constructed from the values in the map.
Expand All @@ -30925,7 +30965,7 @@ path with an explicit <code>file:</code> scheme.</p>
a preceding question mark (<code>?</code>).</p>

<p>If the <code>fragment</code> key exists in the map, then
the value of that key is added to the URI with
the value of that key is encoded with <code>encode-for-uri</code> and added to the URI with
a preceding hash mark (<code>#</code>).</p>

<p>The resulting URI is returned.</p>
Expand Down

0 comments on commit 474fd37

Please sign in to comment.