Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Node Updates: Functions #1702

Open
ChristianGruen opened this issue Jan 14, 2025 · 2 comments
Open

Node Updates: Functions #1702

ChristianGruen opened this issue Jan 14, 2025 · 2 comments
Labels
Discussion A discussion on a general topic. XQFO An issue related to Functions and Operators

Comments

@ChristianGruen
Copy link
Contributor

In #1225, I have summarized some thoughts on generalizing updates for both nodes and structured items (maps/arrays).

XQuery Update is complex, as updates are in general, so we may still decide that it is too ambitious to introduce update features in the core language. If we want to give it a try, we could offer functions that are based on XQUF, but that only perform one update operation at a a time on a given input. This way, we could ignore the sophisticated Pending Update List semantics, which is only important when multiple updating expressions are specified and need to be checked and brought into order.

A function set that provides an equivalent functionality for all XQUF update operations could look as follows (the presented functions are valid XQuery Update code):

declare namespace update = 'http://www.w3.org/TR/xquery-update';

declare function update:delete(
  $node  as node(),
  $path  as fn(node()) as node()*
) as node() {
  copy $c := $node
  modify delete node $path($c)
  return $c
};

declare function update:rename(
  $node  as node(),
  $path  as fn(node()) as node()*,
  $name  as (xs:QName | xs:NCName | fn(node(), xs:integer) as (xs:QName | xs:NCName))
) as node() {
  copy $c := $node
  modify (
    for $target at $pos in $path($c)
    let $result := if($name instance of fn(*)) {
      $name($target, $pos)
    } else {
      $name
    }
    return rename node $target as $result
  )
  return $c
};

declare function update:replace(
  $node      as node(),
  $path      as fn(node()) as node()*,
  $contents  as (node() | xs:anyAtomicType | fn(node(), xs:integer) as node()*)*,
  $options   as record(value? as xs:boolean)? := {}
) as node() {
  copy $c := $node
  modify (
    for $target at $pos in $path($c)
    let $result := (
      for $content in $contents
      return if($content instance of fn(*)) {
        $content($target, $pos)
      } else {
        $content
      }
    )
    return if($options?value) {
      replace value of node $target with $result
    } else {
      replace node $target with $result
    }
  )
  return $c
};

declare function update:insert(
  $node      as node(),
  $path      as fn(node()) as node()*,
  $contents  as (node() | xs:anyAtomicType | fn(node(), xs:integer) as (node() | xs:anyAtomicType))*,
  $options   as record(position? as enum('last', 'first', 'before', 'after'))? := {}
) as node() {
  copy $c := $node
  modify (
    for $target at $pos in $path($c)
    let $result := (
      for $content in $contents
      return if($content instance of fn(*)) {
        $content($target, $pos)
      } else {
        $content
      }
    )
    return switch($options?position) {
      case 'before' return insert node $result before $target
      case 'after'  return insert node $result after $target
      case 'first'  return insert node $result as first into $target
      default       return insert node $result as last into $target
    }
  )
  return $c
};

Here are some exemplary function calls:

let $node := <xml><e/><e/></xml>
return (
  (: deletes all <e/> child nodes :)
  update:delete($node, fn { e }),
  (: renames the <e/> child nodes to <f/> :)
  update:rename($node, fn { e }, 'f'),
  (: replaces the <e/> child nodes with <replaced/> :)
  update:replace($node, fn { e }, <replaced/>),
  (: replaces the string value of the <e/> child nodes with 'text' :)
  update:replace($node, fn { e }, 'text', { 'value': true() }),
  (: inserts a 'text' text node into the <e/> child nodes :)
  update:insert($node, fn { e }, 'text'),
  (: inserts 'text1' and 'text2' text nodes into the <e/> child nodes :)
  update:insert($node, fn { e }, fn($node, $pos) { 'text' || $pos }),
  (: inserts an <x/> element after each <e/> child node :)
  update:insert($node, fn { e }, <x/>, { 'position': 'after' })
)

Multiple update operations can easily be chained:

(: rename <e/> child nodes to <f/>, insert 'x' text nodes :)
<xml><e/><e/></xml>
=> update:rename(fn { e }, 'f')
=> update:insert(fn { f }, 'x')

Ideally, we could offer a similar function set (or maybe even the same) for maps and arrays in a next step (see #77). The map/array syntax would be similar for deletions…

let $data := { 'a': [ 1, 2, 3 ] }
return update:delete($data, fn { ?a?2 })

…but it certainly gets trickier for other operations.

If some of you believe that the presented approach is something that we should pursue, I will be happy to add details. As an alternative, we could pursue the XQUF light approach that I have sketched in #1225, based on the existing XQUF update keywords.

Yet another solution could be to stick with what we have, but add map/array update features to XQUF.

@ChristianGruen ChristianGruen added XQFO An issue related to Functions and Operators Discussion A discussion on a general topic. labels Jan 14, 2025
@michaelhkay
Copy link
Contributor

Yes, I think it would be a good idea to progress this. Note that the semantics could also be described in terms of XSLT equivalents, which avoids relying on a spec that isn't fully defined in an XQuery 4.0 context. (Or, perhaps a bit more laboriously, in terms of recursive XQuery functions using typeswitch).

I'm not sure what you gain by supplying the target of rename/delete etc as a function rather than just a list of nodes. Why

update:rename($node, fn { e }, 'f')

rather than

update:rename($node, $node/e, 'f')

And if we're doing update through functions, I think it would make sense to provide node construction through functions as well: see #573

@ChristianGruen
Copy link
Contributor Author

which avoids relying on a spec that isn't fully defined in an XQuery 4.0 context

I have learned to appreciate XQuery Update over time, so I think we will benefit a lot if we target a solution that can be expressed as a subset of XQuery Update as much as possible. Otherwise, multiple solutions would exist in parallel, and it will be difficult to argue which solution is the one to prefer in XQuery. Next, it should be simple to switch between both worlds, and for example enhance an XQuery 4.0 function call to an XQuery Update expression if it turns out that the function call is too limited.

But in principle I agree: If we manage to define functionality that is simple enough to work out without XQUF references, it would simplify things a lot.

I'm not sure what you gain by supplying the target of rename/delete etc as a function rather than just a list of nodes.

From the XQUF perspective, it is certainly the more obvious way to specify it. update:delete($node, $path) is equivalent to nothing else than $node transform with { delete node $path(.) }.

It would be much harder to specify an equivalent XQUF expression with list of target nodes that is generated before the node is copied. If we supply a list of nodes instead of a function, we:

  • need to find out whether the target nodes are actually descendants of the input node.
  • If they are, we would need to find the in the copied instance which can be deleted.

Next, if we supply a list of nodes, things get challenging with nested target nodes. Imagine we wanted to rename nested a nodes to b:

(: expected: <xml><b><b><b/></b></b></xml> :)
let $xml := <xml><a><a><a/></a></a></xml>
return $xml transform with { rename node .//a as 'b' }

With the following recursive XQuery code…

declare function update:rename(
  $node    as element(),
  $target  as node()*,
  $name    as xs:QName
) as node() {
  element { node-name($node) } {
    for $child in $node/*
    return if(some $t in $target satisfies $t is $child) then (
      element { $name } { $child }
    ) else (
      $child
    ) ! update:rename(., $target, $name)
  }
};

…the uppermost replaced element a would get a new node identity, and its children could not be compared with the $target nodes anymore. Maybe we would need to start from the bottom nodes to avoid this. – But maybe it would be simpler with XSLT. Or I am missing something?

In principle, the approach resembles the first proposal for map/arrray updates (#832). In both cases, we could probably get rid of the function notation as long as we can be sure that we can always create a relationship between the copied and the target items.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Discussion A discussion on a general topic. XQFO An issue related to Functions and Operators
Projects
None yet
Development

No branches or pull requests

2 participants