Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generalization of Deep Updates #1225

Open
ChristianGruen opened this issue May 18, 2024 · 2 comments
Open

Generalization of Deep Updates #1225

ChristianGruen opened this issue May 18, 2024 · 2 comments
Labels
Discussion A discussion on a general topic. PRG-hard Categorized as "hard" at the Prague f2f, 2024 PRG-optional Categorized as "optional for 4.0" at the Prague f2f, 2024 Propose for V4.0 The WG should consider this item critical to 4.0 XPath An issue related to XPath

Comments

@ChristianGruen
Copy link
Contributor

ChristianGruen commented May 18, 2024

Note: This is a discussion issue, as I cannot contribute something substantial so far.

Observations

Our current development to support updates in the languages may come as a surprise to developers:

  • The XQuery core specification (which includes X in its name) will include constructs for updating Maps and Arrays.
  • To update XML, an implementation must support the XQuery Update (XQUF) specification.

I think we should…

  1. either embed map/array updates in XQUF, or
  2. support a modified subset of XQUF in our core specs (while remaining fully compatible with XQUF).

I believe 2. is more realistic. By providing a simplified syntax, we could tackle some of the shortcomings of XQUF, such as its verbosity, and seemingly unnecessary restrictions:

XQUF: Verbosity

The Transform expression (or Copy Modify expression, as it’s called in 3.0) has a cumbersome and wordy syntax for doing very trivial things:

copy $node := <a><b/></a>
modify delete node $node/b
return $node

The 3.0 Transform With syntax is a bit simpler, it utilizes the context item:

<a><b/></a> transform with {
  delete node ./b
}

It resulted from the BaseX update syntax…

<a><b/></a> update {
  delete node ./b
}

…which comes with an ambiguity that forbids its unchanged adoption: element update {} could be both an element constructor and an update statement. I think that dropping the curly braces (and, optionally, using parentheses) would resolve this issue.

XQUF: Restrictions

The XQUF syntax is very powerful, but it has some restrictions that require the use of FLWOR expressions when addressing multiple nodes. For example, the following statement is illegal…

replace //village with <village/>

…if the target is not a single node, which means that you have to write…

for $v in //village
return replace $v with <village/>

…or…

(: only supported in BaseX :)
//village ! (replace . with <village/>)

I’m pretty sure it would be safe to drop the restriction, which also exists for other update expressions, such as insert nodes NODES into SINGLE-NODE or rename node NODE as 'NAME' (delete nodes NODES is legal). Allowing multiple targets would greatly reduce the number of iterations required within update blocks in practice.

XQuery Update light

I think the new update syntax should meet the following requirements (among others):

  • Compatible with the XQUF node semantics.
  • Similar syntax for supported input types.
  • Chaining of update operations.

First, we would need to decide on a syntax that would be applicable to both maps/arrays and nodes. We could:

  1. Build on the proposal in #832, which introduces a new syntax for maps and arrays, and extend it for nodes:
update map   INPUT-MAP   { ... }
update array INPUT-ARRAY { ... }
update node  INPUT-NODE  { ... }
  1. Build on XQUF 3.0:
INPUT-MAP   transform with { ... }
INPUT-ARRAY transform with { ... }
INPUT-NODE  transform with { ... }
  1. Build on BaseX (allowing multiple input items and chains):
INPUT-MAPS   update (...) update (...)
INPUT-ARRAYS update (...) update (...)
INPUT-NODES  update (...) update (...)

Syntax 2. and 3. is challenging, as the type of the input can only be evaluated at time (and for XQUF it has to be determined statically whether an expression is an updating or non-updating).

As we currently have a proposal for 1., I will stick to that syntax, but allow an optional plural form for map and array (inspired by XQUF), and use chains. Within the the update block, we could now use the short syntax also for nodes without the node/nodes keywords:

update map $country-map {
  delete ??entry:city
},
update maps $country-maps update {
  rename ?entry:village as 'city'
},

update node $country-node {
  delete //city
},
update nodes $country-nodes {
  insert <lakes/> into .,
  insert <mountains/> into .
} {
  insert <lake/> into //lakes
)

Semantics

  • Note that for XQUF update expressions it makes a difference whether multiple expressions are defined with the same block or in a subsequent block – which is why I think chains are essential.
  • Even though the syntax would be similar for node and map/array updates, the inherent semantics would differ a lot – which is something, however, users would not need to care about too much: Node updates would greatly rely on XQUF, whereas map/array updates would be based on the new proposal.

I’m looking forward to everyone’s opinions.

@ChristianGruen ChristianGruen added XPath An issue related to XPath Discussion A discussion on a general topic. labels May 18, 2024
@michaelhkay
Copy link
Contributor

Yes, this probably needs to be addressed. I had been avoiding it, I admit, (a) because it's hard enough to get update for maps and arrays right without considering nodes at the same time, and (b) because defining XQUF 3.0 proved so troublesome. (In retrospect, though, I think that's because it was trying to achieve things that aren't actually needed).

@ndw ndw added PRG-hard Categorized as "hard" at the Prague f2f, 2024 PRG-optional Categorized as "optional for 4.0" at the Prague f2f, 2024 labels Jun 5, 2024
@ChristianGruen ChristianGruen added the Propose for V4.0 The WG should consider this item critical to 4.0 label Jun 5, 2024
@ChristianGruen
Copy link
Contributor Author

I’d like us to be strict about this requirement: As long as we consider XML to be a primary language feature, we should focus on it first. I would be happy if we could discuss my proposals when I’ll be back.

If we believe that XQUF 3.0 has taken the wrong direction, we can build on version 1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Discussion A discussion on a general topic. PRG-hard Categorized as "hard" at the Prague f2f, 2024 PRG-optional Categorized as "optional for 4.0" at the Prague f2f, 2024 Propose for V4.0 The WG should consider this item critical to 4.0 XPath An issue related to XPath
Projects
None yet
Development

No branches or pull requests

3 participants