-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
77 Lookup returning path selection #832
Conversation
Exciting! And ambitious. Some questions (that could possibly have been answered by reading the rules in more detail):
(: will probably return [ 'x' ] ? :)
array:deep-update(
[ [ '1' ] ],
fn { ?1, ?1?1 }, (: → or, possibly, with #297: ??1 :)
fn { 'x' }
)
(: will probably return map { 'a': 1 } ? :)
map:deep-update(
map { 'a': map { 'a': map { 'a': 0 } } },
map:find(?, 'a'),
fn { 1 }
)
map:deep-update(
map { 'x': (1, 2) },
fn { head(?x) },
fn { 0 }
) |
Q1: I chose to make this an error; you can't select two items A and B if one contains the other. Dropping the update on the child would have been another option. Q2: yes, that example should work fine. |
With XQuery Update, it’s sometimes convenient to be able to write |
Upfront: The Thanks again for the proposal. As I was somewhat thrilled by the powerful semantics of the single update function, I presented this in an internal meeting. The first feedback I got was: Gosh, updates are really becoming cryptic now… One reason why I believe people are fond of XQuery Update is its verbosity (that doesn’t necessarily include myself). For the following XML document… <xml>
<person><name>John</name><city>London</city></person>
<person><name>Jack</name><city>Manchester</city></person>
</xml> …there’s hardly a need to explain what the following expressions do: delete nodes //city,
replace node /xml/person[1]/name/text() with 'Joe',
insert node <country>UK</country> into /xml/person[1] With the proposed functions, and with the full notation, we would probably get something like… let $persons := [
map { 'name': 'John', 'city': 'London' },
map { 'name': 'Jack', 'city': 'Manchester' }
]
return $persons
=> array:deep-update( (: delete :)
function($persons) { array:values($persons) },
function($person) { map:remove($person, 'city') }
)
=> array:deep-update( (: insert :)
function($persons) { $persons(1) },
function($person) { map:put($person, 'country', 'UK') }
)
=> array:deep-update( (: replace :)
function($persons) { $persons(1)('name') },
function($person) { 'Joe' }
) With a syntax as compacted as possible we get: $persons
=> array:deep-update(fn { ?* }, fn { map:remove(., 'city') }) (: delete :)
=> array:deep-update(fn { ?1 }, fn { map:put(., 'country', 'UK' }) (: insert :)
=> array:deep-update(fn { ?1?name }, fn { 'Joe' }) (: replace :) This probably looks accessible to people who use lookup operators and function items every day, but it could get a mean hassle for more occasional users. One approach to mitigate this could be dedicated functions for deletions, insertions, and replacements: $persons
=> deep-delete( fn($persons) { $persons?*?city } )
=> deep-insert( fn($persons) { $persons?1 }, fn { 'country' }, fn { 'UK' } )
=> deep-replace( fn($persons) { $persons?1?name }, fn { 'Joe' } ) The syntax could be further simplified by providing a simple sequence of steps to the target nodes, and getting rid of function items, at least for simple operations: $persons
=> deep-delete( [1 to array:size($persons), 'city'] )
=> deep-insert( [1], 'country', 'UK' )
=> deep-replace( [1, 'name'], 'Joe' ) Remarks:
Maybe we need just both: Simple variants for non-experts, and Personally, I have a hard time convincing myself that annotations can be implemented without compromising the vast majority of our non-updating use cases. Even if the data model isn’t affected, it would seem to affect numerous places in the code that are critical in terms of runtime and memory consumption. An separate path array, as suggested for the simpler functions variants, would certainly simplify things a lot. |
I see the deep-update function as a primitive onto which we can put XQuery/XSLT syntactic sugar, so I'm not too concerned about the usability of the function. The main reason I did it this way was that it made it easier to define the semantics in a way that was common to XSLT and XQuery. I've certainly considered the idea of restricting the expression that can be used in the An important aim is to ensure that the implementation can use immutable/persistent structures so that parts of the tree that don't change don't need to be physically copied. I've never managed to achieve that with node trees (despite several attempts) because node identity and parentage are so deeply embedded in the data model. I think it's achievable here because identity and parentage are transient. I do recognise that implementation is challenging and that there's a need to demonstrate feasibility with a proof-of-concept implementation. But I think it's a good idea to start with a spec that we would like to be able to deliver to users, and then work out how to solve the implementation problems. |
One potential simplification would be to select (for replacement) only values that are immediate members of an array or entry-values of a map, rather than individual items within those values. The main problem with this is that the lookup operators ( There might also be mileage in borrowing from JSONPath, where the result of a selection expression is not just a value, but a path to a value - equivalent in essentials to the "annotated value" in this proposal, but described rather differently. A related problem is handling entries where the value is an empty sequence. It's hard to select such values using lookup operators, and we can't currently replace an empty sequence by something else. If we replace a non-empty sequence by an empty sequence, it doesn't delete the array member or map entry, it merely sets it to empty. |
Note that the proposal could be considerably simplified if we were to adopt the suggestion in #298 (comment) that arrays should be treated as a subtype of maps. |
I'm optimistic about the chance to avoid full copies and reference unmodified parts of the original map/array as that's a general requirement for immutable data structures – and I agree it's not applicable to XML node trees unless we drop node identities. My main concern is about the selection process, though. The annotation/pedigree approach reminds me of something we have done for XQuery Full Text scoring in the past ( My conclusion was that properties that can get lost unnoticed during processing (possibly via distinct-values, sorts or casts, in our case) – such properties shouldn't exist at all in a formally strict language; they should rather be made explicit. I think the XQUF spec is a valuable document to learn about various obstacles users might encounter for map/array updates. Luckily we don't need to care about different node types, namespaces or attributes vs. texts, but there are good reasons why more than one update primitive was introduced (technically, it would certainly have been possible to define fewer primitives). With deep-update( fn { ?1 }, fn { array:remove(., 2) } )
vs deep-delete( fn { ?1?2 )
or deep-delete( [1, 2] ) |
<fos:expression><eg>map:deep-update( | ||
map{'x':map{'p':1, 'q':2}, 'y':map{'p':11, 'q':12}}, | ||
fn{map:find(.,'p')?*}, | ||
f{.+5})</eg></fos:expression> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
f{.+5} - > fn{.+5}
2c147ba
to
035bae7
Compare
Thanks to the re-enabled previews; I’ve just spotted the new Our experience with XQUF shows that chains/pipelines of updates are important, if not essential. We should understand how this would look like with the new syntax. I’ve seen the following example: let $temp :=
update map $data
replace ??Paris
with array:append(map{"to":"Frankfurt", "distance":480})
return update $temp
replace ??cities
with map:put("Frankfurt", [map{"to":"Paris", "distance":357}]) I assume it’s supposed to be We should head for a flat syntax; maybe something like: update { $data } {
replace value ??Paris with array:append(., map { "to": "Frankfurt", "distance": 480 }),
for $city in ??cities
return replace value $city with map:put($city, "Frankfurt", [ map { "to": "Paris", "distance": 357 } ]))
} {
delete values ??phone
} And I bet that XQUF users will demand a similar syntax for nodes. In principle, the same construct could be used to wrap existing XQUF expressions: update { $node } {
for $n in .//*[. = 'Paris']
return insert nodes (<to/>, <distance/>) into $n
} {
delete nodes .//phone
} |
Yes, doing the examples led me to similar conclusions. But I'm inclined to get something simple working first. It's clearly already quite powerful. That's why I stuck with "replace" for the time being, though it would clearly be useful to also have insert/remove. I'm not comfortable using "," between expressions for something other than sequence concatenation - though I know that's what we do in function calls! |
…and for basically all other operations that cannot be represented by a single expression ;·) In non-trivial XQuery code, it’s very common indeed. |
@michaelhkay A remark regarding test cases: Would it be possible to create PRs in the qt4tests repository for tests of features that haven’t been added to the specification yet, and only merge those once the PR has been finalized and accepted? |
Yes, it would probably be a good discipline to start using branches and PRs on the test suite. |
Glad to hear. To avoid too much overhead, I believe it’s fine to just merge things that have already been accepted by the group. |
9490835
to
d2954e0
Compare
I have revised the proposal very substantially. The changes also impact on the semantics of deep lookup, and the concept of labels. An update expression can now have multiple clauses, including delete, replace, and extend sub-operations. The creation of parent, ancestor, and root functions is now built-in to the deep lookup operator when the The semantics of the update expression are now described formally in terms of an XSLT-style recursive-descent transformation of the tree of maps and arrays. There's a glitch I still need to fix - multi-step lookup expressions like ??x??y don't retain ancestry information all the way back to the root. But I think the proposal is now in a state where detailed review is very welcome. |
Very interesting again; it will take me a while to give acceptable feedback. |
Some feedback (not as comprehensive as I had hoped):
update map $map {
delete (
let $root := .?entry::sub?parent()
let $sub := $root?sub
return $sub ! (?entry::leav)
)
} I believe the hardest challenges remain terminology and usability. It doesn’t feel overwhelmingly intuitive that (only?) the last segment of a lookup expression must use the
The terminology of the update clauses is fairly descriptive: “delete Y”, “replace X with Y”. Maybe we should try something similar for the lookup and rename update map $cities {
delete ?lookup::*??lookup::mail,
delete ?track::*??track::mail
} An alternative (similar to the ones discussed in previous weeks) would be to get rid of the explicit specifier and implicitly track the path for update map $cities {
delete .?*??mail
} …and a possible extension for nodes: update nodes $cities {
delete ./*//mail
} |
Fix #1247 |
78cc64e
to
986153c
Compare
986153c
to
3e0226b
Compare
A minor error found in one of the examples: Expression: [ "a", "b", (), "d" ]?value::* The correct result is actually: ( ["a"], ["b"], [()], ["d"] ) |
I think the changes in this PR were all either abandoned, or reworked as part of other PRs. Therefore closing this one with no further action. |
@michaelhkay I think we would need a new version of this PR before we can look at #1283. Things I would expect to be added/revised:
…and various changes proposed for |
Thanks, yes. I have just spent an hour or so reviewing PR #1283 and I agree it has dependencies on changes that need to be reintroduced or revised. |
Note that unlike many of the functions we have added, these are non-trivial: they cannot easily be implemented in XSLT or XQuery.
This is a first cut and I expect some refinement will be needed, but reviews are invited.
I might subsequently propose layering some XSLT syntax on top of this for convenience.