-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ordered Maps: Updates #1656
Comments
Examples for updating ordered maps with existing syntax: Insert an entry at index 10 (could be
|
I can't really comment on #1225 it requires too much local knowledge. For maps, lets assume, for the moment, maps are ordered. we have this
which presumably constructs a map, based on the order of the sequence of maps (and their inherent order) then
presumably these obey/preserve order then we have the alleged awkward squad
ok, this what i'd call update/insert, and for me the 'obvious' behaviour is it overwrites the existing entry (in its ordered position) OR it adds on on the end. I accept IF you are interested in order, then that's probably not very flexible, but I think losing the order completely is at best confusing, but then there is the question of how to insert things in order. well we already have merge and of-pairs, so we just need a way to create the parts of the map, for an insert, you need the bit before the place you want to insert and the bit after. I can't see how I would use filter to insert an item at the n'th position, the index isnt in the filter predicate, and to be frank, if it was, it feels a little clunky, I'd have to filter the map after the position i want to insert, and then before, and then merge. I think splitAt is a bit cleaner, an order is really just something that maps to the natural numbers (i.e. an index) so inserting etc at a position, is really just about specifying indexes. P.S.
seems a more awkward problem, 2 maps with different orders but otherwise identical, for me are not equal, but I sense that may be a problem. |
actually I think your position example is probably good enough, and splitAt is an unnecessary luxury (unless there is some big performance benefit that can be gained by having a dedicated function to split, I sense it could be a common task, and the maps can be large) what do you think about deep-equals? |
Your perspective is certainly interesting to hear. For me, it is the other way round: I would be surprised if someone used From an implementer’s perspective (but this should not be the deciding factor), one has certainly more freedom to decide what is going to happen internally if orderedness after updates is undefined.
I have edited my comment to show how As far as |
let me appeal to principle rather than intuition For me the driving motivation here is to derive a map that is "pure" especially in the context of me constructing and testing JSON. I actually don't care what the order is, as long as its consistent, and it being inconsistent is an (seemingly unnecessary) impure side effect, that causes me no end of pain when I test (thus I have to impose an order in the tests, to make the code effectively pure again). So for me a So I cannot reason about the code, my tests can't either. Intuitively the word "put" is just a word, if I rewrote your statement like this
this statement now seems highly dubious, yet all I've done is change the label 'put' to 'append', ok, we don't want to change the label in the language for backwards compatibility reasons, but interpreting 'put' as effective "update or append" rather than its current "update or inject (at random)" seems to solve the issue without causing any real harm, its just the same as the old put, except it has a well defined position, no previous implementations will break (based on the spec) because no order was ever previously defined, if like you, I didn't expect it to have a defined order, I wouldn't care, having an order does not break logic that expects no consistent order. So I'm struggling to see a problem. The cost is to the implementor of the language, but it doesn't seem (naively) onerous. For the deep-equal issue, I have reservations about the whole notion of deep-equals, but intuitively if I compare 2 lists in a different order, then they are not equal, even if the entries are the same, this seems to be the same principle, but I accept there is a backwards compatibility cost to that, for me deep-equals is the wrinkle in this proposal. (it would be solved by having ordered and unordered maps as separate data types, my observation is I would then no longer ever use unordered maps, and really for me this would just be a new version of map and thus a versioning issue, which I would happily accept, I don't share much of the backwards compatibility concerns) |
@MarkNicholls Thanks for your thoughts. Immutable/persistent maps are more complex than their mutable counterparts, and supporting efficient updates with a notion of orderedness makes it even harder. It would be interesting to get feedback from other implementers on how far we should go. |
yep I'm aware of the issues of persistent data structures, for me, the specifics of what the order is in a put is not important, whats important is its defined, so it can be the easiest option. |
I believe that the primary purpose of making maps ordered is for human readability of serialized content, not to provide a way of capturing additional information, and we should design the feature with this as the primary goal. That's why, for example, I proposed that deep-equal() should ignore order. I'm therefore not in favour of introducing additional features for manipulating maps as sequences. There's always the fallback of using As for implementation, I found that on the Java platform there's a persistent ordered map implementation in the VAVR library. I couldn't find one on .NET, but I've implemented my own by running a two-way linked list through the entries of a non-ordered I think we should allow bulk map creation operations like map:build and parse-json the options of either retaining or not retaining input order. To me it then seems natural that if you built the map retaining order, then put() and remove() should also respect the order, while if you built it as unordered, put() and remove() shouldn't impose an order on the result. That goes back to the idea that maps have orderedness as a property that affects the behaviour of put() and remove(). |
I cant see a C# or an F# one. |
I spent some time analyzing the existing uses of maps: In most cases, maps are only created once. We will probably offer only ordered maps. Internally, we will use a mutable map implementation for new maps, as it seems to be much faster than all immutable implementations that I encountered so far. At the moment, our current HAMT extensions for ordered entries have a similar weak spot like VAVR’s LinkedHashMap, which is the linear worst-case complexity for put operations. Maybe we will tackle this by keeping the ordered entry list unchanged, but remembering updated (and possibly deleted) entries in auxiliary data structures, expecting that those do not get too large. If they do, the map can be cleaned up and reorganized. |
After further tests and experiments, I have come to the opinion that we should drop unordered maps completely:
We could still provide options to disable orderedness, but I wonder who will use it? It seems like a fairly technical tweak that may only affect specific implementations, so maybe it should rather be a vendor-specific feature than part of the standard. I would still love to learn more about large XDM maps that are used in practical applications. The most impressive use cases I have seen so far did not process more than 10 million entries. For testing purposes, I have successfully created created, updated, serialized/stored and parsed/retrieved maps with ten times the amount of data in a reasonable amount of time. With regard to the update operations, I still think it would be helpful to have functions to control the insertion position, but I also think that many other language features should have a higher priority at this stage. @MarkNicholls With regard to |
I tend to agree with the comment about unordered maps, having a map with an unspecified order I wince at (I'd rather it defaulted to order by key, which tends to be out the box), but if the cost is minimal to adopt write sequence ordered maps everywhere then that's may just be simpler all round. |
Overriding is the most efficient, since it does not have to change the key storage |
That's a gross simplification, it depends entirely on the data structures that are chosen. And a |
There will certainly be cases where the cost is noticeable. Certainly the cost in memory consumption, which can be very significant for some applications. I certainly think I owe it to my users to provide an option to drop ordering for cases where it's an unnecessary overhead, though I don't object to ordering by default. |
After having played around with different approaches, I am pretty convinced now that the cost (both in terms of runtime and memory) depends a lot on the implementation, and I think the most important question rather is if we want to confront vendors with the effort involved in finding or writing an efficient implementations. It reminds me a lot of the XQuery
We believe to have found an efficient solution for both approaches (replace vs. delete/insert). From a user point of view, a replace would probably be most intuitive, as I assume that hardly anyone is available of the fact that |
I would much prefer to see either:
My primary concerns here are:
Regards performance, I am aware that there are various persistent and non-persistent data structures for Immutable Maps that may offer ordering. However I suspect that these can never be as efficient as a simple mutable Open Hash Map using something like Robin Hood hashing. I would expect an immutable structure, to exhibit worse CPU cache locality and increase the amount of GC required. I am willing to be corrected though. |
@adamretter Thanks for sharing your thoughts.
My observation is that the general overhead for immutability is much more significant than making an already immutable data structure ordered. One tweak that we have introduced just recently is to:
|
If maps are updated, insertion/deletion order may be an issue, even more if maps will be ordered by default (#1651).
This topic needs to be discussed in more depth before we take any actions.
Personally, I think we should focus on XML updates first (related: #1225).
The text was updated successfully, but these errors were encountered: