Skip to content
This repository has been archived by the owner on Oct 4, 2020. It is now read-only.

toUnfoldable leaks structure #135

Open
matthewleon opened this issue Jan 17, 2018 · 7 comments
Open

toUnfoldable leaks structure #135

matthewleon opened this issue Jan 17, 2018 · 7 comments

Comments

@matthewleon
Copy link
Contributor

matthewleon commented Jan 17, 2018

toUnfoldable currently leaks structure by doing a breadth-first traversal of the Map.

Potential solutions, subsequent to benchmarking:

  1. If it turns out that toUnfoldable really isn't faster than toAscUnfoldable, then toUnfoldable = toAscUnfoldable should be a simple fix. It would break anything that was depending on the order of toUnfoldable, which is not what people should be doing. Maybe you also deprecate toAscUnfoldable.

  2. Put some sort of constraint on toUnfoldable to ensure that it unfolds to a structure where ordering is not important?

  3. Place a strongly worded, ugly warning comment on toUnfoldable.

Other ideas?

@matthewleon
Copy link
Contributor Author

matthewleon commented Jan 17, 2018

There are some benchmarks here showing toUnfoldable to be significantly faster than toAscUnfoldable: #79 (comment)
Will try to run these again, since there have been some compiler changes since that time, to see if there is still a significant difference.

@matthewleon
Copy link
Contributor Author

I've done some benchmarking:

Sadly, toAscUnfoldable is over 10% slower than toUnfoldable on larger inputs. Example:

toUnfoldable: big map (1000000)
mean   = 805.20 ms
stddev = 91.14 ms
min    = 687.07 ms
max    = 927.82 ms

toAscUnfoldable: big map (1000000)
mean   = 910.99 ms
stddev = 6.16 ms
min    = 902.18 ms
max    = 920.25 ms

My inclination is to say that this means that toUnfoldable is a function that still has its uses in cases where the user doesn't care about order. There just needs to be a constraint or strong warning of some sort. @hdgarrood any thoughts here?

@matthewleon
Copy link
Contributor Author

Sorry, forgot to link to my benchmark branch: https://github.com/matthewleon/purescript-maps/tree/toUnfoldable-bench

@hdgarrood
Copy link
Contributor

My inclination is to say that this means that toUnfoldable is a function that still has its uses in cases where the user doesn't care about order. There just needs to be a constraint or strong warning of some sort.

Yes, I think this makes sense. It's a breaking change, but I'd like to have toUnfoldable be the 'safe' one, i.e. the one which gives you back results in ascending order of keys, and the current toUnfoldable be renamed to something like toUnfoldableUnsafe with a note that depending on the order of the structure returned can result in unpredictable behaviour, and in most cases you should use toUnfoldable instead.

@matthewleon
Copy link
Contributor Author

matthewleon commented Jan 17, 2018

a note that depending on the order of the structure returned can result in unpredictable behaviour, and in most cases you should use toUnfoldable instead.

I have to say that I really like the idea that a constraint could be used for something like this, too, in the same way that Partial goes a step further than simply putting a warning in docs. Now I imagine situations like this one are pretty rare, so it might not be worth the trouble, but what would a "this is only safe when order is irrelevant" constraint look like?

@hdgarrood
Copy link
Contributor

Right; it's difficult for me to imagine what that would look like, though. One way to expose this safely would be to define something like foldCommutative :: forall m k v. Commutative m => Monoid m => (k -> v -> m) -> Map k v -> m with implementation \f -> foldMap f <<< (toUnfoldableUnsafe :: _ -> Array _)... although if you are dealing with a performance bottleneck you're probably going to want to avoid foldMap anyway.

@matthewleon
Copy link
Contributor Author

One way to expose this safely would be to define something like foldCommutative :: forall m k v. Commutative m => Monoid m => (k -> v -> m) -> Map k v -> m with implementation \f -> foldMap f <<< (toUnfoldableUnsafe :: _ -> Array _)... although if you are dealing with a performance bottleneck you're probably going to want to avoid foldMap anyway.

As a thought experiment, I'm wondering if this would necessarily be true with the right compiler optimizations... I would naïvely think that the compiler might to be able to optimize a sum written in terms of foldMap to be (near-)equivalent to its foldl version. As Int under addition is a Commutative Monoid, you would then be able to realize the perf benefits of unordered traversals.

matthewleon added a commit to matthewleon/purescript-maps that referenced this issue Jan 18, 2018
addresses purescript-deprecated#135

toAscUnfoldable becomes a deprecated equivalent to toUnfoldable. We
introduce `unsafeToUnfoldable` for the unsorted traversal.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants