Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

module-level factory functions #24

Merged
merged 1 commit into from
Feb 7, 2019
Merged

Conversation

gabbard
Copy link

@gabbard gabbard commented Jan 29, 2019

Partially addresses #12

@gabbard gabbard changed the title Switch immutableset creation to immutableset() factory function [WIP] Switch immutableset creation to immutableset() factory function Jan 29, 2019
@gabbard
Copy link
Author

gabbard commented Jan 29, 2019

This is a start on #12 . A few notes:

  • to test the error messages on different versions of Python, we probably need to switch to testing with tox. I am inclined to push this testing (and therefore testing the order checks for dicts) to a different issues.
  • I think we should factory implementing check_isinstance to its own issue because there are some tricky subtleties there (e.g. if the input is already an immutable set but your requested type is a sub-type of the type recoreded on it, what is the right thing to do?) and no one currently uses it.

@gabbard
Copy link
Author

gabbard commented Jan 29, 2019

@nicomitchell : can you add tests to this for the order check case of AbstractSet?

@ConstantineLignos
Copy link
Contributor

ConstantineLignos commented Jan 31, 2019

Another general comment on something that isn't in the diff so I can't leave a review: can we change iteration_order inside _FrozenSetBackedImmutableSet to just be Sequence[T] and not do any conversion?

Since it's effectively a private constructor, we can trust that the factory method is never sharing the iteration order with the caller, even if it is mutable.

Current implementation is:

    _iteration_order: immutablelist.ImmutableList[T] = attrib(
        converter=immutablelist.ImmutableList.of, cmp=False, hash=False  # type: ignore
    )

EDIT: The downside of this is that as_list can no longer just return the iteration order as-is. Since ImmutableSet[T] is now a Sequence[T], I have no problem with making as_list expensive.

@gabbard
Copy link
Author

gabbard commented Jan 31, 2019

@ConstantineLignos : actually, is there any reason to have as_list() now that we just implement Sequence? I think it was added before that was the case

@gabbard
Copy link
Author

gabbard commented Jan 31, 2019

@ConstantineLignos : I think an event nicer option is for ImmutableSet to just sub-class tuple and skip having a separate iteration_order field (instead passing the order to super). This may cause a slight slowdown on creation, but should save time on access and iteration.

@gabbard
Copy link
Author

gabbard commented Jan 31, 2019

(but this should be checked using benchmarks)

@ConstantineLignos
Copy link
Contributor

ConstantineLignos commented Jan 31, 2019

I'm fine with killing off as_list. If subclassing tuple makes sense, go for it. I'd like to explore the idea of a dict-backed implementation (see #14 (comment)) at the same time as when exploring subclassing tuple. The dict-backed one will should have the advantage of having the backing dict handle both containment and order in the dict itself, so the factory method would be different for maximum efficiency.

(Just in case anyone is wondering, the Python 3.6 dict implementation keeps the order of first insertion as you might expect, just like ImmutableSet, see below.)

>>> list({item: None for item in [4, 3, 2, 1, 4]})
[4, 3, 2, 1]

@gabbard
Copy link
Author

gabbard commented Jan 31, 2019

@ConstantineLignos : I experimented a little with using a dict in the factory method:

unique_values = tuple({ val : None for val in iterable}.keys())
return _FrozenSetBackedImmutableSet(
                frozenset(unique_values), unique_values)
...

In increases the time to initialize from a 3-element collection by ~5% but doubles performance on a 10k-element collection, so I think it is a win.

One challenge with using the dict to actually back the set is that you still need the tuple for iteration order because the Dict's KeyView just extends Collection, not Sequence. So the tradeoff becomes the time to initialize the frozenset for containment checks vs just keeping the dict vs. the (probable) increased memory usage of dict

@ConstantineLignos
Copy link
Contributor

@rgabbard Is there anything we'd need to do before merging if we stick with the frozenset backing right now?

@gabbard
Copy link
Author

gabbard commented Feb 6, 2019

@ConstantineLignos : I have some local changes on this branch to sync up with this before merging. Poke me if I haven't taken care of it by tomorrow morning

@ConstantineLignos
Copy link
Contributor

Sounds good. I'm on vacation at the start of next week so I'd like to get this and isi-vista/vistautils#44 sorted this week before I go.

@gabbard gabbard changed the title [WIP] Switch immutableset creation to immutableset() factory function module-level factory functions Feb 6, 2019
Copy link
Contributor

@ConstantineLignos ConstantineLignos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, one comment to remove. Also, I'm starting to think maybe we can just remove the type-checking features in favor of maybe adding some helpers to vistautils?

immutablecollections/_immutableset.py Outdated Show resolved Hide resolved
@gabbard
Copy link
Author

gabbard commented Feb 7, 2019

I'm all in favor of ditching the type checking (which I think also means, on a future PR, ditching ImmutableList, which no longer offers any advantage over tuple). I'll drop the TODO comments related to it.

@gabbard gabbard force-pushed the 12-immutableset-factory branch from e027933 to 21e9a02 Compare February 7, 2019 17:35
@codecov
Copy link

codecov bot commented Feb 7, 2019

Codecov Report

Merging #24 into master will decrease coverage by 2.16%.
The diff coverage is 85.18%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #24      +/-   ##
==========================================
- Coverage   91.82%   89.66%   -2.17%     
==========================================
  Files           7        7              
  Lines         575      619      +44     
==========================================
+ Hits          528      555      +27     
- Misses         47       64      +17
Impacted Files Coverage Δ
immutablecollections/_utils.py 100% <100%> (ø)
immutablecollections/_immutabledict.py 83.87% <80%> (ø)
immutablecollections/_immutablelist.py 90.16% <83.33%> (ø)
immutablecollections/_immutablemultidict.py 91.98% <85%> (ø)
immutablecollections/_immutableset.py 88.57% <88%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d861bd6...21e9a02. Read the comment docs.

@gabbard
Copy link
Author

gabbard commented Feb 7, 2019

I'm ignoring codecov's comments about test coverage for now because it is largely due to me switching tests to use the new factory methods, leaving parts of the builder and .of methods untested. This code is slated for deprecation shortly, so I'm not worried about it.

@gabbard gabbard merged commit 46dc162 into master Feb 7, 2019
@gabbard gabbard deleted the 12-immutableset-factory branch February 7, 2019 17:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants