Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add iterable concept #219

Open
wants to merge 45 commits into
base: main
Choose a base branch
from
Open

Add iterable concept #219

wants to merge 45 commits into from

Conversation

tcbrindle
Copy link
Owner

This PR adds a new iterable concept and updates various algorithms and adaptors to use it.

An iterable is, as the name suggests, something we can iterate over -- specifically, by using internal iteration. The required interface is:

struct sequence_traits<T> {
    auto element_type(T& iterable) -> E;

    template <typename T, typename Pred>
        requires std::predicate<Pred&, E>
    auto iterate(T& iterable, Pred&& pred) -> bool;
};

This is basically the existing for_each_while abstracted out into its own concept, returning a bool to indicate whether iteration was completed successfully rather than a cursor.

Once iterate() has returned, it is unspecified whether a second call to iterate() will start again from the beginning, carry on where it left off, do nothing, or do something else. Iterables are strictly weaker than sequences: that is, every sequence is an iterable, but not vice versa. Nonetheless, a large number of algorithms work on iterables (just about anything that can be written as a short-circuiting fold), and most of the existing sequence adaptors can work on iterables as well.

Since iterables only use internal iteration, there is no external state (i.e. a cursor) which could be invalidated. Not only does this make the iterable interface easier to implement for some types, but most importantly it means that we can provide a safe default implementation of the iterable protocol for all C++20 ranges.

This dramatically improves our interoperability with ranges, while the use of internal iteration means that converting ranges pipelines to the equivalent Flux code may yield some nice performance benefits.

Copy link

codecov bot commented Nov 26, 2024

Codecov Report

Attention: Patch coverage is 97.89790% with 7 lines in your changes missing coverage. Please review.

Project coverage is 98.40%. Comparing base (f539c11) to head (b5587e2).

Files with missing lines Patch % Lines
include/flux/algorithm/starts_with.hpp 75.00% 4 Missing ⚠️
include/flux/adaptor/scan.hpp 98.03% 1 Missing ⚠️
include/flux/core/concepts.hpp 91.66% 1 Missing ⚠️
include/flux/core/sequence_iterator.hpp 66.66% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #219      +/-   ##
==========================================
- Coverage   98.60%   98.40%   -0.21%     
==========================================
  Files          69       70       +1     
  Lines        2578     2629      +51     
==========================================
+ Hits         2542     2587      +45     
- Misses         36       42       +6     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

An iterable is an object which can perform internal iteration, accepting a predicate which indicates if the iteration should terminate early.

Iterables need to supply an `iterate(it, pred)` function, and an `element_type(it)` function whose return type indicates the element type of the sequence. (For C++-y reasons this is a function rather than a plain alias like `using element_type = int`).

Every sequence is iterable, but we can also have iterables which are not sequences. In particular, every range can safely be iterable, because iterators are never exposed and so we never need to worry about invalidation, i.e.

```cpp
auto iterate(std::ranges::input_range auto&& rng, auto Pred) -> bool
{
    for (auto&& elem : rng) {
        if (!pred(FWD(elem)) { return false; }
    }
    return true;
}
```
Contiguous + sized ranges are still contiguous, sized, bounded sequences
Seems like a good place to start
...rather than sequences.

These are the algorithms which perform a single pass from the start of a sequence up to some end point, without needing to restart again.

Which is actually a suprising number of algorithms.

(Although not flux::to yet, because that's more complicated.)
Also const_iterable_sequence -> const_iterable

And make `flux::ref` work for iterables rather than just sequences
The current std::invocable and std::regular_invocable concepts are a bit of a pain to use. This commit adds three new callable concepts instead, each of which is specified with a signature argument:

 * `func_once` is a callable that will be invoked at most once

* `func_mut` may be called more than once, and may hold and modify internal state

* `func` may be called more than once, but may not modify state: it must be equality preserving, like `std::regular_invocable`

Rather than variadic template arguments, these concepts use function signatures instead, so you say something like

    template <typename F>
        requires func<F, void(int, float)>
    auto g(F f);

or just

    auto g(func<void(int, float)> auto f)
We already use this in the documentation, it's probably a good idea to make it exist in reality too
Like `adaptable_sequence`, this accepts either rvalue iterables or trivially copyable lvalue iterables, and (as the name suggests) is intended for use in sink functions, allowing the user to pass (presumably) cheap-to-copy types effectively by value, while requiring an explicit copy or move for things like std::vector.
This is a lot of changes
...and also filter_deref()
That was actually easier than expected
That was more work than expected
The final boss...

In theory, we could make `cartesian_product` and `cartesian_product_map` iterable when the first argument is a non-sequence iterable, but that doesn't seem worth the effort.
Somehow the changes in this PR triggered an ICE in GCC 12 while compiling `reverse_adaptor`... even though we haven't actually made any changes to it at all (other than changing the name of a function).

It's a mystery, but hopefully no longer a problematic one.
For some reason MSVC really doesn't like the

    static consteval auto element_type(auto& self)
        -> element_t<decltype((self.base_))>;

formulation that we've been using in various adaptors. Fortunately, the very first thing I tried, namely using a template parameter rather than an auto param, seems to fix it, i.e.

    template <typename Self>
    static consteval auto element_type(Self& self)
        -> element_t<decltype((self.base_))>;

even though the two formulations should be completely identical in meaning.
The new name is even worse than the old one. I really need to think of something better. Naming is hard...
Sequence flatten requires both outer and inner to be sequences
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant