Use `permute' or operator[] as the name of permutation functions? #55

danieltowner · 2023-03-15T15:02:12Z

Re: https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2023/p2664r1.pdf

The basic proposal is to add 3 permute overloads to cover the common permutation cases:

outputSimd = permute<4>(inputSimd, [](auto idx) { return idx % 1; }); // Statically generated permute
outputSimd = permute(inputSimd, indexSimd);                           // simd lookup by runtime index
outputSimd = permute(inputSimd, mask);                                // Extract active elements and store contiguously

The first use probably has to stay since there needs to be some way of passing in the size of the desired output simd.

Should the other two uses overload operator[]?

outputSimd = inputSimd[indexes];
outputSimd = inputSimd[simd_mask];

If they do change operator[] then the changes would need to be rolled into std::simd, but providing permute instead would make this an extension on top of std::simd.

The text was updated successfully, but these errors were encountered:

mattkretz · 2023-03-15T16:11:17Z

I hope we'll get a simd<T> operator[](contiguous_range auto const&, simd<U> idxs). In that case I believe we want to have the following consistency:

void f(std::vector<float> const& data, simd<float> v, simd<int> idxs) {
  simd x = data[idxs];
  simd y = v[idxs];
}

So simd<T>::operator[](simd<U>) makes a lot of sense to me. It seems to be a must-have.

OTOH, do we want the following?

void f(std::vector<float> const& data, simd<float> v, simd_mask<float> mask) {
  simd x = data[mask];
  simd y = v[mask];
}

Subscripting a contiguous range with a simd_mask? That doesn't seem right. Also subscripting via bool is not an analogue operation we could refer to. I think a "compress" operation should neither be called permute nor operator[].

I have a question that your paper doesn't appear to address: What about inserting zeros into the output? Like if I want to permute [1, 2, 3, 4] into [2, 0, 4, 0]? I can always blend with a mask. But it's not uncommon that the positions of the zeros is an intrinsic property of the algorithm and could trivially be specified in the index mapping function. E.g. permute(v, [](int idx) -> int { return bool(idx & 1) ? idx ^ 1 : std::simd_permute_zero; }), where simd_permute_zero could be INT_MIN?

Another question: Have you considered indexing from the end? E.g. permute(v, [](int idx) -> int { return -1 - idx; }) would return reversed v. I'm not saying that I want it. But I believe it's part of the design space that the paper should cover.

danieltowner · 2023-03-15T18:03:40Z

I think subscripting a contiguous_iterator with a mask should be handled by an overload with a better name, like std::copy_if.

At the moment I handle the insertion of zeros likes this:

permute<simd<float>::size>(concat(values, simd<float>()),
                           [](auto idx) { return idx % 2 ? idx : simd<float>::size;});

I don't like making the indexes have special values to mean different things, like insert-a-zero or index-from-the-end.

danieltowner · 2023-03-15T18:13:18Z

For convenience, is it worth also having a C++-23 multi-index overload of operator[] to allow this to be written:

outputSimd = inputSimd[3, 5, 1, 7, 5];

This would be equivalent to:

constexpr simd indexes = [3, 5, 1, 7, 5];
outputSimd = inputSimd[simd_indexes];

(or with variables in there too)

mattkretz · 2023-03-17T12:31:59Z

is it worth also having a C++-23 multi-index overload of operator[]

This is a good question and belongs into the paper. As well as an exploration and your/our recommendation.

At this point [1, 2] has only one meaning in standard C++: that of mdspan. I.e. it returns a single element. Using the same syntax to mean shuffle and potentially at some other place to mean subrange is going to be inconsistent. inputSimd[simd{3, 5, 1, 7, 5}] would work (given #24) and still rather concise.

mattkretz · 2023-03-17T13:07:47Z

In order to write generic permutation mappings the mapping function needs the size of the input simd.
E.g. to write a generic broadcast_last mapping:

constexpr auto broadcast_last = [](unsigned idx) { return size - 1; };

But size isn't known at this point as it depends on the first permute parameter. We could add a second (optional) function argument for size. What do you think?

danieltowner · 2023-03-17T13:20:52Z

I had thought about doing that, but generally I've found that the size of the first parameter is often readily available in the context of its use anyway. For example, my reverse function:

template<typename _T, typename _Abi>
auto reverse(const simd<_T, _Abi>& v)
{
    return permute(v, [](auto idx) { return (simd_size_v<_T, _Abi> - 1) - idx; });
}

To start with I think we leave it without a size parameter and get more user feedback.

mattkretz · 2023-03-17T13:21:54Z

Right, but how do you write a generic / named index mapping function for reverse? Like the dupEven etc. examples in P2664?

mattkretz · 2023-03-17T13:40:52Z

I've just prototyped the permute functions and a few pre-defined index mappings: https://github.com/mattkretz/simd-prototyping/blob/main/permute.h
It detects whether the mapping function accepts a second size parameter and optionally passes the simd::size.

mattkretz · 2023-03-17T13:42:39Z

Oh, and for good measure, I also implemented setting negative indexing (from the end) and setting values to zero via the mapping. It comes at no runtime cost, so it's really just a question of usefulness / complexity.

mattkretz · 2023-03-17T13:52:13Z

Just realized: since I allow negative indexing, the size parameter is not necessary for any of the mappings I pre-defined. However, if you want to define a "interleave low and high halves" index mapping, then you need to know size/2. So there's still a use case.

danieltowner · 2023-03-17T16:48:52Z

Right, but how do you write a generic / named index mapping function for reverse? Like the dupEven etc. examples in P2664?

Do we want to provide pre-defined utility lambdas that we can hand over to permute:

permute(value, std::simd_getEven);

or do we define a top-level function which does invokes permute with a local lambda which can get the information about size from its own context:

auto getEven(simdable v) {
    return permute<decltype(v)::size/2>(v, [](auto idx) { return idx * 2; });
}

If we want the former, than having the size parameter is useful, and gives something very flexible. But the second creates a much tighter interface which makes it more obvious how to use it, and consequently harder for mistakes to creep in (e.g., getEven needs an output size which is no greater than half the input size, but a stand-alone lambda couldn't enforce that).

The function called getEven might not be a good example. Maybe better examples would be if we created functions which mirror their existing counterparts, like std::reverse, std::shift_left, std::rotate, and so on, overloaded for simd, and which are implemented using permute with a local lambda. I think that most users would want something which does the whole job (i.e., named functions), rather than a pre-defined utility lambda that they need to figure out how to call properly.

danieltowner · 2023-06-07T08:09:40Z

Oh, and for good measure, I also implemented setting negative indexing (from the end) and setting values to zero via the mapping. It comes at no runtime cost, so it's really just a question of usefulness / complexity.

This is what is in P2664R3, but I have had users feed back that they don't like negative indexes. Errors in computing the index (e.g., off-by-one) can result in apparently valid negative indexes which don't get flagged up at compile time and have to be caught with run-time testing. Also, having the size passed in (also in P2664R3) means redundancy in having two mechanisms for getting indexes from the end.

I propose that we keep the optional size parameter but drop negative indexes. Indexes will only ever be in the range [0..size), or the special zero and uninitialised constants, which is easy to check at compile time.

danieltowner added this to the P2664R2 milestone Mar 15, 2023

danieltowner added this to std::simd in C++26 Mar 15, 2023

danieltowner moved this to In Progress in std::simd in C++26 Mar 15, 2023

danieltowner self-assigned this Mar 15, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use `permute' or operator[] as the name of permutation functions? #55

Use `permute' or operator[] as the name of permutation functions? #55

danieltowner commented Mar 15, 2023

mattkretz commented Mar 15, 2023 •

edited

Loading

danieltowner commented Mar 15, 2023 •

edited

Loading

danieltowner commented Mar 15, 2023

mattkretz commented Mar 17, 2023

mattkretz commented Mar 17, 2023

danieltowner commented Mar 17, 2023

mattkretz commented Mar 17, 2023

mattkretz commented Mar 17, 2023

mattkretz commented Mar 17, 2023

mattkretz commented Mar 17, 2023

danieltowner commented Mar 17, 2023

danieltowner commented Jun 7, 2023

Use `permute' or operator[] as the name of permutation functions? #55

Use `permute' or operator[] as the name of permutation functions? #55

Comments

danieltowner commented Mar 15, 2023

mattkretz commented Mar 15, 2023 • edited Loading

danieltowner commented Mar 15, 2023 • edited Loading

danieltowner commented Mar 15, 2023

mattkretz commented Mar 17, 2023

mattkretz commented Mar 17, 2023

danieltowner commented Mar 17, 2023

mattkretz commented Mar 17, 2023

mattkretz commented Mar 17, 2023

mattkretz commented Mar 17, 2023

mattkretz commented Mar 17, 2023

danieltowner commented Mar 17, 2023

danieltowner commented Jun 7, 2023

mattkretz commented Mar 15, 2023 •

edited

Loading

danieltowner commented Mar 15, 2023 •

edited

Loading