Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement non-axis windows and chunks iterators #276

Open
3 of 4 tasks
Robbepop opened this issue Mar 14, 2017 · 54 comments
Open
3 of 4 tasks

Implement non-axis windows and chunks iterators #276

Robbepop opened this issue Mar 14, 2017 · 54 comments

Comments

@Robbepop
Copy link
Contributor

Robbepop commented Mar 14, 2017

Already implemented features and todos:

  • ExactChunks and ExactChunksMut aswell as their NdProducer impl
  • RaggedChunks or simply Chunks aswell as their NdProducer impl
  • Windows
  • NdProducer impl for Windows

Implement n-dimensional windows and chunks iterators.

Windows-iterators are especially useful for convolutional neural networks while chunk iterations may prove useful for paralell algorithms.

N-Dimensionality for both iterators use N-dimensionally shaped items.

For example for a 1-dimensional Array the windows and chunks items are also 1-dimensional.
For a 2-dimensional Array the windows and chunks items are 2-dimensional, and so on.

For windows iterator it could be useful to support two different kinds of windows iterators.
One for "valid" windows only that iterate only over windows within a valid region.
For example in a valid-windows iterator over a 10x10 shaped 2d-array and window sizes of 4x4 the iterator would iterate totally over (10-4+1)x(10-4+1) items of size 4x4.
A full-windows iterator would also iterate over the windows with invalid elements (on the edges of the array) and would simulate missing elements with zero elements.

Semi-nice visualization of n-dimensional chunks and windows iterators.

A similar approach could be used for n-dimensional chunks that iterate beyond the array sizes and replace missing/invalid array elements with zero elements.

@bluss
Copy link
Member

bluss commented Mar 14, 2017

That sounds good, and I love a good illustration to go together with expressing an idea.

One concern I have here is that it's probably still not an efficient way to implement convolution, to go through this.

@Robbepop
Copy link
Contributor Author

Robbepop commented Mar 14, 2017

That sounds good, and I love a good illustration to go together with expressing an idea.

Thanks! :)

One concern I have here is that it's probably still not an efficient way to implement convolution, to go through this.

True! For performance the most promising implementation for convolution is to use a GPGPU approach. But what if one doesn't want to go that far? The main question for this issue for me is, if it is possible to provide such iterators with equal performance as extern definitions. But as far as I can imagine such iterators would perform much better if implemented internally in this crate with hidden data of ArrayBase.

Also: What implementation would fit best? We could implement such iterator for example by using an item-sized local Array within the iterator to reuse the memory for copying the data out of the nd-array.

Or we could also use iterators over iterators over the nd-array and thus avoid copying of elements completely but I don't know if this approach is faster since I do not know ndarray internals too well. As far as I can think the strike-throughed idea is not feasable if we want ArrayView as items for the new iterators.

@bluss
Copy link
Member

bluss commented Mar 14, 2017

Between as_ptr, as_mut_ptr, and the .axes() iterator (each axis' length and stride), all the information of the raw representation is available, except for issues related to ownership, and they don't matter here. So I think it's possible to do it outside of the crate, but not so natural.

@Robbepop
Copy link
Contributor Author

Between as_ptr, as_mut_ptr, and the .axes() iterator (each axis' length and stride), all the information of the raw representation is available, except for issues related to ownership, and they don't matter here. So I think it's possible to do it outside of the crate, but not so natural.

Okay, good to know about the currently possible "hack", but yeah it wouldn't be a very natural way to encode this behavious and would require unsafe rust to deref the pointers which is unnecessary.

For me both proposed iterators would be great and would be of great use for my projects!

@bluss
Copy link
Member

bluss commented Mar 14, 2017

Sure, the window iterator is a good idea. If I understand correctly, chunks has a chunk size in each dimension? .axis_chunks_iter() currently covers it if you just want to chunk it along one dimension, (using the full width of every other).

I also want to underline that using the unsafe methods or lower level code is not a hack but a feature, we need to allow it so that people can use it to improve their programs if needed.

@Robbepop
Copy link
Contributor Author

Robbepop commented Mar 14, 2017

For my proposed idea I cannot use the axis_chunks_iter since it has a different behaviour. The items of the chunks iterator I proposed has the same dimension as the underlying Array. It just acts like the 1-dimensional chunks iterator for slices but for n-dimensions.

For 2-dimensions I could really use this for the pooling function within the convolution.

And for a 3 dimensional array of size 10x10x10 I could for example chunk it with 5x5x5 cubes and would iterate over (10³ / 5³) chunks in total.

For both iterators it should be decided whether we also are in need of full-iterator that (as described) above iterate over the full area, instead of only the valid area. This is possible since in ndarray elements are all NDFloat which is Copy and has a default value 0.0 which we could simply use for invalid elements.

@bluss
Copy link
Member

bluss commented Mar 14, 2017

axis_chunks_iter gives views that have the same dimensionality (1d → 1d, 2d → 2d) though.

@Robbepop
Copy link
Contributor Author

axis_chunks_iter gives views that have the same dimensionality (1d → 1d, 2d → 2d) though.

Okay I will have to take another look into the docs. Maybe I have overseen things and it could really be used to simulate the chunks iterator that I proposed here.

@Robbepop
Copy link
Contributor Author

I have found a semi-nice visualization of the proposed chunks iterator for 2 dimensions here.

@bluss
Copy link
Member

bluss commented Mar 14, 2017

Right, the current axis_chunks_iter cuts stripes, not squares. So they are different in that way.

@Robbepop
Copy link
Contributor Author

Robbepop commented Mar 14, 2017

How do you feel about the importance of a chunks iterator facility for ndarray?

@bluss
Copy link
Member

bluss commented Mar 14, 2017

Not important, but we can try to do it

@Robbepop
Copy link
Contributor Author

How do you feel about me trying to do it via PR? (This would probably take a while since I have to wrap my head around the libraries internals.)

@bluss
Copy link
Member

bluss commented Mar 14, 2017

Go ahead.

  1. Remember how windows for slice has no mutable variant? That needs to be true for us too.

  2. Maybe I should tell about the trick that we usually use for iterators: We reuse the basic element iterator (Baseiter) . How to construct a chunks iterator? Make an iterator that visits exactly the top right corner of each chunk (This can be done using slicing with a stride, which we can also emulate manually). For each element, you map it to an ArrayView that starts in that element (its pointer is the arrayview's pointer!)

@Robbepop
Copy link
Contributor Author

Okay I will try! :)

@Robbepop
Copy link
Contributor Author

Robbepop commented Mar 22, 2017

@bluss

Today I looked into the ndarray code to wrap my head around the iterator and especially the BaseIterator and ArrayBase code.

It is not yet very clear to me but could it be possible to design a windows or chunks iterator without any copy operations into an iterator-internal owning ArrayBase object by exploiting the strides of the ArrayBase and just giving out cheap instantiations of ArrayViews that have a proper ptr and strides?

I need feedback on this.

@bluss
Copy link
Member

bluss commented Mar 22, 2017

I recognize it's not super easy to implement a new iterator, since it's so much about ndarray implementation details.

I think you would want do do something like InnerIter: have a Baseiter and also extra information. However for chunks/windows it needs to have a whole dimension/stride set, not just a single number (Basically D instead of an Ix). https://github.com/bluss/rust-ndarray/blob/master/src/iterators/mod.rs#L400

@Robbepop
Copy link
Contributor Author

Yep I got that point with the D: Dimension strides for the ArrayView but I was just curious if the whole thing is really implemented in that way as I thought that we have to store an internal copy of all the floats within the current iteration window or chunk but with this architecture all that copying is not needed and thus the whole iteration should be a lot more efficient than I thought in the first place. ^.^

Let's see if I can get it working. :S

@bluss
Copy link
Member

bluss commented Mar 22, 2017

If I think of the chunks iterator and am not mistaken I think that

We use sliced dimension and sliced stride to initialize Baseiter (One can slice in place using the array method islice, but it can be done with lower level functions as well). Each yielded ArrayView will have the dimensions of a chunk (clipped to the full array's boundaries) and strides that are equal to those of the original array(!)

@bluss
Copy link
Member

bluss commented Mar 22, 2017

This code could also be read to understand how to use a slice / subview of an array as a “map” of array views.

Imagine we have an array of N dimensions and want to cut it into separate one-dimensional “sticks” (or “lanes”).

That's what these methods do. Only difference is they use a callback instead of an iterator.

https://github.com/bluss/rust-ndarray/blob/d9492d7b5812e47dfeb23d062f2acd33a62b41cd/src/impl_methods.rs#L1371-L1386

@Robbepop
Copy link
Contributor Author

Robbepop commented Mar 24, 2017

I think I understood how I can utilize islice to create the iterators.
On iterator instantiation I create an ArrayView of the n-dimensional Array under iteration and with islice I mutate the internal ArrayView and give out references to it on a call to next.

I could also use slice method to create a new instance of the ArrayView instead of mutating an old one. The advantage here would be that I could easily move the newly created ArrayView on a call to next and would yield values instead of references on iteration which would be nice in case of non-owning ArrayViews.

The part that is currently unclear to me is, how I can perform the n-dimensional iteration of all possible Slices of the chunked or windowed Array. Is there built-initerator functionality for that to yield all possible indices of the array? - Is the Indices iterator the right utility for this approach?

We need an efficient way for iterating over all possible slices or indices. I saw that the iteration Item for Indices is a Vec<usize> which implies tons of potential malloc invokations. We should avoid that I think.

@bluss
Copy link
Member

bluss commented Mar 24, 2017

The iteration Item for Indices is D::Pattern where D is a dimension.

That means if D is Ix2, the item is (usize, usize). If D is IxDyn, dynamic number of axes, the item is indeed Vec<usize>. It's now IxDyn works, it uses a lot of dynamic allocations, but it's not the usual way you'd use ndarray. Ix1, Ix2 etc are the common ones.

@bluss
Copy link
Member

bluss commented Mar 24, 2017

This is maybe not so straightforward, you could leave this to me if you want.

@Robbepop
Copy link
Contributor Author

Robbepop commented Mar 24, 2017

That means if D is Ix2, the item is (usize, usize).

That's awesome!

This is maybe not so straightforward, you could leave this to me if you want.

I would like to try it out now if this is okay for you.

So I think I can design Windows like so:

struct Windows<'a, A: 'a, D: 'a + Dimension> {
    /// view of the entire array under iteration
    view   : ArrayView<'a, A, D>,
    /// iterator for all possible valid window indices
    indices: Indices<D>,
    /// the shape of the windows
    shape  : Shape<D>
}

pub fn windows<'a, A: 'a, D: 'a + Dimension>(array: ArrayView<'a, A, D>, shape: Shape<D>) -> Windows<'a, A, D> {
    Windows{
        view:    array.view(),
        indices: indices_of(&array), // restrict array to area for valid windows shapes:
                                     // e.g. array.shape() == [10, 8], window shape is [5, 4], then indices, area
                                     // should be (0..5, 0..4).
        shape:   shape
    }
}

impl<'a, A: 'a, D: 'a + Dimension> Iterator for Windows<'a, A, D> {
    type Item = ArrayView<'a, A, D>;

    fn next(&mut self) -> Option<Self::Item> {
        match self.indices.next() {
            None => None,
            Some(idx) => {
                let slice_arg = slice_at_idx_of_shape(self.shape, idx); // function that converts a shape and a given index into a SliceArg
                Some(self.view.slice(slice_arg)) // return window with given shape at current index
            }
        }
    }
}

Questions for me to solve:

  • How to restrict the indices iteration area to valid windows for given shape s.
  • How to convert a given idx in next into a shape of the given shape `s.

How close or far am I from your theory behind the codes for the windows iterator?

To create a windows-slice for slice method from a given shape Dim and an index (offset) Dim without dynamic memory allocation in the special case of 2 dimensions I came to this code:

fn slice_at_idx_of_shape_2d<D: Dimension>(shape: D, idx: D) -> [Si; 2] {
    [
        Si(idx[0] as isize, Some((idx[0] + shape[0]) as isize), 1),
        Si(idx[1] as isize, Some((idx[1] + shape[1]) as isize), 1)
    ]
}

Provided that this implementation is correct it would be best imo to create a 1D, 2D, ... 6D version for it and a generalize dynamic allocation requiring method to avoid dynamic allocation up to 5 dimensional arrays.
With strides we might be able to easily implement chunks-iterators.

@Robbepop
Copy link
Contributor Author

I have created a prototype implementation of a generic, malloc avoiding into_si_with_offset trait implementation which can be found here.

This makes it possible to map a pair of shape: Dim (windows/chunks size) and offset: Dim (from indices) into a slice argument for slicing.

@bluss
Copy link
Member

bluss commented Mar 24, 2017

Making SliceArg available like that is good

@bluss
Copy link
Member

bluss commented Mar 27, 2017

I have sort of gone into this territory with Zip, a chunk iterator might show up there.

@Robbepop
Copy link
Contributor Author

I haven't had quite enough time to work on this the last few days.
It is nice that you also worked on another path there which looks really promising!
Maybe I can actually use this zip a lot in order to improve already existing Prophet code.

I have sort of gone into this territory with Zip, a chunk iterator might show up there.

Does that mean that you are able to implement a Chunks iterator using this new facility?

I came to the conclusion that a Windows iterator is a lot easier to implement than a Chunks iterator.
Basically the template you just gave me needs only little changes for it to be a valid Windows iterator.

@bluss
Copy link
Member

bluss commented Mar 28, 2017

What I want now with Zip is that one can zip together chunk producers and arrays.

If an array is dim (10, 10) and we get a chunks producer with (2, 2) chunks, the chunks producer has dim (5, 5) and we can zip it with other (5, 5) arrays.

Iterators are stateful, and handling the halfway-iterated state is troublesome. So I'm going the route (like rayon) of having “Producers” (Think of that as IntoIterator for the array zip). A producer can be split etc for parallelization. Some objects like Chunks is a Producer and IntoIterator, other objects like AxisIter can implement Producer and Iterator directly (because the axis iter is one dimensional, this is not a problem).

@Robbepop
Copy link
Contributor Author

Sounds very nice so far - especially the improved integration with rayon!
The question that I am asking myself is: Do you still need a Chunks and Windows iterator implementation after this huge commit or is it self-contained in it?

If you no longer need it, I can at least serve you with the test cases implementation I came up with.
There are many for chunks: (just for 2 dimensional arrays so far)

  • only even chunks (easy one)
  • odd chunks on the right handside (horizontal) (hard)
  • odd chunks on the vertical axis (hard)
  • odd chunks on both axes
  • chunks of size 1x1 -> should be same as normal iteration. So should this do some warning on use?
  • chunks of size 0 -> error!
  • chunks of size greater than the underlying array -> what is the expected behaviour for this?

Odd chunks are shaped so fit into the remaining valid region, as in the standard chunks iterator.

We could maybe also implement an EvenChunks iterator besides this one, that would simply skip odd chunks or panic on misusage (what you like better) which should give us some performance when we do not need to cover these many edge cases.
What do you think?

@bluss
Copy link
Member

bluss commented Mar 28, 2017

I've only implemented the "EvenChunks" case so far (I called it WholeChunks to be less ambiguous). Basically it was there because I needed it to ensure that the trait for n-ary Zip could support it.

I think that yes we want at least the windows iterator (and windows producer!)

@bluss
Copy link
Member

bluss commented Mar 28, 2017

chunks of size greater than the underlying array -> what is the expected behaviour for this?

For whole chunks, the result must be 0 chunks (all chunks produced have the requested size), for ragged chunks it would be as many ragged chunks that fit. If the chunk is larger than the array in all dimensions, the result is 1 chunk, surely? Not sure what to do if the source array is size 0.

@Robbepop
Copy link
Contributor Author

How do you like working with the Result type in such cases?
Could it be worth returning a Result<Chunks> instead of Chunks iterator directly in cases where it doesn't make sense to create such an iterator, like in the case of a zero sized array?
Or should we just return a ragged chunk of size zero? Or panic on zero sized arrays?

I personally thing the cleanest solution would be to simply return a zero sized ragged chunk, however, this is not the best solution to point a user to a logic error if this scenario is the result of one.

@bluss
Copy link
Member

bluss commented Mar 28, 2017

If it's a logic error and programmer's error we have a convention to just panic; for example [].chunks(0) panics and [1, 2, 3][10] panics. Result is preferred for situations when errors is part of the regular use of the API.

I'm not sure zero sized arrays are common.

@Robbepop
Copy link
Contributor Author

Robbepop commented Mar 28, 2017

I think I have prototyped an implementation for the Windows iterator based on the IterBase - was actually easy compared to the Chunks iterator with all its edge cases.

The implementations for the Windows and Chunks iterators will be splitted into two different PRs and the more difficult Chunks iterator will follow soon, hopefully.

I am going to write some tests for the Windows iterator now to verify that it is working as far as the tests are verifying it.

@bluss
Copy link
Member

bluss commented Mar 28, 2017

Great! I wrote some chunks tests for whole chunks (the easy version), maybe those can inspire. tests/iterator_chunks.rs

I'm wishing that we make an NdProducer out of Windows, so that we can use it in azip.

@Robbepop
Copy link
Contributor Author

Robbepop commented Mar 28, 2017

I have inspected your WholeChunks iterator implementation. Maybe with your code as template I can get NdProducer working with my Windows implementation.
I also looked at your WholeChunks test - looks kinda complicated to me.
My tests would look more like that:

#[test]
fn test_windows_iterator_simple() {
    let a = Array::from_iter(10..30).into_shape((5, 4)).unwrap();

    let mut iter = windows(a.view(), Dim((3, 2)));
    assert_eq!( iter.next().unwrap(), arr2(&[ [10, 11], [14, 15], [18, 19] ]) );
    assert_eq!( iter.next().unwrap(), arr2(&[ [11, 12], [15, 16], [19, 20] ]) );
    assert_eq!( iter.next().unwrap(), arr2(&[ [12, 13], [16, 17], [20, 21] ]) );

    assert_eq!( iter.next().unwrap(), arr2(&[ [14, 15], [18, 19], [22, 23] ]) );
    assert_eq!( iter.next().unwrap(), arr2(&[ [15, 16], [19, 20], [23, 24] ]) );
    assert_eq!( iter.next().unwrap(), arr2(&[ [16, 17], [20, 21], [24, 25] ]) );

    assert_eq!( iter.next().unwrap(), arr2(&[ [18, 19], [22, 23], [26, 27] ]) );
    assert_eq!( iter.next().unwrap(), arr2(&[ [19, 20], [23, 24], [27, 28] ]) );
    assert_eq!( iter.next().unwrap(), arr2(&[ [20, 21], [24, 25], [28, 29] ]) );

    assert_eq!( iter.next(), None );
}

I wanted to (KISS) keep it simple stupid. :S

@bluss
Copy link
Member

bluss commented Mar 28, 2017

Looks pretty good. I'm trying to learn to write more “exhaustive” style tests than just inputting examples. Your tests look good as they are, I'd probably use itertools::assert_equal to test the iterator's sequence though.

@bluss
Copy link
Member

bluss commented Mar 28, 2017

So for example I'd prefer a test that tested all possible window sizes, since that's more exhaustive.

@Robbepop
Copy link
Contributor Author

Looks pretty good.

Okay thanks!

I'd probably use itertools::assert_equal to test the iterator's sequence though.

Awesome! Didn't know about its existance. :)

So for example I'd prefer a test that tested all possible window sizes, since that's more exhaustive.

True! I will first implement some naive implementations and later add some exhaustive tests. I would also like to test other dimensionalities, such as 3D.

@Robbepop
Copy link
Contributor Author

#[test]
fn test_windows_iterator_simple() {
    let a = Array::from_iter(10..30).into_shape((5, 4)).unwrap();
    itertools::assert_equal(
        windows(a.view(), Dim((3, 2))),
        vec![
            arr2(&[ [10, 11], [14, 15], [18, 19] ]),
            arr2(&[ [11, 12], [15, 16], [19, 20] ]),
            arr2(&[ [12, 13], [16, 17], [20, 21] ]),

            arr2(&[ [14, 15], [18, 19], [22, 23] ]),
            arr2(&[ [15, 16], [19, 20], [23, 24] ]),
            arr2(&[ [16, 17], [20, 21], [24, 25] ]),

            arr2(&[ [18, 19], [22, 23], [26, 27] ]),
            arr2(&[ [19, 20], [23, 24], [27, 28] ]),
            arr2(&[ [20, 21], [24, 25], [28, 29] ])
        ]);
}

@Robbepop
Copy link
Contributor Author

Robbepop commented Mar 29, 2017

All tests are passing now but this one:

/// Test that verifites that no windows are yielded on oversized window sizes.
#[test]
fn windows_iterator_oversized() {
    let a = Array::from_iter(10..37)
        .into_shape((3, 3, 3))
        .unwrap();
    let mut iter = windows(a.view(), Dim((4, 3, 2))); // (4,3,2) doesn't fit into (3,3,3) => oversized!
    assert_eq!(iter.next(), None);
}

How is this functionality achieved in itertors using BaseIter in general?
I do not want to implement it via a hack ...

@bluss
Copy link
Member

bluss commented Mar 29, 2017

Just make sure the dimension of the iterator is computed to be (0, 1, 2) which means 0 elements

@Robbepop
Copy link
Contributor Author

Ah nice, I got screwed by this because of another bug with underflowed subtraction. All tests are working now.

@bluss
Copy link
Member

bluss commented Apr 6, 2017

Sorry for all the churn around NdProducer and so on. It's a big project to make it work.. and a lot needs to change around it. Let me know, I'm fine with merging a windows iterator and fixing it to be ndproducer.

@Robbepop
Copy link
Contributor Author

Robbepop commented Apr 6, 2017

Sorry for not notifying you on the status of the current windows implementation and tests.
I haven't had that much time in the past days ...
I will make a PR today if I can get it working. =)

@Robbepop
Copy link
Contributor Author

Robbepop commented Apr 6, 2017

PR completed #306 - awaiting feedback

@Robbepop
Copy link
Contributor Author

Robbepop commented Apr 6, 2017

I have added a task-list for the todos until this issue can finally be closed.
We should discuss if we still need and/or want the RaggedChunks iterator.

@bluss
Copy link
Member

bluss commented Apr 6, 2017

Good. Lists are good. Checking things off is the next best thing to closing issues 😉 Thanks for working on Windows and working with ndarray in general.

@Robbepop
Copy link
Contributor Author

Robbepop commented Apr 6, 2017

  • Thanks! :)

And thank you for maintaining all these awesome, high-quality libraries that are critical for the well-being of the rust ecosystem!

@fmorency
Copy link

Thanks a lot for this awesome work!

Not being a Rust expert, I was wondering if it would be a lot of work to implement a full-window iterator with different array-border modes such as in scipy? Those kind of things are used a lot in image processing.

@Robbepop
Copy link
Contributor Author

I can remember that we were talking about similar things for the windows and chunks iterators but implemented only the simplest approaches of introducing them initially. I cannot tell you if implementation of different modes like wrap would incur overhead that'd make it impractical. Maybe @bluss can tell us more. :)

@jturner314
Copy link
Member

jturner314 commented Feb 27, 2018

One challenge with extending windows/chunks at the edges beyond the original array is that ideally we'd minimize copying the data as much as possible. The existing chunks and windows producers don't have to copy any of the data because they return ArrayView instances which merely view the original array. To handle extending windows/chunks beyond the original array, the producer can't just create an ArrayView that extends beyond the original array because something needs to own the data outside the original array. The question is, "Who should own the data outside the original array in the edge chunks/windows?" I came up with a few possible solutions:

  1. When creating the chunks/windows producer, create a larger owned array with the original data in the middle and with the extra data for the edges around it. Then, return chunks/windows as views into this expanded array. This would require copying the original array, so it would be expensive in most cases, although if the the windows/chunks are much larger than the original array, then it may be a good option. The lifetimes for this solution are also a little weird, and I'd have to think more about how to satisfy the borrow checker.
  2. When creating the chunks/windows producer, allocate and fill in new arrays for the edges. The producer could yield views of the original array for middle chunks/windows and views of the edge arrays for the edges. This would be complex to implement, though, and would require allocating all of the edge chunks/windows at once. The lifetimes for this solution are also a little weird, and I'd have to think more about how to satisfy the borrow checker.
  3. Have the producers return Array<A, D> (which owns the data) by copying the underlying data for each chunk/window. This would be expensive.
  4. Create a new Data representation for ArrayBase called CowRepr<'a, A> that behaves like ::std::borrow::Cow so the producer could return the Borrowed variant (equivalent to ViewRepr<&'a A>) for data inside the original array and the Owned variant (equivalent to OwnedRepr<A>) for the chunks/windows that extend beyond the original array.
  5. For the special case of contiguous-memory chunks, the chunks are non-aliasing and non-overlapping, so given an Array<A, D> we could (1) take ownership of the array (2) ::std::mem::forget the data, (3) yield chunks that own their data (taking ownership of their portion of the forgotten data or copying data as necessary at the edges), and (4) when the producer is dropped, clean up any forgotten data that hasn't been yielded in a chunk. This seems like a pretty bad hack, though, and it's limited to contiguous-memory chunks on owned arrays.
  6. Instead of a producer, we could have a method that takes a closure and applies the closure to views of chunks/windows. For chunks/windows in the middle, the views could just be of the original array, and for each chunk/window on the edges, the method could allocate a temporary array and call the closure on a view of that temporary array.

If all that's needed is an equivalent to scipy.ndimage.filters.convolve(), then that would be relatively straightforward because convolve() doesn't need to actually produce views of the array; it just computes the convolution with the given weights.

If all that's needed is the ability to call a closure on chunks/windows, then idea (6) would be the simplest to implement. It would also be the cheapest because it could allocate a single owned array for handling all of the edge chunks/windows (by reusing it for each edge chunk/window). (Producers based on idea (4) would have to allocate a new owned array for each edge chunk/window because they lose ownership.)

If we actually want producers that return the chunks/windows, though, I would recommend idea (4). (I've found other potential uses for a CowRepr<'a, A> type anyway, e.g. #390.)

@frjnn
Copy link

frjnn commented Mar 31, 2021

One challenge with extending windows/chunks at the edges beyond the original array is that ideally we'd minimize copying the data as much as possible. The existing chunks and windows producers don't have to copy any of the data because they return ArrayView instances which merely view the original array. To handle extending windows/chunks beyond the original array, the producer can't just create an ArrayView that extends beyond the original array because something needs to own the data outside the original array. The question is, "Who should own the data outside the original array in the edge chunks/windows?" I came up with a few possible solutions:

  1. When creating the chunks/windows producer, create a larger owned array with the original data in the middle and with the extra data for the edges around it. Then, return chunks/windows as views into this expanded array. This would require copying the original array, so it would be expensive in most cases, although if the the windows/chunks are much larger than the original array, then it may be a good option. The lifetimes for this solution are also a little weird, and I'd have to think more about how to satisfy the borrow checker.
  2. When creating the chunks/windows producer, allocate and fill in new arrays for the edges. The producer could yield views of the original array for middle chunks/windows and views of the edge arrays for the edges. This would be complex to implement, though, and would require allocating all of the edge chunks/windows at once. The lifetimes for this solution are also a little weird, and I'd have to think more about how to satisfy the borrow checker.
  3. Have the producers return Array<A, D> (which owns the data) by copying the underlying data for each chunk/window. This would be expensive.
  4. Create a new Data representation for ArrayBase called CowRepr<'a, A> that behaves like ::std::borrow::Cow so the producer could return the Borrowed variant (equivalent to ViewRepr<&'a A>) for data inside the original array and the Owned variant (equivalent to OwnedRepr<A>) for the chunks/windows that extend beyond the original array.
  5. For the special case of contiguous-memory chunks, the chunks are non-aliasing and non-overlapping, so given an Array<A, D> we could (1) take ownership of the array (2) ::std::mem::forget the data, (3) yield chunks that own their data (taking ownership of their portion of the forgotten data or copying data as necessary at the edges), and (4) when the producer is dropped, clean up any forgotten data that hasn't been yielded in a chunk. This seems like a pretty bad hack, though, and it's limited to contiguous-memory chunks on owned arrays.
  6. Instead of a producer, we could have a method that takes a closure and applies the closure to views of chunks/windows. For chunks/windows in the middle, the views could just be of the original array, and for each chunk/window on the edges, the method could allocate a temporary array and call the closure on a view of that temporary array.

If all that's needed is an equivalent to scipy.ndimage.filters.convolve(), then that would be relatively straightforward because convolve() doesn't need to actually produce views of the array; it just computes the convolution with the given weights.

If all that's needed is the ability to call a closure on chunks/windows, then idea (6) would be the simplest to implement. It would also be the cheapest because it could allocate a single owned array for handling all of the edge chunks/windows (by reusing it for each edge chunk/window). (Producers based on idea (4) would have to allocate a new owned array for each edge chunk/window because they lose ownership.)

If we actually want producers that return the chunks/windows, though, I would recommend idea (4). (I've found other potential uses for a CowRepr<'a, A> type anyway, e.g. #390.)

What is the status of this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants