-
Notifications
You must be signed in to change notification settings - Fork 311
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds triu
and tril
methods directly to ArrayBase
#1386
Conversation
triu
and tril
methods directly to ArrayBase to fix #1323triu
and tril
methods directly to ArrayBase
Hold off on any merge for a moment; |
Ok, fixed. Not to move conversations, but two thoughts on the (now resolved) discussion about order of the input array:
running 3 tests (64 x 64)
test triu_c ... bench: 1,681 ns/iter (+/- 45)
test triu_f ... bench: 2,011 ns/iter (+/- 32)
test triu_f_via_t ... bench: 2,133 ns/iter (+/- 29)
running 3 tests (2048 x 2048)
test triu_c ... bench: 2,418,649 ns/iter (+/- 320,729)
test triu_f ... bench: 22,184,991 ns/iter (+/- 4,532,682)
test triu_f_via_t ... bench: 4,259,493 ns/iter (+/- 218,849) |
Well, I know that I'm interested in those functions, but it may not be what @bluss had in mind. The more methods we have, the harder it is to find them. As for the benchmarks, it's kinda normal because those larger arrays don't fit in the cache and you're iterating in the wrong order. Seeing those results, I think adding a condition on the memory order is worth it. Some people (like me) often work in f-order :) |
This is true; I don't want to bloat the API. That being said, I think some (difficult and time-consuming and I don't mean to imply it's trivial) documentation work could make the docs easier to navigate and alleviate concerns about more functions.
Me too :) too many column-vector algorithms / math. I was interested in seeing how well NumPy does on the transposed version and I guess they didn't write one that dispatches on order: In [1]: import numpy as np
In [2]: x = np.ones((2048, 2048))
In [3]: %timeit np.triu(x)
4.34 ms ± 53.8 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [4]: x = np.ones((2048, 2048), order="F")
In [5]: %timeit np.triu(x)
19 ms ± 557 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) (Fun to see that If we want to create a specialization, I think we can do it via the transpose-transpose method I mentioned above, with a slight tweak to avoid the copy: (EDITED) pub fn triu(&self, k: isize) -> Array<A, D> {
match self.ndim() > 1 && is_layout_f(&self.dim, &self.strides) {
true => {
let n = self.ndim();
let mut x = self.view();
x.swap_axes(n - 2, n - 1);
let mut tril = x.tril(-k).into_shared();
tril.swap_axes(n - 2, n - 1);
tril.into_owned()
},
false => {
let mut res = Array::zeros(self.raw_dim());
Zip::indexed(self.rows())
.and(res.rows_mut())
.for_each(|i, src, mut dst| {
let row_num = i.into_dimension().last_elem();
let lower = max(row_num as isize + k, 0);
dst.slice_mut(s![lower..]).assign(&src.slice(s![lower..]));
});
res
}
}
} which gets us the following performance: (EDITED) running 3 tests (64 x 64)
test triu_c ... bench: 1,709 ns/iter (+/- 58)
test triu_f ... bench: 1,723 ns/iter (+/- 43)
running 3 tests (2048 x 2048)
test triu_c ... bench: 2,429,621 ns/iter (+/- 171,843)
test triu_f ... bench: 2,484,196 ns/iter (+/- 159,421) If this gets some thumbs up I'll add it to the PR (and maybe squash at this point to reduce commits?) Glad I got to explore the |
7d24c54
to
6316fbe
Compare
Includes branched implementations for f- and c-order arrays.
Thank you for your contribution @akern40 |
They added `tril` and `triu` in rust-ndarray/ndarray#1386, but the maintainers are busy, and there hasn't been an ndarray release since July 2022 (despite active development?). I could write my own tril (or copy–paste it), or ... we could run tip of master?! It is written that Y.O.L.O.
Thanks @akern40!! To anyone waiting to use this before 0.16 is released, remember that you can run master by changing the version specification in your Cargo.toml to |
The implementation seems to work, but unsure about whether this should be a free function vs bound directly to ArrayBase vs a trait. Closes #1323.