Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

filt perf optimizations #516

Merged
merged 22 commits into from
Feb 6, 2024
Merged

filt perf optimizations #516

merged 22 commits into from
Feb 6, 2024

Conversation

wheeheee
Copy link
Member

@wheeheee wheeheee commented Dec 5, 2023

  • instead of allocating for every column, reuse si (different from input si)
  • filt currently errors for N>2 (edit: if the trailing dimensions aren't all of length 1); this PR doesn't change that

@wheeheee
Copy link
Member Author

managed to squeeze performance of filt! to 2x on my device for _small_filt_fir, N < 18, by omitting inbounds.

@wheeheee wheeheee changed the title filt: create si only once, and reuse filt perf optimizations Dec 27, 2023
@wheeheee
Copy link
Member Author

Bump, couldn't figure out how to make BenchmarkPlots plot benchmarks in order, but the GitHub Actions tests take significantly less time to run (50 -> 30 minutes total usage). Some of this is maybe due to faster compilation for the generated functions. Tests in filt_stream.jl benefit quite a lot.

@codecov-commenter
Copy link

codecov-commenter commented Jan 20, 2024

Codecov Report

Attention: 2 lines in your changes are missing coverage. Please review.

Comparison is base (453c7e6) 97.46% compared to head (553b433) 97.49%.

Files Patch % Lines
src/Filters/remez_fir.jl 94.73% 1 Missing ⚠️
src/dspbase.jl 98.14% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #516      +/-   ##
==========================================
+ Coverage   97.46%   97.49%   +0.03%     
==========================================
  Files          18       18              
  Lines        3080     3078       -2     
==========================================
- Hits         3002     3001       -1     
+ Misses         78       77       -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@ViralBShah
Copy link
Contributor

@martinholters Ok to merge? Hope you don't mind the ping.

Copy link
Member

@martinholters martinholters left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@martinholters Ok to merge? Hope you don't mind the ping.

Don't mind at all. In principle ok with me. I'm not too excited about all those muladds; I find them less readable than the straight a*b+c and don't think they bring benefits to justify that everywhere. That said, now that all the replacements have been done, I don't think it's worth it to go through them and see whether they bring any tangible benefits.

One thing I'm not sure about and would like a word of explanation is all those deleted filt(self, ...) methods. What's up with those? Why don't we need them anymore?

@wheeheee
Copy link
Member Author

They were duplicates of each other except for the kernels so I edited the last one to accept all the FIRKernels and deleted the rest.

@martinholters
Copy link
Member

Ah, yes, of course. I had missed the edited one.

@ViralBShah
Copy link
Contributor

Merge? Also @wheeheee would you be ok to be invited to the organization to help maintain these packages?

@wheeheee
Copy link
Member Author

wheeheee commented Feb 4, 2024

I will say I don't have serious expertise in DSP, although I don't mind helping clean up issues / add documentation occasionally, and actually have a few more minor commits saved up to contribute.

Also, it might be good to do some benchmarking on other hardware for this commit. My laptop cpu (i5-1135g7) has avx512 instructions, which may possibly be the reason why the inbounds fiddling in _filt_fir! works.

@ViralBShah
Copy link
Contributor

@wheeheee Thanks for helping with the maintenance here. Since we have approval here from @martinholters, it would be great to rebase and get this one merged.

- reduced exceptions in `@code_llvm` for `N < 18`
- `@noinline mul!`, `broadcast`, only for julia > v1.8
- name `SMALL_FILT_VECT_CUTOFF`, selective bounds check
@wheeheee
Copy link
Member Author

wheeheee commented Feb 6, 2024

Right, I think I'll just go ahead and merge to speed up CI. If the newest commits need modification, I'll open a PR later.

@wheeheee wheeheee merged commit 9de4c0c into JuliaDSP:master Feb 6, 2024
11 checks passed
@wheeheee wheeheee deleted the filt_copysi branch February 6, 2024 06:52
@martinholters
Copy link
Member

Thanks @wheeheee, any help here is appreciated. And also thanks @ViralBShah for pushing this forward.

@ViralBShah
Copy link
Contributor

Thanks @martinholters for all the work on this package, and @wheeheee for jumping in. We should see if there are other contributors who would like to step up as well. I'm just trying to help out - amazing how the small improvements over the last few days are making a big cosmetic difference at least.

@wheeheee wheeheee mentioned this pull request Feb 25, 2024
martinholters pushed a commit that referenced this pull request Sep 13, 2024
* create `si` only once, and reuse

* no inbounds faster for N < 24

* @generated `_small_filt_fir!`

- use Base.Cartesian.@ntuple

* inbounds if N > 18

major speedups for N > 18, much slower if inbounds for small N

* reorder store in `_filt_fir` and `_filt_iir`

>20% faster for fir

* maybe prevent undefvarerror

no need to assign values to enum

* small things

- eachindex stuff
- `==` to `===`  nothing
- `npairs = max(nz, np)`

* isone, iszero, etc.

* `filt` for `FIRFilter`: remove redundant definitions

* extract type instability from loop

* remove fft conv type instability

- in `unsafe_conv_kern_os!`
- probably no real impact

* correct `filt` arg types

* nicer loop

* inv plan

* cleanup filt_order

* use `muladd`s in filt.jl

* `_prod_freq`

* more `muladd`s

* `fill!`, `reverse`, don't broadcast scalar /

* micro-optimizations, cosmetics

- reduced exceptions in `@code_llvm` for `N < 18`
- `@noinline mul!`, `broadcast`, only for julia > v1.8
- name `SMALL_FILT_VECT_CUTOFF`, selective bounds check

* Compat noinline

* add test for error

(cherry picked from commit 9de4c0c)
martinholters pushed a commit that referenced this pull request Sep 13, 2024
* create `si` only once, and reuse

* no inbounds faster for N < 24

* @generated `_small_filt_fir!`

- use Base.Cartesian.@ntuple

* inbounds if N > 18

major speedups for N > 18, much slower if inbounds for small N

* reorder store in `_filt_fir` and `_filt_iir`

>20% faster for fir

* maybe prevent undefvarerror

no need to assign values to enum

* small things

- eachindex stuff
- `==` to `===`  nothing
- `npairs = max(nz, np)`

* isone, iszero, etc.

* `filt` for `FIRFilter`: remove redundant definitions

* extract type instability from loop

* remove fft conv type instability

- in `unsafe_conv_kern_os!`
- probably no real impact

* correct `filt` arg types

* nicer loop

* inv plan

* cleanup filt_order

* use `muladd`s in filt.jl

* `_prod_freq`

* more `muladd`s

* `fill!`, `reverse`, don't broadcast scalar /

* micro-optimizations, cosmetics

- reduced exceptions in `@code_llvm` for `N < 18`
- `@noinline mul!`, `broadcast`, only for julia > v1.8
- name `SMALL_FILT_VECT_CUTOFF`, selective bounds check

* Compat noinline

* add test for error

(cherry picked from commit 9de4c0c)
martinholters added a commit that referenced this pull request Sep 27, 2024
Backport `filt` fix (#516 and #539) and bump version to 0.7.10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants