Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

conv is slower than necessary #13996

Closed
sloisel opened this issue Nov 15, 2015 · 5 comments
Closed

conv is slower than necessary #13996

sloisel opened this issue Nov 15, 2015 · 5 comments
Labels
help wanted Indicates that a maintainer wants help on an issue or pull request performance Must go faster
Milestone

Comments

@sloisel
Copy link

sloisel commented Nov 15, 2015

It seems that conv is slower than necessary. By implementing a naive convolution, I am able to outperform conv for moderate vector sizes. On my particular machine, naive convolution is better than conv if the product of vector sizes is less than 100 000.

Just so everyone sees what the naive solution is, I'm pasting it below.

slowconvcost(m,n) = 2e-9*m*n+1e-6
fastconvcost(m,n) = 7e-8*(m+n)*log(m+n)+1e-4
function sebconv{T}(a::Array{T}, b::Array{T})
    m = length(a)
    n = length(b)
    if(fastconvcost(m,n)<slowconvcost(m,n)) return conv(a,b); end
    c = zeros(T,m+n-1)
    @inbound for j=1:m
        for k=1:n
            c[j+k-1] += a[j]*b[k]
        end
    end
    return c
end

Here is a performance plot:

convperf

Edit: tweaked code for better heuristic as to when slowconv is faster than fastconv.

@ViralBShah ViralBShah added the performance Must go faster label Nov 15, 2015
@ViralBShah ViralBShah added this to the 0.5.0 milestone Nov 15, 2015
@simonster
Copy link
Member

You can also use filt(y, 1, [x; zeros(length(y)-1)]) to perform convolution, where x is the longer of the two vectors and y is the shorter one.

DSP.jl has an implementation of filt that decides whether or not to use the FFT algorithm based on the vector lengths. The FFT algorithm in DSP.jl should also be a little more efficient than conv when one vector is much longer than the other, since it does overlap-save. I guess we could move that here, but that seems contrary to #5155.

@sloisel
Copy link
Author

sloisel commented Nov 15, 2015

filt seems to run slightly slower than naive convolution. I've tested that my switching heuristic (between naive and FFT) above seems to be the correct one at all scales.

function anotherconv{T}(x::Array{T},y::Array{T})
    if(length(y)>length(x))
        foo = x
        x = y
        y = foo
    end
    return filt(y, 1, [x; zeros(length(y)-1)])
end

convperf
convperf2

@jrevels jrevels added the potential benchmark Could make a good benchmark in BaseBenchmarks label Nov 17, 2015
@JeffBezanson JeffBezanson modified the milestones: 0.5.x, 0.5.0 Jan 27, 2016
@StefanKarpinski StefanKarpinski added help wanted Indicates that a maintainer wants help on an issue or pull request and removed help wanted Indicates that a maintainer wants help on an issue or pull request labels Oct 27, 2016
@dehann
Copy link
Contributor

dehann commented Jan 21, 2017

FYI: recent work at MIT Accelerated Convolutions for Efficient Multi-Scale Time to Contact Computation in Julia
(link to pdf on arxiv)
of which the implementation should be available somewhere. First author repos seem empty though. aamini/FastConv.jl and issue aamini/TimeToContact.jl#1.

Maybe you already know about this and have it wrapped in, but I can't find it and then this pointer might be useful to others also.

@aamini
Copy link

aamini commented Jan 23, 2017

Thanks for the mention. Both packages have been updated.
Fast convolutions for Julia: amini/FastConv.jl
Time to Contact in Julia: amini/TimeToContact.jl

@KristofferC
Copy link
Member

Moved to JuliaDSP/DSP.jl#167

@KristofferC KristofferC removed the potential benchmark Could make a good benchmark in BaseBenchmarks label Oct 31, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Indicates that a maintainer wants help on an issue or pull request performance Must go faster
Projects
None yet
Development

No branches or pull requests

9 participants