-
Notifications
You must be signed in to change notification settings - Fork 422
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Uncompatibility rand(MvNormal()) and AutoDiff #813
Comments
@theogf PR welcome on that |
So I found the source of the error but it's in common.jl so not sure if the change would be breaking : Distributions.jl/src/common.jl Line 51 in 817bd83
Says that whatever the type of the distribution is eltype will return Float64 for a continuous function. Since eltype is called many times for the MvNormal sampling, one always end up with Float64 samples. I don't know the dependency of other distributions on eltype but a quick fix would be overloading it for MvNormal types
|
Yes this should be overloaded for (almost?) all distributions
…On Wed, May 8, 2019, 11:58 Théo Galy-Fajou ***@***.***> wrote:
So I found the source of the error but it's in common.jl so not sure if
the change would be breaking :
https://github.com/JuliaStats/Distributions.jl/blob/817bd83326f9d562ec88c0e782e60d25d64862ad/src/common.jl#L51
Says that whatever the type of the distribution is eltype will return
Float64 for a continuous function. Since eltype is called many times for
the MvNormal sampling, one always end up with Float64 samples. I don't
know the dependency of other distributions on eltype but a quick fix
would be overloading it for MvNormal types
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#813 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AB2FDMRLU6IZHHZP3HYLPWLPUKP3FANCNFSM4GQEEJNA>
.
|
Is it really reasonable to expect |
Yes, I think this could be important, differentiability of |
Related: TuringLang/DistributionsAD.jl#123 I think in many cases it is not important to implement the samplers in a differentiable way but it would be useful to add custom adjoints, probably based on ChainRulesCore. |
But does this generalize beyond location/scale? I don't think e.g. |
Every |
It would be great if we could figure out a way to handle this that isn't generally broken and only sometimes (even if very often) works. Recall that the julia> _seed
165
julia> tmp = [rand(Random.MersenneTwister(_seed), Distributions.GammaGDSampler(Gamma(_s, 0.1))) for _s in s]; I generally think you should be very reluctant to allow loose signatures for methods that exploits details about floating point numbers such as their precision. I'm wondering if instead, we could define AD rules for |
This is what I had in mind when I said that it might be better to add custom adjoints/AD rules instead of trying to make the sampling algorithm itself differentiable.
This would only be a problem for AD systems that operate with special number types such as |
That single type parameter shouldn’t pose a problem, one can promote the other parameter to dual too. |
The point I tried to make was that you'd have to restrict the argument type for the shape parameter to |
Its not differentiable in the shape but it can run on duals |
Let me elaborate: roughly speaking, there are two kinds of floating point methods
The first groups exploits details of the computer representation of the numbers such as calling an LLVM intrinsic or a Distributions.jl/src/samplers/gamma.jl Lines 43 to 55 in 8c7f400
Distributions.jl/src/samplers/gamma.jl Lines 65 to 74 in 8c7f400
The second group is composed out of "Core" methods and I completely agree that such definitions should have as loose a signature as possible to allow for as many number types as possible. Regarding AD then we need rules for the "Core" group for AD to work and the beauty is then that AD automatically works for the second group provided that we have used sufficiently loose signatures in the method definitions. What we are currently doing is that we consider the sampler a "Composition" method. I'm arguing that it's not sound and that we'd have to make it a "Core" method and define AD rules for it. Specifically, we only need to consider the version for |
I think we are on one page, I agree with your line of reasoning, only in isolation I wouldn't restrict
to
at the right precision. |
Hello it is unfortunately not possible to use automatic differentation with (at least) the
MvNormal
distribution.The following code will fail at
rand(p)
due to a wrong conversion toFloat64
The text was updated successfully, but these errors were encountered: