Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Choice of second-order AD and warnings #129

Open
gdalle opened this issue Nov 12, 2024 · 1 comment
Open

Choice of second-order AD and warnings #129

gdalle opened this issue Nov 12, 2024 · 1 comment

Comments

@gdalle
Copy link
Contributor

gdalle commented Nov 12, 2024

This issue is about the machinery for choosing backends and throwing warnings:

function generate_adtype(adtype)
if !(adtype isa SciMLBase.NoAD || adtype isa DifferentiationInterface.SecondOrder ||
adtype isa AutoZygote)
soadtype = DifferentiationInterface.SecondOrder(adtype, adtype)
elseif adtype isa AutoZygote
soadtype = DifferentiationInterface.SecondOrder(AutoForwardDiff(), adtype)
elseif adtype isa DifferentiationInterface.SecondOrder
soadtype = adtype
adtype = adtype.inner
elseif adtype isa SciMLBase.NoAD
soadtype = adtype
adtype = adtype
end
return adtype, soadtype
end

if !(prob.f.adtype isa DifferentiationInterface.SecondOrder ||
prob.f.adtype isa AutoZygote) &&
(SciMLBase.requireshessian(opt) || SciMLBase.requiresconshess(opt) ||
SciMLBase.requireslagh(opt))
@warn "The selected optimization algorithm requires second order derivatives, but `SecondOrder` ADtype was not provided.
So a `SecondOrder` with $(prob.f.adtype) for both inner and outer will be created, this can be suboptimal and not work in some cases so
an explicit `SecondOrder` ADtype is recommended."
elseif prob.f.adtype isa AutoZygote &&
(SciMLBase.requiresconshess(opt) || SciMLBase.requireslagh(opt) ||
SciMLBase.requireshessian(opt))
@warn "The selected optimization algorithm requires second order derivatives, but `AutoZygote` ADtype was provided.
So a `SecondOrder` with `AutoZygote` for inner and `AutoForwardDiff` for outer will be created, for choosing another pair
an explicit `SecondOrder` ADtype is recommended."
end

I think that this could be both optimized and simplified due to recent changes in DI.

Nowadays, DI.inner and DI.outer can also be called on backends which are not SecondOrder, they just act as the identity. Thus, you don't need to explicitly create a SecondOrder(adtype, adtype). Passing adtype alone will be equivalent in most cases, and faster in some because it can leverage custom Hessian implementations within a single backend (e.g. SecondOrder(AutoForwardDiff(), AutoForwardDiff()) cannot call ForwardDiff.hessian whereas AutoForwardDiff() can).
Furthermore, DI's hvp and hessian for AutoZygote() already use ForwardDiff-over-Zygote.

Here are my suggestions:

  • Simplify the generate_adtype logic and its variants to avoid creating SecondOrder objects altogether.
  • Throw a warning based on the modes DI.inner and DI.outer, e.g. when the inner backend is not a reverse mode backend. This can be checked with ADTypes.mode(DI.inner(adtype)) isa Union{ADTypes.ReverseMode,ADTypes.ForwardOrReverseMode}. Of course you also want to allow ForwardDiff so feel free to refine.
  • Document this behavior so that users are less confused by the warnings (see this Discourse thread).

What do you think @Vaibhavdixit02?

@gdalle gdalle changed the title Choice of second-order AD Choice of second-order AD and warnings Nov 12, 2024
@Vaibhavdixit02
Copy link
Member

This sounds very reasonable, sorry for the long delay in response. I will fix this up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants