You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@warn"The selected optimization algorithm requires second order derivatives, but `AutoZygote` ADtype was provided.
So a `SecondOrder` with `AutoZygote` for inner and `AutoForwardDiff` for outer will be created, for choosing another pair
an explicit `SecondOrder` ADtype is recommended."
end
I think that this could be both optimized and simplified due to recent changes in DI.
Nowadays, DI.inner and DI.outer can also be called on backends which are not SecondOrder, they just act as the identity. Thus, you don't need to explicitly create a SecondOrder(adtype, adtype). Passing adtype alone will be equivalent in most cases, and faster in some because it can leverage custom Hessian implementations within a single backend (e.g. SecondOrder(AutoForwardDiff(), AutoForwardDiff()) cannot call ForwardDiff.hessian whereas AutoForwardDiff() can).
Furthermore, DI's hvp and hessian for AutoZygote() already use ForwardDiff-over-Zygote.
Here are my suggestions:
Simplify the generate_adtype logic and its variants to avoid creating SecondOrder objects altogether.
Throw a warning based on the modes DI.inner and DI.outer, e.g. when the inner backend is not a reverse mode backend. This can be checked with ADTypes.mode(DI.inner(adtype)) isa Union{ADTypes.ReverseMode,ADTypes.ForwardOrReverseMode}. Of course you also want to allow ForwardDiff so feel free to refine.
Document this behavior so that users are less confused by the warnings (see this Discourse thread).
This issue is about the machinery for choosing backends and throwing warnings:
OptimizationBase.jl/src/adtypes.jl
Lines 222 to 236 in 2ffab7e
OptimizationBase.jl/src/cache.jl
Lines 45 to 58 in 2ffab7e
I think that this could be both optimized and simplified due to recent changes in DI.
Nowadays,
DI.inner
andDI.outer
can also be called on backends which are notSecondOrder
, they just act as the identity. Thus, you don't need to explicitly create aSecondOrder(adtype, adtype)
. Passingadtype
alone will be equivalent in most cases, and faster in some because it can leverage custom Hessian implementations within a single backend (e.g.SecondOrder(AutoForwardDiff(), AutoForwardDiff())
cannot callForwardDiff.hessian
whereasAutoForwardDiff()
can).Furthermore, DI's
hvp
andhessian
forAutoZygote()
already use ForwardDiff-over-Zygote.Here are my suggestions:
generate_adtype
logic and its variants to avoid creatingSecondOrder
objects altogether.DI.inner
andDI.outer
, e.g. when the inner backend is not a reverse mode backend. This can be checked withADTypes.mode(DI.inner(adtype)) isa Union{ADTypes.ReverseMode,ADTypes.ForwardOrReverseMode}
. Of course you also want to allow ForwardDiff so feel free to refine.What do you think @Vaibhavdixit02?
The text was updated successfully, but these errors were encountered: