DI and Enzyme #721

gdalle · 2025-02-08T18:47:03Z

gdalle
Feb 8, 2025
Maintainer

I'm starting this discussion as a more permanent location than Slack for exchanging ideas.

The main question we have been debating is where DI's Enzyme bindings and tests - let's call this code DI+Enzyme - should live, and who should maintain them. In general, given packages A and B, there's no definitive rule on where glue code for A+B belongs. The question is made more thorny here because DI+Enzyme relies on internals of both packages.

On the one hand, DI does define a generic interface which AD backends can implement. In an ideal world, each backend would be responsible for how the interface is implemented, and DI would be extension-free. But in practice, it is unlikely to work that way anytime soon, for a variety of reasons:

DI is a young project which is still evolving fast. Keeping all the code in one place is much better for joint design and synchronized testing with every backend.
The package extensions and test files for each backend rely heavily on DI and DITest internals, which will make it trickier not to break anything if they are not modified in lockstep.
Regardless of where that code is, I'll probably be the one to care for it, so it makes sense to keep in a repo where I control the rest (tests, docs, CI).

As a result, I am against a blanket copy, like the one suggested in EnzymeAD/Enzyme.jl#2301. However, I agree that the more functionality we upstream to Enzyme, the more reliable DI becomes. So let's discuss which parts can be upstreamed and to where!

gdalle · 2025-02-08T19:00:22Z

gdalle
Feb 8, 2025
Maintainer Author

Here is a first list of upstreaming ideas:

DI+Enzyme functionality	Current location	Envisioned location
Batch size choice	here	EnzymeCore
Function annotation	here	EnzymeCoreADTypesExt
Mode manipulation	here	EnzymeCoreADTypesExt
Handling of wrong-type tangents	here and everywhere `_righttype` appears	get rid of it?
General pullback	here	Enzyme
In-place gradient with several arguments	here	Enzyme

Note however that, by definition, there will be things that DI users do with Enzyme which are not directly implemented with Enzyme calls. For instance, reverse-mode Jacobians, second derivatives, HVPs or Hessians currently use DI's generic machinery until they land on gradients, pushforwards or pullbacks. This is the whole point of the interface, so the idea is to shore up the set of basic functions in order to confidently build upon them.

0 replies

wsmoses · 2025-02-08T19:06:51Z

wsmoses
Feb 8, 2025

We definitely should start with those, but in phase 2 I think we should move the higher level ones to call directly into Enzyme calls. For one thing Enzyme does implement many of them (like reverse mode Jacobian, though perhaps not to the spec of DI), and in doing so it’s able to take advantage of enzyme specific information that lets it be more efficient.

For ones that don’t exist we should make such a function, and build off it as a baseline.

at the end of the day this will hopefully mean that we can have reasonable confidence that the perf engineered ones in Enzyme are being called well, and that there isn’t a substantial overhead from DI itself

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DI and Enzyme #721

{{title}}

Replies: 2 comments

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

DI and Enzyme #721

gdalle Feb 8, 2025 Maintainer

Replies: 2 comments

gdalle Feb 8, 2025 Maintainer Author

wsmoses Feb 8, 2025

gdalle
Feb 8, 2025
Maintainer

gdalle
Feb 8, 2025
Maintainer Author

wsmoses
Feb 8, 2025