Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decouple from Distributions.jl #523

Closed
torfjelde opened this issue Aug 30, 2023 · 9 comments
Closed

Decouple from Distributions.jl #523

torfjelde opened this issue Aug 30, 2023 · 9 comments

Comments

@torfjelde
Copy link
Member

Good people of DynamicPPL-land. I'm back from vacation, which also means I'm back with crazy ideas.

Here's my first one for now:

What if we add our own logpdf, etc. in DynamicPPL?

I ask for the following reasons:

  1. Allows us to define specific AD-rules, e.g. we can use Enzyme for logpdf-computations without affecting the broader community while making inference much faster for certain models. This can be very useful if you want to squeeze out some additional performance from certain models that aren't fully compatible with certain AD frameworks, but you don't want to mess with other packages in the environment.
  2. Broadening support for RHS types. By decoupling the logpdf and rand methods from Distributions, it becomes much easier to support non-Distributions, e.g. MeasureTheory.jl.
  3. We're not using the full Distributions.jl interface, nor do we need it.

Making the change would be trivial since it would just be a matter of adding

logpdf(d, x) = Distributions.logpdf(d, x)

and so on. It would also, depending on how we execute this, provide use with a way to not do all the type-piracy we do in DistributionsAD.jl (we could potentially define a logpdf-like thing there, which is then used in DynamicPPL.jl).

Thoughts?

@torfjelde
Copy link
Member Author

@devmotion @yebai maybe?

@yebai
Copy link
Member

yebai commented Aug 30, 2023

It is sensible, but perhaps for DynamicPPL/Turing 2.0.

@torfjelde
Copy link
Member Author

It is a very trivial change to make though without any real maintenance cost, while allowing 3rd-party developers to more easily extend how things are treated. I personally would have found this useful in some of the performance-sensitive projects we've been involved in.

@yebai
Copy link
Member

yebai commented Aug 30, 2023

Here are some functions that we should consider if this goes ahead:

  • logpdf
  • rand
  • flatten <-- vectorise
  • unflatten <-- reconstruct

We could have generic fallbacks for all distributions from Distributions.jl, then allow users to overload these functions for their own distributions.

EDIT: optional API that can be very helpful

  • condition(d::DynamicPPLDistribution, x::NamedTuple)
  • gradient rules, including rrule and frule.

@devmotion
Copy link
Member

I'm on vacation, hence only some quick comments:

Allows us to define specific AD-rules, e.g. we can use Enzyme for logpdf-computations without affecting the broader community

IMO this is a major disadvantage: why should we not let the whole community benefit from AD improvements? IMO we should just add AD rules for logpdf (that are an improvement over AD) to Distributions.

Broadening support for RHS types. By decoupling the logpdf and rand methods from Distributions, it becomes much easier to support non-Distributions, e.g. MeasureTheory.jl.

Isn't this what interfaces such as DensityInterface are designed for, which are AFAIK supported by both Distributions and MeasureTheory?

@torfjelde
Copy link
Member Author

IMO this is a major disadvantage: why should we not let the whole community benefit from AD improvements? IMO we should just add AD rules for logpdf (that are an improvement over AD) to Distributions.

I most certainly agree with this! But sometimes we might want to make more "opinionated" changes that are specifically beneficial to Turing. For example, would you be willing to introduce an rrule for Distributions.logpdf that uses Enzyme to implement it? To me that seems a bit "too" opinionated to make it's way into the entire ecosystem, but would most certainly benefit 95% of Turing.jl-users.

Isn't this what interfaces such as DensityInterface are designed for, which are AFAIK supported by both Distributions and MeasureTheory?

Aaah that might be true! Had completely forgotten about DensityInterface; my bad.

@devmotion
Copy link
Member

For example, would you be willing to introduce an rrule for Distributions.logpdf that uses Enzyme to implement it?

I think in general that should neither be done in Distributions nor in DynamicPPL. According to my understanding, rrules shouldn't call out to completely different AD systems. Or at least only to those that are as well-established, stable and fundamental such as ForwardDiff.

@torfjelde
Copy link
Member Author

I very much agree with you, but the reality is that AD performance is a huge bottleneck for us. Being able to do some specialized stuff here and there that is "not great in general" could be quite useful to squeeze out more perf.

@yebai yebai closed this as not planned Won't fix, can't repro, duplicate, stale Dec 19, 2024
@yebai
Copy link
Member

yebai commented Dec 19, 2024

There are not enough benefits since Distribution will be the only probability distribution package for the foreseeable future. Furthermore, to_distribution/to_sampleable` provide a way to import user-specified distributions into DynamicPPL.

Please reopen if the situation changes.

@yebai yebai reopened this Dec 19, 2024
@yebai yebai closed this as completed Dec 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants