Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using JET.jl to determine if typed varinfo is okay #728

Merged
merged 65 commits into from
Dec 10, 2024

Conversation

torfjelde
Copy link
Member

After a quick experiment with JET.jl I found some bugs in DynamicPPL.jl (#726), but also realized that we can JET.jl to properly check whether the a given model supports the usage of TypedVarInfo rather than requiring UntypedVarInfo.

This has been a looooooong standing issue, and this seems to work really, really well.

The problem

In Turing.jl, we use TypedVarInfo almost everywhere due to the performance charactersitics that come with it. The problem is that we do so by simply evaluating the given model once and then using the resulting (hopefully, concretety typed) varinfo for all subsequent computations. This works nicely for most typical models, but fails horribly (and uninformably) for a good chunk of models, such as

@model function demo1()
    x ~ Bernoulli()
    if x
        y ~ Normal()
    else
        z ~ Normal()
    end
end

Here we will execute the model once and get, say, a TypedVarInfo containing the variables x and y (because x happend to result in a true sample). If we then re-use this varinfo for sampling, we will ofc run into issues since z is nowhere to be seen.

Technically we can handle this by just widing the container a bit, but if we do that, we need to cpature the new varinfo, which isn't always possible, e.g. when using the LogDensityFunction in a sampler.

As a result, we have a lot of code that just makes the assumption "surely this model is 'static' in what variables and types it contains", which can sometimes be false.

The solution

This PR introduces a determine_varinfo method, which can automagically figure out whether we can use the type stable varinfo properly (i.e. without having to always capture the resulting varinfo, etc.) or if we need to use the untyped varinfo using abstract interpretation offered by JET.jl, all done statically.

Effectively what determine_varinfo does is:

  1. Execute the model once with to get the typed varinfo.
  2. Using JET.jl, statically check if we can run into type issues, e.g. container of NamedTuple{(:x, :y)} cannot handle the value for z being updated (because the entry does not exist).
  3. If we do run into errors, we return an untyped varinfo. If we don't, we return a typed one.

Note that this method doesn't say anything about whether there might be type instabilities; this only checks if we would encounter errors. We can also use JET to check type instabilites, etc., but I think that's a separate functionality and thus PR.

@torfjelde
Copy link
Member Author

See the tests for what we can properly check here. It honestly seems really good for our purposes 👀

torfjelde and others added 3 commits November 28, 2024 15:44
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
@yebai
Copy link
Member

yebai commented Nov 28, 2024

That seems like an elegant trick!

Copy link

codecov bot commented Nov 28, 2024

Codecov Report

Attention: Patch coverage is 89.28571% with 3 lines in your changes missing coverage. Please review.

Project coverage is 86.49%. Comparing base (0548ddf) to head (4a17e82).
Report is 1 commits behind head on master.

Files with missing lines Patch % Lines
src/experimental.jl 60.00% 2 Missing ⚠️
src/DynamicPPL.jl 80.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #728      +/-   ##
==========================================
+ Coverage   86.34%   86.49%   +0.15%     
==========================================
  Files          34       36       +2     
  Lines        4254     4272      +18     
==========================================
+ Hits         3673     3695      +22     
+ Misses        581      577       -4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@coveralls
Copy link

coveralls commented Nov 28, 2024

Pull Request Test Coverage Report for Build 12236332369

Details

  • 25 of 28 (89.29%) changed or added relevant lines in 4 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.2%) to 86.493%

Changes Missing Coverage Covered Lines Changed/Added Lines %
src/DynamicPPL.jl 4 5 80.0%
src/experimental.jl 3 5 60.0%
Totals Coverage Status
Change from base Build 12224895333: 0.2%
Covered Lines: 3695
Relevant Lines: 4272

💛 - Coveralls

torfjelde and others added 11 commits November 29, 2024 09:25
fallback to current behavior + `supports_varinfo` to `is_suitable_varinfo`
longer needed on Julia 1.10 and onwards + added error hint for when
JET.jl has not been loaded
provided context, but uses `SamplingContext` by default (as this
should be a stricter check than just evaluation)
in sampling context now so no need to handle this explicitly elsewhere
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
@torfjelde
Copy link
Member Author

This honestly seem to work really well. I've now made it so that default_sampler and LogDensityFunction also makes use of this. The question is just how well it works with Turing.jl (will try this now).

@torfjelde
Copy link
Member Author

Added some docs 👍

Copy link
Member

@mhauru mhauru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @torfjelde, this seems really handy. Had a few localised comments, nothing major.

Project.toml Outdated Show resolved Hide resolved
src/varinfo.jl Outdated Show resolved Hide resolved
src/experimental.jl Outdated Show resolved Hide resolved
src/experimental.jl Show resolved Hide resolved
src/experimental.jl Show resolved Hide resolved
src/DynamicPPL.jl Outdated Show resolved Hide resolved
Comment on lines +212 to +230
Base.Experimental.register_error_hint(MethodError) do io, exc, argtypes, _
requires_jet =
exc.f === DynamicPPL.Experimental._determine_varinfo_jet &&
length(argtypes) >= 2 &&
argtypes[1] <: Model &&
argtypes[2] <: AbstractContext
requires_jet |=
exc.f === DynamicPPL.Experimental.is_suitable_varinfo &&
length(argtypes) >= 3 &&
argtypes[1] <: Model &&
argtypes[2] <: AbstractContext &&
argtypes[3] <: AbstractVarInfo
if requires_jet
print(
io,
"\n$(exc.f) requires JET.jl to be loaded. Please run `using JET` before calling $(exc.f).",
)
end
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could there be some way to test this? I do see that it's tricky. I'm a bit uncomfortable having this in without any testing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah was thinking the same. We could put in a test strictly before loading JET.jl ofc. It's a bit messy, but seems like the best way 😕

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does wrapping the tests in separate modules save us?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nah. AFAIK extensions trigger if the package is loaded at any point, e.g. even if a dep loads it

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's also a thing where it doesn't seem like we can nicely get the resulting error message (the error hint is not in the msg of the error or something). So I think we just leave this for now 😕

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, it does seem nasty to test for. Have you tried locally that it does what you expect?

src/experimental.jl Show resolved Hide resolved
ext/DynamicPPLJETExt.jl Outdated Show resolved Hide resolved
Project.toml Outdated Show resolved Hide resolved
src/experimental.jl Outdated Show resolved Hide resolved
Comment on lines +46 to +60
@model function demo5()
x ~ Normal()
xs = Any[]
push!(xs, x)
# `sum(::Vector{Any})` can potentially error unless the dynamic manages to resolve the
# correct `zero` method. As a result, this code will run, but JET will raise this is an issue.
return sum(xs)
end
# Should pass if we're only checking the tilde statements.
@test DynamicPPL.Experimental.determine_suitable_varinfo(demo5()) isa
DynamicPPL.TypedVarInfo
# Should fail if we're including errors in the model body.
@test DynamicPPL.Experimental.determine_suitable_varinfo(
demo5(); only_ddpl=false
) isa DynamicPPL.UntypedVarInfo
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the example mentioned above @mhauru :)

@torfjelde
Copy link
Member Author

You happy with this now @mhauru ?:) It's only failiing because of the x86 OOM thingy

Copy link
Member

@mhauru mhauru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, am happy, thanks!

@torfjelde torfjelde merged commit 145f471 into master Dec 10, 2024
11 of 13 checks passed
@torfjelde torfjelde deleted the torfjelde/determine-varinfo branch December 10, 2024 09:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants