Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove selector stuff from VarInfo tests and link/invlink #780

Open
wants to merge 30 commits into
base: release-0.35
Choose a base branch
from

Conversation

mhauru
Copy link
Member

@mhauru mhauru commented Jan 16, 2025

This PR implements a part of moving away from using Selectors/Gibbs IDs/indexing by samplers. It does two things:

  • It removes any and all mentions of indexing VarInfos by samplers in test/varinfo.jl.
  • It reworks link, link!!, invlink, and invlink!! to not take samplers as arguments anymore, but rather (iterables of) VarNames.

These two changes were easiest to do at the same time, for reasons.

Note that more clean up Selector/Gibbs ID stuff is to come, but I'm asking for code review at this stage to avoid one megareview. This PR is not to be merged to master, but rather to #779, which will collect all Selector-removal changes.

@coveralls
Copy link

coveralls commented Jan 16, 2025

Pull Request Test Coverage Report for Build 12813097029

Details

  • 75 of 88 (85.23%) changed or added relevant lines in 4 files are covered.
  • 59 unchanged lines in 7 files lost coverage.
  • Overall coverage decreased (-12.7%) to 73.636%

Changes Missing Coverage Covered Lines Changed/Added Lines %
src/abstract_varinfo.jl 12 16 75.0%
src/threadsafe.jl 2 6 33.33%
src/varinfo.jl 59 64 92.19%
Files with Coverage Reduction New Missed Lines %
src/simple_varinfo.jl 2 84.02%
src/sampler.jl 2 92.54%
src/selector.jl 3 42.86%
src/abstract_varinfo.jl 4 75.82%
src/model.jl 5 80.0%
src/varinfo.jl 21 84.2%
src/threadsafe.jl 22 46.61%
Totals Coverage Status
Change from base Build 12812614355: -12.7%
Covered Lines: 3212
Relevant Lines: 4362

💛 - Coveralls

@coveralls
Copy link

coveralls commented Jan 16, 2025

Pull Request Test Coverage Report for Build 13038103492

Details

  • 146 of 188 (77.66%) changed or added relevant lines in 6 files are covered.
  • 48 unchanged lines in 7 files lost coverage.
  • Overall coverage decreased (-0.9%) to 85.403%

Changes Missing Coverage Covered Lines Changed/Added Lines %
src/abstract_varinfo.jl 20 24 83.33%
src/threadsafe.jl 2 18 11.11%
src/varinfo.jl 105 127 82.68%
Files with Coverage Reduction New Missed Lines %
src/utils.jl 1 74.67%
src/threadsafe.jl 2 54.24%
src/sampler.jl 2 92.54%
src/simple_varinfo.jl 3 80.93%
src/selector.jl 3 42.86%
src/abstract_varinfo.jl 4 78.68%
src/varinfo.jl 33 81.34%
Totals Coverage Status
Change from base Build 13010021727: -0.9%
Covered Lines: 3721
Relevant Lines: 4357

💛 - Coveralls

Copy link

codecov bot commented Jan 16, 2025

Codecov Report

Attention: Patch coverage is 83.42246% with 31 lines in your changes missing coverage. Please review.

Project coverage is 85.24%. Comparing base (af65bb0) to head (853f47e).
Report is 4 commits behind head on release-0.35.

Files with missing lines Patch % Lines
src/varinfo.jl 86.40% 17 Missing ⚠️
src/threadsafe.jl 37.50% 10 Missing ⚠️
src/abstract_varinfo.jl 80.95% 4 Missing ⚠️
Additional details and impacted files
@@               Coverage Diff                @@
##           release-0.35     #780      +/-   ##
================================================
- Coverage         86.16%   85.24%   -0.92%     
================================================
  Files                36       36              
  Lines              4301     4372      +71     
================================================
+ Hits               3706     3727      +21     
- Misses              595      645      +50     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@mhauru mhauru changed the title Remove selector stuff from VarInfo tests Remove selector stuff from VarInfo tests and link/invlink Jan 23, 2025
@mhauru mhauru marked this pull request as ready for review January 23, 2025 13:53
@mhauru mhauru requested review from sunxd3 and penelopeysm January 23, 2025 15:01
@mhauru mhauru marked this pull request as draft January 23, 2025 15:18
@mhauru
Copy link
Member Author

mhauru commented Jan 23, 2025

Sorry, spoke too soon.

@mhauru mhauru marked this pull request as ready for review January 23, 2025 16:04
@mhauru
Copy link
Member Author

mhauru commented Jan 23, 2025

Okay, we are back.

@sunxd3
Copy link
Member

sunxd3 commented Jan 24, 2025

Markus, could you say something about why using sampler for link is not a good idea? (I do prefer the current approach, but just to be more certain). Particularly the logic/reason behind using sampler to index link (before) and using varnames to index link (after)?

@mhauru
Copy link
Member Author

mhauru commented Jan 24, 2025

Sure, @sunxd3. This all part of a general push to get rid of marking different variables as belonging to different samplers in VarInfo (#573). By the time #779 is ready to merge, I expect Ctrl+F of "Sampler" in varinfo.jl to bring up nothing.

I view this mostly as a modularity/separation of concerns thing: VarInfo's job is to keep track of which variable has which value, and how they've been transformed, and maybe what the accumulated logprob is. What exactly is being done the model that warrants keeping track of these things, like whether we are drawing a single sample from the prior or doing MCMC or variational inference or predicting or whatever it is, is a higher level issue downstream of VarInfo, and VarInfo should be blind to it.

To be philosophical about it, this is how I think it currently looks:

                                                                             
 ┌──────────────────────────────────────────┬──────────────────────────────┐ 
 │                                          │                              │ 
 │          ┌───────┐               VarInfo │ Sampler                      │ 
 │          └─────┐ │                       │                              │ 
 │ ┌─────┐        │ └──────────┐            │                              │ 
 │ └───┐ │        │   ┌─────┐  │            │                              │ 
 │     │ └────────┘   └─┐   └─┐└────┐       │                              │ 
 │     └──────────────┐ │     └───┐ │       │                              │ 
 │                    │ │         │ └───────┘                              │ 
 │                    │ │         └─────────┐                              │ 
 │                    │ │ ◄──────┐    ▲     │                              │ 
 │                    │ │        ├────┘     │                              │ 
 │      ┌─────────────┘ │        │          │                              │ 
 │      │ ┌─────────────┘        │          │                              │ 
 │      │ │                      │          │                              │ 
 │      │ │     Indexing a VarInfo by       │                              │ 
 │      └─┘     samplers                    │                              │ 
 └──────────────────────────────────────────┴──────────────────────────────┘ 

and this is how I think it should look:

 ┌───────────────────────────┬──────────────────────────────┐ 
 │                           │                              │ 
 │                   VarInfo └──┐ Sampler                   │ 
 │                              │                           │ 
 │                           ┌──┘                           │ 
 │                           │                              │ 
 │                           └──┐                           │ 
 │                              │                           │ 
 │                           ┌──┘                           │ 
 │                           │                              │ 
 │                           └──┐                           │ 
 │                              │                           │ 
 │                           ┌──┘                           │ 
 │                           │                              │ 
 └───────────────────────────┴──────────────────────────────┘ 

There are also some more practical benefits:

  • The VarInfo code can be greatly simplified, and we can switch over to using VarNamedVector. Probably some edge case bugs will be fixed as a side effect too.
  • We can have a well-defined interface for VarInfo, that is stable and properly documented. This makes code easier to reason about.
  • Turing's own samplers can be on equal footing with external samplers. Currently a sampler needs to be a of a particular type to be used with VarInfo, and conforming to that requires glue code in Turing.jl. In the future a sampler can just take a state of the sampling process, as a vector or maybe as a VarInfo, modify it and return it, using a natural interface where it just sees a bunch of variables with values, and is blind to the internals of how VarInfo works.

Note that the only reason (I think) why this indexing of VarInfos by samplers was introduced in the first place was Gibbs sampling. The old Gibbs sampler used this mechanism to keep track of which component sampler applied to which variable. Now with the new Gibbs sampler that is no longer needed.

@sunxd3
Copy link
Member

sunxd3 commented Jan 24, 2025

Thanks a lot for the explanation, crystal clear!

Copy link
Member

@sunxd3 sunxd3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solid improvements, thanks.

src/abstract_varinfo.jl Outdated Show resolved Hide resolved
src/abstract_varinfo.jl Outdated Show resolved Hide resolved
src/abstract_varinfo.jl Outdated Show resolved Hide resolved
src/abstract_varinfo.jl Outdated Show resolved Hide resolved
src/abstract_varinfo.jl Outdated Show resolved Hide resolved
src/abstract_varinfo.jl Outdated Show resolved Hide resolved
Copy link
Member

@penelopeysm penelopeysm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My overarching feeling about this so far is that we are being really loose with types. I think it's great to offer the user some convenience, but in this case I fear that we have too many pathways which makes it very hard to reason about. e.g. if you pass a tuple to link!!, it's really hard to keep track of the output type, and then that output type becomes the input to another function which might convert it to a namedtuple, but in a type stable way; but on the other hand if you pass it a vector, sure, the eventual outcome is the same, but it follows a rather different winding path through abstract_varinfo.jl and varinfo.jl.

The main question I have is really whether we need this flexibility? All the linking functions are entirely internal to Turing and so external user convenience isn't really a consideration.

If we made sure that all of these functions could only take one container, then it appears to me that we could potentially:

(1) cut some of the code that is repeated
(2) guarantee that it proceeds via a path that is type stable
(3) make it easier to follow the dispatch chain when we revisit this

which would be a win in my books 😄 and I'd be super, super happy to wrap my lists in a tuple before calling link!! if it means that we get the above.

Comment on lines 541 to 545
link!!([t::AbstractTransformation, ]vi::AbstractVarInfo, model::Model)
link!!([t::AbstractTransformation, ]vi::AbstractVarInfo, spl::AbstractSampler, model::Model)
link!!([t::AbstractTransformation, ]vi::AbstractVarInfo, vn::VarName, model::Model)
link!!([t::AbstractTransformation, ]vi::AbstractVarInfo, vn::Tuple{N,VarName}, model::Model)
link!!([t::AbstractTransformation, ]vi::AbstractVarInfo, vn::AbstractVector{<:VarName}, model::Model)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mightttt have mentioned this before, but I still would really prefer if we didn't define a single-VarName method.

  • There are many settings in DynamicPPL / Turing where we have some variable vn which might be a VarName or an iterable of VarNames depending on the context, and it's difficult to guess which one is happening when it gets passed to a function that can take either.
  • Another general reason to not declare convenience methods is that complexity grows in a multiplicative sense. If we have M types of AbstractTransformation and N types of AbstractVarInfo, then for every possible type that we let the vns parameter be, we are essentially declaring M*N more methods, for which we need to make sure that they all exist, behave correctly, and don't generate any method ambiguities (the act of resolving method ambiguities itself leads to code duplication).
    (But at least it's better than allowing a new optional parameter, for which complexity grows exponentially, cf. our situation with evaluate!!.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed this now, even though I would like to have the single-VarName version. The reason I've removed it is that the dispatch on link/invlink is already too complicated, which indeed causes method ambiguities. I think if the dispatch got cleaned up otherwise we could reintroduce the single-VarName version, which we should be able to do with just a single method without ambiguities (because no other method would accept a single VarName). However, for now it's not worth the code complexity, given that it's not really needed anywhere.

For the first point of being able to see the type at the call site, I see what you mean, but I think this is something you have to largely give up in a dynamically typed language like Julia, and I think the ergonomy of the single-VarName method would outweigh it.

Copy link
Member

@penelopeysm penelopeysm Jan 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is something you have to largely give up in a dynamically typed language like Julia

Maybe not very surprising, but I don't agree with this 🙈

Dynamic typing means that it's not possible to unambiguously determine the type of something (using type inference algorithms, or in practice, LSP), but often it's possible to make limited inferences from the known function signatures. In a dynamically typed language, it's not possible to verify these inferences at compile time, but IMO it doesn't follow from this that we should make it harder for ourselves to make such inferences.

The most frequently cited benefit of dynamic typing is the ease of prototyping. However, I think this primarily applies to the end users of the language, i.e. people who end up using Turing. If this was a case of providing nice APIs for end users, I'd be totally on board with extra convenience methods, but I think the (inv)?link(!!)? functions are pretty much entirely for internal use. I believe (and I hope it's not a controversial opinion!) that it's easier to maintain, reason about, and guarantee the correctness of software that is statically typed, and although we can't do that in Julia, I do think that any step we can take in that direction is a net positive, and our lives will be easier if we are more disciplined about our internal APIs 😉

Anyway, I just wanted to write it out because I suspect that me pushing back against multiple dispatch 'overuse' (quotes because others may not agree that it's overuse!) might be a continuing trend in PRs. I do recognise it's not my personal codebase and I'll always be happy to compromise on the outcome, in the sense that if you say you really want a convenience method I'll just say 'ehh, whatever', but I think it's unlikely that I'll really change my mind on the underlying principle. So I write it out once and I'll spare all of you this next time 😄 Maybe I will bookmark it and copy paste the link whenever people ask me about it hahah

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Btw just on this PR I'm happy with how it stands with dispatch so feel free to resolve this convo!)

src/abstract_varinfo.jl Outdated Show resolved Hide resolved
src/abstract_varinfo.jl Outdated Show resolved Hide resolved
src/abstract_varinfo.jl Outdated Show resolved Hide resolved
src/abstract_varinfo.jl Outdated Show resolved Hide resolved
src/abstract_varinfo.jl Outdated Show resolved Hide resolved
src/utils.jl Outdated Show resolved Hide resolved
src/utils.jl Outdated Show resolved Hide resolved
src/utils.jl Outdated Show resolved Hide resolved
src/varinfo.jl Outdated Show resolved Hide resolved
Copy link
Member

@penelopeysm penelopeysm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a handful more comments to the previous

src/varinfo.jl Outdated Show resolved Hide resolved
src/varinfo.jl Outdated Show resolved Hide resolved
test/varinfo.jl Show resolved Hide resolved
@mhauru
Copy link
Member Author

mhauru commented Jan 27, 2025

The main question I have is really whether we need this flexibility?

I agree that this is the big question. I, too, would greatly enjoy the simplicity of only supporting tuples.

We clearly want tuples, because vectors are impossible for type stability.

If we only use tuples, the case I worry about is something like

x = Vector{T}(undef, 1000)
for i in 1:1000
    x[i] ~ dist()
end

creating 1000-element tuples, and type inference having a heart attack, resulting in either exploding inference times or it giving up on inference and giving us abstract types and horrible runtime performance. This could be premature optimisation, but I have a rule of thumb living in my head that very long tuples are bad and will make you sad at some point. Doing a search on Discourse about issues relating to big tuples causing slow/failing inference brings up many threads.

One option would be defining a best-of-all-worlds VarName container type, a lot like the NamedTuple of symbol => vector of varnames that I use for TypedVarInfo, and starting all calls to link/invlink by converting to that type and having the rest of the call stack handle nothing else. This would require rewriting the actual workhorse functions like _link for UntypedVarInfo to handle the awkward and unnecessarily complicated nesting of NamedTuples.

One aspect that might make my worries about large tuples premature optimisation is that I don't think Turing internals have much of a need for calling link/invlink on some variables and not others. We even had plans of removing the possibility of having some variables linked and others unlinked. The reason we did not do that, is that Gibbs ends up creating varinfos with mixed linking by merging varinfos that are all-linked with ones that are unlinked. However, we might still never need to call explicitly link with a list of varnames, it might all get done with merge and subset. This could save us from having the vector methods, though I feel a bit uncomfortable having downstream usage define interface details: Having a way to link even large subsets of variables is a very natural part of the interface that feels like it should be there even if we don't see a current use.

Another factor in this question is the use of subsumes. I'm not at all certain we are consistent on how things like link(vi, @varname(x)) work when vi has x[1], x[2], etc. I've tried to ignore that question here, to focus on removing sampler stuff, but it could be an escape hatch from 1000-element tuples.

I should go through TuringLang repos and see what are all the ways link and invlink actually get called, see if we could do without the vector version.

@torfjelde
Copy link
Member

This is great @mhauru 🎉

I'm a bit OOL, but my recommendation would be against exposing varname-specific linking to the "user".

As you said, ideally we'd move away from this (and it can be done by a combination of subset and merge), and it will simplify things quite a bit I'd imagine.

I would do:

  1. Define link / invlink for the entire VarInfo, which simply defers to its metadata.
  2. Define link / invlink only for (metadata, varname) pairs, i.e. no tuples or vectors.

Is there a reason why this wouldn't work / is not ideal?

Happy to have a quick chat about all of this sometime this week if you want:)

@torfjelde
Copy link
Member

Also loved the illustrations above 😅

@mhauru
Copy link
Member Author

mhauru commented Jan 28, 2025

I searched in TuringLang for all uses of link/invlink, and internally we don't use anything other than the link/invlink(varinfo, model) version, which links all variables. Hence I'm happy with not having the vector option, and I've removed it.

I also removed the single VarName version just to simplify dispatch and reduce the multitude of methods. I would ideally like to reintroduce it at some point since I think it's nice for ergonomics, but since we don't need it I think the simplicity is worth it for now.

I would like to keep some way for users to manually link individual variables, so I've left the tuple version. I would like to keep it because we export merge, which allows a user to create VarInfos with mixed linking, and thus I think they should have the ability to manipulate the linking as well.

In the process I also added some code duplication to reduce the use of Union types, make the dispatch cascade easier to reason about, and make some error messages more informative. The principle is now that for each subtype of AbstractVarInfo you have to separately define link and invlink for the set of arguments that includes a tuple of VarNames and for the set that doesn't; i.e. there is no fallback where if no list of VarNames is given we would get a default option with keys(varinfo).

Copy link
Member

@penelopeysm penelopeysm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some minor stuff remaining

src/abstract_varinfo.jl Outdated Show resolved Hide resolved
src/abstract_varinfo.jl Outdated Show resolved Hide resolved
src/abstract_varinfo.jl Outdated Show resolved Hide resolved
src/abstract_varinfo.jl Outdated Show resolved Hide resolved
src/utils.jl Outdated Show resolved Hide resolved
src/utils.jl Outdated Show resolved Hide resolved
src/utils.jl Outdated Show resolved Hide resolved
src/varinfo.jl Outdated Show resolved Hide resolved
src/varinfo.jl Outdated Show resolved Hide resolved
src/varinfo.jl Outdated Show resolved Hide resolved
@penelopeysm
Copy link
Member

penelopeysm commented Jan 29, 2025

I'll go for a walk, if I come back and CI has passed i'll approve 😄

Last question, which doesn't necessarily need to be acted upon now: I wonder if it's worth keeping a note of what was changed somewhere. DPPL doesn't have a changelog (not yet, at least, my other PR #792 puts one in) but it makes sense to me that each PR should add / modify the changelog entry so that when we do the big 0.35 merge we're not trying to collate all the changes.
I figure that this might be an intermediate stage, so maybe you don't want to write something now, but I thought it might just be something to keep in mind

@mhauru mhauru closed this Jan 29, 2025
@mhauru mhauru reopened this Jan 29, 2025
@penelopeysm
Copy link
Member

Thanks for spending the time to clean it up and to bear with all my comments 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants