-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove selector stuff from VarInfo tests and link/invlink #780
base: release-0.35
Are you sure you want to change the base?
Conversation
Pull Request Test Coverage Report for Build 12813097029Details
💛 - Coveralls |
Pull Request Test Coverage Report for Build 13038103492Details
💛 - Coveralls |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## release-0.35 #780 +/- ##
================================================
- Coverage 86.16% 85.24% -0.92%
================================================
Files 36 36
Lines 4301 4372 +71
================================================
+ Hits 3706 3727 +21
- Misses 595 645 +50 ☔ View full report in Codecov by Sentry. |
Sorry, spoke too soon. |
Okay, we are back. |
Markus, could you say something about why using sampler for |
Sure, @sunxd3. This all part of a general push to get rid of marking different variables as belonging to different samplers in I view this mostly as a modularity/separation of concerns thing: VarInfo's job is to keep track of which variable has which value, and how they've been transformed, and maybe what the accumulated logprob is. What exactly is being done the model that warrants keeping track of these things, like whether we are drawing a single sample from the prior or doing MCMC or variational inference or predicting or whatever it is, is a higher level issue downstream of VarInfo, and VarInfo should be blind to it. To be philosophical about it, this is how I think it currently looks:
and this is how I think it should look:
There are also some more practical benefits:
Note that the only reason (I think) why this indexing of |
Thanks a lot for the explanation, crystal clear! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Solid improvements, thanks.
Co-authored-by: Xianda Sun <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My overarching feeling about this so far is that we are being really loose with types. I think it's great to offer the user some convenience, but in this case I fear that we have too many pathways which makes it very hard to reason about. e.g. if you pass a tuple to link!!
, it's really hard to keep track of the output type, and then that output type becomes the input to another function which might convert it to a namedtuple, but in a type stable way; but on the other hand if you pass it a vector, sure, the eventual outcome is the same, but it follows a rather different winding path through abstract_varinfo.jl
and varinfo.jl
.
The main question I have is really whether we need this flexibility? All the linking functions are entirely internal to Turing and so external user convenience isn't really a consideration.
If we made sure that all of these functions could only take one container, then it appears to me that we could potentially:
(1) cut some of the code that is repeated
(2) guarantee that it proceeds via a path that is type stable
(3) make it easier to follow the dispatch chain when we revisit this
which would be a win in my books 😄 and I'd be super, super happy to wrap my lists in a tuple before calling link!!
if it means that we get the above.
src/abstract_varinfo.jl
Outdated
link!!([t::AbstractTransformation, ]vi::AbstractVarInfo, model::Model) | ||
link!!([t::AbstractTransformation, ]vi::AbstractVarInfo, spl::AbstractSampler, model::Model) | ||
link!!([t::AbstractTransformation, ]vi::AbstractVarInfo, vn::VarName, model::Model) | ||
link!!([t::AbstractTransformation, ]vi::AbstractVarInfo, vn::Tuple{N,VarName}, model::Model) | ||
link!!([t::AbstractTransformation, ]vi::AbstractVarInfo, vn::AbstractVector{<:VarName}, model::Model) | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mightttt have mentioned this before, but I still would really prefer if we didn't define a single-VarName method.
- There are many settings in DynamicPPL / Turing where we have some variable
vn
which might be a VarName or an iterable of VarNames depending on the context, and it's difficult to guess which one is happening when it gets passed to a function that can take either. - Another general reason to not declare convenience methods is that complexity grows in a multiplicative sense. If we have M types of
AbstractTransformation
and N types ofAbstractVarInfo
, then for every possible type that we let thevns
parameter be, we are essentially declaring M*N more methods, for which we need to make sure that they all exist, behave correctly, and don't generate any method ambiguities (the act of resolving method ambiguities itself leads to code duplication).
(But at least it's better than allowing a new optional parameter, for which complexity grows exponentially, cf. our situation withevaluate!!
.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've removed this now, even though I would like to have the single-VarName version. The reason I've removed it is that the dispatch on link
/invlink
is already too complicated, which indeed causes method ambiguities. I think if the dispatch got cleaned up otherwise we could reintroduce the single-VarName version, which we should be able to do with just a single method without ambiguities (because no other method would accept a single VarName). However, for now it's not worth the code complexity, given that it's not really needed anywhere.
For the first point of being able to see the type at the call site, I see what you mean, but I think this is something you have to largely give up in a dynamically typed language like Julia, and I think the ergonomy of the single-VarName method would outweigh it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is something you have to largely give up in a dynamically typed language like Julia
Maybe not very surprising, but I don't agree with this 🙈
Dynamic typing means that it's not possible to unambiguously determine the type of something (using type inference algorithms, or in practice, LSP), but often it's possible to make limited inferences from the known function signatures. In a dynamically typed language, it's not possible to verify these inferences at compile time, but IMO it doesn't follow from this that we should make it harder for ourselves to make such inferences.
The most frequently cited benefit of dynamic typing is the ease of prototyping. However, I think this primarily applies to the end users of the language, i.e. people who end up using Turing. If this was a case of providing nice APIs for end users, I'd be totally on board with extra convenience methods, but I think the (inv)?link(!!)?
functions are pretty much entirely for internal use. I believe (and I hope it's not a controversial opinion!) that it's easier to maintain, reason about, and guarantee the correctness of software that is statically typed, and although we can't do that in Julia, I do think that any step we can take in that direction is a net positive, and our lives will be easier if we are more disciplined about our internal APIs 😉
Anyway, I just wanted to write it out because I suspect that me pushing back against multiple dispatch 'overuse' (quotes because others may not agree that it's overuse!) might be a continuing trend in PRs. I do recognise it's not my personal codebase and I'll always be happy to compromise on the outcome, in the sense that if you say you really want a convenience method I'll just say 'ehh, whatever', but I think it's unlikely that I'll really change my mind on the underlying principle. So I write it out once and I'll spare all of you this next time 😄 Maybe I will bookmark it and copy paste the link whenever people ask me about it hahah
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Btw just on this PR I'm happy with how it stands with dispatch so feel free to resolve this convo!)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a handful more comments to the previous
I agree that this is the big question. I, too, would greatly enjoy the simplicity of only supporting tuples. We clearly want tuples, because vectors are impossible for type stability. If we only use tuples, the case I worry about is something like
creating 1000-element tuples, and type inference having a heart attack, resulting in either exploding inference times or it giving up on inference and giving us abstract types and horrible runtime performance. This could be premature optimisation, but I have a rule of thumb living in my head that very long tuples are bad and will make you sad at some point. Doing a search on Discourse about issues relating to big tuples causing slow/failing inference brings up many threads. One option would be defining a best-of-all-worlds One aspect that might make my worries about large tuples premature optimisation is that I don't think Turing internals have much of a need for calling Another factor in this question is the use of I should go through TuringLang repos and see what are all the ways |
This is great @mhauru 🎉 I'm a bit OOL, but my recommendation would be against exposing varname-specific linking to the "user". As you said, ideally we'd move away from this (and it can be done by a combination of I would do:
Is there a reason why this wouldn't work / is not ideal? Happy to have a quick chat about all of this sometime this week if you want:) |
Also loved the illustrations above 😅 |
Co-authored-by: Penelope Yong <[email protected]>
I searched in TuringLang for all uses of I also removed the single I would like to keep some way for users to manually link individual variables, so I've left the tuple version. I would like to keep it because we export In the process I also added some code duplication to reduce the use of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some minor stuff remaining
I'll go for a walk, if I come back and CI has passed i'll approve 😄 Last question, which doesn't necessarily need to be acted upon now: I wonder if it's worth keeping a note of what was changed somewhere. DPPL doesn't have a changelog (not yet, at least, my other PR #792 puts one in) but it makes sense to me that each PR should add / modify the changelog entry so that when we do the big 0.35 merge we're not trying to collate all the changes. |
Thanks for spending the time to clean it up and to bear with all my comments 😄 |
This PR implements a part of moving away from using Selectors/Gibbs IDs/indexing by samplers. It does two things:
VarInfo
s by samplers intest/varinfo.jl
.link
,link!!
,invlink
, andinvlink!!
to not take samplers as arguments anymore, but rather (iterables of)VarName
s.These two changes were easiest to do at the same time, for reasons.
Note that more clean up Selector/Gibbs ID stuff is to come, but I'm asking for code review at this stage to avoid one megareview. This PR is not to be merged to
master
, but rather to #779, which will collect all Selector-removal changes.