-
Notifications
You must be signed in to change notification settings - Fork 152
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement new sort inference algorithm #3673
Conversation
Let's make this into a blog post! Thanks for the diligent research :)
|
Most of this I understand, but I don't completely understand the theory behind the work you're doing with ambiguities. However, that's okay. As long as it works in practice, I understand the algorithm itself well enough to be able to say that it should be fine. Let's see what the implementation looks like and how it performs and behaves once it's complete. |
Road MapThe first step is to implement the core of SimpleSub, but working with our data structures
After this, we need to make the necessary adjustments to change the inferred SimpleSub types to actual K sorts
Finally, we need to handle the more serious complications related to K:
|
93237dd
to
8c1c057
Compare
…tory to be in -kompiled.
64c7f50
to
2b4ed78
Compare
…leBody to differentiate macro rules vs rules using a macro on the LHS
final class MonomorphizationError extends SortInferenceError { | ||
// TODO: Produce better error messages! | ||
public MonomorphizationError(HasLocation loc) { | ||
super("Term is not well-sorted due to monomorphization failure.", loc); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This error message is quite poor, but I don't think it's actually possible to hit it currently (see my other comments on the "Over-Simplification" conceptual issue).
Conceptual Issue - Unrealizable Sub-Term TypesThe current algorithm consists of two phases:
In particular, this process is run on the inferred "function type" of the overall term (i.e. converting However, I am concerned we could run into the following situation:
For example, consider
The rewrite here has type For rewrites, this is still caught by a later check,
so this has not actually caused issues in practice yet. However, it's likely possible to construct other examples not involving the rewrite itself which this check will miss. SolutionEvery Conceptual Issue - Over-SimiplificationBecause the top-level sort of the term is almost always From our view of K
with
then the inferred type instead because SolutionFor now, this doesn't actually matter because we fully monomorphize types, but if we eventually want to support parametric rules we will need to ensure that the exact sort of (at least some of the) sub-terms is expressible over the sorts of the variables. For every sort that we need to express, we can just consider the usages of any type variable there during co-occurrence analysis to avoid simplifying them aways. However, I'm unsure which sorts actually matter, i.e. which of the following is relevant:
|
bounds.removeIf( | ||
s -> | ||
subsorts.lessThanEq(s, Sorts.KLabel()) | ||
|| subsorts.lessThanEq(s, Sorts.KBott()) | ||
|| subsorts.greaterThan(s, Sorts.K())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not confident that filtering the bounds this way is correct, but it was necessary lest we get multiple unrelated bounds like KLabel
and K
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I fully understand this code, but it seems well thought out, well documented, and well architected, and I'm going to assume that you've run the entire test suite with the CHECKED
mode as the default, which would at least tell you that it's inferring the exact same terms as the old algorithm, so assuming that's correct, this seems reasonable.
See my comment though; if we've adequately tested this on the entire test suite and know that it's identical in behavior to the old system on those tests, and if it's easy to turn off in the event of bugs in other semantics, and it's faster than the old code, we ought to turn this on by default.
"Choose between the Z3-based and SimpleSub-based type inference algorithms, or run both" | ||
+ " and check that their results are equal. Must be one of [z3|simplesub|checked].", | ||
hidden = true) | ||
public TypeInferenceMode typeInferenceMode = TypeInferenceMode.SIMPLESUB; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand the rationale behind leaving it off by default. Surely we want it on so that it gets tested by our integration testing moving forward, right?
There might be room to add a bit more documentation on fields and classes that don't currently have javadocs associated with them, though. |
… distinct Variables instances are unequal.
… could be re-processed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've nitpicked the code a bit, but broadly speaking I agree with Dwight - this looks really well thought through, and I'm confident enough in the surrounding infrastructure that we should be OK to get it merged and try it out on live projects using a flag.
Excellent work Scott; well done for seeing this through to the point of merging!
"Choose between the Z3-based and SimpleSub-based type inference algorithms, or run both" | ||
+ " and check that their results are equal. Must be one of [z3|simplesub|checked].", | ||
hidden = true) | ||
public TypeInferenceMode typeInferenceMode = TypeInferenceMode.SIMPLESUB; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will also let us selectively roll the feature out to projects one-by-one and keep an eye on them to make sure we don't see bugs or performance regressions there. I agree that in the end we definitely want this on by default, but it's a big enough feature that for now we should take care rolling it out.
kernel/src/main/java/org/kframework/parser/inner/ParseInModule.java
Outdated
Show resolved
Hide resolved
kernel/src/main/java/org/kframework/parser/inner/disambiguation/inference/CompactSort.java
Outdated
Show resolved
Hide resolved
kernel/src/main/java/org/kframework/parser/inner/disambiguation/inference/CompactSort.java
Outdated
Show resolved
Hide resolved
...l/src/main/java/org/kframework/parser/inner/disambiguation/inference/SortInferenceError.java
Show resolved
Hide resolved
kernel/src/main/java/org/kframework/parser/inner/disambiguation/inference/SortVariable.java
Outdated
Show resolved
Hide resolved
kernel/src/main/java/org/kframework/parser/inner/disambiguation/inference/SortInferencer.java
Show resolved
Hide resolved
kernel/src/main/java/org/kframework/parser/inner/disambiguation/inference/SortInferencer.java
Outdated
Show resolved
Hide resolved
kernel/src/main/java/org/kframework/parser/inner/disambiguation/inference/SortInferencer.java
Outdated
Show resolved
Hide resolved
…il.mak for testing.
3c3f2bb
to
cef8679
Compare
…t is respected everywhere
…e the only option.
…g in LUB/GLB computation.
I've tested the c-semantics with the new and old algorithm and here are some findings:
The new implementation is much faster. |
Closes #3601.
This PR introduces a new sort inference algorithm, aiming to eventually replace the Z3-based approach and pave the way for parametric rules and sorts.
The design is heavily inspired by The Simple Essence of Algebraic Subtyping: Principal Type Inference with Subtyping Made Easy by Lionel Parreaux. A high-level description explaining the relevant background can be found in
docs/developers/sort_inference.md
.The current PR implements a limited form of the proposed algorithm, still falling back to the Z3-based algorithm for any terms containing:
For reviewers, I would begin by reading
docs/developers/sort_inference.md
for a high-level conceptual understandingSortInferencer.java
briefly explaining how the high-level design is actually implemented with our data structures