Develop a more principled type inference algorithm #3601

Scott-Guest · 2023-08-24T16:17:08Z

In order to improve performance and eventually add support for more powerful types, we should develop a robust type inference algorithm and only rely on the current Z3 implementation as a fallback.

Scott-Guest · 2023-08-24T16:17:35Z

Ideally, we want to support:

subtyping
parametric sorts
parametric rules

Scott-Guest · 2023-08-24T16:19:32Z

The SimpleSub algorithm provides some nice inspiration:

Paper: The Simple Essence of Algebraic Subtyping: Principal Type Inference with Subtyping Made Easy
Demo: https://lptk.github.io/simple-sub/
Scala Implementation: https://github.com/LPTK/simple-sub

In summary, this algorithm is quite similar to the usual HM-style Algorithm W, but tracking subtyping constraints rather than performing unification. The results are then encoded using union and intersection types, roughly,

encoding "any type which is a subtype of X" as α ∧ X
encoding "any type which is a supertype of X" as α ∨ X

followed by a few simplifications passes to produce compact types.

We can use this to infer a general solution for the sorts, then can find all the maximal models (in the sense of the current type inference algorithm) by enumerating the monomorphizations and interpreting ∧/∨ as meet/join in the sort poset.

Scott-Guest · 2023-09-11T19:21:11Z

Translation from Sort Inference to SimpleSub

Disregarding ambiguities, we can apply the SimpleSub algorithm to our problem as follows:

Create an initial typing context Γ by letting P : ∀α1. ... ∀αN. S1 -> ... -> SK -> S for each Production with parameters α1, ..., αN, sort S, and non-terminals of sorts S1, ..., SK. Here, P is just some fresh identifier we associate to the Production.
For T a given Term, construct a corresponding SimpleSub-like term λx1. ... λxN. T, where x1, ..., xN are the variables occurring in the rule by treatingProductionReferences as function applications of the corresponding identifier we mention above.
Infer the type of this SimpleSub-like term in the context Γ. The resulting type will be of the form ∀α1. ... ∀αK. S1 -> ... -> SN -> S. In some sense, this tells us that x1 : S1, ..., xN : SN and the rule itself has sort S.

More explicitly, if K = 0 and the type is fully monomorphic, this gives use the exact sorts of the variables. Otherwise, this give us the general form of all solutions, and we can find the all maximal solutions by enumerating the monomorphizations.

This translation is just conceptual - the actual implementation can work with the existing Term data structures as-is.

To extend this to handle ambiguities, we begin by performing type inference on each unambiguous section of the parse-forest, but parametric over any contained ambiguities. Specifically,

Split the provided Term into numerous sub-trees by cutting it at each Ambiguity. That is, each sub-tree will have a root which is either the top-most Term or some sub-Term whose parent was originally an Ambiguity node, and the branches of this sub-tree only extend until the first encountered child Ambiguity node.
For each of these sub-trees, construct a corresponding SimpleSub-term as described before, but replace each leaf Ambiguity node with a term amb x1 ... xN where amb is a fresh-variable and x1, ..., xN are again all the variables we are attempting to infer. Abstract this SimpleSub-term over amb, i.e., prepend it with λamb. ...
For each sub-tree, the inferred type will now be of the form ∀α1. ... ∀αK. (T1 -> ... -> TN -> T) -> .... -> S1 -> ... -> SN -> S, with an inferred argument type T1 -> .... TN -> T for each ambiguity. Here, T will tell us the expected type of the term in the ambiguity

Once we have this local information for each unambiguous section, it in effect shows us all the typing-constraints that a particular section of the parse-forest places on the variables. We can then use this to prune the parse-forest by subsituting in the inferred types of ambiguities, forming a set of all possibilities at each ambiguity point.

Scott-Guest · 2023-09-17T12:29:21Z

Parametric Sorts with Non-Structural Subtyping

Consider types as finite trees built up from a finite number of type constructors. There are two types of subtyping orders you can have over such types:

structural subtyping, where subtyping relationships can only occur between types with the same constructor and arity.
non-structural subtyping, where subtyping relationships can occur even when the types may not have the same constructor or arity.

On first glance, I believed this algorithm would support parametric sorts quite easily. However, it in fact only does so in a limited context, namely

structural subtyping where each parameter is explicitly specified as either covariant, contravariant, or invariant.
a limited form of non-structural subtyping where constructors much match but arity may not (e.g. extensible records or variants)

K, on the other hand, supports completely arbitrary subsorting regardless of constructor or arity. As far as I'm aware, there is no standard algorithm which can actually handle this in full generality.

As a fallback then, if we eventually want to fully support parametric sorts, we'll likely still need to rely on a constraint solver when we run into these sort of constraints. I'll try to design things though so that we can still handle most cases directly with the inference algorithm and only pass off some small required formulas to Z3.

radumereuta · 2023-09-19T14:36:07Z

When this is concluded we should close this issue: #3551

radumereuta · 2023-09-19T15:29:52Z

Performance issues with the C-semantics
run with --profile-rule-parsing <file>

Scott-Guest · 2023-10-10T15:49:21Z

#3673's description includes a more flushed out explanation of the proposal above

Baltoli · 2023-10-26T15:55:20Z

It seems like the "fall back to Z3" strategy we discussed previously will be viable; @Scott-Guest will look into this once there is an end-to-end implementation for the partial deterministic inferencer.

@radumereuta mentions that we should keep consistency in mind as well as correctness - @Scott-Guest has a script to make sure that the inference results are the same as in Z3; if results are different it needs to be checked why that's the case. Z3 might not be correct!

…ms (#3814) Previously, when parsing programs, the subsort declarations `syntax KItem ::= Sort` were only added to the parsing module. However, sort inference is run in the disambiguation module, so they should be included there as well. This is necessary for #3601.

Scott-Guest self-assigned this Aug 24, 2023

radumereuta mentioned this issue Sep 19, 2023

[K-Bug] #Equals of #Equals produces invalid kore #3605

Open

6 tasks

Scott-Guest mentioned this issue Sep 26, 2023

Inconsistent type inference on MacOS 12 #3667

Open

Scott-Guest mentioned this issue Nov 15, 2023

Add KItem subsorts to the disambiguation module when parsing programs #3814

Merged

Scott-Guest mentioned this issue Nov 28, 2023

Implement new sort inference algorithm #3673

Merged

rv-jenkins closed this as completed in fe2916e Dec 6, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Develop a more principled type inference algorithm #3601

Develop a more principled type inference algorithm #3601

Scott-Guest commented Aug 24, 2023

Scott-Guest commented Aug 24, 2023 •

edited

Loading

Scott-Guest commented Aug 24, 2023 •

edited

Loading

Scott-Guest commented Sep 11, 2023 •

edited

Loading

Scott-Guest commented Sep 17, 2023

radumereuta commented Sep 19, 2023

radumereuta commented Sep 19, 2023

Scott-Guest commented Oct 10, 2023

Baltoli commented Oct 26, 2023

Develop a more principled type inference algorithm #3601

Develop a more principled type inference algorithm #3601

Comments

Scott-Guest commented Aug 24, 2023

Scott-Guest commented Aug 24, 2023 • edited Loading

Scott-Guest commented Aug 24, 2023 • edited Loading

Scott-Guest commented Sep 11, 2023 • edited Loading

Translation from Sort Inference to SimpleSub

Scott-Guest commented Sep 17, 2023

Parametric Sorts with Non-Structural Subtyping

radumereuta commented Sep 19, 2023

radumereuta commented Sep 19, 2023

Scott-Guest commented Oct 10, 2023

Baltoli commented Oct 26, 2023

Scott-Guest commented Aug 24, 2023 •

edited

Loading

Scott-Guest commented Aug 24, 2023 •

edited

Loading

Scott-Guest commented Sep 11, 2023 •

edited

Loading