Document the intuition behind the optimized ruleset #175

ecstatic-morse · 2021-08-03T22:32:04Z

The "optimized" variant of the borrow-checking rules—defined here—isn't explained in detail anywhere in this repo AFAICT. You can read through the relations one-by-one to get a sense of what's happening, but it's easy to miss the forest for the trees. For example, what is so special about "dying regions"? The book has only a short placeholder section that doesn't provide any answers:

A more detailed description of the rules in this Opt variant will be added later but it computes the same data as the Naive variant described above, more efficiently, by limiting where the subset transitive closure is computed: some origins are short-lived, or part of a subsection of the subset graph into which no loan ever flows, and therefore don't contribute to errors or loan propagation. There's no need to track these specific cases.

Niko is pretty busy, so I don't foresee him writing another sweet blog post like this one. There's actually a pretty good explanation in the first paragraph of one of Niko's comments on #153, but obviously that needs to be expanded upon.

I've developed my own intuition about what the optimized variant is doing, although I might be missing some of the subtleties, and @lqd has some insight as well from their work on #156. That should be enough to improve the state of things. During the sprint, Niko mentioned removing the optimized variant while still in the prototyping phase, since it's difficult to iterate on top of. Even if that happens, documenting its ideas is still important lest they fade into obscurity.

edit
I've written down my intuition for datafrog-opt below, although I'm not 100% sure that it's correct. We should discuss a bit before putting anything in the book. I've explained it by showing what happens when either the left or the right of a single subset relation ('a <: 'b) goes dead, but it may be clearer to talk the middle origin in a transitive subset relation ('a <: 'b <: 'c). Also, I've not documented which ideas correspond to specific relations in the optimized variant (e.g. dying_can_reach, dying_region_requires, etc.).

The text was updated successfully, but these errors were encountered:

ecstatic-morse · 2021-08-12T17:14:32Z

Intuition for `datafrog_opt`

This datalog program computes the same results (errors and subset_errors) as the "naive" version of the Polonius, but with less intermediate tuples.

Redundant `errors`

Let's look at the naive version of errors (with the naive version of loan_live_at inlined into the rule body):

errors(loan, node) :-
      loan_invalidated_at(loan, node),
      origin_contains_loan_on_entry(origin, loan, node),
      origin_live_on_entry(origin, node).

In plain english, a borrow error occurs when a loan is invalidated at a given point while contained by region that is live at that point. Notably, if the same loan is contained in multiple live regions while it is invalidated, we
don't gain any additional information: the same error will be generated multiple times.

For example, perhaps we have a two (non-placeholder) origins 'a and 'b such that 'a is a subset of 'b ('a <: 'b). If both 'a and 'b are live at some point in the control-flow graph, any borrow errors caused by a loan in 'a will also be generated by that same loan in 'b. For this reason, while both regions are live, there is no need to propagate loans from 'a to 'b. As long as each live loan is contained by at least one live region, we will emit a borrow error if that loan is invalidated.

In the past, this meant we could stop computing the full transitive closure of the subset relation as soon as we get to a live origin. Any edges from the live origin to its supersets can only create redundant borrow errors. The addition of placeholder region errors complicates things, since we need to derive all possible subset constraints between two placeholder origins, which
may pass through any number of non-placeholders. We deal with this by computing the placeholder_subset (all subset relationships that may involve a placeholder region) separately from the subset used to compute borrow errors.

Remember, all this is only valid while the regions on both sides of our subset constraint 'a <: 'b. If 'a (the subset) goes dead, we must propagate all of its loans to 'b (the superset), guaranteeing that each live loan is contained in at least one live origin. If 'b goes dead, we must propagate all loans in 'a through 'b to all of its live supersets (origins 'c such that 'c is
live and 'b <: 'c). That way, if 'a goes dead at a later point, its loans will be remembered.

lqd · 2021-08-13T08:47:10Z

👍 this also matches my intuition and the little documentation we have about it.

We deal with this by computing the placeholder_subset (all subset relationships that may involve a placeholder region) separately from the subset used to compute borrow errors.

Since we only want to know the origins into which placeholders flow (to ultimately know all the placeholders that a placeholder can reach), "all subset relationships that may involve a placeholder region" may give the impression that non-placeholders flowing into a placeholder (since a placeholder is involved) are of interest here, but it's really that the subset is downstream from a placeholder that is the most important property. No big deal of course, especially for an aside like this.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Document the intuition behind the optimized ruleset #175

Document the intuition behind the optimized ruleset #175

ecstatic-morse commented Aug 3, 2021 •

edited

Loading

ecstatic-morse commented Aug 12, 2021

lqd commented Aug 13, 2021

Document the intuition behind the optimized ruleset #175

Document the intuition behind the optimized ruleset #175

Comments

ecstatic-morse commented Aug 3, 2021 • edited Loading

ecstatic-morse commented Aug 12, 2021

Intuition for datafrog_opt

Redundant errors

lqd commented Aug 13, 2021

ecstatic-morse commented Aug 3, 2021 •

edited

Loading

Intuition for `datafrog_opt`

Redundant `errors`