Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Separate compilation #62

Open
wants to merge 30 commits into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
291fdd1
Separate compilation doc.
gmalecha Feb 3, 2022
5d19d22
Apply 1 suggestion(s) to 1 file(s)
Blaisorblade Feb 9, 2022
8b5e4e9
Update text/separate-compilation.md
gmalecha Feb 9, 2022
cade9ed
Elaborating on the different possibilities.
gmalecha Feb 10, 2022
b006139
Removing more duplication.
gmalecha Feb 10, 2022
648cec1
Removing some more redundant text.
gmalecha Feb 10, 2022
0d29e79
More consolidation.
gmalecha Feb 10, 2022
ec81a87
A note about full functorization (though it might be off-topic).
gmalecha Feb 10, 2022
2b15916
Summary: significant revision/rewrite
Blaisorblade Feb 12, 2022
9659510
can not -> cannot
Blaisorblade Feb 12, 2022
15fb9b3
Other typos
Blaisorblade Feb 12, 2022
aacf72a
Drop my old background (now covered in intro)
Blaisorblade Feb 12, 2022
aeb21f7
Intro tweaks
Blaisorblade Feb 12, 2022
6bfae81
Reinforce (indirectly) that interfaces affect compilation
Blaisorblade Feb 12, 2022
0e2c2a1
Remove caveat: addressed earlier indirectly
Blaisorblade Feb 12, 2022
50bcff8
Revise section titles
Blaisorblade Feb 12, 2022
8882a74
Section 2: restructure/reflow
Blaisorblade Feb 12, 2022
28cacc9
Both `.vi` and `.v`: significant revisions, demonstrate impl.-only re…
Blaisorblade Feb 12, 2022
3231fbf
s/value/answer/ to match the later example
Blaisorblade Feb 13, 2022
fc02480
Typo
Blaisorblade Feb 13, 2022
e1eb2c6
Swap `.vi` only and `.v` only for better flow
Blaisorblade Feb 13, 2022
43e5f55
TODO: remove most of "only `v` file"
Blaisorblade Feb 13, 2022
e533ad2
Typos
Blaisorblade Feb 13, 2022
6f9dc01
Removals agreed with Gregory
gmalecha Feb 13, 2022
086dfca
markdown fix.
gmalecha Feb 13, 2022
11e2c71
Revisions and formatting.
gmalecha Feb 13, 2022
bd2e5eb
Additional clarification.
gmalecha Feb 13, 2022
71391ba
Minor revision
Blaisorblade Feb 14, 2022
37a04a4
Alternative implementation strategy
Blaisorblade Feb 14, 2022
bd11d59
Implementation plan: mention build support
Blaisorblade Feb 14, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
229 changes: 229 additions & 0 deletions text/separate-compilation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,229 @@
- Title: Separate Compilation in Coq

- Drivers: David Swasey, Paolo Giarrusso, Gregory Malecha

# Summary

Coq provides a module system that can be used explicitly through commands such
as `Module` and `Module Type`. These can be quite heavyweight in many instances,
and have some limitations when it comes to separately compiling files and
building generic libraries.
To address these problems, and support information hiding and Cardelli-style separate
compilation, in this CEP we propose a notion of Coq interface files, which we
Comment on lines +11 to +12
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be helpful to briefly spell out what "Cardelli-style" means. A reference would be nice, although I personally am not always able to access referenced papers.

will call a `.vi` file, and which are inspired by OCaml's `.mli` files.
Intuitively, a Coq interface file called `module.vi` defines the public
interface for `module.v`. The `module.vi` interface shall enable developing and
typechecking clients, even before `module.v` has been implemented.

If both `module.vi` and `module.v` are present, `module.vi` shall act as an
opaque ascription for the top-level module defined by `module.v`; this opaque
ascription ensures that clients that typecheck against `module.vi` shall still
typecheck against the combination of `module.vi` and `module.v`, regardless of
the implementation details of `module.v`, including any non-logical side effects
such as hints (but excluding [universes](#universes)). Instead, without
interfaces, adding implementations is sufficient to break clients, and changing
implementations can break clients again, hindering modular development.

This has a few advantages compared to Coq's state of the art:
- It enables separate development: after agreeing on `module.vi`, `module.v` and
its clients can be developed independently. To ensure this, unlike today's
opaque ascription, `.vi` files can even hide side effects due to `Require`.
- It reduces compile-time dependencies and improves compile times, even compared
to today's `vos` builds (initial builds can be more parallel, and incremental
builds need to recompile fewer files).

NOTE: For concreteness, in this CEP we use the `.vi` extension for for interface
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
NOTE: For concreteness, in this CEP we use the `.vi` extension for for interface
NOTE: For concreteness, in this CEP we use the `.vi` extension for interface

source files, and the `.vio` extension for interface object files. However,
`.vio` files should not be confused with `.vio` files produced by `-quick`
builds: to avoid confusion, we could choose other file extensions or remove
`-quick` builds entirely.

# Proposed Semantics

In this section, we describe the meaning of our Coq extension: instead of
suggesting an implementation strategy, we sketch a semantics as a
source-to-source transformation, to be taken as an informal and _approximate_
specification. However, this is an approximate specification, because the exact
semantics cannot be expressed via source-to-source transformation.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would much prefer a version where this can be fully implemented as a source-to-source translation (in part because that makes things easier for the bug minimizer). I think it also solves some design questions without making things significantly worse for the user.

I propose the following:

  1. Interfaces are interpreted as module types
  2. Compiling a .v file with a corresponding .vi acts as if the module type were ascribed to it (except possibly Requires, which may be considered to be above the module type ascription as far as global side effects go)
  3. At Require time there should be three flavors: a file can Require just the compiled .vi file, it can Require both the compiled .vi file and the compiled .v file, or it can Require just a compiled .v file. The last should be an error if a .vi file (compiled or source) exists alongside the .v/.vo file. Requiring both the .v(o) and compiled .vi should be like Requiring the compiled .vi and then attaching bodies to the constants for Print Assumptions, etc (in the same way it would happen for modules opaquely ascribed with a type), and also including the Requires of the .v file.

This pushes all the issues with side effects of require, universe constraints, ability to use Extraction, etc, onto the user side in a systematic way, by putting linking under the user's control (linking corresponds to requiring the compiled .v file).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

except possibly Requires, which may be considered to be above the module type ascription as far as global side effects go

To actually achieve separate compilation, the Requires would have to be inside the module and be hidden by the opaque ascription. Today this gives a warning, but in the Coq call it wasn't clear why, but until that's resolved I consider a source-to-source translation impossible today :-).

Copy link
Member

@gares gares Feb 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what I tried to explain (very badly) by saying that an interface should morally be

Require stuff.
Module Type this.
Import some stuff.
...
End this.

There are a few "actions" performed by require which don't play well with modules, which are usually seen as a linguistic construct to limit the scope of things.

Ideally, if one does not import a module, he does not get any effect. But require, today, leaks stuff, like global hints. So placing it inside a (sub) module does not do what you may think. Also, there is special code hoisting some actions at the top level, like loading ML plugins and co which is ugly and has no real use case.

Finally, it would be nice to have a header in each file which lets one compute the dependencies of the file easily. If requires are only allowed in there, then we kind of have this header. If we allow them to be anywhere, well... Similarly, require is slow, if they are all at the beginning of the file, a UI pays the cost once upfront, and then it is good to go. If they are interleaved with other sentences, well, you may need to backtrack over that slow line way more often.

To sum up, it is both an "hygienic" and would allow some code simplification if all requires were on top.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally (though this is a fairly purist point of view), I have been thinking of .v files as corresponding to functors from their Required dependencies. Currently, Coq does not provide a mechanism to ascribe a type to these modules, so there is no means to abstract. With this approach, you can change what the module type being abstracted over is which provides more flexibility to the implementer of the interface. If I follow this mental model, then the second flavor of Require that you describe above means that you can "abstract over" the "natural" module type (the one that does not hide anything, including the body of opaque definitions); however, it becomes a bit odd because to typecheck those definitions you might need the body of other constants which could introduce problems.


As a key principle: `.vi` interfaces are meant to hide implementations and
Copy link
Member

@jfehrle jfehrle Feb 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use case for hiding "implementations" is primarily proofs (plus necessary definitions)?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Qed is typically good enough to hide proofs, but you often want to hide more: both hints needed by the implementation, but also the bodies of definitions, etc.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i.e. same job as interfaces in java

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not just, it can be useful to hide bodies of definitions that you do not want clients to think about though this does not work well if you want to compute on the definitions.

support separate compilation in Cardelli's sense. Hence, compiling a module
`consumer.v` that consumes the interface of `producer.vi` shall not depend
(directly or indirectly) on the existence of `producer.v` or its contents. As a
consequence, no change to `producer.v` can affect whether `consumer.v`
typechecks.

More concretely, our semantics for compiling top-level module (say, `lib`)
distinguishes three scenarios, depending on the existence of:
- both `lib.vi` and `lib.v`;
- only `lib.vi`;
- or only `lib.v`.

## Both a `.vi` and `.v` File
Copy link
Member

@jfehrle jfehrle Feb 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any things that can't appear in a .vi file (or in practice wouldn't be included in a .vi file)?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so.


For concreteness, consider the following example:

```coq
(* lib.vi *)
Global Open Scope Z_scope.
Parameter answer : nat.
Axiom answer_is_42 : answer = 42%nat.

(* lib.v *)
Require Import stdpp.prelude.
Global Open Scope N_scope.

Definition answer : nat := 42.
Definition answer_is_42 : answer = 42%nat := ltac:(reflexivity).

Definition other_answer : N := 42 + 1.
```

Here, `lib.v` contains both the implementation of `lib.vi` and some commands
that perform global side effects and that should be hidden from clients:
`stdpp.prelude` modifies many Coq settings aggressively, even when simply
`Require`d, so adding this `Require` is a significant breaking change.

In this example, the semantics of `lib` would resemble the semantics of the
following Coq source:

```coq
(* lib_composed.v: *)
Module Type __LIB.
Global Open Scope Z_scope.
Parameter answer : nat.
Axiom answer_is_42 : answer = 42.
End __LIB.

Module __lib : __LIB.
Require Import stdpp.prelude.
(* ^ Require inside a Module is strongly discouraged. What is relevant is
that side effects of the Module are not visible outside of this Module
*even* when the Module is Import-ed *)
Global Open Scope N_scope.

Definition answer : nat := 42.
Definition answer_is_42 : answer = 42%nat := ltac:(reflexivity).
End __lib.

Export __lib. (* make the declarations from lib available from [Import]ing lib (rather that lib.lib *)
```

In this example translation, we use the `\_\_` prefix for generated identifiers
used as a translation device; these identifiers should be hidden from clients.

This translation turns the `.vi` interface into a module type, the `.v`
implementation into a module, and uses opaque module ascription to hide the
module contents.
The only visible side effects are those that appear in the interface: here,
only `Global Open Scope Z_scope`. As we only `Require` stdpp.prelude in the
module body, we intend this to be **hidden** from clients that perform `Require
lib`, even if this might not be guaranteed by `lib_composed.v`.

## Only a `.vi` File

When only an interface file exists, there is (potentially) no underlying
implementation, but the existance of the interface should still provide definite
references to an implementation. Casting this in the feature set of the current
Coq ecosystem, this essentially translates to a `Declare Module`
[^full-functorization]. Concretely,

```coq
(* lib.vi *)
Global Open Scope Z_scope.
Parameter answer : nat.
Axiom answer_is_42 : answer = 42.
```

would be "translated" to:

```coq
(* lib.v *)
Module Type __LIB.
Global Open Scope Z_scope.
Parameter answer : nat.
Axiom answer_is_42 : answer = 42.
End __LIB.

Declare Module __lib : __LIB.
Export __lib.
```

## Only a `.v` File
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you expect that this will be the most common case for the long term?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. I would hope that there are a lot of libraries that would have meaningful theory that could be put in a .vi file. @charguer mentioned some things like this in his standard library talk.


If only a `.v` file is available, Coq's semantics should be the same as today:
the entire implementation shall be exposed in full builds, and it shall be
hidden in `vos` builds.

## Universes

As some readers will anticipate, universe checks do not admit fully separate
compilation. Module bodies might add universe constraints that are note explicit
(or even visible) from its interface. This is already an issue with `.vos`
builds today, and is a problem inherent to Coq's global universe graph and
(generally) implicit treatment of universes in the surface syntax, so any
solution to this problem could be shared.

We consider `vos` builds a special case of this proposal, where interfaces are
inferred as the strictest possible ones for the given implementation; `.vi`
files enable hiding more implementation details. In both cases, the interface
omits universe constraints that are derived from hidden bodies (for `vos`
builds, just `Qed` bodies). While some universe checks are performed anyway,
omitted constraints might make the universe graph unsatisfiable.

To remedy this problem, we propose an additional "global" check. By analogy with
separate compilation in other languages, we call this "link-time" universe
checking. Such a link-time would be effectively the same as `Require`ing all of
the modules in a library preferring their *implemenation* (`.v` file) rather
than their interface (`.vi` file). If the combined set of constraints is
satisfiable, then there is no problem.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem with universes is not so much satisfiability than the fact you can get wildly different results depending on the link order, because this affects unification and the like. It is somewhat easy to craft a Definition foo : bool. whose value in separate compilation mode depends on the scheduling order.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because this affects unification and the like.

It is somewhat easy to craft a Definition foo : bool. whose value in separate compilation mode depends on the scheduling order.

Link-time checks happen on concrete proof terms, so what's the unification in question?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue I mentioned happens for vos files, but a similar problem arises with separate compilation. Namely, there are cases where replacing the interface with the implementation will result in later files being compiled differently, breaking the abstraction boundary. For a simplified and contrived example, you can consider

Universe i j.

Lemma dummy : True.
Proof.
constr_eq Type@{i} Type@{j}.
constructor.
Qed.

Definition b : bool.
Proof.
(let _ := constr:(let _ : Type@{i} := Type@{j} in tt) in exact true) || exact false.
Defined.

where the value of b depends whether the proof script of dummy has been evaluated or not at the time b is computed. In separate compilation mode, it means that an interface where dummy is an axiom won't produce the same proof as one where the proof is explicitly given, despite being clearly compatible as per the module rules. This is not specific to Qed proofs, one can also trigger this kind of behaviours with explicit constraints hidden under module interfaces.

Copy link
Contributor

@SkySkimmer SkySkimmer Feb 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But in the separate compilation model we don't expect to be able to remove the interface and still compile dependants, so it's not a problem IMO

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, “removing the interface breaks clients” is an intended feature. But the only existing way to fully compile this code in non-vos mode would leak the extra constraint to clients, despite opaque ascriptions; this was brought up in December and it’s concerning. I’ve seen no evidence this is a real problem (compared to the periodic universe inconsistencies we do see), but in principle it seems fixable (very invasively!) if you’re that serious about separate compilation: always defer checking of these constraints to the link-time check. But that’s independent of this proposal, and relevant to vos builds in general.

despite being clearly compatible as per the module rules

I agree that the subsumption rule would suggest ”dropping interfaces” should work, but subsumption on source-language modules doesn’t actually work (in Coq, OCaml and many other languages). Even ignoring hints etc., Import + width subtyping is enough to break subsumption: if you Import foo and foo adds a member bar, then foo changes to a more specific type, yet bar can shadow existing bar and break clients.


If the combined set of constraints is *not* satisfiable, then the problem must
arise from an interface not exposing a constraint that an implementation
requires. Diffing the universe constraints between an interface and an
implementation would provide the right information necessary to diagnose and fix
the problem.

### "Full compilation" Semantics

The notion of full compilation semantics, i.e. a sound full-build semantics akin
to a "vo-style" build can be achieved (at the cost of build parallelism) by
introducing a dependency between the `.vio` file and the `.vo` file and
elaborating the resulting `.vio` file with universe constraints introduced by the
implementation. It is important that nothing but universe constraints (say
hints, tactics or plugin requirements) leak from `.vo` files into `.vio` files.

Alternatively, the extra universe constraints could be fetched directly from
`.vo` files without altering the `.vio` files: again, nothing but universe
constraints should leak from implementations to clients.

In this case, we could maybe ensure that `.vio` files coincide across separate
builds and "full" builds.
To this end, processing `Require library` when building `client.vio` file might
have to not load constraints from `library.vo`, even in a full build.

# Implementation

The implementation would require (at least) the following:

1. Extending the build infrastructure to support `.vi` compilation.
2. Modifying the implementation of `Require` to search for `.vio` files in
addition to `.vo` files. For backwards compatibility, we believe it is
necessary to search for both `.vio` and `.vo` files *simultaneously* rather
than first searching for a `.vio` everywhere and then for a `.vo` everywhere
because the later would mean that adding a `.vi` files could change the
library that is used.
3. We believe that the bit-level representation of `.vio` could be the same as
`.vo` files, though an alternative would be to leverage the representation of
`.vos` files (which might be the same).
4. Build support for the new mode; ideally, some code could be shared with
support for `vos` builds.


# Footnotes

[^full-functorization]: An alternative characterization of `Declare Module` is
to implicitly functorize all dependent modules over the module type of the
declared module. This understanding of `Declare Module` does enable some
additional flexibility at "link-time" but is beyond the scope of this
proposal.