Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CEP] Proposal for libobject API refactoring #65

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
237 changes: 237 additions & 0 deletions text/libobject-refactor.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,237 @@
- Title: Libobject metadata and state redesign

- Driver: Emilio J. Gallego Arias

- Status: Draft, (partial) PR available, PR tested

---

# Summary and Objectives

Coq's module system is (mainly) implemented by an abstract notion of
"object" (`Libobject.t`), which corresponds to an existential type `t`
which has to be (implicitly) `Marshall`able, plus a set of operations
`Libobject.object_declaration` which specify the effects the object
will perform when affected by several operations such as `Require`,
`Import`, etc... in a fully imperative (that is to say `unit -> unit`)
way.

Examples of such objects are declarations, implicit arguments, modules
themselves, implicit arguments and quantifiers, classes, notations, ...

There are quite a few downsides to this approach, in particular:

- there is no mechanism for objects to declare common meta-data such
as documentation, origin (user or auto-generated), location, etc...

- there is no mechanism for objects to declare common behavior such as
registering with the nametab, declaring tables or maps, etc..

- there is no control of interaction of objects with the parser, kernel

- there is no possibility to systematically explore the list of
objects currently in scope (other than hoping the ad-hoc internal
structures are somehow exposed)

- in some cases (such as printing), we need to perform a costly (and
messy) global imperative state update as to simulate that a module
has been opened, due to the imperative nature of the module system
implementation

These shortcoming are handled today by introducing ad-hoc effects in
the imperative handler, but clients such as UI interfaces or Search
have a hard time working in a clean way, the API state is far from
satisfactory.

This CEP proposes a remodelling of the `Libobject` API based on our
experience with alternative Coq toplevels and interaction systems. In
particular, we aim to:

- split concerns in the module system implementation: serialization,
scope handling, object life-time and effects

- provide a common API for indexing a retrieving object metadata

- provide a common API for declaration of side-effects, such as
creating tables, references, registering with the kernel, the
parser, or the nametab

- set the path for a functional implementation of some basic
operations of the module system, pushing the imperative state to the
upper layers

- replace `.glob` files by a query to the object database, provide
rich documentation, jump to definition features

## Coq DB Workshop

This CEP has its origins on the "Coq DB" workshop that was held
Thursday, March 31st, 14h00 (Paris time)

https://github.com/coq/coq/wiki/jsCoq--and-CoqDB-Working-Day

There are videos and a small summary that will be uploaded ASAP.

# Lifecycle of a `Libobject.t` object:

Lifecycle of `Libobject.t` values can be understood from the creation
point of view, which happens when new objects are registered, and its
dual, when existing objects are imported from `.vo` files.

The lifecycle is roughly as follows:

## Creation

- when clients want to "persist" (or declare) a new object, they will
register it with `Lib.add_leaf`. `Lib`'s job is mainly to sit below
all clients that can register objects, and track the (ephemeral)
list of objects that have been declared while a module / library is
built.

- when a module or section is closed, `Declaremods` will call `Lib` to
collect the corresponding objects, and in some cases they will be
re-declared if they survive the module / section.

- at the end of a library compilation, `Library` will call
`Declaremods.end_library` to grab the list of objects to persist in
the `.vo` file.

## Loading

- when a new library or module is `Required`, `Declaremods` will
sequentially "load" the objects, basically performing the
side-effects declared in their interface

- the objects are then stored in a internal map, as their interface
may need to be called when importing for example

- users of `Declaremod` have to be careful about the state

# Logical and meta-logical objects

A common discussion topic is about the role that objects registered
with the kernel play. Indeed, some objects present in a `.vo` can be
understood as "logical" objects, as they will be registered with the
kernel.

A large set of objects are meta-logical: they are not registered with
the kernel, but still essential to Coq operation. Such objects are
notations, instance declarations, hints, etc...

Indeed it is a common source of discussion where the data for logical
objects should be stored. Some data will be stored invariant in the
kernel, but some other data, like documentation or location, has to
live outside it.

# Part I: common meta-data

For this part, we propose to decorate the `Libobject.t` type with a
`Data.t` structure. This approach is modelled after the approach used
in `Declare` for constant information, which has proven successful in
reducing code duplication, uniformizing the API, and making it more
extensible (see for example per-definition typing flags).

In this preliminary proposal, `Data.t`, will contain:

- `loc`: Location of the object, if it is coming from a source file
- `doc`: String containing the documentation of the object, to be extended
- `name`: Name of the object, as an `Id.t`, if not anonymous
- `category`: Already exiting in `Libobject`, used for import, could be also used by `Search`.

A draft implementation or Part I is available in PR
https://github.com/coq/coq/pull/16484

This changes affect the lifetime in the following way:

## Creation

Clients must provide the metadata when registering the objects.

## Loading

`Declaremods` already stores the objects in a map indexed by their
prefix, we propose to:

- create a reverse map or maps to allow clients to query the `Data.t`
efficiently. In particular we could have reverse maps for
categories, names, locations. For now we implement a reverse map for
names, but more options could be desirable after more experimentation.
Comment on lines +157 to +158
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this refer to a branch somewhere or is more like "for now we propose to start with a reverse map for names"?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is implemented in the prototype branch, cleaning it up to submit as we speak.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR done, you can use (hack hack) Print Graph to print the reverse map


- expose the regular map for anonymous objects so clients can explore.

## Extra

It would be desirable to allow clients to register as to allow
incremental indexing of objects, of for example, to maintain a
location-based table.

Another interesting topic is versioning of objects. While preliminary,
we could include versioning information in the objects as for example
to allow to have hybrid Coq 8/9 objects, or several versions of the
same `Definition` for example.

# Part II: behavior specification

The second part of this CEP, not yet implemented, is to phase out the
`declare_object` API in favor a set of `Behavior.t` traits that an
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a description of the Behavior API?

Copy link
Member Author

@ejgallego ejgallego Sep 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a too precise view on how to implement it, there are several options so here we can use some feedback.

I guess for the first prototype I'd try to have a list of traits, indeed due to the existential nature of traits that would mean turning the declaration of objects into a module, but I think I'm fine with that (as in the case for the proposal about errors in coq/coq#12602)

object can implement, in a functional way.

The main goal here is to allow the upper layer to control the state,
and have more flexibility in updating the state. This has many
applications.

In particular, we can distinguish a few common behaviors:

- `Kernel` : objects with this trait provide a function to register with a `safe_env`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that currently I believe the only part of libobject which modifies the safe_env is the Require replay after closing section/module (in_require in library.ml).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, that's correct. Tho the trait is meant to capture objects that call the diverse Global.* functions, so at some point the env can be owned by declaremods.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've clarified this a bit in an update.


```ocaml
module type KernelTrait : sig
type t
val register : env -> t -> env
end
```
currently the objects that register with the kernel are mainly
inductives and definitions. Also, note that this allows to define a
notion of soundness for document interpretation, namely, the
document manager has to map all objects with kernel trait to the kernel.

- `Named` : objects with this trait can register with the `Nametab`, in a uniform way
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that currently nametab itself is not uniform (there are separate tables for globref, modules, ltacs, some other stuff) so this may require a nametab redesign first.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FTR I think that unifying the various nametabs is gong the wrong way. It's a feature that each plugin can define its own set of names, so the design critically needs to account for this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the idea is that when an object declares a nametab trait, it does so with their own nametab. Refactoring is indeed needed, you can see current progress here, but need to advance more https://github.com/ejgallego/coq/blob/f059e5a57e95c5c80a3c9026e91ba91936d90db5/library/nametab.mli

Ideally, if we think this is a good idea we would do such a change in two steps:

  • we would first define the trait, which takes a module as a parameter and performs the imperative state
  • we would then push the trait to be functional, and have Declaremods own the nametab.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clarified a bit.

```ocaml
module type NamedTrait : sig

type path
type elt

module NameTable : NameTab.S with elt := elt and path := path

end
```

- `Parser` : objects with this trait can update the parser
- `Summary` : objects with this trait will register a map or value with the summary

In all cases, the registration of the trait exposes the state
manipulation in a functional way, so the upper layer [`DeclareMods`]
is responsible of threading the state in a proper way.

This should allow to reduce quite a bit of ad-hoc code duplication in
the current codebase.

We can think of a possible implementation strategy in two-phases:

1. declare the traits to be imperative
2. have to declaremods own the state, make the traits functional

I think this makes a lot sense at least for the objects exposed here.

# Open questions / risk

Main issue for the implementation of behaviors is what to do with
functors / module include. The state of this is not clear yet, we will
soon provide a first behavior implementation PR so we can evaluate the
impact on code.

# Extra notes and related work

- See also https://github.com/herbelin/ceps/blob/state-passing/text/state-passing.md
- Can we remove `Lib` at all when this is done and have all the logic in `DeclareMods`?