Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

working with gc types #317

Draft
wants to merge 6 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions design/mvp/Binary.md
Original file line number Diff line number Diff line change
Expand Up @@ -280,6 +280,7 @@ canonopt ::= 0x00 => string-encod
| 0x03 m:<core:memidx> => (memory m)
| 0x04 f:<core:funcidx> => (realloc f)
| 0x05 f:<core:funcidx> => (post-return f)
| 0x06 => reference-type 📡
```
Notes:
* The second `0x00` byte in `canon` stands for the `func` sort and thus the
Expand Down
44 changes: 44 additions & 0 deletions design/mvp/Explainer.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,7 @@ implemented, considered stable and included in a future milestone:
* 🪙: value imports/exports and component-level start function
* 🪺: nested namespaces and packages in import/export names
* 🧵: threading built-ins
* 📡: reference types ([gc] proposal integration)

(Based on the previous [scoping and layering] proposal to the WebAssembly CG,
this repo merges and supersedes the [module-linking] and [interface-types]
Expand Down Expand Up @@ -598,6 +599,7 @@ sets of abstract values:
| `own` | a unique, opaque address of a resource that will be destroyed when this value is dropped |
| `borrow` | an opaque address of a resource that must be dropped before the current export call returns |


How these abstract values are produced and consumed from Core WebAssembly
values and linear memory is configured by the component via *canonical lifting
and lowering definitions*, which are introduced [below](#canonical-definitions).
Expand Down Expand Up @@ -1168,11 +1170,14 @@ canonopt ::= string-encoding=utf8
| (memory <core:memidx>)
| (realloc <core:funcidx>)
| (post-return <core:funcidx>)
| reference-type 📡
```
While the production `externdesc` accepts any `sort`, the validation rules
for `canon lift` would only allow the `func` sort. In the future, other sorts
may be added (viz., types), hence the explicit sort.

##### `string-encoding` options

The `string-encoding` option specifies the encoding the Canonical ABI will use
for the `string` type. The `latin1+utf16` encoding captures a common string
encoding across Java, JavaScript and .NET VMs and allows a dynamic choice
Expand All @@ -1182,6 +1187,8 @@ Point range) or UTF-16 (which can express all Code Points, but uses either
default is UTF-8. It is a validation error to include more than one
`string-encoding` option.

##### `memory` options

The `(memory ...)` option specifies the memory that the Canonical ABI will
use to load and store values. If the Canonical ABI needs to load or store,
validation requires this option to be present (there is no default).
Expand All @@ -1199,6 +1206,43 @@ The Canonical ABI will use `realloc` both to allocate (passing `0` for the
first two parameters) and reallocate. If the Canonical ABI needs `realloc`,
validation requires this option to be present (there is no default).

##### `reference-type` options

📡 `reference-type` does not take effect by default. When this option
is turned on, Canonical ABI will be required to use reference types to pass
parameters. This option conflicts with `memory` and `realloc` and cannot
exist at the same time.

📡 When `reference-type` is enabled, the parameter type will change as follows:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the right place to put this spec information is in CanonicalABI.md (along-side the current spec information for how non-gc works) and in canonical-abi/definitions.py (which has the benefit that you can write tests for it in run_tests.py)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know how to rewrite definitions.py. The binary needs to reference an index of an already defined type.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since definitions.py is not a complete Python reference implementation but, rather, just a "suggestive" subset that describes just the lifting/lowering/built-in rules, what we do is, for canon definitions that take a typeidx in the binary format, we have the corresponding canon_* Python function take a Python object that directly represents the type (e.g., see how canon_lift takes a FuncType directly). That being said, I don't think there are any cases where you'll need to take a core wasm gc type as an immediate -- core wasm types are programmatically derived from component-level types by, e.g., flatten_functype which is called by canon_lift and canon_lower. Thus, I think you just need to add CoreArrayType and CoreStructType Python classes (analogous to CoreFuncType) so they can be created by flatten_functype in the appropriate cases when cx.opts.gc is true (which you'd also add to CanonicalOptions.


| wit type | wasm w/o `reference-type` | wasm w/ `reference-type` |
| :------------------- | :-------------------------- | :-------------------------------- |
| `bool` | `(i32,)` | `(i32,)` |
| `char` | `(i32,)` | `(i32,)` |
| `u8` | `(i32,)` | `(i32,)` |
| `u16` | `(i32,)` | `(i32,)` |
| `u32` | `(i32,)` | `(i32,)` |
| `u64` | `(i32,)` | `(i32,)` |
| `resource` | `(ptr: i32)` | `(ref extern,)` |
| `list<T>` | `(ptr: i32, len: i32)` | `(ref array (mut $t),)` |
| `record { a, b }` | `(a: A, b: B)` (flatten) | `(ref struct $record,)` |
| `tuple<A, B>`(in⩽16) | `(a: A, b: B)` (flatten) | `(a: A, b: B)` (flatten) |
| `tuple<A, B>`(out>1) | `(ptr: i32)` | `(ref struct (mut $a) (mut $b),)` |

📡 For variant types, they will be unpacked into two parts: enumeration and data.

| wit type | wasm w/o `reference-type` | wasm w/ `reference-type` |
| :---------------------- | :-------------------------- | :----------------------- |
| `option<A>` (heap type) | `(i32, [A_SIZE])` | `(i32, (ref null eq))` |
| `none` (heap type) | `(0, [FILL_ZEROES])` | `(1, null)` |
| `(some A)` (heap type) | `(1, [A_SIZE])` | `(0, ref $a)` |
| `result<A, B>` | `(i32, [MAX_VARIANT_SIZE])` | `(i32, (ref null eq))` |
| `(ok A)` | `(0, [A_SIZE])` | `(0, ref $a)` |
| `(err B)` | `(1, [B_SIZE])` | `(1, ref $b)` |
| `variant` | `(i32, [MAX_VARIANT_SIZE])` | `(i32, (ref null eq))` |

##### `post-return` options

The `(post-return ...)` option may only be present in `canon lift`
and specifies a core function to be called with the original return values
after they have finished being read, allowing memory to be deallocated and
Expand Down
9 changes: 9 additions & 0 deletions design/mvp/FutureFeatures.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,15 @@ to and from the statically-compiled host implementation language). See
[`list.lift_canon` and `list.lower_canon`] for more details.


## Mutability constraints on reference types 📡

Mutability constraints are a complex issue, especially in oop languages. For
example, whether array is internally mutable determines whether some operations
can accept specific subtypes. There are no constraints in the current MVP
version, which may lead to unexpected semantic damage. This can only be
guaranteed by the language compiler for the time being, and it is not
cross-language safe.

## Shared-some-things linking via "adapter modules"

The original [Interface Types proposal] and the re-layered [Module Linking
Expand Down