This document describes the portable binary encoding of the WebAssembly modules.
The binary encoding is a dense representation of module information that enables small files, fast decoding, and reduced memory usage. See the rationale document for more detail.
The encoding is split into three layers:
- Layer 0 is a simple binary encoding of the bytecode instructions and related data structures. The encoding is dense and trivial to interact with, making it suitable for scenarios like JIT, instrumentation tools, and debugging.
- Layer 1 🦄 provides structural compression on top of layer 0, exploiting specific knowledge about the nature of the syntax tree and its nodes. The structural compression introduces more efficient encoding of values, rearranges values within the module, and prunes structurally identical tree nodes.
- Layer 2 🦄 Layer 2 applies generic compression algorithms, like gzip and Brotli, that are already available in browsers and other tooling.
Most importantly, the layering approach allows development and standardization to occur incrementally. For example, Layer 1 and Layer 2 encoding techniques can be experimented with by application-level decompressing to the layer below. As compression techniques stabilize, they can be standardized and moved into native implementations.
See proposed layer 1 compression 🦄 for a proposal for layer 1 structural compression.
An unsigned integer of N bits, represented in N/8 bytes in little endian order. N is either 8, 16, or 32.
A LEB128 variable-length integer, limited to N bits (i.e., the values [0, 2^N-1]),
represented by at most ceil(N/7) bytes that may contain padding 0x80
bytes.
Note: Currently, the only sizes used are varuint1
, varuint7
, and varuint32
,
where the former two are used for compatibility with potential future extensions.
A Signed LEB128 variable-length integer, limited to N bits (i.e., the values [-2^(N-1), +2^(N-1)-1]),
represented by at most ceil(N/7) bytes that may contain padding 0x80
or 0xFF
bytes.
Note: Currently, the only sizes used are varint7
, varint32
and varint64
.
All types are distinguished by a negative varint7
values that is the first byte of their encoding (representing a type constructor):
Opcode | Type constructor |
---|---|
-0x01 (i.e., the byte 0x7f ) |
i32 |
-0x02 (i.e., the byte 0x7e ) |
i64 |
-0x03 (i.e., the byte 0x7d ) |
f32 |
-0x04 (i.e., the byte 0x7c ) |
f64 |
-0x10 (i.e., the byte 0x70 ) |
anyfunc |
-0x20 (i.e., the byte 0x60 ) |
func |
-0x40 (i.e., the byte 0x40 ) |
pseudo type for representing an empty block_type |
Some of these will be followed by additional fields, see below.
Note: Gaps are reserved for future 🦄 extensions. The use of a signed scheme is so that types can coexist in a single space with (positive) indices into the type section, which may be relevant for future extensions of the type system.
A varint7
indicating a value type. One of:
i32
i64
f32
f64
as encoded above.
A varint7
indicating a block signature. These types are encoded as:
- either a
value_type
indicating a signature with a single result - or
-0x40
(i.e., the byte0x40
) indicating a signature with 0 results.
A varint7
indicating the types of elements in a table.
In the MVP, only one type is available:
Note: In the future 🦄, other element types may be allowed.
The description of a function signature. Its type constructor is followed by an additional description:
Field | Type | Description |
---|---|---|
form | varint7 |
the value for the func type constructor as defined above |
param_count | varuint32 |
the number of parameters to the function |
param_types | value_type* |
the parameter types of the function |
return_count | varuint1 |
the number of results from the function |
return_type | value_type? |
the result type of the function (if return_count is 1) |
Note: In the future 🦄, return_count
and return_type
might be generalised to allow multiple values.
The description of a global variable.
Field | Type | Description |
---|---|---|
content_type | value_type |
type of the value |
mutability | varuint1 |
0 if immutable, 1 if mutable |
The description of a table.
Field | Type | Description |
---|---|---|
element_type | elem_type |
the type of elements |
limits | resizable_limits |
see below |
The description of a memory.
Field | Type | Description |
---|---|---|
limits | resizable_limits |
see below |
A single-byte unsigned integer indicating the kind of definition being imported or defined:
0
indicating aFunction
import or definition1
indicating aTable
import or definition2
indicating aMemory
import or definition3
indicating aGlobal
import or definition
A packed tuple that describes the limits of a table or memory:
Field | Type | Description |
---|---|---|
flags | varuint32 |
bit 0x1 is set if the maximum field is present |
initial | varuint32 |
initial length (in units of table elements or wasm pages) |
maximum | varuint32 ? |
only present if specified by flags |
Note: In the future 🦄, the "flags" field may be extended, e.g., to include a flag for sharing between threads.
The encoding of an initializer expression
is the normal encoding of the expression followed by the end
opcode as a
delimiter.
Note that get_global
in an initializer expression can only refer to immutable
imported globals and all uses of init_expr
can only appear after the Imports
section.
The following documents the current prototype format. This format is based on and supersedes the v8-native prototype format, originally in a public design doc.
The module starts with a preamble of two fields:
Field | Type | Description |
---|---|---|
magic number | uint32 |
Magic number 0x6d736100 (i.e., '\0asm') |
version | uint32 |
Version number, currently 0xd. The version for MVP will be reset to 1. |
The module preamble is followed by a sequence of sections.
Each section is identified by a 1-byte section code that encodes either a known section or a user-defined section.
The section length and payload data then follow.
Known sections have non-zero ids, while unknown sections have a 0
id followed by an identifying string as
part of the payload.
Unknown sections are ignored by the WebAssembly implementation, and thus validation errors within them do not
invalidate a module.
Field | Type | Description |
---|---|---|
id | varuint7 |
section code |
payload_len | varuint32 |
size of this section in bytes |
name_len | varuint32 ? |
length of the section name in bytes, present if id == 0 |
name | bytes ? |
section name string, present if id == 0 |
payload_data | bytes |
content of this section, of length payload_len - sizeof(name) - sizeof(name_len) |
Each known section is optional and may appear at most once.
Unknown sections all have the same id
, and can be named non-uniquely (all bytes composing their names can be identical).
Known sections from this list may not appear out of order.
The content of each section is encoded in its payload_data
.
Section Name | Code | Description |
---|---|---|
Type | 1 |
Function signature declarations |
Import | 2 |
Import declarations |
Function | 3 |
Function declarations |
Table | 4 |
Indirect function table and other tables |
Memory | 5 |
Memory attributes |
Global | 6 |
Global declarations |
Export | 7 |
Exports |
Start | 8 |
Start function declaration |
Element | 9 |
Elements section |
Code | 10 |
Function bodies (code) |
Data | 11 |
Data segments |
The end of the last present section must coincide with the last byte of the
module. The shortest valid module is 8 bytes (magic number
, version
,
followed by zero sections).
The type section declares all function signatures that will be used in the module.
Field | Type | Description |
---|---|---|
count | varuint32 |
count of type entries to follow |
entries | func_type* |
repeated type entries as described below |
Note: In the future 🦄,
this section may contain other forms of type entries as well, which can be distinguished by the form
field of the type encoding.
The import section declares all imports that will be used in the module.
Field | Type | Description |
---|---|---|
count | varuint32 |
count of import entries to follow |
entries | import_entry* |
repeated import entries as described below |
Field | Type | Description |
---|---|---|
module_len | varuint32 |
module string length |
module_str | bytes |
module string of module_len bytes |
field_len | varuint32 |
field name length |
field_str | bytes |
field name string of field_len bytes |
kind | external_kind |
the kind of definition being imported |
Followed by, if the kind
is Function
:
Field | Type | Description |
---|---|---|
type | varuint32 |
type index of the function signature |
or, if the kind
is Table
:
Field | Type | Description |
---|---|---|
type | table_type |
type of the imported table |
or, if the kind
is Memory
:
Field | Type | Description |
---|---|---|
type | memory_type |
type of the imported memory |
or, if the kind
is Global
:
Field | Type | Description |
---|---|---|
type | global_type |
type of the imported global |
Note that, in the MVP, only immutable global variables can be imported.
The function section declares the signatures of all functions in the module (their definitions appear in the code section).
Field | Type | Description |
---|---|---|
count | varuint32 |
count of signature indices to follow |
types | varuint32* |
sequence of indices into the type section |
The encoding of a Table section:
Field | Type | Description |
---|---|---|
count | varuint32 |
indicating the number of tables defined by the module |
entries | table_type* |
repeated table_type entries as described below |
In the MVP, the number of tables must be no more than 1.
ID: memory
The encoding of a Memory section:
Field | Type | Description |
---|---|---|
count | varuint32 |
indicating the number of memories defined by the module |
entries | memory_type* |
repeated memory_type entries as described below |
Note that the initial/maximum fields are specified in units of WebAssembly pages.
In the MVP, the number of memories must be no more than 1.
The encoding of the Global section:
Field | Type | Description |
---|---|---|
count | varuint32 |
count of global variable entries |
globals | global_variable* |
global variables, as described below |
Each global_variable
declares a single global variable of a given type, mutability
and with the given initializer.
Field | Type | Description |
---|---|---|
type | global_type |
type of the variables |
init | init_expr |
the initial value of the global |
Note that, in the MVP, only immutable global variables can be exported.
The encoding of the Export section:
Field | Type | Description |
---|---|---|
count | varuint32 |
count of export entries to follow |
entries | export_entry* |
repeated export entries as described below |
Field | Type | Description |
---|---|---|
field_len | varuint32 |
field name string length |
field_str | bytes |
field name string of field_len bytes |
kind | external_kind |
the kind of definition being exported |
index | varuint32 |
the index into the corresponding index space |
For example, if the "kind" is Function
, then "index" is a
function index. Note that, in the MVP, the
only valid index value for a memory or table export is 0.
The start section declares the start function.
Field | Type | Description |
---|---|---|
index | varuint32 |
start function index |
The encoding of the Elements section:
Field | Type | Description |
---|---|---|
count | varuint32 |
count of element segments to follow |
entries | elem_segment* |
repeated element segments as described below |
a elem_segment
is:
Field | Type | Description |
---|---|---|
index | varuint32 |
the table index (0 in the MVP) |
offset | init_expr |
an i32 initializer expression that computes the offset at which to place the elements |
num_elem | varuint32 |
number of elements to follow |
elems | varuint32* |
sequence of function indices |
ID: code
The code section contains a body for every function in the module.
The count of function declared in the function section
and function bodies defined in this section must be the same and the i
th
declaration corresponds to the i
th function body.
Field | Type | Description |
---|---|---|
count | varuint32 |
count of function bodies to follow |
bodies | function_body* |
sequence of Function Bodies |
The data section declares the initialized data that is loaded into the linear memory.
Field | Type | Description |
---|---|---|
count | varuint32 |
count of data segments to follow |
entries | data_segment* |
repeated data segments as described below |
a data_segment
is:
Field | Type | Description |
---|---|---|
index | varuint32 |
the linear memory index (0 in the MVP) |
offset | init_expr |
an i32 initializer expression that computes the offset at which to place the data |
size | varuint32 |
size of data (in bytes) |
data | bytes |
sequence of size bytes |
User-defined section string: "name"
The names section does not change execution semantics, and thus is not allocated a section code.
It is encoded as an unknown section (id 0
) followed by the identification string "name"
.
Like all unknown sections, a validation error in this section does not cause validation of the module to fail.
The name section may appear only once, and only after the Data section.
The expectation is that, when a binary WebAssembly module is viewed in a browser or other development
environment, the names in this section will be used as the names of functions
and locals in the text format.
Field | Type | Description |
---|---|---|
count | varuint32 |
count of entries to follow |
entries | function_names* |
sequence of names |
The sequence of function_names
assigns names to the corresponding
function index. (Note: this assigns names to both
imported and module-defined functions.) The count may be greater or less than
the actual number of functions.
Field | Type | Description |
---|---|---|
fun_name_len | varuint32 |
string length, in bytes |
fun_name_str | bytes |
valid utf8 encoding |
local_count | varuint32 |
count of local names to follow |
local_names | local_name* |
sequence of local names |
The sequence of local_name
assigns names to the corresponding local index. The
count may be greater or less than the actual number of locals.
Field | Type | Description |
---|---|---|
local_name_len | varuint32 |
string length, in bytes |
local_name_str | bytes |
valid utf8 encoding |
Function bodies consist of a sequence of local variable declarations followed by
bytecode instructions. Each function body must end with the end
opcode.
Field | Type | Description |
---|---|---|
body_size | varuint32 |
size of function body to follow, in bytes |
local_count | varuint32 |
number of local entries |
locals | local_entry* |
local variables |
code | byte* |
bytecode of the function |
end | byte |
0x0b , indicating the end of the body |
Each local entry declares a number of local variables of a given type. It is legal to have several entries with the same type.
Field | Type | Description |
---|---|---|
count | varuint32 |
number of local variables of the following type |
type | value_type |
type of the variables |
Control flow operators (described here)
Name | Opcode | Immediates | Description |
---|---|---|---|
unreachable |
0x00 |
trap immediately | |
nop |
0x01 |
no operation | |
block |
0x02 |
sig : block_type |
begin a sequence of expressions, yielding 0 or 1 values |
loop |
0x03 |
sig : block_type |
begin a block which can also form control flow loops |
if |
0x04 |
sig : block_type |
begin if expression |
else |
0x05 |
begin else expression of if | |
end |
0x0b |
end a block, loop, or if | |
br |
0x0c |
relative_depth : varuint32 |
break that targets an outer nested block |
br_if |
0x0d |
relative_depth : varuint32 |
conditional break that targets an outer nested block |
br_table |
0x0e |
see below | branch table control flow construct |
return |
0x0f |
return zero or one value from this function |
The sig fields of block
and if
operators specify function signatures
which describe their use of the operand stack.
The br_table
operator has an immediate operand which is encoded as follows:
Field | Type | Description |
---|---|---|
target_count | varuint32 |
number of entries in the target_table |
target_table | varuint32* |
target entries that indicate an outer block or loop to which to break |
default_target | varuint32 |
an outer block or loop to which to break in the default case |
The br_table
operator implements an indirect branch. It accepts an optional value argument
(like other branches) and an additional i32
expression as input, and
branches to the block or loop at the given offset within the target_table
. If the input value is
out of range, br_table
branches to the default target.
Note: Gaps in the opcode space, here and elsewhere, are reserved for future 🦄 extensions.
Call operators (described here)
Name | Opcode | Immediates | Description |
---|---|---|---|
call |
0x10 |
function_index : varuint32 |
call a function by its index |
call_indirect |
0x11 |
type_index : varuint32 , reserved : varuint1 |
call a function indirect with an expected signature |
The call_indirect
operator takes a list of function arguments and as the last
operand the index into the table. Its reserved
immediate is for
future 🦄 use and must be 0
in the MVP.
Parametric operators (described here)
Name | Opcode | Immediates | Description |
---|---|---|---|
drop |
0x1a |
ignore value | |
select |
0x1b |
select one of two values based on condition |
Variable access (described here)
Name | Opcode | Immediates | Description |
---|---|---|---|
get_local |
0x20 |
local_index : varuint32 |
read a local variable or parameter |
set_local |
0x21 |
local_index : varuint32 |
write a local variable or parameter |
tee_local |
0x22 |
local_index : varuint32 |
write a local variable or parameter and return the same value |
get_global |
0x23 |
global_index : varuint32 |
read a global variable |
set_global |
0x24 |
global_index : varuint32 |
write a global variable |
Memory-related operators (described here)
Name | Opcode | Immediate | Description |
---|---|---|---|
i32.load |
0x28 |
memory_immediate |
load from memory |
i64.load |
0x29 |
memory_immediate |
load from memory |
f32.load |
0x2a |
memory_immediate |
load from memory |
f64.load |
0x2b |
memory_immediate |
load from memory |
i32.load8_s |
0x2c |
memory_immediate |
load from memory |
i32.load8_u |
0x2d |
memory_immediate |
load from memory |
i32.load16_s |
0x2e |
memory_immediate |
load from memory |
i32.load16_u |
0x2f |
memory_immediate |
load from memory |
i64.load8_s |
0x30 |
memory_immediate |
load from memory |
i64.load8_u |
0x31 |
memory_immediate |
load from memory |
i64.load16_s |
0x32 |
memory_immediate |
load from memory |
i64.load16_u |
0x33 |
memory_immediate |
load from memory |
i64.load32_s |
0x34 |
memory_immediate |
load from memory |
i64.load32_u |
0x35 |
memory_immediate |
load from memory |
i32.store |
0x36 |
memory_immediate |
store to memory |
i64.store |
0x37 |
memory_immediate |
store to memory |
f32.store |
0x38 |
memory_immediate |
store to memory |
f64.store |
0x39 |
memory_immediate |
store to memory |
i32.store8 |
0x3a |
memory_immediate |
store to memory |
i32.store16 |
0x3b |
memory_immediate |
store to memory |
i64.store8 |
0x3c |
memory_immediate |
store to memory |
i64.store16 |
0x3d |
memory_immediate |
store to memory |
i64.store32 |
0x3e |
memory_immediate |
store to memory |
current_memory |
0x3f |
reserved : varuint1 |
query the size of memory |
grow_memory |
0x40 |
reserved : varuint1 |
grow the size of memory |
The memory_immediate
type is encoded as follows:
Name | Type | Description |
---|---|---|
flags | varuint32 |
a bitfield which currently contains the alignment in the least significant bits, encoded as log2(alignment) |
offset | varuint32 |
the value of the offset |
As implied by the log2(alignment)
encoding, the alignment must be a power of 2.
As an additional validation criteria, the alignment must be less or equal to
natural alignment. The bits after the
log(memory-access-size)
least-significant bits must be set to 0. These bits
are reserved for future 🦄 use
(e.g., for shared memory ordering requirements).
The reserved
immediate to the current_memory
and grow_memory
operators is
for future 🦄 use and must be 0 in the MVP.
Constants (described here)
Name | Opcode | Immediates | Description |
---|---|---|---|
i32.const |
0x41 |
value : varint32 |
a constant value interpreted as i32 |
i64.const |
0x42 |
value : varint64 |
a constant value interpreted as i64 |
f32.const |
0x43 |
value : uint32 |
a constant value interpreted as f32 |
f64.const |
0x44 |
value : uint64 |
a constant value interpreted as f64 |
Comparison operators (described here)
Name | Opcode | Immediate | Description |
---|---|---|---|
i32.eqz |
0x45 |
||
i32.eq |
0x46 |
||
i32.ne |
0x47 |
||
i32.lt_s |
0x48 |
||
i32.lt_u |
0x49 |
||
i32.gt_s |
0x4a |
||
i32.gt_u |
0x4b |
||
i32.le_s |
0x4c |
||
i32.le_u |
0x4d |
||
i32.ge_s |
0x4e |
||
i32.ge_u |
0x4f |
||
i64.eqz |
0x50 |
||
i64.eq |
0x51 |
||
i64.ne |
0x52 |
||
i64.lt_s |
0x53 |
||
i64.lt_u |
0x54 |
||
i64.gt_s |
0x55 |
||
i64.gt_u |
0x56 |
||
i64.le_s |
0x57 |
||
i64.le_u |
0x58 |
||
i64.ge_s |
0x59 |
||
i64.ge_u |
0x5a |
||
f32.eq |
0x5b |
||
f32.ne |
0x5c |
||
f32.lt |
0x5d |
||
f32.gt |
0x5e |
||
f32.le |
0x5f |
||
f32.ge |
0x60 |
||
f64.eq |
0x61 |
||
f64.ne |
0x62 |
||
f64.lt |
0x63 |
||
f64.gt |
0x64 |
||
f64.le |
0x65 |
||
f64.ge |
0x66 |
Numeric operators (described here)
Name | Opcode | Immediate | Description |
---|---|---|---|
i32.clz |
0x67 |
||
i32.ctz |
0x68 |
||
i32.popcnt |
0x69 |
||
i32.add |
0x6a |
||
i32.sub |
0x6b |
||
i32.mul |
0x6c |
||
i32.div_s |
0x6d |
||
i32.div_u |
0x6e |
||
i32.rem_s |
0x6f |
||
i32.rem_u |
0x70 |
||
i32.and |
0x71 |
||
i32.or |
0x72 |
||
i32.xor |
0x73 |
||
i32.shl |
0x74 |
||
i32.shr_s |
0x75 |
||
i32.shr_u |
0x76 |
||
i32.rotl |
0x77 |
||
i32.rotr |
0x78 |
||
i64.clz |
0x79 |
||
i64.ctz |
0x7a |
||
i64.popcnt |
0x7b |
||
i64.add |
0x7c |
||
i64.sub |
0x7d |
||
i64.mul |
0x7e |
||
i64.div_s |
0x7f |
||
i64.div_u |
0x80 |
||
i64.rem_s |
0x81 |
||
i64.rem_u |
0x82 |
||
i64.and |
0x83 |
||
i64.or |
0x84 |
||
i64.xor |
0x85 |
||
i64.shl |
0x86 |
||
i64.shr_s |
0x87 |
||
i64.shr_u |
0x88 |
||
i64.rotl |
0x89 |
||
i64.rotr |
0x8a |
||
f32.abs |
0x8b |
||
f32.neg |
0x8c |
||
f32.ceil |
0x8d |
||
f32.floor |
0x8e |
||
f32.trunc |
0x8f |
||
f32.nearest |
0x90 |
||
f32.sqrt |
0x91 |
||
f32.add |
0x92 |
||
f32.sub |
0x93 |
||
f32.mul |
0x94 |
||
f32.div |
0x95 |
||
f32.min |
0x96 |
||
f32.max |
0x97 |
||
f32.copysign |
0x98 |
||
f64.abs |
0x99 |
||
f64.neg |
0x9a |
||
f64.ceil |
0x9b |
||
f64.floor |
0x9c |
||
f64.trunc |
0x9d |
||
f64.nearest |
0x9e |
||
f64.sqrt |
0x9f |
||
f64.add |
0xa0 |
||
f64.sub |
0xa1 |
||
f64.mul |
0xa2 |
||
f64.div |
0xa3 |
||
f64.min |
0xa4 |
||
f64.max |
0xa5 |
||
f64.copysign |
0xa6 |
Conversions (described here)
Name | Opcode | Immediate | Description |
---|---|---|---|
i32.wrap/i64 |
0xa7 |
||
i32.trunc_s/f32 |
0xa8 |
||
i32.trunc_u/f32 |
0xa9 |
||
i32.trunc_s/f64 |
0xaa |
||
i32.trunc_u/f64 |
0xab |
||
i64.extend_s/i32 |
0xac |
||
i64.extend_u/i32 |
0xad |
||
i64.trunc_s/f32 |
0xae |
||
i64.trunc_u/f32 |
0xaf |
||
i64.trunc_s/f64 |
0xb0 |
||
i64.trunc_u/f64 |
0xb1 |
||
f32.convert_s/i32 |
0xb2 |
||
f32.convert_u/i32 |
0xb3 |
||
f32.convert_s/i64 |
0xb4 |
||
f32.convert_u/i64 |
0xb5 |
||
f32.demote/f64 |
0xb6 |
||
f64.convert_s/i32 |
0xb7 |
||
f64.convert_u/i32 |
0xb8 |
||
f64.convert_s/i64 |
0xb9 |
||
f64.convert_u/i64 |
0xba |
||
f64.promote/f32 |
0xbb |
Reinterpretations (described here)
Name | Opcode | Immediate | Description |
---|---|---|---|
i32.reinterpret/f32 |
0xbc |
||
i64.reinterpret/f64 |
0xbd |
||
f32.reinterpret/i32 |
0xbe |
||
f64.reinterpret/i64 |
0xbf |