Skip to content

Commit

Permalink
wip
Browse files Browse the repository at this point in the history
Signed-off-by: Edoardo Vacchi <[email protected]>
  • Loading branch information
evacchi committed Feb 12, 2024
1 parent 30d8cc5 commit 4964a41
Showing 1 changed file with 99 additions and 51 deletions.
150 changes: 99 additions & 51 deletions site/content/docs/how_the_optimizing_compiler_works.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,28 +27,23 @@ That is, the module is (1) validated then (2) translated to an Intermediate Repr
The wazero IR can then be executed directly (in the case of the interpreter) or it can be further processed and translated into native code by the compiler. This compiler performs a straightforward translation from the IR to native code, without any further passes. The wazero IR is not intended for further processing beyond immediate execution or straightforward translation.

```goat
+---- wazero IR ----+
| |
v v
+--------------+ +--------------+
| | | |
| Compiler | | Interpreter |---------* executable
| | | |
+--------------+ +--------------+
|
+----------+---------+
| |
v v
+-------------+ +-------------+
| | | |
| ARM64 | | AMD64 |
| Backend | | Backend | ---------------------* executable
| | | |
+-------------+ +-------------+
+---- wazero IR ----+
| |
v v
+--------------+ +--------------+
| Compiler | | Interpreter |- - - executable
+--------------+ +--------------+
|
+----------+---------+
| |
v v
+---------+ +---------+
| ARM64 | | AMD64 |
| Backend | | Backend | - - - - - - - - - executable
+---------+ +---------+
```


Validation and translation to an IR in a compiler are usually called the **front-end** part of a compiler, while code-generation occurs in what we call the **back-end** of a compiler. The front-end is the part of a compiler that is closer to the input, and it generally indicates machine-independent processing, such as parsing and static validation. The back-end is the part of a compiler that is closer to the output, and it generally includes machine-specific procedures, such as code-generation.

In the **optimizing** compiler, we still decode and translate Wasm binaries to an intermediate representation in the front-end, but we use a textbook representation called an **SSA** or "Static Single-Assignment Form", that is intended for further transformation.
Expand All @@ -68,36 +63,23 @@ The wazero optimizing compiler implements the following compilation passes:

```goat
+-------------------+ +-------------------+
Input | | | |
Input +-------------------+ +-------------------+
Wasm Binary --->| DecodeModule |----->| CompileModule |--+
| | | | |
+-------------------+ +-------------------+ |
|
|
|
+-----------------------------------------------------------------------+
|
|
|
| +---------------------------+ +---------------------------+
| | | | |
+->| Front-End |------------------->| Back-End |
| | | |
+---------------------------+ +---------------------------+
| |
| |
v v
SSA Instruction Selection
| |
| |
v v
Optimization Registry Allocation
|
|
v
Finalization/Encoding
+--------------------------------------------------------------------+
|
| +---------------+ +---------------+
+->| Front-End |------------------------------->| Back-End |
+---------------+ +---------------+
| |
v v
SSA Instruction Selection
| |
v v
Optimization Registry Allocation
|
v
Finalization/Encoding
```

## Front-End: Translation to SSA
Expand All @@ -109,17 +91,83 @@ In short terms, every program, or, in our case, every Wasm function, can be tran
The control-flow graph is a directed graph where each node is a sequence of statements that do not contain a control flow instruction,
called a **basic block**. Instead, control-flow instructions are translated into edges.

For instance, take the following implementation of the `abs` function:

```wasm
(module
(func (;0;) (param i32) (result i32)
(if (result i32) (i32.lt_s (local.get 0) (i32.const 0))
(then
(i32.sub (i32.const 0) (local.get 0)))
(else
(local.get 0))
)
)
(export "f" (func 0))
)
```

This is translated to the following block diagram:

```goat
+---------------------------------------------+
|blk0: (exec_ctx:i64, module_ctx:i64, v2:i32) |
| v3:i32 = Iconst_32 0x0 |
| v4:i32 = Icmp lt_s, v2, v3 |
| Brz v4, blk2 |
| Jump blk1 |
+---------------------------------------------+
|
|
+---(v4 != 0)---+--(v4 == 0)----+
| |
v v
+---------------------------+ +---------------------------+
|blk1: () <-- (blk0) | |blk2: () <-- (blk0) |
| v6:i32 = Iconst_32 0x0 | | Jump blk3, v2 |
| v7:i32 = Isub v6, v2 | | |
| Jump blk3, v7 | | |
+---------------------------+ +---------------------------+
| |
| |
+-{v5 := v7}----+---{v5 := v2}--+
|
v
+------------------------------+
|blk3: (v5:i32) <-- (blk1,blk2)|
| Jump blk_ret, v5 |
+------------------------------+
|
{return v5}
|
v
```

We use the ["block argument" variant of SSA][ssa-blocks], which is also the same representation [used in LLVM's MLIR][llvm-mlir]. In this variant, each block takes a list of arguments. Each block ends with a jump instruction with an optional list of arguments; these arguments, are assigned to the target block's arguments like a function.

Consider the first block `blk0`. You will notice that, compared to the original function, it takes two extra parameters (`exec_ctx` and `module_ctx`). It then takes one parameter `v2`, corresponding to the function parameter, and it defines two variables `v3`, `v4`. `v3` is the constant 0, `v4` is the result of comparing `v2` to `v3` using the `i32.lt_s` instruction.

You might also have noticed that the instructions do not correspond strictly to the original Wasm opcodes. This is because, similarly to the wazero IR used by the old compiler, this is a custom IR.

You will also notice that, on the right-hand side of the assignments of any block, no name occurs twice: this is why this form is called "single-assignment".








<!--
We use the "block argument" variant of SSA: https://en.wikipedia.org/wiki/Static_single-assignment_form#Block_arguments
which is equivalent to the traditional PHI function based one, but more convenient during optimizations.
However, in this package's source code comment, we might use PHI whenever it seems necessary in order to be aligned with
existing literatures, e.g. SSA level optimization algorithms are often described using PHI nodes.
The rationale doc for the choice of "block argument" by MLIR of LLVM is worth a read:
https://mlir.llvm.org/docs/Rationale/Rationale/#block-arguments-vs-phi-nodes
The algorithm to resolve variable definitions used here is based on the paper
"Simple and Efficient Construction of Static Single Assignment Form": https://link.springer.com/content/pdf/10.1007/978-3-642-37051-9_6.pdf.
-->

[ssa-blocks]: https://en.wikipedia.org/wiki/Static_single-assignment_form#Block_arguments
[llvm-mlir]: https://mlir.llvm.org/docs/Rationale/Rationale/#block-arguments-vs-phi-nodes

0 comments on commit 4964a41

Please sign in to comment.