Skip to content

Compiler Technical Info

Joe Pfeiffer edited this page Sep 13, 2022 · 3 revisions

Technical Info about the compiler.

The BC Compiler is a compiler that targets the Synacor VM. It compiles a language named BC to Synacor bytecode that can be run on any spec-compliant Synacor VM that was built for the Synacor Challenge.

The syntax is loosely based on C. The idea of pointers is used heavily as in c, but the ways they are declared are different. They are implemented with a C# generics-like syntax where you have a pointer to T ptr<T>. At the time of creation, I thought this was more readable. Jury's still out here.

The compiler is somewhat crudely written in C#. It emits assembly code that can then be fed into the assembler in the repo to create a binary.

Built in types

int

ints are the basic building block of Synacor bytecode. They are 16 bit unsigned LE integers. This is all you get as far as numeric operations in the Synacor VM.

string

strings are wrapper around a \0 terminated sequence of ints which represents a string. It's basically a wrapper around a void pointer and can be interchangeably cast as such. eg "test" as ptr

ptr<T>

pointers can point to any type and the type checker uses this to enforce type safety even when using pointers. This does not of course apply if you are casting to a void pointer. eg:

type test
{
  int id
}
ptr<test> &test_instance;
do_something(test_instance);

The declaration syntax for the pointer here ptr<test> &test_instance; is a shortcut that automatically creates the backing variable for the pointer. Currently, dereferncing does not work correctly for complex types or primitives when modifying the values, so the current solution is to do arrow access for complex types and copies for primitives. eg:

Complex (Custom types)

//                      V-- here the value is passed by reference by virtue of using the ptr.
function do_something(ptr<test> test_thing)
{
  // The ptr is returned with a register, copied, and then used to point to the same data that we used in the function scope
  return test_thing->id = 123;
}
//                      V-- here the value is passed by stack.
function do_something(test test_thing)
{
  // The test_thing is returned by loading up the stack with its members, copied, and then used to point to the same data that we used in the function scope
  return test_thing.id = 123;
}

Primatives (think int, string)

//                          V-- here the value is passed by copy. Just like primatives in C#
function do_something_basic(int id) : int
{
  // The value is returned and copied out of a register here as well.
  return id + 123;
}

func<Tin.., Tout>

function pointers are implemented much the same way as raw pointers. It gives an extra layer of abstraction and typesafety here by requiring you to specify in and return types. eg a function the takes an int and returns a string would be specified as func<int, string>. You always have to specify the return type. For example, a function that takes a string and returns nothing would be specified as func<string, void>. Function declarations are func types. They can be explicitly assigned just like in Javascript and C#:

type test_type
{
    int id
}

func<int, ptr<test_type>> test_type_factory = function(int id) : ptr<test_type>
{
    ptr<test_type> &t;
    t->id = id;
    return t;
}
ptr<test_type> shiny_new_test = test_type_factory(123);

Implicit typing

As the types in the example above are a bit cumbersome to type a lot, you could use an implicit type here as follows:

#include "stdlib.bc"
var test_type_factory = function(int id) : ptr<test_type>
{
    ptr<test_type> &t;
    t->id = id;
    return t;
}
println(typeof(test_type_factory)); // Prints "func<int, ptr<test_type>>"

var shiny_new_test = test_type_factory(123);
println(typeof(shiny_new_test)); // Prints "ptr<test_type>"

Implicit types can be used wherever the type of the assigned object can reasonably be inferred. This obviously does not work for standalone variable declarations as the type cannot be known if you aren't assigning anything to the variable. Because of this, implicit typing cannot be used in parameter definitions in function definitions as well. Another example is declaration of pointer types since the type of the pointer must be known. eg. ptr<test_type> &test_type

Custom Data structures

structs or types as they are called in BC are created with the type keyword. No need for any typedef stuff. As soon as they are declared, they can be used to bind variables. No initializer syntax is present yet.

type person
{
  int age;
  string name;
}

Preprocessor Directives

#include "<filepath>"

Includes the specified bc file into the current one.