-
Notifications
You must be signed in to change notification settings - Fork 40
Writing Bops: The Bebop Schema Language
Bebop schemas are written in the custom Bebop Schema Language, which this page documents.
A Bebop schema consists of a series of definitions, each introduced by a keyword, followed by a name and a description in curly braces. Here is an example schema, demonstrating the three kinds of definition Bebop accepts:
enum Instrument {
Sax = 0;
Trumpet = 1;
Clarinet = 2;
}
struct Performer {
string name;
Instrument plays;
}
message Song {
string title = 1;
uint16 year = 2;
Performer[] performers = 3;
}
Let's go over each of these:
An enum
defines a type that acts as a wrapper around uint32
, with certain named constants, each having a corresponding underlying integer value. It is used much like an enum
in C.
The syntax is:
enum Flavor { Vanilla = 1; Chocolate = 2; Mint = 3; }
.
Unlike in C, all constants must be explicitly given an integer literal value.
You should never remove a constant from an
enum
definition. Instead, put[deprecated("reason here")]
in front of the name.You're free to add new constants to an
enum
at any point in the future.
A struct
defines an aggregation of "fields", containing typed values in a fixed order. All values are always present. It is used much like a struct
in C.
The syntax is:
struct Point { int32 x; int32 y; }
.
The binary representation of a
struct
is simply that of all field values in order.
This means it's more compact and efficient thanmessage
.When you define a
struct
, you're promising to never add or remove fields from it.
(If this turns out to be necessary, you'll have to define astruct MyStructV2
and deprecate the oldstruct MyStruct
.)When you define a struct with the
readonly
modifier the Bebop compiler guarantees that it's values cannot be modified or updated after decoding takes place. Use this to ensure data integrity when marshalling between language domains.
A message
defines an indexed aggregation of fields containing typed values, each of which may be absent. It might correspond to something like a class
in Java, or a JSON object.
The syntax is:
message Song { string title = 1; uint16 year = 2; }
— note the indices.
In the binary representation of a
message
, the message is prefixed with its length, and each field is prefixed with its index.It's okay to add fields to a
message
with new indices later — in fact, this is the whole point ofmessage
. (When an unrecognized field index is encountered in the process of decoding amessage
, it is skipped over. This allows for compatibility with versions of your app that use an older version of the schema.)
When talking about Bebop, the word "record" is used to mean "either a struct
or a message
".
In any definition, ;
is used to delimit items. In a record definition, each field is specified by giving the name of the type of the field, followed by the name of the field, followed by ;
.
The following types are built-ins:
Name | Description |
---|---|
bool |
A Boolean value, true or false. |
byte |
An unsigned 8-bit integer. |
uint16 |
An unsigned 16-bit integer. |
int16 |
A signed 16-bit integer. |
uint32 |
An unsigned 32-bit integer. |
int32 |
A signed 32-bit integer. |
uint64 |
An unsigned 64-bit integer. |
int64 |
A signed 64-bit integer. |
float32 |
A 32-bit IEEE single-precision floating point number. |
float64 |
A 64-bit IEEE double-precision floating point number. |
string |
A length-prefixed UTF-8-encoded string. |
guid |
A GUID. |
date |
A UTC date / timestamp. |
T[] |
A length-prefixed array of T values. array[T] is an alias. |
map[T1, T2] |
A map, as a length-prefixed array of (T1 , T2 ) association pairs. |
You may also use user-defined types (enum
s and other records) as field types.
A string is stored as a length-prefixed array of bytes. All length-prefixes are 32-bit unsigned integers, which means the maximum number of bytes in a string, or entries in an array or map, is about 4 billion (2^32).
A guid
is stored as 16 bytes, in Guid.ToByteArray order.
A date
is stored as a 64-bit integer amount of “ticks” since 00:00:00 UTC on January 1 of year 1 A.D. in the Gregorian calendar, where a “tick” is 100 nanoseconds.
Use [deprecated("We no longer use this")]
before a field. When encoding a message
deprecated fields are skipped. A notice will also be copied into the generated code.
Use [opcode(0x12345678)]
before a record definition to associate an identifying "opcode" with it. You can also use a 4-byte ASCII string as an opcode: [opcode("Ping")]
.
Strictly speaking, Bebop is not opinionated about what you do with these opcodes. But you may find it useful to send this kind of thing over the wire:
12 34 56 78 03 00 00 00 18 00 ...
[4-byte opcode] [Bebop-encoded data]
And use the 4-byte opcode to decide which decoder/handler to dispatch the rest of the packet to. For more information see Mirrors.
All the compiler does is check that no opcode is used twice, and add something like class Foo { const int Opcode = 0x12345678; ... }
in the generated code for you to use in your dispatching code.
As in many C-like languages, //
starts a comment until the end of the line, whereas /*
and */
delimit a block comment.
If a comment is placed directly before a field specification (/* like so */ int32 x;
) or before a definition (/* like so */ struct S { ... }
), that comment will be copied over as "documentation" to the corresponding bit of generated code.