Skip to content

Latest commit

 

History

History
143 lines (121 loc) · 9.02 KB

TECHNICAL_DETAILS.md

File metadata and controls

143 lines (121 loc) · 9.02 KB

Gaia Technical Details

Intro

Gaia is a powerful tool designed to create and restore snapshots of parts of Minecraft worlds efficiently, without the need for a server restart. While minigame servers might prefer to not save changes and reload the whole world after a round ends, other gamemodes might need a less intrusive way to restore it to its original state. The purpose of this document is to explain how Gaia achieves its extreme performance and tiny memory footprint.

Creating snapshots

Creating and storing snapshots can be tricky considering arenas can have dozens or hundreds of million blocks.

To create a snapshot we need two things, to load the area and iterate over it to capture the block states for every position.

Chunk loading in modern minecraft can be performed asynchronously but interacting with the world needs to be done in the main thread therefore it is still a costly operation. The obvious solution is to split the workload into smaller tasks that can be performed over multiple ticks (the same concept will also be applied to restoring the snapshots). Therefore, we should strive to only load chunks that we currently need to operate on during that tick. For that reason, Gaia will split the area into chunk sub-regions to optimize chunk loading. We'll also utilize chunk tickets to let the server know we need a chunk to be loaded and kept in memory until we are done with it.

To understand chunk snapshots we first need to understand how Minecraft manages them (Chunk Format). The gist is that chunks are split into sections (16x16x16 blocks) and each section is paletted. A palette stores an id -> BlockState map and the raw data are basically just tightly arrays of those ids. We won't bother with iterating over each position as minecraft allows us to copy whole sections with relative ease.

Storing the snapshots is already solved for us. We can just use schematics! It utilizes a similar paletted format plus gzip compression allowing for relatively small files. Adopting the Sponge standard allows for compatibility with other tools such as WorldEdit. Gaia is currently storing Sponge v3 compliant schematics but only saving and reading the required fields and blocks data. It ignores any biomes, block entities etc as they are out of scope for Gaia's snapshot restoration.

Reading stored snapshots

A major issue with loading large schematics or multiple chunk-sized ones is the total amount of memory they need. WorldEdit for example, will unpack the data into 3-dimensional arrays of BlockStates. That requires a lot of memory but is necessary for them because they need to be able to transform the clipboard (eg rotate). Gaia however has no such requirement, we simply need to paste it at the same original position. Therefore, we'll load the data into a palette and a packed byte array. After some debugging and analyzing heap dumps, it's estimated that Gaia will use ~35MB of RAM for 40 million blocks. For comparison, a WorldEdit clipboard of equal volume requires over four times that memory just for the 3-dimensional array alone. Depending on the schematic you load, ram usage can spike by several GB putting the GC under pressure.

We can also theorize that the particular packed data structure Gaia uses improves memory locality. Proving that would require an even more in-depth analysis of JIT, the generated bytecode as well as depend on the executing CPU's cache which is out of scope for this document. However, the results below seem to (at least partially)
support that idea.

Restoring snapshots from memory

Restoring the snapshots is rather straight-forward. Once the data is loaded, we queue a chunk load and when ready we simply iterate over our data (same order as written, ZXY), map the id to the actual block state from the palette and set it on the chunk directly. The code looks something like this:

class SnapshotRestore {
  public void restoreSnapshot(Snapshot snapshot) {
    // calculate offsets and init some local variables
    final IndexedIterator<BlockState> it = gaiaSnapshot.iterator(); // Indexed iterator is a custom class allowing us to track the index while reading varints
    BlockState toRestore;
    while (it.hasNext()) {
      // get the index before it's incremented by reading the next varint
      int index = it.index();
      toRestore = it.next();
      // while encoding: index = (y * width * length) + (z * width) + x
      final int y = yOffset + (index / 256);
      final int z = zOffset + ((index % 256) / 16);
      final int x = xOffset + ((index % 256) % 16);
      // ensure the block is inside the region, this basically applies to non chunk aligned snapshots
      if (snapshot.chunk().region().contains(x, y, z)) {
        BlockState result = levelChunk.setBlockState(mutablePos.set(x, y, z), toRestore, false);
        if (result != null && result != toRestore) {
          chunkSource.blockChanged(mutablePos);
        }
      }
    }
  }
}

The current way allows processing a whole chunk (16 x 16 x 384 = 98,304 blocks) in ~2ms on average on a test system with a Ryzen 9 5900x.

It's true we could operate on a lower level by setting the raw data in the chunk sections or their palettes, but it would require duplicating a lot of vanilla logic while being a lot more prone to breaking after an update. Furthermore, both Vanilla and server software like Paper heavily optimize empty sections. If we were operate on a chunk sections and defer heightmap calculations it would actually be 25-35% slower.

Operation processing

As mentioned, the two distinct operations are creating and restoring a snapshot which are represented as CompletableFutures. Creating a snapshot is computationally trivial, we just make a copy of the internal data structure on the main thread and then process it asynchronously before finally storing it as a schematic. However, restoring a chunk requires us to be on the main thread for the entire duration of block checking and setting.

Operations can be created in any thread and are added to the tail of a ConcurrentLinkedQueue. Every tick, Gaia processes the first x elements and removes them once complete. The amount of processing done depends on the configuration (sections-per-tick and concurrent-chunks) as well as an internal timer that ensures Gaia doesn't go over 60% of the tick (30ms). Moreover, events are propagated when operations complete to allow other plugins to monitor Gaia's operations. Expanding on the previous test, restoring an area of 24x24 chunks (56,623,104 blocks) at around 14 operations per tick usually takes <2500ms including file IO on an SSD.

There's a third type of operation for calculating light data after an individual chunk or full arena has been restored. That type of operation is only supported on PaperMC servers as Gaia simply notifies the provided Starlight engine and lets the server handle the whole relighting. That operation is applied in a second pass after all local block setting has been completed to avoid unnecessary recalculations.

Benchmarks

Continuously queuing operations for an area of 24x24 chunks (56 million blocks) and measuring individual operation times:

System: Ryzen 9 5900x

Benchmark sample size: 576 operations each test

Restoring:

[INFO]: [GaiaBenchmark] Min/Avg/Median/95th/Max: 1.647/2.125/2.068/2.705/3.627
[INFO]: [GaiaBenchmark] Min/Avg/Median/95th/Max: 1.635/2.089/2.072/2.348/3.315
[INFO]: [GaiaBenchmark] Min/Avg/Median/95th/Max: 1.666/2.148/2.089/2.705/4.614
[INFO]: [GaiaBenchmark] Min/Avg/Median/95th/Max: 1.667/2.191/2.099/2.974/4.315
[INFO]: [GaiaBenchmark] Min/Avg/Median/95th/Max: 1.624/2.118/2.082/2.493/3.780
[INFO]: [GaiaBenchmark] Min/Avg/Median/95th/Max: 1.694/2.118/2.081/2.549/3.349
[INFO]: [GaiaBenchmark] Min/Avg/Median/95th/Max: 1.684/2.165/2.039/3.087/3.997
[INFO]: [GaiaBenchmark] Min/Avg/Median/95th/Max: 1.658/2.094/2.055/2.460/4.906
[INFO]: [GaiaBenchmark] Min/Avg/Median/95th/Max: 1.718/2.216/2.108/2.983/3.941
[INFO]: [GaiaBenchmark] Min/Avg/Median/95th/Max: 1.707/2.206/2.109/3.041/3.595
Total operation times in milliseconds

[1989, 1980, 2047, 2040, 1984, 1988, 2046, 1999, 2080, 2083]

Analyzing

[INFO]: [GaiaBenchmark] Min/Avg/Median/95th/Max: 0.005/0.018/0.009/0.027/0.929
[INFO]: [GaiaBenchmark] Min/Avg/Median/95th/Max: 0.003/0.008/0.006/0.012/0.429
[INFO]: [GaiaBenchmark] Min/Avg/Median/95th/Max: 0.003/0.007/0.006/0.012/0.280
[INFO]: [GaiaBenchmark] Min/Avg/Median/95th/Max: 0.003/0.006/0.005/0.009/0.276
[INFO]: [GaiaBenchmark] Min/Avg/Median/95th/Max: 0.003/0.006/0.005/0.008/0.273
[INFO]: [GaiaBenchmark] Min/Avg/Median/95th/Max: 0.003/0.006/0.005/0.007/0.294
[INFO]: [GaiaBenchmark] Min/Avg/Median/95th/Max: 0.003/0.005/0.005/0.008/0.138
[INFO]: [GaiaBenchmark] Min/Avg/Median/95th/Max: 0.003/0.007/0.005/0.011/0.453
[INFO]: [GaiaBenchmark] Min/Avg/Median/95th/Max: 0.004/0.017/0.010/0.022/2.075
[INFO]: [GaiaBenchmark] Min/Avg/Median/95th/Max: 0.003/0.017/0.011/0.021/2.164