This repo contains my code for the implementation of https://worldmodels.github.io/
The report is the file report.pdf
in the main directory of this repository.
This informal readme explains a bit of the structure of the project at the time of submission. Some heavy refactoring would be needed to make the code more cohesive and modular, but it doens't seem a good idea to risk breaking something just before the submission, so it will happen after the deadline.
Also, every folder containing datasets, replays, checkpoints will not be on GitHub due to their size, so this file explains how the project structure should work.
CarRacer/
contains all of the code needed to replicate the reuslts from the paper on the original game, CarRacing. It is completely isolated to avoid messing up previous results with successive experiments by accident.
The only external file that it relies on is global_config.py
, which contains shared configuration variables.
CarRacer/checkpoints/
(which may or may not be on GitHub because of file size) contains, obviously, trained models.CarRacer/replays/
(which will definitely not be on GitHub) contains tons of replays of random games.CarRacer/latent_states/
(not on GitHub as well) contains all the latent representations generated by the VAE for the frames of the replays. This is used to train the memory, because it is much faster to dump them once rather than computing them on the fly in the DataLoaderCarRacer/videos/
contains interesting videos of a full game, the same game encoded and then decoded by the VAE, and a game generated by the agent MDN-RNN model.
The files controller.py
, memory.py
, mdn.py
and vae.py
contain the implementation of the main models, used for all the games except CarRacing. These started out as copies of the same files in the directory CarRacer/
and were later modified during the experiments.
Procgen/
is the same as CarRacer/
, but is used for every other game I tested (to avoid messing up the original results in any way). After the submission, CarRacer/
and Procgen/
could be merged into a single directory, removing duplicate code.
-
Procgen/checkpoints
,Procgen/latent_stuff
andProcgen/replays
work in the same way as forCarRacer/
, but they contain one subdirectory for each game tested.
ex:
Procgen/checkpoints/chaser/
Procgen/checkpoints/dodgeball/
Procgen/checkpoints/enduro/
-
Procgen/vae_dataset/
(definitely not on GitHub, it's like 16GB) contains a dataset for the VAE. After switching from CarRacing to different games, I figured it would be faster to extract every frame from the replays into images and load it that way, so this directory contains an image for each frame of each random game played.
This is taken from this repo and contains a collection of VAE models. I ended up trying out most of this VAEs to solve posterior collapse first, and later to try to achieve less blurry reconstructions.
Contains some ouputs I wanted to keep, including a couple of simulations of games invented by the memory (dream games).
This project tackles only the first and main algorithm presented in the paper, in which the controller is trainined on actual games, receiving the world model as input.
The alternative method shown in the paper is to train the controller on dream games, meaning games generated by the MDN-RNN (memory) of the world model. I intend to at some point extend this project to explore this alternative as well, but not at the moment, since it will surely require a massive amount of further configuration and training time.