scene-generation-from-novel-viewpoints-gens

We investigate whether Graph Element Networks (a graph convolutional neural network architecture that we published in ICML 2019) can be used to organize memories spatially, in the problem of generating scene images from novel viewpoints.

We sample 3D mazes from the DeepMind Lab game platform (dataset) and each maze comes with a series of images. Each image reveals how the maze appears, from a specific 2D coordinate and given a specific (yaw, pitch, roll) triple for the camera.

In the animation, we have mazes placed in a 3x3 grid structure. The animation shows generated scenes on the left and a top-down view of the 9 mazes on the right. We first sample views from different places inside the mazes, and insert them into the GEN. We then query the GEN for the inferred view at new query coordinates, while rotating 360 degrees for each position. The red nodes (in the top-down map) are active nodes from which information is interpolated to generate a new view, for each query location.

In this problem, the GEN:

has its nodes spread across the 2D ground plane of the mazes (see white circles in right image)
learns a useful representation for what mazes look like and we interpolate information from its nodes to generate new images
compartmentalizes spatial memories since it trains on mazes one by one but at test time succeeds in absorbing information from 9 mazes simultaneously

How do we decode node states to draw scene images? This work was done to improve on Deepmind's work (Eslami et. al.) where they have a representation-learning network and an image-generation network ressembling the standard DRAW architecture. They can only represent one maze at a time as their model absorbs information without spatial disentangling. We use our GENs for representation learning, and apply their standard drawing architecture to decode our hidden states.

To run the code:

You need to download the dataset (either for mazes, or rooms etc, and save the train and test data separately).
You need to use convert.py in utils (provide name of your dataset) to process the data set from the DeepMind format to .pt.gz files and then extract all files to .pt format

If you'd like to skip these 2 steps and try running our code before committing to downloading these huge datasets, we provided a few sample images (processed already) in the data_samples folder.

Then you can run our code by running

python train_scene_rendering.py

with arguments matching our argparse header:

parser.add_argument('--dataset', type=str, default='Labyrinth', help='dataset (dafault: Shepard-Mtzler)')
parser.add_argument('--train_data_dir', type=str, help='location of training data', \
default="/home/jaks19/mazes-torch/train")
parser.add_argument('--test_data_dir', type=str, help='location of test data', \
default="/home/jaks19/mazes-torch/test")
parser.add_argument('--root_log_dir', type=str, help='root location of log', default='/home/jaks19/logs/')
parser.add_argument('--log_dir', type=str, help='log directory (default: GQN)', default='GQN')
parser.add_argument('--workers', type=int, help='number of data loading workers', default=32)
parser.add_argument('--device_ids', type=int, nargs='+', help='list of CUDA devices (default: [0,1,2,3])', default=[0,1,2,3,4,5,6,7])
parser.add_argument('--layers', type=int, help='number of generative layers (default: 12)', default=8)
parser.add_argument('--saved_model', type=str, help='path to model', default=None)

Note:

It took about a full week of non-stop training on 4 GPUs to generate the scenes shown in the animation. But we do better than the DeepMind GQN on many mazes put adjacent to each other (they perform well on one maze at a time, and their model fails with many mazes as their representation squashes all information onto the same representation). We are confident that the quality of our images can get much better with much more compute resources.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
GEN		GEN
GQN		GQN
Models		Models
Utils		Utils
code for video making		code for video making
data_samples		data_samples
gifs		gifs
README.md		README.md
train_embed_classify.py		train_embed_classify.py
train_scene_rendering.py		train_scene_rendering.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

scene-generation-from-novel-viewpoints-gens

To run the code:

If you'd like to skip these 2 steps and try running our code before committing to downloading these huge datasets, we provided a few sample images (processed already) in the data_samples folder.

Note:

About

Releases

Packages

Languages

male-lion/scene-generation-from-novel-viewpoints-gens

Folders and files

Latest commit

History

Repository files navigation

scene-generation-from-novel-viewpoints-gens

To run the code:

If you'd like to skip these 2 steps and try running our code before committing to downloading these huge datasets, we provided a few sample images (processed already) in the data_samples folder.

Note:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages