The second place of Ego4D Natural Language Queries challenge on ECCV 2022. Our arXiv version can be found in this link. We invite our audience to try out the code.
This code repo implements an ActionFormer variant for single-stage temporal sentence grounding on Ego4D NLQ challenge. Our model differs from ActionFormer in following aspects:
- An additional transformer-based text encoder.
- Transformer-based classification and regression heads.
- Attention-based fusion of video and text features.
- Frame level contrastive loss
The structure of this code repo is heavily inspired by ActionFormer. Some of the main components are
- ./libs/core: Parameter configuration module.
- ./libs/datasets: Data loader and IO module.
- ./libs/model: Our main model with all its building blocks.
- ./libs/utils: Utility functions for training, inference, and postprocessing.
- Follow INSTALL.md for installing necessary dependencies and compiling the code.
- Download ego_vlp_reshape.zip from this google drive link. This file includes EgoVLP feature in pt format.
- Download the official Slowfast and Omnivore features from Ego4D official repo.
Details These are EgoVlP features extracted using EgoVLP official code. The features are extracted using clips of 16 frames
and a stride of 16 frames
. We reshaped these features to align with Ego4D official slowfast features.
- Follow data/DATA_README.md for prepare video features.
- Unpack the file under ./data/ego4d (or elsewhere and link to ./data).
- The folder structure should look like
This folder
│ README.md
│ ...
│
└───data/
│ └───ego4d/
│ │ └───annotations
│ │ └───video_features
│ │ └───ego_vlp_reshape
│ │ └───official_slowfast
│ │ └───official_omnivore
│ │ └───fusion
│ └───...
|
└───libs
│
│ ...
Training and Evaluation
- Train our model on the Ego4D dataset. This will create an experiment folder under ./log that stores training config, logs, and checkpoints.
python ./train.py --config configs/ego4d.yaml -n ego4d -g 0
- [Optional] Monitor the training using TensorBoard
tensorboard --logdir=./log/ego4d/
- Evaluate the trained model. The [email protected] metric for Ego4D should be around 15.5%.
python ./eval.py -n ego4d -c last -ema
- Generate submission file for Ego4D NLQ challenge.
python ./submit.py -n ego4d -c last -ema
Reperduce Our Results
- Our checkpoint can be downloaded from here. You can download the checkpoint file, move it the to ./log folder, and use the following command to reproduce our results. (If the data format is correct, the result should be: Rank@1, [email protected] = 17.58 and Rank@1, [email protected] = 9.76)
python ./eval.py -n ego4d -c 08 -ema
Sicheng Mo ([email protected])
If you are using our code, please consider citing our paper.
@misc{mo2022simple,
title={A Simple Transformer-Based Model for Ego4D Natural Language Queries Challenge},
author={Sicheng Mo and Fangzhou Mu and Yin Li},
year={2022},
eprint={2211.08704},
archivePrefix={arXiv},
primaryClass={cs.CV}
}