Provided task is split into 3 stages:
- Analyse: Data Visualization and Explanation
- Make Splits: Preprocessing data for training
- Training: Model training and validation
First two stages are provided in notebooks folder
Code for the last part is stored in indoor folder
- Recreate train/val split using notebooks
- Install egg
pip3 install -e .
- Install dependencies
pip3 install -r requirements.txt
Trainig: python3 indoor/training/train.py --kwargs...
Visualization: python3 indoor/project_utils/visualize.py --kwargs...
Trained model is available at Google Drive
Implemented
Model Training
Predicts visualization
mAP and mAR metrics
per class mAP metrics
Not Implemented but could be useful
Share info between nearby frames
Sequence business metric*
Modulated deformable convs**
box_score thrs for each class
ablation study for params, archs etc.
*Metric that shows model quality on each sequence of frames **Should work fine for this task
-
DL lib: Pytorch x Torchvision
-
MlOps & Tracking: ClearML
-
Detector: Faster-RCNN based 2 stage detector
-
Backbone: ResNeST 50 (ResNet 50 with split Attention)
-
Batch size: 4
-
Epochs: 50
-
Base Lr: 0.0005
-
Pretrain: None
-
Optimizer: Ranger (Lookahead + RAdam)
-
Lr_scheduler: ReduceOnPlateo with reloading best checkpoint*
*That's my favourite, always SOTA
Class | mAP |
fireextinguisher | 0.75 |
chair | 0.79 |
exit | 0.79 |
clock | 0.76 |
trashbin | 0.61 |
screen | 0.69 |
printer | 0.79 |
FP for different classes
At evaluation should aggregate predictions from nearby frames because of cases like this
- Got pretty good working model but there are many ways to increase final quality
- Statements mentioned in "Not Implemented but could be useful" are crucial but not included in this work