Skip to content

Commit

Permalink
Code of CVPR 2019 Paper: Region Proposal by Guided Anchoring (#594)
Browse files Browse the repository at this point in the history
* add two stage w/o neck and w/ upperneck

* add rpn r50 c4

* update c4 configs

* fix

* config update

* update config

* minor update

* mask rcnn support c4 train and test

* lr fix

* cascade support upper_neck

* add cascade c4 config

* update config

* update

* update res_layer to new interface

* refactoring

* c4 configs update

* refactoring

* update rpn_c4 config

* rename upper_neck as shared_head

* update

* update configs

* update

* update c4 configs

* update according to commits

* update

* add ga rpn

* test bug fix

* test bug fix with loc_filter_thr is large

* update configs

* update configs

* add ga_retinanet

* ga test bug fix

* update configs

* update

* init masked conv

* update

* update masked conv

* update

* support no ga_sampler

* update

* update

* test with masked_conv

* update comment

* fix flake errors

* fix flake 8 errors

* refactor bounded iou loss

* refactor ga_retina_head

* update configs

* refactor masked conv

* fix flake8 error

* refactor guided_anchor_head and ga_rpn_head

* update configs

* use_sigmoid_cls -> cls_sigmoid_loss; use_focal_loss -> cls_focal_loss

* refactoring

* cls_sigmoid_loss -> use_sigmoid_cls

* fix flake8 error

* add some docs

* rename normalize to norm_cfg

* update configs

* add readme

* update ga_faster config

* update readme

* update readme

* rename configs as r50_caffe

* merge master

* refactor guided anchor target

* update readme

* update approx mas iou assigner

* refactor guided anchor target

* update docstring

* refactor ga heads

* fix flake8 error

* update readme

* update model url

* update comments

* refactor get anchors

* update docstring

* not use_loc_filter during training

* add R-101 results

* update to support build loss api

* fix flake8 error

* update readme with x-101 performances

* update readme

* add a link in project readme

* refactor code about ga shape inside flags

* update

* update

* add x101 config files

* add ga_rpn r101 config

* update some comments

* add comments

* add comments

* update comments

* fix flake8 error
  • Loading branch information
myownskyW7 authored and hellock committed May 23, 2019
1 parent 3cb84ac commit 89022a2
Show file tree
Hide file tree
Showing 32 changed files with 2,983 additions and 21 deletions.
4 changes: 4 additions & 0 deletions MODEL_ZOO.md
Original file line number Diff line number Diff line change
Expand Up @@ -214,6 +214,10 @@ Please refer to [Weight Standardization](configs/gn+ws/README.md) for details.

Please refer to [Deformable Convolutional Networks](configs/dcn/README.md) for details.

### Guided Anchoring

Please refer to [Guided Anchoring](configs/guided_anchoring/README.md) for details.


## Comparison with Detectron and maskrcnn-benchmark

Expand Down
7 changes: 7 additions & 0 deletions compile.sh
Original file line number Diff line number Diff line change
Expand Up @@ -36,3 +36,10 @@ if [ -d "build" ]; then
rm -r build
fi
$PYTHON setup.py build_ext --inplace

echo "Building masked conv op..."
cd ../masked_conv
if [ -d "build" ]; then
rm -r build
fi
$PYTHON setup.py build_ext --inplace
42 changes: 42 additions & 0 deletions configs/guided_anchoring/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
# Region Proposal by Guided Anchoring

## Introduction

We provide config files to reproduce the results in the CVPR 2019 paper for [Region Proposal by Guided Anchoring](https://arxiv.org/abs/1901.03278).

```
@inproceedings{wang2019region,
title={Region Proposal by Guided Anchoring},
author={Jiaqi Wang and Kai Chen and Shuo Yang and Chen Change Loy and Dahua Lin},
booktitle={IEEE Conference on Computer Vision and Pattern Recognition},
year={2019}
}
```

## Results and Models

The results on COCO 2017 val is shown in the below table. (results on test-dev are usually slightly higher than val).

| Method | Backbone | Style | Lr schd | Mem (GB) | Train time (s/iter) | Inf time (fps) | AR 1000 | Download |
| :----: | :-------------: | :-----: | :-----: | :------: | :-----------------: | :------------: | :-----: | :-------------------------------------------------------------------------------------------------------------------------------------------: |
| GA-RPN | R-50-FPN | caffe | 1x | 5.0 | 0.55 | 13.3 | 68.5 | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/guided_anchoring/ga_rpn_r50_caffe_fpn_1x_20190513-95e91886.pth) |
| GA-RPN | R-101-FPN | caffe | 1x | - | - | - | 69.6 | - |
| GA-RPN | X-101-32x4d-FPN | pytorch | 1x | - | - | - | 70.0 | - |
| GA-RPN | X-101-64x4d-FPN | pytorch | 1x | - | - | - | 70.5 | - |


| Method | Backbone | Style | Lr schd | Mem (GB) | Train time (s/iter) | Inf time (fps) | box AP | Download |
| :------------: | :-------------: | :-----: | :-----: | :------: | :-----------------: | :------------: | :----: | :-------------------------------------------------------------------------------------------------------------------------------------------------: |
| GA-Fast RCNN | R-50-FPN | caffe | 1x | 3.3 | 0.23 | 14.9 | 39.5 | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/guided_anchoring/ga_fast_r50_caffe_fpn_1x_20190513-c5af9f8b.pth) |
| GA-Faster RCNN | R-50-FPN | caffe | 1x | 5.1 | 0.64 | 9.6 | 39.9 | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/guided_anchoring/ga_faster_r50_caffe_fpn_1x_20190513-a52b31fa.pth) |
| GA-Faster RCNN | R-101-FPN | caffe | 1x | - | - | - | 41.5 | - |
| GA-Faster RCNN | X-101-32x4d-FPN | pytorch | 1x | - | - | - | 42.9 | - |
| GA-Faster RCNN | X-101-64x4d-FPN | pytorch | 1x | - | - | - | 43.9 | - |
| GA-RetinaNet | R-50-FPN | caffe | 1x | 3.2 | 0.50 | 10.7 | 37.0 | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/guided_anchoring/ga_retinanet_r50_caffe_fpn_1x_20190513-29905101.pth) |
| GA-RetinaNet | R-101-FPN | caffe | 1x | - | - | - | 38.9 | - |
| GA-RetinaNet | X-101-32x4d-FPN | pytorch | 1x | - | - | - | 40.3 | - |
| GA-RetinaNet | X-101-64x4d-FPN | pytorch | 1x | - | - | - | 40.8 | - |



- In the Guided Anchoring paper, `score_thr` is set to 0.001 in Fast/Faster RCNN and 0.05 in RetinaNet for both baselines and Guided Anchoring.
130 changes: 130 additions & 0 deletions configs/guided_anchoring/ga_fast_r50_caffe_fpn_1x.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,130 @@
# model settings
model = dict(
type='FastRCNN',
pretrained='open-mmlab://resnet50_caffe',
backbone=dict(
type='ResNet',
depth=50,
num_stages=4,
out_indices=(0, 1, 2, 3),
frozen_stages=1,
norm_cfg=dict(type='BN', requires_grad=False),
norm_eval=True,
style='caffe'),
neck=dict(
type='FPN',
in_channels=[256, 512, 1024, 2048],
out_channels=256,
num_outs=5),
bbox_roi_extractor=dict(
type='SingleRoIExtractor',
roi_layer=dict(type='RoIAlign', out_size=7, sample_num=2),
out_channels=256,
featmap_strides=[4, 8, 16, 32]),
bbox_head=dict(
type='SharedFCBBoxHead',
num_fcs=2,
in_channels=256,
fc_out_channels=1024,
roi_feat_size=7,
num_classes=81,
target_means=[0., 0., 0., 0.],
target_stds=[0.05, 0.05, 0.1, 0.1],
reg_class_agnostic=False,
loss_cls=dict(
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0),
loss_bbox=dict(type='SmoothL1Loss', beta=1.0, loss_weight=1.0)))
# model training and testing settings
train_cfg = dict(
rcnn=dict(
assigner=dict(
type='MaxIoUAssigner',
pos_iou_thr=0.6,
neg_iou_thr=0.6,
min_pos_iou=0.6,
ignore_iof_thr=-1),
sampler=dict(
type='RandomSampler',
num=256,
pos_fraction=0.25,
neg_pos_ub=-1,
add_gt_as_proposals=True),
pos_weight=-1,
debug=False))
test_cfg = dict(
rcnn=dict(
score_thr=1e-3, nms=dict(type='nms', iou_thr=0.5), max_per_img=100))
# dataset settings
dataset_type = 'CocoDataset'
data_root = 'data/coco/'
img_norm_cfg = dict(
mean=[102.9801, 115.9465, 122.7717], std=[1.0, 1.0, 1.0], to_rgb=False)
data = dict(
imgs_per_gpu=2,
workers_per_gpu=2,
train=dict(
type=dataset_type,
ann_file=data_root + 'annotations/instances_train2017.json',
img_prefix=data_root + 'train2017/',
img_scale=(1333, 800),
img_norm_cfg=img_norm_cfg,
size_divisor=32,
num_max_proposals=300,
proposal_file=data_root + 'proposals/ga_rpn_r50_fpn_1x_train2017.pkl',
flip_ratio=0.5,
with_mask=False,
with_crowd=True,
with_label=True),
val=dict(
type=dataset_type,
ann_file=data_root + 'annotations/instances_val2017.json',
img_prefix=data_root + 'val2017/',
img_scale=(1333, 800),
img_norm_cfg=img_norm_cfg,
size_divisor=32,
num_max_proposals=300,
proposal_file=data_root + 'proposals/ga_rpn_r50_fpn_1x_val2017.pkl',
flip_ratio=0,
with_mask=False,
with_crowd=True,
with_label=True),
test=dict(
type=dataset_type,
ann_file=data_root + 'annotations/instances_val2017.json',
img_prefix=data_root + 'val2017/',
img_scale=(1333, 800),
img_norm_cfg=img_norm_cfg,
size_divisor=32,
num_max_proposals=300,
proposal_file=data_root + 'proposals/ga_rpn_r50_fpn_1x_val2017.pkl',
flip_ratio=0,
with_mask=False,
with_label=False,
test_mode=True))
# optimizer
optimizer = dict(type='SGD', lr=0.02, momentum=0.9, weight_decay=0.0001)
optimizer_config = dict(grad_clip=dict(max_norm=35, norm_type=2))
# learning policy
lr_config = dict(
policy='step',
warmup='linear',
warmup_iters=500,
warmup_ratio=1.0 / 3,
step=[8, 11])
checkpoint_config = dict(interval=1)
# yapf:disable
log_config = dict(
interval=50,
hooks=[
dict(type='TextLoggerHook'),
# dict(type='TensorboardLoggerHook')
])
# yapf:enable
# runtime settings
total_epochs = 12
dist_params = dict(backend='nccl')
log_level = 'INFO'
work_dir = './work_dirs/ga_fast_rcnn_r50_caffe_fpn_1x'
load_from = None
resume_from = None
workflow = [('train', 1)]
193 changes: 193 additions & 0 deletions configs/guided_anchoring/ga_faster_r50_caffe_fpn_1x.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,193 @@
# model settings
model = dict(
type='FasterRCNN',
pretrained='open-mmlab://resnet50_caffe',
backbone=dict(
type='ResNet',
depth=50,
num_stages=4,
out_indices=(0, 1, 2, 3),
frozen_stages=1,
norm_cfg=dict(type='BN', requires_grad=False),
norm_eval=True,
style='caffe'),
neck=dict(
type='FPN',
in_channels=[256, 512, 1024, 2048],
out_channels=256,
num_outs=5),
rpn_head=dict(
type='GARPNHead',
in_channels=256,
feat_channels=256,
octave_base_scale=8,
scales_per_octave=3,
octave_ratios=[0.5, 1.0, 2.0],
anchor_strides=[4, 8, 16, 32, 64],
anchor_base_sizes=None,
anchoring_means=[.0, .0, .0, .0],
anchoring_stds=[0.07, 0.07, 0.14, 0.14],
target_means=(.0, .0, .0, .0),
target_stds=[0.07, 0.07, 0.11, 0.11],
loc_filter_thr=0.01,
loss_loc=dict(
type='FocalLoss',
use_sigmoid=True,
gamma=2.0,
alpha=0.25,
loss_weight=1.0),
loss_shape=dict(
type='IoULoss', style='bounded', beta=0.2, loss_weight=1.0),
loss_cls=dict(
type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0),
loss_bbox=dict(type='SmoothL1Loss', beta=1.0, loss_weight=1.0)),
bbox_roi_extractor=dict(
type='SingleRoIExtractor',
roi_layer=dict(type='RoIAlign', out_size=7, sample_num=2),
out_channels=256,
featmap_strides=[4, 8, 16, 32]),
bbox_head=dict(
type='SharedFCBBoxHead',
num_fcs=2,
in_channels=256,
fc_out_channels=1024,
roi_feat_size=7,
num_classes=81,
target_means=[0., 0., 0., 0.],
target_stds=[0.05, 0.05, 0.1, 0.1],
reg_class_agnostic=False,
loss_cls=dict(
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0),
loss_bbox=dict(type='SmoothL1Loss', beta=1.0, loss_weight=1.0)))
# model training and testing settings
train_cfg = dict(
rpn=dict(
ga_assigner=dict(
type='ApproxMaxIoUAssigner',
pos_iou_thr=0.7,
neg_iou_thr=0.3,
min_pos_iou=0.3,
ignore_iof_thr=-1),
ga_sampler=dict(
type='RandomSampler',
num=256,
pos_fraction=0.5,
neg_pos_ub=-1,
add_gt_as_proposals=False),
assigner=dict(
type='MaxIoUAssigner',
pos_iou_thr=0.7,
neg_iou_thr=0.3,
min_pos_iou=0.3,
ignore_iof_thr=-1),
sampler=dict(
type='RandomSampler',
num=256,
pos_fraction=0.5,
neg_pos_ub=-1,
add_gt_as_proposals=False),
allowed_border=-1,
pos_weight=-1,
center_ratio=0.2,
ignore_ratio=0.5,
debug=False),
rpn_proposal=dict(
nms_across_levels=False,
nms_pre=2000,
nms_post=2000,
max_num=300,
nms_thr=0.7,
min_bbox_size=0),
rcnn=dict(
assigner=dict(
type='MaxIoUAssigner',
pos_iou_thr=0.6,
neg_iou_thr=0.6,
min_pos_iou=0.6,
ignore_iof_thr=-1),
sampler=dict(
type='RandomSampler',
num=256,
pos_fraction=0.25,
neg_pos_ub=-1,
add_gt_as_proposals=True),
pos_weight=-1,
debug=False))
test_cfg = dict(
rpn=dict(
nms_across_levels=False,
nms_pre=1000,
nms_post=1000,
max_num=300,
nms_thr=0.7,
min_bbox_size=0),
rcnn=dict(
score_thr=1e-3, nms=dict(type='nms', iou_thr=0.5), max_per_img=100))
# dataset settings
dataset_type = 'CocoDataset'
data_root = 'data/coco/'
img_norm_cfg = dict(
mean=[102.9801, 115.9465, 122.7717], std=[1.0, 1.0, 1.0], to_rgb=False)
data = dict(
imgs_per_gpu=2,
workers_per_gpu=2,
train=dict(
type=dataset_type,
ann_file=data_root + 'annotations/instances_train2017.json',
img_prefix=data_root + 'train2017/',
img_scale=(1333, 800),
img_norm_cfg=img_norm_cfg,
size_divisor=32,
flip_ratio=0.5,
with_mask=False,
with_crowd=True,
with_label=True),
val=dict(
type=dataset_type,
ann_file=data_root + 'annotations/instances_val2017.json',
img_prefix=data_root + 'val2017/',
img_scale=(1333, 800),
img_norm_cfg=img_norm_cfg,
size_divisor=32,
flip_ratio=0,
with_mask=False,
with_crowd=True,
with_label=True),
test=dict(
type=dataset_type,
ann_file=data_root + 'annotations/instances_val2017.json',
img_prefix=data_root + 'val2017/',
img_scale=(1333, 800),
img_norm_cfg=img_norm_cfg,
size_divisor=32,
flip_ratio=0,
with_mask=False,
with_label=False,
test_mode=True))
# optimizer
optimizer = dict(type='SGD', lr=0.02, momentum=0.9, weight_decay=0.0001)
optimizer_config = dict(grad_clip=dict(max_norm=35, norm_type=2))
# learning policy
lr_config = dict(
policy='step',
warmup='linear',
warmup_iters=500,
warmup_ratio=1.0 / 3,
step=[8, 11])
checkpoint_config = dict(interval=1)
# yapf:disable
log_config = dict(
interval=50,
hooks=[
dict(type='TextLoggerHook'),
# dict(type='TensorboardLoggerHook')
])
# yapf:enable
# runtime settings
total_epochs = 12
dist_params = dict(backend='nccl')
log_level = 'INFO'
work_dir = './work_dirs/ga_faster_rcnn_r50_caffe_fpn_1x'
load_from = None
resume_from = None
workflow = [('train', 1)]
Loading

0 comments on commit 89022a2

Please sign in to comment.