Skip to content

Commit

Permalink
Update
Browse files Browse the repository at this point in the history
  • Loading branch information
lartpang committed Jun 26, 2024
1 parent 727a24d commit 3a89cf1
Show file tree
Hide file tree
Showing 37 changed files with 5,473 additions and 23 deletions.
179 changes: 179 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,179 @@
### JupyterNotebooks template
# gitignore template for Jupyter Notebooks
# website: http://jupyter.org/

.ipynb_checkpoints
*/.ipynb_checkpoints/*

# IPython
profile_default/
ipython_config.py

# Remove previous ipynb_checkpoints
# git rm -r .ipynb_checkpoints/

### Python template
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/
cover/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

# PyBuilder
.pybuilder/
target/

# Jupyter Notebook
.ipynb_checkpoints

# IPython
profile_default/
ipython_config.py

# pyenv
# For a library or package, you might want to ignore these files since the code is
# intended to run in multiple environments; otherwise, check them in:
# .python-version

# pipenv
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
# However, in case of collaboration, if having platform-specific dependencies or dependencies
# having no cross-platform support, pipenv may install dependencies that don't work, or not
# install all needed dependencies.
#Pipfile.lock

# poetry
# Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
# This is especially recommended for binary packages to ensure reproducibility, and is more
# commonly ignored for libraries.
# https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
#poetry.lock

# pdm
# Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
#pdm.lock
# pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
# in version control.
# https://pdm.fming.dev/#use-with-ide
.pdm.toml

# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
__pypackages__/

# Celery stuff
celerybeat-schedule
celerybeat.pid

# SageMath parsed files
*.sage.py

# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/

# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject

# mkdocs documentation
/site

# mypy
.mypy_cache/
.dmypy.json
dmypy.json

# Pyre type checker
.pyre/

# pytype static type analyzer
.pytype/

# Cython debug symbols
cython_debug/

.idea/
.vscode/
weights/
outputs/
pretrained/
dataset.yaml

# Custom Scripts
*.ps1
*.sh
178 changes: 155 additions & 23 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,13 +8,12 @@


```bibtex
@misc{ZoomNeXt,
title={ZoomNeXt: A Unified Collaborative Pyramid Network for Camouflaged Object Detection},
author={Youwei Pang and Xiaoqi Zhao and Tian-Zhu Xiang and Lihe Zhang and Huchuan Lu},
year={2023},
eprint={2310.20208},
archivePrefix={arXiv},
primaryClass={cs.CV}
@ARTICLE {ZoomNeXt,
title = {ZoomNeXt: A Unified Collaborative Pyramid Network for Camouflaged Object Detection},
author ={Youwei Pang and Xiaoqi Zhao and Tian-Zhu Xiang and Lihe Zhang and Huchuan Lu},
journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence},
year = {2024},
doi = {10.1109/TPAMI.2024.3417329},
}
```

Expand All @@ -29,19 +28,152 @@

### Weights

| Backbone | CAMO-TE | | | CHAMELEON | | | COD10K-TE | | | NC4K | | | Links |
| ---------------- | ------- | -------------------- | ----- | --------- | -------------------- | ----- | --------- | -------------------- | ----- | ----- | -------------------- | ----- | ------------------------------------------------------------------------------------------------------- |
| | $S_m$ | $F^{\omega}_{\beta}$ | MAE | $S_m$ | $F^{\omega}_{\beta}$ | MAE | $S_m$ | $F^{\omega}_{\beta}$ | MAE | $S_m$ | $F^{\omega}_{\beta}$ | MAE | |
| ResNet-50 | 0.833 | 0.774 | 0.065 | 0.908 | 0.858 | 0.021 | 0.861 | 0.768 | 0.026 | 0.874 | 0.816 | 0.037 | [Weight](https://github.com/lartpang/ZoomNeXt/releases/download/weights-v0.2/resnet50-zoomnext.pth) |
| EfficientNet-B4 | 0.867 | 0.824 | 0.046 | 0.911 | 0.865 | 0.020 | 0.875 | 0.797 | 0.021 | 0.884 | 0.837 | 0.032 | [Weight](https://github.com/lartpang/ZoomNeXt/releases/download/weights-v0.2/eff-b4-zoomnext.pth) |
| PVTv2-B2 | 0.874 | 0.839 | 0.047 | 0.922 | 0.884 | 0.017 | 0.887 | 0.818 | 0.019 | 0.892 | 0.852 | 0.030 | [Weight](https://github.com/lartpang/ZoomNeXt/releases/download/weights-v0.2/pvtv2-b2-zoomnext.pth) |
| PVTv2-B3 | 0.885 | 0.854 | 0.042 | 0.927 | 0.898 | 0.017 | 0.895 | 0.829 | 0.018 | 0.900 | 0.861 | 0.028 | [Weight](https://github.com/lartpang/ZoomNeXt/releases/download/weights-v0.2/pvtv2-b3-zoomnext.pth) |
| PVTv2-B4 | 0.888 | 0.859 | 0.040 | 0.925 | 0.897 | 0.016 | 0.898 | 0.838 | 0.017 | 0.900 | 0.865 | 0.028 | [Weight](https://github.com/lartpang/ZoomNeXt/releases/download/weights-v0.2/pvtv2-b4-zoomnext.pth) |
| PVTv2-B5 | 0.889 | 0.857 | 0.041 | 0.924 | 0.885 | 0.018 | 0.898 | 0.827 | 0.018 | 0.903 | 0.863 | 0.028 | [Weight](https://github.com/lartpang/ZoomNeXt/releases/download/weights-v0.2/pvtv2-b5-zoomnext.pth) |
| EfficientNet-B1 | | | | | | | | | | | | | [Weight](https://github.com/lartpang/ZoomNeXt/releases/download/weights-v0.2/eff-b1-zoomnext.pth) |
| ConvNeXtV2-Large | | | | | | | | | | | | | [Weight](https://github.com/lartpang/ZoomNeXt/releases/download/weights-v0.2/convnextv2-l-zoomnext.pth) |

| Backbone | CAD | | | | | MoCA-Mask-TE | | | | | Links |
| --------------- | ----- | -------------------- | ----- | ----- | ----- | ------------ | -------------------- | ----- | ----- | ----- | ---------------------------------------------------------------------------------------------------------- |
| | $S_m$ | $F^{\omega}_{\beta}$ | MAE | mDice | mIoU | $S_m$ | $F^{\omega}_{\beta}$ | MAE | mDice | mIoU | |
| PVTv2-B5 (T=5) | 0.757 | 0.593 | 0.020 | 0.599 | 0.510 | 0.734 | 0.476 | 0.010 | 0.497 | 0.422 | [Weight](https://github.com/lartpang/ZoomNeXt/releases/download/weights-v0.2/pvtv2-b5-5frame-zoomnext.pth) |
| Backbone | CAMO-TE | | | CHAMELEON | | | COD10K-TE | | | NC4K | | | Links |
| --------------- | ------- | -------------------- | ----- | --------- | -------------------- | ----- | --------- | -------------------- | ----- | ----- | -------------------- | ----- | --------------------------------------------------------------------------------------------------- |
| | $S_m$ | $F^{\omega}_{\beta}$ | MAE | $S_m$ | $F^{\omega}_{\beta}$ | MAE | $S_m$ | $F^{\omega}_{\beta}$ | MAE | $S_m$ | $F^{\omega}_{\beta}$ | MAE | |
| ResNet-50 | 0.833 | 0.774 | 0.065 | 0.908 | 0.858 | 0.021 | 0.861 | 0.768 | 0.026 | 0.874 | 0.816 | 0.037 | [Weight](https://github.com/lartpang/ZoomNeXt/releases/download/weights-v0.2/resnet50-zoomnext.pth) |
| EfficientNet-B1 | 0.848 | 0.803 | 0.056 | 0.916 | 0.870 | 0.020 | 0.863 | 0.773 | 0.024 | 0.876 | 0.823 | 0.036 | [Weight](https://github.com/lartpang/ZoomNeXt/releases/download/weights-v0.2/eff-b1-zoomnext.pth) |
| EfficientNet-B4 | 0.867 | 0.824 | 0.046 | 0.911 | 0.865 | 0.020 | 0.875 | 0.797 | 0.021 | 0.884 | 0.837 | 0.032 | [Weight](https://github.com/lartpang/ZoomNeXt/releases/download/weights-v0.2/eff-b4-zoomnext.pth) |
| PVTv2-B2 | 0.874 | 0.839 | 0.047 | 0.922 | 0.884 | 0.017 | 0.887 | 0.818 | 0.019 | 0.892 | 0.852 | 0.030 | [Weight](https://github.com/lartpang/ZoomNeXt/releases/download/weights-v0.2/pvtv2-b2-zoomnext.pth) |
| PVTv2-B3 | 0.885 | 0.854 | 0.042 | 0.927 | 0.898 | 0.017 | 0.895 | 0.829 | 0.018 | 0.900 | 0.861 | 0.028 | [Weight](https://github.com/lartpang/ZoomNeXt/releases/download/weights-v0.2/pvtv2-b3-zoomnext.pth) |
| PVTv2-B4 | 0.888 | 0.859 | 0.040 | 0.925 | 0.897 | 0.016 | 0.898 | 0.838 | 0.017 | 0.900 | 0.865 | 0.028 | [Weight](https://github.com/lartpang/ZoomNeXt/releases/download/weights-v0.2/pvtv2-b4-zoomnext.pth) |
| PVTv2-B5 | 0.889 | 0.857 | 0.041 | 0.924 | 0.885 | 0.018 | 0.898 | 0.827 | 0.018 | 0.903 | 0.863 | 0.028 | [Weight](https://github.com/lartpang/ZoomNeXt/releases/download/weights-v0.2/pvtv2-b5-zoomnext.pth) |

| Backbone | CAD | | | | | MoCA-Mask-TE | | | | | Links |
| -------------- | ----- | -------------------- | ----- | ----- | ----- | ------------ | -------------------- | ----- | ----- | ----- | ---------------------------------------------------------------------------------------------------------- |
| | $S_m$ | $F^{\omega}_{\beta}$ | MAE | mDice | mIoU | $S_m$ | $F^{\omega}_{\beta}$ | MAE | mDice | mIoU | |
| PVTv2-B5 (T=5) | 0.757 | 0.593 | 0.020 | 0.599 | 0.510 | 0.734 | 0.476 | 0.010 | 0.497 | 0.422 | [Weight](https://github.com/lartpang/ZoomNeXt/releases/download/weights-v0.2/pvtv2-b5-5frame-zoomnext.pth) |


## Prepare Data

Set all dataset information to the `dataset.yaml` as follows.

<details>
<summary>
Example of the config file (dataset.yaml):
</summary>

```yaml
# VCOD Datasets
moca_mask_tr:
{
root: "YOUR-VCOD-DATASETS-ROOT/MoCA-Mask/MoCA_Video/TrainDataset_per_sq",
image: { path: "*/Imgs", suffix: ".jpg" },
mask: { path: "*/GT", suffix: ".png" },
}
moca_mask_te:
{
root: "YOUR-VCOD-DATASETS-ROOT/MoCA-Mask/MoCA_Video/TestDataset_per_sq",
image: { path: "*/Imgs", suffix: ".jpg" },
mask: { path: "*/GT", suffix: ".png" },
}
cad:
{
root: "YOUR-VCOD-DATASETS-ROOT/CamouflagedAnimalDataset",
image: { path: "original_data/*/frames", suffix: ".png" },
mask: { path: "converted_mask/*/groundtruth", suffix: ".png" },
}

# ICOD Datasets
cod10k_tr:
{
root: "YOUR-ICOD-DATASETS-ROOT/Train/COD10K-TR",
image: { path: "Image", suffix: ".jpg" },
mask: { path: "Mask", suffix: ".png" },
}
camo_tr:
{
root: "YOUR-ICOD-DATASETS-ROOT/Train/CAMO-TR",
image: { path: "Image", suffix: ".jpg" },
mask: { path: "Mask", suffix: ".png" },
}
cod10k_te:
{
root: "YOUR-ICOD-DATASETS-ROOT/Test/COD10K-TE",
image: { path: "Image", suffix: ".jpg" },
mask: { path: "Mask", suffix: ".png" },
}
camo_te:
{
root: "YOUR-ICOD-DATASETS-ROOT/Test/CAMO-TE",
image: { path: "Image", suffix: ".jpg" },
mask: { path: "Mask", suffix: ".png" },
}
chameleon:
{
root: "YOUR-ICOD-DATASETS-ROOT/Test/CHAMELEON",
image: { path: "Image", suffix: ".jpg" },
mask: { path: "Mask", suffix: ".png" },
}
nc4k:
{
root: "YOUR-ICOD-DATASETS-ROOT/Test/NC4K",
image: { path: "Imgs", suffix: ".jpg" },
mask: { path: "GT", suffix: ".png" },
}
```
</details>

## Install Requirements

- torch==2.1.2
- torchvision==0.16.2
- Others: `pip install -r requirements.txt`

## Evaluation

```shell
# ICOD
python main_for_image.py --config configs/icod_train.py --model-name <MODEL_NAME> --evaluate --load-from <TRAINED_WEIGHT>
# VCOD
python main_for_video.py --config configs/vcod_finetune.py --model-name <MODEL_NAME> --evaluate --load-from <TRAINED_WEIGHT>
```

> [!note]
>
> Evaluating performance on the VCOD dataset directly using training scripts is not consistent with the paper.
> This is because the evaluation approach in the paper continues the strategy of previous work [SLT-Net](https://github.com/XuelianCheng/SLT-Net), which adjusts the range of valid frames in the sequence.
To get the results in our paper, you can use [PySODEvalToolkit](https://github.com/lartpang/PySODEvalToolkit) and use the similar command as:

```shell
python ./eval.py `
--dataset-json vcod-datasets.json `
--method-json vcod-methods.json `
--include-datasets CAD `
--include-methods videoPvtV2B5_ZoomNeXt `
--data-type video `
--valid-frame-start "0" `
--valid-frame-end "0" `
--metric-names "sm" "wfm" "mae" "fmeasure" "em" "dice" "iou"

python ./eval.py `
--dataset-json vcod-datasets.json `
--method-json vcod-methods.json `
--include-datasets MOCA-MASK-TE `
--include-methods videoPvtV2B5_ZoomNeXt `
--data-type video `
--valid-frame-start "0" `
--valid-frame-end "-2" `
--metric-names "sm" "wfm" "mae" "fmeasure" "em" "dice" "iou"
```

## Training

### Image Camouflaged Object Detection

```shell
python main_for_image.py --config configs/icod_train.py --pretrained --model-name EffB1_ZoomNeXt
python main_for_image.py --config configs/icod_train.py --pretrained --model-name EffB4_ZoomNeXt
python main_for_image.py --config configs/icod_train.py --pretrained --model-name PvtV2B2_ZoomNeXt
python main_for_image.py --config configs/icod_train.py --pretrained --model-name PvtV2B3_ZoomNeXt
python main_for_image.py --config configs/icod_train.py --pretrained --model-name PvtV2B4_ZoomNeXt
python main_for_image.py --config configs/icod_train.py --pretrained --model-name PvtV2B5_ZoomNeXt
python main_for_image.py --config configs/icod_train.py --pretrained --model-name RN50_ZoomNeXt
```

### Video Camouflaged Object Detection

1. Pretrain on COD10K-TR: `python main_for_image.py --config configs/icod_pretrain.py --info pretrain --model-name PvtV2B5_ZoomNeXt --pretrained`
2. Finetune on MoCA-Mask-TR: `python main_for_video.py --config configs/vcod_finetune.py --info finetune --model-name videoPvtV2B5_ZoomNeXt --load-from <PRETAINED_WEIGHT>`
Loading

0 comments on commit 3a89cf1

Please sign in to comment.