-
Notifications
You must be signed in to change notification settings - Fork 10
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
0 parents
commit 471df37
Showing
36 changed files
with
3,771 additions
and
0 deletions.
There are no files selected for viewing
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,30 @@ | ||
ZIM | ||
Copyright (c) 2024-present NAVER Cloud Corp. | ||
|
||
Creative Commons Attribution-NonCommercial 4.0 International | ||
|
||
A summary of the CC BY-NC 4.0 license is located here: | ||
https://creativecommons.org/licenses/by-nc/4.0/ | ||
|
||
This project contains subcomponents with separate copyright notices and license terms. | ||
Your use of the source code for these subcomponents is subject to the terms and conditions of the following licenses. | ||
|
||
===== | ||
|
||
facebookresearch/segment-anything | ||
https://github.com/facebookresearch/segment-anything | ||
|
||
|
||
Licensed under the Apache License, Version 2.0 (the "License"); | ||
you may not use this file except in compliance with the License. | ||
You may obtain a copy of the License at | ||
|
||
http://www.apache.org/licenses/LICENSE-2.0 | ||
|
||
Unless required by applicable law or agreed to in writing, software | ||
distributed under the License is distributed on an "AS IS" BASIS, | ||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
See the License for the specific language governing permissions and | ||
limitations under the License. | ||
|
||
===== |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,32 @@ | ||
# ZIM | ||
|
||
**ZIM: Zero-Shot Image Matting for Anything** <br /> | ||
[Beomyoung Kim](https://beomyoung-kim.github.io/), Chanyong Shin, Joonhyun Jeong, Hyungsik Jung, Se-Yun Lee, Sewhan Chun, Dong-Hyun Hwang, Joonsang Yu<br> | ||
|
||
<sub>NAVER Cloud, ImageVision</sub><br /> | ||
|
||
[![Paper](https://img.shields.io/badge/Paper-arxiv)](https://arxiv.org) | ||
[![Page](https://img.shields.io/badge/Project_page-blue)](https://naver-ai.github.io/ZIM) | ||
[![Demo](https://img.shields.io/badge/Demo-yellow)](https://huggingface.co/spaces/naver-iv/ZIM_Zero-Shot-Image-Matting) | ||
[![Data](https://img.shields.io/badge/Data-gray)](https://huggingface.co/datasets/naver-iv/MicroMat-3K) | ||
|
||
|
||
|
||
## Introduction | ||
|
||
In this paper, we introduce a novel zero-shot image matting model. Recent models like SAM (Segment Anything Model) exhibit strong zero-shot capabilities, but they fall short in generating fine-grained, high-precision masks. To address this limitation, we propose two key contributions: First, we develop a label converter that transforms segmentation labels into detailed matte labels, creating the new SA1B-Matte dataset. This enables the model to generate high-quality, micro-level matte masks without costly manual annotations. Second, we design a zero-shot matting model equipped with a hierarchical pixel decoder and prompt-aware masked attention mechanism, improving both the resolution of mask outputs and the model’s ability to focus on specific regions based on user prompts. We evaluate our model using the newly introduced ZIM test set, which contains high-quality micro-level matte labels. Experimental results show that our model outperforms SAM and other existing methods in precision and zero-shot generalization. Furthermore, we demonstrate the versatility of our approach in downstream tasks, including image inpainting and 3D neural radiance fields (NeRF), where the ability to produce precise matte masks is crucial. Our contributions provide a robust foundation for advancing zero-shot image matting and its applications across a wide range of computer vision tasks. | ||
|
||
|
||
## Updates | ||
**Available Soon** | ||
|
||
|
||
## Installation | ||
|
||
Our implementation is based on [SAM](https://github.com/facebookresearch/segment-anything). | ||
|
||
Please check the [installation instructions](INSTALL.md) | ||
|
||
## License | ||
|
||
Available Soon |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
from config.config import generate_config |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,66 @@ | ||
""" | ||
Copyright (c) 2024-present Naver Cloud Corp. | ||
This source code is licensed under the license found in the | ||
LICENSE file in the root directory of this source tree. | ||
""" | ||
|
||
from easydict import EasyDict as edict | ||
|
||
config_ = edict() | ||
|
||
""" | ||
Common configs | ||
""" | ||
config_.data_root = "/mnt/tmp" | ||
config_.use_ddp = True | ||
config_.use_amp = False | ||
config_.local_rank = 0 | ||
config_.world_size = 1 | ||
config_.random_seed = 3407 | ||
""" | ||
Network configs | ||
""" | ||
config_.network = edict() | ||
config_.network.encoder = "vit_b" | ||
config_.network.decoder = "zim" | ||
config_.network.encode_kernel = 21 | ||
""" | ||
Evaluation configs | ||
""" | ||
config_.eval = edict() | ||
config_.eval.workers = 4 | ||
config_.eval.image_size = 1024 | ||
config_.eval.prompt_type = "point,bbox" | ||
config_.eval.model_list = "zim,sam" | ||
config_.eval.zim_weights = "" | ||
config_.eval.sam_weights = "" | ||
""" | ||
Dataset configs | ||
""" | ||
config_.dataset = edict() | ||
config_.dataset.valset = "MicroMat3K" | ||
config_.dataset.data_type = "fine,coarse" | ||
config_.dataset.data_list_txt = "data_list.txt" | ||
|
||
|
||
def remove_prefix(text, prefix): | ||
if text.startswith(prefix): | ||
return text[len(prefix) :] | ||
return text | ||
|
||
|
||
def generate_config(args): | ||
# merge args & config | ||
for k, v in args.items(): | ||
if k.startswith("network_"): | ||
config_["network"][remove_prefix(k, "network_")] = v | ||
elif k.startswith("eval_"): | ||
config_["eval"][remove_prefix(k, "eval_")] = v | ||
elif k.startswith("dataset_"): | ||
config_["dataset"][remove_prefix(k, "dataset_")] = v | ||
elif k == "amp": | ||
config_["use_amp"] = v | ||
else: | ||
config_[k] = v | ||
return config_ |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Oops, something went wrong.