Skip to content

Commit

Permalink
initial publishing
Browse files Browse the repository at this point in the history
  • Loading branch information
innimu committed Jan 10, 2025
1 parent f025c8a commit 0542fef
Show file tree
Hide file tree
Showing 11 changed files with 36 additions and 42 deletions.
Binary file modified _images/Untitled.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
36 changes: 16 additions & 20 deletions _sources/docs/review/zero123.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# zero 1-to-3 : Zero-shot One Image to 3D Object
# zero 123

``` {admonition} Information
- **Title:** zero 1-to-3 : Zero-shot One Image to 3D Object
Expand Down Expand Up @@ -46,13 +46,9 @@
4. **Zero 1-to-3 를 제안**
- **Zero 1-to-3 :**
- Stable Diffusion과 같은 large-scale diffusion 모델이 단일 RGB 이미지를 입력으로,

<aside>
1. **zero-shot novel view synthesis**
2. **3D shape reconstruction**
</aside>
를 수행하기 위해 **카메라 view point(시점)을 조작을 컨트롤하는 매커니즘 학습**
를 수행하기 위해 **카메라 view point(시점)을 조작을 컨트롤하는 매커니즘 학습**

- 주어진 단일 RGB 이미지 입력하에 이러한 TASK를 수행 하는 것은 매우 많은 제약이 존재
→ Diffusion 모델을 기반으로 다양한 시점에서 방대한 객체 이미지를 생성하여 이용
Expand Down Expand Up @@ -86,7 +82,7 @@
- [**기존 접근 방식**:](https://velog.io/@dldydldy75/3D-Understanding)

:::{figure-md}
<img src="../../pics/zero123/image 1.png" alt="tag" class="bg-primary mb-1">
<img src="../../pics/zero123/image1.png" alt="tag" class="bg-primary mb-1">
:::

1. **데이터 수집 형태에 따른 전역적인(global) 특성 기반의 조건화 모델**
Expand Down Expand Up @@ -134,7 +130,7 @@
2. **인터넷 규모 데이터셋에 반영된 뷰포인트 편향** : 생성 모델들은 인터넷에서 반영된 시점 편향을 물려받아, 특정한 자세 및 시점에서의 이미지를 생성하는 경향 존재

:::{figure-md}
<img src="../../pics/zero123/image 2.png" alt="tag" class="bg-primary mb-1">
<img src="../../pics/zero123/image2.png" alt="tag" class="bg-primary mb-1">
:::


Expand All @@ -146,14 +142,14 @@
- **접근법 [Figure3]**

:::{figure-md}
<img src="../../pics/zero123/image 3.png" alt="tag" class="bg-primary mb-1">
<img src="../../pics/zero123/image3.png" alt="tag" class="bg-primary mb-1">
:::

- 사전 훈련된 diffusion 모델을 미세조정하여 나머지 표현을 손상시키지 않고 카메라 파라미터를 제어하도록 학습.
- **Latent Diffusion Architecture 를 이용**
- **[참고] LDM**
:::{figure-md}
<img src="../../pics/zero123/image 4.png" alt="tag" class="bg-primary mb-1">
<img src="../../pics/zero123/image4.png" alt="tag" class="bg-primary mb-1">
:::
- Encoder($\mathcal{E}$), Denoiser(U-Net, $\epsilon_\theta$), Decoder($\mathcal{D}$)로 구성
- **The Objective**
Expand Down Expand Up @@ -212,7 +208,7 @@
- $\nabla \mathcal{L}_{SJC}$ : PAAS 스코어

:::{figure-md}
<img src="../../pics/zero123/image 7.png" alt="tag" class="bg-primary mb-1">
<img src="../../pics/zero123/image7.png" alt="tag" class="bg-primary mb-1">
:::

- 입력 뷰와의 **MSE** 손실로 최적화
Expand Down Expand Up @@ -312,18 +308,18 @@
## 4.4. Novel View Synthesis Results

:::{figure-md}
<img src="../../pics/zero123/image 12.png" alt="tag" class="bg-primary mb-1">
<img src="../../pics/zero123/image12.png" alt="tag" class="bg-primary mb-1">
:::
- **Table 2** : RTMV에서의 새로운 뷰 합성 결과. RTMV의 장면은 Objaverse 훈련 데이터와 분포가 다르지만, 우리의 모델은 여전히 기준선을 상당한 차이로 능가.

:::{figure-md}
<img src="../../pics/zero123/image 13.png" alt="tag" class="bg-primary mb-1">
<img src="../../pics/zero123/image13.png" alt="tag" class="bg-primary mb-1">
:::
- **Figure 5 : Novel view synthesis on Google Scanned Object**
왼쪽에 표시된 입력 뷰는 두 개의 무작위 샘플링된 새로운 뷰를 합성하는 데 사용됨. 해당하는 실제 뷰는 오른쪽에 표시되어 있음. 기준 방법들과 비교할 때, 우리가 합성한 새로운 뷰는 실제와 매우 일치하는 풍부한 텍스트 및 기하학적 세부 사항을 포함하고 있으며, 반면 기준 방법들은 고주파 세부 사항의 유의미한 손실을 보임.

:::{figure-md}
<img src="../../pics/zero123/image 14.png" alt="tag" class="bg-primary mb-1">
<img src="../../pics/zero123/image14.png" alt="tag" class="bg-primary mb-1">
:::
- **Figure 6 Novel view synthesis on RTMV**
왼쪽에 표시된 입력 뷰는 두 개의 무작위로 샘플링된 새로운 뷰를 합성하는 데 사용됨. 오른쪽에는 해당하는 실제 뷰가 표시됨. 우리가 합성한 뷰는 큰 카메라 시점 변화가 있을 때조차도 높은 충실도를 유지하며, 대부분의 다른 방법들은 품질이 급격히 저하됨.
Expand All @@ -333,7 +329,7 @@
- 그러나 생성된 포인트 클라우드의 크기가 작아 Point-E가 새로운 뷰 합성(novel view synthesis)에서의 적용 가능성을 제한함.

:::{figure-md}
<img src="../../pics/zero123/image 15.png" alt="tag" class="bg-primary mb-1">
<img src="../../pics/zero123/image15.png" alt="tag" class="bg-primary mb-1">
:::

- **Figure 7 Novel view synthesis on in-the-wild images.**
Expand All @@ -343,7 +339,7 @@
- 입력 이미지가 2D이기 때문에 항상 객체의 부분적인 뷰만을 나타내고 많은 부분이 관찰되지 않으므로, diffusion 을 통해 다양한 시점에서 샘플들을 랜덤으로 생성

:::{figure-md}
<img src="../../pics/zero123/image 16.png" alt="tag" class="bg-primary mb-1">
<img src="../../pics/zero123/image16.png" alt="tag" class="bg-primary mb-1">
:::

- Figure 8: 입력 뷰를 고정하고, 다양한 시점에서 새로운 샘플을 랜덤으로 생성하여 시각화
Expand All @@ -352,16 +348,16 @@
## 4.5. 3D Reconstruction Results

:::{figure-md}
<img src="../../pics/zero123/image 17.png" alt="tag" class="bg-primary mb-1">
<img src="../../pics/zero123/image17.png" alt="tag" class="bg-primary mb-1">
:::

- 실제 ground truth 와 유사한 고충실도의 3D 메쉬를 reconstruct

:::{figure-md}
<img src="../../pics/zero123/image 18.png" alt="tag" class="bg-primary mb-1">
<img src="../../pics/zero123/image18.png" alt="tag" class="bg-primary mb-1">
:::
:::{figure-md}
<img src="../../pics/zero123/image 19.png" alt="tag" class="bg-primary mb-1">
<img src="../../pics/zero123/image19.png" alt="tag" class="bg-primary mb-1">
:::

- **MCC (Multiview Compressive Coding)**:
Expand All @@ -376,7 +372,7 @@
## 4.6. Text to Image to 3D-

:::{figure-md}
<img src="../../pics/zero123/image 20.png" alt="tag" class="bg-primary mb-1">
<img src="../../pics/zero123/image20.png" alt="tag" class="bg-primary mb-1">
:::

- 실제 환경에서 촬영된 이미지 외에도, Dall-E-2와 같은 txt2img 모델이 생성한 이미지에 대해서도 테스트
Expand Down
2 changes: 1 addition & 1 deletion _sources/intro.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,4 +45,4 @@ CVPR 2024 Workshop
- 한동현 [Donghyun Han] | [GitHub](https://github.com/donghyun99) | [Linkedin](https://www.linkedin.com/in/donghyun99/) |
- 이창환 [ChangHwan Lee] | [Github](https://github.com/Hwan-I) | Linkedin
- 유경민 [Kyeongmin Yu] | [Github](https://github.com/yukyeongmin) | Linkdedin |
- 이정인 [Jeongin Lee] | Github | Linkdedin |
- 이정인 [Jeongin Lee] | Github | [Linkdedin](https://www.linkedin.com/in/innimu/) |
2 changes: 1 addition & 1 deletion docs/review/DreamFusion.html
Original file line number Diff line number Diff line change
Expand Up @@ -321,7 +321,7 @@



<li class="toctree-l1"><a class="reference internal" href="zero123.html">zero 1-to-3 : Zero-shot One Image to 3D Object</a></li>
<li class="toctree-l1"><a class="reference internal" href="zero123.html">zero 123</a></li>



Expand Down
2 changes: 1 addition & 1 deletion docs/review/DreamPose.html
Original file line number Diff line number Diff line change
Expand Up @@ -321,7 +321,7 @@



<li class="toctree-l1"><a class="reference internal" href="zero123.html">zero 1-to-3 : Zero-shot One Image to 3D Object</a></li>
<li class="toctree-l1"><a class="reference internal" href="zero123.html">zero 123</a></li>



Expand Down
26 changes: 12 additions & 14 deletions docs/review/zero123.html
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" /><meta name="generator" content="Docutils 0.18.1: http://docutils.sourceforge.net/" />

<title>zero 1-to-3 : Zero-shot One Image to 3D Object &#8212; Text-to-Image Generation-feat-Diffusion</title>
<title>zero 123 &#8212; Text-to-Image Generation-feat-Diffusion</title>



Expand Down Expand Up @@ -321,7 +321,7 @@



<li class="toctree-l1 current active"><a class="current reference internal" href="#">zero 1-to-3 : Zero-shot One Image to 3D Object</a></li>
<li class="toctree-l1 current active"><a class="current reference internal" href="#">zero 123</a></li>



Expand Down Expand Up @@ -534,7 +534,7 @@


<div id="jb-print-docs-body" class="onlyprint">
<h1>zero 1-to-3 : Zero-shot One Image to 3D Object</h1>
<h1>zero 123</h1>
<!-- Table of contents -->
<div id="print-main-content">
<div id="jb-print-toc">
Expand All @@ -544,7 +544,7 @@ <h2> Contents </h2>
</div>
<nav aria-label="Page">
<ul class="visible nav section-nav flex-column">
<li class="toc-h1 nav-item toc-entry"><a class="reference internal nav-link" href="#">zero 1-to-3 : Zero-shot One Image to 3D Object</a></li>
<li class="toc-h1 nav-item toc-entry"><a class="reference internal nav-link" href="#">zero 123</a></li>
<li class="toc-h1 nav-item toc-entry"><a class="reference internal nav-link" href="#abstract-zero-shot-one-image-to-3d-object">Abstract : Zero-shot One Image to 3D Object</a></li>
<li class="toc-h1 nav-item toc-entry"><a class="reference internal nav-link" href="#introduction">1. Introduction</a></li>
<li class="toc-h1 nav-item toc-entry"><a class="reference internal nav-link" href="#related-work">2. Related Work</a></li>
Expand Down Expand Up @@ -590,8 +590,8 @@ <h2> Contents </h2>
<div id="searchbox"></div>
<article class="bd-article">

<section class="tex2jax_ignore mathjax_ignore" id="zero-1-to-3-zero-shot-one-image-to-3d-object">
<h1>zero 1-to-3 : Zero-shot One Image to 3D Object<a class="headerlink" href="#zero-1-to-3-zero-shot-one-image-to-3d-object" title="Permalink to this heading">#</a></h1>
<section class="tex2jax_ignore mathjax_ignore" id="zero-123">
<h1>zero 123<a class="headerlink" href="#zero-123" title="Permalink to this heading">#</a></h1>
<div class="admonition-information admonition">
<p class="admonition-title">Information</p>
<ul class="simple">
Expand Down Expand Up @@ -656,13 +656,11 @@ <h1>1. Introduction<a class="headerlink" href="#introduction" title="Permalink t
<ul class="simple">
<li><p>Stable Diffusion과 같은 large-scale diffusion 모델이 단일 RGB 이미지를 입력으로,</p></li>
</ul>
<aside>
1. **zero-shot novel view synthesis**
2. **3D shape reconstruction**
</aside>
<div class="highlight-none notranslate"><div class="highlight"><pre><span></span> 를 수행하기 위해 **카메라 view point(시점)을 조작을 컨트롤하는 매커니즘 학습**
</pre></div>
</div>
<ol class="arabic simple">
<li><p><strong>zero-shot novel view synthesis</strong></p></li>
<li><p><strong>3D shape reconstruction</strong>
를 수행하기 위해 <strong>카메라 view point(시점)을 조작을 컨트롤하는 매커니즘 학습</strong></p></li>
</ol>
</li>
<li><p>주어진 단일 RGB 이미지 입력하에 이러한 TASK를 수행 하는 것은 매우 많은 제약이 존재
→ Diffusion 모델을 기반으로 다양한 시점에서 방대한 객체 이미지를 생성하여 이용</p>
Expand Down Expand Up @@ -1247,7 +1245,7 @@ <h2>D. 3D Reconstruction<a class="headerlink" href="#d-3d-reconstruction" title=
</div>
<nav class="bd-toc-nav page-toc">
<ul class="visible nav section-nav flex-column">
<li class="toc-h1 nav-item toc-entry"><a class="reference internal nav-link" href="#">zero 1-to-3 : Zero-shot One Image to 3D Object</a></li>
<li class="toc-h1 nav-item toc-entry"><a class="reference internal nav-link" href="#">zero 123</a></li>
<li class="toc-h1 nav-item toc-entry"><a class="reference internal nav-link" href="#abstract-zero-shot-one-image-to-3d-object">Abstract : Zero-shot One Image to 3D Object</a></li>
<li class="toc-h1 nav-item toc-entry"><a class="reference internal nav-link" href="#introduction">1. Introduction</a></li>
<li class="toc-h1 nav-item toc-entry"><a class="reference internal nav-link" href="#related-work">2. Related Work</a></li>
Expand Down
2 changes: 1 addition & 1 deletion genindex.html
Original file line number Diff line number Diff line change
Expand Up @@ -318,7 +318,7 @@



<li class="toctree-l1"><a class="reference internal" href="docs/review/zero123.html">zero 1-to-3 : Zero-shot One Image to 3D Object</a></li>
<li class="toctree-l1"><a class="reference internal" href="docs/review/zero123.html">zero 123</a></li>



Expand Down
4 changes: 2 additions & 2 deletions intro.html
Original file line number Diff line number Diff line change
Expand Up @@ -320,7 +320,7 @@



<li class="toctree-l1"><a class="reference internal" href="docs/review/zero123.html">zero 1-to-3 : Zero-shot One Image to 3D Object</a></li>
<li class="toctree-l1"><a class="reference internal" href="docs/review/zero123.html">zero 123</a></li>



Expand Down Expand Up @@ -604,7 +604,7 @@ <h2>Contributors<a class="headerlink" href="#contributors" title="Permalink to t
<li><p>한동현 [Donghyun Han] | <a class="reference external" href="https://github.com/donghyun99">GitHub</a> | <a class="reference external" href="https://www.linkedin.com/in/donghyun99/">Linkedin</a> |</p></li>
<li><p>이창환 [ChangHwan Lee] | <a class="reference external" href="https://github.com/Hwan-I">Github</a> | Linkedin</p></li>
<li><p>유경민 [Kyeongmin Yu] | <a class="reference external" href="https://github.com/yukyeongmin">Github</a> | Linkdedin |</p></li>
<li><p>이정인 [Jeongin Lee] | Github | Linkdedin |</p></li>
<li><p>이정인 [Jeongin Lee] | Github | <a class="reference external" href="https://www.linkedin.com/in/innimu/">Linkdedin</a> |</p></li>
</ul>
</section>
<div class="toctree-wrapper compound">
Expand Down
Binary file modified objects.inv
Binary file not shown.
2 changes: 1 addition & 1 deletion search.html
Original file line number Diff line number Diff line change
Expand Up @@ -320,7 +320,7 @@



<li class="toctree-l1"><a class="reference internal" href="docs/review/zero123.html">zero 1-to-3 : Zero-shot One Image to 3D Object</a></li>
<li class="toctree-l1"><a class="reference internal" href="docs/review/zero123.html">zero 123</a></li>



Expand Down
2 changes: 1 addition & 1 deletion searchindex.js

Large diffs are not rendered by default.

0 comments on commit 0542fef

Please sign in to comment.