forked from jonbarron/website
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathindex.html
310 lines (275 loc) · 15.3 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
<!DOCTYPE html>
<html>
<head lang="en">
<meta charset="UTF-8">
<meta http-equiv="x-ua-compatible" content="ie=edge">
<title>mip-NeRF</title>
<meta name="description" content="">
<meta name="viewport" content="width=device-width, initial-scale=1">
<!-- <base href="/"> -->
<!--FACEBOOK-->
<meta property="og:image" content="https://jonbarron.info/mipnerf/img/rays_square.png">
<meta property="og:image:type" content="image/png">
<meta property="og:image:width" content="682">
<meta property="og:image:height" content="682">
<meta property="og:type" content="website" />
<meta property="og:url" content="https://jonbarron.info/mipnerf/"/>
<meta property="og:title" content="mip-NeRF" />
<meta property="og:description" content="Project page for Mip-NeRF: A Multiscale Representation for Anti-Aliasing Neural Radiance Fields." />
<!--TWITTER-->
<meta name="twitter:card" content="summary_large_image" />
<meta name="twitter:title" content="mip-NeRF" />
<meta name="twitter:description" content="Project page for Mip-NeRF: A Multiscale Representation for Anti-Aliasing Neural Radiance Fields." />
<meta name="twitter:image" content="https://jonbarron.info/mipnerf/img/rays_square.png" />
<!-- <link rel="apple-touch-icon" href="apple-touch-icon.png"> -->
<!-- <link rel="icon" type="image/png" href="img/seal_icon.png"> -->
<!-- Place favicon.ico in the root directory -->
<link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.5/css/bootstrap.min.css">
<link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/font-awesome/4.4.0/css/font-awesome.min.css">
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/codemirror/5.8.0/codemirror.min.css">
<link rel="stylesheet" href="css/app.css">
<link rel="stylesheet" href="css/bootstrap.min.css">
<script src="https://ajax.googleapis.com/ajax/libs/jquery/1.11.3/jquery.min.js"></script>
<script src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.5/js/bootstrap.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/codemirror/5.8.0/codemirror.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/clipboard.js/1.5.3/clipboard.min.js"></script>
<script src="js/app.js"></script>
</head>
<body>
<div class="container" id="main">
<div class="row">
<h2 class="col-md-12 text-center">
<b>Mip-NeRF</b>: A Multiscale Representation <br> for Anti-Aliasing Neural Radiance Fields</br>
<small>
ICCV 2021 (Oral, Best Paper Honorable Mention)
</small>
</h2>
</div>
<div class="row">
<div class="col-md-12 text-center">
<ul class="list-inline">
<li>
<a href="https://jonbarron.info/">
Jonathan T. Barron
</a>
</br>Google
</li>
<li>
<a href="http://bmild.github.io/">
Ben Mildenhall
</a>
</br>Google
</li>
<li>
<a href="http://matthewtancik.com/">
Matthew Tancik
</a>
</br>UC Berkeley
</li><br>
<li>
<a href="https://phogzone.com/">
Peter Hedman
</a>
</br>Google
</li>
<li>
<a href="http://www.ricardomartinbrualla.com/">
Ricardo Martin-Brualla
</a>
</br>Google
</li>
<li>
<a href="https://pratulsrinivasan.github.io/">
Pratul P. Srinivasan
</a>
</br>Google
</li>
</ul>
</div>
</div>
<div class="row">
<div class="col-md-4 col-md-offset-4 text-center">
<ul class="nav nav-pills nav-justified">
<li>
<a href="https://arxiv.org/abs/2103.13415">
<image src="img/mip_paper_image.jpg" height="60px">
<h4><strong>Paper</strong></h4>
</a>
</li>
<li>
<a href="https://youtu.be/EpH175PY1A0">
<image src="img/youtube_icon.png" height="60px">
<h4><strong>Video</strong></h4>
</a>
</li>
<li>
<a href="https://github.com/google/mipnerf">
<image src="img/github.png" height="60px">
<h4><strong>Code</strong></h4>
</a>
</li>
</ul>
</div>
</div>
<div class="row">
<div class="col-md-8 col-md-offset-2">
<h3>
Abstract
</h3>
<image src="img/rays.png" class="img-responsive" alt="overview"><br>
<p class="text-justify">
The rendering procedure used by neural radiance fields (NeRF) samples a scene with a single ray per pixel and may therefore produce renderings that are excessively blurred or aliased when training or testing images observe scene content at different resolutions. The straightforward solution of supersampling by rendering with multiple rays per pixel is impractical for NeRF, because rendering each ray requires querying a multilayer perceptron hundreds of times. Our solution, which we call "mip-NeRF" (à la "mipmap"), extends NeRF to represent the scene at a continuously-valued scale.
By efficiently rendering anti-aliased conical frustums instead of rays, mip-NeRF reduces objectionable aliasing artifacts and significantly improves NeRF's ability to represent fine details, while also being 7% faster than NeRF and half the size.
Compared to NeRF, mip-NeRF reduces average error rates by 17% on the dataset presented with NeRF and by 60% on a challenging multiscale variant of that dataset that we present. mip-NeRF is also able to match the accuracy of a brute-force supersampled NeRF on our multiscale dataset while being 22x faster.
</p>
</div>
</div>
<div class="row">
<div class="col-md-8 col-md-offset-2">
<h3>
Video
</h3>
<div class="text-center">
<div style="position:relative;padding-top:56.25%;">
<iframe src="https://www.youtube.com/embed/EpH175PY1A0" allowfullscreen style="position:absolute;top:0;left:0;width:100%;height:100%;"></iframe>
</div>
</div>
</div>
</div>
<div class="row">
<div class="col-md-8 col-md-offset-2">
<h3>
Integrated Positional Encoding
</h3>
<p class="text-justify">
Typical positional encoding (as used in Transformer networks and Neural Radiance Fields) maps a single point in space to a feature vector, where each element is generated by a sinusoid with an exponentially increasing frequency:
</p>
<p style="text-align:center;">
<image src="img/pe_seq_eqn_pad.png" height="50px" class="img-responsive">
</p>
<video id="v0" width="100%" autoplay loop muted>
<source src="img/pe_anim_horiz.mp4" type="video/mp4" />
</video>
<p class="text-justify">
Here, we show how these feature vectors change as a function of a point moving in 1D space.
<br><br>
Our <em>integrated positional encoding</em> considers Gaussian <em>regions</em> of space, rather than infinitesimal points. This provides a natural way to input a "region" of space as query to a coordinate-based neural network, allowing the network to reason about sampling and aliasing. The expected value of each positional encoding component has a simple closed form:
</p>
<p style="text-align:center;">
<image src="img/ipe_eqn_under_pad.png" height="30px" class="img-responsive">
</p>
<video id="v0" width="100%" autoplay loop muted>
<source src="img/ipe_anim_horiz.mp4" type="video/mp4" />
</video>
<p class="text-justify">
We can see that when considering a wider region, the higher frequency features automatically shrink toward zero, providing the network with lower-frequency inputs. As the region narrows, these features converge to the original positional encoding.
</p>
</div>
</div>
<div class="row">
<div class="col-md-8 col-md-offset-2">
<h3>
Mip-NeRF
</h3>
<p class="text-justify">
We use integrated positional encoding to train NeRF to generate anti-aliased renderings. Rather than casting an infinitesimal ray through each pixel, we instead cast a full 3D <em>cone</em>. For each queried point along a ray, we consider its associated 3D conical frustum. Two different cameras viewing the same point in space may result in vastly different conical frustums, as illustrated here in 2D:
</p>
<p style="text-align:center;">
<image src="img/scales_toy.png" class="img-responsive" alt="scales">
</p>
<p class="text-justify">
In order to pass this information through the NeRF network, we fit a multivariate Gaussian to the conical frustum and use the integrated positional encoding described above to create the input feature vector to the network.
</p>
</div>
</div>
<div class="row">
<div class="col-md-8 col-md-offset-2">
<h3>
Results
</h3>
<p class="text-justify">
We train NeRF and mip-NeRF on a dataset with images at four different resolutions. Normal NeRF (left) is not capable of learning to represent the same scene at multiple levels of detail, with blurring in close-up shots and aliasing in low resolution views, while mip-NeRF (right) both preserves sharp details in close-ups and correctly renders the zoomed-out images.
</p>
<br>
<video id="v0" width="100%" autoplay loop muted controls>
<source src="img/ship_sbs_path1.mp4" type="video/mp4" />
</video>
<video id="v0" width="100%" autoplay loop muted controls>
<source src="img/chair_sbs_path1.mp4" type="video/mp4" />
</video>
<video id="v0" width="100%" autoplay loop muted controls>
<source src="img/lego_sbs_path1.mp4" type="video/mp4" />
</video>
<video id="v0" width="100%" autoplay loop muted controls>
<source src="img/mic_sbs_path1.mp4" type="video/mp4" />
</video>
<br><br>
<p class="text-justify">
We can also manipulate the integrated positional encoding by using a larger or smaller radius than the true pixel footprint, exposing the continuous level of detail learned within a single network:
</p>
<video id="v0" width="100%" autoplay loop muted controls>
<source src="img/lego_radii_manip_slider_200p.mp4" type="video/mp4" />
</video>
</div>
</div>
<div class="row">
<div class="col-md-8 col-md-offset-2">
<h3>
Related links
</h3>
<p class="text-justify">
<a href="https://en.wikipedia.org/wiki/Spatial_anti-aliasing">Wikipedia</a> provides an excellent introduction to spatial anti-aliasing techniques.
</p>
<p class="text-justify">
Mipmaps were introduced by Lance Williams in his paper "Pyramidal Parametrics" (<a href="https://software.intel.com/sites/default/files/m/7/2/c/p1-williams.pdf">Williams (1983)</a>).
</p>
<p class="text-justify">
<a href="https://dl.acm.org/doi/abs/10.1145/964965.808589">Amanatides (1984)</a> first proposed the idea of replacing rays with cones in computer graphics rendering.
</p>
<p class="text-justify">
The closely related concept of <em>ray differentials</em> (<a href="https://graphics.stanford.edu/papers/trd/">Igehy (1999)</a>) is used in most modern renderers to antialias textures and other material buffers during ray tracing.
</p>
<p class="text-justify">
Cone tracing has been used along with prefiltered voxel-based representations of scene geometry for speeding up indirect illumination calculations in <a href="https://research.nvidia.com/sites/default/files/publications/GIVoxels-pg2011-authors.pdf">Crassin et al. (2011)</a>.
</p>
<p class="text-justify">
Mip-NeRF was implemented on top of the <a href="https://github.com/google-research/google-research/tree/master/jaxnerf">JAXNeRF</a> codebase.
</p>
</div>
</div>
<div class="row">
<div class="col-md-8 col-md-offset-2">
<h3>
Citation
</h3>
<div class="form-group col-md-10 col-md-offset-1">
<textarea id="bibtex" class="form-control" readonly>
@article{barron2021mipnerf,
title={Mip-NeRF: A Multiscale Representation
for Anti-Aliasing Neural Radiance Fields},
author={Jonathan T. Barron and Ben Mildenhall and
Matthew Tancik and Peter Hedman and
Ricardo Martin-Brualla and Pratul P. Srinivasan},
journal={ICCV},
year={2021}
}</textarea>
</div>
</div>
</div>
<div class="row">
<div class="col-md-8 col-md-offset-2">
<h3>
Acknowledgements
</h3>
<p class="text-justify">
We thank Janne Kontkanen and David Salesin for their comments on the text, Paul Debevec for constructive discussions, and Boyang Deng for JaxNeRF.
<br>
MT is funded by an NSF Graduate Fellowship.
<br>
The website template was borrowed from <a href="http://mgharbi.com/">Michaël Gharbi</a>.
</p>
</div>
</div>
</div>
</body>
</html>