diff --git a/.cproject b/.cproject index 69419af..307064c 100644 --- a/.cproject +++ b/.cproject @@ -181,7 +181,7 @@ - + @@ -189,10 +189,10 @@ - + - + diff --git a/.project b/.project index f7a36d9..4d29320 100644 --- a/.project +++ b/.project @@ -1,6 +1,6 @@ - Project3-CUDA-Path-Tracer + Project4-CUDA-Denoiser diff --git a/CMakeLists.txt b/CMakeLists.txt index 036d874..162568b 100644 --- a/CMakeLists.txt +++ b/CMakeLists.txt @@ -1,6 +1,6 @@ cmake_minimum_required(VERSION 3.1) -project(cis565_path_tracer) +project(cis565_denoiser) set_property(GLOBAL PROPERTY USE_FOLDERS ON) diff --git a/GNUmakefile b/GNUmakefile index 8d548ae..d944325 100644 --- a/GNUmakefile +++ b/GNUmakefile @@ -23,7 +23,7 @@ RelWithDebugInfo: build run: - build/cis565_path_tracer scenes/sphere.txt + build/cis565_denoiser scenes/sphere.txt build: mkdir -p build diff --git a/INSTRUCTION.md b/INSTRUCTION.md index 9a7ccb1..1834d60 100644 --- a/INSTRUCTION.md +++ b/INSTRUCTION.md @@ -1,31 +1,55 @@ -Proj 3 CUDA Path Tracer - Instructions +Project 4 CUDA Denoiser - Instructions ======================== -This is due **Wednesday October 7th** at 11:59pm. A mid-project submission of the core requirements is due **Tuesday Sept 30th** at 11:59pm. - -[Link to "Pathtracing Primer" Slides](https://1drv.ms/p/s!AiLXbdZHgbemhedscBCjlYs-dpL59A) +This is due **Monday October 19th** at 11:59pm EST. **Summary:** -In this project, you'll implement a CUDA-based path tracer capable of rendering globally-illuminated images very quickly. Since in this class we are concerned with working in GPU programming, performance, and the generation of actual beautiful images (and not with mundane programming tasks like I/O), this project includes base code for loading a scene description file, described below, and various other things that generally make up a framework for previewing and saving images. +In this project, you'll implement a pathtracing denoiser that uses geometry buffers (G-buffers) to guide a smoothing filter. + +We would like you to base your technique on the paper "Edge-Avoiding A-Trous Wavelet Transform for fast Global Illumination Filtering," by Dammertz, Sewtz, Hanika, and Lensch. +You can find the paper here: https://jo.dreggn.org/home/2010_atrous.pdf + +Denoisers can help produce a smoother appearance in a pathtraced image with fewer samples-per-pixel/iterations, although the actual improvement often varies from scene-to-scene. +Smoothing an image can be accomplished by blurring pixels - a simple pixel-by-pixel blur filter may sample the color from a pixel's neighbors in the image, weight them by distance, and write the result back into the pixel. + +However, just running a simple blur filter on an image often reduces the amount of detail, smoothing sharp edges. This can get worse as the blur filter gets larger, or with more blurring passes. +Fortunately in a 3D scene, we can use per-pixel metrics to help the filter detect and preserve edges. + +| raw pathtraced image | simple blur | blur guided by G-buffers | +|---|---|---| +|![](img/noisy.png)|![](img/simple_blur.png)|![](img/denoised.png)| -The core renderer is left for you to implement. Finally, note that, while this base code is meant to serve as a strong starting point for a CUDA path tracer, you are not required to use it if you don't want to. You may also change any part of the base code as you please. **This is YOUR project.** +These per-pixel metrics can include scene geometry information (hence G-buffer), such as per-pixel normals and per-pixel positions, as well as surface color or albedo for preserving detail in mapped or procedural textures. For the purposes of this assignment we will only require per-pixel metrics from the "first bounce." -**Recommendations:** -* Every image you save should automatically get a different filename. Don't delete all of them! For the benefit of your README, keep a bunch of them around so you can pick a few to document your progress at the end. Outtakes are highly appreciated! -* Remember to save your debug images - these will make for a great README. -* Also remember to save and share your bloopers. Every image has a story to tell and we want to hear about it. + per-pixel normals | per-pixel positions (scaled down) | ???! (dummy data, time-of-flight)| +|---|---|---| +|![](img/normals.png)|![](img/positions.png)|![](img/time-of-flight.png)| + + + +We expect you to integrate denoising into your own pathtracers if possible. +This project's base code is forked from the CUDA pathtracer basecode in Project 3, and exists so that the +assignment can stand on its own as well as provide some guidance on how to implement some useful tools. +The main changes are that we have added some GUI controls, a *very* simple pathtracer without stream +compaction, and G-buffer with some dummy data in it. +Feel free to use it as a playground for working out the code changes that your denoiser needs. + +You may also change any part of the base code as you please. **This is YOUR project.** ## Contents * `src/` C++/CUDA source files. * `scenes/` Example scene description files. * `img/` Renders of example scene description files. (These probably won't match precisely with yours.) + * note that we have added a `cornell_ceiling_light` scene + * simple pathtracers often benefit from scenes with very large lights * `external/` Includes and static libraries for 3rd party libraries. +* `imgui/` Library code from https://github.com/ocornut/imgui ## Running the code -The main function requires a scene description file. Call the program with one as an argument: `cis565_path_tracer scenes/sphere.txt`. (In Visual Studio, `../scenes/sphere.txt`.) +The main function requires a scene description file. Call the program with one as an argument: `cis565_denoiser scenes/cornell_ceiling_light.txt`. (In Visual Studio, `../scenes/cornell_ceiling_light.txt`.) If you are using Visual Studio, you can set this in the `Debugging > Command Arguments` section in the `Project Properties`. Make sure you get the path right - read the console for errors. @@ -38,283 +62,127 @@ If you are using Visual Studio, you can set this in the `Debugging > Command Arg * Right mouse button on the vertical axis to zoom in/out. * Middle mouse button to move the LOOKAT point in the scene's X/Z plane. -## Requirements - -**Ask in piazza for clarifications.** - -In this project, you are given code for: - -* Loading and reading the scene description format. -* Sphere and box intersection functions. -* Support for saving images. -* Working CUDA-GL interop for previewing your render while it's running. -* A skeleton renderer with: - * Naive ray-scene intersection. - * A "fake" shading kernel that colors rays based on the material and intersection properties but does NOT compute a new ray based on the BSDF. - -### Part 1 - Core Features - -**You need to complete these features for your mid-project submission on due by Sept, 30**. - -Follow all the same guidelines for README and Pull Request for your mid-project submission, except that you should create a branch called `mid-project-submission` and open a pull request with that branch. This way you can continue to work on your projects in the master branch. - -You will need to implement the following features: -* A shading kernel with BSDF evaluation for: - * Ideal Diffuse surfaces (using provided cosine-weighted scatter function, see below.) [PBRT 8.3]. - * Perfectly specular-reflective (mirrored) surfaces (e.g. using `glm::reflect`). - * See notes on diffuse/specular in `scatterRay` and on imperfect specular below. -* Path continuation/termination using Stream Compaction from Project 2. -* After you have a [basic pathtracer up and running](img/REFERENCE_cornell.5000samp.png), -implement a means of making rays/pathSegments/intersections contiguous in memory by material type. This should be easily toggleable. - * Consider the problems with coloring every path segment in a buffer and performing BSDF evaluation using one big shading kernel: different materials/BSDF evaluations within the kernel will take different amounts of time to complete. - * Sort the rays/path segments so that rays/paths interacting with the same material are contiguous in memory before shading. How does this impact performance? Why? -* A toggleable option to cache the first bounce intersections for re-use across all subsequent iterations. Provide performance benefit analysis across different max ray depths. - -### Part 2 - Advance Features (Required) - -1. 2 of these 3 smaller features: - * Refraction (e.g. glass/water) [PBRT 8.2] with Frensel effects using [Schlick's approximation](https://en.wikipedia.org/wiki/Schlick's_approximation) or more accurate methods [PBRT 8.5]. You can use `glm::refract` for Snell's law. - * Recommended but not required: non-perfect specular surfaces. (See below.) - * Physically-based depth-of-field (by jittering rays within an aperture). [PBRT 6.2.3] - * Stochastic Sampled Antialiasing. See Paul Bourke's [notes](http://paulbourke.net/miscellaneous/aliasing/). Keep in mind how this influences the first-bounce cache in part 1. - - > Note you may choose to implement the third feature as well for extra credit as noted in Part 3. - -2. Arbitrary mesh loading and rendering (e.g. glTF 2.0 (preferred) or `obj` files) with -toggleable bounding volume intersection culling - * You can find models online or export them from your favorite 3D modeling application. - With approval, you may use a third-party loading code to bring the data - into C++. - * [tinygltf](https://github.com/syoyo/tinygltf/) is highly recommended for glTF. - * [tinyObj](https://github.com/syoyo/tinyobjloader) is highly recommended for OBJ. - * [obj2gltf](https://github.com/CesiumGS/obj2gltf) can be used to convert - OBJ to glTF files. You can find similar projects for FBX and other - formats. - * You can use the triangle intersection function `glm::intersectRayTriangle`. - * Bounding volume intersection culling: reduce the number of rays that have to - be checked against the entire mesh by first checking rays against a volume - that completely bounds the mesh. For full credit, provide performance analysis - with and without this optimization. - - > Note: This goes great with the Hierarcical Spatial Data Structures extra credit. - -3. [Better hemisphere sampling methods](https://cseweb.ucsd.edu/classes/sp17/cse168-a/CSE168_07_Random.pdf) - -### Part 3 - Make Your Pathtracer Unique! - -You are required to choose and implement at least: -* Any 2 Visual Improvements, or -* 1 of Heirarchical Spatial Data Structure or Open Image AI Denoiser. - -This is part of the base project requirements. - -**Extra credit**: implement more features on top of the above required ones, with point value up to +20/100 at the grader's discretion (based on difficulty and coolness). - -#### Visual Improvements -* Implement the 3rd feature from Part 2.1. -* Procedural Shapes & Textures. - * You must generate a minimum of two different complex shapes procedurally. (Not primitives) - * You must be able to shade object with a minimum of two different textures -* Texture mapping [PBRT 10.4] and Bump mapping [PBRT 9.3]. - * Implement file-loaded textures AND a basic procedural texture - * Provide a performance comparison between the two -* Direct lighting (by taking a final ray directly to a random point on an emissive object acting as a light source). Or more advanced [PBRT 15.1.1]. -* Subsurface scattering [PBRT 5.6.2, 11.6]. -* Some method of defining object motion, and motion blur by averaging samples at different times in the animation. -* Use final rays to apply post-processing shaders. Please post your ideas on Piazza before starting. - -#### Performance Improvements -* Work-efficient stream compaction using shared memory across multiple blocks. (See [*GPU Gems 3*, Chapter 39](https://developer.nvidia.com/gpugems/gpugems3/part-vi-gpu-computing/chapter-39-parallel-prefix-sum-scan-cuda).) - * Note that you will NOT receieve extra credit for this if you implemented shared memory stream compaction as extra credit for Project 2. -* Hierarchical spatial data structures - for better ray/scene intersection testing - * Octree recommended - this feature is more about traversal on the GPU than perfect tree structure - * CPU-side data structure construction is sufficient - GPU-side construction was a [final project.](https://github.com/jeremynewlin/Accel) - * Make sure this is toggleable for performance comparisons - * If implemented in conjunction with Arbitrary mesh loading (required for this year), this qualifies as the toggleable bounding volume intersection culling. - * See below for more resources -* [Wavefront pathtracing](https://research.nvidia.com/publication/megakernels-considered-harmful-wavefront-path-tracing-gpus): -Group rays by material without a sorting pass. A sane implementation will require considerable refactoring, since every supported material suddenly needs its own kernel. -* [*Open Image AI Denoiser* ](https://github.com/OpenImageDenoise/oidn) Open Image Denoiser is an image denoiser which works by applying a filter on Monte-Carlo-based pathtracer output. The denoiser runs on the CPU and takes in path tracer output from 1spp to beyond. In order to get full credit for this, you must pass in at least one extra buffer along with the [raw "beauty" buffer](https://github.com/OpenImageDenoise/oidn#open-image-denoise-overview). **Ex:** Beauty + Normals. - * Part of this extra credit is figuring out where the filter should be called, and how you should manage the data for the filter step. - * It is important to note that integrating this is not as simple as it may seem at first glance. Library integration, buffer creation, device compatibility, and more are all real problems which will appear, and it may be hard to debug them. Please only try this if you have finished the Part 2 early and would like extra points. While this is difficult, the result would be a significantly faster resolution of the path traced image. -* Re-startable Path tracing: Save some application state (iteration number, samples so far, acceleration structure) so you can start and stop rendering instead of leaving your computer running for hours at end (which will happen in this project) - -**This 'extra features' list is not comprehensive. If you have a particular idea you would like to implement (e.g. acceleration structures, etc.), please post on Piazza.** - -For each extra feature, you must provide the following analysis: - -* Overview write-up of the feature -* Performance impact of the feature -* If you did something to accelerate the feature, what did you do and why? -* Compare your GPU version of the feature to a HYPOTHETICAL CPU version (you don't have to implement it!)? Does it benefit or suffer from being implemented on the GPU? -* How might this feature be optimized beyond your current implementation? - -## Base Code Tour - -You'll be working in the following files. Look for important parts of the code: -* Search for `CHECKITOUT`. -* You'll have to implement parts labeled with `TODO`. (But don't let these constrain you - you have free rein!) +We have also added simple GUI controls that change variables in the code, letting you tune denoising +parameters without having to recompile the project. -* `src/pathtrace.cu`: path tracing kernels, device functions, and calling code - * `pathtraceInit` initializes the path tracer state - it should copy scene data (e.g. geometry, materials) from `Scene`. - * `pathtraceFree` frees memory allocated by `pathtraceInit` - * `pathtrace` performs one iteration of the rendering - it handles kernel launches, memory copies, transferring some data, etc. - * See comments for a low-level path tracing recap. -* `src/intersections.h`: ray intersection functions - * `boxIntersectionTest` and `sphereIntersectionTest`, which take in a ray and a geometry object and return various properties of the intersection. -* `src/interactions.h`: ray scattering functions - * `calculateRandomDirectionInHemisphere`: a cosine-weighted random direction in a hemisphere. Needed for implementing diffuse surfaces. - * `scatterRay`: this function should perform all ray scattering, and will call `calculateRandomDirectionInHemisphere`. See comments for details. -* `src/main.cpp`: you don't need to do anything here, but you can change the program to save `.hdr` image files, if you want (for postprocessing). -* `stream_compaction`: A dummy folder into which you should place your Stream Compaction implementation from Project 2. It should be sufficient to copy the files from [here](https://github.com/CIS565-Fall-2018/Project2-Stream-Compaction/tree/master/stream_compaction) +Requirements +=== -### Generating random numbers - -``` -thrust::default_random_engine rng(hash(index)); -thrust::uniform_real_distribution u01(0, 1); -float result = u01(rng); -``` +**Ask in piazza for clarifications.** -There is a convenience function for generating a random engine using a -combination of index, iteration, and depth as the seed: +## Part 1 - Read! -``` -thrust::default_random_engine rng = makeSeededRandomEngine(iter, index, path.remainingBounces); -``` +One meta-goal for this project is to help you gain some experience in reading technical papers and implementing their concepts. This is an important skill in graphics software engineering, and will also be helpful for your final projects. -### Imperfect specular lighting +For part one, try to skim the paper, and then read through it in depth a couple times: https://jo.dreggn.org/home/2010_atrous.pdf -In path tracing, like diffuse materials, specular materials are simulated using a probability distribution instead computing the strength of a ray bounce based on angles. +Try to look up anything that you don't understand, and feel free to discuss with your fellow students on Piazza. We were also able to locate presentation slides for this paper that may be helpful: https://www.highperformancegraphics.org/previous/www_2010/media/RayTracing_I/HPG2010_RayTracing_I_Dammertz.pdf -Equations 7, 8, and 9 of [*GPU Gems 3*, Chapter 20](https://developer.nvidia.com/gpugems/gpugems3/part-iii-rendering/chapter-20-gpu-based-importance-sampling) give the formulas for generating a random specular ray. (Note that there is a typographical error: χ in the text = ξ in the formulas.) +This paper is also helpful in that it includes a code sample illustrating some of the math, although not +all the details are given away - for example, parameter tuning in denoising can be very implementation-dependent. -Also see the notes in `scatterRay` for probability splits between diffuse/specular/other material types. +This project will focus on this paper, however, it may be helpful to read some of the references as well as +more recent papers on denoising, such as "Spatiotemporal Variance-Guided Filtering" from NVIDIA, available here: https://research.nvidia.com/publication/2017-07_Spatiotemporal-Variance-Guided-Filtering%3A -See also: PBRT 8.2.2. +## Part 2 - A-trous wavelet filter -### Hierarchical spatial datastructures +Implement the A-trous wavelet filter from the paper. :shrug: -One method for avoiding checking a ray against every primitive in the scene or every triangle in a mesh is to bin the primitives in a hierarchical spatial datastructure such as an [octree](https://en.wikipedia.org/wiki/Octree). +It's always good to break down techniques into steps that you can individually verify. +Such a breakdown for this paper could include: +1. add UI controls to your project - we've done this for you in this base code, but see `Base Code Tour` +1. implement G-Buffers for normals and positions and visualize them to confirm (see `Base Code Tour`) +1. implement the A-trous kernel and its iterations without weighting and compare with a a blur applied from, say, GIMP or Photoshop +1. use the G-Buffers to preserve perceived edges +1. tune parameters to see if they respond in ways that you expect +1. test more advanced scenes -Ray-primitive intersection then involves recursively testing the ray against bounding volumes at different levels in the tree until a leaf containing a subset of primitives/triangles is reached, at which point the ray is checked against all the primitives/triangles in the leaf. +## Base Code Tour -* We highly recommend building the datastructure on the CPU and encapsulating the tree buffers into their own struct, with its own dedicated GPU memory management functions. -* We highly recommend working through your tree construction algorithm by hand with a couple cases before writing any actual code. - * How does the algorithm distribute triangles uniformly distributed in space? - * What if the model is a perfect axis-aligned cube with 12 triangles in 6 faces? This test can often bring up numerous edge cases! -* Note that traversal on the GPU must be coded iteratively! -* Good execution on the GPU requires tuning the maximum tree depth. Make this configurable from the start. -* If a primitive spans more than one leaf cell in the datastructure, it is sufficient for this project to count the primitive in each leaf cell. +This base code is derivd from Project 3. Some notable differences: -### Handling Long-Running CUDA Threads +* `src/pathtrace.cu` - we've added functions `showGBuffer` and `showImage` to help you visualize G-Buffer info and your denoised results. There's also a `generateGBuffer` kernel on the first bounce of `pathtrace`. +* `src/sceneStructs.h` - there's a new `GBufferPixel` struct + * the term G-buffer is more common in the world of rasterizing APIs like OpenGL or WebGL, where many G-buffers may be needed due to limited pixel channels (RGB, RGBA) + * in CUDA we can pack everything into one G-buffer with comparatively huge pixels. + * at the moment this just contains some dummy "time-to-intersect" data so you can see how `showGBuffer` works. +* `src/main.h` and `src/main.cpp` - we've added a bunch of `ui_` variables - these connect to the UI sliders in `src/preview.cpp`, and let you toggle between `showGBuffer` and `showImage`, among other things. +* `scenes` - we've added `cornell_ceiling_light.txt`, which uses a much larger light and fewer iterations. This can be a good scene to start denoising with, since even in the first iteration many rays will terminate at the light. +* As usual, be sure to search across the project for `CHECKITOUT` and `TODO` -By default, your GPU driver will probably kill a CUDA kernel if it runs for more than 5 seconds. There's a way to disable this timeout. Just beware of infinite loops - they may lock up your computer. +Note that the image saving functionality isn't hooked up to gbuffers or denoised images yet - you may need to do this yourself, but doing so will be considerably more usable than screenshotting every image. -> The easiest way to disable TDR for Cuda programming, assuming you have the NVIDIA Nsight tools installed, is to open the Nsight Monitor, click on "Nsight Monitor options", and under "General" set "WDDM TDR enabled" to false. This will change the registry setting for you. Close and reboot. Any change to the TDR registry setting won't take effect until you reboot. [Stack Overflow](http://stackoverflow.com/questions/497685/cuda-apps-time-out-fail-after-several-seconds-how-to-work-around-this) +There's also a couple specific git commits that you can look at for guidance on how to add some of these changes to your own pathtracer, such as `imgui`. You can view these changes on the command line using `git diff [commit hash]`, or on github, for example: https://github.com/CIS565-Fall-2020/Project4-CUDA-Denoiser/commit/0857d1f8f477a39a9ba28a1e0a584b79bd7ec466 -### Notes on GLM +* 0857d1f8f477a39a9ba28a1e0a584b79bd7ec466 - visualization code for a gbuffer with dummy data as time-to-intersection +* 1178307347e32da064dce1ef4c217ce0ca6153a8 - add iterations slider and save-and-exit button to UI +* 5feb60366e03687bfc245579523402221950c9c5 - add imgui and set up basic sliders for denoising parameters (all the gory cmake changes) -This project uses GLM for linear algebra. +## Part 3 - Performance Analysis -On NVIDIA cards pre-Fermi (pre-DX12), you may have issues with mat4-vec4 multiplication. If you have one of these cards, be careful! If you have issues, you might need to grab `cudamat4` and `multiplyMV` from the [Fall 2014 project](https://github.com/CIS565-Fall-2014/Project3-Pathtracer). +The point of denoising is to reduce the number of samples-per-pixel/pathtracing iterations needed to achieve an acceptably smooth image. -Let us know if you need to do this. +You should assess how much time denoising adds to your renders, as well as: +* how denoising influences the number of iterations needed to get an "acceptably smooth" result +* how it impacts runtime at different resolutions +* how effective/ineffective it is with different material types -### Scene File Format +Note that "acceptably smooth" is somewhat subjective - we will leave the means for image comparison up to you, but image diffing tools may be a good place to start, and can help visually convey differences between two images. -This project uses a custom scene description format. Scene files are flat text files that describe all geometry, materials, lights, cameras, and render settings inside of the scene. Items in the format are delimited by new lines, and comments can be added using C-style `// comments`. +Also compare visual results and performance for varying filter sizes. -Materials are defined in the following fashion: +Be sure to compare across different scenes as well - for example, between `cornell.txt` and `cornell_ceiling_light.txt`. Does one scene produce better denoised results? Why or why not? -* MATERIAL (material ID) //material header -* RGB (float r) (float g) (float b) //diffuse color -* SPECX (float specx) //specular exponent -* SPECRGB (float r) (float g) (float b) //specular color -* REFL (bool refl) //reflectivity flag, 0 for no, 1 for yes -* REFR (bool refr) //refractivity flag, 0 for no, 1 for yes -* REFRIOR (float ior) //index of refraction for Fresnel effects -* EMITTANCE (float emittance) //the emittance strength of the material. Material is a light source iff emittance > 0. +Extra Credit +=== -Cameras are defined in the following fashion: +The following extra credit items are listed roughly in order of level-of-effort, and are just suggestions - if you have an idea for something else you want to add, just ask on Piazza! -* CAMERA //camera header -* RES (float x) (float y) //resolution -* FOVY (float fovy) //vertical field of view half-angle. the horizonal angle is calculated from this and the reslution -* ITERATIONS (float interations) //how many iterations to refine the image -* DEPTH (int depth) //maximum depth (number of times the path will bounce) -* FILE (string filename) //file to output render to upon completion -* EYE (float x) (float y) (float z) //camera's position in worldspace -* LOOKAT (float x) (float y) (float z) //point in space that the camera orbits around and points at -* UP (float x) (float y) (float z) //camera's up vector -Objects are defined in the following fashion: +## G-Buffer optimization -* OBJECT (object ID) //object header -* (cube OR sphere OR mesh) //type of object, can be either "cube", "sphere", or "mesh". Note that cubes and spheres are unit sized and centered at the origin. -* material (material ID) //material to assign this object -* TRANS (float transx) (float transy) (float transz) //translation -* ROTAT (float rotationx) (float rotationy) (float rotationz) //rotation -* SCALE (float scalex) (float scaley) (float scalez) //scale +When starting out with gbuffers, it's probably easiest to start storing per-pixel positions and normals as glm::vec3s. However, this can be a decent amount of per-pixel data, which must be read from memory. -Two examples are provided in the `scenes/` directory: a single emissive sphere, and a simple cornell box made using cubes for walls and lights and a sphere in the middle. You may want to add to this file for features you implement. (DOF, Anti-aliasing, etc...) +Implement methods to store positions and normals more compactly. Two places to start include: +* storing Z-depth instead of position, and reconstruct position based on pixel coordinates and an inverted projection matrix +* oct-encoding normals: http://jcgt.org/published/0003/02/01/paper.pdf -## Third-Party Code Policy +Be sure to provide performance comparison numbers between optimized and unoptimized implementations. -* Use of any third-party code must be approved by asking on our Piazza. -* If it is approved, all students are welcome to use it. Generally, we approve use of third-party code that is not a core part of the project. For example, for the path tracer, we would approve using a third-party library for loading models, but would not approve copying and pasting a CUDA function for doing refraction. -* Third-party code **MUST** be credited in README.md. -* Using third-party code without its approval, including using another student's code, is an academic integrity violation, and will, at minimum, result in you receiving an F for the semester. +## Comparing A-trous and Gaussian filtering -## README +Dammertz-et-al mention in their section 2.2 that A-trous filtering is a means for approximating gaussian fitlering. Implement gaussian filtering and compare with A-trous to see if one method is significantly faster. -Please see: [**TIPS FOR WRITING AN AWESOME README**](https://github.com/pjcozzi/Articles/blob/master/CIS565/GitHubRepo/README.md) +## Shared Memory Filtering -* Sell your project. -* Assume the reader has a little knowledge of path tracing - don't go into - detail explaining what it is. Focus on your project. -* Don't talk about it like it's an assignment - don't say what is and isn't - "extra" or "extra credit." Talk about what you accomplished. -* Use this to document what you've done. -* *DO NOT* leave the README to the last minute! It is a crucial part of the - project, and we will not be able to grade you without a good README. +Filtering techniques can be somewhat memory-expensive - for each pixel, the technique reads several neighboring pixels to compute a final value. This only gets more expensive with the aditional data in G-Buffers, so these tecniques are likely to benefit from shared memory. -In addition: +Be sure to provide performance comparison numbers between implementations with and without shared memory. +Also pay attention to how shared memory use impacts the block size for your kernels, and how this may change as the filter width changes. -* This is a renderer, so include images that you've made! -* Be sure to back your claims for optimization with numbers and comparisons. -* If you reference any other material, please provide a link to it. -* You wil not be graded on how fast your path tracer runs, but getting close to - real-time is always nice! -* If you have a fast GPU renderer, it is very good to show case this with a - video to show interactivity. If you do so, please include a link! +## Implement Temporal Sampling -### Analysis +High-performance raytracers in dynamic applications (like games, or real-time visualization engines) now often use temporal sampling, borrowing and repositioning samples from previous frames so that each frame effectively only computes 1 sample-per-pixel but can denoise from many frames. -* Stream compaction helps most after a few bounces. Print and plot the - effects of stream compaction within a single iteration (i.e. the number of - unterminated rays after each bounce) and evaluate the benefits you get from - stream compaction. -* Compare scenes which are open (like the given cornell box) and closed - (i.e. no light can escape the scene). Again, compare the performance effects - of stream compaction! Remember, stream compaction only affects rays which - terminate, so what might you expect? -* For optimizations that target specific kernels, we recommend using - stacked bar graphs to convey total execution time and improvements in - individual kernels. For example: +This will require additional buffers, as well as reprojection code to move samples from where they were in a previous frame to the current frame. - ![Clearly the Macchiato is optimal.](img/stacked_bar_graph.png) +Note that our basic pathtracer doesn't do animation, so you will also need to implement some kind of dynamic aspect in your scene - this may be as simple as an automated panning camera, or as complex as translating models. - Timings from NSight should be very useful for generating these kinds of charts. +See https://research.nvidia.com/publication/2017-07_Spatiotemporal-Variance-Guided-Filtering%3A for more details. -## Submit +Submission +=== If you have modified any of the `CMakeLists.txt` files at all (aside from the list of `SOURCE_FILES`), mentions it explicity. Beware of any build issues discussed on the Piazza. Open a GitHub pull request so that we can see that you have finished. -The title should be "Project 3: YOUR NAME". +If you are completing this assignment off of your Project 3 pathtracer, you can open a pull request to Project 3. However, **before you start committing code**, please create a separate branch from the one that you used to submit Project 3. This will help us distinguish the changes you made for this project. + +Alternatively, if you decide to use a single branch throughout, please let us know in your Project 3 and Project 4 Pull Requests which commits you would like us to grade from. You can just let us know in the Pull Request comments. + +The title should be "Project 4: YOUR NAME". The template of the comment section of your pull request is attached below, you can do some copy and paste: * [Repo Link](https://link-to-your-repo) @@ -324,8 +192,10 @@ The template of the comment section of your pull request is attached below, you * ... * Feedback on the project itself, if any. -## References +References +=== -* [PBRT] Physically Based Rendering, Second Edition: From Theory To Implementation. Pharr, Matt and Humphreys, Greg. 2010. -* Antialiasing and Raytracing. Chris Cooksey and Paul Bourke, http://paulbourke.net/miscellaneous/aliasing/ -* [Sampling notes](http://graphics.ucsd.edu/courses/cse168_s14/) from Steve Rotenberg and Matteo Mannino, University of California, San Diego, CSE168: Rendering Algorithms +* [Edge-Avoiding A-Trous Wavelet Transform for fast Global Illumination Filtering](https://jo.dreggn.org/home/2010_atrous.pdf) +* [Spatiotemporal Variance-Guided Filtering](https://research.nvidia.com/publication/2017-07_Spatiotemporal-Variance-Guided-Filtering%3A) +* [A Survey of Efficient Representations for Independent Unit Vectors](http://jcgt.org/published/0003/02/01/paper.pdf) +* ocornut/imgui - https://github.com/ocornut/imgui diff --git a/Project3-CUDA-Path-Tracer.launch b/Project4-CUDA-Denoiser.launch similarity index 92% rename from Project3-CUDA-Path-Tracer.launch rename to Project4-CUDA-Denoiser.launch index 0222434..192751f 100644 --- a/Project3-CUDA-Path-Tracer.launch +++ b/Project4-CUDA-Denoiser.launch @@ -7,13 +7,13 @@ - - + + - + diff --git a/README.md b/README.md index 110697c..c6f2823 100644 --- a/README.md +++ b/README.md @@ -1,7 +1,7 @@ CUDA Path Tracer ================ -**University of Pennsylvania, CIS 565: GPU Programming and Architecture, Project 3** +**University of Pennsylvania, CIS 565: GPU Programming and Architecture, Project 4** * (TODO) YOUR NAME HERE * Tested on: (TODO) Windows 22, i7-2222 @ 2.22GHz 22GB, GTX 222 222MB (Moore 2222 Lab) diff --git a/img/denoised.png b/img/denoised.png new file mode 100644 index 0000000..ed51537 Binary files /dev/null and b/img/denoised.png differ diff --git a/img/noisy.png b/img/noisy.png new file mode 100644 index 0000000..42ca179 Binary files /dev/null and b/img/noisy.png differ diff --git a/img/normals.png b/img/normals.png new file mode 100644 index 0000000..4b8dc47 Binary files /dev/null and b/img/normals.png differ diff --git a/img/positions.png b/img/positions.png new file mode 100644 index 0000000..50f9742 Binary files /dev/null and b/img/positions.png differ diff --git a/img/simple_blur.png b/img/simple_blur.png new file mode 100644 index 0000000..94f243a Binary files /dev/null and b/img/simple_blur.png differ diff --git a/img/time-of-flight.png b/img/time-of-flight.png new file mode 100644 index 0000000..65b637f Binary files /dev/null and b/img/time-of-flight.png differ diff --git a/scenes/cornell_ceiling_light.txt b/scenes/cornell_ceiling_light.txt new file mode 100644 index 0000000..15af5f1 --- /dev/null +++ b/scenes/cornell_ceiling_light.txt @@ -0,0 +1,117 @@ +// Emissive material (light) +MATERIAL 0 +RGB 1 1 1 +SPECEX 0 +SPECRGB 0 0 0 +REFL 0 +REFR 0 +REFRIOR 0 +EMITTANCE 1 + +// Diffuse white +MATERIAL 1 +RGB .98 .98 .98 +SPECEX 0 +SPECRGB 0 0 0 +REFL 0 +REFR 0 +REFRIOR 0 +EMITTANCE 0 + +// Diffuse red +MATERIAL 2 +RGB .85 .35 .35 +SPECEX 0 +SPECRGB 0 0 0 +REFL 0 +REFR 0 +REFRIOR 0 +EMITTANCE 0 + +// Diffuse green +MATERIAL 3 +RGB .35 .85 .35 +SPECEX 0 +SPECRGB 0 0 0 +REFL 0 +REFR 0 +REFRIOR 0 +EMITTANCE 0 + +// Specular white +MATERIAL 4 +RGB .98 .98 .98 +SPECEX 0 +SPECRGB .98 .98 .98 +REFL 1 +REFR 0 +REFRIOR 0 +EMITTANCE 0 + +// Camera +CAMERA +RES 800 800 +FOVY 45 +ITERATIONS 10 +DEPTH 8 +FILE cornell +EYE 0.0 5 10.5 +LOOKAT 0 5 0 +UP 0 1 0 + + +// Ceiling light +OBJECT 0 +cube +material 0 +TRANS 0 10 0 +ROTAT 0 0 0 +SCALE 10 .3 10 + +// Floor +OBJECT 1 +cube +material 1 +TRANS 0 0 0 +ROTAT 0 0 0 +SCALE 10 .01 10 + +// Ceiling +OBJECT 2 +cube +material 1 +TRANS 0 10 0 +ROTAT 0 0 90 +SCALE .01 10 10 + +// Back wall +OBJECT 3 +cube +material 1 +TRANS 0 5 -5 +ROTAT 0 90 0 +SCALE .01 10 10 + +// Left wall +OBJECT 4 +cube +material 2 +TRANS -5 5 0 +ROTAT 0 0 0 +SCALE .01 10 10 + +// Right wall +OBJECT 5 +cube +material 3 +TRANS 5 5 0 +ROTAT 0 0 0 +SCALE .01 10 10 + +// Sphere +OBJECT 6 +sphere +material 4 +TRANS -1 4 -1 +ROTAT 0 0 0 +SCALE 3 3 3 diff --git a/src/interactions.h b/src/interactions.h index a9d3968..144a9f5 100644 --- a/src/interactions.h +++ b/src/interactions.h @@ -2,7 +2,6 @@ #include "intersections.h" -// CHECKITOUT /** * Computes a cosine-weighted random direction in a hemisphere. * Used for diffuse lighting. @@ -42,29 +41,7 @@ glm::vec3 calculateRandomDirectionInHemisphere( } /** - * Scatter a ray with some probabilities according to the material properties. - * For example, a diffuse surface scatters in a cosine-weighted hemisphere. - * A perfect specular surface scatters in the reflected ray direction. - * In order to apply multiple effects to one surface, probabilistically choose - * between them. - * - * The visual effect you want is to straight-up add the diffuse and specular - * components. You can do this in a few ways. This logic also applies to - * combining other types of materias (such as refractive). - * - * - Always take an even (50/50) split between a each effect (a diffuse bounce - * and a specular bounce), but divide the resulting color of either branch - * by its probability (0.5), to counteract the chance (0.5) of the branch - * being taken. - * - This way is inefficient, but serves as a good starting point - it - * converges slowly, especially for pure-diffuse or pure-specular. - * - Pick the split based on the intensity of each material color, and divide - * branch result by that branch's probability (whatever probability you use). - * - * This method applies its changes to the Ray parameter `ray` in place. - * It also modifies the color `color` of the ray in place. - * - * You may need to change the parameter list for your purposes! + * Simple ray scattering with diffuse and perfect specular support. */ __host__ __device__ void scatterRay( @@ -73,9 +50,6 @@ void scatterRay( glm::vec3 normal, const Material &m, thrust::default_random_engine &rng) { - // TODO: implement this. - // A basic implementation of pure-diffuse shading will just call the - // calculateRandomDirectionInHemisphere defined above. glm::vec3 newDirection; if (m.hasReflective) { newDirection = glm::reflect(pathSegment.ray.direction, normal); diff --git a/src/intersections.h b/src/intersections.h index b150407..c3e81f4 100644 --- a/src/intersections.h +++ b/src/intersections.h @@ -19,7 +19,6 @@ __host__ __device__ inline unsigned int utilhash(unsigned int a) { return a; } -// CHECKITOUT /** * Compute a point at parameter value `t` on ray `r`. * Falls slightly short so that it doesn't intersect the object it's hitting. @@ -35,7 +34,6 @@ __host__ __device__ glm::vec3 multiplyMV(glm::mat4 m, glm::vec4 v) { return glm::vec3(m * v); } -// CHECKITOUT /** * Test intersection between a ray and a transformed cube. Untransformed, * the cube ranges from -0.5 to 0.5 in each axis and is centered at the origin. @@ -89,7 +87,6 @@ __host__ __device__ float boxIntersectionTest(Geom box, Ray r, return -1; } -// CHECKITOUT /** * Test intersection between a ray and a transformed sphere. Untransformed, * the sphere always has radius 0.5 and is centered at the origin. diff --git a/src/main.cpp b/src/main.cpp index 26cef7b..4092ae4 100644 --- a/src/main.cpp +++ b/src/main.cpp @@ -15,6 +15,10 @@ static bool middleMousePressed = false; static double lastX; static double lastY; +// CHECKITOUT: simple UI parameters. +// Search for any of these across the whole project to see how these are used, +// or look at the diff for commit 1178307347e32da064dce1ef4c217ce0ca6153a8. +// For all the gory GUI details, look at commit 5feb60366e03687bfc245579523402221950c9c5. int ui_iterations = 0; int startupIterations = 0; int lastLoopIterations = 0; diff --git a/src/pathtrace.cu b/src/pathtrace.cu index b57dd97..23e5f90 100644 --- a/src/pathtrace.cu +++ b/src/pathtrace.cu @@ -150,7 +150,6 @@ __global__ void generateRayFromCamera(Camera cam, int iter, int traceDepth, Path segment.ray.origin = cam.position; segment.color = glm::vec3(1.0f, 1.0f, 1.0f); - // TODO: implement antialiasing by jittering the ray segment.ray.direction = glm::normalize(cam.view - cam.right * cam.pixelLength.x * ((float)x - (float)cam.resolution.x * 0.5f) - cam.up * cam.pixelLength.y * ((float)y - (float)cam.resolution.y * 0.5f) @@ -161,10 +160,6 @@ __global__ void generateRayFromCamera(Camera cam, int iter, int traceDepth, Path } } -// TODO: -// computeIntersections handles generating ray intersections ONLY. -// Generating new rays is handled in your shader(s). -// Feel free to modify the code below. __global__ void computeIntersections( int depth , int num_paths @@ -204,7 +199,6 @@ __global__ void computeIntersections( { t = sphereIntersectionTest(geom, pathSegment.ray, tmp_intersect, tmp_normal, outside); } - // TODO: add more intersection tests here... triangle? metaball? CSG? // Compute the minimum t from the intersection tests to determine what // scene geometry object was hit first. @@ -231,15 +225,6 @@ __global__ void computeIntersections( } } -// LOOK: "fake" shader demonstrating what you might do with the info in -// a ShadeableIntersection, as well as how to use thrust's random number -// generator. Observe that since the thrust random number generator basically -// adds "noise" to the iteration, the image should start off noisy and get -// cleaner as more iterations are computed. -// -// Note that this shader does NOT do a BSDF evaluation! -// Your shaders should handle that - this can allow techniques such as -// bump mapping. __global__ void shadeSimpleMaterials ( int iter , int num_paths @@ -260,8 +245,6 @@ __global__ void shadeSimpleMaterials ( if (intersection.t > 0.0f) { // if the intersection exists... segment.remainingBounces--; // Set up the RNG - // LOOK: this is how you use thrust's RNG! Please look at - // makeSeededRandomEngine as well. thrust::default_random_engine rng = makeSeededRandomEngine(iter, idx, segment.remainingBounces); Material material = materials[intersection.materialId]; @@ -272,9 +255,6 @@ __global__ void shadeSimpleMaterials ( segment.color *= (materialColor * material.emittance); segment.remainingBounces = 0; } - // Otherwise, do some pseudo-lighting computation. This is actually more - // like what you would expect from shading in a rasterizer like OpenGL. - // TODO: replace this! you should be able to start with basically a one-liner else { segment.color *= materialColor; glm::vec3 intersectPos = intersection.t * segment.ray.direction + segment.ray.origin; @@ -337,12 +317,13 @@ void pathtrace(int frame, int iter) { /////////////////////////////////////////////////////////////////////////// - // Recap: + // Pathtracing Recap: // * Initialize array of path rays (using rays that come out of the camera) // * You can pass the Camera object to that kernel. // * Each path ray must carry at minimum a (ray, color) pair, // * where color starts as the multiplicative identity, white = (1, 1, 1). // * This has already been done for you. + // * NEW: For the first depth, generate geometry buffers (gbuffers) // * For each depth: // * Compute an intersection in the scene for each path ray. // A very naive version of this has been implemented for you, but feel @@ -350,21 +331,20 @@ void pathtrace(int frame, int iter) { // Currently, intersection distance is recorded as a parametric distance, // t, or a "distance along the ray." t = -1.0 indicates no intersection. // * Color is attenuated (multiplied) by reflections off of any object - // * TODO: Stream compact away all of the terminated paths. + // * Stream compact away all of the terminated paths. // You may use either your implementation or `thrust::remove_if` or its // cousins. // * Note that you can't really use a 2D kernel launch any more - switch // to 1D. - // * TODO: Shade the rays that intersected something or didn't bottom out. + // * Shade the rays that intersected something or didn't bottom out. // That is, color the ray by performing a color computation according // to the shader, then generate a new ray to continue the ray path. // We recommend just updating the ray's PathSegment in place. // Note that this step may come before or after stream compaction, // since some shaders you write may also cause a path to terminate. - // * Finally, add this iteration's results to the image. This has been done - // for you. - - // TODO: perform one iteration of path tracing + // * Finally: + // * if not denoising, add this iteration's results to the image + // * TODO: if denoising, run kernels that take both the raw pathtraced result and the gbuffer, and put the result in the "pbo" from opengl generateRayFromCamera <<>>(cam, iter, traceDepth, dev_paths); checkCUDAError("generate camera ray"); @@ -404,15 +384,6 @@ void pathtrace(int frame, int iter) { depth++; - // TODO: - // --- Shading Stage --- - // Shade path segments based on intersections and generate new rays by - // evaluating the BSDF. - // Start off with just a big kernel that handles all the different - // materials you have in the scenefile. - // TODO: compare between directly shading the path segments and shading - // path segments that have been reshuffled to be contiguous in memory. - shadeSimpleMaterials<<>> ( iter, num_paths, @@ -429,6 +400,8 @@ void pathtrace(int frame, int iter) { /////////////////////////////////////////////////////////////////////////// + // CHECKITOUT: use dev_image as reference if you want to implement saving denoised images. + // Otherwise, screenshots are also acceptable. // Retrieve image from GPU cudaMemcpy(hst_scene->state.image.data(), dev_image, pixelcount * sizeof(glm::vec3), cudaMemcpyDeviceToHost); @@ -436,6 +409,7 @@ void pathtrace(int frame, int iter) { checkCUDAError("pathtrace"); } +// CHECKITOUT: this kernel "post-processes" the gbuffer/gbuffers into something that you can visualize for debugging. void showGBuffer(uchar4* pbo) { const Camera &cam = hst_scene->state.camera; const dim3 blockSize2d(8, 8); @@ -443,6 +417,7 @@ void showGBuffer(uchar4* pbo) { (cam.resolution.x + blockSize2d.x - 1) / blockSize2d.x, (cam.resolution.y + blockSize2d.y - 1) / blockSize2d.y); + // CHECKITOUT: process the gbuffer results and send them to OpenGL buffer for visualization gbufferToPBO<<>>(pbo, cam.resolution, dev_gBuffer); } diff --git a/src/sceneStructs.h b/src/sceneStructs.h index 77a5cde..da7e558 100644 --- a/src/sceneStructs.h +++ b/src/sceneStructs.h @@ -75,6 +75,8 @@ struct ShadeableIntersection { int materialId; }; +// CHECKITOUT - a simple struct for storing scene geometry information per-pixel. +// What information might be helpful for guiding a denoising filter? struct GBufferPixel { float t; };