University of Pennsylvania, CIS 565: GPU Programming and Architecture, Project 3
- Wenqing Wang
- Tested on: Windows 11, i7-11370H @ 3.30GHz 16.0 GB, GTX 3050 Ti
This project implemented a GPU-based path tracer using CUDA with several visual and performances improvements.
- Visual
- Shading kernel with BSDF evaluation for diffuse, specular-reflective and refractive surfaces.
- Stochastic Sampled Antialiasing
- Depth of Field
- Motion Blur
- Mesh
- Arbitrary mesh loading and rendering based on tinyOBJ with toggleable bounding volume intersection culling
- Performance
- Path continuation/termination using streaming compaction
- Material sorting
- First bounce caching
Diffuse cube, refractive sphere & purely reflective floor |
---|
![]() |
This project evaluates BSDF for different materials (diffuse, reflective and refractive). For diffuse materials, the light is scattered using random sampling, while for fully reflective materials, the direction of the incident light is calculated using glm::reflect
. For refractive materials, I first check if the pre-condition of refraction is met by checking ior * sin_theta < 1.f
(where theta represents the angle of incident light), then I use glm::refract
to scatter the ray. Here, I also used Schlick's equation to approximate the contribution of the Fresnel factor.
Without Anti-aliasing | With Anti-aliasing |
---|---|
![]() |
![]() |
However, anti-aliasing conflicts with first-bounce-caching because they are now supposed to be different in each iteration. I think this is a good example of making a balance between quality and speed of rendering.
Focal Distance = 8.0 | Focal Distance = 12.0 |
---|---|
![]() |
![]() |
This path tracer implements the depth of field using two parameters: LENS_RADIUS
and FOCAL_DISTANCE
. FOCAL_DISTANCE
determines how far objects must be from the camera to be in focus; LENS_RADIUS
will determine how blurry objects that are out of focus will appear.
Without Motion Blur | Motion Blur |
---|---|
![]() |
![]() |
Teapot | Cow |
---|---|
![]() |
![]() |
This path tracer supports simple .obj mesh loading by using tinyobj (I used a previous version from the CIS 561 project, where I rewrite the loadOBJ()
for easier attributes parsing). I parsed each mesh into triangles and performed glm::intersectRayTriangle
on each triangle. I also computed the bounding box for each mesh geometry so that triangles will be checked only if a ray intersects the mesh's bounding box first.
As shown above, with first bounce caching, we can achieve better performance for all the depth values we choose for test. However, as the maximum depth increases in each iteration, the performance gain from caching the first bounce keeps decreasing. This is because the amount of computation required for the first reflection becomes a smaller percentage of the overall computation.
In this project, I used thrust::sort_by_key
to sort the intersection points (lightpaths) based on the surface material. However, contrary to the my expection, the overall performance degrades significantly after sorting. I tried to increase the number of materials, but probably due to the simplicity of the scene, the sorted implementation was still worse than the unsorted case. I anticipate that this optimization may have a significant effect in a much more complex scene, but I have not yet obtained test results due to time constraints.
Upside down |
---|
![]() |
Ghost cube |
---|
![]() |