Skip to content

Commit

Permalink
Improved meshing process and added denoising of the generated point c…
Browse files Browse the repository at this point in the history
…loud
  • Loading branch information
Lewis Stuart authored and Lewis Stuart committed Jan 8, 2025
1 parent e71a102 commit dd1281b
Show file tree
Hide file tree
Showing 6 changed files with 132 additions and 71 deletions.
36 changes: 10 additions & 26 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,8 @@ Gaussian Splatting can generate extremely high quality 3D representations of a s

This repo offers scripts for converting a 3D Gaussian Splatting scene into a dense point cloud. The generated point clouds are high-quality and effectively imitate the original 3DGS scenes. Extra functionality is offered to customise the creation of the point cloud, as well as producing a mesh of the scene.

1) **Research Article:** *https://radiancefields.com/3dgs-to-dense-ply*
1) **Research Paper:** *TO ADD*
2) **Research Article:** *https://radiancefields.com/3dgs-to-dense-ply*
2) **Youtube Video:** *https://www.youtube.com/watch?v=cOXfKRFqqxg*

<p>
Expand Down Expand Up @@ -44,8 +45,9 @@ The transform path can either be to a transforms.json file or COLMAP output file
| output_path | 3dgs_pc.ply | Path to output file (must be ply file) |
| transform_path | - | Path to COLMAP or Transform file used for loading in camera positions for rendering colours |
| generate_mesh | False | Set to also generate a mesh based on the created point cloud |
| poisson_depth | 12 | The depth used in the poisson surface reconstruction algorithm that is used for meshing (larger value = more quality) |
| poisson_depth | 10 | The depth used in the poisson surface reconstruction algorithm that is used for meshing (larger value = more quality) |
| mesh_output_path | 3dgs_mesh.ply| Path to mesh output file (must be ply file) |
| clean_pointcloud | False | Set to remove outliers on the point cloud after generation (requires Open3D) |
| camera_skip_rate | 0 | Number of cameras to skip for each rendered image (reduces compute time- only use if cameras in linear trajectory) |
| num_points | 10000000 | Total number of points to generate for the pointcloud |
| exact_num_points | False | Set if the number of generated points should more closely match the num_points argument (slower) |
Expand All @@ -72,37 +74,19 @@ Our mesh reconstruction works by generating a point cloud only containing the pr
![Comparison of the generated point cloud and mesh for the bulldozer scene](https://i.imgur.com/Lzwhatr.png)

Some tips to improve the results:
1) Set a bounding box to only mesh specific parts of the scene that are you need in the mesh
2) If the final mesh is too sharp, we recommend using some of the features in CloudCompare (e.g. smoothing) to get the desired output.
1) Set the ```poisson_depth``` argument to a higher value (we found that 12 produced the best results, but any higher produced a infeasible mesh)
2) Set a bounding box to only mesh specific parts of the scene that are you need in the mesh
3) If the final mesh is too sharp, we recommend using some of the features in CloudCompare (e.g. smoothing) to get the desired output.

## How to increase speed

While the generated point clouds have a high accuracy and precise colours, the process can be slower than desired (especially for scenes with millions of Gaussians). There are several ways that speed can be increased without substantially impacting the final quality of the point cloud:
1) Set camera_skip_rate to a value where overlapping images are not rendered (e.g. we set camera_skip_rate = 4 for the mip dataset). We found that setting this value significantly reduced compile time, while not directly impacting the quality of the final reconstruction. Only do this if the camera poses are ordered in a linear trajectory around your scene and the camera poses overlap considerably.
2) Set colour_quality to a lower option. This value is used to determine what resolution to render images of the scene; a lower quality will result in a faster render time.

## How this works
# Citation

Firstly, the gaussians are loaded from the input file, with the 3D covariance matrices being calculated using the scales and rotations of each gaussian. The original gaussian colours are calculated from the spherical harmonics (with the degree=0 since these points do not change based on direction when they are part of the point cloud). Gaussians are then culled based on the bounding box, size cut off and minimum capacity arguments. Alongside this, the normals of each Gaussians are calculated by taking the smallest axis of the Gaussian; this facilitates meshing of the point cloud later.
If you want to know more about how this works, we recommend reading our paper below. Also, if you found our work useful, please consider citing:

There is an issue with using the loaded gaussian colours for generating new points; these colours do not accurately represent the scene. When rendering an image, gaussians that overlap on each pixel each contribute to the final colour of that pixel based on their opacity and order. Hence, a gaussian that is always behind another gaussian may only contribute 10% to the final pixel colour, and thus their colour does not accurately represent the contribution to the scene.
TO ADD

![Comparison of point clouds generated using original gaussian colours vs rendered colours](https://i.imgur.com/Y9ZVZaQ.png)

The fix is to use colours generated by rendering images of the scene and tracking the contributions of each gaussian at each frame. When a gaussian contributes the most to a particular pixel colour compared to other times it is rendered (e.g. the gaussian is closer to a surface at that particular camera perspective) then we assign the colour of that pixel to that gaussian. Hence, gaussians behind other colours will not have erroneous colours and will have the same colour as the rendered images. The results are much better.

![Showcase of how rendering colours works compared to utilising the original gaussian colours](https://i.imgur.com/Kbkp4wI.png)

Now we have all the information required to start sampling points at each gaussian. Firstly, points are distributed to all gaussians based on the volume each gaussian has. Hence, larger gaussians will have more points compared to smaller ones, meaning that areas, such as backgrounds, have proper representation.

Each of these gaussians are batched, with gaussians with the same number of points to be generated being batched together. Since larger gaussians are less common, the number of gaussians in each batch diminishes as the number of assigned points increases, which is inefficient when generating the points. Hence, after a certain number of points, these batches are 'binned' together. While this does mean that the number of generated points does not exactly match the specified argument, it is much more efficient.

The Torch multivariate normal function is used to sample over the gaussian distribution in batches. However, since a gaussian distribution is not definite, points can be generated that are 'outliers' as they differ too far from the gaussian's centre. Hence, for each point, the Mahalanobis distance is calculated from the centre of each gaussian to its points. If a point has a distance greater than 2 STD, then it is considered an outlier and removed. To ensure that gaussians with lots of random outliers are represented fairly, points that were removed are regenerated and checked again, and removed if they are outliers. This process is repeated until all points have been correctly generated or a max number of attempts has been made (we set this as 5).

Once all the points have been generated for all of the gaussians these points are exported to a .ply file.

## Issues

Currently, we are using an altered version of the gaussian renderer introduced in the Torch Splatting repo. While our alterations allow us to accurately calculate the colours of each point, the entire rendering process is slow compared to the original CUDA implementation (around 2 seconds per render). We plan on eventually implementing this into the original gaussian renderer, but this is a future plan. If anyone is up for the challenge, feel free to implement it and push to this repo :)

Another improvement can be automatically generating camera positions for rendering the colours, rather than requiring a set of camera transforms.
9 changes: 9 additions & 0 deletions gauss_handler.py
Original file line number Diff line number Diff line change
Expand Up @@ -199,6 +199,15 @@ def apply_bounding_box(self, bounding_box_min, bounding_box_max):

self.filter_gaussians(valid_gaussians_indices)

"""def apply_k_nearest_neighbours(self, k=10):
tree = KDTree(self.xyz.detach().cpu().numpy())
distances, indices = tree.query(gaussian_positions, k=k)
total_distances = (np.sum(distances, axis=1)/k)
invalid_gaussians = torch.tensor(total_distances > max_dist, device="cuda:0")"""

def cull_large_gaussians(self, cull_gauss_size_percent):
"""
Orders the gaussians by size and removes gaussians with a size greater than the 'cull_gauss_size_percent'
Expand Down
60 changes: 35 additions & 25 deletions gauss_render.py
Original file line number Diff line number Diff line change
Expand Up @@ -194,52 +194,58 @@ def get_rect(pix_coord, radii, width, height):

class GaussRenderer():
"""
A gaussian splatting renderer
A Gaussian splatting renderer
"""

def __init__(self, means3D, opacity, colour, cov3d, white_bkgd=True):
self.white_bkgd = white_bkgd

self.device = means3D.get_device()

# Tensor of the maximum contributions each gaussian made
# Tensor of the maximum contributions each Gaussian made
self.gaussian_max_contribution = torch.zeros(means3D.shape[0], device=self.device)

# Tensor of new gaussian colours calculated for point cloud generation
# Tensor of new Gaussian colours calculated for point cloud generation
self.gaussian_colours = torch.zeros((means3D.shape[0], 3), device=self.device, dtype=torch.double)

"""# Tensor of surface Gaussians
self.surface_gaussian_idxs = torch.zeros(means3D.shape[0], device=self.device)"""

self.means3D = means3D
self.opacity = opacity
self.cov3d = cov3d
self.colour = colour

def get_colours(self):
"""
Returns the new calculated gaussian colours
Returns the new calculated Gaussian colours
"""
return self.gaussian_colours * 255

def get_seen_gaussians(self):
"""
Returns indices of gaussians that have been rendered
Returns indices of Gaussians that have been rendered
"""
return self.gaussian_max_contribution > 0

def get_surface_gaussians(self):
return self.gaussian_max_contribution > 0.4
"""
Returns indices of Gaussians that are predicted to be on the surface of the scene
"""
return self.gaussian_max_contribution > torch.mean(self.gaussian_max_contribution)

def render(self, camera, means2D, cov2d, colour, opacity, depths, projection_mask, max_tile_size=60, max_gaussians_per_tile=60000):
"""
Renders an image given a set of gaussians and camera transform
Renders an image given a set of Gaussians and camera transform
Args:
camera: the camera parameters to render the image
means2D: positions of the gaussians in 2D spacse
cov2D: 2D covariance matrices for gaussians
colour: colours of the gaussians
opacity: opacity of the gaussians
depths: depths of the gaussians
projection_mask: mask that filters gaussians not included in camera frame
means2D: positions of the Gaussians in 2D spacse
cov2D: 2D covariance matrices for Gaussians
colour: colours of the Gaussians
opacity: opacity of the Gaussians
depths: depths of the Gaussians
projection_mask: mask that filters Gaussians not included in camera frame
Returns:
render_image: the rendered RGB image
"""
Expand Down Expand Up @@ -301,14 +307,14 @@ def render(self, camera, means2D, cov2d, colour, opacity, depths, projection_mas

tile_coord = self.pix_coord[start_px[0]: start_px[0]+tile_size[1], start_px[1]: start_px[1]+tile_size[0]].flatten(0,-2)

# Order gaussians based on the depth (descending away from cam)
# Order Gaussians based on the depth (descending away from cam)
sorted_depths, index = torch.sort(depths[tile_mask])

index = torch.flip(index, [0,])

inverse_index = index.argsort(0)

# Filter gaussians to only those in mask and reorder
# Filter Gaussians to only those in mask and reorder
sorted_means2D = means2D[tile_mask][index]
sorted_cov2d = cov2d[tile_mask][index]
sorted_conic = sorted_cov2d.inverse()
Expand All @@ -317,14 +323,14 @@ def render(self, camera, means2D, cov2d, colour, opacity, depths, projection_mas

dx = (tile_coord[:,None,:] - sorted_means2D[None,:])

# Calculate contributions of each gaussian
# Calculate contributions of each Gaussian
gauss_weight = torch.exp(-0.5 * (
dx[:, :, 0]**2 * sorted_conic[:, 0, 0]
+ dx[:, :, 1]**2 * sorted_conic[:, 1, 1]
+ dx[:,:,0]*dx[:,:,1] * sorted_conic[:, 0, 1]
+ dx[:,:,0]*dx[:,:,1] * sorted_conic[:, 1, 0]))

# Calculate alpha and transmittance of each gaussian in pixel
# Calculate alpha and transmittance of each Gaussian in pixel
alpha = (gauss_weight[..., None] * sorted_opacity[None]).clip(max=0.99)
T = torch.cat([torch.ones_like(alpha[:,:1]), 1-alpha[:,:-1]], dim=1).cumprod(dim=1)
acc_alpha = (alpha * T).sum(dim=1)
Expand All @@ -333,23 +339,22 @@ def render(self, camera, means2D, cov2d, colour, opacity, depths, projection_mas

render_colour[start_px[0]: start_px[0]+tile_size[1], start_px[1]: start_px[1]+tile_size[0]] = tile_colour.reshape(tile_size[1], tile_size[0], -1)

# Get the current max representations of each gaussian by applying projection and tile mask
# Get the current max representations of each Gaussian by applying projection and tile mask
combined_mask = torch.zeros_like(self.gaussian_max_contribution, dtype=torch.bool)
combined_mask[projection_mask] = tile_mask
current_gaussian_reps = self.gaussian_max_contribution[combined_mask]

indices_in_mask = combined_mask.nonzero(as_tuple=True)[0]

# Calculate the represntation of each gaussian for the current pixels
# This is the amount it contributed to the pixel colour and is what is used to determine what colour the points of each gaussian should have!
# Calculate the represntation of each Gaussian for the current pixels
# This is the amount it contributed to the pixel colour and is what is used to determine what colour the points of each Gaussian should have!
contribution = ((T * alpha)).squeeze(2)[:, inverse_index]

# Get what pixel the gaussian contributed the most and what its biggest contribution value was
# Get what pixel the Gaussian contributed the most and what its biggest contribution value was
biggest_contribution_in_tile = torch.max(contribution, 0)
biggest_contribution_in_tile_vals = biggest_contribution_in_tile[0]
biggest_contribution_in_tile_pixel = biggest_contribution_in_tile[1]

# Filter gaussians that have a new biggest contribution
# Filter Gaussians that have a new biggest contribution
new_gaussians = biggest_contribution_in_tile_vals > current_gaussian_reps

new_gaussian_mask_indices = new_gaussians.nonzero()
Expand All @@ -360,6 +365,11 @@ def render(self, camera, means2D, cov2d, colour, opacity, depths, projection_mas
self.gaussian_max_contribution[gaussians_to_update] = biggest_contribution_in_tile_vals[new_gaussians].unsqueeze(1)
self.gaussian_colours[gaussians_to_update] = tile_colour[biggest_contribution_in_tile_pixel[new_gaussians]].unsqueeze(1)

"""# Get the Gaussian that contributed most for each pixel in the tile, and set the surface Gaussian ids to 1 for these Gaussians
biggest_contribution_per_pixel = torch.max(contribution, 1)[1]
surface_gaussians_to_update = indices_in_mask[biggest_contribution_per_pixel]
self.surface_gaussian_idxs[surface_gaussians_to_update] += 1"""

return torch.flip(render_colour, [1,])

def add_img(self, camera, **kwargs):
Expand All @@ -381,7 +391,7 @@ def add_img(self, camera, **kwargs):
focal_x=camera.focal_x,
focal_y=camera.focal_y)

# Project gaussians into 2D and filter gaussians outside of view range
# Project Gaussians into 2D and filter Gaussians outside of view range
mean_ndc, mean_view, in_mask = projection_ndc(self.means3D,
viewmatrix=camera.world_view_transform,
projmatrix=camera.projection_matrix)
Expand All @@ -397,7 +407,7 @@ def add_img(self, camera, **kwargs):
mean_coord_y = ((mean_ndc[..., 1] + 1) * camera.image_height - 1.0) * 0.5
means2D = torch.stack([mean_coord_x, mean_coord_y], dim=-1)

# Estimate of the maximum gaussians for each tile for the provided amount of memory
# Estimate of the maximum Gaussians for each tile for the provided amount of memory
estimated_mem_per_gaussian = 175000
max_gaussians_per_tile = int((torch.cuda.mem_get_info()[1] - torch.cuda.memory_allocated(self.device))/estimated_mem_per_gaussian)

Expand Down
Loading

0 comments on commit dd1281b

Please sign in to comment.