Improved meshing process and added denoising of the generated point c…

…loud
Lewis-Stuart-11 · Jan 8, 2025 · dd1281b · dd1281b
1 parent e71a102
commit dd1281b
Show file tree

Hide file tree

Showing 6 changed files with 132 additions and 71 deletions.
diff --git a/README.md b/README.md
@@ -4,7 +4,8 @@ Gaussian Splatting can generate extremely high quality 3D representations of a s
 
 This repo offers scripts for converting a 3D Gaussian Splatting scene into a dense point cloud. The generated point clouds are high-quality and effectively imitate the original 3DGS scenes. Extra functionality is offered to customise the creation of the point cloud, as well as producing a mesh of the scene.
 
-1) **Research Article:** *https://radiancefields.com/3dgs-to-dense-ply*
+1) **Research Paper:** *TO ADD*
+2) **Research Article:** *https://radiancefields.com/3dgs-to-dense-ply*
 2) **Youtube Video:** *https://www.youtube.com/watch?v=cOXfKRFqqxg*
 
 <p>
@@ -44,8 +45,9 @@ The transform path can either be to a transforms.json file or COLMAP output file
 | output_path          | 3dgs_pc.ply  |  Path to output file (must be ply file) |
 | transform_path       | -            |  Path to COLMAP or Transform file used for loading in camera positions for rendering colours |
 | generate_mesh        | False        |  Set to also generate a mesh based on the created point cloud  |
-| poisson_depth        | 12           |  The depth used in the poisson surface reconstruction algorithm that is used for meshing (larger value = more quality)  |
+| poisson_depth        | 10           |  The depth used in the poisson surface reconstruction algorithm that is used for meshing (larger value = more quality)  |
 | mesh_output_path     | 3dgs_mesh.ply|  Path to mesh output file (must be ply file) |
+| clean_pointcloud     | False        |  Set to remove outliers on the point cloud after generation (requires Open3D) |
 | camera_skip_rate     | 0            |  Number of cameras to skip for each rendered image (reduces compute time- only use if cameras in linear trajectory) |
 | num_points           | 10000000     |  Total number of points to generate for the pointcloud |
 | exact_num_points     | False        |  Set if the number of generated points should more closely match the num_points argument (slower) |
@@ -72,37 +74,19 @@ Our mesh reconstruction works by generating a point cloud only containing the pr
 ![Comparison of the generated point cloud and mesh for the bulldozer scene](https://i.imgur.com/Lzwhatr.png)
 
 Some tips to improve the results:
-1) Set a bounding box to only mesh specific parts of the scene that are you need in the mesh
-2) If the final mesh is too sharp, we recommend using some of the features in CloudCompare (e.g. smoothing) to get the desired output.
+1) Set the ```poisson_depth``` argument to a higher value (we found that 12 produced the best results, but any higher produced a infeasible mesh)
+2) Set a bounding box to only mesh specific parts of the scene that are you need in the mesh
+3) If the final mesh is too sharp, we recommend using some of the features in CloudCompare (e.g. smoothing) to get the desired output.
 
 ## How to increase speed
 
 While the generated point clouds have a high accuracy and precise colours, the process can be slower than desired (especially for scenes with millions of Gaussians). There are several ways that speed can be increased without substantially impacting the final quality of the point cloud:
 1) Set camera_skip_rate to a value where overlapping images are not rendered (e.g. we set camera_skip_rate = 4 for the mip dataset). We found that setting this value significantly reduced compile time, while not directly impacting the quality of the final reconstruction. Only do this if the camera poses are ordered in a linear trajectory around your scene and the camera poses overlap considerably.
 2) Set colour_quality to a lower option. This value is used to determine what resolution to render images of the scene; a lower quality will result in a faster render time.
 
-## How this works
+# Citation
 
-Firstly, the gaussians are loaded from the input file, with the 3D covariance matrices being calculated using the scales and rotations of each gaussian. The original gaussian colours are calculated from the spherical harmonics (with the degree=0 since these points do not change based on direction when they are part of the point cloud). Gaussians are then culled based on the bounding box, size cut off and minimum capacity arguments. Alongside this, the normals of each Gaussians are calculated by taking the smallest axis of the Gaussian; this facilitates meshing of the point cloud later.
+If you want to know more about how this works, we recommend reading our paper below. Also, if you found our work useful, please consider citing:
 
-There is an issue with using the loaded gaussian colours for generating new points; these colours do not accurately represent the scene. When rendering an image, gaussians that overlap on each pixel each contribute to the final colour of that pixel based on their opacity and order. Hence, a gaussian that is always behind another gaussian may only contribute 10% to the final pixel colour, and thus their colour does not accurately represent the contribution to the scene.
+TO ADD
 
-![Comparison of point clouds generated using original gaussian colours vs rendered colours](https://i.imgur.com/Y9ZVZaQ.png)
-
-The fix is to use colours generated by rendering images of the scene and tracking the contributions of each gaussian at each frame. When a gaussian contributes the most to a particular pixel colour compared to other times it is rendered (e.g. the gaussian is closer to a surface at that particular camera perspective) then we assign the colour of that pixel to that gaussian. Hence, gaussians behind other colours will not have erroneous colours and will have the same colour as the rendered images. The results are much better.
-
-![Showcase of how rendering colours works compared to utilising the original gaussian colours](https://i.imgur.com/Kbkp4wI.png)
-
-Now we have all the information required to start sampling points at each gaussian. Firstly, points are distributed to all gaussians based on the volume each gaussian has. Hence, larger gaussians will have more points compared to smaller ones, meaning that areas, such as backgrounds, have proper representation. 
-
-Each of these gaussians are batched, with gaussians with the same number of points to be generated being batched together. Since larger gaussians are less common, the number of gaussians in each batch diminishes as the number of assigned points increases, which is inefficient when generating the points. Hence, after a certain number of points, these batches are 'binned' together. While this does mean that the number of generated points does not exactly match the specified argument, it is much more efficient.
-
-The Torch multivariate normal function is used to sample over the gaussian distribution in batches. However, since a gaussian distribution is not definite, points can be generated that are 'outliers' as they differ too far from the gaussian's centre. Hence, for each point, the Mahalanobis distance is calculated from the centre of each gaussian to its points. If a point has a distance greater than 2 STD, then it is considered an outlier and removed. To ensure that gaussians with lots of random outliers are represented fairly, points that were removed are regenerated and checked again, and removed if they are outliers. This process is repeated until all points have been correctly generated or a max number of attempts has been made (we set this as 5).
-
-Once all the points have been generated for all of the gaussians these points are exported to a .ply file.
-
-## Issues
-
-Currently, we are using an altered version of the gaussian renderer introduced in the Torch Splatting repo. While our alterations allow us to accurately calculate the colours of each point, the entire rendering process is slow compared to the original CUDA implementation (around 2 seconds per render). We plan on eventually implementing this into the original gaussian renderer, but this is a future plan. If anyone is up for the challenge, feel free to implement it and push to this repo :)
-
-Another improvement can be automatically generating camera positions for rendering the colours, rather than requiring a set of camera transforms.
diff --git a/gauss_handler.py b/gauss_handler.py
@@ -199,6 +199,15 @@ def apply_bounding_box(self, bounding_box_min, bounding_box_max):
 
         self.filter_gaussians(valid_gaussians_indices)
 
+    """def apply_k_nearest_neighbours(self, k=10):
+        tree = KDTree(self.xyz.detach().cpu().numpy())
+
+        distances, indices = tree.query(gaussian_positions, k=k)
+
+        total_distances = (np.sum(distances, axis=1)/k)
+
+        invalid_gaussians = torch.tensor(total_distances > max_dist, device="cuda:0")"""
+
     def cull_large_gaussians(self, cull_gauss_size_percent):
         """
         Orders the gaussians by size and removes gaussians with a size greater than the 'cull_gauss_size_percent'

diff --git a/gauss_render.py b/gauss_render.py
@@ -194,52 +194,58 @@ def get_rect(pix_coord, radii, width, height):
 
 class GaussRenderer():
     """
-    A gaussian splatting renderer
+    A Gaussian splatting renderer
     """
 
     def __init__(self, means3D, opacity, colour, cov3d, white_bkgd=True):
         self.white_bkgd = white_bkgd
 
         self.device = means3D.get_device()
 
-        # Tensor of the maximum contributions each gaussian made 
+        # Tensor of the maximum contributions each Gaussian made 
         self.gaussian_max_contribution = torch.zeros(means3D.shape[0], device=self.device)
 
-        # Tensor of new gaussian colours calculated for point cloud generation
+        # Tensor of new Gaussian colours calculated for point cloud generation
         self.gaussian_colours = torch.zeros((means3D.shape[0], 3), device=self.device, dtype=torch.double)
 
+        """# Tensor of surface Gaussians
+        self.surface_gaussian_idxs = torch.zeros(means3D.shape[0], device=self.device)"""
+
         self.means3D =  means3D
         self.opacity = opacity 
         self.cov3d = cov3d
         self.colour = colour
 
     def get_colours(self):
         """ 
-        Returns the new calculated gaussian colours 
+        Returns the new calculated Gaussian colours 
         """
         return self.gaussian_colours * 255
 
     def get_seen_gaussians(self):
         """ 
-        Returns indices of gaussians that have been rendered 
+        Returns indices of Gaussians that have been rendered 
         """
         return self.gaussian_max_contribution > 0
 
     def get_surface_gaussians(self):
-        return self.gaussian_max_contribution > 0.4
+        """
+        Returns indices of Gaussians that are predicted to be on the surface of the scene
+        """
+        return self.gaussian_max_contribution > torch.mean(self.gaussian_max_contribution)
 
     def render(self, camera, means2D, cov2d, colour, opacity, depths, projection_mask, max_tile_size=60, max_gaussians_per_tile=60000):
         """
-        Renders an image given a set of gaussians and camera transform
+        Renders an image given a set of Gaussians and camera transform
 
         Args:
             camera: the camera parameters to render the image
-            means2D: positions of the gaussians in 2D spacse
-            cov2D: 2D covariance matrices for gaussians
-            colour: colours of the gaussians
-            opacity: opacity of the gaussians
-            depths: depths of the gaussians
-            projection_mask: mask that filters gaussians not included in camera frame
+            means2D: positions of the Gaussians in 2D spacse
+            cov2D: 2D covariance matrices for Gaussians
+            colour: colours of the Gaussians
+            opacity: opacity of the Gaussians
+            depths: depths of the Gaussians
+            projection_mask: mask that filters Gaussians not included in camera frame
         Returns:
             render_image: the rendered RGB image
         """
@@ -301,14 +307,14 @@ def render(self, camera, means2D, cov2d, colour, opacity, depths, projection_mas
 
                 tile_coord = self.pix_coord[start_px[0]: start_px[0]+tile_size[1], start_px[1]: start_px[1]+tile_size[0]].flatten(0,-2)
 
-                # Order gaussians based on the depth (descending away from cam)
+                # Order Gaussians based on the depth (descending away from cam)
                 sorted_depths, index = torch.sort(depths[tile_mask]) 
 
                 index = torch.flip(index, [0,])
 
                 inverse_index = index.argsort(0)
 
-                # Filter gaussians to only those in mask and reorder
+                # Filter Gaussians to only those in mask and reorder
                 sorted_means2D = means2D[tile_mask][index]
                 sorted_cov2d = cov2d[tile_mask][index] 
                 sorted_conic = sorted_cov2d.inverse() 
@@ -317,14 +323,14 @@ def render(self, camera, means2D, cov2d, colour, opacity, depths, projection_mas
 
                 dx = (tile_coord[:,None,:] - sorted_means2D[None,:]) 
 
-                # Calculate contributions of each gaussian
+                # Calculate contributions of each Gaussian
                 gauss_weight = torch.exp(-0.5 * (
                     dx[:, :, 0]**2 * sorted_conic[:, 0, 0] 
                     + dx[:, :, 1]**2 * sorted_conic[:, 1, 1]
                     + dx[:,:,0]*dx[:,:,1] * sorted_conic[:, 0, 1]
                     + dx[:,:,0]*dx[:,:,1] * sorted_conic[:, 1, 0]))
 
-                # Calculate alpha and transmittance of each gaussian in pixel
+                # Calculate alpha and transmittance of each Gaussian in pixel
                 alpha = (gauss_weight[..., None] * sorted_opacity[None]).clip(max=0.99) 
                 T = torch.cat([torch.ones_like(alpha[:,:1]), 1-alpha[:,:-1]], dim=1).cumprod(dim=1)
                 acc_alpha = (alpha * T).sum(dim=1)
@@ -333,23 +339,22 @@ def render(self, camera, means2D, cov2d, colour, opacity, depths, projection_mas
 
                 render_colour[start_px[0]: start_px[0]+tile_size[1], start_px[1]: start_px[1]+tile_size[0]] = tile_colour.reshape(tile_size[1], tile_size[0], -1)
 
-                # Get the current max representations of each gaussian by applying projection and tile mask
+                # Get the current max representations of each Gaussian by applying projection and tile mask
                 combined_mask = torch.zeros_like(self.gaussian_max_contribution, dtype=torch.bool)
                 combined_mask[projection_mask] = tile_mask
                 current_gaussian_reps = self.gaussian_max_contribution[combined_mask]
 
                 indices_in_mask = combined_mask.nonzero(as_tuple=True)[0]
 
-                # Calculate the represntation of each gaussian for the current pixels 
-                # This is the amount it contributed to the pixel colour and is what is used to determine what colour the points of each gaussian should have!
+                # Calculate the represntation of each Gaussian for the current pixels 
+                # This is the amount it contributed to the pixel colour and is what is used to determine what colour the points of each Gaussian should have!
                 contribution = ((T * alpha)).squeeze(2)[:, inverse_index]
 
-                # Get what pixel the gaussian contributed the most and what its biggest contribution value was
+                # Get what pixel the Gaussian contributed the most and what its biggest contribution value was
                 biggest_contribution_in_tile = torch.max(contribution, 0)
                 biggest_contribution_in_tile_vals = biggest_contribution_in_tile[0]
                 biggest_contribution_in_tile_pixel = biggest_contribution_in_tile[1]
-
-                # Filter gaussians that have a new biggest contribution
+                # Filter Gaussians that have a new biggest contribution
                 new_gaussians = biggest_contribution_in_tile_vals > current_gaussian_reps
 
                 new_gaussian_mask_indices = new_gaussians.nonzero()
@@ -360,6 +365,11 @@ def render(self, camera, means2D, cov2d, colour, opacity, depths, projection_mas
                 self.gaussian_max_contribution[gaussians_to_update] = biggest_contribution_in_tile_vals[new_gaussians].unsqueeze(1)
                 self.gaussian_colours[gaussians_to_update] = tile_colour[biggest_contribution_in_tile_pixel[new_gaussians]].unsqueeze(1)
 
+                """# Get the Gaussian that contributed most for each pixel in the tile, and set the surface Gaussian ids to 1 for these Gaussians
+                biggest_contribution_per_pixel = torch.max(contribution, 1)[1]
+                surface_gaussians_to_update = indices_in_mask[biggest_contribution_per_pixel]
+                self.surface_gaussian_idxs[surface_gaussians_to_update] += 1"""
+
             return torch.flip(render_colour, [1,])
 
     def add_img(self, camera, **kwargs):
@@ -381,7 +391,7 @@ def add_img(self, camera, **kwargs):
                     focal_x=camera.focal_x, 
                     focal_y=camera.focal_y)
 
-            # Project gaussians into 2D and filter gaussians outside of view range
+            # Project Gaussians into 2D and filter Gaussians outside of view range
             mean_ndc, mean_view, in_mask = projection_ndc(self.means3D, 
                         viewmatrix=camera.world_view_transform, 
                         projmatrix=camera.projection_matrix)
@@ -397,7 +407,7 @@ def add_img(self, camera, **kwargs):
             mean_coord_y = ((mean_ndc[..., 1] + 1) * camera.image_height - 1.0) * 0.5
             means2D = torch.stack([mean_coord_x, mean_coord_y], dim=-1)
 
-            # Estimate of the maximum gaussians for each tile for the provided amount of memory
+            # Estimate of the maximum Gaussians for each tile for the provided amount of memory
             estimated_mem_per_gaussian = 175000
             max_gaussians_per_tile = int((torch.cuda.mem_get_info()[1] - torch.cuda.memory_allocated(self.device))/estimated_mem_per_gaussian)