Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate k-means segs with no labels #14

Open
jordandeklerk opened this issue Jul 2, 2024 · 3 comments
Open

Generate k-means segs with no labels #14

jordandeklerk opened this issue Jul 2, 2024 · 3 comments

Comments

@jordandeklerk
Copy link

jordandeklerk commented Jul 2, 2024

First of all, great work on this paper. It is very interesting.

I was curious about generating k-means segmentations without providing ground truth labels. Is this possible with this framework? In the helper_generate_kmeans.py file the true label is fed into the label_hint_seg function in the segmentation.py file, which uses the ground truth label as a hint to generate clusters (as far I know).

So it would seem that you can't run generate_kmeans.py without ground truth labels. Unless I'm missing something.

I am trying to generate segmentations on MRI spine images without any labels and the training process generates latent images, but when I run generate_kmeans.py with the output from training, it doesn't return anything.

Here is how I setup my custom dataset class. I think I followed all of the steps in the instructions.

from glob import glob
from typing import Tuple
import cv2
import numpy as np
from torch.utils.data import Dataset
from PIL import Image

class SpineDataset(Dataset):
    def __init__(self,
                 base_path: str = './data/spine',
                 out_shape: Tuple[int] = (512, 512)):
        self.out_shape = out_shape
        # This dataset contains multiple PNG files.
        self.img_paths = glob('%s/*.png' % (base_path))
        assert len(self.img_paths) > 0, f"No PNG files found in {base_path}"
        self.img_paths.sort()  # Ensure consistent ordering

    def __len__(self) -> int:
        return len(self.img_paths)

    def __getitem__(self, idx) -> Tuple[np.array, np.array]:
        image_path = self.img_paths[idx]
        image = np.array(Image.open(image_path).convert('L'))  # Open as grayscale
        
        epsilon = 1e-12
        image = 2 * (image - image.min() + epsilon) / (image.max() - image.min() + epsilon) - 1
        
        assert len(image.shape) == 2
        
        # Resize to `out_shape`.
        resize_factor = np.array(self.out_shape) / image.shape
        dsize = tuple(np.int16(resize_factor.min() * np.float16(image.shape)))
        image = cv2.resize(src=image,
                           dsize=dsize,
                           interpolation=cv2.INTER_CUBIC)
        
        # channel last to channel first to comply with Torch.
        image = image[None, ...]
        
        # Use an `np.nan` as a placeholder for the non-existent label.
        return image, np.nan

    def num_image_channel(self) -> int:
        # [B, C, H, W]
        return 1

    def num_classes(self) -> int:
        return None

I also setup the config.yaml file correctly following the no label setup as suggested in the readme. Any help with this would be much appreciated! Thank you!

@ChenLiu-1996
Copy link
Owner

ChenLiu-1996 commented Jul 3, 2024

Thank you for the feedback.

In the current codebase, the code is written assuming that we have labels and I wouldn't be surprised if some parts of the codebase breaks when you run on your custom data without labels.

With that said, even under the current codebase, you probably should have something rather than nothing when you run the training and spectral k-means clustering on unlabeled images.

You are correct that label_hint_seg will be a useless function if you don't have labels. However, when you run generate_kmeans.py, besides producing the binary segmentations (seg_pred), the code should also produce the predicted multi-valued label map (label_pred).

So if the code works properly, running generate_kmeans.py will give you npz files where the label_pred key corresponds to the spectral k-means cluster results (default is 10 classes, which you can modify by updating n_clusters inside helper_generate_kmeans.py).

Then (hopefully) you can visualize the result with plot_paper_figure_main.py, WITHOUT the --binary flag (see main Readme).

@jordandeklerk
Copy link
Author

Thank you for the quick reply! This makes sense. I will have to do some more testing and debugging to see why I'm not getting any output from the generate_kmeans.py script. I suspect it is probably something on my end.

Thank you!

@ChenLiu-1996
Copy link
Owner

I have not tested out whether the latest code support unlabeled data very well. Please let me know if you do have troubles.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants