[ROIs] Use FOV ROIs for labeling (at arbitrary pyramid level) #19

tcompa · 2022-07-20T09:44:02Z

Starting from discussion in

we are now splitting the work on ROIs into several steps/issues:

Parsing pixel sizes and FOV ROIs.
Transform physical-unit ROI into pixels, at arbitrary pyramid level.
Use FOV ROIs for illumination correction (at highest-resolution level).
Use FOV ROIs for labeling (at arbitrary pyramid level). (THIS ISSUE)

Useful resources:

This is the slightly more complex that the illumination-correction case, because:

We should read/write at an arbitrary resolution level.
We may have to force a sequential scan of ROIs (this is to be verified).

Other than that, the tentative plan is similar to the illumination-correction one:

Read ROIs from some path, and transform them to indices at an arbitrary level.
Switch from map_blocks to an explicit loop over ROIs which constructs an array by delayed correction functions. In v0, this should be constrained to never act on more than a single ROI at a time.
Write to disk, triggering execution of the correction function for all FOVs. Keep in mind that when working at level>0 there are some additional scale prefactors to introduce - see https://github.com/fractal-analytics-platform/fractal/blob/tasks/fractal/tasks/image_labeling_whole_well.py#L159-L175.

The text was updated successfully, but these errors were encountered:

* Generalize image_labeling to work with ROIs; * Generalize image_labeling to work at arbitrary resolution level; * Add lib_zattrs_utils.py, also with rescale_datasets; * Rename extract_zyx_pixel_sizes_from_zattrs to extract_zyx_pixel_sizes;

tcompa · 2022-07-25T06:58:08Z

Current status of the per-FOV image-labeling task:

It takes an argument labeling_level (in the same way as image_labeling_whole_well), which can be larger than 0.
It loops over FOV ROIs at this arbitrary resolution level and performs segmentation.
The loop is currently not in parallel - this is something to explore later.
After segmentation, the task writes the mask with the correct scale factors. Since the rescaling operation is shared among the two labeling tasks, it is now obtained with the function rescale_datasets from lib_zattrs_utils.py:

def rescale_datasets(datasets : List[Dict] = None,
                     coarsening_xy : int = None,
                     reference_level : int = None,) -> List[Dict]:
    if datasets is None or coarsening_xy is None or reference_level is None:
        raise TypeError("Missing argument in rescale_datasets")

    # Construct rescaled datasets
    new_datasets = []
    for ds in datasets:
        new_ds = {}

        # Copy all keys that are not coordinateTransformations (e.g. path)
        for key in ds.keys():
            if key != "coordinateTransformations":
                new_ds[key] = ds[key]

        # Update coordinateTransformations
        old_transformations = ds["coordinateTransformations"]
        new_transformations = []
        for t in old_transformations:
            if t["type"] == "scale":
                new_t = {"type": "scale"}
                new_t["scale"] = [
                    t["scale"][0],
                    t["scale"][1] * coarsening_xy**reference_level,
                    t["scale"][2] * coarsening_xy**reference_level,
                ]
                new_transformations.append(new_t)
            else:
                new_transformations.append(t)
        new_ds["coordinateTransformations"] = new_transformations
        new_datasets.append(new_ds)

Note that we explicitly forbid the existence of a global coordinateTransformation, at the moment. We can also decide to allow this feature, and then we need to combine the two scale's (here and also somewhere else - e.g. when parsing pixel sizes).

There a few of relatively simple open points:

Make sure that output is chunked correctly;
Perhaps include the global scale transformation;
Explore parallel writing.
Make sure that image_labeling still works also for 2D (concerning this point: do we actually need to support the per-FOV 2D segmentation? Or should we drop it?)

jluethi · 2022-07-25T07:06:32Z

Sounds great! 👏🏻

Make sure that image_labeling still works also for 2D (concerning this point: do we actually need to support the per-FOV 2D segmentation? Or should we drop it?)

Per FOV 2D segmentation can still be useful, because it allows for 2D segmentation at full resolution, while whole well 2D segmentation needs to be done at a lower resolution for memory reasons.

Perhaps include the global scale transformation;

If it's simple to do, why not. Otherwise, let's wait how transformations change in the v0.5 spec and eventually refactor for that

Explore parallel writing.

There are some cases we know would be save. e.g. at level 0. Is there a way to detect if parallel writing will be save? Also interesting to explore whether it can be safely done in cases where FOV are in overlapping chunks.

tcompa · 2022-07-25T08:08:02Z

Make sure that image_labeling still works also for 2D (concerning this point: do we actually need to support the per-FOV 2D segmentation? Or should we drop it?)

Per FOV 2D segmentation can still be useful, because it allows for 2D segmentation at full resolution, while whole well 2D segmentation needs to be done at a lower resolution for memory reasons.

Ok.
This feature is in-place, and I just tested it on a MIP array. It cannot be tested within Fractal, at the moment, because of our "poor" handling of task arguments (the per-FOV segmentation always acts on the "main" zarr file, and in the current version it cannot be overridden), but it works smoothly from the command-line interface.

tcompa · 2022-07-25T14:03:57Z

Partial update on status:

~~Make sure that output is chunked correctly.~~
Done. Chunk sizes correspond to images (e.g. 1x2160x2560), and the image size is obtained by combining FOV ROIs (from the AnnData table) and pixel sizes (from the zattrs file) - both for the highest-resolution level.
~~Make sure that image_labeling still works also for 2D.~~
Done (although not as part of Fractal).
~~Perhaps include the global scale transformation.~~
This is indeed something simple, but perhaps let's postpone since more changes will come with v0.5. As of fractal-analytics-platform/fractal-client@273ab07, we raise a NotImplementedError with a decent error message when a global transformation is found.
Explore parallel writing.
Ongoing. The first bunch of tests with num_threads>1 just ran through (with reasonable output), even if they included possibly-non-safe parallel writing. Maybe to_zarr already takes care of it? More tests/understanding needed. We may also provisionally change the chunking choices (making chunks very large), to create more edge cases where many processes should write to the same chunk.
Re: Is there a way to detect if parallel writing will be save?
There is a way, of course, but it requires an explicit check that each chunk only hosts a single FOV/ROI. If expected safe operations are only the ones acting on single images, then I wouldn't take the road towards a general automatic check of safety - since this can be achieved directly by requiring that chunks are of the size of images. If @jluethi has in mind something else (like checking that no pair of organoids belong to the same chunk), then the automatic check would be needed.
I'd say let's re-assess after we explore point 4. In the best case, parallel writing is always safe and we can skip this discussion.
~~Profile memory usage. This is just to make sure that we've not introduced any unexpected large memory usage.~~
EDIT: this is under control - see next comment.

tcompa · 2022-07-25T14:46:59Z

Concerning memory usage, here is the profiling for three single-well cases. TL;DR Everything looks OK.

Note that multi-well examples may lead to a known issue - see fractal-analytics-platform/fractal-client#109 (comment).

In all three cases below, segmentation takes place per-FOV, at the highest-resolution level, and with relabeling.

Case 1: 2x2 well with 10 Z planes (num_threads=2):

Case 2: 2x2 well with 42 Z planes (num_threads=1):

Case 3: 9x8 well with 19 Z planes (num_threads=2):

jluethi · 2022-07-25T15:06:12Z

Awesome progress, very nice to see that memory is stable as well!

A NotImplementedError is harsh, but I guess we don't have good ways to parse warnings so far, right? We probably wouldn't want the whole pipeline to crash if someone specifies extra metadata. So as soon as we think the library may gain broader usage, this would have to go. But fine for now :)

Explore parallel writing.

Sound good. Let's make this the focus. I agree, only if parallel writing is possible, but causes issues in some cases will we need to implement some check as in 5. Eventually, we may also want to check that there is no ROI overlap (e.g. for the organoid case). But let's have a look first at how save the to_zarr is. I like the idea of provoking simultaneous writes, could also do this for some synthetic examples. If to_zarr handles this well, than we only need to worry about actual pixel overlap, not parallel chunk access :)

👏🏻👏🏻👏🏻

tcompa · 2022-07-26T07:45:34Z

Here is some first anecdotal evidence of successful parallel writing.

For a single 9x8 well, I performed per-FOV segmentation at resolution level 4, where a FOV has YX shape 135x160 and a single 2160x2560 chunk may host 256 FOVs (that is, more than the 72 FOVs of the well).
I ran this test with num_threads=10 parallel executions of Cellpose (which is possible, memory-wise, because each FOV has shape 10x135x160), which is a highly "dangerous" operation because all segmentation calls write on the same (single) chunk.

Observations:

The code ran through, with no obvious errors/warnings.
The labels appear reasonable in napari (apart from the fact that they are extremely coarse and irregular, but that is an expected consequence of working at level 4).

This points in the good direction!
We should now find some more stringent tests to be sure that everything is happening safely (maybe we can quickly come up with an artificial test of to_zarr.. let's see).

jluethi · 2022-07-26T07:50:35Z

Oh, this sounds great! Very promising indeed! 🎉

tcompa · 2022-07-28T12:43:17Z

Based on the discussion in #20 (and under the assumptions described there), we consider parallel writing as safe - until further notice. This ends the current work on per-ROI labeling, and I'm closing this issue.

Things that are not addressed here:

There is still a broader memory issue which comes up during labeling, but it is not related to the per-ROI scheme -- see Labeling-tasks integration into fractal + (Torch) memory errors fractal-client#109.
Another minor missing point (support for a global coordinate transformation) is not specific for labeling but rather shared among tasks. It can be addressed in the future, the first time that we actually need it. See Support global scale transformations at the multiscale level #50.

This was referenced Jul 20, 2022

Creation of object-defined feature tables fractal-analytics-platform/fractal-client#27

Closed

Parsing metadata from Yokogawa experiments #25

Closed

Field of view parallelization via OME-NGFF ROI tables #24

Closed

tcompa referenced this issue in fractal-analytics-platform/fractal-client Jul 21, 2022

Fix coordinateTransformations in image_labeling_whole_well (ref #115)

6e7eee6

tcompa referenced this issue in fractal-analytics-platform/fractal-client Jul 21, 2022

Fix bug in image_labeling_whole_well (ref #112 #115)

12e267e

tcompa referenced this issue in fractal-analytics-platform/fractal-client Jul 22, 2022

Make per-FOV labeling loop over ROIs (ref #115)

746e450

tcompa referenced this issue in fractal-analytics-platform/fractal-client Jul 25, 2022

Fix bug in rescale_datasets (missing return) - ref #115

8b339f5

tcompa referenced this issue in fractal-analytics-platform/fractal-client Jul 25, 2022

Fix relabeling in per-FOV labeling, after switching to ROIs (ref #115)

1d73c2a

tcompa referenced this issue in fractal-analytics-platform/fractal-client Jul 25, 2022

Remove all logic around chunks from image_labeling (ref #115)

08108e0

tcompa referenced this issue in fractal-analytics-platform/fractal-client Jul 25, 2022

Better exceptions re: coordinateTransformations (ref #115)

273ab07

This was referenced Jul 27, 2022

Labeling-tasks integration into fractal + (Torch) memory errors fractal-analytics-platform/fractal-client#109

Closed

Parallel writing of zarr array from single process #20

Closed

tcompa closed this as completed Jul 28, 2022

jluethi transferred this issue from fractal-analytics-platform/fractal-client Sep 2, 2022

jluethi added this to the Create ROIs & support processing by ROIs milestone Sep 2, 2022

jluethi moved this from Done to Done Archive in Fractal Project Management Oct 5, 2022

tcompa added the Tables AnnData and ROI/feature tables label Sep 19, 2023

jluethi removed this from Fractal Project Management Apr 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ROIs] Use FOV ROIs for labeling (at arbitrary pyramid level) #19

[ROIs] Use FOV ROIs for labeling (at arbitrary pyramid level) #19

tcompa commented Jul 20, 2022

tcompa commented Jul 25, 2022

jluethi commented Jul 25, 2022

tcompa commented Jul 25, 2022

tcompa commented Jul 25, 2022 •

edited

Loading

tcompa commented Jul 25, 2022

jluethi commented Jul 25, 2022

tcompa commented Jul 26, 2022

jluethi commented Jul 26, 2022

tcompa commented Jul 28, 2022

[ROIs] Use FOV ROIs for labeling (at arbitrary pyramid level) #19

[ROIs] Use FOV ROIs for labeling (at arbitrary pyramid level) #19

Comments

tcompa commented Jul 20, 2022

tcompa commented Jul 25, 2022

jluethi commented Jul 25, 2022

tcompa commented Jul 25, 2022

tcompa commented Jul 25, 2022 • edited Loading

tcompa commented Jul 25, 2022

jluethi commented Jul 25, 2022

tcompa commented Jul 26, 2022

jluethi commented Jul 26, 2022

tcompa commented Jul 28, 2022

tcompa commented Jul 25, 2022 •

edited

Loading