Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ROIs] Parsing pixel sizes and FOV ROIs #23

Closed
tcompa opened this issue Jul 20, 2022 · 28 comments
Closed

[ROIs] Parsing pixel sizes and FOV ROIs #23

tcompa opened this issue Jul 20, 2022 · 28 comments
Labels
Tables AnnData and ROI/feature tables

Comments

@tcompa
Copy link
Collaborator

tcompa commented Jul 20, 2022

Starting from discussion in

we are now splitting the work on ROIs into several steps/issues:

  1. Parsing pixel sizes and FOV ROIs. (THIS ISSUE)
  2. Transform physical-unit ROI into pixels, at arbitrary pyramid level.
  3. Use FOV ROIs for illumination correction (at highest-resolution level).
  4. Use FOV ROIs for labeling (at arbitrary pyramid level).

Useful resources:


First sketch of a detailed to-do list:

  1. In create_zarr_structure, during the loop over wells, parse metadata using parse_yokogawa_metadata, which returns a table like Parsing metadata from Yokogawa experiments #25.
  2. When writing the per-FOV .zattrs (corresponding to a well, in our single-FOV scheme), add a global scale with pixel sizes in physical units, and a per-dataset scale with coarsening prefactors.
  3. Convert the FOV-ROI table from step 1 into a table like Field of view parallelization via OME-NGFF ROI tables #24 (this should be done with an external function, because later we may want to provide a custom table to use).
  4. Save it as an AnnData table, in the correct path. See writer in https://github.com/kevinyamauchi/ome-ngff-tables-prototype.
@jluethi
Copy link
Collaborator

jluethi commented Jul 20, 2022

Great summary @tcompa. I also created a corresponding milestone for it, so that we can keep an overview over the related issues.

Sounds like a good first implementation. In order for this to generalize afterwards, the function that will vary is creating the table in 1. That table is either parsed from the mlf & mrf files as described in 1 (using #25), calculated by Fractal based on some inputs (tbd) or generated by the user (support tbd).

If 2-4 are then based on that table, we can either have different tasks for the different input schemes or an input parameter that specifies how the metadata is parsed :)

tcompa referenced this issue in fractal-analytics-platform/fractal-client Jul 20, 2022
@tcompa
Copy link
Collaborator Author

tcompa commented Jul 20, 2022

@jluethi

Your metadata-parsing function produces a global list of FOV ROIs in the plate, which are not divided into wells - right?
However we should know the well corresponding to each FOV/ROI.

Is this a change you could implement quickly?
Probably just adding a "well" column would do it, because then we can quickly select the ROIs corresponding to a given well.

Let us know if you prefer that we take care of it or you can work on it.

@tcompa
Copy link
Collaborator Author

tcompa commented Jul 20, 2022

(it would be best if you can do it in the tasks-ROIs branch)

@jluethi
Copy link
Collaborator

jluethi commented Jul 20, 2022

Hmm, what kind of table to you get when you parse it? In my test data, I got a table that was double-indexed by well & field of view. Thus, by using the first index column (well_id), you should get the well.

See here: #25

Screenshot 2022-06-17 at 17 42 29

If that doesn't work or you don't get a well_id column, let me know and I'll look into it!

@tcompa
Copy link
Collaborator Author

tcompa commented Jul 20, 2022

My bad, I take it back.

The printout of the table in the terminal is not very clear, and I didn't see the well_id column, sorry.

@tcompa
Copy link
Collaborator Author

tcompa commented Jul 20, 2022

One more question: would you mind exposing also the image size in pixels (that is, 2160x2560)?
We'll start by hard-coding it, but it would be useful to read it directly from the mlf/mrf files, if possible.

@jluethi
Copy link
Collaborator

jluethi commented Jul 20, 2022

Good idea. I will look into making sure this is parsed & checked from the metadata file and will expose it during the refactor to improve code quality of the metadata parsing :)

From the Yokogawa side, it's always specified globally (per channel). For our current parsing, I'd say we enforce that the different channels have the same x & y dimensions. But we still add it as a measurement per site, in case users want to provide tables where different sites were imaged differently at some point.

tcompa referenced this issue in fractal-analytics-platform/fractal-client Jul 20, 2022
@tcompa
Copy link
Collaborator Author

tcompa commented Jul 21, 2022

It seems that napari does not recognize the global coordinateTransformations key, despite having it in the ome-ngff specs. I'm not sure of whether we did something wrong, or whether it's a bug upstream.
With the following .zattrs file, the global scale transformation (with prefactors 1x0.1625x0.1625) is ignored

{
    "multiscales": [
        {
            "axes": [
                {
                    "name": "c",
                    "type": "channel"
                },
                {
                    "name": "z",
                    "type": "space",
                    "unit": "micrometer"
                },
                {
                    "name": "y",
                    "type": "space",
                    "unit": "micrometer"
                },
                {
                    "name": "x",
                    "type": "space",
                    "unit": "micrometer"
                }
            ],
            "coordinateTransformations": [
                {
                    "scale": [
                        1.0,
                        0.16249999999999984,
                        0.16249999999999984
                    ],
                    "type": "scale"
                }
            ],
            "datasets": [
                {
                    "coordinateTransformations": [
                        {
                            "scale": [
                                1.0,
                                1.0,
                                1.0
                            ],
                            "type": "scale"
                        }
                    ],
                    "path": "0"
                },
                {
                    "coordinateTransformations": [
                        {
                            "scale": [
                                1.0,
                                2.0,
                                2.0
                            ],
                            "type": "scale"
                        }
                    ],
                    "path": "1"
                },
                {
                    "coordinateTransformations": [
                        {
                            "scale": [
                                1.0,
                                4.0,
                                4.0
                            ],
                            "type": "scale"
                        }
                    ],
                    "path": "2"
                },
                {
                    "coordinateTransformations": [
                        {
                            "scale": [
                                1.0,
                                8.0,
                                8.0
                            ],
                            "type": "scale"
                        }
                    ],
                    "path": "3"
                },
                {
                    "coordinateTransformations": [
                        {
                            "scale": [
                                1.0,
                                16.0,
                                16.0
                            ],
                            "type": "scale"
                        }
                    ],
                    "path": "4"
                }
            ],
            "version": "0.3"
        }
    ],
    "omero": {
        "channels": [
            {
                "coefficient": 1,
                "color": "00FFFF",
                "family": "linear",
                "label": "DAPI",
                "window": {
                    "end": 700,
                    "max": 65535,
                    "min": 0,
                    "start": 0
                }
            },
            {
                "coefficient": 1,
                "color": "FF00FF",
                "family": "linear",
                "label": "nanog",
                "window": {
                    "end": 180,
                    "max": 65535,
                    "min": 0,
                    "start": 0
                }
            },
            {
                "coefficient": 1,
                "color": "FFFF00",
                "family": "linear",
                "label": "Lamin B1",
                "window": {
                    "end": 1500,
                    "max": 65535,
                    "min": 0,
                    "start": 0
                }
            }
        ],
        "id": 1,
        "name": "TBD",
        "version": "0.4"
    }
}

This is with

$ napari --info
napari: 0.4.16
Platform: Linux-5.15.0-41-generic-x86_64-with-glibc2.17
System: Ubuntu 22.04 LTS
Python: 3.8.13 (default, Mar 28 2022, 11:38:47)  [GCC 7.5.0]
Qt: 5.15.2
PyQt5: 5.15.7
NumPy: 1.22.3
SciPy: 1.6.1
Dask: 2022.6.1
VisPy: 0.10.0

OpenGL:
  - GL version:  4.6 (Compatibility Profile) Mesa 22.2.0-devel (git-1951065 2022-06-28 jammy-oibaf-ppa)
  - MAX_TEXTURE_SIZE: 16384

Screens:
  - screen 1: resolution 3840x2160, scale 1.0

Plugins:
  - console: 0.0.4
  - napari-ome-zarr: 0.5.1
  - napari-svg: 0.1.6
  - scikit-image: 0.4.16

It is trivial to move this scale information in the different datasets, but I'm wondering where the problem is.

@tcompa
Copy link
Collaborator Author

tcompa commented Jul 21, 2022

Update: my guess is that it's a ome-zarr issue, since in https://github.com/ome/ome-zarr-py/blob/fe811a635a9edf9be0d97d08ad9ee46e59c60083/ome_zarr/reader.py they only load the dataset transformations.

EDIT
This is the relevant issue: ome/ome-zarr-py#172

@jluethi
Copy link
Collaborator

jluethi commented Jul 21, 2022

Hmm, do we have a strong reason to use global coordinateTransformations? To my mind, the two solutions are very similar. One has extra work during writing (making sure every dataset has the correct coordinates), the other during reading (combining transforms).

Work on coordinate transforms in OME-NGFF is planned for the v0.5 release for the fall. I don't know all the details, but I think the system will be generalized a bit. Maybe we stay with the "simple" format of having coordinateTransformations per dataset until then?

@tcompa
Copy link
Collaborator Author

tcompa commented Jul 21, 2022

Hmm, do we have a strong reason to use global coordinateTransformations? To my mind, the two solutions are very similar. One has extra work during writing (making sure every dataset has the correct coordinates), the other during reading (combining transforms).

I'm not thinking about how much work it is, but about how clear it would be to look at a .zattrs file. If we split the global physical-unit rescaling from the pyramid-coarsening one, as soon as we open a zattrs file we immediately know if it was created at the highest-resolution level (where the first dataset has scale 1,1,1) or at a coarsened one (e.g. 1,2,2 means that this was at level 1, with coarsening factor 2).

It's a small detail, not a serious issue.

Work on coordinate transforms in OME-NGFF is planned for the v0.5 release for the fall. I don't know all the details, but I think the system will be generalized a bit. Maybe we stay with the "simple" format of having coordinateTransformations per dataset until then?

It is not an OME-NGFF issue, since the specs are very clear about this option.
It is a missing feature in ome-zarr reader, see ome/ome-zarr-py#172.

The best option would be to fix the ome-zarr reader (I can open that discussion over there, if we decide so.. it seems easy to add this feature in a naive way, but not if they prefer to have a more complete PR), or we also can keep it simple and move on with the other solution (store everything in the datasets). I'm fine with both.

@jluethi
Copy link
Collaborator

jluethi commented Jul 21, 2022

Hmm, I see the point. If it's easy to fix on the ome-zarr front, sure, let's do it.

But if not, we always know the pyramid level from the folder index (i.e. 0 is lowest level). Or are you thinking about always specifying the same globale scale and using dataset scalings for e.g. label images generated at level 1?
If so, clever idea! But I'm not sure we'd want to enforce this as a standard, would be more robust to look at the combined scale (global + dataset) to figure out if things are scaled the same.

In this spirit: We can also ensure that our way of reading scale always combines global & per dataset (via a lib function). Then, we always switch back and forth, depending on what is implemented downstream

Lastly, I agree, it's not an OME-NGFF issues. But changes are coming to OME-NGFF regarding coordinate transformations in the fall. That may be the reason why supporting all of the current spec hasn't been a priority.

@tcompa
Copy link
Collaborator Author

tcompa commented Jul 21, 2022

But if not, we always know the pyramid level from the folder index (i.e. 0 is lowest level).

The folder name does not contain information about the actual resolution: when we segment level-2 images, the mask will sit in a folder named 0 but that mask has a shape which does not match with the highest-resolution images.

Or are you thinking about always specifying the same globale scale and using dataset scalings for e.g. label images generated at level 1?If so, clever idea!

Yes, the first plan was to use the multiscales scaling to set the physical size of pixels at level 0, and the scales of single datasets to set the coarsening scale.
I'm not yet sure of whether this is the safest way to go (what happens when we want to add translations at the dataset level? In which order are the three scale/scale/translate transformations applied? Depending on the order, we may have to specify translations in either pixel or physical units), but it has the nice advantage I mentioned above.

But I'm not sure we'd want to enforce this as a standard, would be more robust to look at the combined scale (global + dataset) to figure out if things are scaled the same.

Agreed, this is not a standard. It would just be an implementation detail, but we obviously always read/copy all transformations (both at the multiscale and dataset level).

In this spirit: We can also ensure that our way of reading scale always combines global & per dataset (via a lib function). Then, we always switch back and forth, depending on what is implemented downstream

Agreed. At the moment this is just copying zattrs from one (existing) zarr to another one, and a lib didn't seem necessary. But we can factor out this operation if it becomes useful.

All in all, I think the easiest is just to combine scale transformations in a single place, so that we can visualize things with the current ome-zarr reader and we don't have to think too much about the order of transformations (even though we don't use translations at the moment).

(let's also touch this point in our call today)

@tcompa
Copy link
Collaborator Author

tcompa commented Jul 21, 2022

As of our last call, we proceed as in

All in all, I think the easiest is just to combine scale transformations in a single place, so that we can visualize things with the current ome-zarr reader and we don't have to think too much about the order of transformations (even though we don't use translations at the moment).

and we re-evaluate this issue later on, when related changes in OME-NGFF appear.

tcompa referenced this issue in fractal-analytics-platform/fractal-client Jul 21, 2022
tcompa referenced this issue in fractal-analytics-platform/fractal-client Jul 21, 2022
tcompa referenced this issue in fractal-analytics-platform/fractal-client Jul 21, 2022
…rs file, including coordinateTransformations (ref #112)
tcompa referenced this issue in fractal-analytics-platform/fractal-client Jul 21, 2022
tcompa referenced this issue in fractal-analytics-platform/fractal-client Jul 21, 2022
tcompa referenced this issue in fractal-analytics-platform/fractal-client Jul 21, 2022
tcompa referenced this issue in fractal-analytics-platform/fractal-client Jul 21, 2022
tcompa referenced this issue in fractal-analytics-platform/fractal-client Jul 21, 2022
@tcompa
Copy link
Collaborator Author

tcompa commented Jul 21, 2022

Current status:

  • When first creating a zarr file, we parse the mlf/mrf files in the image folder and extract a DataFrame with FOV ROIs. In the future, this DataFrame could come from other sources.
  • Physical units (AKA pixel sizes) are read from this DataFrame and used to produce a correct set of scale transformations for all datasets (combining both the pixel->micrometer scaling and the coarsening-related one). These transformations are propagated in all subsequent tasks (notably in zarr replication, MIP and labeling tasks).
  • As a check of the last point, we ran a typical workflow (zarr generation, illumination correction, per-FOV labeling, MIP, per-well 2D labeling at level 2) and we verified that all scales are correctly read and aligned with each other.
  • The function prepare_FOV_ROI_table (in lib_regions_of_interest.py) translates the DataFrame into an AnnData table, which is then stored within the zarr hierarchy (in plate.zarr/B/03/0/tables/FOV_ROI_table/). Other tasks can read from that table and extract useful information (about FOV ROIs or about pixel sizes, e.g. to automatically set the anisotropy parameter for 3D labeling).
  • The originally computed FOV ROIs are, by default, 3D shapes that extend across the entire Z depth.
  • During the replicate_zarr_structure_mip task, this table is converted to new ROIs which have the same 2D shape but with a Z extension of just one pixel size (e.g. 1 micrometer). Such table has the same shape as the standard one, and it is saved in the same position (but within the MIP zarr folder). At the moment we are not using it for anything, but it could be useful in the future.

Missing points:

  1. We should parse some more metadata, notably the image size (e.g. 2160x2560) and the number of Z planes (e.g. 10).

After point 1 is in-place, we can close this issue.

Other comments?

@jluethi
Copy link
Collaborator

jluethi commented Jul 21, 2022

Wow, awesome progress! I'll make sure to update the metadata parsing by the end of the week so we can indeed wrap this up then!

This status sounds great. One nitpick:

Other tasks can read from that table and extract useful information (about FOV ROIs or about pixel sizes, e.g. to automatically set the anisotropy parameter for 3D labeling).

I would agree for extracting ROIs, that is the point. But for pixel sizes, I think the Zarr file should be the universal ground truth. I don't know whether pixel sizes should remain in the ROI table (they aren't part of a ROI definition, right?). If the ROI is defined in physical units and we adopt some general ROI standard at some point, pixel sizes probably wouldn't be a part of it.

How does the current ROI table actually look like?

@tcompa
Copy link
Collaborator Author

tcompa commented Jul 21, 2022

This is the content of a (current) ROI table:

Screenshot from 2022-07-21 14-11-09

The places where the pixel sizes are used are:

  1. When mapping this table to a set of pixel indices, for a given level and coarsening factor (an operation that takes place frequently, during tasks).
  2. For the specific case of anisotropy, in 3D labeling.

If that's preferable, we can remove pixel sizes from the ROI table. In that case we will have to retrieve them (for both cases 1 and 2) from the scale attribute of level 0 in a zattrs file.

@jluethi
Copy link
Collaborator

jluethi commented Jul 21, 2022

Great!

Yes, it makes a lot of sense that those two tasks need the pixel size information. But would be great to read this from the Zarr file. Because then, the logic of the tasks work on OME-Zarr files in general and read the metadata directly from the Zarr files.

Plus, this reinforces again that the Zarr file is the ground truth and there is just one place where one reads pixel size data from, which is the zarr metadata in the .zattrs (where also the viewer goes to read this metadata from).

jluethi referenced this issue in fractal-analytics-platform/fractal-client Jul 21, 2022
@jluethi
Copy link
Collaborator

jluethi commented Jul 21, 2022

I now updated the metadata_parsing.py. It now passes pre-commit, includes x_pixel, y_pixel & z_pixel (=> pixel dimensions of the images) and has relevant values cast to integers.

An example table looks like this:
Screenshot 2022-07-21 at 17 11 33

I pushed it to the tasks branch though. Can you move it over to tasks-ROI if you need it there @tcompa ?

Also, I introduced a new dependency: instead of using the xml library (which triggered a bandit B314 issue), I now use defusedxml. Where do I add the defusedxml dependency now? Does this go into pyproject.toml? Or into poetry.lock? And how does it need to be specified?

@tcompa
Copy link
Collaborator Author

tcompa commented Jul 22, 2022

Where do I add the defusedxml dependency now? Does this go into pyproject.toml? Or into poetry.lock? And how does it need to be specified?

The simplest is the following:

  1. If you installed it with pip, remove it (pip uninstall defusedxml).
  2. poetry add defusedxml --group runner does the work for you: it checks that this module can be added and is consistent with the other dependencies, it installs the module, it updates the pyproject.toml file, it writes the poetry.lock file.
  3. After poetry add, you can commit and push the two poetry files (pyproject.toml and poetry.lock).

tcompa referenced this issue in fractal-analytics-platform/fractal-client Jul 22, 2022
@tcompa
Copy link
Collaborator Author

tcompa commented Jul 22, 2022

I pushed it to the tasks branch though. Can you move it over to tasks-ROI if you need it there @tcompa ?

Sure. FYI, from tasks-ROIs I did:

git pull
git cherry-pick 7a39e2a675447d34764a2f9210fe00c51b9ff434
git push

and obtained 0888d509b20402a387a6a42a2e5cbb78da078441.

@tcompa
Copy link
Collaborator Author

tcompa commented Jul 22, 2022

I now updated the metadata_parsing.py. It now passes pre-commit, includes x_pixel, y_pixel & z_pixel (=> pixel dimensions of the images) and has relevant values cast to integers.

Great, thanks!
I'll include this new information right now.

@tcompa
Copy link
Collaborator Author

tcompa commented Jul 22, 2022

@jluethi, I'm confused by the table shown in #23 (which is the same I obtain, with your code).
For some reason it says there are 19 Z planes, although the dataset I'm using only has 10. Could you please double check the behavior on a 10-Z-planes dataset?

EDIT: it's most probably my fault, I should be doing something wrong somewhere.. no need you really go and double check, sorry

@jluethi
Copy link
Collaborator

jluethi commented Jul 22, 2022

Oh, maybe I created a fake metadata file for the test case that still has too many Z level in there. Let me have a quick check!

@jluethi
Copy link
Collaborator

jluethi commented Jul 22, 2022

Ah yes, my bad. I'll update the metadata file to be correct for the subset of data with have in the 2x2 subset

@jluethi
Copy link
Collaborator

jluethi commented Jul 22, 2022

Ok, should be fixed now @tcompa . The synthetic MeasurementData.mlf I created wasn't correctly tailored to the subset we use in the 2x2 test case. I now removed the metadata relating to planes 11-19. Now the parsing should produce a result that also fits the Z specification of that test.

Let me know if it works :)

@tcompa tcompa mentioned this issue Sep 2, 2022
tcompa referenced this issue in fractal-analytics-platform/fractal-client Jul 22, 2022
* Do not store pixel sizes in the FOV-ROI table;
* Always parse pixel sizes from the zattrs file, when needed;
* Add level arg to extract_zyx_pixel_sizes_from_zattrs.
@tcompa
Copy link
Collaborator Author

tcompa commented Jul 22, 2022

Ok, should be fixed now @tcompa . The synthetic MeasurementData.mlf I created wasn't correctly tailored to the subset we use in the 2x2 test case. I now removed the metadata relating to planes 11-19. Now the parsing should produce a result that also fits the Z specification of that test.

Let me know if it works :)

It does work now, thanks

@tcompa tcompa closed this as completed Jul 22, 2022
@tcompa
Copy link
Collaborator Author

tcompa commented Jul 22, 2022

After

  1. Including the new parsed data (x_pixel, y_pixel, z_pixel),
  2. Removing information on pixel sizes from the ROI table,
  3. Adding a feature for reading pixel sizes from zattrs file,
    I closed this issue.

Let's reopen it if something else comes up.

@jluethi jluethi transferred this issue from fractal-analytics-platform/fractal-client Sep 2, 2022
@jluethi jluethi moved this from Done to Done Archive in Fractal Project Management Oct 5, 2022
@tcompa tcompa added the Tables AnnData and ROI/feature tables label Sep 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Tables AnnData and ROI/feature tables
Projects
None yet
Development

No branches or pull requests

2 participants