Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Making coordinateTransformations flexible for axes & comply with spec (include channels transformation) #420

Closed
2 tasks
jluethi opened this issue Jun 9, 2023 · 14 comments

Comments

@jluethi
Copy link
Collaborator

jluethi commented Jun 9, 2023

This is about whether we're putting the right number of elements into the coordinateTransformations in the OME-Zarr and reading them out flexibly.

I have been working more with the OME-Zarr files from the MD converter and ran into an issue with our MIP conversion, where the convert_ROI_table_to_indices function in lib_regions_of_interest assumes all coordinateTransformations will always be z, y, x.

There are always 3 entries in our OME-Zarr files, see this example:

{
    "multiscales": [
        {
            "axes": [
                {
                    "name": "c",
                    "type": "channel"
                },
                {
                    "name": "z",
                    "type": "space",
                    "unit": "micrometer"
                },
                {
                    "name": "y",
                    "type": "space",
                    "unit": "micrometer"
                },
                {
                    "name": "x",
                    "type": "space",
                    "unit": "micrometer"
                }
            ],
            "datasets": [
                {
                    "coordinateTransformations": [
                        {
                            "scale": [
                                1.0,
                                0.1625,
                                0.1625
                            ],
                            "type": "scale"
                        }
                    ],
                    "path": "0"
                },

This example does have 4 axes though. Thus, according to the spec (both latest and v0.4), we should have a transformation for all 4 axes, i.e. in this case one for c, z, y, x.

@tcompa Any idea why we have only 3 entries in our coordinate transformations? In some early issues, I found examples where we were looking at 4 entries (e.g. here #25). But shortly thereafter, we already had examples with 3: #23

As part of handling axes more flexibly:

  • We should ensure we always set all the correct axes
  • We well need to adapt loading of the coordinate transformation, for example in extract_zyx_pixel_sizes in the lib_zattrs_utils, maybe in convert_ROI_table_to_indices and other places.
@jluethi
Copy link
Collaborator Author

jluethi commented Jun 9, 2023

An issue that will appear in some tasks while we have this problem:

ValueError                                Traceback (most recent call last)
Cell In[8], line 18
     14 full_res_pxl_sizes_zyx = extract_zyx_pixel_sizes(
     15     f"{zarrurl_old}/.zattrs", level=0
     16 )
     17 # Create list of indices for 3D FOVs spanning the entire Z direction
---> 18 list_indices = convert_ROI_table_to_indices(
     19     FOV_ROI_table,
     20     level=0,
     21     coarsening_xy=coarsening_xy,
     22     full_res_pxl_sizes_zyx=full_res_pxl_sizes_zyx,
     23 )

Cell In[5], line 32, in convert_ROI_table_to_indices(ROI, full_res_pxl_sizes_zyx, level, coarsening_xy, cols_xyz_pos, cols_xyz_len, reset_origin)
     30 # Set pyramid-level pixel sizes
     31 print(full_res_pxl_sizes_zyx)
---> 32 pxl_size_z, pxl_size_y, pxl_size_x = full_res_pxl_sizes_zyx
     33 prefactor = coarsening_xy**level
     34 pxl_size_x *= prefactor

ValueError: too many values to unpack (expected 3)

A very manual workaround is to use full_res_pxl_sizes_zyx[-3:] instead of full_res_pxl_sizes_zyx to cut off additional entries when we load the transformations in the mip task. But clearly that isn't the actual way we want to go.

jluethi added a commit that referenced this issue Jun 9, 2023
@jluethi
Copy link
Collaborator Author

jluethi commented Jun 9, 2023

Actually, I added a workaround that avoids running into crashes in our tasks to the extract_zyx_pixel_sizes function for the moment: It only returns the last 3 entries of the coordinateTransformation list.
This isn't the general solution to actually have correct metadata. But extract_zyx_pixel_sizes anyway should only return zyx scaling, not other scaling.

This is no part of #403. All tests still pass. With this workaround, we can process 3D MD data. And our normal examples (e.g. example 01) also still run.

@jluethi
Copy link
Collaborator Author

jluethi commented Jun 9, 2023

There was a second place where we currently have the assumption that coordinateTransformations always consist of z, y, x in the lib_zattrs_utils: The rescale_datasets. I now switch this such that it also works with czyx (or tczyx). But we should of course make this more general. This fix is just so that our tasks don't break with OME-Zarrs with czyx coordinateTransformations.

tcompa added a commit that referenced this issue Jun 12, 2023
@tcompa
Copy link
Collaborator

tcompa commented Jun 12, 2023

There was a second place where we currently have the assumption that coordinateTransformations always consist of z, y, x in the lib_zattrs_utils: The rescale_datasets. I now switch this such that it also works with czyx (or tczyx). But we should of course make this more general. This fix is just so that our tasks don't break with OME-Zarrs with czyx coordinateTransformations.

I made this slightly more general with 63e04d2, let me know if it makes sense to you.

@jluethi
Copy link
Collaborator Author

jluethi commented Jun 12, 2023

@tcompa This indeed looks like a more general solution. I'll try to test it this afternoon. I think it should work, but there are some edge cases where we e.g. were relying on the fact that the scale arrays have 3 digits, which is not always the case.

But if this works, it would already be a good step towards dropping this assumption.

@tcompa
Copy link
Collaborator

tcompa commented Jun 12, 2023

I think it should work, but there are some edge cases where we e.g. were relying on the fact that the scale arrays have 3 digits

Shall we take this opportunity to make everything a bit more consistent?
For the moment that means always having 4 (CZYX) items in any scale transformation (which will then be superseded by #150).

The most relevant modules dealing with scale transformations are:

  1. lib_zattrs_utils.py, which is currently under control (even if the one in extract_zyx_pixel_sizes is clearly a workaround)
  2. The create-ome-zarr tasks, where we are currently writing scale transformation of length 3. I will now switch to always including a 1 for the channel axis, in the scale transformation, since for the moment we always write CZYX arrays.

I cannot easily find other relevant modules to look at, but I can try to set up a couple more tests to spot possible issues:

$ git grep scale fractal_tasks_core | grep -v upscale | grep -v multiscale | grep -v escale | grep scale 
fractal_tasks_core/lib_zattrs_utils.py:            if t["type"] == "scale":
fractal_tasks_core/lib_zattrs_utils.py:                pixel_sizes = t["scale"]
fractal_tasks_core/lib_zattrs_utils.py:            f" no scale transformation found for level {level}"
fractal_tasks_core/lib_zattrs_utils.py:    Given a set of datasets (as per OME-NGFF specs), update their "scale"
fractal_tasks_core/lib_zattrs_utils.py:            if t["type"] == "scale":
fractal_tasks_core/lib_zattrs_utils.py:                new_t["scale"][-2] = new_t["scale"][-1] * prefactor
fractal_tasks_core/lib_zattrs_utils.py:                new_t["scale"][-1] = new_t["scale"][-1] * prefactor
fractal_tasks_core/tasks/create_ome_zarr.py:                                    "type": "scale",
fractal_tasks_core/tasks/create_ome_zarr.py:                                    "scale": [
fractal_tasks_core/tasks/create_ome_zarr_multiplex.py:                                    "type": "scale",
fractal_tasks_core/tasks/create_ome_zarr_multiplex.py:                                    "scale": [

@tcompa
Copy link
Collaborator

tcompa commented Jun 12, 2023

@tcompa Any idea why we have only 3 entries in our coordinate transformations?

I have no idea, TBH.

The create-ome-zarr tasks, where we are currently writing scale transformation of length 3. I will now switch to always including a 1 for the channel axis, in the scale transformation, since for the moment we always write CZYX arrays.

See a0ab7e7

@jluethi
Copy link
Collaborator Author

jluethi commented Jun 12, 2023

Shall we take this opportunity to make everything a bit more consistent?

Yes, that's great! I was a bit hesitant about opening too broad a topic here. But this sounds like we can tackle it in a contained manner, generalize our processing to always create the CZYX coordinate transformations and then test that this does not interfere with our existing workflows.

If this more general way works well for the existing examples we have, I'd be in favor of merging it into the 0.10.0 release candidates.
I'll need to update the MD parsing tasks to be 0.10.0 (& server 1.3.0) compatible and then test them with these new changes. But I'd open new issues if we then run into problems there

@tcompa
Copy link
Collaborator

tcompa commented Jun 12, 2023

Yes, that's great! I was a bit hesitant about opening too broad a topic here. But this sounds like we can tackle it in a contained manner, generalize our processing to always create the CZYX coordinate transformations and then test that this does not interfere with our existing workflows.

Even if we don't go too far in flexibility (and for sure we are not going to address #150 now), at least we should be aware of how things are now. I've added a unit test for lib_zatts_utils.py functions, which is meant as a showcase of how things work now.

and then test that this does not interfere with our existing workflows.

I'll need to update the MD parsing tasks to be 0.10.0 (& server 1.3.0) compatible and then test them with these new changes. But I'd open new issues if we then run into problems there

The CI of #403 is passing (locally), but it's always best to double check things when running an actual workflow. And I have started to update things in fractal-analytics-platform/fractal-demos#54, so that testing examples (at least the 01 one) should be relatively smooth. I'll also make frequent fractal-tasks-core prereleases, so that we can speed up testing.

If this more general way works well for the existing examples we have, I'd be in favor of merging it into the 0.10.0 release candidates.

I still need to review the tasks part of #403, but for the moment I have nothing to add about the transformations part. I'd push for having it in 0.10.0, for sure.

@tcompa
Copy link
Collaborator

tcompa commented Jun 12, 2023

Note that the zarr examples on zenodo also have the wrong number of items in the scale transformation. This is currently addressed in tests via 8657a6f, but at some point we should update those files.

@jluethi
Copy link
Collaborator Author

jluethi commented Jun 28, 2023

@tcompa An updated Zenodo dataset is now available here: https://zenodo.org/record/8091756

That should replace the old test dataset and we can get rid of the workarounds in the tests again :)

Do you also need the 2x2 datasets in the tests somewhere? I'm updating this atm, will upload in a bit

tcompa added a commit that referenced this issue Jun 29, 2023
1. Update DOI and zenodo link;
2. Revert workaround introduced in 8657a6f.
tcompa added a commit that referenced this issue Jun 29, 2023
@tcompa
Copy link
Collaborator

tcompa commented Jun 29, 2023

@tcompa An updated Zenodo dataset is now available here: https://zenodo.org/record/8091756

Thanks!

That should replace the old test dataset and we can get rid of the workarounds in the tests again :)

Great, done with #454.

Do you also need the 2x2 datasets in the tests somewhere?

No. The ones we use are:

Then this is all good.

@tcompa
Copy link
Collaborator

tcompa commented Jun 29, 2023

Is there anything left to do on this issue, or should it all be deferred to #150?

@jluethi
Copy link
Collaborator Author

jluethi commented Jun 29, 2023

The rest is deferred to #150 , it seems to work decently now :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants