Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wk-libs: Add Dataset.add_layer_from_images() using pims #741

Merged
merged 16 commits into from
Jun 22, 2022

Conversation

jstriebel
Copy link
Contributor

@jstriebel jstriebel commented Jun 1, 2022

Description:

Adds Dataset.add_layer_from_images(), which converts plenty of image (stack) formats to a wk-compatible layer.

For the usage, please check the added test. This can also be executed standalone, which will upload the resulting datasets to the webknossos instance configured in your .env, I'd recommend to use a local wk instance. This helps to inspect the outputs manually.

I moved the respective testfiles to the webknossos package and linked them from wkcuber, since changes in webknossos also trigger CI runs for wkcuber, but not the other way around.

I also adapted the warning behaviour for multiprocessing the in the cluster-tools. Should we also adapt this for the other executors? Might be out-of-scope for this PR though.

Issues:

Todos:

  • Updated Changelog
  • Added / Updated Tests
  • Updated Documentation
  • Considered adding this to the Examples
  • Add issue with detailed comparison to wkcuber conversion features (e.g. offset is currently missing, maybe some specialized image formats): Expand add_layer_from_images to cubing script #748

@jstriebel jstriebel self-assigned this Jun 1, 2022
@jstriebel
Copy link
Contributor Author

@philippotto I guess this is ready for review now. Some minor parts are still missing, see TODO, but test + code is ready. Let me know if we should do a walkthrough together. I'd also be very happy if you have a good idea how to simplify the logic in pims_images.py.

@jstriebel jstriebel requested a review from philippotto June 1, 2022 12:38
@jstriebel
Copy link
Contributor Author

PS: The new tests add ~5 minutes. Probably we should look into parallelizing tests or reducing the time somehow. The downloaded datasets might also be cached in some folder (locally gitignored in the repo and via extra caching steps in the CI).

Copy link
Member

@philippotto philippotto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great stuff! I already reviewed most of the PR and left some feedback, but maybe we can have a call about the pims_images.py module, as I think I could use some kind of introduction before reviewing it :)

webknossos/webknossos/dataset/view.py Outdated Show resolved Hide resolved
webknossos/webknossos/dataset/dataset.py Outdated Show resolved Hide resolved
webknossos/webknossos/dataset/dataset.py Show resolved Hide resolved
webknossos/webknossos/dataset/dataset.py Show resolved Hide resolved
Comment on lines 833 to 835
# if a slice is larger than the others it might happen that
# a new chunk is partially written, leading to a warning,
# which is ignored in this context
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wouldn't this even happen if all slices have the same size but that size is not shard-aligned? also, why is the warning hidden? the user could set compress=False to avoid the performance penalty, no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in the first case this should work, since the first write will set the new bounding-box, which then fits all subsequent writes. Writes must be either shard or bounding-box aligned (per bbox-border). I think this warning is not useful at all in this case, since it only happens at the dataset-borders, where the user can not change much. The only important warning is about the different sizes, which is handled elsewhere. The performance penalty should be negligible, since it's only about the borders of the dataset. It's not that the general chunk-size doesn't fit the blocks during iteration.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, please explain this in the code comment too then :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I expanded the comment, I hope it makes more sense now 👍

webknossos/webknossos/dataset/dataset.py Outdated Show resolved Hide resolved
webknossos/webknossos/dataset/dataset.py Outdated Show resolved Hide resolved
webknossos/webknossos/dataset/dataset.py Outdated Show resolved Hide resolved
Copy link
Member

@philippotto philippotto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

awesome stuff! didn't look at the tests, yet, but I already have some feedback :)

webknossos/webknossos/dataset/_utils/pims_images.py Outdated Show resolved Hide resolved
webknossos/webknossos/dataset/_utils/pims_images.py Outdated Show resolved Hide resolved
webknossos/webknossos/dataset/_utils/pims_images.py Outdated Show resolved Hide resolved
webknossos/webknossos/dataset/_utils/pims_images.py Outdated Show resolved Hide resolved
webknossos/webknossos/dataset/_utils/pims_images.py Outdated Show resolved Hide resolved
@jstriebel
Copy link
Contributor Author

@philippotto Thanks a lot for the review! I think I adressed all points, either in the newest commits or commenting there directly. Please check my newest changes again, thanks 🙏

@jstriebel
Copy link
Contributor Author

ping @philippotto

Copy link
Member

@philippotto philippotto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the late review! I only left some smaller comments (mostly about code comments) :)

webknossos/webknossos/dataset/_utils/pims_images.py Outdated Show resolved Hide resolved
webknossos/webknossos/dataset/_utils/pims_images.py Outdated Show resolved Hide resolved
webknossos/webknossos/dataset/dataset.py Outdated Show resolved Hide resolved
Comment on lines 833 to 835
# if a slice is larger than the others it might happen that
# a new chunk is partially written, leading to a warning,
# which is ignored in this context
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, please explain this in the code comment too then :)

webknossos/tests/test_from_images.py Show resolved Hide resolved
@jstriebel jstriebel enabled auto-merge (squash) June 22, 2022 16:08
@jstriebel jstriebel merged commit 9b2955a into master Jun 22, 2022
@jstriebel jstriebel deleted the add-layer-from-images branch June 22, 2022 16:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add create_from_image_sequence
2 participants