Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix chunking bug with compound dtypes #1146

Open
wants to merge 24 commits into
base: main
Choose a base branch
from
Open

Conversation

pauladkisson
Copy link
Member

@pauladkisson pauladkisson commented Nov 22, 2024

@pauladkisson pauladkisson marked this pull request as ready for review December 5, 2024 00:48
@pauladkisson pauladkisson requested a review from rly December 5, 2024 00:50
@pauladkisson
Copy link
Member Author

@rly, lmk what you think

@pauladkisson
Copy link
Member Author

@h-mayorquin, this is ready for review. Basically I use the hdmf.build.builders.BaseBuilder to check if a neurodata object would have a compound dtype. Most of the complexity is introduced by the need to find a match between the neurodata object and its location in the builder, which is outlined in the docstrings. Lmk what you think!

@h-mayorquin
Copy link
Collaborator

I did a first reading. Two things:

  • I think, we should update hdmf on the pyproject to the latest version or greater.
  • Looking at the tests that were failing before they seem related to the pixel mask. Can we create a more direct test of this in the dataset configuration tests? I think it would be better to have something more unit-test-like that would fail quicker if we break this (or if we can to refactor). Can we build something simpler with pixel-mask so we don't rely on the full segmentation conversion test?

@pauladkisson
Copy link
Member Author

I think, we should update hdmf on the pyproject to the latest version or greater.

I updated to include everything <4, which zarr-related issues. Should be able to add hdmf 4.0 soon -- see: #1191

@pauladkisson
Copy link
Member Author

Looking at the tests that were failing before they seem related to the pixel mask. Can we create a more direct test of this in the dataset configuration tests? I think it would be better to have something more unit-test-like that would fail quicker if we break this (or if we can to refactor). Can we build something simpler with pixel-mask so we don't rely on the full segmentation conversion test?

Definitely needs some unit tests. I'll put together some.

@pauladkisson
Copy link
Member Author

From Meeting: move has_compound_dtype inside get_data_shape

@pauladkisson
Copy link
Member Author

From Meeting: move has_compound_dtype inside get_data_shape

Actually, get_data_shape comes from hdmf.utils, so no way to move that in this PR. Also, if I remember correctly, they use get_data_shape in ways that would make incorporating this compound_dtype fix difficult.

@pauladkisson
Copy link
Member Author

@h-mayorquin, i added tests, so this should be good to go!

Copy link

codecov bot commented Feb 7, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 89.72%. Comparing base (96dfdff) to head (23c0c89).
Report is 32 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1146      +/-   ##
==========================================
- Coverage   90.69%   89.72%   -0.98%     
==========================================
  Files         129      129              
  Lines        8189     8415     +226     
==========================================
+ Hits         7427     7550     +123     
- Misses        762      865     +103     
Flag Coverage Δ
unittests 89.72% <100.00%> (-0.98%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
..._helpers/_configuration_models/_base_dataset_io.py 98.80% <100.00%> (+2.03%) ⬆️
...roconv/tools/nwb_helpers/_dataset_configuration.py 93.67% <100.00%> (+0.16%) ⬆️

... and 20 files with indirect coverage changes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug]: Recommended Chunk Shape doesn't take into account compound dtypes
2 participants