Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FIX] NormalizePhaseReference.transformed() should not return a Table #787

Merged
merged 1 commit into from
Jan 29, 2025

Conversation

stuart-cls
Copy link
Member

_NormalizePhaseReferenceCommon.transformed() would return a Table instead of an array when the table had length 0

This is what the previously failing test (test_no_samples) is supposed to check, but I'm not sure why it was only failing in the latest dask branch?!

transformed() would return a Table instead of an array when the table
had length 0
@markotoplak
Copy link
Collaborator

Thanks!

@stuart-cls
Copy link
Member Author

I'm going to merge this, but I'm a little concerned that the existing test didn't catch this bug.

@stuart-cls stuart-cls merged commit 886e9de into Quasars:master Jan 29, 2025
20 checks passed
@markotoplak
Copy link
Collaborator

This is why I did not merge because I wanted to see why the difference. It is entirely possible that DaskTable and Table behave differently in edge cases.

If the table was really empty (= had no rows) then the transformation should not do anything with the result of this function and theoretically it would not need to matter what it returns in these cases. But seems like different things are executed...

@stuart-cls
Copy link
Member Author

Oops, sorry Marko! This function was definitely being called in the test but I didn't follow further to see what was being done with the result. One surprising thing is I don't think the dataset tested here is actually loaded as DaskTable. It may be a funny interaction between the thread cacheing stuff in Table and the dask.distributed.LocalCluster which also uses threading.

@markotoplak
Copy link
Collaborator

The dask branch also has refactored table transformations. Perhaps something is different there. :)

@markotoplak
Copy link
Collaborator

I checked. The dask branch code does not have any exceptions when transforming zero-length tables. So it still expects arrays of proper sizes (and row-size 0 :) ) on transformation outputs.

I am not completely sure whether that was a smart choice. It surely makes the transformation code shorter because otherwise we'd have to write a new functions that makes a "blank" zero-row arrays of certain sizes and types, but it also unmasks bugs like this.

@stuart-cls
Copy link
Member Author

I think it's better to keep the .transformed() functions simple and without tricky conditionals (as fixed in this PR). It doesn't make sense to me to have the function returning completely different types depending on the Table length.

I suppose the issue is if the underlying transforming code that is called doesn't support zero-row arrays, then we're back to having to handle that in the transformed() function.

Could there be some typing to indicate what the table transformation code is expecting? I guess that wouldn't help with the array size issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants