Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added round option to recording.astype #2513

Merged
merged 7 commits into from
Mar 13, 2024

Conversation

DradeAW
Copy link
Contributor

@DradeAW DradeAW commented Feb 26, 2024

By default False to mimic numpy's astype

@@ -19,6 +21,7 @@ def __init__(
self,
recording,
dtype=None,
round: bool = False,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would put True by default. no ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also think it makes more sense for spike-sorting, but @h-mayorquin made a good point that users might expect the same behaviour as numpy.

I'm fine with both

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fine with default True. I think that's the preferred default behavior even if it deviates from numpy astype

Copy link
Collaborator

@h-mayorquin h-mayorquin Mar 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, so default ended up being None which means that you round only when integer. Was this discussed outside of the PR or just got through as a non-issue?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, because rounding is what we normally want in this case. Even though that deviates from numpy behavior, that's better for ephys processing :)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you answering with a question instead of an statement makes it more difficult for me to understand. I am thinking as I write but I feel that something is missing.

Let's make a list to see if I can get you.

  • recording dtype is int and you cast to float -> rounding by defaut should not make a difference.
  • recording dtype is float and you cast to int -> runding by default was the reason this was added
  • recording dtype is int and you cast to int -> rounding by default should not make a difference.
  • recording dtype is float and you cast to float -> this would be problematic is someone wanted to cast float 64 to float 32.

Is that last case the problematic one? Am I understanding correctly?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. In that case the logic would return a float64 but rounded. Do you agree?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. Thanks for explaining.

I propose the following:

I think we can simplify this by making round (preferably called round_float_to_int to be fully descriptive) only True and False.

it should be True bye default as you guys wanted and it can only take two values:

  • When True and the input is float, we round before casting to int. If the input is int or the output is float we don't do anything and the function behaes as numpy astype.
  • When False it defaults to numpy astype behavior for all the cases.

This fullfils the original purpose and it has some advantages over the current approach:

Advantages:

  • Easier to document. The behavior is fully described by what I wrote above. We can enchance this by adding examples of why we did this (i.e. inclusive of 0 in np.hisotgram).
  • Avoids having the third unecessary branch of round=True that we have now.
  • Simplifies the type of round / round_float_to_int by making it boolean only.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you change it to round_float_to_int, then you remove the possibility that someone, for some reason, would want to round but keep it as a float.

I don't see what the problem would be when converting from float64 to float32?*
By default there would be no rounding (as there is no need), but the user ca explicitly ask for it if he wants to

Copy link
Collaborator

@h-mayorquin h-mayorquin Mar 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I want to reduce the generality of the code to make it easier. Less general code is good. Easier to understand, mantain docstrings (which are lacking in this case) and write tests if needed. Plus, in this case I can't really see a use case for the generalization, that is, it does not buy us anything other than generality for the sake of generality. If the use case comes to be, then we will be in a better position to write a general implementation with concrete use case in mind.

The cost of generality on the other hand? it took me some time to understand what each of the options does and where and why I would want them. Not all cases are documented which illustrates my point that is more work We can save a future developer time by making this more concrete and save ourselves the work to document cases that are unlikely to be used. Plus, it is is way easier to document something that is more concrete. Case in point: try writing a docstring for the more general case of the CAR preprocessor, no one has done it for a reason.

@alejoe91 alejoe91 added the preprocessing Related to preprocessing module label Feb 29, 2024
@@ -19,20 +21,26 @@ def __init__(
self,
recording,
dtype=None,
round: bool | None = None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a small doc please ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done :)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The case of False which is recovering numpy default behavior is not documented.

For future developers maybe we can add a comment saying that we do this because np.histogram is inclusive at 0 and we do this to counteract that.

@alejoe91 alejoe91 merged commit 62f5199 into SpikeInterface:main Mar 13, 2024
11 checks passed
@DradeAW DradeAW deleted the astype_rounding branch March 13, 2024 09:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
preprocessing Related to preprocessing module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants