Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: introduce mode_any aggregation #1793

Open
MarcoGorelli opened this issue Jan 11, 2025 · 3 comments
Open

feat: introduce mode_any aggregation #1793

MarcoGorelli opened this issue Jan 11, 2025 · 3 comments

Comments

@MarcoGorelli
Copy link
Member

We currently have Expr.mode and Series.mode, like Polars does. What's not great about them is that they don't aggregate

I'd like to suggest a departure from Polars here. We introduce Expr.mode_any and Series.mode_any, which actually aggregate, but make no guarantees about which mode will be returned. For example, if a column contains values [1, 2, 2, 3, 3, 4], then it may return 2 or it may return 3.

@MarcoGorelli
Copy link
Member Author

Alternatively, per #1657, if we disallow length-changing-non-aggregating expressions which aren't followed by aggregations, then we might be able to get away with just Expr.mode, and just require that it be followed by an aggregation, .e. nw.col('a').mode().first(order_by='id'). We could even introduce Expr.any_value and allow nw.col('a').mode().any_value()

@FBruzzesi
Copy link
Member

Returning a list would definitly be consistent with polars mode behavior in a group by context. We might always return a list type (and enhance a bit the list namespace with .first, .last, .__getitem__)

@MarcoGorelli
Copy link
Member Author

I'm not sold on returning lists here for expressions which don't aggregate, and I think Polars is a bit internally inconsistent here. .list is also a bit problematic for some backends, like pandas non-pyarrow-backed, whereas with nw.col('a').mode().any_value() / nw.col('a').mode().first(order_by='id') we can bypass the list and have a return value which works well for all implementations

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants