[pipeline] Use conj! as the default pipeline rf #191

alexander-yakushev · 2024-10-08T17:50:16Z

Another small but non-invasive improvement. This works best when many rows are processed, but shouldn't bring too much of an overhead if one/few rows are selected.

camsaul · 2024-10-14T17:29:32Z

Not sure this is really "non-invasive" if everything using query-reducible and select-reducible and the like has to be updated to use transduce instead of reduce (as evidenced by the tests you had to update)... I'm fairly certain this is going to require us to make changes in upstream Metabase code which makes it a breaking change. I'm honestly not sure the minor (?) performance benefits we get here are worth making the code harder to use correctly. (What are the performance benefits, btw? I would still love to see some benchmarks for these PRs)

alexander-yakushev · 2024-10-14T17:46:23Z

I understand what you are saying. However, the tests that I've changed, for example, extend the method pipeline/transduce-execute-with-connection. I'm not sure there is a hard rule regarding this, but I would expect anything with the word "transduce" in the name to call the 1-arity of the reducing function. Otherwise, it's violating the "transduce contract," so to speak.

EDIT: this may be relevant https://clojure.org/reference/transducers#_creating_transducible_processes.

A completing process must call the completion operation on the final accumulated value exactly once.

It is possible to rewrite this PR without having to resort to 1-arity if the compatibility here is crucial.

Regarding the benchmarks: I did benchmarks for multiple changes at once. Each change is not very impressive number-wise on its own, but together they chip away quite a bit. I'll do separate benchmarks for this PR and post them.

codecov · 2024-10-15T07:46:49Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 83.65%. Comparing base (b412026) to head (08af6fc).

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #191      +/-   ##
==========================================
+ Coverage   83.55%   83.65%   +0.09%     
==========================================
  Files          37       37              
  Lines        2506     2515       +9     
  Branches      212      212              
==========================================
+ Hits         2094     2104      +10     
+ Misses        200      199       -1     
  Partials      212      212

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

alexander-yakushev · 2024-10-15T11:50:28Z

I did benchmarks of this against the current master and the allocations are 20% lower but timing results are inconclusive. Let's wait with this PR if favor of the others, and revisit it once other are merged.

alexander-yakushev · 2025-01-05T11:28:08Z

Update

Here are the results for this PR on the usual select 10k rows benchmark:

- master
Time per call: 3.91 ms   Alloc per call: 8,448,707b   Iterations: 2577
Time per call: 3.84 ms   Alloc per call: 8,447,663b   Iterations: 2647
Time per call: 3.93 ms   Alloc per call: 8,447,364b   Iterations: 2703

- rf-conj!
Time per call: 3.56 ms   Alloc per call: 7,157,359b   Iterations: 2881
Time per call: 3.62 ms   Alloc per call: 7,156,383b   Iterations: 2814
Time per call: 3.53 ms   Alloc per call: 7,155,533b   Iterations: 2945

So, it nets ~8% time improvement and ~15% allocation reduction. The reduced allocations are quite tasty; this almost achieves the level of overhead that the raw next.jdbc has (it did ~5MB per select the last time I checked).

Given that you are concerned about how much change will be necessary in Metabase, I'm going to try this out in Metabase in advance.

UPD: Turns out, Metabase tests don't fail after this change (see metabase/metabase#51772). I suppose, Metabase already uses the rf "properly", invoking the one-arg arity as the completion.

alexander-yakushev requested a review from camsaul as a code owner October 8, 2024 17:50

alexander-yakushev force-pushed the rf-conj! branch from 7c4d1cc to 5596fb2 Compare October 8, 2024 18:10

alexander-yakushev force-pushed the rf-conj! branch from 5596fb2 to dd09a66 Compare October 15, 2024 07:45

[pipeline] Use conj! as the default pipeline rf

08af6fc

alexander-yakushev force-pushed the rf-conj! branch from dd09a66 to 08af6fc Compare January 5, 2025 11:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[pipeline] Use conj! as the default pipeline rf #191

[pipeline] Use conj! as the default pipeline rf #191

alexander-yakushev commented Oct 8, 2024

camsaul commented Oct 14, 2024

alexander-yakushev commented Oct 14, 2024 •

edited

Loading

codecov bot commented Oct 15, 2024 •

edited

Loading

alexander-yakushev commented Oct 15, 2024

alexander-yakushev commented Jan 5, 2025 •

edited

Loading

[pipeline] Use conj! as the default pipeline rf #191

Are you sure you want to change the base?

[pipeline] Use conj! as the default pipeline rf #191

Conversation

alexander-yakushev commented Oct 8, 2024

camsaul commented Oct 14, 2024

alexander-yakushev commented Oct 14, 2024 • edited Loading

codecov bot commented Oct 15, 2024 • edited Loading

Codecov Report

alexander-yakushev commented Oct 15, 2024

alexander-yakushev commented Jan 5, 2025 • edited Loading

Update

alexander-yakushev commented Oct 14, 2024 •

edited

Loading

codecov bot commented Oct 15, 2024 •

edited

Loading

alexander-yakushev commented Jan 5, 2025 •

edited

Loading