Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[jdbc.read] Don't track realized keys in TransientRow #185

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

alexander-yakushev
Copy link
Contributor

@alexander-yakushev alexander-yakushev commented Oct 5, 2024

Tracking the hashset of realized keys for each row in the resultset has noticeable overhead. Since this is only needed to print the transient row during debugging (as far as I can tell), it makes sense to shave the overhead here. All column names can be computed upfront and then used to print the values available in transient row. Please tell me if I'm wrong or missing something.

@alexander-yakushev alexander-yakushev force-pushed the no-realized-keys branch 3 times, most recently from 84ff99e to 7319b51 Compare October 5, 2024 11:35
Copy link

codecov bot commented Oct 5, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 83.57%. Comparing base (b412026) to head (e90e5f2).

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #185      +/-   ##
==========================================
+ Coverage   83.55%   83.57%   +0.01%     
==========================================
  Files          37       37              
  Lines        2506     2508       +2     
  Branches      212      212              
==========================================
+ Hits         2094     2096       +2     
  Misses        200      200              
  Partials      212      212              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@alexander-yakushev alexander-yakushev force-pushed the no-realized-keys branch 2 times, most recently from 7330740 to 76a77f9 Compare October 5, 2024 11:39
@@ -142,6 +145,7 @@
col-names (get builder :cols (next.jdbc.rs/get-modified-column-names
(.getMetaData rset)
combined-opts))
col-names-kw (vec (keep #(column-name->keyword % label-fn) col-names))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're really worried about performance, (into [] (keep ...) ...) would be faster, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code only runs once per ResultSet, so it's not as critical to make optimizations here. But sure, there's no harm in switching to a transducer.

Comment on lines 35 to 36
(defn- realized-keys [^TransientRow row]
@(.realized_keys row))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't this still be calculated by looking at .column_names and removing that ones that are ::not-found? That way we don't need to delete a ton of test code below

@@ -142,6 +145,7 @@
col-names (get builder :cols (next.jdbc.rs/get-modified-column-names
(.getMetaData rset)
combined-opts))
col-names-kw (vec (keep #(column-name->keyword % label-fn) col-names))
col-name->index (make-column-name->index col-names label-fn)]
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like it will be bad to performance to calculate all the column name keywords ahead of time in the line above and then create a memorized function that has to recalculate all those names a second time here... surely if you want to go down this road it would make sense to have make-column-name->index take col-names-kw so it doesn't need to call column-name->keyword again. Right?

Copy link
Owner

@camsaul camsaul Oct 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an added bonus I guess we'd avoid the overhead of having to do column-name->keyword calculations on the key passed to the without method since presumably it should already be a keyword that is formatted correctly

Copy link
Contributor Author

@alexander-yakushev alexander-yakushev Oct 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like I've said above, column names here are calculated only once per ResultSet. The current realized-keys update is triggered for every individual cell access (N columns * M rows). That's why optimizing the once per ResultSet part is not as important. But if there are easy wins there, it makes sense to do them as well.

The reason why I left the current memoized make-column-name->index conversion is because, if I read the code correctly, the user may pass a string, a keyword, or a symbol as a column name, and they all should work. We can perhaps calculate a map for all possible valid column names. I am not against it but it feels a bit dirty. We can go that route if you wish. We can't precalculate all valid column names because of arbitrary label-fn. I see that it transforms names to kebab case for example, so :my_good_column, "my-good_column", 'my_good-column are all valid keys that should be transformed to :my-good-coumn in the end. We can't precalculate that.

@camsaul
Copy link
Owner

camsaul commented Oct 7, 2024

It would be nice to see something like Criterium benchmarks of before and after these changes to get a sense of how significant of a performance impact this really makes. I would be interested in seeing whether doing away with realized-keys but always calculating all of the column names ahead of time even when we do something like only realize the value of a single column from the transient row actually speeds things up significantly or not.

We lose a bit of info by taking this stuff out (as evidenced by the number of lines you had to delete from the tests) so if we're talking like a performance improvement of a few microseconds then I'm not sure it's really worth it or not.

If column-names is really only needed for printing a transient row, could we only calculate it on demand only when printing a transient row, that way we avoid the overhead of calculating it for every single row when 99.99% of the time we never print the transient rows? That seems like it would be an even bigger performance improvement, right? Is there a way to do that?

Copy link
Owner

@camsaul camsaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Is there a way we can stop pre-calculating all of the keyword column names and only calculate them on demand if and when we print a transient row?
  • If we actually do need to pre-calculate all of the keyword column names, make-column-name->index can probably be updated to use those so we're not calculating them twice for every row.
  • If we're getting really crazy with performance optimizations, (into [] ...) is faster than (vec (keep ...))
  • If we remove realized-keys but still have column-names, you should still be able to calculate the realized keys by looking at column names and seeing what is not ::not-found. If we write a function for that in row-test then you don't need to remove all the tests you've removed

@alexander-yakushev alexander-yakushev force-pushed the no-realized-keys branch 2 times, most recently from 463e943 to 3e6e748 Compare October 8, 2024 09:25
@alexander-yakushev
Copy link
Contributor Author

alexander-yakushev commented Oct 8, 2024

UPD regarding tests: the current tests are written using :venues model that has columns [:id :name :category :created-at :updated-at], but e.g. the assoc test uses :k as a key and then checks that it is among realized-keys. Is this testing the intentional behavior that you want to preserve?

@alexander-yakushev
Copy link
Contributor Author

Here's the benchmark:

;; preparation - insert 10k rows into people table
(toucan2.insert/insert!
 ::test/people
 (for [id (range 10 10000)]
   (instance/instance ::test/people
                      {:id id
                       :name       "Somebody"
                       :created-at (LocalDateTime/parse "2017-01-01T00:00")})))

;; the benchmark - I use time+ to get allocations too, but criterium would work just as well
(time+ (select/select ::test/people {:select [:*], :from [[:people]]}))

Results:

Before:
Time per call: 18.89 ms   Alloc per call: 17,484,278b
Time per call: 18.90 ms   Alloc per call: 17,483,998b
Time per call: 18.87 ms   Alloc per call: 17,483,817b

After:
Time per call: 15.17 ms   Alloc per call: 16,771,048b
Time per call: 15.14 ms   Alloc per call: 16,770,815b
Time per call: 15.18 ms   Alloc per call: 16,770,697b

@alexander-yakushev
Copy link
Contributor Author

And another follow-up on having to calculate all columns ahead of time: I'm mostly looking at Metabase benchmarks (not microbenchmarks of Toucan2 itself) and I see that in Metabase fetch-all-columns! is usually called; hence, all column names are eventually calculated anyway.

@alexander-yakushev
Copy link
Contributor Author

alexander-yakushev commented Jan 5, 2025

Update

Here's a completely different approach. Don't track realized keys, don't pre-compute column names. Use a bit of reflection to work around the inability to officially grab the keys of a transient collection. Dig into the private underlying array of a transient map, get the values from there. Not entirely safe indeed, but this is a devtime thing, and it's already protected with a try/catch.

Benchmark:

- master:
Time per call: 3.91 ms   Alloc per call: 8,448,707b   Iterations: 2577
Time per call: 3.84 ms   Alloc per call: 8,447,663b   Iterations: 2647
Time per call: 3.93 ms   Alloc per call: 8,447,364b   Iterations: 2703

- no-realized-keys:
Time per call: 3.60 ms   Alloc per call: 7,888,220b   Iterations: 2776
Time per call: 3.60 ms   Alloc per call: 7,887,431b   Iterations: 2781
Time per call: 3.70 ms   Alloc per call: 7,886,683b   Iterations: 2702

-5% time, -7% allocations, and simplifying the main path of the implementation which is the biggest win.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants