Add caching #86

KingMob · 2024-03-18T18:46:10Z

Add caching, tests, and cache scalar pdf/prob/condition/constrain/mutual-info fns in clj

Also bump up inferenceql.inference version to avoid arrow constructor bug

Notes

CLJS does not have core.cache or core.memoize. And in Js, there were surprisingly few comprehensive options for caching, so I went with one that seemed popular, well-maintained, and flexible enough to handle custom keys.

The Clojure hash fn hashes both 0 and nil to the same value. Thus hash alone cannot distinguish between {:x 0} and {:x nil}. (Hash maps have fallbacks to handle collisions, and disambiguate between 0 and nil, but these don't work with most Js cache implementations afaict.)

There are two popular solutions to this problem:

Rely on identity. This might work well, given persistent data structures, at the cost of cache misses on different vars with identical contents.
Use JSON/stringify. Recommended, and should be pretty optimized under modern browsers. May need to revisit if benchmarks show otherwise.

test/inferenceql/query/cache_test.cljc

src/inferenceql/query/scalar.cljc

littleredcomputer

LGTM

src/inferenceql/query/cache.cljc

mhuebert · 2024-03-22T15:34:34Z

src/inferenceql/query/cache.cljc

+  ([f lru-threshold]
+   #?(:clj (memo/lru f :lru/threshold lru-threshold)
+      :cljs (memoizee f #js {"max" lru-threshold
+                             "normalizer" js/JSON.stringify}))))


I wonder if we'll run into trouble using js/JSON.stringify with some of the parameters passed to these functions, like model (which can be a special type, eg. reify) or cljs data types which look pretty weird (though maybe that's ok, if it's more performant than using clj->js first..)

It might be worth doing some simple benchmarks before committing to an approach, as some of these conversions can be surprisingly costly.

My predictions are that it should be fine, but it won't hurt to check. Will have to wait until I get back from vacation, though.

The serialized weirdness should be ok as long as they never result in accidentally identical string representations. If anything, I think the opposite is true. There are probably things that could share keys, but won't. Luckily, that just means some cache misses.

I would be very surprised if clj->js + JSON/stringify was faster than JSON/stringify on the original, but no need to guess! I'll try it out when I get back. We'll see what my posteriors are then.

Well, it turns out I am very surprised. Unfortunately, I don't think it will work out to use clj->js first.

I ran a quick-and-dirty test in cljs:

(let [n 100000 m {"x" 0 :foo {:bar 1 :moop/floop "asdf"} :baz [1 2 3]} a (reify IAtom)] (js/console.log "JSON.stringify:") (time (dotimes [_ n] (js/JSON.stringify m) (js/JSON.stringify :foo) (js/JSON.stringify a))) (js/console.log "clj->js, then JSON.stringify:") (time (dotimes [_ n] (-> m clj->js (js/JSON.stringify)) (-> :foo clj->js (js/JSON.stringify)) (-> a clj->js (js/JSON.stringify)))) (js/console.log "clj->js w/ str keyword-fn, then JSON.stringify:") (time (dotimes [_ n] (-> m (clj->js :keyword-fn str) (js/JSON.stringify)) (-> :foo (clj->js :keyword-fn str) (js/JSON.stringify)) (-> a (clj->js :keyword-fn str) (js/JSON.stringify)))) (js/console.log "bean/->js, then JSON.stringify:") (time (dotimes [_ n] (-> m (bean/->js :key->prop str) (js/JSON.stringify)) (-> :foo (bean/->js :key->prop str)(js/JSON.stringify)) (-> a (bean/->js :key->prop str) (js/JSON.stringify)))))

and got:

JSON.stringify: "Elapsed time: 1274.645467 msecs" clj->js, then JSON.stringify: "Elapsed time: 1208.113983 msecs" clj->js w/ str keyword-fn, then JSON.stringify: "Elapsed time: 1443.455173 msecs" bean/->js, then JSON.stringify: "Elapsed time: 1506.302295 msecs"

Using clj->js before JSON.stringify is slightly faster, but unfortunately, it can't distinguish between string and keyword keys. Having duplicate string and keyword keys would cause other problems, so it might be safe to do. But, I'd rather err on the side of correctness, especially since these timings aren't too far off from each other.

We can always revisit our caching strategy, if necessary.

…strain/mutual-info fns in clj Also bump up inferenceql.inference version to avoid arrow constructor bug

KingMob force-pushed the add-caching branch from 8eeee78 to 7fa5631 Compare March 19, 2024 09:28

KingMob force-pushed the added-shadow-cljs branch from 7faca04 to be3b194 Compare March 19, 2024 09:38

KingMob force-pushed the add-caching branch from 7fa5631 to 7e9d431 Compare March 19, 2024 09:38

KingMob requested review from zane, sritchie, littleredcomputer and Schaechtle March 19, 2024 09:45

KingMob force-pushed the added-shadow-cljs branch from be3b194 to 42c66b3 Compare March 21, 2024 16:18

KingMob force-pushed the add-caching branch from 7e9d431 to 4149bed Compare March 21, 2024 16:18

KingMob force-pushed the added-shadow-cljs branch from 42c66b3 to f0a164d Compare March 21, 2024 16:31

KingMob force-pushed the add-caching branch 2 times, most recently from 6662f3b to 0af2cb4 Compare March 21, 2024 16:42

littleredcomputer reviewed Mar 21, 2024

View reviewed changes

test/inferenceql/query/cache_test.cljc Show resolved Hide resolved

src/inferenceql/query/scalar.cljc Outdated Show resolved Hide resolved

ships mentioned this pull request Mar 21, 2024

build: bump inferenceql.inference to latest ChiSym/GenSQL.gpm.sppl#12

Merged

KingMob force-pushed the added-shadow-cljs branch from f0a164d to ed10267 Compare March 22, 2024 09:58

KingMob force-pushed the add-caching branch 3 times, most recently from 1cac8f8 to 0eed776 Compare March 22, 2024 12:19

KingMob force-pushed the added-shadow-cljs branch from ed10267 to 4805515 Compare March 22, 2024 14:47

KingMob force-pushed the add-caching branch from 0eed776 to e3842ca Compare March 22, 2024 14:47

littleredcomputer approved these changes Mar 22, 2024

View reviewed changes

mhuebert reviewed Mar 22, 2024

View reviewed changes

KingMob force-pushed the added-shadow-cljs branch from 4805515 to 255366a Compare March 22, 2024 16:12

KingMob force-pushed the add-caching branch 2 times, most recently from 9fe096b to a680fc2 Compare March 22, 2024 16:22

KingMob force-pushed the added-shadow-cljs branch from 255366a to f971a58 Compare March 22, 2024 16:22

KingMob force-pushed the add-caching branch from a680fc2 to 6393f58 Compare March 22, 2024 16:41

KingMob force-pushed the add-caching branch from 6393f58 to cd1bb6a Compare April 1, 2024 14:43

KingMob force-pushed the added-shadow-cljs branch 2 times, most recently from e57d5bf to 630bff6 Compare April 1, 2024 16:14

KingMob force-pushed the add-caching branch from cd1bb6a to a97735c Compare April 2, 2024 11:56

refactor: Add caching, tests, and cache scalar pdf/prob/condition/con…

68000ea

…strain/mutual-info fns in clj Also bump up inferenceql.inference version to avoid arrow constructor bug

KingMob force-pushed the added-shadow-cljs branch from 1a7a78f to 328df8f Compare April 2, 2024 14:49

KingMob force-pushed the add-caching branch from a97735c to 68000ea Compare April 2, 2024 14:49

KingMob merged commit eb6709b into added-shadow-cljs Apr 3, 2024
3 checks passed

KingMob deleted the add-caching branch April 3, 2024 10:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add caching #86

Add caching #86

KingMob commented Mar 18, 2024

littleredcomputer left a comment

mhuebert Mar 22, 2024

KingMob Mar 22, 2024 •

edited

Loading

KingMob Apr 2, 2024

Add caching #86

Add caching #86

Conversation

KingMob commented Mar 18, 2024

Notes

littleredcomputer left a comment

Choose a reason for hiding this comment

mhuebert Mar 22, 2024

Choose a reason for hiding this comment

KingMob Mar 22, 2024 • edited Loading

Choose a reason for hiding this comment

KingMob Apr 2, 2024

Choose a reason for hiding this comment

KingMob Mar 22, 2024 •

edited

Loading