Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add caching #86

Merged
merged 1 commit into from
Apr 3, 2024
Merged

Add caching #86

merged 1 commit into from
Apr 3, 2024

Conversation

KingMob
Copy link
Contributor

@KingMob KingMob commented Mar 18, 2024

Add caching, tests, and cache scalar pdf/prob/condition/constrain/mutual-info fns in clj

Also bump up inferenceql.inference version to avoid arrow constructor bug

Notes

CLJS does not have core.cache or core.memoize. And in Js, there were surprisingly few comprehensive options for caching, so I went with one that seemed popular, well-maintained, and flexible enough to handle custom keys.

The Clojure hash fn hashes both 0 and nil to the same value. Thus hash alone cannot distinguish between {:x 0} and {:x nil}. (Hash maps have fallbacks to handle collisions, and disambiguate between 0 and nil, but these don't work with most Js cache implementations afaict.)

There are two popular solutions to this problem:

  1. Rely on identity. This might work well, given persistent data structures, at the cost of cache misses on different vars with identical contents.
  2. Use JSON/stringify. Recommended, and should be pretty optimized under modern browsers. May need to revisit if benchmarks show otherwise.

Copy link

@littleredcomputer littleredcomputer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

([f lru-threshold]
#?(:clj (memo/lru f :lru/threshold lru-threshold)
:cljs (memoizee f #js {"max" lru-threshold
"normalizer" js/JSON.stringify}))))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we'll run into trouble using js/JSON.stringify with some of the parameters passed to these functions, like model (which can be a special type, eg. reify) or cljs data types which look pretty weird (though maybe that's ok, if it's more performant than using clj->js first..)

image

It might be worth doing some simple benchmarks before committing to an approach, as some of these conversions can be surprisingly costly.

Copy link
Contributor Author

@KingMob KingMob Mar 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My predictions are that it should be fine, but it won't hurt to check. Will have to wait until I get back from vacation, though.

The serialized weirdness should be ok as long as they never result in accidentally identical string representations. If anything, I think the opposite is true. There are probably things that could share keys, but won't. Luckily, that just means some cache misses.

I would be very surprised if clj->js + JSON/stringify was faster than JSON/stringify on the original, but no need to guess! I'll try it out when I get back. We'll see what my posteriors are then.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, it turns out I am very surprised. Unfortunately, I don't think it will work out to use clj->js first.

I ran a quick-and-dirty test in cljs:

(let [n 100000
        m {"x" 0 :foo {:bar 1 :moop/floop "asdf"} :baz [1 2 3]}
        a (reify IAtom)]

    (js/console.log "JSON.stringify:")
    (time
      (dotimes [_ n]
        (js/JSON.stringify m)
        (js/JSON.stringify :foo)
        (js/JSON.stringify a)))

    (js/console.log "clj->js, then JSON.stringify:")
    (time
      (dotimes [_ n]
        (-> m clj->js (js/JSON.stringify))
        (-> :foo clj->js (js/JSON.stringify))
        (-> a clj->js (js/JSON.stringify))))

    (js/console.log "clj->js w/ str keyword-fn, then JSON.stringify:")
    (time
      (dotimes [_ n]
        (-> m (clj->js :keyword-fn str) (js/JSON.stringify))
        (-> :foo (clj->js :keyword-fn str) (js/JSON.stringify))
        (-> a (clj->js :keyword-fn str) (js/JSON.stringify))))

    (js/console.log "bean/->js, then JSON.stringify:")
    (time
      (dotimes [_ n]
        (-> m (bean/->js :key->prop str) (js/JSON.stringify))
        (-> :foo (bean/->js :key->prop str)(js/JSON.stringify))
        (-> a (bean/->js :key->prop str) (js/JSON.stringify)))))

and got:

JSON.stringify:
"Elapsed time: 1274.645467 msecs"
clj->js, then JSON.stringify:
"Elapsed time: 1208.113983 msecs"
clj->js w/ str keyword-fn, then JSON.stringify:
"Elapsed time: 1443.455173 msecs"
bean/->js, then JSON.stringify:
"Elapsed time: 1506.302295 msecs"

Using clj->js before JSON.stringify is slightly faster, but unfortunately, it can't distinguish between string and keyword keys. Having duplicate string and keyword keys would cause other problems, so it might be safe to do. But, I'd rather err on the side of correctness, especially since these timings aren't too far off from each other.

We can always revisit our caching strategy, if necessary.

@KingMob KingMob force-pushed the added-shadow-cljs branch from 4805515 to 255366a Compare March 22, 2024 16:12
@KingMob KingMob force-pushed the add-caching branch 2 times, most recently from 9fe096b to a680fc2 Compare March 22, 2024 16:22
@KingMob KingMob force-pushed the added-shadow-cljs branch from 255366a to f971a58 Compare March 22, 2024 16:22
@KingMob KingMob force-pushed the added-shadow-cljs branch 2 times, most recently from e57d5bf to 630bff6 Compare April 1, 2024 16:14
…strain/mutual-info fns in clj

Also bump up inferenceql.inference version to avoid arrow constructor bug
@KingMob KingMob force-pushed the added-shadow-cljs branch from 1a7a78f to 328df8f Compare April 2, 2024 14:49
@KingMob KingMob merged commit eb6709b into added-shadow-cljs Apr 3, 2024
3 checks passed
@KingMob KingMob deleted the add-caching branch April 3, 2024 10:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants