-
Notifications
You must be signed in to change notification settings - Fork 219
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Docs on SGHMC / SGLD? #2270
Comments
No, there is currently no support for mini-batching out of the box. Also, it is generally not a great idea to spin BNNs with Turing since we currently don't support GPUs. |
On very small datasets (+-100 observations), I've managed to get okay results at a fraction of a time to NUTS. The NNs I've tested aren't very large either (e.g. 4 layers of 8 neurons each). In my experience, it is very sensitive to random variation, so I've had to make a point of setting a random number seed when tweaking parameters and resampling, and the step size needed to be around 1e-3 to avoid NaNs, so try setting that smaller if you get that with the default configs. Big caveat though: convergence of individual weights isn't very strong, but I've found it doesn't make a huge impact on results. Here, NUTS is the clear winner in y experience though it takes much, much longer to sample. Invoking SGLD looks something like this: N_sims = 10_000
ch_SGLD = sample(
Xoshiro(323),
bayes_nn(Xs, ys),
SGLD(; stepsize=PolynomialStepsize(1e-3), adtype=AutoTracker()),
N_sims;
discard_adapt=false) (takes about 8 secs p/10 000 samples on my laptop) Agree though that mini-batching would be great one day along with GPU support to tackle larger datasets. |
Hi, @patrickm663 , just to be clear, what you just did is technically not SGLD but unadjusted Langevin/Langevin Monte Carlo. SGLD is the special case where minibatching is used. It is very unlikely for unadjusted Langevin to be competitive against NUTS for such high dimensional problems. |
Hi, thanks very much for clarifying! I did not realise that
…________________________________
From: Kyurae Kim ***@***.***>
Sent: Sunday, 14 July 2024 22:15
To: TuringLang/Turing.jl ***@***.***>
Cc: Patrick Moehrke ***@***.***>; Mention ***@***.***>
Subject: Re: [TuringLang/Turing.jl] Docs on SGHMC / SGLD? (Issue #2270)
Hi, @patrickm663<https://github.com/patrickm663> , just to be clear, what you just did is technically not SGLD but unadjusted Langevin/Langevin Monte Carlo. SGLD is the special case where minibatching is used. Without minibatching, it is very unlikely for unadjusted Langevin to be competitive against NUTS for such high dimensional problems.
—
Reply to this email directly, view it on GitHub<#2270 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ASSHES2UV5C6PKOJVWE5UFTZMLL5VAVCNFSM6AAAAABJUGBAIOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMRXGQ3TCNRQGQ>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Hey,
There are stochastic samplers in the Turing codebase, but there is no information how to actually use these.
Do they actually work? How do you define mini-batches?
I could add something to the BNN tutorial about it if someone could explain to me how these samplers work?
Thanks
The text was updated successfully, but these errors were encountered: