-
Notifications
You must be signed in to change notification settings - Fork 129
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tempered mcmc #180
Tempered mcmc #180
Conversation
cobaya/samplers/mcmc/mcmc.py
Outdated
if self.temperature: | ||
raise LoggedError( | ||
self.log, | ||
"Temperature != 1 and dragging are not compatible at the moment.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At a quick glance, I think dragging is correct with just the current PR changes (since temperature just scales all the logposts by the same linear factor)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should, I guess, for any sampling algorithm, since it's just importance reweighting? Anyway, will test before merging with baseline Planck.
Codecov Report
@@ Coverage Diff @@
## master #180 +/- ##
==========================================
- Coverage 87.59% 87.55% -0.05%
==========================================
Files 91 91
Lines 8345 8390 +45
==========================================
+ Hits 7310 7346 +36
- Misses 1035 1044 +9
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
@cmbant Mostly done. As expected by getdist, now storing tempered weights (integer) and logpost in the collection. Statistical functions such as A couple of missing things:
|
back from hols now. I agree thinning should just use the integer weights, and saving covmats with T=1 makes sense. |
Thanks!
I'll open a GetDist issue about this. |
Looks good to me thanks, though not very sure what you mean by "by weighting with samples with their probability". Can be useful for evaluating small tail probabilities. In high dimensions I'm using a single higher temperature usually probably doesn't help much for importance sampling (you'd really want to flatten just in a small number of relevant directions). Did you fix the issue with the covariance scaling? I don't have a very strong opinion about default GetDist behaviour. Ideally temperature, cooling, thinning , and burn-in removal metadata should probably be propagated consistently (including when saving MCSamples.saveAsText, and in Cobaya importance-sampled outputs). |
Thanks a lot for the review!
The first "with" should not be there, sorry. I mean when computing any quantity for which one weights samples with their posterior/prior/likelihood, e.g. in the GP sampler we try to do quick estimates of the covariance matrix from small high-temperature samples, too small to be fair, but sparse enough to get a decent estimation (hopefully).
I'll add that.
Ok, I guess I was thinking too naively about this. Do you think it's still worth mentioning that it may be useful in low dimensions, or should we remove any mention of re-weighting there?
Yes, it was indeed the prior, thanks!
My strongest opinion is that, if the default is plotting/using the high-temperature chain, users should get at least a warning when loading the chain with an explanation of how to cool it down. Another strong opinion: some hint/checkbox in the GUI for cooling down on the spot (even it the cooled-down chain only stays in RAM). Otherwise users would have to load the chains by hand in a Python shell/notebook and cool them to be able to plot them? Not sure how you did that before with CosmoMC. In general, I think GetDist should work with it more like Cobaya's collections: stored as high-temperature, but cooled-down weights and posterior can be requested via methods without modifying the internal high-temperature data. This way you don't need to cool down the sample to get "cool" statistics and plots.
I'll turn this into another issue and get to work on it over the following weeks. As for post processing, the easiest way to go at the moment is to automatically cool on load, and thus not preserve temperature. Unless you strongly disagree, I'd leave it like that for now and open another issue to be worked on in the following weeks. (I need this one merged soon). |
I started support for auto-cool in getdist at https://github.com/cmbant/getdist/tree/autocool. GetDist gui currently doesn't have a good way to change settings per chain (since you often operate on many at once), but this I think works analogously to ignore_rows/burn_removed, in that set globally, but chain metadata specified whether should be applied to each specific chain. |
Thanks! Could work. Since this is new, we can test it ourselves and see whether it feels convenient. I am going to be using it this coming week myself. To get the temperature from a yaml, assuming you have got the sampler name with the
So if I got it correctly, GetDistGUI will cool down on load unless the metadata requests otherwise. That would look good to me then. Any comment on my answer above re documentation? |
Doc changes sound OK, can certainly mention could be useful for importance sampling in some cases. |
TODO (at least): post processing, how to store temperature/cool status (#202), getdist setting of temperature property on cobaya load. I pushed a few of the tempering-independent fixes in this PR to master. |
@JesusTorrado: @AndreasNygaard is interested in using temperatures for training Connect - how far off do you think this is from merge? |
[WIP]
Missing testing and getdist side of things.