You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Basically, weights the self-play-generated sample points based on how "surprising" they are. That is, if the MCTS-generated count-distribution looks very different from the policy-prior, then includes multiple copies of that row of data, so that the next generation neural network puts more weight on correcting it.
Implement this and validate value through experimentation.
The text was updated successfully, but these errors were encountered:
Used by KataGo, described here.
Basically, weights the self-play-generated sample points based on how "surprising" they are. That is, if the MCTS-generated count-distribution looks very different from the policy-prior, then includes multiple copies of that row of data, so that the next generation neural network puts more weight on correcting it.
Implement this and validate value through experimentation.
The text was updated successfully, but these errors were encountered: