Add MaskablePPOPlayer #297

zarns · 2024-11-16T20:58:07Z

Supersedes #287

I added the SubprocVecEnv to allow multiple games to be played at once, so training data is captured about 5x faster. I trained for 4.96 days straight (100,000,000 timesteps) with this configuration and the model.zip file is 1525MB (too big to upload to git, unfortunately). After 5 days of training, PPOPlayer has an 8% win rate against AB-pruning and an 11% win rate against ValueFunctionPlayer. Attached is the wandb graph output. You can see that the episode_reward_mean is not slowing down, but it's simply not training fast enough on my RTX 4070 to realistically surpass the AB-pruning player. Perhaps the model has too many layers, slowing down training, but I've played around quite a bit with different hyperparameters and model sizes and this is the best I've come up with.

The features_extractor CNN doesn't seem to help much in training shorter runs even with much smaller model sizes. I'm starting to think stablebaselines isn't the best way to go. AlphaZero uses a combo of MCTS with this actor/critic neural net, and maybe we need to pursue recreating it for Catan.

Note that if you want to pull the branch and play around with it, you'll have to delete the model.zip before each run to reset the architecture.

netlify · 2024-11-16T20:58:11Z

👷 Deploy request for catanatron-staging pending review.

Visit the deploys page to approve it

Name	Link
🔨 Latest commit	`bc2ec10`

zarns · 2024-11-16T21:00:54Z

Looks like the build fails anyway bc the sb3_contrib requirements aren't met. We could just leave this as an open pull request too, I guess

Add MaskablePPOPlayer

bc2ec10

zarns mentioned this pull request Jan 16, 2025

Vectorizable? Spawn Multiple Environments? #299

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add MaskablePPOPlayer #297

Add MaskablePPOPlayer #297

zarns commented Nov 16, 2024

netlify bot commented Nov 16, 2024

zarns commented Nov 16, 2024

Add MaskablePPOPlayer #297

Are you sure you want to change the base?

Add MaskablePPOPlayer #297

Conversation

zarns commented Nov 16, 2024

netlify bot commented Nov 16, 2024

👷 Deploy request for catanatron-staging pending review.

zarns commented Nov 16, 2024