-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor rl examples - updated README #223
Conversation
Thanks for opening your first pull request in this repository! Someone will review it when they have a chance. In the mean time, please be sure that you've handled the following things, to make the review process quicker and easier:
Thank you again for your contributions! 👍 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that there are a lot of changes here and for the more gym-related stuff, I'm not confident and would prefer for someone else to review the changes too. Hopefully the comments I've left are helpful.
@@ -93,3 +95,9 @@ extract all the necessary dataset in order for examples to work perfectly: | |||
cd tools/ | |||
./download_data_set.py | |||
``` | |||
|
|||
### 5. Setup | |||
To setup a jupyter local environment that work with C++ using xeus-cling you shall execute the following command: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But I don't think the intention was for users to run that script directly. It would be better to just use Binderhub or similar.
It's true that you could run this script, but it has a number of assumptions that may not be true for users:
- Users may not be using conda.
- Users may not be interested in the C++ notebook examples at all, but might be using the
Makefile
-built examples. - Users may not even be interested in C++ at all and may be focusing on other languages.
So I don't think that I would want to include this in the general README; users will then attempt to run the command, and may encounter problems that may not even be relevant if they're not looking to use Jupyterlab.
I think as an alternative it may be more reasonable to comment that script a little bit better. Or, if we restructured the examples in the repository to organize them by language, then perhaps in a directory specific to C++ notebook examples, it makes more sense to have this documentation.
@@ -193,8 +197,9 @@ int main() | |||
agent.Deterministic() = true; | |||
|
|||
// Creating and setting up the gym environment for testing. | |||
envTest.monitor.start("./dummy/", true, true); | |||
|
|||
// envTest.monitor.start("./dummy/", true, true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any particular reason to comment this out, or add the compression call?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The envTest environment wasn't functioning properly upon reuse. I attempted to troubleshoot by commenting out the monitor section or adding compression, but these adjustments didn't resolve the issue. So, i would just revert it to its original state.
@@ -120,7 +124,7 @@ int main() | |||
|
|||
// Preparation for training the agent | |||
// Set up the gym training environment. | |||
gym::Environment env("gym.kurg.org", "4040", "Acrobot-v1"); | |||
gym::Environment env("localhost", "4040", "Acrobot-v1"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure of the status of gym.kurg.org, but I don't know if this is the right thing to do here, otherwise we would now need to expect a user to be running the gym locally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True this would need the user to be running gym locally. Running gym_tcp_api locally was the only way I could get it to work. I couldn't find any working examples using gym.kurg.org, so I assumed it's not functional anymore. Also, the example in the gym_tcp_api directory used localhost.
@@ -16,6 +16,10 @@ | |||
using namespace mlpack; | |||
using namespace ens; | |||
|
|||
// Set up the state and action space. | |||
constexpr size_t stateDimension = 2; | |||
constexpr size_t actionSize = 3; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these paired with specific changes in an mlpack PR, or is this necessary to ensure that these examples work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is necessary to ensure that these examples work as the new DiscreteActionEnv and ContinuousActionEnv requires template parameters. I also defined these globally so that i could use them in the Train function.
template<size_t Dimension, size_t Size, size_t RewardSize = 0>
class DiscreteActionEnv
{
...
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the clarification. I definitely agree that these changes appear to be necessary, but like I mentioned somewhere else, I am not confident enough about the state of the RL code to say what the best way forward is. In the most ideal world, we could get CI to automatically test all of these examples, but I think right now the RL and notebook examples aren't built. I'll see if I can find some time to learn a little bit more so that we can merge this in, or maybe someone else will come along who knows the code better than I do. 👍
…eventing reusing the test env.
This issue has been automatically marked as stale because it has not had any recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions! 👍 |
No description provided.