Refactor rl examples - updated README #223

Ali-Hossam · 2024-04-01T12:38:35Z

No description provided.

mlpack-bot · 2024-04-01T12:38:38Z

Thanks for opening your first pull request in this repository! Someone will review it when they have a chance. In the mean time, please be sure that you've handled the following things, to make the review process quicker and easier:

All code should follow the style guide
Documentation added for any new functionality
Tests added for any new functionality
Tests that are added follow the testing guide
Headers and license information added to the top of any new code files
HISTORY.md updated if the changes are big or user-facing
All CI checks should be passing

Thank you again for your contributions! 👍

github-actions · 2024-04-01T12:38:45Z

👈 Launch a binder notebook on branch Ali-Hossam/examples/master

…tionEnv class

rcurtin

I see that there are a lot of changes here and for the more gym-related stuff, I'm not confident and would prefer for someone else to review the changes too. Hopefully the comments I've left are helpful.

rcurtin · 2024-04-10T23:25:34Z

README.md

@@ -93,3 +95,9 @@ extract all the necessary dataset in order for examples to work perfectly:
 cd tools/
 ./download_data_set.py
 ```
+
+### 5. Setup
+To setup a jupyter local environment that work with C++ using xeus-cling you shall execute the following command:


But I don't think the intention was for users to run that script directly. It would be better to just use Binderhub or similar.

It's true that you could run this script, but it has a number of assumptions that may not be true for users:

Users may not be using conda.

Users may not be interested in the C++ notebook examples at all, but might be using the Makefile-built examples.

Users may not even be interested in C++ at all and may be focusing on other languages.

So I don't think that I would want to include this in the general README; users will then attempt to run the command, and may encounter problems that may not even be relevant if they're not looking to use Jupyterlab.

I think as an alternative it may be more reasonable to comment that script a little bit better. Or, if we restructured the examples in the repository to organize them by language, then perhaps in a directory specific to C++ notebook examples, it makes more sense to have this documentation.

rcurtin · 2024-04-10T23:26:04Z

reinforcement_learning_gym/acrobot_dqn/acrobot_dqn.cpp

@@ -193,8 +197,9 @@ int main()
  agent.Deterministic() = true;

  // Creating and setting up the gym environment for testing.
-  envTest.monitor.start("./dummy/", true, true);
-
+  // envTest.monitor.start("./dummy/", true, true);


Any particular reason to comment this out, or add the compression call?

The envTest environment wasn't functioning properly upon reuse. I attempted to troubleshoot by commenting out the monitor section or adding compression, but these adjustments didn't resolve the issue. So, i would just revert it to its original state.

rcurtin · 2024-04-10T23:27:34Z

reinforcement_learning_gym/acrobot_dqn/acrobot_dqn.cpp

@@ -120,7 +124,7 @@ int main()

  // Preparation for training the agent
  // Set up the gym training environment.
-  gym::Environment env("gym.kurg.org", "4040", "Acrobot-v1");
+  gym::Environment env("localhost", "4040", "Acrobot-v1");


I'm not sure of the status of gym.kurg.org, but I don't know if this is the right thing to do here, otherwise we would now need to expect a user to be running the gym locally.

True this would need the user to be running gym locally. Running gym_tcp_api locally was the only way I could get it to work. I couldn't find any working examples using gym.kurg.org, so I assumed it's not functional anymore. Also, the example in the gym_tcp_api directory used localhost.

rcurtin · 2024-04-10T23:28:00Z

reinforcement_learning_gym/mountain_car_dqn/mountain_car_dqn.cpp

@@ -16,6 +16,10 @@
 using namespace mlpack;
 using namespace ens;

+// Set up the state and action space.
+constexpr size_t stateDimension = 2;
+constexpr size_t actionSize = 3;


Are these paired with specific changes in an mlpack PR, or is this necessary to ensure that these examples work?

It is necessary to ensure that these examples work as the new DiscreteActionEnv and ContinuousActionEnv requires template parameters. I also defined these globally so that i could use them in the Train function.

template<size_t Dimension, size_t Size, size_t RewardSize = 0> class DiscreteActionEnv { ... }

Thanks for the clarification. I definitely agree that these changes appear to be necessary, but like I mentioned somewhere else, I am not confident enough about the state of the RL code to say what the best way forward is. In the most ideal world, we could get CI to automatically test all of these examples, but I think right now the RL and notebook examples aren't built. I'll see if I can find some time to learn a little bit more so that we can merge this in, or maybe someone else will come along who knows the code better than I do. 👍

…eventing reusing the test env.

mlpack-bot · 2024-05-11T13:14:32Z

This issue has been automatically marked as stale because it has not had any recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions! 👍

updated README

1edb991

mlpack-bot bot added s: needs review s: unanswered s: unlabeled labels Apr 1, 2024

Ali-Hossam added 10 commits April 6, 2024 14:39

Adapted to work with new DiscreteActionEnv class

5c621a9

Adapt bidpedal_walker_sac to work with new ContinuousActionEnv class

8144891

Adapt lunar_lander_dqn to work with new DiscreteActionEnv class

bb9e8d3

Adapt cartpole_dqn.cpp to work with new DicreteActionEnv class

3bf7dd7

Adapt pendulum_dqn.cpp to work with new DiscreteActionEnv class

ad4f08c

Refactor pendulum_sac.cpp to work with the new version of ContinousAc…

fc108dc

…tionEnv class

Refactor pendulum_sac.cpp to work with the new ContinuousActionEnv class

89bbe91

Replace gym env V0 with V1

a62a164

Update dqn Network

efc730b

replaced {envname}_render with env.render

be140e3

rcurtin reviewed Apr 10, 2024

View reviewed changes

Ali-Hossam changed the title ~~updated README - Jupyter setup command~~ Refactor rl examples - updated README Apr 11, 2024

removed first envTest.close and second envTest.render as they were pr…

b56cdd9

…eventing reusing the test env.

rcurtin mentioned this pull request Apr 11, 2024

Refactor rl examples #225

Closed

mlpack-bot bot added the s: stale label May 11, 2024

mlpack-bot bot closed this May 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor rl examples - updated README #223

Refactor rl examples - updated README #223

Ali-Hossam commented Apr 1, 2024

mlpack-bot bot commented Apr 1, 2024

github-actions bot commented Apr 1, 2024

rcurtin left a comment

rcurtin Apr 10, 2024

rcurtin Apr 10, 2024

Ali-Hossam Apr 11, 2024

rcurtin Apr 10, 2024

Ali-Hossam Apr 11, 2024

rcurtin Apr 10, 2024

Ali-Hossam Apr 11, 2024

rcurtin Apr 11, 2024

mlpack-bot bot commented May 11, 2024

Refactor rl examples - updated README #223

Refactor rl examples - updated README #223

Conversation

Ali-Hossam commented Apr 1, 2024

mlpack-bot bot commented Apr 1, 2024

github-actions bot commented Apr 1, 2024

rcurtin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mlpack-bot bot commented May 11, 2024