Skip to content

Commit

Permalink
YouTube links added
Browse files Browse the repository at this point in the history
  • Loading branch information
danipeix13 committed Sep 10, 2022
1 parent 79ebe70 commit 3985c6b
Show file tree
Hide file tree
Showing 3 changed files with 7 additions and 3 deletions.
3 changes: 2 additions & 1 deletion gsoc/2022/posts/daniel_peix/4-2d_DQN.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ The reward function is one of the most important aspects of a Reinforcement Lear
In order to create a decent reward function, the 2d distance is used. With two thresholds, the environment can check is the gripper is above te cube, far away from it or any other possible situation. Depending on which situation is taking place, the reward wil be different.

## Results
TODO: YouTube links
Start of the training: https://youtu.be/fx7trTHZsjk
End of the training: https://youtu.be/bgIHhoa8vrQ

__Daniel Peix del Río__
3 changes: 2 additions & 1 deletion gsoc/2022/posts/daniel_peix/5-3d_DQN.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ In this case, the size of the observation needs to change. As we are using 3 dim
This reward function is not as trivial as the one done for two dimensions. In this case, there is more information available from the environment, so the reward function might be a little more complicated. The data which is used in this new reward function is: 2D distance, 3D distance and the gripper's 'fingers' data. Using two values for the distance (2D and 3D) allows us to give more importance to the 2D distance over the 3D one, because is crucial to be first above the cube.

## Results
TODO: YouTube links
Start of the training: https://youtu.be/zBbi9Xjelkg
End of the training: https://youtu.be/T5mk46UGFe8

__Daniel Peix del Río__
4 changes: 3 additions & 1 deletion gsoc/2022/posts/daniel_peix/6-4d_DQN.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,8 @@ In this case, the size of the observation needs to change. As we are now using t
This reward function is quite similar to the 3-dimensional one, just adding the griper's 'hand' info. The data which is used in this new reward function is: 2D distance, 3D distance, the gripper's 'fingers' data and the gripper's 'hand' data.

## Results
TODO: YouTube links
Start of the training: https://youtu.be/TjRCTKmOpRg
Middle of the training: https://youtu.be/VOYvWodl6Ik
End of the training: The computer clogged because CoppeliaSim used all the RAM

__Daniel Peix del Río__

0 comments on commit 3985c6b

Please sign in to comment.