Kachow!

there are two models:
- State constructor (get environment state from video)
- control model (generates actions from environment state)
we train a custom CNN (Convolutional Neural Network) + LSTM (long short term memory) hybrid model to reconstruct environment state from states encountered during control model training.
- The environment state can be directly retreived from a simulated environment, but cant be collected from a physical one making this important
- The LSTM allows us to remember past information, thus allowing us to remember where cubes are and where other robots are
- The CNN allows us to get numeric data from image inputs, thus allowing us to input each frame captured by the camera to the LSTM
We train a control model based on SimBa¹ using PPO (Proximal Policy Optimization)
- This uses a custom environment and reward function

TO DO:

Name		Name	Last commit message	Last commit date
Latest commit History 41 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
RL.py		RL.py
environment.yml		environment.yml
main.py		main.py
model.py		model.py
streaming_test.py		streaming_test.py
streaming_test_vision.py		streaming_test_vision.py
training.py		training.py
utils.py		utils.py

Provide feedback