Towards Human-Level Safe Reinforcement Learning in Atari Library Environment

Using Gym Super Mario Bros as the environment Using Stable Baselines, a fork of OpenAI's popular Baselines reinforcement learning library

using the concept of modified reward, as the simplest safety constraint to enforce safety behaviour of the agent.

falling into pit is set as unsafe / catastrophic state.

below are the results of experimenting in multiple iterations:

Notes	GIFs
mostly still fails reward 629.2 violation 20 completion rate 0%	Iteration: 100k
Mario shows hesitation reward 1001.7 violation 19 completion rate 0%	Iteration: 500k
proceed smoothly but violation in the end reward 921.5 violation 24 completion rate 0%	Iteration: 1m
agent freeze due to fear of pit reward 2331 violation 18 completion rate 18%	Iteration: 5m
mostly win without problem reward 2703.9 violation 2 completion rate 71%	Iteration: 10m

Notes	GIFs
can avoid enemies but violate safety reward 679.6 violation 24 completion rate 0%	Iteration: 100k
still violate safety with wins sometimes reward 1151.4 violation 23 completion rate 0%	Iteration: 500k
farthest record of the model reward 700.2 violation 27 completion rate 0%	Iteration: 1m
complete the level but stuck for a while reward 2755.5 violation 24 completion rate 47%	Iteration: 5m
mostly completed level without problem reward 2637.5 violation 18 completion rate 62%	Iteration: 10m

Notes	GIFs
good start then mostly stuck with pipe reward 295.5 violation 0 completion rate 0%	Iteration: 100k
mostly fails reward 719.7 violation 23 completion rate 0%	Iteration: 500k
still fails quickly but a bit change jump pattern reward 165.5 violation 0 completion rate 0%	Iteration: 1m
somewhat smooth but still fails reward 815.3 violation 30 completion rate 0%	Iteration: 5m
have progress but not much reward 858.7 violation 32 completion rate 0%	Iteration: 10m

Setup

must use Python version < 3.8, preferrably Python-3.7.6 this research is using vscode with virtual environment

pip install -r requirements.txt

Training process is started with

python train.py

Evaluation process is started with

python eval.py