Batch, Off-policy and Model-free Apprenticeship Learning
(Projection method[Abbeel, Ng. 2004] + LSPI + LSTD-mu)
- python3.6
- Tensorflow 1.5.0
- gym (openai gym)
- Numpy
python3 bomap_main.py
(default)
python3 bomap\_{}\_main.py
(under construction)
(under construction)
Deep Action Network(DAN) for deep basis function features instead of simple basis function
IRL_DAN + Deep Reward Network(DRN) for irl instead of Projection method