GitHub - lubiluk/cldt: Continual learning with decision transformer

Add your policy to policies.py.
Add you environment to envs.py.
Run the generation module:

Usage: 
generate_dataset.py [-h] [-t POLICY_TYPE] [-p POLICY_PATH] [-e ENV] [-n NUM_EPISODES] [-o OUTPUT_PATH] [--render] [--seed SEED]

Example:
python -m generate_dateset.py -t random -e hopper -n 1000 -o cache/hopper.pkl --render --seed 0

Downloading Atari datasets

gsutil -m cp -R gs://atari-replay-datasets/dqn/Breakout/ ./cache/

Running Atari DT

python -m experiment.atari --seed 1234 --context-length 30 --epochs 5 --model-type reward_conditioned --num-steps 500000 --num-buffers 50 --game Breakout --batch-size 128

Multi-Goal examples

Generate PandaReach dataset. The demonstrator needs time-feature wrapper.

python generate_dataset.py -t reach -e panda-reach-dense -n 100000 -o datasets/panda_reach_dense_random.pkl -w time-feature
python generate_dataset.py -t reach -p demonstrators/tqcher_panda_reach_dense_tf.zip -e panda-reach-dense -n 100000 -o datasets/panda_reach_dense_expert.pkl -w time-feature

Generate PandaPush dataset. The demonstrator needs time-feature wrapper.

python generate_dataset.py -t tqc+her -p demonstrators/sb3_tqc_panda_push_sparse.zip -e panda-push-sparse -n 100000 -o datasets/panda_push_sparse_100k_expert.pkl -w time-feature

Train Decision-Transformer on PandaReach.

python train_single.py -c configs/ICRA_1mln_exp_ratio_1_seed_1234/dt_panda_push_dense_tf.yaml  --dataset datasets/split/panda_push_dense_1m_expert.pkl

Experiments TODO

Train TQC on all envs.
Generate datasets of sizes 1m 500k 250k 100k 50k 10k.
Train DT on all envs and all dataset sizes.

Repeat on a different seed?

Train fast using RL Zoo

python -m rl_zoo3.train --env PandaPushDense-v3 --algo tqc --conf-file configs/tqcher_zoo.yaml --folder trained --save-freq 100000 --hyperparams n_envs:4 gradient_steps:-1

n_envs tells how many environments should work in parallel.

Name		Name	Last commit message	Last commit date
Latest commit History 105 Commits
.vscode		.vscode
cldt		cldt
configs		configs
notebooks		notebooks
scripts		scripts
.gitignore		.gitignore
README.md		README.md
athena_dt_runner.py		athena_dt_runner.py
download_dataset.py		download_dataset.py
environment.yml		environment.yml
evaluate_single.py		evaluate_single.py
generate_dataset.py		generate_dataset.py
paths.py		paths.py
requirements.txt		requirements.txt
run_dir.sh		run_dir.sh
run_rl.sh		run_rl.sh
split_dataset.py		split_dataset.py
spython.sh		spython.sh
train_continual.py		train_continual.py
train_single.py		train_single.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Downloading Atari datasets

Running Atari DT

Multi-Goal examples

Experiments TODO

Train fast using RL Zoo

About

Releases

Packages

Contributors 2

Languages

lubiluk/cldt

Folders and files

Latest commit

History

Repository files navigation

Downloading Atari datasets

Running Atari DT

Multi-Goal examples

Experiments TODO

Train fast using RL Zoo

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages