This is the repository of the code for the paper Learning Barrier Certificates: Towards Safe Reinforcement Learning with Zero Training-time Violations, accepted at NeurIPS 2021.
Create the environment
conda create -n safe -y python=3.9
pip install -r requirements.txt
pip install -r lunzi/requirements.txt
Before we run, if you don't use wandb, simply run wandb offline.
Suppose we want to run CRABS in the task Swing. We also provide intermediate checkpoints.
There are step steps to get CRABS working:
- Run
export PYTHONPATH=$PYTHONPATH:. - Train an initial barrier certificate. The barrier certificate can be found at
/tmp/crabs/pretrain/wandb/run-xxx/ckpt.pt.python run/main.py --root_dir /tmp/crabs/pretrain/ -c ./configs/train_h.json5
- Train a policy iteratively. The policies can be found at
/tmp/crabs/iterative/wandb/run-xxx/ckpt.ptand the log file can be found at/tmp/crabs/iterative/wandb/run-xxx/run-xxx.wandbpython run/main.py --root_dir /tmp/crabs/iterative/ -c ./configs/train_policy.json5
To read the log file and generate the plot, please check run/read_log.py.