-
Notifications
You must be signed in to change notification settings - Fork 10
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
6464c86
commit 8f2494b
Showing
3 changed files
with
26 additions
and
12 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -21,22 +21,19 @@ The black-box LM is frozen during the whole procedure. | |
All required packages can be found in ``requirements.txt``. | ||
You can install them in a new environment with | ||
```shell | ||
conda env create -n icl python=3.7 | ||
conda create -n icl python=3.7 | ||
conda activate icl | ||
|
||
git clone [email protected]:HKUNLP/icl-ceil.git | ||
pip install -r requirements.txt | ||
#[Optional] If you want to experiment on Break dataset with LF-EM evaluation metric, you have to clone recursively with the following commands to include third-party dependencies: | ||
#git clone --recurse-submodules [email protected]:HKUNLP/HKUNLP.git | ||
|
||
# The following line to be replaced depending on your cuda version. | ||
pip install torch==1.10.1+cu113 -f https://download.pytorch.org/whl/torch_stable.html | ||
``` | ||
|
||
Optional: If you want to experiment on Break dataset with LF-EM evaluation metric, you have to clone recursively with the following commands to include third-party dependencies: | ||
```shell | ||
git clone --recurse-submodules [email protected]:HKUNLP/HKUNLP.git | ||
``` | ||
|
||
Activate the environment by running | ||
```shell | ||
conda activate icl | ||
cd icl-ceil | ||
pip install -r requirements.txt | ||
# if you don't want to use API from openai, just comment out the `openai` package in `requirements.txt`. | ||
``` | ||
|
||
Setup WandB for tracking the training status for `EPR` and `CEIL` in `scripts/run_epr.sh` and `scripts/run_dpp_epr.sh`: | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
#!/usr/bin/python3 | ||
# -*- coding: utf-8 -*- | ||
from transformers import AutoTokenizer | ||
|
||
|
||
def model_to_tokenizer(model_name): | ||
if "code-" in model_name: | ||
return "SaulLu/codex-like-tokenizer" | ||
if "gpt3" in model_name: | ||
return "gpt2" | ||
return model_name | ||
|
||
|
||
def get_tokenizer(model_name): | ||
if model_name == 'bm25': | ||
return model_name | ||
return AutoTokenizer.from_pretrained(model_to_tokenizer(model_name)) |