Skip to content

Commit

Permalink
rename data to match stimuli convention
Browse files Browse the repository at this point in the history
  • Loading branch information
jennhu committed Oct 2, 2019
1 parent edef395 commit e15a98d
Show file tree
Hide file tree
Showing 164 changed files with 8 additions and 641,381 deletions.
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@ get_data_df.py
template/
*.DS_Store
run_experiment.sh
move
data/unused

stimuli/unused
stimuli/old/
Expand Down
11 changes: 6 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,13 +25,12 @@ If you use any of our code, data, or analyses, please cite the paper using the b
## Stimuli

For each experiment, a `.csv` file containing the stimuli can be found at
`stimuli/<EXPERIMENT>.csv`, where `<EXPERIMENT>` corresponds to **TODO**.
The file is structured as follows:
`stimuli/<EXPERIMENT>/<PRONOUN>.csv`. The file is structured as follows:

**SHERRY TODO: explain how stimuli file is structured**

To extract the sentences from this file, use the script
`extract_sentences.py`. You can toggle flags like `--uncased` and `--eos`
`stimuli/extract_sentences.py`. You can toggle flags like `--uncased` and `--eos`
depending on the requirements of your model. **Please note that the final period
at the end of each sentence is separated by whitespace.** Otherwise, no
tokenization assumptions are made.
Expand All @@ -50,10 +49,12 @@ Penn Treebank. This is not the case for the materials used in Experiment 1, the

## Data
The per-token surprisal values for each model can be found in the [data](data)
folder, following the following naming convention:
folder, following this naming convention:
```
data/<MODEL>/<EXPERIMENT>_surprisal_<MODEL>.txt
data/<MODEL>/<EXPERIMENT>/<PRONOUN>_<MODEL>.txt
```
The BERT data is in a slightly different `.csv` format, but otherwise
follows the same naming convention.

## Dependencies
Our analysis code requires a basic scientific installation of Python
Expand Down
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
Loading

0 comments on commit e15a98d

Please sign in to comment.