Skip to content

Commit

Permalink
Added encoder-decoder support to model view. Changed order or params …
Browse files Browse the repository at this point in the history
…in head view. Added two notebooks.
  • Loading branch information
jessevig committed May 5, 2021
1 parent 29ae773 commit 8b18522
Show file tree
Hide file tree
Showing 7 changed files with 1,209 additions and 91 deletions.
79 changes: 72 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -78,29 +78,56 @@ pip install bertviz
```
You must also have [Jupyter Notebook](https://jupyter.org/install) installed.

## Execution
## Quickstart

First start Jupyter Notebook:

```
jupyter notebook
```

Click *New* to start a Jupter notebook, then follow the instructions below.
Click *New* to start a Jupter notebook.

### Head view / model view
First load a Huggingface model, either a pre-trained model as shown below, or your own fine-tuned model. Be sure to set `output_attention=True`.
Add the following cell:

```
from transformers import AutoTokenizer, AutoModel
from bertviz import model_view
tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")
model = AutoModel.from_pretrained("bert-base-uncased", output_attentions=True)
inputs = tokenizer.encode("The cat sat on the mat", return_tensors='pt')
outputs = model(inputs)
attention = outputs[-1] # Output includes attention weights when output_attentions=True
tokens = tokenizer.convert_ids_to_tokens(inputs[0])
model_view(attention, tokens)
```

And run it! It will take a few seconds to load.

## Detailed Instructions

### Self-Attention Models (BERT, GPT-2, etc.)

#### Head view / model view
First load a Huggingface model, either a pre-trained model as shown below, or your own fine-tuned model.
Be sure to set `output_attention=True`.
```
from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")
model = AutoModel.from_pretrained("bert-base-uncased", output_attentions=True)
```

Then prepare inputs and compute attention:

```
inputs = tokenizer.encode("The cat sat on the mat", return_tensors='pt')
outputs = model(inputs)
attention = outputs[-1] # Output includes attention weights when output_attentions=True
tokens = tokenizer.convert_ids_to_tokens(inputs[0])
```

Then display the returned attention weights using the BertViz `head_view` or `model_view` function:
Finally, display the attention weights using the `head_view` or `model_view` function:

```
from bertviz import head_view
Expand All @@ -110,7 +137,7 @@ head_view(attention, tokens)
For more advanced use cases, e.g., specifying a two-sentence input to the model, please refer to the
sample notebooks.

### Neuron view
#### Neuron view

The neuron view is invoked differently than the head view or model view, due to requiring access to the model's
query/key vectors, which are not returned through the Huggingface API. It is currently limited to custom versions of BERT, GPT-2, and
Expand All @@ -128,6 +155,45 @@ model_type = 'bert'
show(model, model_type, tokenizer, sentence_a, sentence_b, layer=2, head=0)
```

### Encoder-Decoder Models (MarianMT, etc.)

The head view and model view both support encoder-decoder models.

First, load an encoder-decoder model:

```
from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("Helsinki-NLP/opus-mt-en-de")
model = AutoModel.from_pretrained("Helsinki-NLP/opus-mt-en-de")
```

Then prepare the inputs and compute attention:
```
# get encoded input vectors
encoder_input_ids = tokenizer("She sees the small elephant.", return_tensors="pt", add_special_tokens=True).input_ids
# create ids of encoded input vectors
decoder_input_ids = tokenizer("Sie sieht den kleinen Elefanten.", return_tensors="pt", add_special_tokens=True).input_ids
outputs = model(input_ids=encoder_input_ids, decoder_input_ids=decoder_input_ids, output_attentions=True)
encoder_text = tokenizer.convert_ids_to_tokens(encoder_input_ids[0])
decoder_text = tokenizer.convert_ids_to_tokens(decoder_input_ids[0])
```

Finally, display the visualization using either `head_view` or `model_view`.
```
from bertviz import model_view
model_view(
encoder_attention=outputs.encoder_attentions,
decoder_attention=outputs.decoder_attentions,
cross_attention=outputs.cross_attentions,
encoder_tokens= encoder_text,
decoder_tokens = decoder_text
)
```

### Running a sample notebook

```
Expand Down Expand Up @@ -176,7 +242,6 @@ returned from Huggingface models). In some case, Tensorflow checkpoints may be l
* The neuron view only supports the custom BERT, GPT-2, and RoBERTa models included with the tool. This view needs access to the query and key vectors,
which required modifying the model code (see `transformers_neuron_view directory`), which has only been done for these three models.
Also, only one neuron view may be included per notebook.
* The visualization doesn't currently support sequence-to-sequence models out of the box.
### Attention as "explanation"
Visualizing attention weights illuminates a particular mechanism within the model architecture but does not
necessarily provide a direct *explanation* for model predictions. See [[1](https://arxiv.org/pdf/1909.11218.pdf)], [[2](https://arxiv.org/abs/1902.10186)], [[3](https://arxiv.org/pdf/1908.04626.pdf)].
Expand Down
1 change: 1 addition & 0 deletions bertviz/head_view.js
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
* 12/29/20 Jesse Vig Significant refactor.
* 12/31/20 Jesse Vig Support multiple visualizations in single notebook.
* 02/06/21 Jesse Vig Move require config from separate jupyter notebook step
* 05/03/21 Jesse Vig Adjust height of visualization dynamically
**/

require.config({
Expand Down
11 changes: 5 additions & 6 deletions bertviz/head_view.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,14 +11,14 @@ def head_view(
attention=None,
tokens=None,
sentence_b_start=None,
prettify_tokens=True,
layer=None,
heads=None,
encoder_attention=None,
decoder_attention=None,
cross_attention = None,
cross_attention=None,
encoder_tokens=None,
decoder_tokens=None,
prettify_tokens=True,
layer=None,
heads=None
):
"""Render head view
Expand Down Expand Up @@ -200,5 +200,4 @@ def head_view(
__location__ = os.path.realpath(
os.path.join(os.getcwd(), os.path.dirname(__file__)))
vis_js = open(os.path.join(__location__, 'head_view.js')).read().replace("PYTHON_PARAMS", json.dumps(params))
display(Javascript(vis_js))

display(Javascript(vis_js))
13 changes: 1 addition & 12 deletions bertviz/model_view.js
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
* 12/31/20 Jesse Vig Support multiple visualizations in single notebook.
* 01/19/21 Jesse Vig Support light/dark modes
* 02/06/21 Jesse Vig Move require config from separate jupyter notebook step
* 05/03/21 Jesse Vig Adjust visualization height dynamically
**/

require.config({
Expand Down Expand Up @@ -358,18 +359,6 @@ requirejs(['jquery', 'd3'], function($, d3) {
config.filter = e.currentTarget.value;
render();
});
// // Configure the display mode drop down
// var select = $(`#${config.rootDivId} #mode`)
// console.log('select', select)
// for(var i = 0;i < select.length;i++){
// if(select[i].value == config.mode ){
// select[i].selected = true;
// }
// }
// $(`#${config.rootDivId} #mode`).on('change', function (e) {
// config.mode = e.currentTarget.value;
// render();
// });
}

initialize();
Expand Down
Loading

0 comments on commit 8b18522

Please sign in to comment.