Skip to content

Commit

Permalink
Update notes
Browse files Browse the repository at this point in the history
  • Loading branch information
Jonas1312 committed Apr 28, 2024
1 parent 30f406f commit 9619fe5
Show file tree
Hide file tree
Showing 4 changed files with 32 additions and 0 deletions.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Original file line number Diff line number Diff line change
Expand Up @@ -35,11 +35,13 @@
- [Fixed or variable length?](#fixed-or-variable-length)
- [Softmax is useless](#softmax-is-useless)
- [Loss](#loss)
- [RLHF, PPO, DPO, IPO, KTO](#rlhf-ppo-dpo-ipo-kto)
- [Transformers in NLP](#transformers-in-nlp)
- [GPT](#gpt)
- [GPT2](#gpt2)
- [GPT3](#gpt3)
- [BERT](#bert)
- [Sentence Embeddings](#sentence-embeddings)
- [Transformers in computer vision](#transformers-in-computer-vision)
- [Adapting transformers to CV](#adapting-transformers-to-cv)
- [Patch embeddings and tokenization](#patch-embeddings-and-tokenization)
Expand Down Expand Up @@ -710,6 +712,16 @@ During training, we can use the logits directly. During inference, we can use th

The loss is usually the cross-entropy loss, but we don't want our model to be too confident. So we can use label smoothing.

### RLHF, PPO, DPO, IPO, KTO

After pre-training, we finetune the model to be "instruct" or "chat".

DPO: A type of training which removes the need for a reward model. It simplifies significantly the RLHF-pipeline.

IPO: A change in the DPO objective which is simpler and less prone to overfitting.

KTO: While PPO, DPO, and IPO require pairs of accepted vs rejected generations, KTO just needs a binary label (accepted or rejected), hence allowing to scale to much more data.

## Transformers in NLP

### GPT
Expand Down Expand Up @@ -773,3 +785,17 @@ That’s why BERT is a “bidirectional” transformer. A model has a better cha
The pretraining of these models usually revolves around somehow corrupting a given sentence (for instance, by masking random words in it) and tasking the model with finding or reconstructing the initial sentence.

<https://jalammar.github.io/a-visual-guide-to-using-bert-for-the-first-time/>

### Sentence Embeddings

Sometimes, we need a single embedding vector to represent a sentence.

There are usually three ways to do this:

1. mean pooling: we take the mean of the embeddings of the tokens in the sentence
2. max pooling: we take the max of the embeddings of the tokens in the sentence
3. Bert CLS token: we take the embedding of the CLS token in the sentence. This token is a special token that is supposed to represent the whole sentence meaning. During training, we train the model to predict if two sentences are consecutive (i.e. from the same document).

To compare two sentences/texts, one can compare the embeddings using cosine similarity (bi-encoder) or use a cross-encoder. Cross-encoder are more powerful but slower.

![](./cross_encoder.png)
Original file line number Diff line number Diff line change
Expand Up @@ -7,3 +7,4 @@
- <https://medium.com/mcgill-artificial-intelligence-review/tutorial-setting-up-a-gpu-enabled-virtual-machine-on-microsoft-azure-f9a32fa1b536>
- <https://azure.microsoft.com/en-gb/global-infrastructure/services/>
- <https://azure.microsoft.com/en-gb/pricing/details/virtual-machines/linux/>
- <https://datalab.sspcloud.fr/catalog/ide>
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@
- [Contravariance: `ContravariantType[SuperType, ...] <: ContravariantType[SubType, ...]`](#contravariance-contravarianttypesupertype---contravarianttypesubtype-)
- [Invariant](#invariant)
- [None vs Noreturn](#none-vs-noreturn)
- [TypeGuard, TypeIs](#typeguard-typeis)
- [Sequences](#sequences)
- [Filter Map Reduce](#filter-map-reduce)
- [Comprehension lists/dicts](#comprehension-listsdicts)
Expand Down Expand Up @@ -581,6 +582,10 @@ Python will always add an implicit `return None` to the end of any function. Thi

Use `NoReturn` to indicate that a function never returns normally. For example, it always raises an exception or has an infinite loop.

### TypeGuard, TypeIs

Narrowing types with TypeIs: <https://rednafi.com/python/typeguard_vs_typeis/>

## Sequences

### Filter Map Reduce
Expand Down

0 comments on commit 9619fe5

Please sign in to comment.