Skip to content

Commit

Permalink
vault backup: 2024-06-10 - 1 files
Browse files Browse the repository at this point in the history
Affected files:
Monthly Notes/Jun 2024 notes.md
  • Loading branch information
swyx committed Jun 10, 2024
1 parent bb0e3ad commit 9ccbc01
Showing 1 changed file with 15 additions and 0 deletions.
15 changes: 15 additions & 0 deletions Monthly Notes/Jun 2024 notes.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@

## notable reads

- [A Picture is Worth 170 Tokens: How Does GPT-4o Encode Images?](https://www.oranlooney.com/post/gpt-cnn/)
- sigma-gpt https://news.ycombinator.com/item?id=40608413
- The authors randomly permute (i.e., shuffle) input tokens in training and add two positional encodings to each token: one with the token's position and another with the position of the token to be predicted. Otherwise, the model is a standard autoregressive GPT. The consequences of this seemingly "simple" modification are significant:
- The authors can prompt the trained model with part of a sequence and then decode the missing tokens, all at once, in parallel, regardless of order -- i.e., the model can in-fill in parallel.
- The authors can compute conditional probability densities for every missing token in a sequence, again in parallel, i.e., densities for all missing tokens at once.
- The authors propose a rejection-sampling method for generating in-fill tokens, again in parallel. Their method seems to work well in practice.

## discussions


- forcing AI on to us
- msft recall default https://news.ycombinator.com/item?id=40610435

0 comments on commit 9ccbc01

Please sign in to comment.