Skip to content

Commit

Permalink
manual vault backup: 2024-06-01 - 1 files
Browse files Browse the repository at this point in the history
Affected files:
Resources/BENCHMARKS.md
  • Loading branch information
swyx committed Jun 1, 2024
1 parent 17c7623 commit bb0e3ad
Showing 1 changed file with 1 addition and 0 deletions.
1 change: 1 addition & 0 deletions Resources/BENCHMARKS.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ Benchmarks exist between the Data and Models, and are the least obvious/glamorou

easiest way i know to run the benchmarks yourself is https://github.com/EleutherAI/lm-evaluation-harness
- which was forked from the MMLU test https://huggingface.co/blog/evaluating-mmlu-leaderboard and is also related to the stanford HELM impl
- and drives the Open LLM Leaderboard https://github.com/huggingface/blog/blob/main/open-llm-leaderboard-mmlu.md
openai evals is promising but doesnt have most of them implemented yet


Expand Down

0 comments on commit bb0e3ad

Please sign in to comment.