25 Jan 04:05

KennethEnevoldsen

dfba463

1.30.0 Latest

Latest

1.30.0 (2025-01-25)

Feature

feat: Integrating ChemTEB (#1708)
Add SMILES, AI Paraphrase and Inter-Source Paragraphs PairClassification Tasks
Add chemical subsets of NQ and HotpotQA datasets as Retrieval tasks
Add PubChem Synonyms PairClassification task
Update task init for previously added tasks
Add nomic-bert loader
Add a script to run the evaluation pipeline for chemical-related tasks
Add 15 Wikipedia article classification tasks
Add PairClassification and BitextMining tasks for Coconut SMILES
Fix naming of some Classification and PairClassification tasks
Fix some classification tasks naming issues
Integrate WANDB with benchmarking script
Update .gitignore
Fix nomic_models.py issue with retrieval tasks, similar to issue #1115 in original repo
Add one chemical model and some SentenceTransformer models
Fix a naming issue for SentenceTransformer models
Add OpenAI, bge-m3 and matscibert models
Add PubChem SMILES Bitext Mining tasks
Change metric namings to be more descriptive
Add English e5 and bge v1 models, all the sizes
Add two Wikipedia Clustering tasks
Add a try-except in evaluation script to skip faulty models during the benchmark.
Add bge v1.5 models and clustering score extraction to json parser
Add Amazon Titan embedding models
Add Cohere Bedrock models
Add two SDS Classification tasks
Add SDS Classification tasks to classification init and chem_eval
Add a retrieval dataset, update dataset names and revisions
Update revision for the CoconutRetrieval dataset: handle duplicate SMILES (documents)
Update CoconutSMILES2FormulaPC task
Change CoconutRetrieval dataset to a smaller one
Update some models

Integrate models added in ChemTEB (such as amazon, cohere bedrock and nomic bert) with latest modeling format in mteb.
Update the metadata for the mentioned models

Fix a typo
open_weights argument is repeated twice
Update ChemTEB tasks

Rename some tasks for better readability.
Merge some BitextMining and PairClassification tasks into a single task with subsets (PubChemSMILESBitextMining and PubChemSMILESPC)
Add a new multilingual task (PubChemWikiPairClassification) consisting of 12 languages.
Update dataset paths, revisions and metadata for most tasks.
Add a Chemistry domain to TaskMetadata

Remove unnecessary files and tasks for MTEB
Update some ChemTEB tasks

Move PubChemSMILESBitextMining to eng folder
Add citations for tasks involving SDS, NQ, Hotpot, PubChem data
Update Clustering tasks category
Change main_score for PubChemAISentenceParaphrasePC

Create ChemTEB benchmark
Remove CoconutRetrieval
Update tasks and benchmarks tables with ChemTEB
Mention ChemTEB in readme
Fix some issues, update task metadata, lint

eval_langs fixed
Dataset path was fixed for two datasets
Metadata was completed for all tasks, mainly following fields: date, task_subtypes, dialect, sample_creation
ruff lint
rename nomic_bert_models.py to nomic_bert_model.py and update it.

Remove nomic_bert_model.py as it is now compatible with SentenceTransformer.
Remove WikipediaAIParagraphsParaphrasePC task due to being trivial.
Merge amazon_models and cohere_bedrock_models.py into bedrock_models.py
Remove unnecessary load_data for some tasks.
Update bedrock_models.py, openai_models.py and two dataset revisions

Text should be truncated for amazon text embedding models.
text-embedding-ada-002 returns null embeddings for some inputs with 8192 tokens.
Two datasets are updated, dropping very long samples (len > 99th percentile)

Add a layer of dynamic truncation for amazon models in bedrock_models.py
Replace metadata_dict with self.metadata in PubChemSMILESPC.py
fix model meta for bedrock models
Add reference comment to original Cohere API implementation (4d66434)

Unknown

Update points table (223bf32)

Assets 6

22 Jan 12:11

KennethEnevoldsen

1.29.16

fa5127a

1.29.16

1.29.16 (2025-01-22)

Fix

fix: Added correct training data annotation to LENS (#1859)

Added correct training data annotation to LENS (e775436)

Assets 6

22 Jan 11:50

KennethEnevoldsen

1.29.15

f645183

1.29.15

1.29.15 (2025-01-22)

Fix

fix: Adding missing model meta (#1856)
Added CDE models
Added bge-en-icl
Updated CDE to bge_full_data
Fixed public_training_data flag type to include boolean, as this is how all models are annotated
Added public training data link instead of bool to CDE and BGE
Added GME models
Changed Torch to PyTorch
Added metadata on LENS models
Added ember_v1
Added metadata for amazon titan
Removed GME implementation (692bd26)

Assets 6

22 Jan 09:41

KennethEnevoldsen

1.29.14

fde446d

1.29.14

1.29.14 (2025-01-22)

Fix

fix: Fix zeta alpha mistral (#1736)
fix zeta alpha mistral
update use_instructions
update training datasets
Update mteb/models/e5_instruct.py

Co-authored-by: Kenneth Enevoldsen <[email protected]>

update float
Update mteb/models/e5_instruct.py

Co-authored-by: Kenneth Enevoldsen <[email protected]> (4985da9)

fix: Hotfixed public_training_data type annotation (#1857)

Fixed public_training_data flag type to include boolean, as this is how all models are annotated (4bd7328)

Unknown

Add more annotations (#1833)
apply additions from #1794
add annotations for rumodels
add nomic training data
fix metadata
update rest of model meta
fix bge reranker (12ed9c5)

Assets 6

22 Jan 07:12

KennethEnevoldsen

1.29.13

2f8cfae

1.29.13

1.29.13 (2025-01-22)

Fix

fix: Fixed leaderboard search bar (#1852)

Fixed leaderboard search bar (fe33061)

Assets 6

21 Jan 11:37

KennethEnevoldsen

1.29.12

afd3c77

1.29.12

1.29.12 (2025-01-21)

Fix

fix: Leaderboard Refinements (#1849)
Added better descriptions to benchmarks and removed beta tags
Fixed zero-shot filtering on app loading
Added zero-shot definition in an accordion
NaN values are now filled with blank
Added type hints to filter_models (a8cc887)

Assets 6

21 Jan 10:54

KennethEnevoldsen

1.29.11

2fac8ba

1.29.11

1.29.11 (2025-01-21)

Fix

fix: Add reported annotation and re-added public_training_data (#1846)
fix: Add additional dataset annotations
fix: readded public training data
update voyage annotations (a7a8144)

Assets 6

20 Jan 06:08

KennethEnevoldsen

1.29.10

46f6abc

1.29.10

1.29.10 (2025-01-20)

Fix

fix: Remove default params, public_training_data and memory usage in ModelMeta (#1794)
fix: Leaderboard: K instead of M
Fixes #1752
format
fixed existing annotations to refer to task name instead of hf dataset
added annotation to nvidia
added voyage
added uae annotations
Added stella annotations
sentence trf models
added salesforce and e5
jina
bge + model2vec
added llm2vec annotations
add jasper
format
format
Updated annotations and moved jina models
make models parameters needed to be filled
fix tests
remove comments
remove model meta from test
fix model meta from split
fix: add even more training dataset annotations (#1793)
fix: update max tokens for OpenAI (#1772)
update max tokens
ci: skip AfriSentiLID for now (#1785)
skip AfriSentiLID for now
skip relevant test case instead

Co-authored-by: Isaac Chung <[email protected]>

1.28.7
Automatically generated by python-semantic-release
ci: fix model loading test (#1775)
pass base branch into the make command as an arg
test a file that has custom wrapper
what about overview
just dont check overview
revert instance check
explicitly omit overview and init
remove test change
try on a lot of models
revert test model file

Co-authored-by: Isaac Chung <[email protected]>

feat: Update task filtering, fixing bug which included cross-lingual tasks in overly many benchmarks (#1787)
feat: Update task filtering, fixing bug on MTEB

Updated task filtering adding exclusive_language_filter and hf_subset
fix bug in MTEB where cross-lingual splits were included
added missing language filtering to MTEB(europe, beta) and MTEB(indic, beta)
The following code outlines the problems:

import mteb
from mteb.benchmarks import MTEB_ENG_CLASSIC
task = [t for t in MTEB_ENG_CLASSIC.tasks if t.metadata.name == &#34;STS22&#34;][0]
# was eq. to:
task = mteb.get_task(&#34;STS22&#34;, languages=[&#34;eng&#34;])
task.hf_subsets
# correct filtering to English datasets:
# [&#39;en&#39;, &#39;de-en&#39;, &#39;es-en&#39;, &#39;pl-en&#39;, &#39;zh-en&#39;]
# However it should be:
# [&#39;en&#39;]
# with the changes it is:
task = [t for t in MTEB_ENG_CLASSIC.tasks if t.metadata.name == &#34;STS22&#34;][0]
task.hf_subsets
# [&#39;en&#39;]
# eq. to
task = mteb.get_task(&#34;STS22&#34;, hf_subsets=[&#34;en&#34;])
# which you can also obtain using the exclusive_language_filter (though not if there was multiple english splits):
task = mteb.get_task(&#34;STS22&#34;, languages=[&#34;eng&#34;], exclusive_language_filter=True)

format
remove "en-ext" from AmazonCounterfactualClassification
fixed mteb(deu)
fix: simplify in a few areas
fix: Add gritlm
1.29.0
Automatically generated by python-semantic-release
fix: Added more annotations!
fix: Added C-MTEB (#1786)
Added C-MTEB
1.29.1
Automatically generated by python-semantic-release
docs: Add contact to MMTEB benchmarks (#1796)
Add myself to MMTEB benchmarks
lint
fix: loading pre 11 (#1798)
fix loading pre 11
add similarity
lint
run all task types
1.29.2
Automatically generated by python-semantic-release
fix: allow to load no revision available (#1801)
fix allow to load no revision available
lint
add require_model_meta to leaderboard
lint
1.29.3
Automatically generated by python-semantic-release

Co-authored-by: Roman Solomatin <[email protected]>
Co-authored-by: Isaac Chung <[email protected]>
Co-authored-by: Isaac Chung <[email protected]>
Co-authored-by: github-actions <[email protected]>
Co-authored-by: Márton Kardos <[email protected]>

fig merges
update models info
change public_training_code to str
change public_training_code=False to None
remove annotations
remove annotations
remove changed annotations
remove changed annotations
remove public_training_data and memory usage
make framework not optional
make framework non-optional
empty frameworks
add framework
fix tests
Update mteb/models/overview.py
Co-authored-by: Isaac Chung <[email protected]>

Co-authored-by: Kenneth Enevoldsen <[email protected]>
Co-authored-by: Isaac Chung <[email protected]>
Co-authored-by: Isaac Chung <[email protected]>
Co-authored-by: github-actions <[email protected]>
Co-authored-by: Márton Kardos <[email protected]> (0a83e38)

fix: subsets to run (#1830)
fix split evals
add test
lint
fix moka
add assert (8be6b2e)

Assets 6

17 Jan 15:09

KennethEnevoldsen

1.29.9

762f729

1.29.9

1.29.9 (2025-01-17)

Fix

fix: Fixed eval split for MultilingualSentiment in C-MTEB (#1804)
Fixed eval split for MultilingualSentiment in C-MTEB
FIxed splits for atec, bq and stsb in C-MTEB (96f639b)

Assets 6

17 Jan 14:04

KennethEnevoldsen

1.29.8

b4d0eaa

1.29.8

1.29.8 (2025-01-17)

Fix

fix: Added Misc Chinese models (#1819)
Added moka and piccolo models to overview file
Added Text2Vec models
Added various Chinese embedding models

Co-authored-by: Isaac Chung <[email protected]> (9823529)

fix: Added way more training dataset annotations (#1765)
fix: Leaderboard: K instead of M
Fixes #1752
format
fixed existing annotations to refer to task name instead of hf dataset
added annotation to nvidia
added voyage
added uae annotations
Added stella annotations
sentence trf models
added salesforce and e5
jina
bge + model2vec
added llm2vec annotations
add jasper
format
format
Updated annotations and moved jina models
fix: add even more training dataset annotations (#1793)
fix: update max tokens for OpenAI (#1772)

update max tokens

ci: skip AfriSentiLID for now (#1785)
skip AfriSentiLID for now
skip relevant test case instead

Co-authored-by: Isaac Chung <[email protected]>

1.28.7

Automatically generated by python-semantic-release

ci: fix model loading test (#1775)
pass base branch into the make command as an arg
test a file that has custom wrapper
what about overview
just dont check overview
revert instance check
explicitly omit overview and init
remove test change
try on a lot of models
revert test model file

Co-authored-by: Isaac Chung <[email protected]>

feat: Update task filtering, fixing bug which included cross-lingual tasks in overly many benchmarks (#1787)
feat: Update task filtering, fixing bug on MTEB

Updated task filtering adding exclusive_language_filter and hf_subset
fix bug in MTEB where cross-lingual splits were included
added missing language filtering to MTEB(europe, beta) and MTEB(indic, beta)

The following code outlines the problems:

import mteb
from mteb.benchmarks import MTEB_ENG_CLASSIC

task = [t for t in MTEB_ENG_CLASSIC.tasks if t.metadata.name == &#34;STS22&#34;][0]
# was eq. to:
task = mteb.get_task(&#34;STS22&#34;, languages=[&#34;eng&#34;])
task.hf_subsets
# correct filtering to English datasets:
# [&#39;en&#39;, &#39;de-en&#39;, &#39;es-en&#39;, &#39;pl-en&#39;, &#39;zh-en&#39;]
# However it should be:
# [&#39;en&#39;]

# with the changes it is:
task = [t for t in MTEB_ENG_CLASSIC.tasks if t.metadata.name == &#34;STS22&#34;][0]
task.hf_subsets
# [&#39;en&#39;]
# eq. to
task = mteb.get_task(&#34;STS22&#34;, hf_subsets=[&#34;en&#34;])
# which you can also obtain using the exclusive_language_filter (though not if there was multiple english splits):
task = mteb.get_task(&#34;STS22&#34;, languages=[&#34;eng&#34;], exclusive_language_filter=True)

format
remove "en-ext" from AmazonCounterfactualClassification
fixed mteb(deu)
fix: simplify in a few areas
fix: Add gritlm
1.29.0

Automatically generated by python-semantic-release

fix: Added more annotations!
fix: Added C-MTEB (#1786)

Added C-MTEB

1.29.1

Automatically generated by python-semantic-release

docs: Add contact to MMTEB benchmarks (#1796)
Add myself to MMTEB benchmarks
lint
fix: loading pre 11 (#1798)
fix loading pre 11
add similarity
lint
run all task types
1.29.2

Automatically generated by python-semantic-release

fix: allow to load no revision available (#1801)
fix allow to load no revision available
lint
add require_model_meta to leaderboard
lint
1.29.3

Automatically generated by python-semantic-release

fix: bm25s (#1827)

Co-authored-by: sam021313 <[email protected]> (96420a2)

fix: Added Chinese Stella models (#1824)

Added Chinese Stella models (74b495c)

Assets 6

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

1.30.0 (2025-01-25)

Feature

Unknown

1.29.16 (2025-01-22)

Fix

1.29.15 (2025-01-22)

Fix

1.29.14 (2025-01-22)

Fix

Unknown

1.29.13 (2025-01-22)

Fix

1.29.12 (2025-01-21)

Fix

1.29.11 (2025-01-21)

Fix

1.29.10 (2025-01-20)

Fix

1.29.9 (2025-01-17)

Fix

1.29.8 (2025-01-17)

Fix

Releases: embeddings-benchmark/mteb

1.30.0

1.30.0 (2025-01-25)

Feature

Unknown

1.29.16

1.29.16 (2025-01-22)

Fix

1.29.15

1.29.15 (2025-01-22)

Fix

1.29.14

1.29.14 (2025-01-22)

Fix

Unknown

1.29.13

1.29.13 (2025-01-22)

Fix

1.29.12

1.29.12 (2025-01-21)

Fix

1.29.11

1.29.11 (2025-01-21)

Fix

1.29.10

1.29.10 (2025-01-20)

Fix

1.29.9

1.29.9 (2025-01-17)

Fix

1.29.8

1.29.8 (2025-01-17)

Fix