Releases: embeddings-benchmark/mteb
1.4.0
1.3.4
1.3.3
1.3.3 (2024-03-31)
Documentation
-
docs: Added information related to the automatic release (#290)
-
docs: added information related to the automatic release
-
docs: removed test-parallel from docs
-
docs: minor additions to contributing guidelines
-
ci: removed changelog
As it already present in the git releases
- Apply suggestions from code review
Co-authored-by: Niklas Muennighoff <[email protected]>
Co-authored-by: Niklas Muennighoff <[email protected]> (6821d23
)
Fix
1.3.2
1.3.1
v0.10.0
v0.10.0 (2024-03-26)
Ci
- ci: renamed test job and workflow (#282)
ci: Added tests (6675bb8
)
Documentation
-
docs: add dataset schemas (#255)
-
docs: update AbsTaskClassification.py document schema for classification task
-
update AbsTaskBitextMining.py
-
update BornholmskBitextMining.py
-
update AbsTaskClustering.py and BlurbsClusteringP2P.py
-
update 8 files
-
update 9 files
-
update AbsTaskReranking.py
-
update BlurbsClusteringP2P.py
-
update CMTEBPairClassification.py
-
update GerDaLIRRetrieval.py
-
update 7 files
-
update AbsTaskBitextMining.py
-
update AbsTaskClassification.py (
c3ce1ac
) -
docs: Add development installation instructions (#246)
-
docs: Add development installation instructions
-
removed unused requirements file
I don't believe this is nec. with the setup.py specifying the same dependencies
-
docs: Updated make file with new dependencies
-
ci: Update ci to use make commands
This ensure that the user runs exactly what the CI expects
-
ci: Avoid specifying tests folder as it causes issuew ith tests
-
ci: removed unec. args for test ci
-
Added dev install (
0048878
)
Feature
- feat: update revision id of wikicitiesclustering task (
fb90c02
)
Fix
-
fix: dead link in readme (
ecbb776
) -
fix: Added sizes to the metadata (#276)
-
restructing the readme
-
added mmteb
-
removed unec. method
-
Added docstring to metadata
-
Updated outdated examples
-
formatting documents
-
fix: Updated form to be parsed correctly
-
fix: Added sizes to the metadata
this allow for automatic metadata generations
-
Updated based on feedback
-
Apply suggestions from code review
Co-authored-by: Niklas Muennighoff <[email protected]>
-
updated based on feedback
-
Added suggestion from review
-
added correction based on review
-
reformatted empty fields to None
Co-authored-by: Niklas Muennighoff <[email protected]> (cd4a012
)
-
fix: remove debugging print statement (
d292d93
) -
fix: pass parallel_retrieval kwarg to use DenseRetrievalParallelExactSearch (
19b8f66
) -
fix: msmarco-v2 uses dev.tsv, not dev1.tsv (
6908d21
)
Refactor
-
refactor: add metadata basemodel (#260)
-
refactor: rename description to metadata dict
-
refactor: add TaskMetadata and first example
-
update 9 files
-
update TaskMetadata.py
-
update TaskMetadata.py
-
update TaskMetadata.py
-
update LICENSE, TaskMetadata.py and requirements.dev.txt
-
update 151 files
-
update 150 files
-
update 43 files and delete 1 file
-
update 106 files
-
update 45 files
-
update 6 files
-
update 14 files
-
Added model results to repo and updated CLI to create consistent folder structure. (#254)
-
Added model results to repo and updated CLI to create consistent folder structure.
-
ci: updated ci to use make install
-
Added missing pytest dependencies
-
Update README.md
Co-authored-by: Niklas Muennighoff <[email protected]>
Co-authored-by: Niklas Muennighoff <[email protected]>
-
Restructing the readme (#262)
-
restructing the readme
-
removed double specification of versions and moved all setup to pyproject.toml
-
correctly use flat-layout for the package
-
build(deps): update TaskMetadata.py and pyproject.toml
-
update 221 files
-
build(deps): update pyproject.toml
-
build(deps): update pyproject.toml
-
build(deps): update pyproject.toml
Co-authored-by: Kenneth Enevoldsen <[email protected]>
Co-authored-by: Niklas Muennighoff <[email protected]> (dd5d617
)
Unknown
-
Ci-fix (#289)
-
added release pipeline
-
v1.3.0
-
ci: moved release to the correct folder (
7f56c1a
) -
v1.3.0
-
added release pipeline
-
v1.3.0 (
5e4d10e
) -
tests: speed up tests (#283)
update Makefile and test_all_abstasks.py (2155bf6
)
-
Merge branch 'main' of https://github.com/embeddings-benchmark/mteb (
c9d1a03
) -
Enable ruff ci (#279)
-
restructing the readme
-
added mmteb
-
removed unec. method
-
Added docstring to metadata
-
Updated outdated examples
-
formatting documents
-
fix: Updated form to be parsed correctly
-
fix: Added sizes to the metadata
this allow for automatic metadata generations
-
Updated based on feedback
-
Apply suggestions from code review
Co-authored-by: Niklas Muennighoff <[email protected]>
-
updated based on feedback
-
Added suggestion from review
-
added correction based on review
-
reformatted empty fields to None
-
CI: Enable linter
Co-authored-by: Niklas Muennighoff <[email protected]> (a16eb07
)
-
Added MMTEB (#275)
-
restructing the readme
-
added mmteb
-
removed unec. method
-
Added docstring to metadata
-
Updated outdated examples
-
formatting documents
-
fix: Updated form to be parsed correctly
-
Updated based on feedback
-
Apply suggestions from code review
Co-authored-by: Niklas Muennighoff <[email protected]>
-
updated based on feedback
-
Added suggestion from review
-
added correction based on review
Co-authored-by: Niklas Muennighoff <[email protected]> (c0dc49a
)
-
dev: add isort (#271)
-
dev: add isort
-
dev: add isort (
845099d
) -
dev: run tests on pull request towards any branch (
13f759a
) -
Merge branch 'main' of https://github.com/embeddings-benchmark/mteb (
b42abe4
) -
replaced linter with ruff (#265)
-
restructing the readme
-
removed double specification of versions and moved all setup to pyproject.toml
-
correctly use flat-layout for the package
-
replaced linter with ruff
-
rerun tests
-
ci: Added in newer workflow
some of them are disables as they require other issues to be solved
- Update Makefile
Co-authored-by: Niklas Muennighoff <[email protected]>
Co-authored-by: Niklas Muennighoff <[email protected]> (023e881
)
-
Restructing the readme (#262)
-
restructing the readme
-
removed double specification of versions and moved all setup to pyproject.toml
-
correctly use flat-layout for the package (
769157b
) -
restructing the readme (
364be7f
) -
Added model results to repo and updated CLI to create consistent folder structure. (#254)
-
Added model results to repo and updated CLI to create consistent folder structure.
-
ci: updated ci to use make install
-
Added missing pytest dependencies
-
Update README.md
Co-authored-by: Niklas Muennighoff <[email protected]>
Co-authored-by: Niklas Muennighoff <[email protected]> (8a758bc
)
-
dev: add workspace defaults in VSCode (#253)
-
dev: add black as default formatter in vscode
-
Update .vscode/settings.json
Co-authored-by: Kenneth Enevoldsen <[email protected]> (30e5b9e
)
-
Add Danish Discourse dataset (#247)
-
misc.
-
update dd...
1.2.0 Spanish & French, Simpler Retrieval
Updates
- πͺπΈ New Spanish datasets thanks to @violenil & team π
- π«π· New French datasets thanks to @GabrielSequeira & team + there's a new French Overall leaderboard tab thanks to their massive benchmarking π₯
- Retrieval has become much simpler and is now standardized to align with other tasks. You can inspect all Retrieval datasets on the hub, it is much easier to add new Retrieval datasets now & there are fewer dependencies making installing MTEB easier π While this change is backward-compatible, it represents a significant change in how MTEB works, thus we decided to increment the minor for this release (1.1.2 -> 1.2.0).
What's Changed
- Add tasks for Spanish Embedding Evaluation by @violenil in #227
- Extend MTEB with French datasets by @GabrielSequeira in #218
- Remove HAGRID from french benchmark by @MathieuCiancone in #235
- Fixed missing revision error on Norwegian Bitext Mining by @x-tabdeveloping in #221
- Simplify retrieval by @Muennighoff in #233
New Contributors
- @GabrielSequeira made their first contribution in #218
- @MathieuCiancone made their first contribution in #235
Full Changelog: 1.1.2...1.2.0
1.1.2 New English, German, Korean datasets & bug fixes
What's Changed
- fix RerankingEvaluator's compute_metrics_individual by @novak2000 in #165
- Add Long Document Evaluation Datasets by @violenil in #166
- Fix medrxiv mislinkage by @zhimin-z in #187
- Fix Dalaj linkage by @zhimin-z in #195
- Fix SummEval linkage by @zhimin-z in #194
- Fix SweFAQ linkage by @zhimin-z in #193
- Added Norwegian BokmΓ₯l-Nynorsk bitext mining task by @x-tabdeveloping in #202
- Add support for cache results by @hongjin-su in #207
- Retrieval benchmark based on GermanQuAD by @rasdani in #197
- Refer to other works by @Muennighoff in #212
- Fix selection of DRES/DRPES by @Markus28 in #179
- Add tasks for German Embedding Evaluation by @guenthermi in #214
- only save top-k by @hongjin-su in #209
- Add MultiLongDocRetrieval task to MTEB. by @hanhainebula in #224
- Add Korean Text Search Tasks to MTEB by @taeminlee in #210
- Update BeIRPLTask.py by @kwojtasi in #225
- Add task list by @Muennighoff in #228
New Contributors
- @novak2000 made their first contribution in #165
- @violenil made their first contribution in #166
- @zhimin-z made their first contribution in #187
- @x-tabdeveloping made their first contribution in #202
- @hongjin-su made their first contribution in #207
- @rasdani made their first contribution in #197
- @Markus28 made their first contribution in #179
- @hanhainebula made their first contribution in #224
- @taeminlee made their first contribution in #210
Full Changelog: 1.1.1...1.1.2
1.1.1 C-MTEB. PL-MTEB, Multi-GPU
Updates
- π¨π³ C-MTEB was released and integrated thanks to @staoxiao. Check out the paper here. Together with C-MTEB, the team also released other great embedding resources such as new SoTA models on MTEB & C-MTEB called BGE, as well as datasets and source code π
- π΅π± PL-MTEB & BEIR-PL was released and integrated thanks to @rafalposwiata & @kwojtasi. Check out the new leaderboard tab for PL-MTEB: https://huggingface.co/spaces/mteb/leaderboard. Some BEIR-PL datasets are still missing and will be added soon cc @kwojtasi π
- π» Clarifications on multi-GPU: Native multi-GPU support for Retrieval thanks to @NouamaneTazi. We also added a clarification in the README on how any task can be run in a multi-GPU setup without requiring any changes in MTEB. MTEB abstracts the way the encodings are produced. Whether users use multiple or a single GPU in the
encode
function is completely flexible π
What's Changed
- Code cleanup by @NouamaneTazi in #131
- Replaced prints with logging by @KennethEnevoldsen in #133
- Add BEIR-PL datasets to MTEB by @kwojtasi in #121
- Add Polish tasks (PL-MTEB) by @rafalposwiata in #137
- Add Chinese tasks (C-MTEB) by @staoxiao in #134
- Support Multi-node Evaluation by @NouamaneTazi in #132
- Add multi gpu eval to readme by @NouamaneTazi in #140
- Default to false by @Muennighoff in #143
- Rely on standard encode kwargs only by @Muennighoff in #145
- Fix splits by @Muennighoff in #149
- fix: add missing task-langs attribute by @guenthermi in #152
- Clarify multi-gpu usage by @Muennighoff in #153
- Simplify code snippets by @Muennighoff in #154
- fix: msmarco-v2 uses dev.tsv, not dev1.tsv by @garrett361 in #155
- Fix eval langs by @Muennighoff in #157
New Contributors
- @kwojtasi made their first contribution in #121
- @rafalposwiata made their first contribution in #137
- @staoxiao made their first contribution in #134
- @guenthermi made their first contribution in #152
- @garrett361 made their first contribution in #155
Full Changelog: 1.1.0...1.1.1
1.1.0 New languages, default cluster setting & default error raising
Updates
- π©π°π³π΄πΈπͺ New Danish, Norwegian and Swedish BitextMining & Classification tasks
AngryTweetsClassification
,BornholmBitextMining
,DKHateClassification
,DalajClassification
,LccSentimentClassification
,NordicLangClassification
,NorwegianParliament
,ScalaDaClassification
,ScalaNbClassification
&ScalaSvClassification
thanks to @KennethEnevoldsen - π©πͺ New German Clustering tasks
BlurbsClusteringP2P
,BlurbsClusteringS2S
,TenKGnadClusteringP2P
&TenKGnadClusteringS2S
thanks to @slvnwhrl - β Change in cluster initialization from
3
to the sklearn recommended default ofauto
. This leads to tiny changes in clustering scores going forward and hence makes this release not backwards-compatible. See here for a discussion. Thanks to @stephantul for this change. - β Errors are now directly raised by default. This behavior can be deactivated by passing a kwarg at evaluation. Previously, they were just written to a
.txt file
. Thanks to @KennethEnevoldsen for introducing this change. - π» Code cleanups thanks to @stephantul @izhx @permutohedra
- π The leaderboard has also improved a lot with new task-based rankings, better caching and many new models
What's Changed
- Fix kNN Multiclass by @Muennighoff in #92
- Fix SemmEval description by @ahoho in #97
- Make inputs always List[str] & call in one by @Muennighoff in #99
- Fix clustering warning by @stephantul in #104
- Fix the extending of language pairs in
MTEB
by @izhx in #106 - Add @Property annotation to description method of AbsTask by @permutohedra in #111
- Add German clustering datasets by @slvnwhrl in #116
- Added support for Scandinavian Languages by @KennethEnevoldsen in #124
- Bump version ID and update PyPI by @KennethEnevoldsen in #128
New Contributors
- @ahoho made their first contribution in #97
- @stephantul made their first contribution in #104
- @izhx made their first contribution in #106
- @permutohedra made their first contribution in #111
- @slvnwhrl made their first contribution in #116
- @KennethEnevoldsen made their first contribution in #124
Full Changelog: 1.0.1...1.1.0