Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DOCS] Update Benchmarks - Include Wikipedia-100k and Wikipedia-500k run timings #156

Merged
merged 3 commits into from
Jan 29, 2025

Conversation

bhavnicksm
Copy link
Collaborator

This pull request includes significant updates to the benchmarks/README.md file, focusing on enhancing the speed benchmarks and reorganizing the size comparison sections. The most important changes include updating benchmark results for various datasets and reformatting the document for clarity.

Updates to speed benchmarks:

  • Added new benchmarks for 100K and 500K Wikipedia articles, including token chunking, sentence chunking, recursive chunking, and semantic chunking.
  • Updated the Paul Graham Essays dataset benchmarks to include more detailed timing results.

Reorganization of size comparison sections:

  • Moved the size comparison section to the end of the document and updated the package sizes for Chonkie, LangChain, and LlamaIndex.
  • Reformatted the document to improve readability and clarity, including the addition of new headings and sections.

Additional improvements:

  • Updated the "Why These Numbers Matter" section to highlight both speed and size benefits.
  • Revised the introductory and concluding remarks for better alignment with the updated benchmarks and comparisons.

shreyashnigam and others added 3 commits January 29, 2025 20:08
* Add wiki 500k benchmark results

* Update benchmarks

* bahut tej hai chonkie bhai

* blah blah

---------

Co-authored-by: Bhavnick Minhas <[email protected]>
@bhavnicksm bhavnicksm merged commit 7ed2c13 into main Jan 29, 2025
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants