Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance Optimizations #49

Open
do-me opened this issue Apr 11, 2024 · 6 comments
Open

Performance Optimizations #49

do-me opened this issue Apr 11, 2024 · 6 comments

Comments

@do-me
Copy link
Owner

do-me commented Apr 11, 2024

This issue is grouping a few things that might be optimized to improve performance:

tbc

@varunneal
Copy link
Collaborator

I'll take a look at the dot product today. Have you seen the FAISS library at all? Some of this algorithms might be able to offer terrific implementations for clustering

https://github.com/facebookresearch/faiss

@do-me
Copy link
Owner Author

do-me commented May 14, 2024

Sure! What algorithm are you referring to particularly?

For dimensionality reduction, I was considering t-SNE, PCA and UMAP. They all have pros and cons with PCA being great and widely used efficient algorithm, UMAP being computation-intense (as it's just projecting the point, no expensive iterations required) and t-SNE afaik being computation-heaviest algorithm but usually generating "visually pleasing" results. From what I can see it's the most promising algorithm at the moment. Just yesterday the author of https://github.com/Lv-291/wasm-bhtsne released a new version which supports multithreading, so once updated, that should lead to a huge speedup.

If you're referring to the general logic or the DB-like json objects backbone holding the chunks and embeddings, I already looked a little into JS vector DBs like Orama. However, as I didn't get any answer on my question about performance yet I didn't run any tests yet. From an application perspective that might make much more sense to have the app and DB/data more cleanly separated, also offering easier imports/exports etc. but I definitely do not want to compromise on performance.
There is also other (vector DB) projects like DuckDB (https://github.com/duckdb/duckdb-wasm) but it might not be mature enough yet. If you find anything that looks worth trying, we could give it a go!

Also, for a while I had the idea in mind to allow for connections to external/local vector DBs like Qdrant. That way the web app would be the inferencing interface and the memory intense processes would run somewhere else allowing for really scalable apps! The setup would be optional of course and work like the Ollama connection.

@do-me do-me mentioned this issue May 15, 2024
@do-me
Copy link
Owner Author

do-me commented May 19, 2024

fyi: there is also https://github.com/tantaraio/voy, Rust-based wasm DB as alternative to Orama. However seems a little dead?

Seeing all these projects I think we created a pretty solid "vector DB" ourselves as part of SemanticFinder. Makes me wonder whether it might be worth to extract the logic... like a lean, no-fuzz JS-native JSON-based DB.

@varunneal
Copy link
Collaborator

I agree a lean database seems quite suitable. It would be nice to make a local-storage JS library with "guarantees" like limits on the total memory that can be used.

FAISS has HNSW (super fast approx nearest neighbor) and also newer and faster fast approx nearest neighbor algorithms. Here is an overview..

@do-me
Copy link
Owner Author

do-me commented Oct 11, 2024

Here's another hot candidate for crazy speed improvements on the indexing side: static models with model2vec: MinishLab/model2vec#75. Curious how to run this in JS.

@do-me
Copy link
Owner Author

do-me commented Oct 16, 2024

Fyi: lancedb seems like the best file-based vector DB out there (https://github.com/lancedb/lancedb), similar to sqlite-vec but with more functionality (full-text search etc.). Seems superior to voy and is also written in Rust. Might be charming to be able to export the whole lancedb and be able to connect other frontends to it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants