-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance Optimizations #49
Comments
I'll take a look at the dot product today. Have you seen the FAISS library at all? Some of this algorithms might be able to offer terrific implementations for clustering |
Sure! What algorithm are you referring to particularly? For dimensionality reduction, I was considering t-SNE, PCA and UMAP. They all have pros and cons with PCA being great and widely used efficient algorithm, UMAP being computation-intense (as it's just projecting the point, no expensive iterations required) and t-SNE afaik being computation-heaviest algorithm but usually generating "visually pleasing" results. From what I can see it's the most promising algorithm at the moment. Just yesterday the author of https://github.com/Lv-291/wasm-bhtsne released a new version which supports multithreading, so once updated, that should lead to a huge speedup. If you're referring to the general logic or the DB-like json objects backbone holding the chunks and embeddings, I already looked a little into JS vector DBs like Orama. However, as I didn't get any answer on my question about performance yet I didn't run any tests yet. From an application perspective that might make much more sense to have the app and DB/data more cleanly separated, also offering easier imports/exports etc. but I definitely do not want to compromise on performance. Also, for a while I had the idea in mind to allow for connections to external/local vector DBs like Qdrant. That way the web app would be the inferencing interface and the memory intense processes would run somewhere else allowing for really scalable apps! The setup would be optional of course and work like the Ollama connection. |
fyi: there is also https://github.com/tantaraio/voy, Rust-based wasm DB as alternative to Orama. However seems a little dead? Seeing all these projects I think we created a pretty solid "vector DB" ourselves as part of SemanticFinder. Makes me wonder whether it might be worth to extract the logic... like a lean, no-fuzz JS-native JSON-based DB. |
I agree a lean database seems quite suitable. It would be nice to make a local-storage JS library with "guarantees" like limits on the total memory that can be used. FAISS has HNSW (super fast approx nearest neighbor) and also newer and faster fast approx nearest neighbor algorithms. Here is an overview.. |
Here's another hot candidate for crazy speed improvements on the indexing side: static models with model2vec: MinishLab/model2vec#75. Curious how to run this in JS. |
Fyi: lancedb seems like the best file-based vector DB out there (https://github.com/lancedb/lancedb), similar to sqlite-vec but with more functionality (full-text search etc.). Seems superior to voy and is also written in Rust. Might be charming to be able to export the whole lancedb and be able to connect other frontends to it. |
This issue is grouping a few things that might be optimized to improve performance:
tbc
The text was updated successfully, but these errors were encountered: