Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prototype RAG on DuckDB and File Attachments #803

Open
humphd opened this issue Jan 27, 2025 · 0 comments
Open

Prototype RAG on DuckDB and File Attachments #803

humphd opened this issue Jan 27, 2025 · 0 comments
Labels
enhancement New feature or request

Comments

@humphd
Copy link
Collaborator

humphd commented Jan 27, 2025

ChatCraft has been expanded to include File Attachments and DuckDB, which supports querying files. The two features have been connected, so you can attach files, run SQL queries on them, get back results, download them, etc.

Now that we have this foundation, I think we have most of what we need for building a RAG solution, when file attachments are too large to put into the chat context.

I think the process would work like this:

  • user attaches some files with text we can extract (PDF, source code, Word Doc, etc)
  • somehow (UI? automatically based on file size) we decide when use these file attachments for RAG vs. embedding directly in the chat messages
  • we take the set of RAG-attachment-files and "index" them in DuckDB. Maybe we use full-text search or maybe we use vector search (see part 1, part 2)
  • when the user asks a question, we use their prompt to create a query, get back results from the indexed docs, and include relevant text context along with the original prompt

The initial version of this can be crude, without proper UI, optimal indexing, etc. We need to play a bit to get this right.

Likely, the best way to begin this work is to prototype it outside of ChatCraft using DuckDB and text files locally.

@mulla028 mulla028 added the enhancement New feature or request label Jan 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants