A Streamlit-based web application demonstrating Retrieval Augmented Generation (RAG) using local LLMs via Ollama.
- Upload and process PDF, TXT, and DOCX documents
- Organize documents in collections using ChromaDB
- Query documents using natural language
- Compare responses with and without RAG
- View source context for responses
- Built-in demo documents and tutorial
- Configurable LLM and embedding models
- Advanced settings for fine-tuning
- Python 3.11+
- Docker (optional)
- Ollama running locally or accessible via network
- Start Ollama on your machine
- Pull and run the container:
docker-compose up
- Install Python dependencies:
pip install -r requirements.txt
- Start Ollama and pull required models:
ollama pull llama2
ollama pull nomic-embed-text
- Run the application:
streamlit run rag.py
- Access the web interface at http://localhost:8501
- Click "Ingest Demo Data" to load sample documents
- Or upload your own documents using the sidebar
- Select a collection and enter your query
- Experiment with different models and settings
OLLAMA_BASE_URL
: Set this environment variable to connect to Ollama running on a different host- Advanced settings available in the UI for:
- Chunk size and overlap
- Context window size
- Number of similar documents
- Response length
Includes sample documents covering:
- Solar system and space exploration
- Science fiction (Star Trek)
- Children's stories
- Calendar data
- Technical documentation
The project uses:
- Streamlit for the web interface
- LlamaIndex for document processing
- ChromaDB for vector storage
- Ollama for local LLM integration
© 2024 Dennis Kruyt, AT Computing