change theme, modularize docs

tanmay17061 · Nov 3, 2024 · d3df8fc · d3df8fc
1 parent 157a6d1
commit d3df8fc
Show file tree

Hide file tree

Showing 11 changed files with 1,022 additions and 211 deletions.
diff --git a/docs/_config.yml b/docs/_config.yml
@@ -1,3 +1,34 @@
-theme: jekyll-theme-cayman
+remote_theme: just-the-docs/just-the-docs
 title: Batch-GPT Documentation
 description: Documentation for the Batch-GPT service
+
+# Theme settings
+color_scheme: dark
+search_enabled: true
+heading_anchors: true
+
+# Aux links for the upper right navigation
+aux_links:
+  "Batch-GPT on GitHub":
+    - "//github.com/tanmay17061/batch-gpt"
+
+# Makes Aux links open in a new tab
+aux_links_new_tab: true
+
+# Navigation Structure
+nav_external_links:
+  - title: Batch-GPT on GitHub
+    url: https://github.com/tanmay17061/batch-gpt
+    hide_icon: false
+
+# Enable copy code button
+enable_copy_code_button: true
+
+# Footer content
+footer_content: "Copyright &copy; 2024 Batch-GPT. Distributed under the <a href=\"https://github.com/tanmay17061/batch-gpt/blob/main/LICENSE\">Apache License 2.0.</a>"
+
+# Collections for organizing documentation
+collections:
+  docs:
+    permalink: "/:collection/:path/"
+    output: true
diff --git a/docs/advanced-features.md b/docs/advanced-features.md
@@ -0,0 +1,94 @@
+---
+layout: default
+title: Advanced Features
+nav_order: 6
+---
+
+# Advanced Features
+{: .no_toc }
+
+## Table of contents
+{: .no_toc .text-delta }
+
+1. TOC
+{:toc}
+
+---
+
+## Serving Modes
+
+### Synchronous Mode
+```bash
+export CLIENT_SERVING_MODE=sync  # Default
+```
+- Blocks until response is available
+- Similar to standard OpenAI API
+- Best for low-volume scenarios
+
+### Asynchronous Mode
+```bash
+export CLIENT_SERVING_MODE=async
+```
+- Returns immediately with submission confirmation
+- Suitable for high-volume applications
+- Requires separate status checking
+
+### Cache-Only Mode
+```bash
+export CLIENT_SERVING_MODE=cache
+```
+- Serves only cached responses
+- No new API calls
+- Processes pending batches
+
+## Caching System
+
+### Cache Configuration
+- Automatic request hashing
+- MongoDB-based storage
+- Cross-session persistence
+
+### Cache Operations
+```python
+# Cache hit example
+response1 = client.chat.completions.create(...)
+response2 = client.chat.completions.create(...)  # Same request returns cached response
+```
+
+## Batch Recovery
+
+### Automatic Recovery
+- Detects interrupted batches
+- Resumes processing on restart
+- Updates original requesters
+
+### Manual Recovery
+```bash
+# Check dangling batches
+python client.py --api status_all_batches --status_filter not_completed
+```
+
+## Advanced Monitoring
+
+### Custom Polling Intervals
+```bash
+export COLLECT_BATCH_STATS_POLLING_MAX_INTERVAL_SECONDS=600
+```
+
+### Progress Tracking
+- Real-time completion statistics
+- Request counts monitoring
+- Error tracking
+
+## Performance Tuning
+
+### Batch Window Optimization
+```bash
+# Adjust batch collection window
+export COLLATE_BATCHES_FOR_DURATION_IN_MS=3000  # 3 seconds
+```
+
+### MongoDB Optimization
+- Index management
+- Connection pooling
+- Query optimization
diff --git a/docs/api-reference.md b/docs/api-reference.md
@@ -0,0 +1,120 @@
+---
+layout: default
+title: API Reference
+nav_order: 9
+---
+
+# API Reference
+{: .no_toc }
+
+## Table of contents
+{: .no_toc .text-delta }
+
+1. TOC
+{:toc}
+
+---
+
+## Chat Completions
+
+### Create Chat Completion
+
+`POST /v1/chat/completions`
+
+Request:
+```json
+{
+  "model": "gpt-3.5-turbo",
+  "messages": [
+    {
+      "role": "user",
+      "content": "Hello!"
+    }
+  ]
+}
+```
+
+Response:
+```json
+{
+  "id": "chatcmpl-123",
+  "object": "chat.completion",
+  "created": 1677652288,
+  "choices": [{
+    "index": 0,
+    "message": {
+      "role": "assistant",
+      "content": "Hello! How can I help you today?"
+    },
+    "finish_reason": "stop"
+  }],
+  "usage": {
+    "prompt_tokens": 9,
+    "completion_tokens": 12,
+    "total_tokens": 21
+  }
+}
+```
+
+## Batch Operations
+
+### Retrieve Batch
+
+`GET /v1/batches/{batch_id}`
+
+Response:
+```json
+{
+  "batch": {
+    "id": "batch_123",
+    "status": "completed",
+    "created_at": 1678901234,
+    "expires_at": 1678987634,
+    "request_counts": {
+      "total": 10,
+      "completed": 10
+    }
+  }
+}
+```
+
+### List Batches
+
+`GET /v1/batches`
+
+Response:
+```json
+{
+  "data": [
+    {
+      "batch": {
+        "id": "batch_123",
+        "status": "completed",
+        "created_at": 1678901234,
+        "expires_at": 1678987634,
+        "request_counts": {
+          "total": 10,
+          "completed": 10
+        }
+      }
+    }
+  ]
+}
+```
+
+## Error Handling
+
+### Error Response Format
+```json
+{
+  "error": {
+    "type": "invalid_request_error",
+    "message": "Description of the error"
+  }
+}
+```
+
+### Common Error Types
+- `invalid_request_error`
+- `internal_server_error`
+- `batch_processing_error`
diff --git a/docs/architecture.md b/docs/architecture.md
@@ -0,0 +1,114 @@
+---
+layout: default
+title: Architecture
+nav_order: 4
+---
+
+# Architecture
+{: .no_toc }
+
+## Table of contents
+{: .no_toc .text-delta }
+
+1. TOC
+{:toc}
+
+---
+
+## Overview
+
+Batch-GPT follows a modular architecture with clear separation of concerns:
+
+```
+server/
+├── db/         # Database interactions
+├── handlers/   # HTTP request handlers
+├── logger/     # Custom logging
+├── models/     # Data models
+└── services/   # Business logic
+    ├── batch/  # Batch processing
+    ├── cache/  # Response caching
+    ├── client/ # OpenAI client
+    ├── config/ # Configuration
+    └── utils/  # Common utilities
+```
+
+## Core Components
+
+### Server Core
+- Request routing and validation
+- HTTP handlers
+- Response formatting
+
+### Batch Orchestrator
+- Request collection
+- Batch timing management
+- Response distribution
+
+### Cache System
+- Request hashing
+- Response storage
+- Cache invalidation
+
+### Monitor Tool
+- Status visualization
+- Batch tracking
+- Progress reporting
+
+### Database Layer
+- MongoDB integration
+- Persistence management
+- Status tracking
+
+## Data Flow
+
+1. **Request Reception**
+   - Request validation
+   - Cache checking
+   - Hash generation
+
+2. **Batch Processing**
+   - Request collection
+   - Batch formation
+   - OpenAI submission
+
+3. **Response Handling**
+   - Response collection
+   - Cache updating
+   - Client delivery
+
+4. **Monitoring**
+   - Status tracking
+   - Progress updates
+   - Error logging
+
+## Component Interaction
+
+```mermaid
+graph TD
+    A[Client] --> B[Server Core]
+    B --> C[Cache System]
+    B --> D[Batch Orchestrator]
+    D --> E[OpenAI API]
+    D --> F[MongoDB]
+    C --> F
+    G[Monitor Tool] --> F
+```
+
+## Design Decisions
+
+### Batch Processing
+- Configurable batch windows
+- Hash-based deduplication
+- Async/sync flexibility
+
+### Caching Strategy
+- MongoDB-based persistence
+- Hash-based indexing
+- Cross-session availability
+
+### Monitoring
+- Real-time updates
+- Interactive interface
+- Status aggregation
+```