Skip to content

Commit

Permalink
change theme, modularize docs
Browse files Browse the repository at this point in the history
  • Loading branch information
tanmay17061 committed Nov 3, 2024
1 parent 157a6d1 commit d3df8fc
Show file tree
Hide file tree
Showing 11 changed files with 1,022 additions and 211 deletions.
33 changes: 32 additions & 1 deletion docs/_config.yml
Original file line number Diff line number Diff line change
@@ -1,3 +1,34 @@
theme: jekyll-theme-cayman
remote_theme: just-the-docs/just-the-docs
title: Batch-GPT Documentation
description: Documentation for the Batch-GPT service

# Theme settings
color_scheme: dark
search_enabled: true
heading_anchors: true

# Aux links for the upper right navigation
aux_links:
"Batch-GPT on GitHub":
- "//github.com/tanmay17061/batch-gpt"

# Makes Aux links open in a new tab
aux_links_new_tab: true

# Navigation Structure
nav_external_links:
- title: Batch-GPT on GitHub
url: https://github.com/tanmay17061/batch-gpt
hide_icon: false

# Enable copy code button
enable_copy_code_button: true

# Footer content
footer_content: "Copyright &copy; 2024 Batch-GPT. Distributed under the <a href=\"https://github.com/tanmay17061/batch-gpt/blob/main/LICENSE\">Apache License 2.0.</a>"

# Collections for organizing documentation
collections:
docs:
permalink: "/:collection/:path/"
output: true
94 changes: 94 additions & 0 deletions docs/advanced-features.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
---
layout: default
title: Advanced Features
nav_order: 6
---

# Advanced Features
{: .no_toc }

## Table of contents
{: .no_toc .text-delta }

1. TOC
{:toc}

---

## Serving Modes

### Synchronous Mode
```bash
export CLIENT_SERVING_MODE=sync # Default
```
- Blocks until response is available
- Similar to standard OpenAI API
- Best for low-volume scenarios

### Asynchronous Mode
```bash
export CLIENT_SERVING_MODE=async
```
- Returns immediately with submission confirmation
- Suitable for high-volume applications
- Requires separate status checking

### Cache-Only Mode
```bash
export CLIENT_SERVING_MODE=cache
```
- Serves only cached responses
- No new API calls
- Processes pending batches

## Caching System

### Cache Configuration
- Automatic request hashing
- MongoDB-based storage
- Cross-session persistence

### Cache Operations
```python
# Cache hit example
response1 = client.chat.completions.create(...)
response2 = client.chat.completions.create(...) # Same request returns cached response
```

## Batch Recovery

### Automatic Recovery
- Detects interrupted batches
- Resumes processing on restart
- Updates original requesters

### Manual Recovery
```bash
# Check dangling batches
python client.py --api status_all_batches --status_filter not_completed
```

## Advanced Monitoring

### Custom Polling Intervals
```bash
export COLLECT_BATCH_STATS_POLLING_MAX_INTERVAL_SECONDS=600
```

### Progress Tracking
- Real-time completion statistics
- Request counts monitoring
- Error tracking

## Performance Tuning

### Batch Window Optimization
```bash
# Adjust batch collection window
export COLLATE_BATCHES_FOR_DURATION_IN_MS=3000 # 3 seconds
```

### MongoDB Optimization
- Index management
- Connection pooling
- Query optimization
120 changes: 120 additions & 0 deletions docs/api-reference.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,120 @@
---
layout: default
title: API Reference
nav_order: 9
---

# API Reference
{: .no_toc }

## Table of contents
{: .no_toc .text-delta }

1. TOC
{:toc}

---

## Chat Completions

### Create Chat Completion

`POST /v1/chat/completions`

Request:
```json
{
"model": "gpt-3.5-turbo",
"messages": [
{
"role": "user",
"content": "Hello!"
}
]
}
```

Response:
```json
{
"id": "chatcmpl-123",
"object": "chat.completion",
"created": 1677652288,
"choices": [{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! How can I help you today?"
},
"finish_reason": "stop"
}],
"usage": {
"prompt_tokens": 9,
"completion_tokens": 12,
"total_tokens": 21
}
}
```

## Batch Operations

### Retrieve Batch

`GET /v1/batches/{batch_id}`

Response:
```json
{
"batch": {
"id": "batch_123",
"status": "completed",
"created_at": 1678901234,
"expires_at": 1678987634,
"request_counts": {
"total": 10,
"completed": 10
}
}
}
```

### List Batches

`GET /v1/batches`

Response:
```json
{
"data": [
{
"batch": {
"id": "batch_123",
"status": "completed",
"created_at": 1678901234,
"expires_at": 1678987634,
"request_counts": {
"total": 10,
"completed": 10
}
}
}
]
}
```

## Error Handling

### Error Response Format
```json
{
"error": {
"type": "invalid_request_error",
"message": "Description of the error"
}
}
```

### Common Error Types
- `invalid_request_error`
- `internal_server_error`
- `batch_processing_error`
114 changes: 114 additions & 0 deletions docs/architecture.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
---
layout: default
title: Architecture
nav_order: 4
---

# Architecture
{: .no_toc }

## Table of contents
{: .no_toc .text-delta }

1. TOC
{:toc}

---

## Overview

Batch-GPT follows a modular architecture with clear separation of concerns:

```
server/
├── db/ # Database interactions
├── handlers/ # HTTP request handlers
├── logger/ # Custom logging
├── models/ # Data models
└── services/ # Business logic
├── batch/ # Batch processing
├── cache/ # Response caching
├── client/ # OpenAI client
├── config/ # Configuration
└── utils/ # Common utilities
```

## Core Components

### Server Core
- Request routing and validation
- HTTP handlers
- Response formatting

### Batch Orchestrator
- Request collection
- Batch timing management
- Response distribution

### Cache System
- Request hashing
- Response storage
- Cache invalidation

### Monitor Tool
- Status visualization
- Batch tracking
- Progress reporting

### Database Layer
- MongoDB integration
- Persistence management
- Status tracking

## Data Flow

1. **Request Reception**
- Request validation
- Cache checking
- Hash generation

2. **Batch Processing**
- Request collection
- Batch formation
- OpenAI submission

3. **Response Handling**
- Response collection
- Cache updating
- Client delivery

4. **Monitoring**
- Status tracking
- Progress updates
- Error logging

## Component Interaction

```mermaid
graph TD
A[Client] --> B[Server Core]
B --> C[Cache System]
B --> D[Batch Orchestrator]
D --> E[OpenAI API]
D --> F[MongoDB]
C --> F
G[Monitor Tool] --> F
```

## Design Decisions

### Batch Processing
- Configurable batch windows
- Hash-based deduplication
- Async/sync flexibility

### Caching Strategy
- MongoDB-based persistence
- Hash-based indexing
- Cross-session availability

### Monitoring
- Real-time updates
- Interactive interface
- Status aggregation
```
Loading

0 comments on commit d3df8fc

Please sign in to comment.