-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
157a6d1
commit d3df8fc
Showing
11 changed files
with
1,022 additions
and
211 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,34 @@ | ||
theme: jekyll-theme-cayman | ||
remote_theme: just-the-docs/just-the-docs | ||
title: Batch-GPT Documentation | ||
description: Documentation for the Batch-GPT service | ||
|
||
# Theme settings | ||
color_scheme: dark | ||
search_enabled: true | ||
heading_anchors: true | ||
|
||
# Aux links for the upper right navigation | ||
aux_links: | ||
"Batch-GPT on GitHub": | ||
- "//github.com/tanmay17061/batch-gpt" | ||
|
||
# Makes Aux links open in a new tab | ||
aux_links_new_tab: true | ||
|
||
# Navigation Structure | ||
nav_external_links: | ||
- title: Batch-GPT on GitHub | ||
url: https://github.com/tanmay17061/batch-gpt | ||
hide_icon: false | ||
|
||
# Enable copy code button | ||
enable_copy_code_button: true | ||
|
||
# Footer content | ||
footer_content: "Copyright © 2024 Batch-GPT. Distributed under the <a href=\"https://github.com/tanmay17061/batch-gpt/blob/main/LICENSE\">Apache License 2.0.</a>" | ||
|
||
# Collections for organizing documentation | ||
collections: | ||
docs: | ||
permalink: "/:collection/:path/" | ||
output: true |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,94 @@ | ||
--- | ||
layout: default | ||
title: Advanced Features | ||
nav_order: 6 | ||
--- | ||
|
||
# Advanced Features | ||
{: .no_toc } | ||
|
||
## Table of contents | ||
{: .no_toc .text-delta } | ||
|
||
1. TOC | ||
{:toc} | ||
|
||
--- | ||
|
||
## Serving Modes | ||
|
||
### Synchronous Mode | ||
```bash | ||
export CLIENT_SERVING_MODE=sync # Default | ||
``` | ||
- Blocks until response is available | ||
- Similar to standard OpenAI API | ||
- Best for low-volume scenarios | ||
|
||
### Asynchronous Mode | ||
```bash | ||
export CLIENT_SERVING_MODE=async | ||
``` | ||
- Returns immediately with submission confirmation | ||
- Suitable for high-volume applications | ||
- Requires separate status checking | ||
|
||
### Cache-Only Mode | ||
```bash | ||
export CLIENT_SERVING_MODE=cache | ||
``` | ||
- Serves only cached responses | ||
- No new API calls | ||
- Processes pending batches | ||
|
||
## Caching System | ||
|
||
### Cache Configuration | ||
- Automatic request hashing | ||
- MongoDB-based storage | ||
- Cross-session persistence | ||
|
||
### Cache Operations | ||
```python | ||
# Cache hit example | ||
response1 = client.chat.completions.create(...) | ||
response2 = client.chat.completions.create(...) # Same request returns cached response | ||
``` | ||
|
||
## Batch Recovery | ||
|
||
### Automatic Recovery | ||
- Detects interrupted batches | ||
- Resumes processing on restart | ||
- Updates original requesters | ||
|
||
### Manual Recovery | ||
```bash | ||
# Check dangling batches | ||
python client.py --api status_all_batches --status_filter not_completed | ||
``` | ||
|
||
## Advanced Monitoring | ||
|
||
### Custom Polling Intervals | ||
```bash | ||
export COLLECT_BATCH_STATS_POLLING_MAX_INTERVAL_SECONDS=600 | ||
``` | ||
|
||
### Progress Tracking | ||
- Real-time completion statistics | ||
- Request counts monitoring | ||
- Error tracking | ||
|
||
## Performance Tuning | ||
|
||
### Batch Window Optimization | ||
```bash | ||
# Adjust batch collection window | ||
export COLLATE_BATCHES_FOR_DURATION_IN_MS=3000 # 3 seconds | ||
``` | ||
|
||
### MongoDB Optimization | ||
- Index management | ||
- Connection pooling | ||
- Query optimization |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,120 @@ | ||
--- | ||
layout: default | ||
title: API Reference | ||
nav_order: 9 | ||
--- | ||
|
||
# API Reference | ||
{: .no_toc } | ||
|
||
## Table of contents | ||
{: .no_toc .text-delta } | ||
|
||
1. TOC | ||
{:toc} | ||
|
||
--- | ||
|
||
## Chat Completions | ||
|
||
### Create Chat Completion | ||
|
||
`POST /v1/chat/completions` | ||
|
||
Request: | ||
```json | ||
{ | ||
"model": "gpt-3.5-turbo", | ||
"messages": [ | ||
{ | ||
"role": "user", | ||
"content": "Hello!" | ||
} | ||
] | ||
} | ||
``` | ||
|
||
Response: | ||
```json | ||
{ | ||
"id": "chatcmpl-123", | ||
"object": "chat.completion", | ||
"created": 1677652288, | ||
"choices": [{ | ||
"index": 0, | ||
"message": { | ||
"role": "assistant", | ||
"content": "Hello! How can I help you today?" | ||
}, | ||
"finish_reason": "stop" | ||
}], | ||
"usage": { | ||
"prompt_tokens": 9, | ||
"completion_tokens": 12, | ||
"total_tokens": 21 | ||
} | ||
} | ||
``` | ||
|
||
## Batch Operations | ||
|
||
### Retrieve Batch | ||
|
||
`GET /v1/batches/{batch_id}` | ||
|
||
Response: | ||
```json | ||
{ | ||
"batch": { | ||
"id": "batch_123", | ||
"status": "completed", | ||
"created_at": 1678901234, | ||
"expires_at": 1678987634, | ||
"request_counts": { | ||
"total": 10, | ||
"completed": 10 | ||
} | ||
} | ||
} | ||
``` | ||
|
||
### List Batches | ||
|
||
`GET /v1/batches` | ||
|
||
Response: | ||
```json | ||
{ | ||
"data": [ | ||
{ | ||
"batch": { | ||
"id": "batch_123", | ||
"status": "completed", | ||
"created_at": 1678901234, | ||
"expires_at": 1678987634, | ||
"request_counts": { | ||
"total": 10, | ||
"completed": 10 | ||
} | ||
} | ||
} | ||
] | ||
} | ||
``` | ||
|
||
## Error Handling | ||
|
||
### Error Response Format | ||
```json | ||
{ | ||
"error": { | ||
"type": "invalid_request_error", | ||
"message": "Description of the error" | ||
} | ||
} | ||
``` | ||
|
||
### Common Error Types | ||
- `invalid_request_error` | ||
- `internal_server_error` | ||
- `batch_processing_error` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,114 @@ | ||
--- | ||
layout: default | ||
title: Architecture | ||
nav_order: 4 | ||
--- | ||
|
||
# Architecture | ||
{: .no_toc } | ||
|
||
## Table of contents | ||
{: .no_toc .text-delta } | ||
|
||
1. TOC | ||
{:toc} | ||
|
||
--- | ||
|
||
## Overview | ||
|
||
Batch-GPT follows a modular architecture with clear separation of concerns: | ||
|
||
``` | ||
server/ | ||
├── db/ # Database interactions | ||
├── handlers/ # HTTP request handlers | ||
├── logger/ # Custom logging | ||
├── models/ # Data models | ||
└── services/ # Business logic | ||
├── batch/ # Batch processing | ||
├── cache/ # Response caching | ||
├── client/ # OpenAI client | ||
├── config/ # Configuration | ||
└── utils/ # Common utilities | ||
``` | ||
|
||
## Core Components | ||
|
||
### Server Core | ||
- Request routing and validation | ||
- HTTP handlers | ||
- Response formatting | ||
|
||
### Batch Orchestrator | ||
- Request collection | ||
- Batch timing management | ||
- Response distribution | ||
|
||
### Cache System | ||
- Request hashing | ||
- Response storage | ||
- Cache invalidation | ||
|
||
### Monitor Tool | ||
- Status visualization | ||
- Batch tracking | ||
- Progress reporting | ||
|
||
### Database Layer | ||
- MongoDB integration | ||
- Persistence management | ||
- Status tracking | ||
|
||
## Data Flow | ||
|
||
1. **Request Reception** | ||
- Request validation | ||
- Cache checking | ||
- Hash generation | ||
|
||
2. **Batch Processing** | ||
- Request collection | ||
- Batch formation | ||
- OpenAI submission | ||
|
||
3. **Response Handling** | ||
- Response collection | ||
- Cache updating | ||
- Client delivery | ||
|
||
4. **Monitoring** | ||
- Status tracking | ||
- Progress updates | ||
- Error logging | ||
|
||
## Component Interaction | ||
|
||
```mermaid | ||
graph TD | ||
A[Client] --> B[Server Core] | ||
B --> C[Cache System] | ||
B --> D[Batch Orchestrator] | ||
D --> E[OpenAI API] | ||
D --> F[MongoDB] | ||
C --> F | ||
G[Monitor Tool] --> F | ||
``` | ||
|
||
## Design Decisions | ||
|
||
### Batch Processing | ||
- Configurable batch windows | ||
- Hash-based deduplication | ||
- Async/sync flexibility | ||
|
||
### Caching Strategy | ||
- MongoDB-based persistence | ||
- Hash-based indexing | ||
- Cross-session availability | ||
|
||
### Monitoring | ||
- Real-time updates | ||
- Interactive interface | ||
- Status aggregation | ||
``` |
Oops, something went wrong.