Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: refactoring #105

Open
wants to merge 2 commits into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file added docs/app/icon.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
10 changes: 1 addition & 9 deletions docs/content/docs/apis/introduction.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -13,15 +13,7 @@ With OramaCore we aim to provide a set of APIs that are backward compatible - or

At the time of writing, with OramaCore being in beta, we are still working on the APIs and SDKs. We are also working on the documentation, so please bear with us.

## Philosophy

The one imperative we have when designing the OramaCore APIs is to make them as simple as possible. We want to make it easy for developers to get started with OramaCore, and to make it easy for them to build applications that use OramaCore.

Any additional steps, any additional complexity, any additional boilerplate, is a failure on our part. We want to make it as easy as possible for you to use OramaCore.

If you think we should improve on this front, please let us know at [[email protected]](mailto:[email protected]). We are always looking for feedback.

## REST APIs vs SDKs
## APIs & SDKs

We will provide both REST APIs and SDKs for OramaCore.

Expand Down
2 changes: 1 addition & 1 deletion docs/content/docs/apis/meta.json
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
{
"title": "APIs",
"title": "APIs Reference",
"pages": [
"introduction",
"create-collection",
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: OramaCore Architecture
title: Overview
description: A deep dive into the OramaCore architecture.
---

Expand Down Expand Up @@ -82,7 +82,7 @@ Future versions of OramaCore will move away from this approach by either integra

### Embeddings Generation

OramaCore automatically generates embeddings for your data. You can configure which models to use via the [configuration](/docs/getting-started/configuration).
OramaCore automatically generates embeddings for your data. You can configure which models to use via the [configuration](/docs/guide/configuration).

Current benchmarks indicate this implementation can generate up to 1,200 embeddings per second on an RTX 4080 Super. We acknowledge this seems optimistic and will release reproducible benchmarks soon.

Expand Down
38 changes: 38 additions & 0 deletions docs/content/docs/architecture/write-read.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
---
title: Write & Read Side
description: OramaCore is a modular system, allowing it to run as a monolith or as a distributed system. We split the system into two distinct sides.
---

OramaCore is a modular system. We allow it to run as a monolith - where all the components are running in a single process - or as a distributed system, where you can scale each component independently.

To allow this, we split the system into two distinct sides: the **Write Side** and the **Read Side**.

If you're running OramaCore in a single node, you won't notice the difference. But if you're running it in a distributed system, you can scale the write side independently from the read side.

## Write Side

The write side is responsible for ingesting data, generating embeddings, and storing them in the vector database. It's also responsible for generating the full-text search index.

It's the part of the system that requires the most GPU power and memory, as it need to generate a lot of content, embeddings, and indexes.

In detail, the write side is responsible for:

- **Ingesting data**. It creates a buffer of documents and flushes them to the vector database and the full-text search index, rebuilding the immutable data structures used for search.
- **Generating embeddings**. It generates text embeddings for large datasets without interfering with the search performance.
- **Expanding content (coming soon)**. It is capable of reading images, code blocks, and other types of content, and generating descriptions and metadata for them.

Every insertion, deletion, or update of a document will be handled by the write side.

## Read Side

The read side is responsible for handling queries, searching for documents, and returning the results to the user.

It's also the home of the Answer Engine, which is responsible for generating answers to questions and performing chain of actions based on the user's input.

In detail, the read side is responsible for:

- **Handling queries**. It receives the user's query, translates it into a query that the vector database can understand, and returns the results.
- **Searching for documents**. It searches for documents in the full-text search index and the vector database.
- **Answer Engine**. It generates answers to questions, performs chain of actions, and runs custom agents.

Every query, question, or action will be handled by the read side.
Original file line number Diff line number Diff line change
Expand Up @@ -188,4 +188,4 @@ In this example, we create a collection named `products` that uses the `BGESmall

Since OramaCore ships with a JavaScript runtime integrated, you can use JavaScript hooks to customize text extraction and transformation.

Since this is a more advanced topic, we decided to dedicate it an entire section. Please refer to the [JavaScript Hooks](/docs/getting-started/javascript-hooks#selectembeddingproperties) documentation for more information.
Since this is a more advanced topic, we decided to dedicate it an entire section. Please refer to the [JavaScript Hooks](/docs/customizations/javascript-hooks/selectEmbeddingProperties) documentation for more information.
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ description: "API keys are used to authenticate requests to the OramaCore API"
---
import { Tab, Tabs } from 'fumadocs-ui/components/tabs';

[As explained in the introduction](/docs#write-side-read-side), OramaCore is split in two sides: the **reader side** and the **writer side**.
As explained in the [Architecture](/docs/architecture/write-read) section, OramaCore is split in two sides: the **reader side** and the **writer side**.

Therefore, depending on the operation you want to perform, you will need to use different API keys.

Expand Down
Original file line number Diff line number Diff line change
@@ -1,12 +1,13 @@
---
title: Configuration
description: Learn how to configure OramaCore
icon: Cog
---

<Callout type='warn'>
OramaCore is currently under active development. Our goal is to release the first Beta version (**v0.1.0**) on **Jan 31st, 2025**, and the first stable version (**v1.0.0**) on **Feb 28th, 2025**.

While the system is already quite stable, please note that APIs will undergo changes in **v0.1.0** and **v1.0.0**.
</Callout>

## Configuring OramaCore

Expand Down Expand Up @@ -84,7 +85,7 @@ The `writer_side` section configures the writer side of OramaCore. Here are the
- `data_dir`: The directory where the writer side will persist the data on disk. By default, it's set to `./.data/writer`.
- `embedding_queue_limit`: The maximum number of embeddings that can be stored in the queue before the writer starts to be blocked. By default, it's set to `50000`.
- `insert_batch_commit_size`: The number of document insertions after which the write side will commit the changes. By default, it's set to `5000`.
- `default_embedding_model`: The default embedding model used to calculate the embeddings if not specified in the collection creation. By default, it's set to `MultilingualE5Small`. See more about the available models in the [Embedding Models](/docs/getting-started/text-embeddings) section.
- `default_embedding_model`: The default embedding model used to calculate the embeddings if not specified in the collection creation. By default, it's set to `MultilingualE5Small`. See more about the available models in the [Embedding Models](/docs/customizations/text-embeddings) section.

## `reader_side`

Expand All @@ -108,7 +109,7 @@ The `ai_server` section configures the Python gRPC server that is responsible fo

The `embeddings` section configures the embeddings calculation. Here are the available options:

- `default_model_group`: The default model group used to calculate the embeddings if not specified in the collection creation. By default, it's set to `multilingual`. See more about the available models in the [Embedding Models](/docs/getting-started/text-embeddings) section.
- `default_model_group`: The default model group used to calculate the embeddings if not specified in the collection creation. By default, it's set to `multilingual`. See more about the available models in the [Embedding Models](/docs/customizations/text-embeddings) section.
- `dynamically_load_models`: Whether to dynamically load the models. By default, it's set to `false`.
- `execution_providers`: The execution providers used to calculate the embeddings. By default, it's set to `CUDAExecutionProvider` and `CPUExecutionProvider`.
- `total_threads`: The total number of threads used to calculate the embeddings. By default, it's set to `8`.
Original file line number Diff line number Diff line change
@@ -1,12 +1,13 @@
---
title: Running OramaCore
description: "Downloading, building, and running OramaCore on your machine or in production. "
icon: Play
title: Install OramaCore
description: "Downloading, building, and running OramaCore on your machine or in production."
---

<Callout type='warn'>
OramaCore is currently under active development. Our goal is to release the first Beta version (**v0.1.0**) on **Jan 31st, 2025**, and the first stable version (**v1.0.0**) on **Feb 28th, 2025**.

While the system is already quite stable, please note that APIs will undergo changes in **v0.1.0** and **v1.0.0**.
</Callout>

## Using Docker

Expand All @@ -16,7 +17,7 @@ The simplest way to get started is by pulling the official Docker image from Doc
docker pull oramasearch/oramacore:latest
```

Create a [config.yaml configuration file](/docs/getting-started/configuration) and then run the Docker image:
Create a [config.yaml configuration file](/docs/guide/configuration) and then run the Docker image:

```sh
docker run \
Expand Down Expand Up @@ -91,7 +92,7 @@ Then, install the dependencies:
pip install -r requirements.txt # or pip install -r requirements-cpu.txt
```

When you run the server, OramaCore will automatically download the required models specified in the [configuration file](/docs/getting-started/configuration).
When you run the server, OramaCore will automatically download the required models specified in the [configuration file](/docs/guide/configuration).

The download time will depend on your internet connection.

Expand Down
114 changes: 78 additions & 36 deletions docs/content/docs/index.mdx
Original file line number Diff line number Diff line change
@@ -1,21 +1,14 @@
---
title: Introduction
description: An introduction to OramaCore - a complex AI architecture made easy and open-source.
icon: Album
title: Getting Started
description: Getting started with OramaCore - a complex AI architecture made easy and open-source.
---
import { File, Folder, Files } from 'fumadocs-ui/components/files';
import { SearchIcon, DatabaseIcon, WholeWordIcon, FileJson } from 'lucide-react';

Building search engines, copilots, answer systems, or pretty much any AI project is harder than it should be.

Even in the simplest cases, you'll need a vector database, a connection to an LLM for generating embeddings, a solid chunking mechanism, and another LLM to generate answers.
And that's without even considering your specific needs, where all these pieces need to work together in a way that's unique to your use case.

On top of that, you're likely forced to add multiple layers of network-based communication, deal with third-party slowdowns beyond your control, and address all the typical challenges we consider when building high-performance, high-quality applications.
Building AI projects like search engines or copilots **is harder than it should be**, requiring vector databases, LLMs, chunking, and seamless integration while handling network slowdowns and performance issues. OramaCore simplifies this with a unified, opinionated server for easier development and customization.

OramaCore simplifies the chaos of setting up and maintaining a complex architecture. It gives you a single, easy-to-use, opinionated server that's designed to help you create tailored solutions for your own unique challenges.

## Why OramaCore
## Quick Start

OramaCore gives you everything you need **in a single Dockerfile**.

Expand Down Expand Up @@ -45,48 +38,97 @@ You're getting acces to:
</Card>
</Cards>

All from a single, self-contained image.
All from a single, self-contained image.

To run the image, you can use the following command:

```sh
docker run \
-p 8080:8080 \
-v ${HOME}/.cache/huggingface:/root/.cache/huggingface \
-v ./config.yaml:/app/config.yaml \
--gpus all \
oramacore
```

## On being opinionated
### Configuration

When building OramaCore, we made a deliberate choice to create an opinionated system. We offer strong, general-purpose default configurations while still giving you the flexibility to customize them as needed.
To get started with OramaCore, you can use the default configuration. But if you want to customize it, you can do so by editing the `config.toml` file.
You can customize the system to fit your specific needs. Check out the [configuration](/docs/guide/configuration) guide to learn more.

There are plenty of great vector databases and full-text search engines out there. But most of them don't work seamlessly together out of the box—they often require extensive fine-tuning to arrive at a functional solution.
### Create a collection

Our goal is to provide you with a platform that's ready to go the moment you pull a single Docker file.
To import data into OramaCore, you need to create a collection. A collection is a group of documents that you can search and interact with. You can create a collection by sending a POST request to the `/collections` endpoint with the collection name and the API Keys to secure it.
The request should include an Authorization header with the master API key. Learn more about [API Keys](/docs/guide/api-keys).

```sh
curl -X POST \
http://localhost:8080/v0/collections \
-H 'Authorization: Bearer <master-api-key>' \
-d '{
"id": "products",
"write_api_key": "my-write-api-key",
"read_api_key": "my-read-api-key"
}'
```

### Add documents

## Write Side, Read Side
Once you have created a collection, you can add documents to it. A document is a JSON object that contains the data you want to search. You can add a document by sending a POST request to the `/collections/:collection_id/documents` endpoint with the document data.

OramaCore is a modular system. We allow it to run as a monolith - where all the components are running in a single process - or as a distributed system, where you can scale each component independently.
```sh
curl -X PATCH \
http://localhost:8080/v0/collections/{COLLECTION_ID}/documents \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer <write_api_key>' \
-d '{
"id": "1",
"title": "My first document",
"content": "The quick brown fox jumps over the lazy dog."
}'
```
You can explore more about [documents](/docs/apis/insert-documents) and how to insert documents into a collection.

### Search

Now that you have added documents to your collection, you can perform your first search query using the `/search` endpoint. You can send a POST request to the `/search` endpoint with the search query and the collection ID to get the results.

```sh
curl -X POST \
http://localhost:8080/v0/collections/{COLLECTION_ID}/search?api-key=<read_api_key> \
-H 'Content-Type: application/json' \
-d '{ "term": "The quick brown fox" }'
```

You can now perform unlimited, fast searches on your data using OramaCore! Check out the supported [Search Parameters](/docs/apis/search-documents#search-parameters) to learn more about how to customize your search queries results.

To allow this, we split the system into two distinct sides: the **write side** and the **read side**.
Out of the box, OramaCore is ready to go with a powerful search engine, featuring Full Text search, Vector Search and Hybrid Search. You can start building your AI projects right away! 🚀

If you're running OramaCore in a single node, you won't notice the difference. But if you're running it in a distributed system, you can scale the write side independently from the read side.
---

## Why OramaCore?

### Write Side
Building search engines, copilots, answer systems, or pretty much any AI project is pretty challenging.
Even in the simplest cases, you'll need a vector database, a connection to an LLM for generating embeddings, a solid chunking mechanism, and another LLM to generate answers.
And that's without even considering your specific needs, where all these pieces need to work together in a way that's unique to your use case.

The write side is responsible for ingesting data, generating embeddings, and storing them in the vector database. It's also responsible for generating the full-text search index.
On top of that, you're likely forced to add multiple layers of network-based communication, deal with third-party slowdowns beyond your control, and address all the typical challenges we consider when building high-performance, high-quality applications.

It's the part of the system that requires the most GPU power and memory, as it need to generate a lot of content, embeddings, and indexes.
OramaCore simplifies the chaos of setting up and maintaining a complex architecture. It gives you a single, easy-to-use, opinionated server that's designed to help you create tailored solutions for your own unique challenges.

In detail, the write side is responsible for:
### Philosophy

- **Ingesting data**. It creates a buffer of documents and flushes them to the vector database and the full-text search index, rebuilding the immutable data structures used for search.
- **Generating embeddings**. It generates text embeddings for large datasets without interfering with the search performance.
- **Expanding content (coming soon)**. It is capable of reading images, code blocks, and other types of content, and generating descriptions and metadata for them.
When building OramaCore, we made a deliberate choice to create **an opinionated system**. We offer strong, general-purpose default configurations while still giving you the flexibility to customize them as needed.

Every insertion, deletion, or update of a document will be handled by the write side.
There are plenty of great vector databases and full-text search engines out there. But most of them don't work seamlessly together out of the box—they often require extensive fine-tuning to arrive at a functional solution.

### Read Side
Our goal is to provide you with a platform that's ready to go the moment you pull a single Docker file.

The read side is responsible for handling queries, searching for documents, and returning the results to the user.
### OramaCore APIs

It's also the home of the Answer Engine, which is responsible for generating answers to questions and performing chain of actions based on the user's input.
The one imperative we have when designing the OramaCore APIs is to make them as simple as possible. We want to make it easy for developers to get started with OramaCore, and to make it easy for them to build applications that use OramaCore.

In detail, the read side is responsible for:
Any additional steps, any additional complexity, any additional boilerplate, is a failure on our part. We want to make it as easy as possible for you to use OramaCore.

- **Handling queries**. It receives the user's query, translates it into a query that the vector database can understand, and returns the results.
- **Searching for documents**. It searches for documents in the full-text search index and the vector database.
- **Answer Engine**. It generates answers to questions, performs chain of actions, and runs custom agents.
If you think we should improve on this front, please let us know at [[email protected]](mailto:[email protected]). We are always looking for feedback.

Every query, question, or action will be handled by the read side.
14 changes: 6 additions & 8 deletions docs/content/docs/meta.json
Original file line number Diff line number Diff line change
Expand Up @@ -3,17 +3,15 @@
"description": "OramaCore Documentation",
"root": true,
"pages": [
"---Getting Started---",
"---Guide---",
"index",
"api-key",
"configuration",
"running-oramacore",
"...guide",
"apis",
"---Customizations---",
"text-embeddings",
"javascript-hooks",
"...customizations",
"---Architecture---",
"architecture",
"party-planner"
"architecture/overview",
"architecture/write-read",
"architecture/party-planner"
]
}