Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update API Reference section with improved structure and flow #231

Merged
merged 8 commits into from
Mar 22, 2024
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 8 additions & 7 deletions www/docs/api-reference/admin-apis/admin.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
id: admin
title: Corpus Administration APIs
sidebar_label: Corpus Administration APIs
title: Corpus Administration
sidebar_label: Corpus Administration
---

import Tabs from '@theme/Tabs';
Expand All @@ -13,24 +13,25 @@ The Vectara Console is a good way for you to get started with <Config v="names.p
you're ready to integrate the platform more deeply into your application, the
Corpus Admin APIs allow you to programmatically manipulate corpora and perform
many other operations within the system. These APIs enable new workflows for
organizations, like tracking usage of accounts and corpora. Check out this [blog post about managing multi-tenancy](https://vectara.com/managing-multi-tenancy-with-vectaras-new-management-apis/) for more details.
organizations, like managing corpora and tracking usage of accounts
and corpora. Check out this [blog post about managing multi-tenancy](https://vectara.com/managing-multi-tenancy-with-vectaras-new-management-apis/) for more details.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All blog posts now are at https://vectara.com/blog/XXX
This still works due to a redirect, but perhaps it's best to start pointing to the right URL?


:::tip

The [**interactive API Playground**](/docs/rest-api/admin-service) lets you experiment with these API endpoints.

:::

## Create, Delete, and Reset API Definitions
## Create, Delete, and Reset Corpus API Definitions

The full definitions of the Create, Reset, and Delete gRPC APIs are covered
in [admin.proto](https://github.com/vectara/protos/blob/main/admin.proto).

* The **Create API** allows corpora to be created programmatically, up to the
* The **Create Corpus API** allows corpora to be created programmatically, up to the
limit defined for the account.
* The **Reset API** deletes all data from a corpus, without
* The **Reset Corpus API** deletes all data from a corpus, without
deleting its definition.
* The **Delete API** expunges both the data in the corpus and
* The **Delete Corpus API** expunges both the data in the corpus and
its definition.


Expand Down
4 changes: 2 additions & 2 deletions www/docs/api-reference/admin-apis/compute-account-size.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ sidebar_label: Compute Account Size API Definition
import {Config} from '@site/docs/definitions.md';
import {vars} from '@site/static/variables.json';

The Compute Account Size endpoint lets you view how much quota you consumed
The Compute Account Size API lets you view how much quota you consumed
across the entire account. This capability is useful for administrators who
want to track and monitor usage of multiple accounts. For example, you manage
multiple tenants and notice that your account usage is higher than expected.
Expand Down Expand Up @@ -43,7 +43,7 @@ characters and metadata characters.
### Compute Account Size REST Endpoint Address

<Config v="names.product"/> exposes a REST endpoint at the following URL
to update the status of a corpus:
to compute the account size:
<code>https://<Config v="domains.rest.admin"/>/v1/compute-account-size</code>

### Compute Account Size Example
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ sidebar_label: Compute Corpus Size API Definition
import {Config} from '@site/docs/definitions.md';
import {vars} from '@site/static/variables.json';

The Compute Corpus Size endpoint lets you view the amount of quota consumed
The Compute Corpus Size API lets you view the amount of quota consumed
by a corpus. This capability is useful for administrators to track and monitor
the amount of usage for specific corpora. For example, you manage multiple
tenants and determine that a user consumed too much quota and you might decide
Expand Down Expand Up @@ -37,7 +37,7 @@ values.
### Compute Corpus Size REST Endpoint Address

<Config v="names.product"/> exposes a REST endpoint at the following URL
to update the status of a corpus:
to compute the size of a corpus:
<code>https://<Config v="domains.rest.admin"/>/v1/compute-corpus-size</code>

### Compute Corpus Size Example
Expand Down
8 changes: 4 additions & 4 deletions www/docs/api-reference/admin-apis/corpus/list-documents.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ sidebar_label: List Documents API Definition
import {Config} from '@site/docs/definitions.md';
import {vars} from '@site/static/variables.json';

The List Documents endpoint lets you view the Document IDs and their metadata
The List Documents API lets you view the Document IDs and their metadata
in a corpus. This is useful for viewing documents indexed so far and helping
you decide to remove documents that are no longer needed. It helps you manage
the document lifecycle in your environent.
Expand All @@ -19,8 +19,8 @@ capabilities into their applications.

:::tip

Check out our [interactive API Playground](/docs/rest-api/list-documents) that lets you experiment with this
REST endpoint to manage your documents.
Check out our [**interactive API Playground**](/docs/rest-api/list-documents) that lets you experiment with this
REST endpoint to list your documents.

:::

Expand All @@ -43,7 +43,7 @@ configure up to 1000.
### List Documents REST Endpoint Address

<Config v="names.product"/> exposes a REST endpoint at the following URL
to update the status of a corpus:
to list documents:
<code>https://<Config v="domains.rest.admin"/>/v1/list-documents</code>

### List Documents Request Example
Expand Down
6 changes: 3 additions & 3 deletions www/docs/api-reference/admin-apis/corpus/read-corpus.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ sidebar_label: Read Corpus API Definition
import {Config} from '@site/docs/definitions.md';
import {vars} from '@site/static/variables.json';

The Read Corpus endpoint lets you view detailed information about corpora
The Read Corpus API lets you view detailed information about corpora
within your account. It enables you to view different aspects about the corpus
including basic information like the ID, name, whether it is enabled or
disabled, and other metadata. You can also view the corpus size, associated
Expand All @@ -29,7 +29,7 @@ because of information returned by this endpoint.
:::tip

Check out our [**interactive API Playground**](/docs/rest-api/read-corpus) that lets you experiment with this
REST endpoint to manage your corpus details.
REST endpoint to read your corpus details.

:::

Expand All @@ -52,7 +52,7 @@ API keys with a specific corpus.
### Read Corpus REST Endpoint Address

<Config v="names.product"/> exposes a REST endpoint at the following URL
to ingest content into a corpus:
to read information about the corpus:
<code>https://<Config v="domains.rest.admin"/>/v1/read-corpus</code>

### Read Corpus Request Example
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ import {Config} from '@site/docs/definitions.md';
import {vars} from '@site/static/variables.json';


The Update Corpus Enablement endpoint lets you enable or disable a corpus.
The Update Corpus Enablement API lets you enable or disable a corpus.
This is useful to manage the availability of data within the system, such as
when you need to take the corpus offline without having to delete the corpus.

Expand Down Expand Up @@ -36,7 +36,7 @@ The request to enable or disable a corpus requires the following parameters:
### Update Corpus Enablement REST Endpoint Address

<Config v="names.product"/> exposes a REST endpoint at the following URL
to update the status of a corpus:
to enable or disable a corpus:
<code>https://<Config v="domains.rest.admin"/>/v1/update-corpus-enablement</code>

### Update Corpus Enablement Request Example
Expand Down
176 changes: 74 additions & 102 deletions www/docs/api-reference/admin-apis/create-corpus.md
Original file line number Diff line number Diff line change
@@ -1,62 +1,99 @@
---
id: create-corpus
title: Create Corpus
title: Create Corpus API Definition
sidebar_label: Create Corpus API Definition
---

import {Config} from '@site/docs/definitions.md';
import {vars} from '@site/static/variables.json';

The Create Corpus endpoint lets you create a corpus that contains specific
properties and attributes.

## Create Corpus REST Endpoint
The Create Corpus API lets you create a corpus that contains specific
properties and attributes. A corpus is a container where you upload your data
to be ingested for querying.

<Config v="names.product"/> exposes a REST endpoint at the following URL
to ingest content into a corpus:
<code>https://<Config v="domains.rest.admin"/>/v1/create-corpus</code>
:::tip

### Create Corpus Request Headers
Check out our [**interactive API Playground**](/docs/rest-api/create-corpus) that lets
you experiment with this REST endpoint to create a corpus.

To interact with the Create Corpus service via REST calls, you need the following
headers:
:::

* `customer_id` is the customer ID to use for the request.
* JWT token as your `Bearer Token`
* (Optional) `grpc-timeout` lets you specify how long to wait for the calls
that have the potential to take longer to process. We recommend
`-H "grpc-timeout: 30S"`

### Create Corpus Request Body
## Create Corpus Request Body and Response

Only the `name` and `description` fields are mandatory when creating a corpus.

The response message returns a unique id, `corpus_id`, by which the corpus can
be subsequently referenced.

:::note

The name does not need to be unique within an account.

:::

In order to reference metadata in [filter expressions](/docs/learn/metadata-search-filtering/filter-overview), the
referenceable attributes must be declared at creation time in the **filter
attributes**. This list cannot be changed once the corpus is created.

For information on **custom dimensions** please see
[Custom Dimensions](/docs/learn/semantic-search/add-custom-dimensions).
Like filter attributes, custom dimensions cannot be changed after the corpus is created.


## Filter Attributes Definition

The `filterAttributes` object must specify a `name`, and a `level` which indicates
whether it exists in the document or part level metadata. At indexing time,
metadata with this name will be extracted and made available for filter
expressions to operate on.

If `indexed` is true, the system will build an index on the extracted values
to further improve the performance of filter expressions involving the
attribute.

Finally, filter attributes must specify a `type`, which is validated when
documents are indexed. The four supported types are `integer`, which stores
signed whole-number values up to eight bytes in length; `real`, for storing
floating point values in [IEEE 754 8-byte format][1]; `text` for storing
textual strings in [UTF-8 encoding][2], and `boolean` for storing true/false
values.

[1]: https://en.wikipedia.org/wiki/Double-precision_floating-point_format
[2]: https://en.wikipedia.org/wiki/UTF-8


## REST Example

### Create Corpus REST Endpoint

<Config v="names.product"/> exposes a REST endpoint at the following URL
to create a corpus:
<code>https://<Config v="domains.rest.admin"/>/v1/create-corpus</code>

### Create Corpus Request Body

```json
{
"corpus": {
"id": 0,
"name": "string",
"description": "string",
"dtProvision": "string",
"id": 1,
"name": "NHL Rules",
"description": "Contains rulebooks for the NHL",
"dtProvision": "",
"enabled": true,
"swapQenc": true,
"swapIenc": true,
"textless": true,
"encrypted": true,
"encoderId": "string",
"encoderId": "1",
"metadataMaxBytes": 0,
"customDimensions": [
{
"name": "string",
"description": "string",
"servingDefault": 0,
"indexingDefault": 0
}
{}
],
"filterAttributes": [
{
"name": "string",
"description": "string",
"name": "name-of-field",
"description": "Description about the name",
"indexed": true,
"type": "FILTER_ATTRIBUTE_TYPE__UNDEFINED",
"level": "FILTER_ATTRIBUTE_LEVEL__UNDEFINED"
Expand All @@ -66,39 +103,25 @@ Only the `name` and `description` fields are mandatory when creating a corpus.
}
```

The response message returns a unique id, `corpus_id`, by which the corpus can
be subsequently referenced. Note that the name needn't be unique within an
account.
## gRPC Example

In order to reference metadata in [filter expressions](/docs/learn/metadata-search-filtering/filter-overview), the
referenceable attributes must be declared at creation time in the **filter
attributes**. This list cannot be changed once the corpus is created.

For information on **custom dimensions** please see
[Custom Dimensions](/docs/learn/semantic-search/add-custom-dimensions).
Like filter attributes, custom dimensions cannot be changed after the corpus is created.
You can find the full Create Corpus gRPC definition
at [admin.proto](https://github.com/vectara/protos/blob/main/admin.proto).

```protobuf
message CreateCorpusRequest {
Corpus corpus = 1;
}

message CreateCorpusResponse {
// The corpus id that was created.
uint32 corpus_id = 1;
Status status = 2;
}

message Corpus {
// The corpus id.
// The Corpus ID.
// This value is ignored during Corpus creation.
uint32 id = 1;
// The name of the corpus.
string name = 2;
// A description for the corpus.
string description = 3;
// The time at which the corpus was provisioned.
// This value is ignored during Corpus creation.
int64 dt_provision = 4;
// Whether the corpus is enabled for use or not.
// This value is ignored during Corpus creation.
bool enabled = 5;


Expand Down Expand Up @@ -127,54 +150,3 @@ message Corpus {
repeated FilterAttribute filter_attributes = 14;
}
```

#### Filter Attribute

A filter attribute must specify a **name**, and a **level** which indicates
whether it exists in the document or part level metadata. At indexing time,
metadata with this name will be extracted and made available for filter
expressions to operate on.

If **indexed** is true, the system will build an index on the extracted values
to further improve the performance of filter expressions involving the
attribute.

Finally, filter attributes must specify a **type**, which is validated when
documents are indexed. The four supported types are **integer**, which stores
signed whole-number values up to eight bytes in length; **real**, for storing
floating point values in [IEEE 754 8-byte format][1]; **text** for storing
textual strings in [UTF-8 encoding][2], and **boolean** for storing true/false
values.

[1]: https://en.wikipedia.org/wiki/Double-precision_floating-point_format
[2]: https://en.wikipedia.org/wiki/UTF-8


```
message FilterAttribute {
// Name of the field, as seen in metadata.
string name = 5;
// An optional description.
string description = 10;
// Whether the field is indexed for maximum query speed.
bool indexed = 15;
// The data type of the attribute.
FilterAttributeType type = 20;
// Whether the attribute lives at the document or part level.
FilterAttributeLevel level = 25;
}

enum FilterAttributeType {
FILTER_ATTRIBUTE_TYPE__UNDEFINED = 0;
FILTER_ATTRIBUTE_TYPE__INTEGER = 5;
FILTER_ATTRIBUTE_TYPE__REAL = 15;
FILTER_ATTRIBUTE_TYPE__TEXT = 25;
FILTER_ATTRIBUTE_TYPE__BOOLEAN = 35;
}

enum FilterAttributeLevel {
FILTER_ATTRIBUTE_LEVEL__UNDEFINED = 0;
FILTER_ATTRIBUTE_LEVEL__DOCUMENT = 5; // Document-level attribute
FILTER_ATTRIBUTE_LEVEL__DOCUMENT_PART = 10; // Part-level attribute
}
```
Loading
Loading