Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add documentation for max_number_processors #8157

Merged
merged 6 commits into from
Sep 11, 2024
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions _ingest-pipelines/processors/index-processors.md
Original file line number Diff line number Diff line change
Expand Up @@ -69,6 +69,12 @@
`urldecode` | Decodes a string from URL-encoded format.
`user_agent` | Extracts details from the user agent sent by a browser to its web requests.

## Validations on Processors

Check failure on line 72 in _ingest-pipelines/processors/index-processors.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.HeadingCapitalization] 'Validations on Processors' is a heading and should be in sentence case. Raw Output: {"message": "[OpenSearch.HeadingCapitalization] 'Validations on Processors' is a heading and should be in sentence case.", "location": {"path": "_ingest-pipelines/processors/index-processors.md", "range": {"start": {"line": 72, "column": 4}}}, "severity": "ERROR"}
anandkrrai marked this conversation as resolved.
Show resolved Hide resolved

We can configure the limits on the number of ingest processors that should be used. The limit can be configured using the attribute `cluster.ingest.max_number_processors`. The sum of the number of processors and the number of `on_failure` processors are considered for counting the total number of processors on which the limit would be applied.
anandkrrai marked this conversation as resolved.
Show resolved Hide resolved

The default value for `cluster.ingest.max_number_processors` is `Integer.MAX_VALUE`. If you try to add a number of processors greater than the value configured in `cluster.ingest.max_number_processors`, an `IllegalStateException` will be thrown.

## Batch-enabled processors

Some processors support batch ingestion---they can process multiple documents at the same time as a batch. These batch-enabled processors usually provide better performance when using batch processing. For batch processing, use the [Bulk API]({{site.url}}{{site.baseurl}}/api-reference/document-apis/bulk/) and provide a `batch_size` parameter. All batch-enabled processors have a batch mode and a single-document mode. When you ingest documents using the `PUT` method, the processor functions in single-document mode and processes documents in series. Currently, only the `text_embedding` and `sparse_encoding` processors are batch enabled. All other processors process documents one at a time.
Expand Down
Loading