diff --git a/elasticsearch/_async/client/__init__.py b/elasticsearch/_async/client/__init__.py index 7920715f4..1c966b828 100644 --- a/elasticsearch/_async/client/__init__.py +++ b/elasticsearch/_async/client/__init__.py @@ -646,83 +646,89 @@ async def bulk( ] = None, ) -> ObjectApiResponse[t.Any]: """ - Bulk index or delete documents. Perform multiple `index`, `create`, `delete`, - and `update` actions in a single request. This reduces overhead and can greatly - increase indexing speed. If the Elasticsearch security features are enabled, - you must have the following index privileges for the target data stream, index, - or index alias: * To use the `create` action, you must have the `create_doc`, - `create`, `index`, or `write` index privilege. Data streams support only the - `create` action. * To use the `index` action, you must have the `create`, `index`, - or `write` index privilege. * To use the `delete` action, you must have the `delete` - or `write` index privilege. * To use the `update` action, you must have the `index` - or `write` index privilege. * To automatically create a data stream or index - with a bulk API request, you must have the `auto_configure`, `create_index`, - or `manage` index privilege. * To make the result of a bulk operation visible - to search using the `refresh` parameter, you must have the `maintenance` or `manage` - index privilege. Automatic data stream creation requires a matching index template - with data stream enabled. The actions are specified in the request body using - a newline delimited JSON (NDJSON) structure: ``` action_and_meta_data\\n optional_source\\n - action_and_meta_data\\n optional_source\\n .... action_and_meta_data\\n optional_source\\n - ``` The `index` and `create` actions expect a source on the next line and have - the same semantics as the `op_type` parameter in the standard index API. A `create` - action fails if a document with the same ID already exists in the target An `index` - action adds or replaces a document as necessary. NOTE: Data streams support only - the `create` action. To update or delete a document in a data stream, you must - target the backing index containing the document. An `update` action expects - that the partial doc, upsert, and script and its options are specified on the - next line. A `delete` action does not expect a source on the next line and has - the same semantics as the standard delete API. NOTE: The final line of data must - end with a newline character (`\\n`). Each newline character may be preceded - by a carriage return (`\\r`). When sending NDJSON data to the `_bulk` endpoint, - use a `Content-Type` header of `application/json` or `application/x-ndjson`. - Because this format uses literal newline characters (`\\n`) as delimiters, make - sure that the JSON actions and sources are not pretty printed. If you provide - a target in the request path, it is used for any actions that don't explicitly - specify an `_index` argument. A note on the format: the idea here is to make - processing as fast as possible. As some of the actions are redirected to other - shards on other nodes, only `action_meta_data` is parsed on the receiving node - side. Client libraries using this protocol should try and strive to do something - similar on the client side, and reduce buffering as much as possible. There is - no "correct" number of actions to perform in a single bulk request. Experiment - with different settings to find the optimal size for your particular workload. - Note that Elasticsearch limits the maximum size of a HTTP request to 100mb by - default so clients must ensure that no request exceeds this size. It is not possible - to index a single document that exceeds the size limit, so you must pre-process - any such documents into smaller pieces before sending them to Elasticsearch. - For instance, split documents into pages or chapters before indexing them, or - store raw binary data in a system outside Elasticsearch and replace the raw data - with a link to the external system in the documents that you send to Elasticsearch. - **Client suppport for bulk requests** Some of the officially supported clients - provide helpers to assist with bulk requests and reindexing: * Go: Check out - `esutil.BulkIndexer` * Perl: Check out `Search::Elasticsearch::Client::5_0::Bulk` - and `Search::Elasticsearch::Client::5_0::Scroll` * Python: Check out `elasticsearch.helpers.*` - * JavaScript: Check out `client.helpers.*` * .NET: Check out `BulkAllObservable` - * PHP: Check out bulk indexing. **Submitting bulk requests with cURL** If you're - providing text file input to `curl`, you must use the `--data-binary` flag instead - of plain `-d`. The latter doesn't preserve newlines. For example: ``` $ cat requests - { "index" : { "_index" : "test", "_id" : "1" } } { "field1" : "value1" } $ curl - -s -H "Content-Type: application/x-ndjson" -XPOST localhost:9200/_bulk --data-binary - "@requests"; echo {"took":7, "errors": false, "items":[{"index":{"_index":"test","_id":"1","_version":1,"result":"created","forced_refresh":false}}]} - ``` **Optimistic concurrency control** Each `index` and `delete` action within - a bulk API call may include the `if_seq_no` and `if_primary_term` parameters - in their respective action and meta data lines. The `if_seq_no` and `if_primary_term` - parameters control how operations are run, based on the last modification to - existing documents. See Optimistic concurrency control for more details. **Versioning** - Each bulk item can include the version value using the `version` field. It automatically - follows the behavior of the index or delete operation based on the `_version` - mapping. It also support the `version_type`. **Routing** Each bulk item can include - the routing value using the `routing` field. It automatically follows the behavior - of the index or delete operation based on the `_routing` mapping. NOTE: Data - streams do not support custom routing unless they were created with the `allow_custom_routing` - setting enabled in the template. **Wait for active shards** When making bulk - calls, you can set the `wait_for_active_shards` parameter to require a minimum - number of shard copies to be active before starting to process the bulk request. - **Refresh** Control when the changes made by this request are visible to search. - NOTE: Only the shards that receive the bulk request will be affected by refresh. - Imagine a `_bulk?refresh=wait_for` request with three documents in it that happen - to be routed to different shards in an index with five shards. The request will - only wait for those three shards to refresh. The other two shards that make up - the index do not participate in the `_bulk` request at all. + .. raw:: html + +
Bulk index or delete documents.
+ Perform multiple index
, create
, delete
, and update
actions in a single request.
+ This reduces overhead and can greatly increase indexing speed.
If the Elasticsearch security features are enabled, you must have the following index privileges for the target data stream, index, or index alias:
+create
action, you must have the create_doc
, create
, index
, or write
index privilege. Data streams support only the create
action.index
action, you must have the create
, index
, or write
index privilege.delete
action, you must have the delete
or write
index privilege.update
action, you must have the index
or write
index privilege.auto_configure
, create_index
, or manage
index privilege.refresh
parameter, you must have the maintenance
or manage
index privilege.Automatic data stream creation requires a matching index template with data stream enabled.
+The actions are specified in the request body using a newline delimited JSON (NDJSON) structure:
+action_and_meta_data\\n
+ optional_source\\n
+ action_and_meta_data\\n
+ optional_source\\n
+ ....
+ action_and_meta_data\\n
+ optional_source\\n
+
+ The index
and create
actions expect a source on the next line and have the same semantics as the op_type
parameter in the standard index API.
+ A create
action fails if a document with the same ID already exists in the target
+ An index
action adds or replaces a document as necessary.
NOTE: Data streams support only the create
action.
+ To update or delete a document in a data stream, you must target the backing index containing the document.
An update
action expects that the partial doc, upsert, and script and its options are specified on the next line.
A delete
action does not expect a source on the next line and has the same semantics as the standard delete API.
NOTE: The final line of data must end with a newline character (\\n
).
+ Each newline character may be preceded by a carriage return (\\r
).
+ When sending NDJSON data to the _bulk
endpoint, use a Content-Type
header of application/json
or application/x-ndjson
.
+ Because this format uses literal newline characters (\\n
) as delimiters, make sure that the JSON actions and sources are not pretty printed.
If you provide a target in the request path, it is used for any actions that don't explicitly specify an _index
argument.
A note on the format: the idea here is to make processing as fast as possible.
+ As some of the actions are redirected to other shards on other nodes, only action_meta_data
is parsed on the receiving node side.
Client libraries using this protocol should try and strive to do something similar on the client side, and reduce buffering as much as possible.
+There is no "correct" number of actions to perform in a single bulk request. + Experiment with different settings to find the optimal size for your particular workload. + Note that Elasticsearch limits the maximum size of a HTTP request to 100mb by default so clients must ensure that no request exceeds this size. + It is not possible to index a single document that exceeds the size limit, so you must pre-process any such documents into smaller pieces before sending them to Elasticsearch. + For instance, split documents into pages or chapters before indexing them, or store raw binary data in a system outside Elasticsearch and replace the raw data with a link to the external system in the documents that you send to Elasticsearch.
+Client suppport for bulk requests
+Some of the officially supported clients provide helpers to assist with bulk requests and reindexing:
+esutil.BulkIndexer
Search::Elasticsearch::Client::5_0::Bulk
and Search::Elasticsearch::Client::5_0::Scroll
elasticsearch.helpers.*
client.helpers.*
BulkAllObservable
Submitting bulk requests with cURL
+If you're providing text file input to curl
, you must use the --data-binary
flag instead of plain -d
.
+ The latter doesn't preserve newlines. For example:
$ cat requests
+ { "index" : { "_index" : "test", "_id" : "1" } }
+ { "field1" : "value1" }
+ $ curl -s -H "Content-Type: application/x-ndjson" -XPOST localhost:9200/_bulk --data-binary "@requests"; echo
+ {"took":7, "errors": false, "items":[{"index":{"_index":"test","_id":"1","_version":1,"result":"created","forced_refresh":false}}]}
+
+ Optimistic concurrency control
+Each index
and delete
action within a bulk API call may include the if_seq_no
and if_primary_term
parameters in their respective action and meta data lines.
+ The if_seq_no
and if_primary_term
parameters control how operations are run, based on the last modification to existing documents. See Optimistic concurrency control for more details.
Versioning
+Each bulk item can include the version value using the version
field.
+ It automatically follows the behavior of the index or delete operation based on the _version
mapping.
+ It also support the version_type
.
Routing
+Each bulk item can include the routing value using the routing
field.
+ It automatically follows the behavior of the index or delete operation based on the _routing
mapping.
NOTE: Data streams do not support custom routing unless they were created with the allow_custom_routing
setting enabled in the template.
Wait for active shards
+When making bulk calls, you can set the wait_for_active_shards
parameter to require a minimum number of shard copies to be active before starting to process the bulk request.
Refresh
+Control when the changes made by this request are visible to search.
+NOTE: Only the shards that receive the bulk request will be affected by refresh.
+ Imagine a _bulk?refresh=wait_for
request with three documents in it that happen to be routed to different shards in an index with five shards.
+ The request will only wait for those three shards to refresh.
+ The other two shards that make up the index do not participate in the _bulk
request at all.
Clear a scrolling search. + Clear the search context and results for a scrolling search.
+ `Close a point in time.
+ A point in time must be opened explicitly before being used in search requests.
+ The keep_alive
parameter tells Elasticsearch how long it should persist.
+ A point in time is automatically closed when the keep_alive
period has elapsed.
+ However, keeping points in time has a cost; close them as soon as they are no longer required for search requests.
Count search results. + Get the number of documents matching a query.
+The query can either be provided using a simple query string as a parameter or using the Query DSL defined within the request body.
+ The latter must be nested in a query
key, which is the same as the search API.
The count API supports multi-target syntax. You can run a single count API search across multiple data streams and indices.
+The operation is broadcast across all shards. + For each shard ID group, a replica is chosen and the search is run against it. + This means that replicas increase the scalability of the count.
+ `Create a new document in the index.
+You can index a new JSON document with the /<target>/_doc/
or /<target>/_create/<_id>
APIs
+ Using _create
guarantees that the document is indexed only if it does not already exist.
+ It returns a 409 response when a document with a same ID already exists in the index.
+ To update an existing document, you must use the /<target>/_doc/
API.
If the Elasticsearch security features are enabled, you must have the following index privileges for the target data stream, index, or index alias:
+PUT /<target>/_create/<_id>
or POST /<target>/_create/<_id>
request formats, you must have the create_doc
, create
, index
, or write
index privilege.auto_configure
, create_index
, or manage
index privilege.Automatic data stream creation requires a matching index template with data stream enabled.
+Automatically create data streams and indices
+If the request's target doesn't exist and matches an index template with a data_stream
definition, the index operation automatically creates the data stream.
If the target doesn't exist and doesn't match a data stream template, the operation automatically creates the index and applies any matching index templates.
+NOTE: Elasticsearch includes several built-in index templates. To avoid naming collisions with these templates, refer to index pattern documentation.
+If no mapping exists, the index operation creates a dynamic mapping. + By default, new fields and objects are automatically added to the mapping if needed.
+Automatic index creation is controlled by the action.auto_create_index
setting.
+ If it is true
, any index can be created automatically.
+ You can modify this setting to explicitly allow or block automatic creation of indices that match specified patterns or set it to false
to turn off automatic index creation entirely.
+ Specify a comma-separated list of patterns you want to allow or prefix each pattern with +
or -
to indicate whether it should be allowed or blocked.
+ When a list is specified, the default behaviour is to disallow.
NOTE: The action.auto_create_index
setting affects the automatic creation of indices only.
+ It does not affect the creation of data streams.
Routing
+By default, shard placement — or routing — is controlled by using a hash of the document's ID value.
+ For more explicit control, the value fed into the hash function used by the router can be directly specified on a per-operation basis using the routing
parameter.
When setting up explicit mapping, you can also use the _routing
field to direct the index operation to extract the routing value from the document itself.
+ This does come at the (very minimal) cost of an additional document parsing pass.
+ If the _routing
mapping is defined and set to be required, the index operation will fail if no routing value is provided or extracted.
NOTE: Data streams do not support custom routing unless they were created with the allow_custom_routing
setting enabled in the template.
Distributed
+The index operation is directed to the primary shard based on its route and performed on the actual node containing this shard. + After the primary shard completes the operation, if needed, the update is distributed to applicable replicas.
+Active shards
+To improve the resiliency of writes to the system, indexing operations can be configured to wait for a certain number of active shard copies before proceeding with the operation.
+ If the requisite number of active shard copies are not available, then the write operation must wait and retry, until either the requisite shard copies have started or a timeout occurs.
+ By default, write operations only wait for the primary shards to be active before proceeding (that is to say wait_for_active_shards
is 1
).
+ This default can be overridden in the index settings dynamically by setting index.write.wait_for_active_shards
.
+ To alter this behavior per operation, use the wait_for_active_shards request
parameter.
Valid values are all or any positive integer up to the total number of configured copies per shard in the index (which is number_of_replicas
+1).
+ Specifying a negative value or a number greater than the number of shard copies will throw an error.
For example, suppose you have a cluster of three nodes, A, B, and C and you create an index index with the number of replicas set to 3 (resulting in 4 shard copies, one more copy than there are nodes).
+ If you attempt an indexing operation, by default the operation will only ensure the primary copy of each shard is available before proceeding.
+ This means that even if B and C went down and A hosted the primary shard copies, the indexing operation would still proceed with only one copy of the data.
+ If wait_for_active_shards
is set on the request to 3
(and all three nodes are up), the indexing operation will require 3 active shard copies before proceeding.
+ This requirement should be met because there are 3 active nodes in the cluster, each one holding a copy of the shard.
+ However, if you set wait_for_active_shards
to all
(or to 4
, which is the same in this situation), the indexing operation will not proceed as you do not have all 4 copies of each shard active in the index.
+ The operation will timeout unless a new node is brought up in the cluster to host the fourth copy of the shard.
It is important to note that this setting greatly reduces the chances of the write operation not writing to the requisite number of shard copies, but it does not completely eliminate the possibility, because this check occurs before the write operation starts.
+ After the write operation is underway, it is still possible for replication to fail on any number of shard copies but still succeed on the primary.
+ The _shards
section of the API response reveals the number of shard copies on which replication succeeded and failed.
Delete a document.
+Remove a JSON document from the specified index.
+NOTE: You cannot send deletion requests directly to a data stream. + To delete a document in a data stream, you must target the backing index containing the document.
+Optimistic concurrency control
+Delete operations can be made conditional and only be performed if the last modification to the document was assigned the sequence number and primary term specified by the if_seq_no
and if_primary_term
parameters.
+ If a mismatch is detected, the operation will result in a VersionConflictException
and a status code of 409
.
Versioning
+Each document indexed is versioned.
+ When deleting a document, the version can be specified to make sure the relevant document you are trying to delete is actually being deleted and it has not changed in the meantime.
+ Every write operation run on a document, deletes included, causes its version to be incremented.
+ The version number of a deleted document remains available for a short time after deletion to allow for control of concurrent operations.
+ The length of time for which a deleted document's version remains available is determined by the index.gc_deletes
index setting.
Routing
+If routing is used during indexing, the routing value also needs to be specified to delete a document.
+If the _routing
mapping is set to required
and no routing value is specified, the delete API throws a RoutingMissingException
and rejects the request.
For example:
+DELETE /my-index-000001/_doc/1?routing=shard-1
+
+ This request deletes the document with ID 1, but it is routed based on the user. + The document is not deleted if the correct routing is not specified.
+Distributed
+The delete operation gets hashed into a specific shard ID. + It then gets redirected into the primary shard within that ID group and replicated (if needed) to shard replicas within that ID group.
+ `Delete documents. + Deletes documents that match the specified query.
+ `Throttle a delete by query operation.
+Change the number of requests per second for a particular delete by query operation. + Rethrottling that speeds up the query takes effect immediately but rethrotting that slows down the query takes effect after completing the current batch to prevent scroll timeouts.
+ `Delete a script or search template. + Deletes a stored script or search template.
+ `Check a document.
+Verify that a document exists.
+ For example, check to see if a document with the _id
0 exists:
HEAD my-index-000001/_doc/0
+
+ If the document exists, the API returns a status code of 200 - OK
.
+ If the document doesn’t exist, the API returns 404 - Not Found
.
Versioning support
+You can use the version
parameter to check the document only if its current version is equal to the specified one.
Internally, Elasticsearch has marked the old document as deleted and added an entirely new document. + The old version of the document doesn't disappear immediately, although you won't be able to access it. + Elasticsearch cleans up deleted documents in the background as you continue to index more data.
+ `Check for a document source.
+Check whether a document source exists in an index. + For example:
+HEAD my-index-000001/_source/1
+
+ A document's source is not available if it is disabled in the mapping.
+ `Explain a document match result. + Returns information about why a specific document matches, or doesn’t match, a query.
+ `Get the field capabilities.
+Get information about the capabilities of fields among multiple indices.
+For data streams, the API returns field capabilities among the stream’s backing indices.
+ It returns runtime fields like any other field.
+ For example, a runtime field with a type of keyword is returned the same as any other field that belongs to the keyword
family.
Get a document by its ID.
+Get a document and its source or stored fields from an index.
+By default, this API is realtime and is not affected by the refresh rate of the index (when data will become visible for search).
+ In the case where stored fields are requested with the stored_fields
parameter and the document has been updated but is not yet refreshed, the API will have to parse and analyze the source to extract the stored fields.
+ To turn off realtime behavior, set the realtime
parameter to false.
Source filtering
+By default, the API returns the contents of the _source
field unless you have used the stored_fields
parameter or the _source
field is turned off.
+ You can turn off _source
retrieval by using the _source
parameter:
GET my-index-000001/_doc/0?_source=false
+
+ If you only need one or two fields from the _source
, use the _source_includes
or _source_excludes
parameters to include or filter out particular fields.
+ This can be helpful with large documents where partial retrieval can save on network overhead
+ Both parameters take a comma separated list of fields or wildcard expressions.
+ For example:
GET my-index-000001/_doc/0?_source_includes=*.id&_source_excludes=entities
+
+ If you only want to specify includes, you can use a shorter notation:
+GET my-index-000001/_doc/0?_source=*.id
+
+ Routing
+If routing is used during indexing, the routing value also needs to be specified to retrieve a document. + For example:
+GET my-index-000001/_doc/2?routing=user1
+
+ This request gets the document with ID 2, but it is routed based on the user. + The document is not fetched if the correct routing is not specified.
+Distributed
+The GET operation is hashed into a specific shard ID. + It is then redirected to one of the replicas within that shard ID and returns the result. + The replicas are the primary shard and its replicas within that shard ID group. + This means that the more replicas you have, the better your GET scaling will be.
+Versioning support
+You can use the version
parameter to retrieve the document only if its current version is equal to the specified one.
Internally, Elasticsearch has marked the old document as deleted and added an entirely new document. + The old version of the document doesn't disappear immediately, although you won't be able to access it. + Elasticsearch cleans up deleted documents in the background as you continue to index more data.
+ `Get a script or search template. + Retrieves a stored script or search template.
+ `Get script contexts.
+Get a list of supported script contexts and their methods.
+ `Get script languages.
+Get a list of available script types, languages, and contexts.
+ `Get a document's source.
+Get the source of a document. + For example:
+GET my-index-000001/_source/1
+
+ You can use the source filtering parameters to control which parts of the _source
are returned:
GET my-index-000001/_source/1/?_source_includes=*.id&_source_excludes=entities
+
+
`Get the cluster health. + Get a report with the health status of an Elasticsearch cluster. + The report contains a list of indicators that compose Elasticsearch functionality.
+Each indicator has a health status of: green, unknown, yellow or red. + The indicator will provide an explanation and metadata describing the reason for its current health status.
+The cluster’s status is controlled by the worst indicator status.
+In the event that an indicator’s status is non-green, a list of impacts may be present in the indicator result which detail the functionalities that are negatively affected by the health issue. + Each impact carries with it a severity level, an area of the system that is affected, and a simple description of the impact on the system.
+Some health indicators can determine the root cause of a health problem and prescribe a set of steps that can be performed in order to improve the health of the system. + The root cause and remediation steps are encapsulated in a diagnosis. + A diagnosis contains a cause detailing a root cause analysis, an action containing a brief description of the steps to take to fix the problem, the list of affected resources (if applicable), and a detailed step-by-step troubleshooting guide to fix the diagnosed problem.
+NOTE: The health indicators perform root cause analysis of non-green health statuses. This can be computationally expensive when called frequently. + When setting up automated polling of the API for health status, set verbose to false to disable the more expensive analysis logic.
+ `Create or update a document in an index.
+Add a JSON document to the specified data stream or index and make it searchable. + If the target is an index and the document already exists, the request updates the document and increments its version.
+NOTE: You cannot use this API to send update requests for existing documents in a data stream.
+If the Elasticsearch security features are enabled, you must have the following index privileges for the target data stream, index, or index alias:
+PUT /<target>/_doc/<_id>
request format, you must have the create
, index
, or write
index privilege.POST /<target>/_doc/
request format, you must have the create_doc
, create
, index
, or write
index privilege.auto_configure
, create_index
, or manage
index privilege.Automatic data stream creation requires a matching index template with data stream enabled.
+NOTE: Replica shards might not all be started when an indexing operation returns successfully.
+ By default, only the primary is required. Set wait_for_active_shards
to change this default behavior.
Automatically create data streams and indices
+If the request's target doesn't exist and matches an index template with a data_stream
definition, the index operation automatically creates the data stream.
If the target doesn't exist and doesn't match a data stream template, the operation automatically creates the index and applies any matching index templates.
+NOTE: Elasticsearch includes several built-in index templates. To avoid naming collisions with these templates, refer to index pattern documentation.
+If no mapping exists, the index operation creates a dynamic mapping. + By default, new fields and objects are automatically added to the mapping if needed.
+Automatic index creation is controlled by the action.auto_create_index
setting.
+ If it is true
, any index can be created automatically.
+ You can modify this setting to explicitly allow or block automatic creation of indices that match specified patterns or set it to false
to turn off automatic index creation entirely.
+ Specify a comma-separated list of patterns you want to allow or prefix each pattern with +
or -
to indicate whether it should be allowed or blocked.
+ When a list is specified, the default behaviour is to disallow.
NOTE: The action.auto_create_index
setting affects the automatic creation of indices only.
+ It does not affect the creation of data streams.
Optimistic concurrency control
+Index operations can be made conditional and only be performed if the last modification to the document was assigned the sequence number and primary term specified by the if_seq_no
and if_primary_term
parameters.
+ If a mismatch is detected, the operation will result in a VersionConflictException
and a status code of 409
.
Routing
+By default, shard placement — or routing — is controlled by using a hash of the document's ID value.
+ For more explicit control, the value fed into the hash function used by the router can be directly specified on a per-operation basis using the routing
parameter.
When setting up explicit mapping, you can also use the _routing
field to direct the index operation to extract the routing value from the document itself.
+ This does come at the (very minimal) cost of an additional document parsing pass.
+ If the _routing
mapping is defined and set to be required, the index operation will fail if no routing value is provided or extracted.
NOTE: Data streams do not support custom routing unless they were created with the allow_custom_routing
setting enabled in the template.
Distributed
+The index operation is directed to the primary shard based on its route and performed on the actual node containing this shard. + After the primary shard completes the operation, if needed, the update is distributed to applicable replicas.
+Active shards
+To improve the resiliency of writes to the system, indexing operations can be configured to wait for a certain number of active shard copies before proceeding with the operation.
+ If the requisite number of active shard copies are not available, then the write operation must wait and retry, until either the requisite shard copies have started or a timeout occurs.
+ By default, write operations only wait for the primary shards to be active before proceeding (that is to say wait_for_active_shards
is 1
).
+ This default can be overridden in the index settings dynamically by setting index.write.wait_for_active_shards
.
+ To alter this behavior per operation, use the wait_for_active_shards request
parameter.
Valid values are all or any positive integer up to the total number of configured copies per shard in the index (which is number_of_replicas
+1).
+ Specifying a negative value or a number greater than the number of shard copies will throw an error.
For example, suppose you have a cluster of three nodes, A, B, and C and you create an index index with the number of replicas set to 3 (resulting in 4 shard copies, one more copy than there are nodes).
+ If you attempt an indexing operation, by default the operation will only ensure the primary copy of each shard is available before proceeding.
+ This means that even if B and C went down and A hosted the primary shard copies, the indexing operation would still proceed with only one copy of the data.
+ If wait_for_active_shards
is set on the request to 3
(and all three nodes are up), the indexing operation will require 3 active shard copies before proceeding.
+ This requirement should be met because there are 3 active nodes in the cluster, each one holding a copy of the shard.
+ However, if you set wait_for_active_shards
to all
(or to 4
, which is the same in this situation), the indexing operation will not proceed as you do not have all 4 copies of each shard active in the index.
+ The operation will timeout unless a new node is brought up in the cluster to host the fourth copy of the shard.
It is important to note that this setting greatly reduces the chances of the write operation not writing to the requisite number of shard copies, but it does not completely eliminate the possibility, because this check occurs before the write operation starts.
+ After the write operation is underway, it is still possible for replication to fail on any number of shard copies but still succeed on the primary.
+ The _shards
section of the API response reveals the number of shard copies on which replication succeeded and failed.
No operation (noop) updates
+When updating a document by using this API, a new version of the document is always created even if the document hasn't changed.
+ If this isn't acceptable use the _update
API with detect_noop
set to true
.
+ The detect_noop
option isn't available on this API because it doesn’t fetch the old source and isn't able to compare it against the new source.
There isn't a definitive rule for when noop updates aren't acceptable. + It's a combination of lots of factors like how frequently your data source sends updates that are actually noops and how many queries per second Elasticsearch runs on the shard receiving the updates.
+Versioning
+Each indexed document is given a version number.
+ By default, internal versioning is used that starts at 1 and increments with each update, deletes included.
+ Optionally, the version number can be set to an external value (for example, if maintained in a database).
+ To enable this functionality, version_type
should be set to external
.
+ The value provided must be a numeric, long value greater than or equal to 0, and less than around 9.2e+18
.
NOTE: Versioning is completely real time, and is not affected by the near real time aspects of search operations. + If no version is provided, the operation runs without any version checks.
+When using the external version type, the system checks to see if the version number passed to the index request is greater than the version of the currently stored document. + If true, the document will be indexed and the new version number used. + If the value provided is less than or equal to the stored document's version number, a version conflict will occur and the index operation will fail. For example:
+PUT my-index-000001/_doc/1?version=2&version_type=external
+ {
+ "user": {
+ "id": "elkbee"
+ }
+ }
+
+ In this example, the operation will succeed since the supplied version of 2 is higher than the current document version of 1.
+ If the document was already updated and its version was set to 2 or higher, the indexing command will fail and result in a conflict (409 HTTP status code).
+
+ A nice side effect is that there is no need to maintain strict ordering of async indexing operations run as a result of changes to a source database, as long as version numbers from the source database are used.
+ Even the simple case of updating the Elasticsearch index using data from a database is simplified if external versioning is used, as only the latest version will be used if the index operations arrive out of order.
+
+
`Get cluster info. + Get basic build, version, and cluster information.
+ `Run a knn search.
+NOTE: The kNN search API has been replaced by the knn
option in the search API.
Perform a k-nearest neighbor (kNN) search on a dense_vector field and return the matching documents. + Given a query vector, the API finds the k closest vectors and returns those documents as search hits.
+Elasticsearch uses the HNSW algorithm to support efficient kNN search. + Like most kNN algorithms, HNSW is an approximate method that sacrifices result accuracy for improved search speed. + This means the results returned are not always the true k closest neighbors.
+The kNN search API supports restricting the search using a filter. + The search will return the top k documents that also match the filter query.
+ `Get multiple documents.
+Get multiple JSON documents by ID from one or more indices. + If you specify an index in the request URI, you only need to specify the document IDs in the request body. + To ensure fast responses, this multi get (mget) API responds with partial results if one or more shards fail.
+ `Run multiple searches.
+The format of the request is similar to the bulk API format and makes use of the newline delimited JSON (NDJSON) format. + The structure is as follows:
+header\\n
+ body\\n
+ header\\n
+ body\\n
+
+ This structure is specifically optimized to reduce parsing if a specific search ends up redirected to another node.
+IMPORTANT: The final line of data must end with a newline character \\n
.
+ Each newline character may be preceded by a carriage return \\r
.
+ When sending requests to this endpoint the Content-Type
header should be set to application/x-ndjson
.
Run multiple templated searches.
+ `Get multiple term vectors.
+You can specify existing documents by index and ID or provide artificial documents in the body of the request.
+ You can specify the index in the request body or request URI.
+ The response contains a docs
array with all the fetched termvectors.
+ Each element has the structure provided by the termvectors API.
Open a point in time.
+A search request by default runs against the most recent visible data of the target indices,
+ which is called point in time. Elasticsearch pit (point in time) is a lightweight view into the
+ state of the data as it existed when initiated. In some cases, it’s preferred to perform multiple
+ search requests using the same point in time. For example, if refreshes happen between
+ search_after
requests, then the results of those requests might not be consistent as changes happening
+ between searches are only visible to the more recent point in time.
A point in time must be opened explicitly before being used in search requests.
+ The keep_alive
parameter tells Elasticsearch how long it should persist.
Create or update a script or search template. + Creates or updates a stored script or search template.
+ `Evaluate ranked search results.
+Evaluate the quality of ranked search results over a set of typical search queries.
+ `Reindex documents.
+Copy documents from a source to a destination. + You can copy all documents to the destination index or reindex a subset of the documents. + The source can be any existing index, alias, or data stream. + The destination must differ from the source. + For example, you cannot reindex a data stream into itself.
+IMPORTANT: Reindex requires _source
to be enabled for all documents in the source.
+ The destination should be configured as wanted before calling the reindex API.
+ Reindex does not copy the settings from the source or its associated template.
+ Mappings, shard counts, and replicas, for example, must be configured ahead of time.
If the Elasticsearch security features are enabled, you must have the following security privileges:
+read
index privilege for the source data stream, index, or alias.write
index privilege for the destination data stream, index, or index alias.auto_configure
, create_index
, or manage
index privilege for the destination data stream, index, or alias.source.remote.user
must have the monitor
cluster privilege and the read
index privilege for the source data stream, index, or alias.If reindexing from a remote cluster, you must explicitly allow the remote host in the reindex.remote.whitelist
setting.
+ Automatic data stream creation requires a matching index template with data stream enabled.
The dest
element can be configured like the index API to control optimistic concurrency control.
+ Omitting version_type
or setting it to internal
causes Elasticsearch to blindly dump documents into the destination, overwriting any that happen to have the same ID.
Setting version_type
to external
causes Elasticsearch to preserve the version
from the source, create any documents that are missing, and update any documents that have an older version in the destination than they do in the source.
Setting op_type
to create
causes the reindex API to create only missing documents in the destination.
+ All existing documents will cause a version conflict.
IMPORTANT: Because data streams are append-only, any reindex request to a destination data stream must have an op_type
of create
.
+ A reindex can only add new documents to a destination data stream.
+ It cannot update existing documents in a destination data stream.
By default, version conflicts abort the reindex process.
+ To continue reindexing if there are conflicts, set the conflicts
request body property to proceed
.
+ In this case, the response includes a count of the version conflicts that were encountered.
+ Note that the handling of other error types is unaffected by the conflicts
property.
+ Additionally, if you opt to count version conflicts, the operation could attempt to reindex more documents from the source than max_docs
until it has successfully indexed max_docs
documents into the target or it has gone through every document in the source query.
NOTE: The reindex API makes no effort to handle ID collisions. + The last document written will "win" but the order isn't usually predictable so it is not a good idea to rely on this behavior. + Instead, make sure that IDs are unique by using a script.
+Running reindex asynchronously
+If the request contains wait_for_completion=false
, Elasticsearch performs some preflight checks, launches the request, and returns a task you can use to cancel or get the status of the task.
+ Elasticsearch creates a record of this task as a document at _tasks/<task_id>
.
Reindex from multiple sources
+If you have many sources to reindex it is generally better to reindex them one at a time rather than using a glob pattern to pick up multiple sources. + That way you can resume the process if there are any errors by removing the partially completed source and starting over. + It also makes parallelizing the process fairly simple: split the list of sources to reindex and run each list in parallel.
+For example, you can use a bash script like this:
+for index in i1 i2 i3 i4 i5; do
+ curl -HContent-Type:application/json -XPOST localhost:9200/_reindex?pretty -d'{
+ "source": {
+ "index": "'$index'"
+ },
+ "dest": {
+ "index": "'$index'-reindexed"
+ }
+ }'
+ done
+
+ Throttling
+Set requests_per_second
to any positive decimal number (1.4
, 6
, 1000
, for example) to throttle the rate at which reindex issues batches of index operations.
+ Requests are throttled by padding each batch with a wait time.
+ To turn off throttling, set requests_per_second
to -1
.
The throttling is done by waiting between batches so that the scroll that reindex uses internally can be given a timeout that takes into account the padding.
+ The padding time is the difference between the batch size divided by the requests_per_second
and the time spent writing.
+ By default the batch size is 1000
, so if requests_per_second
is set to 500
:
target_time = 1000 / 500 per second = 2 seconds
+ wait_time = target_time - write_time = 2 seconds - .5 seconds = 1.5 seconds
+
+ Since the batch is issued as a single bulk request, large batch sizes cause Elasticsearch to create many requests and then wait for a while before starting the next set. + This is "bursty" instead of "smooth".
+Slicing
+Reindex supports sliced scroll to parallelize the reindexing process. + This parallelization can improve efficiency and provide a convenient way to break the request down into smaller parts.
+NOTE: Reindexing from remote clusters does not support manual or automatic slicing.
+You can slice a reindex request manually by providing a slice ID and total number of slices to each request.
+ You can also let reindex automatically parallelize by using sliced scroll to slice on _id
.
+ The slices
parameter specifies the number of slices to use.
Adding slices
to the reindex request just automates the manual process, creating sub-requests which means it has some quirks:
slices
only contains the status of completed slices.slices
will rethrottle the unfinished sub-request proportionally.slices
will cancel each sub-request.slices
, each sub-request won't get a perfectly even portion of the documents. All documents will be addressed, but some slices may be larger than others. Expect larger slices to have a more even distribution.requests_per_second
and max_docs
on a request with slices
are distributed proportionally to each sub-request. Combine that with the previous point about distribution being uneven and you should conclude that using max_docs
with slices
might not result in exactly max_docs
documents being reindexed.If slicing automatically, setting slices
to auto
will choose a reasonable number for most indices.
+ If slicing manually or otherwise tuning automatic slicing, use the following guidelines.
Query performance is most efficient when the number of slices is equal to the number of shards in the index.
+ If that number is large (for example, 500
), choose a lower number as too many slices will hurt performance.
+ Setting slices higher than the number of shards generally does not improve efficiency and adds overhead.
Indexing performance scales linearly across available resources with the number of slices.
+Whether query or indexing performance dominates the runtime depends on the documents being reindexed and cluster resources.
+Modify documents during reindexing
+Like _update_by_query
, reindex operations support a script that modifies the document.
+ Unlike _update_by_query
, the script is allowed to modify the document's metadata.
Just as in _update_by_query
, you can set ctx.op
to change the operation that is run on the destination.
+ For example, set ctx.op
to noop
if your script decides that the document doesn’t have to be indexed in the destination. This "no operation" will be reported in the noop
counter in the response body.
+ Set ctx.op
to delete
if your script decides that the document must be deleted from the destination.
+ The deletion will be reported in the deleted
counter in the response body.
+ Setting ctx.op
to anything else will return an error, as will setting any other field in ctx
.
Think of the possibilities! Just be careful; you are able to change:
+_id
_index
_version
_routing
Setting _version
to null
or clearing it from the ctx
map is just like not sending the version in an indexing request.
+ It will cause the document to be overwritten in the destination regardless of the version on the target or the version type you use in the reindex API.
Reindex from remote
+Reindex supports reindexing from a remote Elasticsearch cluster.
+ The host
parameter must contain a scheme, host, port, and optional path.
+ The username
and password
parameters are optional and when they are present the reindex operation will connect to the remote Elasticsearch node using basic authentication.
+ Be sure to use HTTPS when using basic authentication or the password will be sent in plain text.
+ There are a range of settings available to configure the behavior of the HTTPS connection.
When using Elastic Cloud, it is also possible to authenticate against the remote cluster through the use of a valid API key.
+ Remote hosts must be explicitly allowed with the reindex.remote.whitelist
setting.
+ It can be set to a comma delimited list of allowed remote host and port combinations.
+ Scheme is ignored; only the host and port are used.
+ For example:
reindex.remote.whitelist: [otherhost:9200, another:9200, 127.0.10.*:9200, localhost:*"]
+
+ The list of allowed hosts must be configured on any nodes that will coordinate the reindex. + This feature should work with remote clusters of any version of Elasticsearch. + This should enable you to upgrade from any version of Elasticsearch to the current version by reindexing from a cluster of the old version.
+WARNING: Elasticsearch does not support forward compatibility across major versions. + For example, you cannot reindex from a 7.x cluster into a 6.x cluster.
+To enable queries sent to older versions of Elasticsearch, the query
parameter is sent directly to the remote host without validation or modification.
NOTE: Reindexing from remote clusters does not support manual or automatic slicing.
+Reindexing from a remote server uses an on-heap buffer that defaults to a maximum size of 100mb.
+ If the remote index includes very large documents you'll need to use a smaller batch size.
+ It is also possible to set the socket read timeout on the remote connection with the socket_timeout
field and the connection timeout with the connect_timeout
field.
+ Both default to 30 seconds.
Configuring SSL parameters
+Reindex from remote supports configurable SSL settings.
+ These must be specified in the elasticsearch.yml
file, with the exception of the secure settings, which you add in the Elasticsearch keystore.
+ It is not possible to configure SSL in the body of the reindex request.
Throttle a reindex operation.
+Change the number of requests per second for a particular reindex operation. + For example:
+POST _reindex/r1A2WoRbTwKZ516z6NEs5A:36619/_rethrottle?requests_per_second=-1
+
+ Rethrottling that speeds up the query takes effect immediately. + Rethrottling that slows down the query will take effect after completing the current batch. + This behavior prevents scroll timeouts.
+ `Render a search template.
+Render a search template as a search request body.
+ `Run a script. + Runs a script and returns a result.
+ `Run a scrolling search.
+IMPORTANT: The scroll API is no longer recommend for deep pagination. If you need to preserve the index state while paging through more than 10,000 hits, use the search_after
parameter with a point in time (PIT).
The scroll API gets large sets of results from a single scrolling search request.
+ To get the necessary scroll ID, submit a search API request that includes an argument for the scroll
query parameter.
+ The scroll
parameter indicates how long Elasticsearch should retain the search context for the request.
+ The search response returns a scroll ID in the _scroll_id
response body parameter.
+ You can then use the scroll ID with the scroll API to retrieve the next batch of results for the request.
+ If the Elasticsearch security features are enabled, the access to the results of a specific scroll ID is restricted to the user or API key that submitted the search.
You can also use the scroll API to specify a new scroll parameter that extends or shortens the retention period for the search context.
+IMPORTANT: Results from a scrolling search reflect the state of the index at the time of the initial search request. Subsequent indexing or document changes only affect later search and scroll requests.
+ `Run a search.
+Get search hits that match the query defined in the request.
+ You can provide search queries using the q
query string parameter or the request body.
+ If both are specified, only the query parameter is used.
Search a vector tile.
+Search a vector tile for geospatial values.
+ `Get the search shards.
+Get the indices and shards that a search request would be run against. + This information can be useful for working out issues or planning optimizations with routing and shard preferences. + When filtered aliases are used, the filter is returned as part of the indices section.
+ `Run a search with a search template.
+ `Get terms in an index.
+Discover terms that match a partial string in an index. + This "terms enum" API is designed for low-latency look-ups used in auto-complete scenarios.
+If the complete
property in the response is false, the returned terms set may be incomplete and should be treated as approximate.
+ This can occur due to a few reasons, such as a request timeout or a node error.
NOTE: The terms enum API may return terms from deleted documents. Deleted documents are initially only marked as deleted. It is not until their segments are merged that documents are actually deleted. Until that happens, the terms enum API will return terms from these documents.
+ `Get term vector information.
+Get information and statistics about terms in the fields of a particular document.
+ `Update a document.
+Update a document by running a script or passing a partial document.
+If the Elasticsearch security features are enabled, you must have the index
or write
index privilege for the target index or index alias.
The script can update, delete, or skip modifying the document. + The API also supports passing a partial document, which is merged into the existing document. + To fully replace an existing document, use the index API. + This operation:
+The document must still be reindexed, but using this API removes some network roundtrips and reduces chances of version conflicts between the GET and the index operation.
+The _source
field must be enabled to use this API.
+ In addition to _source
, you can access the following variables through the ctx
map: _index
, _type
, _id
, _version
, _routing
, and _now
(the current timestamp).
Update documents. + Updates documents that match the specified query. + If no query is specified, performs an update on every document in the data stream or index without modifying the source, which is useful for picking up mapping changes.
+ `Throttle an update by query operation.
+Change the number of requests per second for a particular update by query operation. + Rethrottling that speeds up the query takes effect immediately but rethrotting that slows down the query takes effect after completing the current batch to prevent scroll timeouts.
+ `Delete an async search.
+If the asynchronous search is still running, it is cancelled.
+ Otherwise, the saved search results are deleted.
+ If the Elasticsearch security features are enabled, the deletion of a specific async search is restricted to: the authenticated user that submitted the original search request; users that have the cancel_task
cluster privilege.
Get async search results.
+Retrieve the results of a previously submitted asynchronous search request. + If the Elasticsearch security features are enabled, access to the results of a specific async search is restricted to the user or API key that submitted it.
+ `Get the async search status.
+Get the status of a previously submitted async search request given its identifier, without retrieving search results.
+ If the Elasticsearch security features are enabled, use of this API is restricted to the monitoring_user
role.
Run an async search.
+When the primary sort of the results is an indexed field, shards get sorted based on minimum and maximum value that they hold for that field. Partial results become available following the sort criteria that was requested.
+Warning: Asynchronous search does not support scroll or search requests that include only the suggest section.
+By default, Elasticsearch does not allow you to store an async search response larger than 10Mb and an attempt to do this results in an error.
+ The maximum allowed size for a stored async search response can be set by changing the search.max_async_search_response_size
cluster level setting.
Delete an autoscaling policy.
+NOTE: This feature is designed for indirect use by Elasticsearch Service, Elastic Cloud Enterprise, and Elastic Cloud on Kubernetes. Direct use is not supported.
+ `Get the autoscaling capacity.
+NOTE: This feature is designed for indirect use by Elasticsearch Service, Elastic Cloud Enterprise, and Elastic Cloud on Kubernetes. Direct use is not supported.
+This API gets the current autoscaling capacity based on the configured autoscaling policy. + It will return information to size the cluster appropriately to the current workload.
+The required_capacity
is calculated as the maximum of the required_capacity
result of all individual deciders that are enabled for the policy.
The operator should verify that the current_nodes
match the operator’s knowledge of the cluster to avoid making autoscaling decisions based on stale or incomplete information.
The response contains decider-specific information you can use to diagnose how and why autoscaling determined a certain capacity was required. + This information is provided for diagnosis only. + Do not use this information to make autoscaling decisions.
+ `Get an autoscaling policy.
+NOTE: This feature is designed for indirect use by Elasticsearch Service, Elastic Cloud Enterprise, and Elastic Cloud on Kubernetes. Direct use is not supported.
+ `Create or update an autoscaling policy.
+NOTE: This feature is designed for indirect use by Elasticsearch Service, Elastic Cloud Enterprise, and Elastic Cloud on Kubernetes. Direct use is not supported.
+ `Get aliases.
+Get the cluster's index aliases, including filter and routing information. + This API does not return data stream aliases.
+IMPORTANT: CAT APIs are only intended for human consumption using the command line or the Kibana console. They are not intended for use by applications. For application consumption, use the aliases API.
+ `Get shard allocation information.
+Get a snapshot of the number of shards allocated to each data node and their disk space.
+IMPORTANT: CAT APIs are only intended for human consumption using the command line or Kibana console. They are not intended for use by applications.
+ `Get component templates.
+Get information about component templates in a cluster. + Component templates are building blocks for constructing index templates that specify index mappings, settings, and aliases.
+IMPORTANT: CAT APIs are only intended for human consumption using the command line or Kibana console. + They are not intended for use by applications. For application consumption, use the get component template API.
+ `Get a document count.
+Get quick access to a document count for a data stream, an index, or an entire cluster. + The document count only includes live documents, not deleted documents which have not yet been removed by the merge process.
+IMPORTANT: CAT APIs are only intended for human consumption using the command line or Kibana console. + They are not intended for use by applications. For application consumption, use the count API.
+ `Get field data cache information.
+Get the amount of heap memory currently used by the field data cache on every data node in the cluster.
+IMPORTANT: cat APIs are only intended for human consumption using the command line or Kibana console. + They are not intended for use by applications. For application consumption, use the nodes stats API.
+ `Get the cluster health status.
+IMPORTANT: CAT APIs are only intended for human consumption using the command line or Kibana console.
+ They are not intended for use by applications. For application consumption, use the cluster health API.
+ This API is often used to check malfunctioning clusters.
+ To help you track cluster health alongside log files and alerting systems, the API returns timestamps in two formats:
+ HH:MM:SS
, which is human-readable but includes no date information;
+ Unix epoch time
, which is machine-sortable and includes date information.
+ The latter format is useful for cluster recoveries that take multiple days.
+ You can use the cat health API to verify cluster health across multiple nodes.
+ You also can use the API to track the recovery of a large cluster over a longer period of time.
Get CAT help.
+Get help for the CAT APIs.
+ `Get index information.
+Get high-level information about indices in a cluster, including backing indices for data streams.
+Use this request to get the following information for each index in a cluster:
+These metrics are retrieved directly from Lucene, which Elasticsearch uses internally to power indexing and search. As a result, all document counts include hidden nested documents. + To get an accurate count of Elasticsearch documents, use the cat count or count APIs.
+CAT APIs are only intended for human consumption using the command line or Kibana console. + They are not intended for use by applications. For application consumption, use an index endpoint.
+ `Get master node information.
+Get information about the master node, including the ID, bound IP address, and name.
+IMPORTANT: cat APIs are only intended for human consumption using the command line or Kibana console. They are not intended for use by applications. For application consumption, use the nodes info API.
+ `Get data frame analytics jobs.
+Get configuration and usage information about data frame analytics jobs.
+IMPORTANT: CAT APIs are only intended for human consumption using the Kibana + console or command line. They are not intended for use by applications. For + application consumption, use the get data frame analytics jobs statistics API.
+ `Get datafeeds.
+Get configuration and usage information about datafeeds.
+ This API returns a maximum of 10,000 datafeeds.
+ If the Elasticsearch security features are enabled, you must have monitor_ml
, monitor
, manage_ml
, or manage
+ cluster privileges to use this API.
IMPORTANT: CAT APIs are only intended for human consumption using the Kibana + console or command line. They are not intended for use by applications. For + application consumption, use the get datafeed statistics API.
+ `Get anomaly detection jobs.
+Get configuration and usage information for anomaly detection jobs.
+ This API returns a maximum of 10,000 jobs.
+ If the Elasticsearch security features are enabled, you must have monitor_ml
,
+ monitor
, manage_ml
, or manage
cluster privileges to use this API.
IMPORTANT: CAT APIs are only intended for human consumption using the Kibana + console or command line. They are not intended for use by applications. For + application consumption, use the get anomaly detection job statistics API.
+ `Get trained models.
+Get configuration and usage information about inference trained models.
+IMPORTANT: CAT APIs are only intended for human consumption using the Kibana + console or command line. They are not intended for use by applications. For + application consumption, use the get trained models statistics API.
+ `Get node attribute information.
+Get information about custom node attributes. + IMPORTANT: cat APIs are only intended for human consumption using the command line or Kibana console. They are not intended for use by applications. For application consumption, use the nodes info API.
+ `Get node information.
+Get information about the nodes in a cluster. + IMPORTANT: cat APIs are only intended for human consumption using the command line or Kibana console. They are not intended for use by applications. For application consumption, use the nodes info API.
+ `Get pending task information.
+Get information about cluster-level changes that have not yet taken effect. + IMPORTANT: cat APIs are only intended for human consumption using the command line or Kibana console. They are not intended for use by applications. For application consumption, use the pending cluster tasks API.
+ `Get plugin information.
+Get a list of plugins running on each node of a cluster. + IMPORTANT: cat APIs are only intended for human consumption using the command line or Kibana console. They are not intended for use by applications. For application consumption, use the nodes info API.
+ `Get shard recovery information.
+Get information about ongoing and completed shard recoveries. + Shard recovery is the process of initializing a shard copy, such as restoring a primary shard from a snapshot or syncing a replica shard from a primary shard. When a shard recovery completes, the recovered shard is available for search and indexing. + For data streams, the API returns information about the stream’s backing indices. + IMPORTANT: cat APIs are only intended for human consumption using the command line or Kibana console. They are not intended for use by applications. For application consumption, use the index recovery API.
+ `Get snapshot repository information.
+Get a list of snapshot repositories for a cluster. + IMPORTANT: cat APIs are only intended for human consumption using the command line or Kibana console. They are not intended for use by applications. For application consumption, use the get snapshot repository API.
+ `Get segment information.
+Get low-level information about the Lucene segments in index shards. + For data streams, the API returns information about the backing indices. + IMPORTANT: cat APIs are only intended for human consumption using the command line or Kibana console. They are not intended for use by applications. For application consumption, use the index segments API.
+ `Get shard information.
+Get information about the shards in a cluster. + For data streams, the API returns information about the backing indices. + IMPORTANT: cat APIs are only intended for human consumption using the command line or Kibana console. They are not intended for use by applications.
+ `Get snapshot information.
+Get information about the snapshots stored in one or more repositories. + A snapshot is a backup of an index or running Elasticsearch cluster. + IMPORTANT: cat APIs are only intended for human consumption using the command line or Kibana console. They are not intended for use by applications. For application consumption, use the get snapshot API.
+ `Get task information.
+Get information about tasks currently running in the cluster. + IMPORTANT: cat APIs are only intended for human consumption using the command line or Kibana console. They are not intended for use by applications. For application consumption, use the task management API.
+ `Get index template information.
+Get information about the index templates in a cluster. + You can use index templates to apply index settings and field mappings to new indices at creation. + IMPORTANT: cat APIs are only intended for human consumption using the command line or Kibana console. They are not intended for use by applications. For application consumption, use the get index template API.
+ `Get thread pool statistics.
+Get thread pool statistics for each node in a cluster. + Returned information includes all built-in thread pools and custom thread pools. + IMPORTANT: cat APIs are only intended for human consumption using the command line or Kibana console. They are not intended for use by applications. For application consumption, use the nodes info API.
+ `Get transform information.
+Get configuration and usage information about transforms.
+CAT APIs are only intended for human consumption using the Kibana + console or command line. They are not intended for use by applications. For + application consumption, use the get transform statistics API.
+ `Delete auto-follow patterns. + Delete a collection of cross-cluster replication auto-follow patterns.
+ `Create a follower. + Create a cross-cluster replication follower index that follows a specific leader index. + When the API returns, the follower index exists and cross-cluster replication starts replicating operations from the leader index to the follower index.
+ `Get follower information. + Get information about all cross-cluster replication follower indices. + For example, the results include follower index names, leader index names, replication options, and whether the follower indices are active or paused.
+ `Get follower stats. + Get cross-cluster replication follower stats. + The API returns shard-level stats about the "following tasks" associated with each shard for the specified indices.
+ `Forget a follower. + Remove the cross-cluster replication follower retention leases from the leader.
+A following index takes out retention leases on its leader index. + These leases are used to increase the likelihood that the shards of the leader index retain the history of operations that the shards of the following index need to run replication. + When a follower index is converted to a regular index by the unfollow API (either by directly calling the API or by index lifecycle management tasks), these leases are removed. + However, removal of the leases can fail, for example when the remote cluster containing the leader index is unavailable. + While the leases will eventually expire on their own, their extended existence can cause the leader index to hold more history than necessary and prevent index lifecycle management from performing some operations on the leader index. + This API exists to enable manually removing the leases when the unfollow API is unable to do so.
+NOTE: This API does not stop replication by a following index. If you use this API with a follower index that is still actively following, the following index will add back retention leases on the leader. + The only purpose of this API is to handle the case of failure to remove the following retention leases after the unfollow API is invoked.
+ `Get auto-follow patterns. + Get cross-cluster replication auto-follow patterns.
+ `Pause an auto-follow pattern. + Pause a cross-cluster replication auto-follow pattern. + When the API returns, the auto-follow pattern is inactive. + New indices that are created on the remote cluster and match the auto-follow patterns are ignored.
+You can resume auto-following with the resume auto-follow pattern API. + When it resumes, the auto-follow pattern is active again and automatically configures follower indices for newly created indices on the remote cluster that match its patterns. + Remote indices that were created while the pattern was paused will also be followed, unless they have been deleted or closed in the interim.
+ `Pause a follower. + Pause a cross-cluster replication follower index. + The follower index will not fetch any additional operations from the leader index. + You can resume following with the resume follower API. + You can pause and resume a follower index to change the configuration of the following task.
+ `Create or update auto-follow patterns. + Create a collection of cross-cluster replication auto-follow patterns for a remote cluster. + Newly created indices on the remote cluster that match any of the patterns are automatically configured as follower indices. + Indices on the remote cluster that were created before the auto-follow pattern was created will not be auto-followed even if they match the pattern.
+This API can also be used to update auto-follow patterns. + NOTE: Follower indices that were configured automatically before updating an auto-follow pattern will remain unchanged even if they do not match against the new patterns.
+ `Resume an auto-follow pattern. + Resume a cross-cluster replication auto-follow pattern that was paused. + The auto-follow pattern will resume configuring following indices for newly created indices that match its patterns on the remote cluster. + Remote indices created while the pattern was paused will also be followed unless they have been deleted or closed in the interim.
+ `Resume a follower. + Resume a cross-cluster replication follower index that was paused. + The follower index could have been paused with the pause follower API. + Alternatively it could be paused due to replication that cannot be retried due to failures during following tasks. + When this API returns, the follower index will resume fetching operations from the leader index.
+ `Get cross-cluster replication stats. + This API returns stats about auto-following and the same shard-level stats as the get follower stats API.
+ `Unfollow an index. + Convert a cross-cluster replication follower index to a regular index. + The API stops the following task associated with a follower index and removes index metadata and settings associated with cross-cluster replication. + The follower index must be paused and closed before you call the unfollow API.
+NOTE: Currently cross-cluster replication does not support converting an existing regular index to a follower index. Converting a follower index to a regular index is an irreversible operation.
+ `Explain the shard allocations. + Get explanations for shard allocations in the cluster. + For unassigned shards, it provides an explanation for why the shard is unassigned. + For assigned shards, it provides an explanation for why the shard is remaining on its current node and has not moved or rebalanced to another node. + This API can be very useful when attempting to diagnose why a shard is unassigned or why a shard continues to remain on its current node when you might expect otherwise.
+ `Delete component templates. + Component templates are building blocks for constructing index templates that specify index mappings, settings, and aliases.
+ `Clear cluster voting config exclusions. + Remove master-eligible nodes from the voting configuration exclusion list.
+ `Check component templates. + Returns information about whether a particular component template exists.
+ `Get component templates. + Get information about component templates.
+ `Get cluster-wide settings. + By default, it returns only settings that have been explicitly defined.
+ `Get the cluster health status. + You can also use the API to get the health status of only specified data streams and indices. + For data streams, the API retrieves the health status of the stream’s backing indices.
+The cluster health status is: green, yellow or red. + On the shard level, a red status indicates that the specific shard is not allocated in the cluster. Yellow means that the primary shard is allocated but replicas are not. Green means that all shards are allocated. + The index level status is controlled by the worst shard status.
+One of the main benefits of the API is the ability to wait until the cluster reaches a certain high watermark health level. + The cluster status is controlled by the worst index status.
+ `Get cluster info. + Returns basic information about the cluster.
+ `Get the pending cluster tasks. + Get information about cluster-level changes (such as create index, update mapping, allocate or fail shard) that have not yet taken effect.
+NOTE: This API returns a list of any pending updates to the cluster state. + These are distinct from the tasks reported by the task management API which include periodic tasks and tasks initiated by the user, such as node stats, search queries, or create index requests. + However, if a user-initiated task such as a create index command causes a cluster state update, the activity of this task might be reported by both task api and pending cluster tasks API.
+ `Update voting configuration exclusions. + Update the cluster voting config exclusions by node IDs or node names. + By default, if there are more than three master-eligible nodes in the cluster and you remove fewer than half of the master-eligible nodes in the cluster at once, the voting configuration automatically shrinks. + If you want to shrink the voting configuration to contain fewer than three nodes or to remove half or more of the master-eligible nodes in the cluster at once, use this API to remove departing nodes from the voting configuration manually. + The API adds an entry for each specified node to the cluster’s voting configuration exclusions list. + It then waits until the cluster has reconfigured its voting configuration to exclude the specified nodes.
+Clusters should have no voting configuration exclusions in normal operation.
+ Once the excluded nodes have stopped, clear the voting configuration exclusions with DELETE /_cluster/voting_config_exclusions
.
+ This API waits for the nodes to be fully removed from the cluster before it returns.
+ If your cluster has voting configuration exclusions for nodes that you no longer intend to remove, use DELETE /_cluster/voting_config_exclusions?wait_for_removal=false
to clear the voting configuration exclusions without waiting for the nodes to leave the cluster.
A response to POST /_cluster/voting_config_exclusions
with an HTTP status code of 200 OK guarantees that the node has been removed from the voting configuration and will not be reinstated until the voting configuration exclusions are cleared by calling DELETE /_cluster/voting_config_exclusions
.
+ If the call to POST /_cluster/voting_config_exclusions
fails or returns a response with an HTTP status code other than 200 OK then the node may not have been removed from the voting configuration.
+ In that case, you may safely retry the call.
NOTE: Voting exclusions are required only when you remove at least half of the master-eligible nodes from a cluster in a short time period. + They are not required when removing master-ineligible nodes or when removing fewer than half of the master-eligible nodes.
+ `Create or update a component template. + Component templates are building blocks for constructing index templates that specify index mappings, settings, and aliases.
+An index template can be composed of multiple component templates.
+ To use a component template, specify it in an index template’s composed_of
list.
+ Component templates are only applied to new data streams and indices as part of a matching index template.
Settings and mappings specified directly in the index template or the create index request override any settings or mappings specified in a component template.
+Component templates are only used during index creation. + For data streams, this includes data stream creation and the creation of a stream’s backing indices. + Changes to component templates do not affect existing indices, including a stream’s backing indices.
+You can use C-style /* *\\/
block comments in component templates.
+ You can include comments anywhere in the request body except before the opening curly bracket.
Applying component templates
+You cannot directly apply a component template to a data stream or index.
+ To be applied, a component template must be included in an index template's composed_of
list.
Update the cluster settings.
+ Configure and update dynamic settings on a running cluster.
+ You can also configure dynamic settings locally on an unstarted or shut down node in elasticsearch.yml
.
Updates made with this API can be persistent, which apply across cluster restarts, or transient, which reset after a cluster restart. + You can also reset transient or persistent settings by assigning them a null value.
+If you configure the same setting using multiple methods, Elasticsearch applies the settings in following order of precedence: 1) Transient setting; 2) Persistent setting; 3) elasticsearch.yml
setting; 4) Default setting value.
+ For example, you can apply a transient setting to override a persistent setting or elasticsearch.yml
setting.
+ However, a change to an elasticsearch.yml
setting will not override a defined transient or persistent setting.
TIP: In Elastic Cloud, use the user settings feature to configure all cluster settings. This method automatically rejects unsafe settings that could break your cluster.
+ If you run Elasticsearch on your own hardware, use this API to configure dynamic cluster settings.
+ Only use elasticsearch.yml
for static cluster settings and node settings.
+ The API doesn’t require a restart and ensures a setting’s value is the same on all nodes.
WARNING: Transient cluster settings are no longer recommended. Use persistent cluster settings instead. + If a cluster becomes unstable, transient settings can clear unexpectedly, resulting in a potentially undesired cluster configuration.
+ `Get remote cluster information. + Get all of the configured remote cluster information. + This API returns connection and endpoint information keyed by the configured remote cluster alias.
+ `Reroute the cluster. + Manually change the allocation of individual shards in the cluster. + For example, a shard can be moved from one node to another explicitly, an allocation can be canceled, and an unassigned shard can be explicitly allocated to a specific node.
+It is important to note that after processing any reroute commands Elasticsearch will perform rebalancing as normal (respecting the values of settings such as cluster.routing.rebalance.enable
) in order to remain in a balanced state.
+ For example, if the requested allocation includes moving a shard from node1 to node2 then this may cause a shard to be moved from node2 back to node1 to even things out.
The cluster can be set to disable allocations using the cluster.routing.allocation.enable
setting.
+ If allocations are disabled then the only allocations that will be performed are explicit ones given using the reroute command, and consequent allocations due to rebalancing.
The cluster will attempt to allocate a shard a maximum of index.allocation.max_retries
times in a row (defaults to 5
), before giving up and leaving the shard unallocated.
+ This scenario can be caused by structural problems such as having an analyzer which refers to a stopwords file which doesn’t exist on all nodes.
Once the problem has been corrected, allocation can be manually retried by calling the reroute API with the ?retry_failed
URI query parameter, which will attempt a single retry round for these shards.
Get the cluster state. + Get comprehensive information about the state of the cluster.
+The cluster state is an internal data structure which keeps track of a variety of information needed by every node, including the identity and attributes of the other nodes in the cluster; cluster-wide settings; index metadata, including the mapping and settings for each index; the location and status of every shard copy in the cluster.
+The elected master node ensures that every node in the cluster has a copy of the same cluster state. + This API lets you retrieve a representation of this internal state for debugging or diagnostic purposes. + You may need to consult the Elasticsearch source code to determine the precise meaning of the response.
+By default the API will route requests to the elected master node since this node is the authoritative source of cluster states.
+ You can also retrieve the cluster state held on the node handling the API request by adding the ?local=true
query parameter.
Elasticsearch may need to expend significant effort to compute a response to this API in larger clusters, and the response may comprise a very large quantity of data. + If you use this API repeatedly, your cluster may become unstable.
+WARNING: The response is a representation of an internal data structure. + Its format is not subject to the same compatibility guarantees as other more stable APIs and may change from version to version. + Do not query this API using external monitoring tools. + Instead, obtain the information you require using other more stable cluster APIs.
+ `Get cluster statistics. + Get basic index metrics (shard numbers, store size, memory usage) and information about the current nodes that form the cluster (number, roles, os, jvm versions, memory usage, cpu and installed plugins).
+ `Check in a connector.
+Update the last_seen
field in the connector and set it to the current timestamp.
Delete a connector.
+Removes a connector and associated sync jobs. + This is a destructive action that is not recoverable. + NOTE: This action doesn’t delete any API keys, ingest pipelines, or data indices associated with the connector. + These need to be removed manually.
+ `Get a connector.
+Get the details about a connector.
+ `Update the connector last sync stats.
+Update the fields related to the last sync of a connector. + This action is used for analytics and monitoring.
+ `Get all connectors.
+Get information about all connectors.
+ `Create a connector.
+Connectors are Elasticsearch integrations that bring content from third-party data sources, which can be deployed on Elastic Cloud or hosted on your own infrastructure. + Elastic managed connectors (Native connectors) are a managed service on Elastic Cloud. + Self-managed connectors (Connector clients) are self-managed on your infrastructure.
+ `Create or update a connector.
+ `Cancel a connector sync job.
+Cancel a connector sync job, which sets the status to cancelling and updates cancellation_requested_at
to the current time.
+ The connector service is then responsible for setting the status of connector sync jobs to cancelled.
Check in a connector sync job.
+ Check in a connector sync job and set the last_seen
field to the current time before updating it in the internal index.
To sync data using self-managed connectors, you need to deploy the Elastic connector service on your own infrastructure. + This service runs automatically on Elastic Cloud for Elastic managed connectors.
+ `Claim a connector sync job.
+ This action updates the job status to in_progress
and sets the last_seen
and started_at
timestamps to the current time.
+ Additionally, it can set the sync_cursor
property for the sync job.
This API is not intended for direct connector management by users. + It supports the implementation of services that utilize the connector protocol to communicate with Elasticsearch.
+To sync data using self-managed connectors, you need to deploy the Elastic connector service on your own infrastructure. + This service runs automatically on Elastic Cloud for Elastic managed connectors.
+ `Delete a connector sync job.
+Remove a connector sync job and its associated data. + This is a destructive action that is not recoverable.
+ `Set a connector sync job error.
+ Set the error
field for a connector sync job and set its status
to error
.
To sync data using self-managed connectors, you need to deploy the Elastic connector service on your own infrastructure. + This service runs automatically on Elastic Cloud for Elastic managed connectors.
+ `Get a connector sync job.
+ `Get all connector sync jobs.
+Get information about all stored connector sync jobs listed by their creation date in ascending order.
+ `Create a connector sync job.
+Create a connector sync job document in the internal index and initialize its counters and timestamps with default values.
+ `Set the connector sync job stats.
+ Stats include: deleted_document_count
, indexed_document_count
, indexed_document_volume
, and total_document_count
.
+ You can also update last_seen
.
+ This API is mainly used by the connector service for updating sync job information.
To sync data using self-managed connectors, you need to deploy the Elastic connector service on your own infrastructure. + This service runs automatically on Elastic Cloud for Elastic managed connectors.
+ `Activate the connector draft filter.
+Activates the valid draft filtering for a connector.
+ `Update the connector API key ID.
+Update the api_key_id
and api_key_secret_id
fields of a connector.
+ You can specify the ID of the API key used for authorization and the ID of the connector secret where the API key is stored.
+ The connector secret ID is required only for Elastic managed (native) connectors.
+ Self-managed connectors (connector clients) do not use this field.
Update the connector configuration.
+Update the configuration field in the connector document.
+ `Update the connector error field.
+Set the error field for the connector. + If the error provided in the request body is non-null, the connector’s status is updated to error. + Otherwise, if the error is reset to null, the connector status is updated to connected.
+ `Update the connector features. + Update the connector features in the connector document. + This API can be used to control the following aspects of a connector:
+Normally, the running connector service automatically manages these features. + However, you can use this API to override the default behavior.
+To sync data using self-managed connectors, you need to deploy the Elastic connector service on your own infrastructure. + This service runs automatically on Elastic Cloud for Elastic managed connectors.
+ `Update the connector filtering.
+Update the draft filtering configuration of a connector and marks the draft validation state as edited. + The filtering draft is activated once validated by the running Elastic connector service. + The filtering property is used to configure sync rules (both basic and advanced) for a connector.
+ `Update the connector draft filtering validation.
+Update the draft filtering validation info for a connector.
+ `Update the connector index name.
+Update the index_name
field of a connector, specifying the index where the data ingested by the connector is stored.
Update the connector name and description.
+ `Update the connector is_native flag.
+ `Update the connector pipeline.
+When you create a new connector, the configuration of an ingest pipeline is populated with default settings.
+ `Update the connector scheduling.
+ `Update the connector service type.
+ `Update the connector status.
+ `Delete a dangling index.
+ If Elasticsearch encounters index data that is absent from the current cluster state, those indices are considered to be dangling.
+ For example, this can happen if you delete more than cluster.indices.tombstones.size
indices while an Elasticsearch node is offline.
Import a dangling index.
+If Elasticsearch encounters index data that is absent from the current cluster state, those indices are considered to be dangling.
+ For example, this can happen if you delete more than cluster.indices.tombstones.size
indices while an Elasticsearch node is offline.
Get the dangling indices.
+If Elasticsearch encounters index data that is absent from the current cluster state, those indices are considered to be dangling.
+ For example, this can happen if you delete more than cluster.indices.tombstones.size
indices while an Elasticsearch node is offline.
Use this API to list dangling indices, which you can then import or delete.
+ `Delete an enrich policy. + Deletes an existing enrich policy and its enrich index.
+ `Run an enrich policy. + Create the enrich index for an existing enrich policy.
+ `Get an enrich policy. + Returns information about an enrich policy.
+ `Create an enrich policy. + Creates an enrich policy.
+ `Get enrich stats. + Returns enrich coordinator statistics and information about enrich policies that are currently executing.
+ `Delete an async EQL search. + Delete an async EQL search or a stored synchronous EQL search. + The API also deletes results for the search.
+ `Get async EQL search results. + Get the current status and available results for an async EQL search or a stored synchronous EQL search.
+ `Get the async EQL status. + Get the current status for an async EQL search or a stored synchronous EQL search without returning results.
+ `Get EQL search results. + Returns search results for an Event Query Language (EQL) query. + EQL assumes each document in a data stream or index corresponds to an event.
+ `Run an async ES|QL query. + Asynchronously run an ES|QL (Elasticsearch query language) query, monitor its progress, and retrieve results when they become available.
+The API accepts the same parameters and request body as the synchronous query API, along with additional async related properties.
+ `Delete an async ES|QL query. + If the query is still running, it is cancelled. + Otherwise, the stored results are deleted.
+If the Elasticsearch security features are enabled, only the following users can use this API to delete a query:
+cancel_task
cluster privilegeGet async ES|QL query results. + Get the current status and available results or stored results for an ES|QL asynchronous query. + If the Elasticsearch security features are enabled, only the user who first submitted the ES|QL query can retrieve the results using this API.
+ `Run an ES|QL query. + Get search results for an ES|QL (Elasticsearch query language) query.
+ `Get the features.
+ Get a list of features that can be included in snapshots using the feature_states
field when creating a snapshot.
+ You can use this API to determine which feature states to include when taking a snapshot.
+ By default, all feature states are included in a snapshot if that snapshot includes the global state, or none if it does not.
A feature state includes one or more system indices necessary for a given feature to function. + In order to ensure data integrity, all system indices that comprise a feature state are snapshotted and restored together.
+The features listed by this API are a combination of built-in features and features defined by plugins. + In order for a feature state to be listed in this API and recognized as a valid feature state by the create snapshot API, the plugin that defines that feature must be installed on the master node.
+ `Reset the features. + Clear all of the state information stored in system indices by Elasticsearch features, including the security and machine learning indices.
+WARNING: Intended for development and testing use only. Do not reset features on a production cluster.
+Return a cluster to the same state as a new installation by resetting the feature state for all Elasticsearch features. + This deletes all state information stored in system indices.
+The response code is HTTP 200 if the state is successfully reset for all features. + It is HTTP 500 if the reset operation failed for any feature.
+Note that select features might provide a way to reset particular system indices. + Using this API resets all features, both those that are built-in and implemented as plugins.
+To list the features that will be affected, use the get features API.
+IMPORTANT: The features installed on the node you submit this request to are the features that will be reset. Run on the master node if you have any doubts about which plugins are installed on individual nodes.
+ `Returns the current global checkpoints for an index. This API is design for internal use by the fleet server project.
+ `Executes several fleet searches with a single API request. + The API follows the same structure as the multi search API. However, similar to the fleet search API, it + supports the wait_for_checkpoints parameter.
+ :param searches: :param index: A single target to search. If the target is an index alias, it @@ -378,9 +382,11 @@ async def search( body: t.Optional[t.Dict[str, t.Any]] = None, ) -> ObjectApiResponse[t.Any]: """ - The purpose of the fleet search api is to provide a search api where the search - will only be executed after provided checkpoint has been processed and is visible - for searches inside of Elasticsearch. + .. raw:: html + +The purpose of the fleet search api is to provide a search api where the search will only be executed + after provided checkpoint has been processed and is visible for searches inside of Elasticsearch.
+ :param index: A single target to search. If the target is an index alias, it must resolve to a single index. diff --git a/elasticsearch/_async/client/graph.py b/elasticsearch/_async/client/graph.py index e713aa26b..5b86970b1 100644 --- a/elasticsearch/_async/client/graph.py +++ b/elasticsearch/_async/client/graph.py @@ -45,14 +45,15 @@ async def explore( body: t.Optional[t.Dict[str, t.Any]] = None, ) -> ObjectApiResponse[t.Any]: """ - Explore graph analytics. Extract and summarize information about the documents - and terms in an Elasticsearch data stream or index. The easiest way to understand - the behavior of this API is to use the Graph UI to explore connections. An initial - request to the `_explore` API contains a seed query that identifies the documents - of interest and specifies the fields that define the vertices and connections - you want to include in the graph. Subsequent requests enable you to spider out - from one more vertices of interest. You can exclude vertices that have already - been returned. + .. raw:: html + +Explore graph analytics.
+ Extract and summarize information about the documents and terms in an Elasticsearch data stream or index.
+ The easiest way to understand the behavior of this API is to use the Graph UI to explore connections.
+ An initial request to the _explore
API contains a seed query that identifies the documents of interest and specifies the fields that define the vertices and connections you want to include in the graph.
+ Subsequent requests enable you to spider out from one more vertices of interest.
+ You can exclude vertices that have already been returned.
Delete a lifecycle policy. + You cannot delete policies that are currently in use. If the policy is being used to manage any indices, the request fails and returns an error.
+ `Explain the lifecycle state. + Get the current lifecycle status for one or more indices. + For data streams, the API retrieves the current lifecycle status for the stream's backing indices.
+The response indicates when the index entered each lifecycle state, provides the definition of the running phase, and information about any failures.
+ `Get lifecycle policies.
+ `Get the ILM status. + Get the current index lifecycle management status.
+ `Migrate to data tiers routing. + Switch the indices, ILM policies, and legacy, composable, and component templates from using custom node attributes and attribute-based allocation filters to using data tiers. + Optionally, delete one legacy index template. + Using node roles enables ILM to automatically move the indices between data tiers.
+Migrating away from custom node attributes routing can be manually performed. + This API provides an automated way of performing three out of the four manual steps listed in the migration guide:
+ILM must be stopped before performing the migration.
+ Use the stop ILM and get ILM status APIs to wait until the reported operation mode is STOPPED
.
Move to a lifecycle step. + Manually move an index into a specific step in the lifecycle policy and run that step.
+WARNING: This operation can result in the loss of data. Manually moving an index into a specific step runs that step even if it has already been performed. This is a potentially destructive action and this should be considered an expert level API.
+You must specify both the current step and the step to be executed in the body of the request. + The request will fail if the current step does not match the step currently running for the index + This is to prevent the index from being moved from an unexpected step into the next step.
+When specifying the target (next_step
) to which the index will be moved, either the name or both the action and name fields are optional.
+ If only the phase is specified, the index will move to the first step of the first action in the target phase.
+ If the phase and action are specified, the index will move to the first step of the specified action in the specified phase.
+ Only actions specified in the ILM policy are considered valid.
+ An index cannot move to a step that is not part of its policy.
Create or update a lifecycle policy. + If the specified policy exists, it is replaced and the policy version is incremented.
+NOTE: Only the latest version of the policy is stored, you cannot revert to previous versions.
+ `Remove policies from an index. + Remove the assigned lifecycle policies from an index or a data stream's backing indices. + It also stops managing the indices.
+ `Retry a policy. + Retry running the lifecycle policy for an index that is in the ERROR step. + The API sets the policy back to the step where the error occurred and runs the step. + Use the explain lifecycle state API to determine whether an index is in the ERROR step.
+ `Start the ILM plugin. + Start the index lifecycle management plugin if it is currently stopped. + ILM is started automatically when the cluster is formed. + Restarting ILM is necessary only when it has been stopped using the stop ILM API.
+ `Stop the ILM plugin. + Halt all lifecycle management operations and stop the index lifecycle management plugin. + This is useful when you are performing maintenance on the cluster and need to prevent ILM from performing any actions on your indices.
+The API returns as soon as the stop request has been acknowledged, but the plugin might continue to run until in-progress operations complete and the plugin can be safely stopped. + Use the get ILM status API to check whether ILM is running.
+ `Add an index block. + Limits the operations allowed on an index by blocking specific operation types.
+ `Get tokens from text analysis. + The analyze API performs analysis on a text string and returns the resulting tokens.
+Generating excessive amount of tokens may cause a node to run out of memory.
+ The index.analyze.max_token_count
setting enables you to limit the number of tokens that can be produced.
+ If more than this limit of tokens gets generated, an error occurs.
+ The _analyze
endpoint without a specified index will always use 10000
as its limit.
Clear the cache. + Clear the cache of one or more indices. + For data streams, the API clears the caches of the stream's backing indices.
+By default, the clear cache API clears all caches.
+ To clear only specific caches, use the fielddata
, query
, or request
parameters.
+ To clear the cache only of specific fields, use the fields
parameter.
Clone an index. + Clone an existing index into a new index. + Each original primary shard is cloned into a new primary shard in the new index.
+IMPORTANT: Elasticsearch does not apply index templates to the resulting index. + The API also does not copy index metadata from the original index. + Index metadata includes aliases, index lifecycle management phase definitions, and cross-cluster replication (CCR) follower information. + For example, if you clone a CCR follower index, the resulting clone will not be a follower index.
+The clone API copies most index settings from the source index to the resulting index, with the exception of index.number_of_replicas
and index.auto_expand_replicas
.
+ To set the number of replicas in the resulting index, configure these settings in the clone request.
Cloning works as follows:
+IMPORTANT: Indices can only be cloned if they meet the following requirements:
+The current write index on a data stream cannot be cloned. + In order to clone the current write index, the data stream must first be rolled over so that a new write index is created and then the previous write index can be cloned.
+NOTE: Mappings cannot be specified in the _clone
request. The mappings of the source index will be used for the target index.
Monitor the cloning process
+The cloning process can be monitored with the cat recovery API or the cluster health API can be used to wait until all primary shards have been allocated by setting the wait_for_status
parameter to yellow
.
The _clone
API returns as soon as the target index has been added to the cluster state, before any shards have been allocated.
+ At this point, all shards are in the state unassigned.
+ If, for any reason, the target index can't be allocated, its primary shard will remain unassigned until it can be allocated on that node.
Once the primary shard is allocated, it moves to state initializing, and the clone process begins. + When the clone operation completes, the shard will become active. + At that point, Elasticsearch will try to allocate any replicas and may decide to relocate the primary shard to another node.
+Wait for active shards
+Because the clone operation creates a new index to clone the shards to, the wait for active shards setting on index creation applies to the clone index action as well.
+ `Close an index. + A closed index is blocked for read or write operations and does not allow all operations that opened indices allow. + It is not possible to index documents or to search for documents in a closed index. + Closed indices do not have to maintain internal data structures for indexing or searching documents, which results in a smaller overhead on the cluster.
+When opening or closing an index, the master node is responsible for restarting the index shards to reflect the new state of the index. + The shards will then go through the normal recovery process. + The data of opened and closed indices is automatically replicated by the cluster to ensure that enough shard copies are safely kept around at all times.
+You can open and close multiple indices.
+ An error is thrown if the request explicitly refers to a missing index.
+ This behaviour can be turned off using the ignore_unavailable=true
parameter.
By default, you must explicitly name the indices you are opening or closing.
+ To open or close indices with _all
, *
, or other wildcard expressions, change the action.destructive_requires_name
setting to false
. This setting can also be changed with the cluster update settings API.
Closed indices consume a significant amount of disk-space which can cause problems in managed environments.
+ Closing indices can be turned off with the cluster settings API by setting cluster.indices.close.enable
to false
.
Create an index. + You can use the create index API to add a new index to an Elasticsearch cluster. + When creating an index, you can specify the following:
+Wait for active shards
+By default, index creation will only return a response to the client when the primary copies of each shard have been started, or the request times out.
+ The index creation response will indicate what happened.
+ For example, acknowledged
indicates whether the index was successfully created in the cluster, while shards_acknowledged
indicates whether the requisite number of shard copies were started for each shard in the index before timing out.
+ Note that it is still possible for either acknowledged
or shards_acknowledged
to be false
, but for the index creation to be successful.
+ These values simply indicate whether the operation completed before the timeout.
+ If acknowledged
is false, the request timed out before the cluster state was updated with the newly created index, but it probably will be created sometime soon.
+ If shards_acknowledged
is false, then the request timed out before the requisite number of shards were started (by default just the primaries), even if the cluster state was successfully updated to reflect the newly created index (that is to say, acknowledged
is true
).
You can change the default of only waiting for the primary shards to start through the index setting index.write.wait_for_active_shards
.
+ Note that changing this setting will also affect the wait_for_active_shards
value on all subsequent write operations.
Create a data stream. + Creates a data stream. + You must have a matching index template with data stream enabled.
+ `Get data stream stats. + Retrieves statistics for one or more data streams.
+ `Delete indices. + Deleting an index deletes its documents, shards, and metadata. + It does not delete related Kibana components, such as data views, visualizations, or dashboards.
+You cannot delete the current write index of a data stream. + To delete the index, you must roll over the data stream so a new write index is created. + You can then use the delete index API to delete the previous write index.
+ `Delete an alias. + Removes a data stream or index from an alias.
+ `Delete data stream lifecycles. + Removes the data stream lifecycle from a data stream, rendering it not managed by the data stream lifecycle.
+ `Delete data streams. + Deletes one or more data streams and their backing indices.
+ `Delete an index template. + The provided may contain multiple template names separated by a comma. If multiple template + names are specified then there is no wildcard support and the provided names should match completely with + existing templates.
+ `Delete a legacy index template.
+ `Analyze the index disk usage. + Analyze the disk usage of each field of an index or data stream. + This API might not support indices created in previous Elasticsearch versions. + The result of a small index can be inaccurate as some parts of an index might not be analyzed by the API.
+NOTE: The total size of fields of the analyzed shards of the index in the response is usually smaller than the index store_size
value because some small metadata files are ignored and some parts of data files might not be scanned by the API.
+ Since stored fields are stored together in a compressed format, the sizes of stored fields are also estimates and can be inaccurate.
+ The stored size of the _id
field is likely underestimated while the _source
field is overestimated.
Downsample an index.
+ Aggregate a time series (TSDS) index and store pre-computed statistical summaries (min
, max
, sum
, value_count
and avg
) for each metric field grouped by a configured time interval.
+ For example, a TSDS index that contains metrics sampled every 10 seconds can be downsampled to an hourly index.
+ All documents within an hour interval are summarized and stored as a single document in the downsample index.
NOTE: Only indices in a time series data stream are supported.
+ Neither field nor document level security can be defined on the source index.
+ The source index must be read only (index.blocks.write: true
).
Check indices. + Check if one or more indices, index aliases, or data streams exist.
+ `Check aliases. + Checks if one or more data stream or index aliases exist.
+ `Check index templates. + Check whether index templates exist.
+ `Check existence of index templates. + Get information about whether index templates exist. + Index templates define settings, mappings, and aliases that can be applied automatically to new indices.
+IMPORTANT: This documentation is about legacy index templates, which are deprecated and will be replaced by the composable templates introduced in Elasticsearch 7.8.
+ `Get the status for a data stream lifecycle. + Get information about an index or data stream's current data stream lifecycle status, such as time since index creation, time since rollover, the lifecycle configuration managing the index, or any errors encountered during lifecycle execution.
+ `Get field usage stats. + Get field usage information for each shard and field of an index. + Field usage statistics are automatically captured when queries are running on a cluster. + A shard-level search request that accesses a given field, even if multiple times during that request, is counted as a single use.
+The response body reports the per-shard usage count of the data structures that back the fields in the index. + A given request will increment each count by a maximum value of 1, even if the request accesses the same field multiple times.
+ `Flush data streams or indices. + Flushing a data stream or index is the process of making sure that any data that is currently only stored in the transaction log is also permanently stored in the Lucene index. + When restarting, Elasticsearch replays any unflushed operations from the transaction log into the Lucene index to bring it back into the state that it was in before the restart. + Elasticsearch automatically triggers flushes as needed, using heuristics that trade off the size of the unflushed transaction log against the cost of performing each flush.
+After each operation has been flushed it is permanently stored in the Lucene index. + This may mean that there is no need to maintain an additional copy of it in the transaction log. + The transaction log is made up of multiple files, called generations, and Elasticsearch will delete any generation files when they are no longer needed, freeing up disk space.
+It is also possible to trigger a flush on one or more indices using the flush API, although it is rare for users to need to call this API directly. + If you call the flush API after indexing some documents then a successful response indicates that Elasticsearch has flushed all the documents that were indexed before the flush API was called.
+ `Force a merge. + Perform the force merge operation on the shards of one or more indices. + For data streams, the API forces a merge on the shards of the stream's backing indices.
+Merging reduces the number of segments in each shard by merging some of them together and also frees up the space used by deleted documents. + Merging normally happens automatically, but sometimes it is useful to trigger a merge manually.
+WARNING: We recommend force merging only a read-only index (meaning the index is no longer receiving writes). + When documents are updated or deleted, the old version is not immediately removed but instead soft-deleted and marked with a "tombstone". + These soft-deleted documents are automatically cleaned up during regular segment merges. + But force merge can cause very large (greater than 5 GB) segments to be produced, which are not eligible for regular merges. + So the number of soft-deleted documents can then grow rapidly, resulting in higher disk usage and worse search performance. + If you regularly force merge an index receiving writes, this can also make snapshots more expensive, since the new documents can't be backed up incrementally.
+Blocks during a force merge
+Calls to this API block until the merge is complete (unless request contains wait_for_completion=false
).
+ If the client connection is lost before completion then the force merge process will continue in the background.
+ Any new requests to force merge the same indices will also block until the ongoing force merge is complete.
Running force merge asynchronously
+If the request contains wait_for_completion=false
, Elasticsearch performs some preflight checks, launches the request, and returns a task you can use to get the status of the task.
+ However, you can not cancel this task as the force merge task is not cancelable.
+ Elasticsearch creates a record of this task as a document at _tasks/<task_id>
.
+ When you are done with a task, you should delete the task document so Elasticsearch can reclaim the space.
Force merging multiple indices
+You can force merge multiple indices with a single request by targeting:
+Each targeted shard is force-merged separately using the force_merge threadpool.
+ By default each node only has a single force_merge
thread which means that the shards on that node are force-merged one at a time.
+ If you expand the force_merge
threadpool on a node then it will force merge its shards in parallel
Force merge makes the storage for the shard being merged temporarily increase, as it may require free space up to triple its size in case max_num_segments parameter
is set to 1
, to rewrite all segments into a new one.
Data streams and time-based indices
+Force-merging is useful for managing a data stream's older backing indices and other time-based indices, particularly after a rollover. + In these cases, each index only receives indexing traffic for a certain period of time. + Once an index receive no more writes, its shards can be force-merged to a single segment. + This can be a good idea because single-segment shards can sometimes use simpler and more efficient data structures to perform searches. + For example:
+POST /.ds-my-data-stream-2099.03.07-000001/_forcemerge?max_num_segments=1
+
+
`Get index information. + Get information about one or more indices. For data streams, the API returns information about the + stream’s backing indices.
+ `Get aliases. + Retrieves information for one or more data stream or index aliases.
+ `Get data stream lifecycles. + Retrieves the data stream lifecycle configuration of one or more data streams.
+ `Get data stream lifecycle stats. + Get statistics about the data streams that are managed by a data stream lifecycle.
+ `Get data streams. + Retrieves information about one or more data streams.
+ `Get mapping definitions. + Retrieves mapping definitions for one or more fields. + For data streams, the API retrieves field mappings for the stream’s backing indices.
+This API is useful if you don't need a complete mapping or if an index mapping contains a large number of fields.
+ `Get index templates. + Get information about one or more index templates.
+ `Get mapping definitions. + For data streams, the API retrieves mappings for the stream’s backing indices.
+ `Get index settings. + Get setting information for one or more indices. + For data streams, it returns setting information for the stream's backing indices.
+ `Get index templates. + Get information about one or more index templates.
+IMPORTANT: This documentation is about legacy index templates, which are deprecated and will be replaced by the composable templates introduced in Elasticsearch 7.8.
+ `Convert an index alias to a data stream.
+ Converts an index alias to a data stream.
+ You must have a matching index template that is data stream enabled.
+ The alias must meet the following criteria:
+ The alias must have a write index;
+ All indices for the alias must have a @timestamp
field mapping of a date
or date_nanos
field type;
+ The alias must not have any filters;
+ The alias must not use custom routing.
+ If successful, the request removes the alias and creates a data stream with the same name.
+ The indices for the alias become hidden backing indices for the stream.
+ The write index for the alias becomes the write index for the stream.
Update data streams. + Performs one or more data stream modification actions in a single atomic operation.
+ `Open a closed index. + For data streams, the API opens any closed backing indices.
+A closed index is blocked for read/write operations and does not allow all operations that opened indices allow. + It is not possible to index documents or to search for documents in a closed index. + This allows closed indices to not have to maintain internal data structures for indexing or searching documents, resulting in a smaller overhead on the cluster.
+When opening or closing an index, the master is responsible for restarting the index shards to reflect the new state of the index. + The shards will then go through the normal recovery process. + The data of opened or closed indices is automatically replicated by the cluster to ensure that enough shard copies are safely kept around at all times.
+You can open and close multiple indices.
+ An error is thrown if the request explicitly refers to a missing index.
+ This behavior can be turned off by using the ignore_unavailable=true
parameter.
By default, you must explicitly name the indices you are opening or closing.
+ To open or close indices with _all
, *
, or other wildcard expressions, change the action.destructive_requires_name
setting to false
.
+ This setting can also be changed with the cluster update settings API.
Closed indices consume a significant amount of disk-space which can cause problems in managed environments.
+ Closing indices can be turned off with the cluster settings API by setting cluster.indices.close.enable
to false
.
Because opening or closing an index allocates its shards, the wait_for_active_shards
setting on index creation applies to the _open
and _close
index actions as well.
Promote a data stream. + Promote a data stream from a replicated data stream managed by cross-cluster replication (CCR) to a regular data stream.
+With CCR auto following, a data stream from a remote cluster can be replicated to the local cluster. + These data streams can't be rolled over in the local cluster. + These replicated data streams roll over only if the upstream data stream rolls over. + In the event that the remote cluster is no longer available, the data stream in the local cluster can be promoted to a regular data stream, which allows these data streams to be rolled over in the local cluster.
+NOTE: When promoting a data stream, ensure the local cluster has a data stream enabled index template that matches the data stream. + If this is missing, the data stream will not be able to roll over until a matching index template is created. + This will affect the lifecycle management of the data stream and interfere with the data stream size and retention.
+ `Create or update an alias. + Adds a data stream or index to an alias.
+ `Update data stream lifecycles. + Update the data stream lifecycle of the specified data streams.
+ `Create or update an index template. + Index templates define settings, mappings, and aliases that can be applied automatically to new indices.
+Elasticsearch applies templates to new indices based on an wildcard pattern that matches the index name. + Index templates are applied during data stream or index creation. + For data streams, these settings and mappings are applied when the stream's backing indices are created. + Settings and mappings specified in a create index API request override any settings or mappings specified in an index template. + Changes to index templates do not affect existing indices, including the existing backing indices of a data stream.
+You can use C-style /* *\\/
block comments in index templates.
+ You can include comments anywhere in the request body, except before the opening curly bracket.
Multiple matching templates
+If multiple index templates match the name of a new index or data stream, the template with the highest priority is used.
+Multiple templates with overlapping index patterns at the same priority are not allowed and an error will be thrown when attempting to create a template matching an existing index template at identical priorities.
+Composing aliases, mappings, and settings
+When multiple component templates are specified in the composed_of
field for an index template, they are merged in the order specified, meaning that later component templates override earlier component templates.
+ Any mappings, settings, or aliases from the parent index template are merged in next.
+ Finally, any configuration on the index request itself is merged.
+ Mapping definitions are merged recursively, which means that later mapping components can introduce new field mappings and update the mapping configuration.
+ If a field mapping is already contained in an earlier component, its definition will be completely overwritten by the later one.
+ This recursive merging strategy applies not only to field mappings, but also root options like dynamic_templates
and meta
.
+ If an earlier component contains a dynamic_templates
block, then by default new dynamic_templates
entries are appended onto the end.
+ If an entry already exists with the same key, then it is overwritten by the new definition.
Update field mappings. + Add new fields to an existing data stream or index. + You can also use this API to change the search settings of existing fields and add new properties to existing object fields. + For data streams, these changes are applied to all backing indices by default.
+Add multi-fields to an existing field
+Multi-fields let you index the same field in different ways. + You can use this API to update the fields mapping parameter and enable multi-fields for an existing field. + WARNING: If an index (or data stream) contains documents when you add a multi-field, those documents will not have values for the new multi-field. + You can populate the new multi-field with the update by query API.
+Change supported mapping parameters for an existing field
+The documentation for each mapping parameter indicates whether you can update it for an existing field using this API.
+ For example, you can use the update mapping API to update the ignore_above
parameter.
Change the mapping of an existing field
+Except for supported mapping parameters, you can't change the mapping or field type of an existing field. + Changing an existing field could invalidate data that's already indexed.
+If you need to change the mapping of a field in a data stream's backing indices, refer to documentation about modifying data streams. + If you need to change the mapping of a field in other indices, create a new index with the correct mapping and reindex your data into that index.
+Rename a field
+Renaming a field would invalidate data already indexed under the old field name. + Instead, add an alias field to create an alternate field name.
+ `Update index settings. + Changes dynamic index settings in real time. + For data streams, index setting changes are applied to all backing indices by default.
+To revert a setting to the default value, use a null value.
+ The list of per-index settings that can be updated dynamically on live indices can be found in index module documentation.
+ To preserve existing settings from being updated, set the preserve_existing
parameter to true
.
NOTE: You can only define new analyzers on closed indices. + To add an analyzer, you must close the index, define the analyzer, and reopen the index. + You cannot close the write index of a data stream. + To update the analyzer for a data stream's write index and future backing indices, update the analyzer in the index template used by the stream. + Then roll over the data stream to apply the new analyzer to the stream's write index and future backing indices. + This affects searches and any new data added to the stream after the rollover. + However, it does not affect the data stream's backing indices or their existing data. + To change the analyzer for existing backing indices, you must create a new data stream and reindex your data into it.
+ `Create or update an index template. + Index templates define settings, mappings, and aliases that can be applied automatically to new indices. + Elasticsearch applies templates to new indices based on an index pattern that matches the index name.
+IMPORTANT: This documentation is about legacy index templates, which are deprecated and will be replaced by the composable templates introduced in Elasticsearch 7.8.
+Composable templates always take precedence over legacy templates. + If no composable template matches a new index, matching legacy templates are applied according to their order.
+Index templates are only applied during index creation. + Changes to index templates do not affect existing indices. + Settings and mappings specified in create index API requests override any settings or mappings specified in an index template.
+You can use C-style /* *\\/
block comments in index templates.
+ You can include comments anywhere in the request body, except before the opening curly bracket.
Indices matching multiple templates
+Multiple index templates can potentially match an index, in this case, both the settings and mappings are merged into the final configuration of the index. + The order of the merging can be controlled using the order parameter, with lower order being applied first, and higher orders overriding them. + NOTE: Multiple matching templates with the same order value will result in a non-deterministic merging order.
+ `Get index recovery information. + Get information about ongoing and completed shard recoveries for one or more indices. + For data streams, the API returns information for the stream's backing indices.
+All recoveries, whether ongoing or complete, are kept in the cluster state and may be reported on at any time.
+Shard recovery is the process of initializing a shard copy, such as restoring a primary shard from a snapshot or creating a replica shard from a primary shard. + When a shard recovery completes, the recovered shard is available for search and indexing.
+Recovery automatically occurs during the following processes:
+You can determine the cause of a shard recovery using the recovery or cat recovery APIs.
+The index recovery API reports information about completed recoveries only for shard copies that currently exist in the cluster. + It only reports the last recovery for each shard copy and does not report historical information about earlier recoveries, nor does it report information about the recoveries of shard copies that no longer exist. + This means that if a shard copy completes a recovery and then Elasticsearch relocates it onto a different node then the information about the original recovery will not be shown in the recovery API.
+ `Refresh an index. + A refresh makes recent operations performed on one or more indices available for search. + For data streams, the API runs the refresh operation on the stream’s backing indices.
+By default, Elasticsearch periodically refreshes indices every second, but only on indices that have received one search request or more in the last 30 seconds.
+ You can change this default interval with the index.refresh_interval
setting.
Refresh requests are synchronous and do not return a response until the refresh operation completes.
+Refreshes are resource-intensive. + To ensure good cluster performance, it's recommended to wait for Elasticsearch's periodic refresh rather than performing an explicit refresh when possible.
+If your application workflow indexes documents and then runs a search to retrieve the indexed document, it's recommended to use the index API's refresh=wait_for
query parameter option.
+ This option ensures the indexing operation waits for a periodic refresh before running the search.
Reload search analyzers. + Reload an index's search analyzers and their resources. + For data streams, the API reloads search analyzers and resources for the stream's backing indices.
+IMPORTANT: After reloading the search analyzers you should clear the request cache to make sure it doesn't contain responses derived from the previous versions of the analyzer.
+You can use the reload search analyzers API to pick up changes to synonym files used in the synonym_graph
or synonym
token filter of a search analyzer.
+ To be eligible, the token filter must have an updateable
flag of true
and only be used in search analyzers.
NOTE: This API does not perform a reload for each shard of an index. + Instead, it performs a reload for each node containing index shards. + As a result, the total shard count returned by the API can differ from the number of index shards. + Because reloading affects every node with an index shard, it is important to update the synonym file on every data node in the cluster--including nodes that don't contain a shard replica--before using this API. + This ensures the synonym file is updated everywhere in the cluster in case shards are relocated in the future.
+ `Resolve the cluster. + Resolve the specified index expressions to return information about each cluster, including the local cluster, if included. + Multiple patterns and remote clusters are supported.
+This endpoint is useful before doing a cross-cluster search in order to determine which remote clusters should be included in a search.
+You use the same index expression with this endpoint as you would for cross-cluster search. + Index and cluster exclusions are also supported with this endpoint.
+For each cluster in the index expression, information is returned about:
+skip_unavailable
as true
or false
.For example, GET /_resolve/cluster/my-index-*,cluster*:my-index-*
returns information about the local cluster and all remotely configured clusters that start with the alias cluster*
.
+ Each cluster returns information about whether it has any indices, aliases or data streams that match my-index-*
.
Advantages of using this endpoint before a cross-cluster search
+You may want to exclude a cluster or index from a search when:
+skip_unavailable=false
. Running a cross-cluster search under those conditions will cause the entire search to fail.logs*,remote1:logs*
and the remote1 cluster has no indices, aliases or data streams that match logs*
. In that case, that cluster will return no results from that cluster if you include it in a cross-cluster search._resolve/cluster
response will be present. (This is also where security/permission errors will be shown.)Resolve indices. + Resolve the names and/or index patterns for indices, aliases, and data streams. + Multiple patterns and remote clusters are supported.
+ `Roll over to a new index. + TIP: It is recommended to use the index lifecycle rollover action to automate rollovers.
+The rollover API creates a new index for a data stream or index alias. + The API behavior depends on the rollover target.
+Roll over a data stream
+If you roll over a data stream, the API creates a new write index for the stream. + The stream's previous write index becomes a regular backing index. + A rollover also increments the data stream's generation.
+Roll over an index alias with a write index
+TIP: Prior to Elasticsearch 7.9, you'd typically use an index alias with a write index to manage time series data. + Data streams replace this functionality, require less maintenance, and automatically integrate with data tiers.
+If an index alias points to multiple indices, one of the indices must be a write index.
+ The rollover API creates a new write index for the alias with is_write_index
set to true
.
+ The API also sets is_write_index
to false
for the previous write index.
Roll over an index alias with one index
+If you roll over an index alias that points to only one index, the API creates a new index for the alias and removes the original index from the alias.
+NOTE: A rollover creates a new index and is subject to the wait_for_active_shards
setting.
Increment index names for an alias
+When you roll over an index alias, you can specify a name for the new index.
+ If you don't specify a name and the current index ends with -
and a number, such as my-index-000001
or my-index-3
, the new index name increments that number.
+ For example, if you roll over an alias with a current index of my-index-000001
, the rollover creates a new index named my-index-000002
.
+ This number is always six characters and zero-padded, regardless of the previous index's name.
If you use an index alias for time series data, you can use date math in the index name to track the rollover date.
+ For example, you can create an alias that points to an index named <my-index-{now/d}-000001>
.
+ If you create the index on May 6, 2099, the index's name is my-index-2099.05.06-000001
.
+ If you roll over the alias on May 7, 2099, the new index's name is my-index-2099.05.07-000002
.
Get index segments. + Get low-level information about the Lucene segments in index shards. + For data streams, the API returns information about the stream's backing indices.
+ `Get index shard stores. + Get store information about replica shards in one or more indices. + For data streams, the API retrieves store information for the stream's backing indices.
+The index shard stores API returns the following information:
+By default, the API returns store information only for primary shards that are unassigned or have one or more unassigned replica shards.
+ `Shrink an index. + Shrink an index into a new index with fewer primary shards.
+Before you can shrink an index:
+To make shard allocation easier, we recommend you also remove the index's replica shards. + You can later re-add replica shards as part of the shrink operation.
+The requested number of primary shards in the target index must be a factor of the number of shards in the source index. + For example an index with 8 primary shards can be shrunk into 4, 2 or 1 primary shards or an index with 15 primary shards can be shrunk into 5, 3 or 1. + If the number of shards in the index is a prime number it can only be shrunk into a single primary shard + Before shrinking, a (primary or replica) copy of every shard in the index must be present on the same node.
+The current write index on a data stream cannot be shrunk. In order to shrink the current write index, the data stream must first be rolled over so that a new write index is created and then the previous write index can be shrunk.
+A shrink operation:
+.routing.allocation.initial_recovery._id
index setting.IMPORTANT: Indices can only be shrunk if they satisfy the following requirements:
+Simulate an index. + Get the index configuration that would be applied to the specified index from an existing index template.
+ `Simulate an index template. + Get the index configuration that would be applied by a particular index template.
+ `Split an index. + Split an index into a new index with more primary shards.
+Before you can split an index:
+The index must be read-only.
+The cluster health status must be green.
+You can do make an index read-only with the following request using the add index block API:
+PUT /my_source_index/_block/write
+
+ The current write index on a data stream cannot be split. + In order to split the current write index, the data stream must first be rolled over so that a new write index is created and then the previous write index can be split.
+The number of times the index can be split (and the number of shards that each original shard can be split into) is determined by the index.number_of_routing_shards
setting.
+ The number of routing shards specifies the hashing space that is used internally to distribute documents across shards with consistent hashing.
+ For instance, a 5 shard index with number_of_routing_shards
set to 30 (5 x 2 x 3) could be split by a factor of 2 or 3.
A split operation:
+IMPORTANT: Indices can only be split if they satisfy the following requirements:
+Get index statistics. + For data streams, the API retrieves statistics for the stream's backing indices.
+By default, the returned statistics are index-level with primaries
and total
aggregations.
+ primaries
are the values for only the primary shards.
+ total
are the accumulated values for both primary and replica shards.
To get shard-level statistics, set the level
parameter to shards
.
NOTE: When moving to another node, the shard-level statistics for a shard are cleared. + Although the shard is no longer part of the node, that node retains any node-level statistics to which the shard contributed.
+ `Unfreeze an index. + When a frozen index is unfrozen, the index goes through the normal recovery process and becomes writeable again.
+ `Create or update an alias. + Adds a data stream or index to an alias.
+ `Validate a query. + Validates a query without running it.
+ `Delete an inference endpoint
+ `Get an inference endpoint
+ `Perform inference on the service
+ `Create an inference endpoint.
+ When you create an inference endpoint, the associated machine learning model is automatically deployed if it is not already running.
+ After creating the endpoint, wait for the model deployment to complete before using it.
+ To verify the deployment status, use the get trained model statistics API.
+ Look for "state": "fully_allocated"
in the response and ensure that the "allocation_count"
matches the "target_allocation_count"
.
+ Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.
IMPORTANT: The inference APIs enable you to use certain services, such as built-in machine learning models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Mistral, Azure OpenAI, Google AI Studio, Google Vertex AI, Anthropic, Watsonx.ai, or Hugging Face. + For built-in models and models uploaded through Eland, the inference APIs offer an alternative way to use and manage trained models. + However, if you do not plan to use the inference APIs to use these models or if you want to use non-NLP models, use the machine learning trained model APIs.
+ `Update an inference endpoint.
+Modify task_settings
, secrets (within service_settings
), or num_allocations
for an inference endpoint, depending on the specific endpoint service and task_type
.
IMPORTANT: The inference APIs enable you to use certain services, such as built-in machine learning models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Azure, Google AI Studio, Google Vertex AI, Anthropic, Watsonx.ai, or Hugging Face. + For built-in models and models uploaded through Eland, the inference APIs offer an alternative way to use and manage trained models. + However, if you do not plan to use the inference APIs to use these models or if you want to use non-NLP models, use the machine learning trained model APIs.
+ `Delete GeoIP database configurations. + Delete one or more IP geolocation database configurations.
+ `Delete IP geolocation database configurations.
+ `Delete pipelines. + Delete one or more ingest pipelines.
+ `Get GeoIP statistics. + Get download statistics for GeoIP2 databases that are used with the GeoIP processor.
+ `Get GeoIP database configurations. + Get information about one or more IP geolocation database configurations.
+ `Get IP geolocation database configurations.
+ `Get pipelines. + Get information about one or more ingest pipelines. + This API returns a local reference of the pipeline.
+ `Run a grok processor. + Extract structured fields out of a single text field within a document. + You must choose which field to extract matched fields from, as well as the grok pattern you expect will match. + A grok pattern is like a regular expression that supports aliased expressions that can be reused.
+ `Create or update a GeoIP database configuration. + Refer to the create or update IP geolocation database configuration API.
+ `Create or update an IP geolocation database configuration.
+ `Create or update a pipeline. + Changes made using this API take effect immediately.
+ `Simulate a pipeline. + Run an ingest pipeline against a set of provided documents. + You can either specify an existing pipeline to use with the provided documents or supply a pipeline definition in the body of the request.
+ `