diff --git a/.github/vale/styles/Vocab/OpenSearch/Plugins/accept.txt b/.github/vale/styles/Vocab/OpenSearch/Plugins/accept.txt index 9dc315ec68..3685cdee7d 100644 --- a/.github/vale/styles/Vocab/OpenSearch/Plugins/accept.txt +++ b/.github/vale/styles/Vocab/OpenSearch/Plugins/accept.txt @@ -26,4 +26,5 @@ Search Relevance plugin Security plugin Security Analytics plugin SQL plugin -Trace Analytics plugin \ No newline at end of file +Trace Analytics plugin +User Behavior Insights \ No newline at end of file diff --git a/.github/vale/styles/Vocab/OpenSearch/Products/accept.txt b/.github/vale/styles/Vocab/OpenSearch/Products/accept.txt index 83e9aee603..9be8da79a9 100644 --- a/.github/vale/styles/Vocab/OpenSearch/Products/accept.txt +++ b/.github/vale/styles/Vocab/OpenSearch/Products/accept.txt @@ -76,7 +76,6 @@ Painless Peer Forwarder Performance Analyzer Piped Processing Language -Point in Time Powershell Python PyTorch diff --git a/.github/workflows/pr_checklist.yml b/.github/workflows/pr_checklist.yml new file mode 100644 index 0000000000..4130f5e2bd --- /dev/null +++ b/.github/workflows/pr_checklist.yml @@ -0,0 +1,43 @@ +name: PR Checklist + +on: + pull_request_target: + types: [opened] + +permissions: + pull-requests: write + +jobs: + add-checklist: + runs-on: ubuntu-latest + + steps: + - name: Comment PR with checklist + uses: peter-evans/create-or-update-comment@v3 + with: + token: ${{ secrets.GITHUB_TOKEN }} + issue-number: ${{ github.event.pull_request.number }} + body: | + Thank you for submitting your PR. The PR states are In progress (or Draft) -> Tech review -> Doc review -> Editorial review -> Merged. + + Before you submit your PR for doc review, make sure the content is technically accurate. If you need help finding a tech reviewer, tag a [maintainer](https://github.com/opensearch-project/documentation-website/blob/main/MAINTAINERS.md). + + **When you're ready for doc review, tag the assignee of this PR**. The doc reviewer may push edits to the PR directly or leave comments and editorial suggestions for you to address (let us know in a comment if you have a preference). The doc reviewer will arrange for an editorial review. + + - name: Auto assign PR to repo owner + uses: actions/github-script@v6 + with: + script: | + let assignee = context.payload.pull_request.user.login; + const prOwners = ['Naarcha-AWS', 'kolchfa-aws', 'vagimeli', 'natebower']; + + if (!prOwners.includes(assignee)) { + assignee = 'hdhalter' + } + + github.rest.issues.addAssignees({ + issue_number: context.issue.number, + owner: context.repo.owner, + repo: context.repo.repo, + assignees: [assignee] + }); \ No newline at end of file diff --git a/.gitignore b/.gitignore index ae2249e73f..446d1deda6 100644 --- a/.gitignore +++ b/.gitignore @@ -4,4 +4,5 @@ _site .DS_Store Gemfile.lock .idea +*.iml .jekyll-cache diff --git a/.ruby-version b/.ruby-version new file mode 100644 index 0000000000..4772543317 --- /dev/null +++ b/.ruby-version @@ -0,0 +1 @@ +3.3.2 diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index de44bbe4ee..7afa9d7596 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -100,10 +100,10 @@ Follow these steps to set up your local copy of the repository: #### Troubleshooting -If you encounter an error while trying to build the documentation website, find the error in the following troubleshooting list: +Try the following troubleshooting steps if you encounter an error when trying to build the documentation website: -- When running `rvm install 3.2` if you receive a `Error running '__rvm_make -j10'`, resolve this by running `rvm install 3.2.0 -C --with-openssl-dir=/opt/homebrew/opt/openssl@3.2` instead of `rvm install 3.2`. -- If receive a `bundle install`: `An error occurred while installing posix-spawn (0.3.15), and Bundler cannot continue.` error when trying to run `bundle install`, resolve this by running `gem install posix-spawn -v 0.3.15 -- --with-cflags=\"-Wno-incompatible-function-pointer-types\"`. Then, run `bundle install`. +- If you see the `Error running '__rvm_make -j10'` error when running `rvm install 3.2`, you can resolve it by running `rvm install 3.2.0 -C --with-openssl-dir=/opt/homebrew/opt/openssl@3.2` instead of `rvm install 3.2`. +- If you see the `bundle install`: `An error occurred while installing posix-spawn (0.3.15), and Bundler cannot continue.` error when trying to run `bundle install`, you can resolve it by running `gem install posix-spawn -v 0.3.15 -- --with-cflags=\"-Wno-incompatible-function-pointer-types\"` and then `bundle install`. diff --git a/_about/version-history.md b/_about/version-history.md index 0d6d844951..09f331b235 100644 --- a/_about/version-history.md +++ b/_about/version-history.md @@ -30,6 +30,7 @@ OpenSearch version | Release highlights | Release date [2.0.1](https://github.com/opensearch-project/opensearch-build/blob/main/release-notes/opensearch-release-notes-2.0.1.md) | Includes bug fixes and maintenance updates for Alerting and Anomaly Detection. | 16 June 2022 [2.0.0](https://github.com/opensearch-project/opensearch-build/blob/main/release-notes/opensearch-release-notes-2.0.0.md) | Includes document-level monitors for alerting, OpenSearch Notifications plugins, and Geo Map Tiles in OpenSearch Dashboards. Also adds support for Lucene 9 and bug fixes for all OpenSearch plugins. For a full list of release highlights, see the Release Notes. | 26 May 2022 [2.0.0-rc1](https://github.com/opensearch-project/opensearch-build/blob/main/release-notes/opensearch-release-notes-2.0.0-rc1.md) | The Release Candidate for 2.0.0. This version allows you to preview the upcoming 2.0.0 release before the GA release. The preview release adds document-level alerting, support for Lucene 9, and the ability to use term lookup queries in document level security. | 03 May 2022 +[1.3.18](https://github.com/opensearch-project/opensearch-build/blob/main/release-notes/opensearch-release-notes-1.3.18.md) | Includes maintenance updates for OpenSearch security. | 16 July 2024 [1.3.17](https://github.com/opensearch-project/opensearch-build/blob/main/release-notes/opensearch-release-notes-1.3.17.md) | Includes maintenance updates for OpenSearch security and OpenSearch Dashboards security. | 06 June 2024 [1.3.16](https://github.com/opensearch-project/opensearch-build/blob/main/release-notes/opensearch-release-notes-1.3.16.md) | Includes bug fixes and maintenance updates for OpenSearch security, index management, performance analyzer, and reporting. | 23 April 2024 [1.3.15](https://github.com/opensearch-project/opensearch-build/blob/main/release-notes/opensearch-release-notes-1.3.15.md) | Includes bug fixes and maintenance updates for cross-cluster replication, SQL, OpenSearch Dashboards reporting, and alerting. | 05 March 2024 diff --git a/_aggregations/metric/geocentroid.md b/_aggregations/metric/geocentroid.md new file mode 100644 index 0000000000..711f49862a --- /dev/null +++ b/_aggregations/metric/geocentroid.md @@ -0,0 +1,256 @@ +--- +layout: default +title: Geocentroid +parent: Metric aggregations +grand_parent: Aggregations +nav_order: 45 +--- + +# Geocentroid + +The OpenSearch `geo_centroid` aggregation is a powerful tool that allows you to calculate the weighted geographic center or focal point of a set of spatial data points. This metric aggregation operates on `geo_point` fields and returns the centroid location as a latitude-longitude pair. + +## Using the aggregation + +Follow these steps to use the `geo_centroid` aggregation: + +**1. Create an index with a `geopoint` field** + +First, you need to create an index with a `geo_point` field type. This field stores the geographic coordinates you want to analyze. For example, to create an index called `restaurants` with a `location` field of type `geo_point`, use the following request: + +```json +PUT /restaurants +{ + "mappings": { + "properties": { + "name": { + "type": "text" + }, + "location": { + "type": "geo_point" + } + } + } +} +``` +{% include copy-curl.html %} + +**2. Index documents with spatial data** + +Next, index your documents containing the spatial data points you want to analyze. Make sure to include the `geo_point` field with the appropriate latitude-longitude coordinates. For example, index your documents using the following request: + +```json +POST /restaurants/_bulk?refresh +{"index": {"_id": 1}} +{"name": "Cafe Delish", "location": "40.7128, -74.0059"} +{"index": {"_id": 2}} +{"name": "Tasty Bites", "location": "51.5074, -0.1278"} +{"index": {"_id": 3}} +{"name": "Sushi Palace", "location": "48.8566, 2.3522"} +{"index": {"_id": 4}} +{"name": "Burger Joint", "location": "34.0522, -118.2437"} +``` +{% include copy-curl.html %} + +**3. Run the `geo_centroid` aggregation** + +To caluculate the centroid location across all documents, run a search with the `geo_centroid` aggregation on the `geo_point` field. For example, use the following request: + +```json +GET /restaurants/_search +{ + "size": 0, + "aggs": { + "centroid": { + "geo_centroid": { + "field": "location" + } + } + } +} +``` +{% include copy-curl.html %} + +The response includes a `centroid` object with `lat` and `lon` properties representing the weighted centroid location of all indexed data point, as shown in the following example: + + ```json + "aggregations": { + "centroid": { + "location": { + "lat": 43.78224998130463, + "lon": -47.506300045643 + }, + "count": 4 +``` +{% include copy-curl.html %} + +**4. Nest under other aggregations (optional)** + +You can also nest the `geo_centroid` aggregation under other bucket aggregations, such as `terms`, to calculate the centroid for subsets of your data. For example, to find the centroid location for each city, use the following request: + +```json +GET /restaurants/_search +{ + "size": 0, + "aggs": { + "cities": { + "terms": { + "field": "city.keyword" + }, + "aggs": { + "centroid": { + "geo_centroid": { + "field": "location" + } + } + } + } + } +} +``` +{% include copy-curl.html %} + +This returns a centroid location for each city bucket, allowing you to analyze the geographic center of data points in different cities. + +## Using `geo_centroid` with the `geohash_grid` aggregation + +The `geohash_grid` aggregation partitions geospatial data into buckets based on geohash prefixes. + +When a document contains multiple geopoint values in a field, the `geohash_grid` aggregation assigns the document to multiple buckets, even if one or more of its geopoints are outside the bucket boundaries. This behavior is different from how individual geopoints are treated, where only those within the bucket boundaries are considered. + +When you nest the `geo_centroid` aggregation under the `geohash_grid` aggregation, each centroid is calculated using all geopoints in a bucket, including those that may be outside the bucket boundaries. This can result in centroid locations that fall outside the geographic area represented by the bucket. + +#### Example + +In this example, the `geohash_grid` aggregation with a `precision` of `3` creates buckets based on geohash prefixes of length `3`. Because each document has multiple geopoints, they may be assigned to multiple buckets, even if some of the geopoints fall outside the bucket boundaries. + +The `geo_centroid` subaggregation calculates the centroid for each bucket using all geopoints assigned to that bucket, including those outside the bucket boundaries. This means that the resulting centroid locations may not necessarily lie within the geographic area represented by the corresponding geohash bucket. + +First, create an index and index documents containing multiple geopoints: + +```json +PUT /locations +{ + "mappings": { + "properties": { + "name": { + "type": "text" + }, + "coordinates": { + "type": "geo_point" + } + } + } +} + +POST /locations/_bulk?refresh +{"index": {"_id": 1}} +{"name": "Point A", "coordinates": ["40.7128, -74.0059", "51.5074, -0.1278"]} +{"index": {"_id": 2}} +{"name": "Point B", "coordinates": ["48.8566, 2.3522", "34.0522, -118.2437"]} +``` + +Then, run `geohash_grid` with the `geo_centroid` subaggregation: + +```json +GET /locations/_search +{ + "size": 0, + "aggs": { + "grid": { + "geohash_grid": { + "field": "coordinates", + "precision": 3 + }, + "aggs": { + "centroid": { + "geo_centroid": { + "field": "coordinates" + } + } + } + } + } +} +``` +{% include copy-curl.html %} + +
+   +    Response +   +  {: .text-delta} + +```json +{ + "took": 26, + "timed_out": false, + "_shards": { + "total": 1, + "successful": 1, + "skipped": 0, + "failed": 0 + }, + "hits": { + "total": { + "value": 2, + "relation": "eq" + }, + "max_score": null, + "hits": [] + }, + "aggregations": { + "grid": { + "buckets": [ + { + "key": "u09", + "doc_count": 1, + "centroid": { + "location": { + "lat": 41.45439997315407, + "lon": -57.945750039070845 + }, + "count": 2 + } + }, + { + "key": "gcp", + "doc_count": 1, + "centroid": { + "location": { + "lat": 46.11009998945519, + "lon": -37.06685005221516 + }, + "count": 2 + } + }, + { + "key": "dr5", + "doc_count": 1, + "centroid": { + "location": { + "lat": 46.11009998945519, + "lon": -37.06685005221516 + }, + "count": 2 + } + }, + { + "key": "9q5", + "doc_count": 1, + "centroid": { + "location": { + "lat": 41.45439997315407, + "lon": -57.945750039070845 + }, + "count": 2 + } + } + ] + } + } +} +``` +{% include copy-curl.html %} + +
diff --git a/_aggregations/metric/weighted-avg.md b/_aggregations/metric/weighted-avg.md new file mode 100644 index 0000000000..268f78bfdc --- /dev/null +++ b/_aggregations/metric/weighted-avg.md @@ -0,0 +1,149 @@ +--- +layout: default +title: Weighted average +parent: Metric aggregations +grand_parent: Aggregations +nav_order: 150 +--- + +# Weighted average + +The `weighted_avg` aggregation calculates the weighted average of numeric values across documents. This is useful when you want to calculate an average but weight some data points more heavily than others. + +## Weighted average calculation + +The weighted average is calculated as `(sum of value * weight) / (sum of weights)`. + +## Parameters + +When using the `weighted_avg` aggregation, you must define the following parameters: + +- `value`: The field or script used to obtain the average numeric values +- `weight`: The field or script used to obtain the weight for each value + +Optionally, you can specify the following parameters: + +- `format`: A numeric format to apply to the output value +- `value_type`: A type hint for the values when using scripts or unmapped fields + +For the value or weight, you can specify the following parameters: + +- `field`: The document field to use +- `missing`: A value or weight to use if the field is missing + + +## Using the aggregation + +Follow these steps to use the `weighted_avg` aggregation: + +**1. Create an index and index some documents** + +```json +PUT /products + +POST /products/_doc/1 +{ + "name": "Product A", + "rating": 4, + "num_reviews": 100 +} + +POST /products/_doc/2 +{ + "name": "Product B", + "rating": 5, + "num_reviews": 20 +} + +POST /products/_doc/3 +{ + "name": "Product C", + "rating": 3, + "num_reviews": 50 +} +``` +{% include copy-curl.html %} + +**2. Run the `weighted_avg` aggregation** + +```json +GET /products/_search +{ + "size": 0, + "aggs": { + "weighted_rating": { + "weighted_avg": { + "value": { + "field": "rating" + }, + "weight": { + "field": "num_reviews" + } + } + } + } +} +``` +{% include copy-curl.html %} + +## Handling missing values + +The `missing` parameter allows you to specify default values for documents missing the `value` field or the `weight` field instead of excluding them from the calculation. + +The following is an example of this behavior. First, create an index and add sample documents. This example includes five documents with different combinations of missing values for the `rating` and `num_reviews` fields: + +```json +PUT /products +{ + "mappings": { + "properties": { + "name": { + "type": "text" + }, + "rating": { + "type": "double" + }, + "num_reviews": { + "type": "integer" + } + } + } +} + +POST /_bulk +{ "index": { "_index": "products" } } +{ "name": "Product A", "rating": 4.5, "num_reviews": 100 } +{ "index": { "_index": "products" } } +{ "name": "Product B", "rating": 3.8, "num_reviews": 50 } +{ "index": { "_index": "products" } } +{ "name": "Product C", "rating": null, "num_reviews": 20 } +{ "index": { "_index": "products" } } +{ "name": "Product D", "rating": 4.2, "num_reviews": null } +{ "index": { "_index": "products" } } +{ "name": "Product E", "rating": null, "num_reviews": null } +``` +{% include copy-curl.html %} + +Next, run the following `weighted_avg` aggregation: + +```json +GET /products/_search +{ + "size": 0, + "aggs": { + "weighted_rating": { + "weighted_avg": { + "value": { + "field": "rating" + }, + "weight": { + "field": "num_reviews" + } + } + } + } +} +``` +{% include copy-curl.html %} + +In the response, you can see that the missing values for `Product E` were completely ignored in the calculation. diff --git a/_api-reference/cat/cat-aliases.md b/_api-reference/cat/cat-aliases.md index 9e4407dced..b0c2d7184e 100644 --- a/_api-reference/cat/cat-aliases.md +++ b/_api-reference/cat/cat-aliases.md @@ -52,7 +52,7 @@ In addition to the [common URL parameters]({{site.url}}{{site.baseurl}}/api-refe Parameter | Type | Description :--- | :--- | :--- -local | Boolean | Whether to return information from the local node only instead of from the master node. Default is false. +local | Boolean | Whether to return information from the local node only instead of from the cluster manager node. Default is `false`. expand_wildcards | Enum | Expands wildcard expressions to concrete indexes. Combine multiple values with commas. Supported values are `all`, `open`, `closed`, `hidden`, and `none`. Default is `open`. ## Response diff --git a/_api-reference/cat/cat-allocation.md b/_api-reference/cat/cat-allocation.md index 9598c8f3b5..23ebed79ff 100644 --- a/_api-reference/cat/cat-allocation.md +++ b/_api-reference/cat/cat-allocation.md @@ -51,8 +51,8 @@ In addition to the [common URL parameters]({{site.url}}{{site.baseurl}}/api-refe Parameter | Type | Description :--- | :--- | :--- bytes | Byte size | Specify the units for byte size. For example, `7kb` or `6gb`. For more information, see [Supported units]({{site.url}}{{site.baseurl}}/opensearch/units/). -local | Boolean | Whether to return information from the local node only instead of from the cluster_manager node. Default is false. -cluster_manager_timeout | Time | The amount of time to wait for a connection to the cluster_manager node. Default is 30 seconds. +local | Boolean | Whether to return information from the local node only instead of from the cluster manager node. Default is `false`. +cluster_manager_timeout | Time | The amount of time to wait for a connection to the cluster manager node. Default is 30 seconds. ## Response diff --git a/_api-reference/cat/cat-health.md b/_api-reference/cat/cat-health.md index 6077c77e43..7767cfbc46 100644 --- a/_api-reference/cat/cat-health.md +++ b/_api-reference/cat/cat-health.md @@ -36,7 +36,7 @@ All CAT health URL parameters are optional. Parameter | Type | Description :--- | :--- | :--- time | Time | Specify the units for time. For example, `5d` or `7h`. For more information, see [Supported units]({{site.url}}{{site.baseurl}}/opensearch/units/). -ts | Boolean | If true, returns HH:MM:SS and Unix epoch timestamps. Default is true. +ts | Boolean | If true, returns HH:MM:SS and Unix epoch timestamps. Default is `true`. ## Response diff --git a/_api-reference/cat/cat-indices.md b/_api-reference/cat/cat-indices.md index 3a21e900ff..fe9556899e 100644 --- a/_api-reference/cat/cat-indices.md +++ b/_api-reference/cat/cat-indices.md @@ -52,9 +52,9 @@ Parameter | Type | Description :--- | :--- | :--- bytes | Byte size | Specify the units for byte size. For example, `7kb` or `6gb`. For more information, see [Supported units]({{site.url}}{{site.baseurl}}/opensearch/units/). health | String | Limit indexes based on their health status. Supported values are `green`, `yellow`, and `red`. -include_unloaded_segments | Boolean | Whether to include information from segments not loaded into memory. Default is false. +include_unloaded_segments | Boolean | Whether to include information from segments not loaded into memory. Default is `false`. cluster_manager_timeout | Time | The amount of time to wait for a connection to the cluster manager node. Default is 30 seconds. -pri | Boolean | Whether to return information only from the primary shards. Default is false. +pri | Boolean | Whether to return information only from the primary shards. Default is `false`. time | Time | Specify the units for time. For example, `5d` or `7h`. For more information, see [Supported units]({{site.url}}{{site.baseurl}}/opensearch/units/). expand_wildcards | Enum | Expands wildcard expressions to concrete indexes. Combine multiple values with commas. Supported values are `all`, `open`, `closed`, `hidden`, and `none`. Default is `open`. diff --git a/_api-reference/cat/cat-nodeattrs.md b/_api-reference/cat/cat-nodeattrs.md index 95c1e50afc..6b4cc6d92e 100644 --- a/_api-reference/cat/cat-nodeattrs.md +++ b/_api-reference/cat/cat-nodeattrs.md @@ -35,7 +35,7 @@ In addition to the [common URL parameters]({{site.url}}{{site.baseurl}}/api-refe Parameter | Type | Description :--- | :--- | :--- -local | Boolean | Whether to return information from the local node only instead of from the cluster_manager node. Default is false. +local | Boolean | Whether to return information from the local node only instead of from the cluster manager node. Default is `false`. cluster_manager_timeout | Time | The amount of time to wait for a connection to the cluster manager node. Default is 30 seconds. diff --git a/_api-reference/cat/cat-nodes.md b/_api-reference/cat/cat-nodes.md index 149e590536..864e5dfdd5 100644 --- a/_api-reference/cat/cat-nodes.md +++ b/_api-reference/cat/cat-nodes.md @@ -39,10 +39,9 @@ Parameter | Type | Description :--- | :--- | :--- bytes | Byte size | Specify the units for byte size. For example, `7kb` or `6gb`. For more information, see [Supported units]({{site.url}}{{site.baseurl}}/opensearch/units/). full_id | Boolean | If true, return the full node ID. If false, return the shortened node ID. Defaults to false. -local | Boolean | Whether to return information from the local node only instead of from the cluster_manager node. Default is false. cluster_manager_timeout | Time | The amount of time to wait for a connection to the cluster manager node. Default is 30 seconds. time | Time | Specify the units for time. For example, `5d` or `7h`. For more information, see [Supported units]({{site.url}}{{site.baseurl}}/opensearch/units/). -include_unloaded_segments | Boolean | Whether to include information from segments not loaded into memory. Default is false. +include_unloaded_segments | Boolean | Whether to include information from segments not loaded into memory. Default is `false`. ## Response diff --git a/_api-reference/cat/cat-pending-tasks.md b/_api-reference/cat/cat-pending-tasks.md index c8e1b744e8..748defd06e 100644 --- a/_api-reference/cat/cat-pending-tasks.md +++ b/_api-reference/cat/cat-pending-tasks.md @@ -36,7 +36,7 @@ In addition to the [common URL parameters]({{site.url}}{{site.baseurl}}/api-refe Parameter | Type | Description :--- | :--- | :--- -local | Boolean | Whether to return information from the local node only instead of from the cluster_manager node. Default is false. +local | Boolean | Whether to return information from the local node only instead of from the cluster manager node. Default is `false`. cluster_manager_timeout | Time | The amount of time to wait for a connection to the cluster manager node. Default is 30 seconds. time | Time | Specify the units for time. For example, `5d` or `7h`. For more information, see [Supported units]({{site.url}}{{site.baseurl}}/opensearch/units/). diff --git a/_api-reference/cat/cat-plugins.md b/_api-reference/cat/cat-plugins.md index 3498462236..519c77f27f 100644 --- a/_api-reference/cat/cat-plugins.md +++ b/_api-reference/cat/cat-plugins.md @@ -36,8 +36,8 @@ In addition to the [common URL parameters]({{site.url}}{{site.baseurl}}/api-refe Parameter | Type | Description :--- | :--- | :--- -local | Boolean | Whether to return information from the local node only instead of from the cluster_manager node. Default is false. -cluster_manager_timeout | Time | The amount of time to wait for a connection to the cluster_manager node. Default is 30 seconds. +local | Boolean | Whether to return information from the local node only instead of from the cluster manager node. Default is `false`. +cluster_manager_timeout | Time | The amount of time to wait for a connection to the cluster manager node. Default is 30 seconds. ## Response diff --git a/_api-reference/cat/cat-recovery.md b/_api-reference/cat/cat-recovery.md index 54abac6d99..da66aa7272 100644 --- a/_api-reference/cat/cat-recovery.md +++ b/_api-reference/cat/cat-recovery.md @@ -50,9 +50,9 @@ In addition to the [common URL parameters]({{site.url}}{{site.baseurl}}/api-refe Parameter | Type | Description :--- | :--- | :--- -active_only | Boolean | Whether to only include ongoing shard recoveries. Default is false. +active_only | Boolean | Whether to only include ongoing shard recoveries. Default is `false`. bytes | Byte size | Specify the units for byte size. For example, `7kb` or `6gb`. For more information, see [Supported units]({{site.url}}{{site.baseurl}}/opensearch/units/). -detailed | Boolean | Whether to include detailed information about shard recoveries. Default is false. +detailed | Boolean | Whether to include detailed information about shard recoveries. Default is `false`. time | Time | Specify the units for time. For example, `5d` or `7h`. For more information, see [Supported units]({{site.url}}{{site.baseurl}}/opensearch/units/). ## Response diff --git a/_api-reference/cat/cat-repositories.md b/_api-reference/cat/cat-repositories.md index 94f39b9d15..c6d62c9c62 100644 --- a/_api-reference/cat/cat-repositories.md +++ b/_api-reference/cat/cat-repositories.md @@ -36,8 +36,8 @@ In addition to the [common URL parameters]({{site.url}}{{site.baseurl}}/api-refe Parameter | Type | Description :--- | :--- | :--- -local | Boolean | Whether to return information from the local node only instead of from the cluster_manager node. Default is false. -cluster_manager_timeout | Time | The amount of time to wait for a connection to the cluster_manager node. Default is 30 seconds. +local | Boolean | Whether to return information from the local node only instead of from the cluster manager node. Default is `false`. +cluster_manager_timeout | Time | The amount of time to wait for a connection to the cluster manager node. Default is 30 seconds. ## Response diff --git a/_api-reference/cat/cat-shards.md b/_api-reference/cat/cat-shards.md index e74667b5ac..9a727b5b11 100644 --- a/_api-reference/cat/cat-shards.md +++ b/_api-reference/cat/cat-shards.md @@ -51,8 +51,8 @@ In addition to the [common URL parameters]({{site.url}}{{site.baseurl}}/api-refe Parameter | Type | Description :--- | :--- | :--- bytes | Byte size | Specify the units for byte size. For example, `7kb` or `6gb`. For more information, see [Supported units]({{site.url}}{{site.baseurl}}/opensearch/units/). -local | Boolean | Whether to return information from the local node only instead of from the cluster_manager node. Default is false. -cluster_manager_timeout | Time | The amount of time to wait for a connection to the cluster_manager node. Default is 30 seconds. +local | Boolean | Whether to return information from the local node only instead of from the cluster manager node. Default is `false`. +cluster_manager_timeout | Time | The amount of time to wait for a connection to the cluster manager node. Default is 30 seconds. time | Time | Specify the units for time. For example, `5d` or `7h`. For more information, see [Supported units]({{site.url}}{{site.baseurl}}/opensearch/units/). diff --git a/_api-reference/cat/cat-templates.md b/_api-reference/cat/cat-templates.md index d2aed7b0b8..d7c7aac90f 100644 --- a/_api-reference/cat/cat-templates.md +++ b/_api-reference/cat/cat-templates.md @@ -44,7 +44,7 @@ In addition to the [common URL parameters]({{site.url}}{{site.baseurl}}/api-refe Parameter | Type | Description :--- | :--- | :--- -local | Boolean | Whether to return information from the local node only instead of from the cluster manager node. Default is false. +local | Boolean | Whether to return information from the local node only instead of from the cluster manager node. Default is `false`. cluster_manager_timeout | Time | The amount of time to wait for a connection to the cluster manager node. Default is 30 seconds. diff --git a/_api-reference/cat/cat-thread-pool.md b/_api-reference/cat/cat-thread-pool.md index 5d3e341b74..491b523092 100644 --- a/_api-reference/cat/cat-thread-pool.md +++ b/_api-reference/cat/cat-thread-pool.md @@ -49,8 +49,8 @@ In addition to the [common URL parameters]({{site.url}}{{site.baseurl}}/api-refe Parameter | Type | Description :--- | :--- | :--- -local | Boolean | Whether to return information from the local node only instead of from the cluster_manager node. Default is false. -cluster_manager_timeout | Time | The amount of time to wait for a connection to the cluster_manager node. Default is 30 seconds. +local | Boolean | Whether to return information from the local node only instead of from the cluster manager node. Default is `false`. +cluster_manager_timeout | Time | The amount of time to wait for a connection to the cluster manager node. Default is 30 seconds. ## Response diff --git a/_api-reference/cat/index.md b/_api-reference/cat/index.md index 0ddaf1e0a7..7454a4cf39 100644 --- a/_api-reference/cat/index.md +++ b/_api-reference/cat/index.md @@ -24,17 +24,54 @@ GET _cat ``` {% include copy-curl.html %} +The response is an ASCII cat (`=^.^=`) and a list of operations: + +``` +=^.^= +/_cat/allocation +/_cat/segment_replication +/_cat/segment_replication/{index} +/_cat/shards +/_cat/shards/{index} +/_cat/cluster_manager +/_cat/nodes +/_cat/tasks +/_cat/indices +/_cat/indices/{index} +/_cat/segments +/_cat/segments/{index} +/_cat/count +/_cat/count/{index} +/_cat/recovery +/_cat/recovery/{index} +/_cat/health +/_cat/pending_tasks +/_cat/aliases +/_cat/aliases/{alias} +/_cat/thread_pool +/_cat/thread_pool/{thread_pools} +/_cat/plugins +/_cat/fielddata +/_cat/fielddata/{fields} +/_cat/nodeattrs +/_cat/repositories +/_cat/snapshots/{repository} +/_cat/templates +/_cat/pit_segments +/_cat/pit_segments/{pit_id} +``` + ## Optional query parameters -You can use the following query parameters with any CAT API to filter your results. +The root `_cat` API does not take any parameters, but individual APIs, such as `/_cat/nodes` accept the following query parameters. Parameter | Description :--- | :--- | `v` | Provides verbose output by adding headers to the columns. It also adds some formatting to help align each of the columns together. All examples in this section include the `v` parameter. `help` | Lists the default and other available headers for a given operation. `h` | Limits the output to specific headers. -`format` | Returns the result in JSON, YAML, or CBOR formats. -`sort` | Sorts the output by the specified columns. +`format` | The format in which to return the result. Valid values are `json`, `yaml`, `cbor`, and `smile`. +`s` | Sorts the output by the specified columns. ### Query parameter usage examples @@ -59,7 +96,6 @@ sample-alias1 sample-index-1 - - - - Without the verbose parameter, `v`, the response simply returns the alias names: ``` - .kibana .kibana_1 - - - - sample-alias1 sample-index-1 - - - - ``` @@ -72,6 +108,24 @@ To see all the available headers, use the `help` parameter: GET _cat/?help ``` +For example, to see the available headers for the CAT aliases operation, send the following request: + +```json +GET _cat/aliases?help +``` +{% include copy-curl.html %} + +The response contains the available headers: + +``` +alias | a | alias name +index | i,idx | index alias points to +filter | f,fi | filter +routing.index | ri,routingIndex | index routing +routing.search | rs,routingSearch | search routing +is_write_index | w,isWriteIndex | write index +``` + ### Get a subset of headers To limit the output to a subset of headers, use the `h` parameter: @@ -80,7 +134,71 @@ To limit the output to a subset of headers, use the `h` parameter: GET _cat/?h=,&v ``` +For example, to limit aliases to only the alias name and index, send the following request: + +```json +GET _cat/aliases?h=alias,index +``` +{% include copy-curl.html %} + +The response contains the requested information: + +``` +.kibana .kibana_1 +sample-alias1 sample-index-1 +``` + Typically, for any operation you can find out what headers are available using the `help` parameter, and then use the `h` parameter to limit the output to only the headers that you care about. +### Sort by a header + +To sort the output by a header, use the `s` parameter: + +```json +GET _cat/?s=, +``` + +For example, to sort aliases by alias and then index, send the following request: + +```json +GET _cat/aliases?s=i,a +``` +{% include copy-curl.html %} + +The response contains the requested information: + +``` +sample-alias2 sample-index-1 +sample-alias1 sample-index-2 +``` + +### Retrieve data in JSON format + +By default, CAT APIs return data in `text/plain` format. + +To retrieve data in JSON format, use the `format=json` parameter: + +```json +GET _cat/?format=json +``` + +For example, to retrieve aliases in JSON format, send the following request: + +```json +GET _cat/aliases?format=json +``` +{% include copy-curl.html %} + +The response contains data in JSON format: + +```json +[ + {"alias":".kibana","index":".kibana_1","filter":"-","routing.index":"-","routing.search":"-","is_write_index":"-"}, + {"alias":"sample-alias-1","index":"sample-index-1","filter":"-","routing.index":"-","routing.search":"-","is_write_index":"-"} +] +``` + +Other supported formats are [YAML](https://yaml.org/), [CBOR](https://cbor.io/), and [Smile](https://github.com/FasterXML/smile-format-specification). + If you use the Security plugin, make sure you have the appropriate permissions. {: .note } diff --git a/_api-reference/cluster-api/cluster-allocation.md b/_api-reference/cluster-api/cluster-allocation.md index da6e3aab05..b1b1c266d6 100644 --- a/_api-reference/cluster-api/cluster-allocation.md +++ b/_api-reference/cluster-api/cluster-allocation.md @@ -43,8 +43,8 @@ All cluster allocation explain parameters are optional. Parameter | Type | Description :--- | :--- | :--- -include_yes_decisions | Boolean | OpenSearch makes a series of yes or no decisions when trying to allocate a shard to a node. If this parameter is true, OpenSearch includes the (generally more numerous) "yes" decisions in its response. Default is false. -include_disk_info | Boolean | Whether to include information about disk usage in the response. Default is false. +include_yes_decisions | Boolean | OpenSearch makes a series of yes or no decisions when trying to allocate a shard to a node. If this parameter is true, OpenSearch includes the (generally more numerous) "yes" decisions in its response. Default is `false`. +include_disk_info | Boolean | Whether to include information about disk usage in the response. Default is `false`. ## Request body diff --git a/_api-reference/cluster-api/cluster-health.md b/_api-reference/cluster-api/cluster-health.md index e9e2bb0e47..73c83d5ee6 100644 --- a/_api-reference/cluster-api/cluster-health.md +++ b/_api-reference/cluster-api/cluster-health.md @@ -44,14 +44,14 @@ Parameter | Type | Description expand_wildcards | Enum | Expands wildcard expressions to concrete indexes. Combine multiple values with commas. Supported values are `all`, `open`, `closed`, `hidden`, and `none`. Default is `open`. level | Enum | The level of detail for returned health information. Supported values are `cluster`, `indices`, `shards`, and `awareness_attributes`. Default is `cluster`. awareness_attribute | String | The name of the awareness attribute, for which to return cluster health (for example, `zone`). Applicable only if `level` is set to `awareness_attributes`. -local | Boolean | Whether to return information from the local node only instead of from the cluster manager node. Default is false. +local | Boolean | Whether to return information from the local node only instead of from the cluster manager node. Default is `false`. cluster_manager_timeout | Time | The amount of time to wait for a connection to the cluster manager node. Default is 30 seconds. timeout | Time | The amount of time to wait for a response. If the timeout expires, the request fails. Default is 30 seconds. wait_for_active_shards | String | Wait until the specified number of shards is active before returning a response. `all` for all shards. Default is `0`. wait_for_nodes | String | Wait for N number of nodes. Use `12` for exact match, `>12` and `<12` for range. wait_for_events | Enum | Wait until all currently queued events with the given priority are processed. Supported values are `immediate`, `urgent`, `high`, `normal`, `low`, and `languid`. -wait_for_no_relocating_shards | Boolean | Whether to wait until there are no relocating shards in the cluster. Default is false. -wait_for_no_initializing_shards | Boolean | Whether to wait until there are no initializing shards in the cluster. Default is false. +wait_for_no_relocating_shards | Boolean | Whether to wait until there are no relocating shards in the cluster. Default is `false`. +wait_for_no_initializing_shards | Boolean | Whether to wait until there are no initializing shards in the cluster. Default is `false`. wait_for_status | Enum | Wait until the cluster health reaches the specified status or better. Supported values are `green`, `yellow`, and `red`. weights | JSON object | Assigns weights to attributes within the request body of the PUT request. Weights can be set in any ration, for example, 2:3:5. In a 2:3:5 ratio with three zones, for every 100 requests sent to the cluster, each zone would receive either 20, 30, or 50 search requests in a random order. When assigned a weight of `0`, the zone does not receive any search traffic. diff --git a/_api-reference/common-parameters.md b/_api-reference/common-parameters.md index 347d38a0de..5b536ad992 100644 --- a/_api-reference/common-parameters.md +++ b/_api-reference/common-parameters.md @@ -90,3 +90,37 @@ The following request specifies filters to limit the fields returned in the resp GET _search?filter_path=.*,- ``` + +## Units + +OpenSearch APIs support the following units. + +### Time units + +The following table lists all supported time units. + +Units | Specify as +:--- | :--- +Days | `d` +Hours | `h` +Minutes | `m` +Seconds | `s` +Milliseconds | `ms` +Microseconds | `micros` +Nanoseconds | `nanos` + +### Distance units + +The following table lists all supported distance units. + +Units | Specify as +:--- | :--- +Miles | `mi` or `miles` +Yards | `yd` or `yards` +Feet | `ft` or `feet` +Inches | `in` or `inch` +Kilometers | `km` or `kilometers` +Meters | `m` or `meters` +Centimeters | `cm` or `centimeters` +Millimeters | `mm` or `millimeters` +Nautical miles | `NM`, `nmi`, or `nauticalmiles` \ No newline at end of file diff --git a/_api-reference/count.md b/_api-reference/count.md index 3e777a413e..2ac336eeb0 100644 --- a/_api-reference/count.md +++ b/_api-reference/count.md @@ -79,14 +79,14 @@ All count parameters are optional. Parameter | Type | Description :--- | :--- | :--- -`allow_no_indices` | Boolean | If false, the request returns an error if any wildcard expression or index alias targets any closed or missing indexes. Default is false. +`allow_no_indices` | Boolean | If false, the request returns an error if any wildcard expression or index alias targets any closed or missing indexes. Default is `false`. `analyzer` | String | The analyzer to use in the query string. -`analyze_wildcard` | Boolean | Specifies whether to analyze wildcard and prefix queries. Default is false. -`default_operator` | String | Indicates whether the default operator for a string query should be AND or OR. Default is OR. +`analyze_wildcard` | Boolean | Specifies whether to analyze wildcard and prefix queries. Default is `false`. +`default_operator` | String | Indicates whether the default operator for a string query should be `AND` or `OR`. Default is `OR`. `df` | String | The default field in case a field prefix is not provided in the query string. `expand_wildcards` | String | Specifies the type of index that wildcard expressions can match. Supports comma-separated values. Valid values are `all` (match any index), `open` (match open, non-hidden indexes), `closed` (match closed, non-hidden indexes), `hidden` (match hidden indexes), and `none` (deny wildcard expressions). Default is `open`. -`ignore_unavailable` | Boolean | Specifies whether to include missing or closed indexes in the response. Default is false. -`lenient` | Boolean | Specifies whether OpenSearch should accept requests if queries have format errors (for example, querying a text field for an integer). Default is false. +`ignore_unavailable` | Boolean | Specifies whether to include missing or closed indexes in the response. Default is `false`. +`lenient` | Boolean | Specifies whether OpenSearch should accept requests if queries have format errors (for example, querying a text field for an integer). Default is `false`. `min_score` | Float | Include only documents with a minimum `_score` value in the result. `routing` | String | Value used to route the operation to a specific shard. `preference` | String | Specifies which shard or node OpenSearch should perform the count operation on. diff --git a/_api-reference/document-apis/delete-by-query.md b/_api-reference/document-apis/delete-by-query.md index ca90ea3484..6f4104c254 100644 --- a/_api-reference/document-apis/delete-by-query.md +++ b/_api-reference/document-apis/delete-by-query.md @@ -42,14 +42,14 @@ Parameter | Type | Description <index> | String | Name or list of the data streams, indexes, or aliases to delete from. Supports wildcards. If left blank, OpenSearch searches all indexes. allow_no_indices | Boolean | Whether to ignore wildcards that don’t match any indexes. Default is `true`. analyzer | String | The analyzer to use in the query string. -analyze_wildcard | Boolean | Specifies whether to analyze wildcard and prefix queries. Default is false. +analyze_wildcard | Boolean | Specifies whether to analyze wildcard and prefix queries. Default is `false`. conflicts | String | Indicates to OpenSearch what should happen if the delete by query operation runs into a version conflict. Valid options are `abort` and `proceed`. Default is `abort`. -default_operator | String | Indicates whether the default operator for a string query should be AND or OR. Default is OR. +default_operator | String | Indicates whether the default operator for a string query should be `AND` or `OR`. Default is `OR`. df | String | The default field in case a field prefix is not provided in the query string. expand_wildcards | String | Specifies the type of index that wildcard expressions can match. Supports comma-separated values. Valid values are `all` (match any index), `open` (match open, non-hidden indexes), `closed` (match closed, non-hidden indexes), `hidden` (match hidden indexes), and `none` (deny wildcard expressions). Default is `open`. from | Integer | The starting index to search from. Default is 0. ignore_unavailable | Boolean | Specifies whether to include missing or closed indexes in the response and ignores unavailable shards during the search request. Default is `false`. -lenient | Boolean | Specifies whether OpenSearch should accept requests if queries have format errors (for example, querying a text field for an integer). Default is false. +lenient | Boolean | Specifies whether OpenSearch should accept requests if queries have format errors (for example, querying a text field for an integer). Default is `false`. max_docs | Integer | How many documents the delete by query operation should process at most. Default is all documents. preference | String | Specifies which shard or node OpenSearch should perform the delete by query operation on. q | String | Lucene query string's query. diff --git a/_api-reference/document-apis/get-documents.md b/_api-reference/document-apis/get-documents.md index d5c2e52d93..3eaeb507d4 100644 --- a/_api-reference/document-apis/get-documents.md +++ b/_api-reference/document-apis/get-documents.md @@ -38,11 +38,11 @@ All get document URL parameters are optional. Parameter | Type | Description :--- | :--- | :--- preference | String | Specifies a preference of which shard to retrieve results from. Available options are `_local`, which tells the operation to retrieve results from a locally allocated shard replica, and a custom string value assigned to a specific shard replica. By default, OpenSearch executes get document operations on random shards. -realtime | Boolean | Specifies whether the operation should run in realtime. If false, the operation waits for the index to refresh to analyze the source to retrieve data, which makes the operation near-realtime. Default is true. +realtime | Boolean | Specifies whether the operation should run in realtime. If false, the operation waits for the index to refresh to analyze the source to retrieve data, which makes the operation near-realtime. Default is `true`. refresh | Boolean | If true, OpenSearch refreshes shards to make the get operation available to search results. Valid options are `true`, `false`, and `wait_for`, which tells OpenSearch to wait for a refresh before executing the operation. Default is `false`. routing | String | A value used to route the operation to a specific shard. -stored_fields | Boolean | Whether the get operation should retrieve fields stored in the index. Default is false. -_source | String | Whether to include the `_source` field in the response body. Default is true. +stored_fields | Boolean | Whether the get operation should retrieve fields stored in the index. Default is `false`. +_source | String | Whether to include the `_source` field in the response body. Default is `true`. _source_excludes | String | A comma-separated list of source fields to exclude in the query response. _source_includes | String | A comma-separated list of source fields to include in the query response. version | Integer | The version of the document to return, which must match the current version of the document. diff --git a/_api-reference/document-apis/index-document.md b/_api-reference/document-apis/index-document.md index 3460fc1d50..d131a2f50e 100644 --- a/_api-reference/document-apis/index-document.md +++ b/_api-reference/document-apis/index-document.md @@ -93,12 +93,12 @@ if_primary_term | Integer | Only perform the index operation if the document has op_type | Enum | Specifies the type of operation to complete with the document. Valid values are `create` (index a document only if it doesn't exist) and `index`. If a document ID is included in the request, then the default is `index`. Otherwise, the default is `create`. | No pipeline | String | Route the index operation to a certain pipeline. | No routing | String | value used to assign the index operation to a specific shard. | No -refresh | Enum | If true, OpenSearch refreshes shards to make the operation visible to searching. Valid options are `true`, `false`, and `wait_for`, which tells OpenSearch to wait for a refresh before executing the operation. Default is false. | No +refresh | Enum | If true, OpenSearch refreshes shards to make the operation visible to searching. Valid options are `true`, `false`, and `wait_for`, which tells OpenSearch to wait for a refresh before executing the operation. Default is `false`. | No timeout | Time | How long to wait for a response from the cluster. Default is `1m`. | No version | Integer | The document's version number. | No version_type | Enum | Assigns a specific type to the document. Valid options are `external` (retrieve the document if the specified version number is greater than the document's current version) and `external_gte` (retrieve the document if the specified version number is greater than or equal to the document's current version). For example, to index version 3 of a document, use `/_doc/1?version=3&version_type=external`. | No wait_for_active_shards | String | The number of active shards that must be available before OpenSearch processes the request. Default is 1 (only the primary shard). Set to `all` or a positive integer. Values greater than 1 require replicas. For example, if you specify a value of 3, the index must have two replicas distributed across two additional nodes for the operation to succeed. | No -require_alias | Boolean | Specifies whether the target index must be an index alias. Default is false. | No +require_alias | Boolean | Specifies whether the target index must be an index alias. Default is `false`. | No ## Request body diff --git a/_api-reference/document-apis/multi-get.md b/_api-reference/document-apis/multi-get.md index 16e9ceeb95..2d3246fa58 100644 --- a/_api-reference/document-apis/multi-get.md +++ b/_api-reference/document-apis/multi-get.md @@ -29,7 +29,7 @@ All multi-get URL parameters are optional. Parameter | Type | Description :--- | :--- | :--- | :--- <index> | String | Name of the index to retrieve documents from. -preference | String | Specifies the nodes or shards OpenSearch should execute the multi-get operation on. Default is random. +preference | String | Specifies the nodes or shards OpenSearch should execute the multi-get operation on. Default is `random`. realtime | Boolean | Specifies whether the operation should run in realtime. If false, the operation waits for the index to refresh to analyze the source to retrieve data, which makes the operation near-realtime. Default is `true`. refresh | Boolean | If true, OpenSearch refreshes shards to make the multi-get operation available to search results. Valid options are `true`, `false`, and `wait_for`, which tells OpenSearch to wait for a refresh before executing the operation. Default is `false`. routing | String | Value used to route the multi-get operation to a specific shard. diff --git a/_api-reference/document-apis/reindex.md b/_api-reference/document-apis/reindex.md index 2bc3646e68..48f14923f5 100644 --- a/_api-reference/document-apis/reindex.md +++ b/_api-reference/document-apis/reindex.md @@ -46,7 +46,7 @@ timeout | Time | How long to wait for a response from the cluster. Default is `3 wait_for_active_shards | String | The number of active shards that must be available before OpenSearch processes the reindex request. Default is 1 (only the primary shard). Set to `all` or a positive integer. Values greater than 1 require replicas. For example, if you specify a value of 3, the index must have two replicas distributed across two additional nodes for the operation to succeed. wait_for_completion | Boolean | Waits for the matching tasks to complete. Default is `false`. requests_per_second | Integer | Specifies the request’s throttling in sub-requests per second. Default is -1, which means no throttling. -require_alias | Boolean | Whether the destination index must be an index alias. Default is false. +require_alias | Boolean | Whether the destination index must be an index alias. Default is `false`. scroll | Time | How long to keep the search context open. Default is `5m`. slices | Integer | Number of sub-tasks OpenSearch should divide this task into. Default is 1, which means OpenSearch should not divide this task. Setting this parameter to `auto` indicates to OpenSearch that it should automatically decide how many slices to split the task into. max_docs | Integer | How many documents the update by query operation should process at most. Default is all documents. @@ -70,7 +70,7 @@ socket_timeout | The wait time for socket reads. Default is 30s. connect_timeout | The wait time for remote connection timeouts. Default is 30s. size | The number of documents to reindex. slice | Whether to manually or automatically slice the reindex operation so it executes in parallel. Setting this field to `auto` allows OpenSearch to control the number of slices to use, which is one slice per shard, up to a maximum of 20. If there are multiple sources, the number of slices used are based on the index or backing index with the smallest number of shards. -_source | Whether to reindex source fields. Specify a list of fields to reindex or true to reindex all fields. Default is true. +_source | Whether to reindex source fields. Specify a list of fields to reindex or true to reindex all fields. Default is `true`. id | The ID to associate with manual slicing. max | Maximum number of slices. dest | Information about the destination index. Valid values are `index`, `version_type`, `op_type`, and `pipeline`. diff --git a/_api-reference/document-apis/update-by-query.md b/_api-reference/document-apis/update-by-query.md index 4cd686dcb4..217ae69550 100644 --- a/_api-reference/document-apis/update-by-query.md +++ b/_api-reference/document-apis/update-by-query.md @@ -49,14 +49,14 @@ Parameter | Type | Description <index> | String | Comma-separated list of indexes to update. To update all indexes, use * or omit this parameter. allow_no_indices | Boolean | Whether to ignore wildcards that don’t match any indexes. Default is `true`. analyzer | String | Analyzer to use in the query string. -analyze_wildcard | Boolean | Whether the update operation should include wildcard and prefix queries in the analysis. Default is false. +analyze_wildcard | Boolean | Whether the update operation should include wildcard and prefix queries in the analysis. Default is `false`. conflicts | String | Indicates to OpenSearch what should happen if the update by query operation runs into a version conflict. Valid options are `abort` and `proceed`. Default is `abort`. default_operator | String | Indicates whether the default operator for a string query should be `AND` or `OR`. Default is `OR`. df | String | The default field if a field prefix is not provided in the query string. expand_wildcards | String | Specifies the type of index that wildcard expressions can match. Supports comma-separated values. Valid values are `all` (match any index), `open` (match open, non-hidden indexes), `closed` (match closed, non-hidden indexes), `hidden` (match hidden indexes), and `none` (deny wildcard expressions). Default is `open`. from | Integer | The starting index to search from. Default is 0. ignore_unavailable | Boolean | Whether to exclude missing or closed indexes in the response and ignores unavailable shards during the search request. Default is `false`. -lenient | Boolean | Specifies whether OpenSearch should accept requests if queries have format errors (for example, querying a text field for an integer). Default is false. +lenient | Boolean | Specifies whether OpenSearch should accept requests if queries have format errors (for example, querying a text field for an integer). Default is `false`. max_docs | Integer | How many documents the update by query operation should process at most. Default is all documents. pipeline | String | ID of the pipeline to use to process documents. preference | String | Specifies which shard or node OpenSearch should perform the update by query operation on. diff --git a/_api-reference/document-apis/update-document.md b/_api-reference/document-apis/update-document.md index 365cb3aa73..3da7030fa5 100644 --- a/_api-reference/document-apis/update-document.md +++ b/_api-reference/document-apis/update-document.md @@ -53,7 +53,7 @@ Parameter | Type | Description | Required if_seq_no | Integer | Only perform the update operation if the document has the specified sequence number. | No if_primary_term | Integer | Perform the update operation if the document has the specified primary term. | No lang | String | Language of the script. Default is `painless`. | No -require_alias | Boolean | Specifies whether the destination must be an index alias. Default is false. | No +require_alias | Boolean | Specifies whether the destination must be an index alias. Default is `false`. | No refresh | Enum | If true, OpenSearch refreshes shards to make the operation visible to searching. Valid options are `true`, `false`, and `wait_for`, which tells OpenSearch to wait for a refresh before executing the operation. Default is `false`. | No retry_on_conflict | Integer | The amount of times OpenSearch should retry the operation if there's a document conflict. Default is 0. | No routing | String | Value to route the update operation to a specific shard. | No diff --git a/_api-reference/explain.md b/_api-reference/explain.md index 57b7d9fada..8c2b757945 100644 --- a/_api-reference/explain.md +++ b/_api-reference/explain.md @@ -64,15 +64,15 @@ Parameter | Type | Description | Required `` | String | Name of the index. You can only specify a single index. | Yes `<_id>` | String | A unique identifier to attach to the document. | Yes `analyzer` | String | The analyzer to use in the query string. | No -`analyze_wildcard` | Boolean | Specifies whether to analyze wildcard and prefix queries. Default is false. | No +`analyze_wildcard` | Boolean | Specifies whether to analyze wildcard and prefix queries. Default is `false`. | No `default_operator` | String | Indicates whether the default operator for a string query should be AND or OR. Default is OR. | No `df` | String | The default field in case a field prefix is not provided in the query string. | No -`lenient` | Boolean | Specifies whether OpenSearch should ignore format-based query failures (for example, querying a text field for an integer). Default is false. | No +`lenient` | Boolean | Specifies whether OpenSearch should ignore format-based query failures (for example, querying a text field for an integer). Default is `false`. | No `preference` | String | Specifies a preference of which shard to retrieve results from. Available options are `_local`, which tells the operation to retrieve results from a locally allocated shard replica, and a custom string value assigned to a specific shard replica. By default, OpenSearch executes the explain operation on random shards. | No `q` | String | Query in the Lucene query string syntax. | No -`stored_fields` | Boolean | If true, the operation retrieves document fields stored in the index rather than the document’s `_source`. Default is false. | No +`stored_fields` | Boolean | If true, the operation retrieves document fields stored in the index rather than the document’s `_source`. Default is `false`. | No `routing` | String | Value used to route the operation to a specific shard. | No -`_source` | String | Whether to include the `_source` field in the response body. Default is true. | No +`_source` | String | Whether to include the `_source` field in the response body. Default is `true`. | No `_source_excludes` | String | A comma-separated list of source fields to exclude in the query response. | No `_source_includes` | String | A comma-separated list of source fields to include in the query response. | No diff --git a/_api-reference/index-apis/close-index.md b/_api-reference/index-apis/close-index.md index e8d2e3e1e2..7e43198d37 100644 --- a/_api-reference/index-apis/close-index.md +++ b/_api-reference/index-apis/close-index.md @@ -33,9 +33,9 @@ All parameters are optional. Parameter | Type | Description :--- | :--- | :--- <index-name> | String | The index to close. Can be a comma-separated list of multiple index names. Use `_all` or * to close all indexes. -allow_no_indices | Boolean | Whether to ignore wildcards that don't match any indexes. Default is true. -expand_wildcards | String | Expands wildcard expressions to different indexes. Combine multiple values with commas. Available values are all (match all indexes), open (match open indexes), closed (match closed indexes), hidden (match hidden indexes), and none (do not accept wildcard expressions). Default is open. -ignore_unavailable | Boolean | If true, OpenSearch does not search for missing or closed indexes. Default is false. +allow_no_indices | Boolean | Whether to ignore wildcards that don't match any indexes. Default is `true`. +expand_wildcards | String | Expands wildcard expressions to different indexes. Combine multiple values with commas. Available values are all (match all indexes), open (match open indexes), closed (match closed indexes), hidden (match hidden indexes), and none (do not accept wildcard expressions). Default is `open`. +ignore_unavailable | Boolean | If true, OpenSearch does not search for missing or closed indexes. Default is `false`. wait_for_active_shards | String | Specifies the number of active shards that must be available before OpenSearch processes the request. Default is 1 (only the primary shard). Set to all or a positive integer. Values greater than 1 require replicas. For example, if you specify a value of 3, the index must have two replicas distributed across two additional nodes for the request to succeed. cluster_manager_timeout | Time | How long to wait for a connection to the cluster manager node. Default is `30s`. timeout | Time | How long to wait for a response from the cluster. Default is `30s`. diff --git a/_api-reference/index-apis/delete-index.md b/_api-reference/index-apis/delete-index.md index 7b2be5e83b..20e5c51c93 100644 --- a/_api-reference/index-apis/delete-index.md +++ b/_api-reference/index-apis/delete-index.md @@ -31,8 +31,8 @@ All parameters are optional. Parameter | Type | Description :--- | :--- | :--- -allow_no_indices | Boolean | Whether to ignore wildcards that don't match any indexes. Default is true. -expand_wildcards | String | Expands wildcard expressions to different indexes. Combine multiple values with commas. Available values are all (match all indexes), open (match open indexes), closed (match closed indexes), hidden (match hidden indexes), and none (do not accept wildcard expressions), which must be used with open, closed, or both. Default is open. +allow_no_indices | Boolean | Whether to ignore wildcards that don't match any indexes. Default is `true`. +expand_wildcards | String | Expands wildcard expressions to different indexes. Combine multiple values with commas. Available values are all (match all indexes), open (match open indexes), closed (match closed indexes), hidden (match hidden indexes), and none (do not accept wildcard expressions), which must be used with open, closed, or both. Default is `open`. ignore_unavailable | Boolean | If true, OpenSearch does not include missing or closed indexes in the response. cluster_manager_timeout | Time | How long to wait for a connection to the cluster manager node. Default is `30s`. timeout | Time | How long to wait for the response to return. Default is `30s`. diff --git a/_api-reference/index-apis/exists.md b/_api-reference/index-apis/exists.md index 6d439a96cf..429ac40745 100644 --- a/_api-reference/index-apis/exists.md +++ b/_api-reference/index-apis/exists.md @@ -32,12 +32,12 @@ All parameters are optional. Parameter | Type | Description :--- | :--- | :--- -allow_no_indices | Boolean | Whether to ignore wildcards that don't match any indexes. Default is true. -expand_wildcards | String | Expands wildcard expressions to different indexes. Combine multiple values with commas. Available values are all (match all indexes), open (match open indexes), closed (match closed indexes), hidden (match hidden indexes), and none (do not accept wildcard expressions). Default is open. +allow_no_indices | Boolean | Whether to ignore wildcards that don't match any indexes. Default is `true`. +expand_wildcards | String | Expands wildcard expressions to different indexes. Combine multiple values with commas. Available values are all (match all indexes), open (match open indexes), closed (match closed indexes), hidden (match hidden indexes), and none (do not accept wildcard expressions). Default is `open`. flat_settings | Boolean | Whether to return settings in the flat form, which can improve readability, especially for heavily nested settings. For example, the flat form of "index": { "creation_date": "123456789" } is "index.creation_date": "123456789". include_defaults | Boolean | Whether to include default settings as part of the response. This parameter is useful for identifying the names and current values of settings you want to update. -ignore_unavailable | Boolean | If true, OpenSearch does not search for missing or closed indexes. Default is false. -local | Boolean | Whether to return information from only the local node instead of from the cluster manager node. Default is false. +ignore_unavailable | Boolean | If true, OpenSearch does not search for missing or closed indexes. Default is `false`. +local | Boolean | Whether to return information from only the local node instead of from the cluster manager node. Default is `false`. ## Response diff --git a/_api-reference/index-apis/get-index.md b/_api-reference/index-apis/get-index.md index 899e82e901..733110d63a 100644 --- a/_api-reference/index-apis/get-index.md +++ b/_api-reference/index-apis/get-index.md @@ -32,12 +32,12 @@ All parameters are optional. Parameter | Type | Description :--- | :--- | :--- -allow_no_indices | Boolean | Whether to ignore wildcards that don't match any indexes. Default is true. -expand_wildcards | String | Expands wildcard expressions to different indexes. Combine multiple values with commas. Available values are all (match all indexes), open (match open indexes), closed (match closed indexes), hidden (match hidden indexes), and none (do not accept wildcard expressions), which must be used with open, closed, or both. Default is open. +allow_no_indices | Boolean | Whether to ignore wildcards that don't match any indexes. Default is `true`. +expand_wildcards | String | Expands wildcard expressions to different indexes. Combine multiple values with commas. Available values are all (match all indexes), open (match open indexes), closed (match closed indexes), hidden (match hidden indexes), and none (do not accept wildcard expressions), which must be used with open, closed, or both. Default is `open`. flat_settings | Boolean | Whether to return settings in the flat form, which can improve readability, especially for heavily nested settings. For example, the flat form of "index": { "creation_date": "123456789" } is "index.creation_date": "123456789". include_defaults | Boolean | Whether to include default settings as part of the response. This parameter is useful for identifying the names and current values of settings you want to update. ignore_unavailable | Boolean | If true, OpenSearch does not include missing or closed indexes in the response. -local | Boolean | Whether to return information from only the local node instead of from the cluster manager node. Default is false. +local | Boolean | Whether to return information from only the local node instead of from the cluster manager node. Default is `false`. cluster_manager_timeout | Time | How long to wait for a connection to the cluster manager node. Default is `30s`. diff --git a/_api-reference/index-apis/get-settings.md b/_api-reference/index-apis/get-settings.md index 41eb4ea113..c41b25b4f5 100644 --- a/_api-reference/index-apis/get-settings.md +++ b/_api-reference/index-apis/get-settings.md @@ -40,9 +40,9 @@ Parameter | Data type | Description allow_no_indices | Boolean | Whether to ignore wildcards that don’t match any indexes. Default is `true`. expand_wildcards | String | Expands wildcard expressions to different indexes. Combine multiple values with commas. Available values are `all` (match all indexes), `open` (match open indexes), `closed` (match closed indexes), `hidden` (match hidden indexes), and `none` (do not accept wildcard expressions), which must be used with `open`, `closed`, or both. Default is `open`. flat_settings | Boolean | Whether to return settings in the flat form, which can improve readability, especially for heavily nested settings. For example, the flat form of “index”: { “creation_date”: “123456789” } is “index.creation_date”: “123456789”. -include_defaults | String | Whether to include default settings, including settings used within OpenSearch plugins, in the response. Default is false. +include_defaults | Boolean | Whether to include default settings, including settings used within OpenSearch plugins, in the response. Default is `false`. ignore_unavailable | Boolean | If true, OpenSearch does not include missing or closed indexes in the response. -local | Boolean | Whether to return information from the local node only instead of the cluster manager node. Default is false. +local | Boolean | Whether to return information from the local node only instead of the cluster manager node. Default is `false`. cluster_manager_timeout | Time | How long to wait for a connection to the cluster manager node. Default is `30s`. ## Response diff --git a/_api-reference/index-apis/open-index.md b/_api-reference/index-apis/open-index.md index 6ca0348695..12381aa8c6 100644 --- a/_api-reference/index-apis/open-index.md +++ b/_api-reference/index-apis/open-index.md @@ -33,9 +33,9 @@ All parameters are optional. Parameter | Type | Description :--- | :--- | :--- <index-name> | String | The index to open. Can be a comma-separated list of multiple index names. Use `_all` or * to open all indexes. -allow_no_indices | Boolean | Whether to ignore wildcards that don't match any indexes. Default is true. -expand_wildcards | String | Expands wildcard expressions to different indexes. Combine multiple values with commas. Available values are all (match all indexes), open (match open indexes), closed (match closed indexes), hidden (match hidden indexes), and none (do not accept wildcard expressions). Default is open. -ignore_unavailable | Boolean | If true, OpenSearch does not search for missing or closed indexes. Default is false. +allow_no_indices | Boolean | Whether to ignore wildcards that don't match any indexes. Default is `true`. +expand_wildcards | String | Expands wildcard expressions to different indexes. Combine multiple values with commas. Available values are all (match all indexes), open (match open indexes), closed (match closed indexes), hidden (match hidden indexes), and none (do not accept wildcard expressions). Default is `open`. +ignore_unavailable | Boolean | If true, OpenSearch does not search for missing or closed indexes. Default is `false`. wait_for_active_shards | String | Specifies the number of active shards that must be available before OpenSearch processes the request. Default is 1 (only the primary shard). Set to all or a positive integer. Values greater than 1 require replicas. For example, if you specify a value of 3, the index must have two replicas distributed across two additional nodes for the request to succeed. cluster_manager_timeout | Time | How long to wait for a connection to the cluster manager node. Default is `30s`. timeout | Time | How long to wait for a response from the cluster. Default is `30s`. diff --git a/_api-reference/index-apis/put-mapping.md b/_api-reference/index-apis/put-mapping.md index 47c47fa125..f7d9321d33 100644 --- a/_api-reference/index-apis/put-mapping.md +++ b/_api-reference/index-apis/put-mapping.md @@ -76,7 +76,6 @@ Parameter | Data type | Description allow_no_indices | Boolean | Whether to ignore wildcards that don’t match any indexes. Default is `true`. expand_wildcards | String | Expands wildcard expressions to different indexes. Combine multiple values with commas. Available values are `all` (match all indexes), `open` (match open indexes), `closed` (match closed indexes), `hidden` (match hidden indexes), and `none` (do not accept wildcard expressions), which must be used with `open`, `closed`, or both. Default is `open`. ignore_unavailable | Boolean | If true, OpenSearch does not include missing or closed indexes in the response. -ignore_malformed | Boolean | Use this parameter with the `ip_range` data type to specify that OpenSearch should ignore malformed fields. If `true`, OpenSearch does not include entries that do not match the IP range specified in the index in the response. The default is `false`. cluster_manager_timeout | Time | How long to wait for a connection to the cluster manager node. Default is `30s`. timeout | Time | How long to wait for the response to return. Default is `30s`. write_index_only | Boolean | Whether OpenSearch should apply mapping updates only to the write index. diff --git a/_api-reference/index-apis/refresh.md b/_api-reference/index-apis/refresh.md index b72a6c7470..4d75060087 100644 --- a/_api-reference/index-apis/refresh.md +++ b/_api-reference/index-apis/refresh.md @@ -45,7 +45,7 @@ The following table lists the available query parameters. All query parameters a | :--- | :--- | :--- | | `ignore_unavailable` | Boolean | When `false`, the request returns an error when it targets a missing or closed index. Default is `false`. | `allow_no_indices` | Boolean | When `false`, the Refresh Index API returns an error when a wildcard expression, index alias, or `_all` targets only closed or missing indexes, even when the request is made against open indexes. Default is `true`. | -| `expand_wildcard` | String | The type of index that the wildcard patterns can match. If the request targets data streams, this argument determines whether the wildcard expressions match any hidden data streams. Supports comma-separated values, such as `open,hidden`. Valid values are `all`, `open`, `closed`, `hidden`, and `none`. +| `expand_wildcards` | String | The type of index that the wildcard patterns can match. If the request targets data streams, this argument determines whether the wildcard expressions match any hidden data streams. Supports comma-separated values, such as `open,hidden`. Valid values are `all`, `open`, `closed`, `hidden`, and `none`. diff --git a/_api-reference/index-apis/update-settings.md b/_api-reference/index-apis/update-settings.md index 3f38418ef4..9fc9f01f85 100644 --- a/_api-reference/index-apis/update-settings.md +++ b/_api-reference/index-apis/update-settings.md @@ -43,7 +43,7 @@ Parameter | Data type | Description allow_no_indices | Boolean | Whether to ignore wildcards that don’t match any indexes. Default is `true`. expand_wildcards | String | Expands wildcard expressions to different indexes. Combine multiple values with commas. Available values are `all` (match all indexes), `open` (match open indexes), `closed` (match closed indexes), `hidden` (match hidden indexes), and `none` (do not accept wildcard expressions), which must be used with `open`, `closed`, or both. Default is `open`. cluster_manager_timeout | Time | How long to wait for a connection to the cluster manager node. Default is `30s`. -preserve_existing | Boolean | Whether to preserve existing index settings. Default is false. +preserve_existing | Boolean | Whether to preserve existing index settings. Default is `false`. timeout | Time | How long to wait for a connection to return. Default is `30s`. ## Request body diff --git a/_api-reference/nodes-apis/nodes-stats.md b/_api-reference/nodes-apis/nodes-stats.md index 145b3d5b24..ca6810b961 100644 --- a/_api-reference/nodes-apis/nodes-stats.md +++ b/_api-reference/nodes-apis/nodes-stats.md @@ -44,7 +44,7 @@ thread_pool | Statistics about each thread pool for the node. fs | File system statistics, such as read/write statistics, data path, and free disk space. transport | Transport layer statistics about send/receive in cluster communication. http | Statistics about the HTTP layer. -breaker | Statistics about the field data circuit breakers. +breakers | Statistics about the field data circuit breakers. script | Statistics about scripts, such as compilations and cache evictions. discovery | Statistics about cluster states. ingest | Statistics about ingest pipelines. diff --git a/_api-reference/render-template.md b/_api-reference/render-template.md new file mode 100644 index 0000000000..16bada0290 --- /dev/null +++ b/_api-reference/render-template.md @@ -0,0 +1,114 @@ +--- +layout: default +title: Render Template +nav_order: 82 +--- + +# Render Template + +The Render Template API renders a [search template]({{site.url}}{{site.baseurl}}/search-plugins/search-template/) as a search query. + +## Paths and HTTP methods + +``` +GET /_render/template +POST /_render/template +GET /_render/template/ +POST /_render/template/ +``` + +## Path parameters + +The Render Template API supports the following optional path parameter. + +| Parameter | Type | Description | +| :--- | :--- | :--- | +| `id` | String | The ID of the search template to render. | + +## Request options + +The following options are supported in the request body of the Render Template API. + +| Parameter | Required | Type | Description | +| :--- | :--- | :--- | :--- | +| `id` | Conditional | String | The ID of the search template to render. Is not required if the ID is provided in the path or if an inline template is specified by the `source`. | +| `params` | No | Object | A list of key-value pairs that replace Mustache variables found in the search template. The key-value pairs must exist in the documents being searched. | +| `source` | Conditional | Object | An inline search template to render if a search template is not specified. Supports the same parameters as a [Search]({{site.url}}{{site.baseurl}}/api-reference/search/) API request and [Mustache](https://mustache.github.io/mustache.5.html) variables. | + +## Example request + +Both of the following request examples use the search template with the template ID `play_search_template`: + +```json +{ + "source": { + "query": { + "match": { + "play_name": "{{play_name}}" + } + } + }, + "params": { + "play_name": "Henry IV" + } +} +``` + +### Render template using template ID + +The following example request validates a search template with the ID `play_search_template`: + +```json +POST _render/template +{ + "id": "play_search_template", + "params": { + "play_name": "Henry IV" + } +} +``` +{% include copy.html %} + +### Render template using `_source` + +If you don't want to use a saved template, or want to test a template before saving, you can test a template with the `_source` parameter using [Mustache](https://mustache.github.io/mustache.5.html) variables, as shown in the following example: + +``` +{ + "source": { + "from": "{{from}}{{^from}}10{{/from}}", + "size": "{{size}}{{^size}}10{{/size}}", + "query": { + "match": { + "play_name": "{{play_name}}" + } + } + }, + "params": { + "play_name": "Henry IV" + } +} +``` +{% include copy.html %} + +## Example response + +OpenSearch responds with information about the template's output: + +```json +{ + "template_output": { + "from": "0", + "size": "10", + "query": { + "match": { + "play_name": "Henry IV" + } + } + } +} +``` + + + + diff --git a/_api-reference/search.md b/_api-reference/search.md index 46212e0634..777f48354e 100644 --- a/_api-reference/search.md +++ b/_api-reference/search.md @@ -42,29 +42,29 @@ All URL parameters are optional. Parameter | Type | Description :--- | :--- | :--- -allow_no_indices | Boolean | Whether to ignore wildcards that don’t match any indexes. Default is true. -allow_partial_search_results | Boolean | Whether to return partial results if the request runs into an error or times out. Default is true. +allow_no_indices | Boolean | Whether to ignore wildcards that don’t match any indexes. Default is `true`. +allow_partial_search_results | Boolean | Whether to return partial results if the request runs into an error or times out. Default is `true`. analyzer | String | Analyzer to use in the query string. -analyze_wildcard | Boolean | Whether the update operation should include wildcard and prefix queries in the analysis. Default is false. +analyze_wildcard | Boolean | Whether the update operation should include wildcard and prefix queries in the analysis. Default is `false`. batched_reduce_size | Integer | How many shard results to reduce on a node. Default is 512. cancel_after_time_interval | Time | The time after which the search request will be canceled. Request-level parameter takes precedence over cancel_after_time_interval [cluster setting]({{site.url}}{{site.baseurl}}/api-reference/cluster-settings). Default is -1. -ccs_minimize_roundtrips | Boolean | Whether to minimize roundtrips between a node and remote clusters. Default is true. +ccs_minimize_roundtrips | Boolean | Whether to minimize roundtrips between a node and remote clusters. Default is `true`. default_operator | String | Indicates whether the default operator for a string query should be AND or OR. Default is OR. df | String | The default field in case a field prefix is not provided in the query string. docvalue_fields | String | The fields that OpenSearch should return using their docvalue forms. expand_wildcards | String | Specifies the type of index that wildcard expressions can match. Supports comma-separated values. Valid values are all (match any index), open (match open, non-hidden indexes), closed (match closed, non-hidden indexes), hidden (match hidden indexes), and none (deny wildcard expressions). Default is open. -explain | Boolean | Whether to return details about how OpenSearch computed the document's score. Default is false. +explain | Boolean | Whether to return details about how OpenSearch computed the document's score. Default is `false`. from | Integer | The starting index to search from. Default is 0. -ignore_throttled | Boolean | Whether to ignore concrete, expanded, or indexes with aliases if indexes are frozen. Default is true. +ignore_throttled | Boolean | Whether to ignore concrete, expanded, or indexes with aliases if indexes are frozen. Default is `true`. ignore_unavailable | Boolean | Specifies whether to include missing or closed indexes in the response and ignores unavailable shards during the search request. Default is `false`. -lenient | Boolean | Specifies whether OpenSearch should accept requests if queries have format errors (for example, querying a text field for an integer). Default is false. +lenient | Boolean | Specifies whether OpenSearch should accept requests if queries have format errors (for example, querying a text field for an integer). Default is `false`. max_concurrent_shard_requests | Integer | How many concurrent shard requests this request should execute on each node. Default is 5. -phase_took | Boolean | Whether to return phase-level `took` time values in the response. Default is false. +phase_took | Boolean | Whether to return phase-level `took` time values in the response. Default is `false`. pre_filter_shard_size | Integer | A prefilter size threshold that triggers a prefilter operation if the request exceeds the threshold. Default is 128 shards. preference | String | Specifies the shards or nodes on which OpenSearch should perform the search. For valid values, see [The `preference` query parameter](#the-preference-query-parameter). q | String | Lucene query string’s query. request_cache | Boolean | Specifies whether OpenSearch should use the request cache. Default is whether it’s enabled in the index’s settings. -rest_total_hits_as_int | Boolean | Whether to return `hits.total` as an integer. Returns an object otherwise. Default is false. +rest_total_hits_as_int | Boolean | Whether to return `hits.total` as an integer. Returns an object otherwise. Default is `false`. routing | String | Value used to route the update by query operation to a specific shard. scroll | Time | How long to keep the search context open. search_type | String | Whether OpenSearch should use global term and document frequencies when calculating relevance scores. Valid choices are `query_then_fetch` and `dfs_query_then_fetch`. `query_then_fetch` scores documents using local term and document frequencies for the shard. It’s usually faster but less accurate. `dfs_query_then_fetch` scores documents using global term and document frequencies across all shards. It’s usually slower but more accurate. Default is `query_then_fetch`. @@ -75,18 +75,18 @@ _source | String | Whether to include the `_source` field in the response. _source_excludes | List | A comma-separated list of source fields to exclude from the response. _source_includes | List | A comma-separated list of source fields to include in the response. stats | String | Value to associate with the request for additional logging. -stored_fields | Boolean | Whether the get operation should retrieve fields stored in the index. Default is false. +stored_fields | Boolean | Whether the get operation should retrieve fields stored in the index. Default is `false`. suggest_field | String | Fields OpenSearch can use to look for similar terms. suggest_mode | String | The mode to use when searching. Available options are `always` (use suggestions based on the provided terms), `popular` (use suggestions that have more occurrences), and `missing` (use suggestions for terms not in the index). suggest_size | Integer | How many suggestions to return. suggest_text | String | The source that suggestions should be based off of. terminate_after | Integer | The maximum number of documents OpenSearch should process before terminating the request. Default is 0. timeout | Time | How long the operation should wait for a response from active shards. Default is `1m`. -track_scores | Boolean | Whether to return document scores. Default is false. +track_scores | Boolean | Whether to return document scores. Default is `false`. track_total_hits | Boolean or Integer | Whether to return how many documents matched the query. -typed_keys | Boolean | Whether returned aggregations and suggested terms should include their types in the response. Default is true. +typed_keys | Boolean | Whether returned aggregations and suggested terms should include their types in the response. Default is `true`. version | Boolean | Whether to include the document version as a match. -include_named_queries_score | Boolean | Whether to return scores with named queries. Default is false. +include_named_queries_score | Boolean | Whether to return scores with named queries. Default is `false`. ### The `preference` query parameter @@ -111,7 +111,7 @@ Field | Type | Description aggs | Object | In the optional `aggs` parameter, you can define any number of aggregations. Each aggregation is defined by its name and one of the types of aggregations that OpenSearch supports. For more information, see [Aggregations]({{site.url}}{{site.baseurl}}/aggregations/). docvalue_fields | Array of objects | The fields that OpenSearch should return using their docvalue forms. Specify a format to return results in a certain format, such as date and time. fields | Array | The fields to search for in the request. Specify a format to return results in a certain format, such as date and time. -explain | String | Whether to return details about how OpenSearch computed the document's score. Default is false. +explain | String | Whether to return details about how OpenSearch computed the document's score. Default is `false`. from | Integer | The starting index to search from. Default is 0. indices_boost | Array of objects | Values used to boost the score of specified indexes. Specify in the format of <index> : <boost-multiplier> min_score | Integer | Specify a score threshold to return only documents above the threshold. diff --git a/_api-reference/snapshots/create-repository.md b/_api-reference/snapshots/create-repository.md index 856332b793..54807b85d1 100644 --- a/_api-reference/snapshots/create-repository.md +++ b/_api-reference/snapshots/create-repository.md @@ -79,7 +79,7 @@ Request field | Description `max_snapshot_bytes_per_sec` | The maximum rate at which snapshots take. Default is 40 MB per second (`40m`). Optional. `readonly` | Whether the repository is read-only. Useful when migrating from one cluster (`"readonly": false` when registering) to another cluster (`"readonly": true` when registering). Optional. `remote_store_index_shallow_copy` | Boolean | Whether the snapshot of the remote store indexes is captured as a shallow copy. Default is `false`. -`server_side_encryption` | Whether to encrypt snapshot files in the S3 bucket. This setting uses AES-256 with S3-managed keys. See [Protecting data using server-side encryption](https://docs.aws.amazon.com/AmazonS3/latest/dev/serv-side-encryption.html). Default is false. Optional. +`server_side_encryption` | Whether to encrypt snapshot files in the S3 bucket. This setting uses AES-256 with S3-managed keys. See [Protecting data using server-side encryption](https://docs.aws.amazon.com/AmazonS3/latest/dev/serv-side-encryption.html). Default is `false`. Optional. `storage_class` | Specifies the [S3 storage class](https://docs.aws.amazon.com/AmazonS3/latest/dev/storage-class-intro.html) for the snapshots files. Default is `standard`. Do not use the `glacier` and `deep_archive` storage classes. Optional. For the `base_path` parameter, do not enter the `s3://` prefix when entering your S3 bucket details. Only the name of the bucket is required. diff --git a/_api-reference/snapshots/create-snapshot.md b/_api-reference/snapshots/create-snapshot.md index 4f0a6d05cf..6334878d8c 100644 --- a/_api-reference/snapshots/create-snapshot.md +++ b/_api-reference/snapshots/create-snapshot.md @@ -42,9 +42,9 @@ The request body is optional. Field | Data type | Description :--- | :--- | :--- `indices` | String | The indices you want to include in the snapshot. You can use `,` to create a list of indices, `*` to specify an index pattern, and `-` to exclude certain indices. Don't put spaces between items. Default is all indices. -`ignore_unavailable` | Boolean | If an index from the `indices` list doesn't exist, whether to ignore it rather than fail the snapshot. Default is false. -`include_global_state` | Boolean | Whether to include cluster state in the snapshot. Default is true. -`partial` | Boolean | Whether to allow partial snapshots. Default is false, which fails the entire snapshot if one or more shards fails to stor +`ignore_unavailable` | Boolean | If an index from the `indices` list doesn't exist, whether to ignore it rather than fail the snapshot. Default is `false`. +`include_global_state` | Boolean | Whether to include cluster state in the snapshot. Default is `true`. +`partial` | Boolean | Whether to allow partial snapshots. Default is `false`, which fails the entire snapshot if one or more shards fails to stor #### Example requests diff --git a/_api-reference/snapshots/get-snapshot-repository.md b/_api-reference/snapshots/get-snapshot-repository.md index e3664e11a8..501d0785dd 100644 --- a/_api-reference/snapshots/get-snapshot-repository.md +++ b/_api-reference/snapshots/get-snapshot-repository.md @@ -27,7 +27,7 @@ You can also get details about a snapshot during and after snapshot creation. Se | Parameter | Data type | Description | :--- | :--- | :--- | local | Boolean | Whether to get information from the local node. Optional, defaults to `false`.| -| cluster_manager_timeout | Time | Amount of time to wait for a connection to the master node. Optional, defaults to 30 seconds. | +| cluster_manager_timeout | Time | Amount of time to wait for a connection to the cluster manager node. Optional, defaults to 30 seconds. | #### Example request diff --git a/_api-reference/snapshots/verify-snapshot-repository.md b/_api-reference/snapshots/verify-snapshot-repository.md index 2929952472..12fada3303 100644 --- a/_api-reference/snapshots/verify-snapshot-repository.md +++ b/_api-reference/snapshots/verify-snapshot-repository.md @@ -29,7 +29,7 @@ Path parameters are optional. | Parameter | Data type | Description | :--- | :--- | :--- -| cluster_manager_timeout | Time | Amount of time to wait for a connection to the master node. Optional, defaults to `30s`. | +| cluster_manager_timeout | Time | Amount of time to wait for a connection to the cluster manager node. Optional, defaults to `30s`. | | timeout | Time | The period of time to wait for a response. If a response is not received before the timeout value, the request fails and returns an error. Defaults to `30s`. | #### Example request diff --git a/_automating-configurations/api/create-workflow.md b/_automating-configurations/api/create-workflow.md index 5c501ce4e8..83c0110ac3 100644 --- a/_automating-configurations/api/create-workflow.md +++ b/_automating-configurations/api/create-workflow.md @@ -20,9 +20,9 @@ You can include placeholder expressions in the value of workflow step fields. Fo Once a workflow is created, provide its `workflow_id` to other APIs. -The `POST` method creates a new workflow. The `PUT` method updates an existing workflow. +The `POST` method creates a new workflow. The `PUT` method updates an existing workflow. You can specify the `update_fields` parameter to update specific fields. -You can only update a workflow if it has not yet been provisioned. +You can only update a complete workflow if it has not yet been provisioned. {: .note} ## Path and HTTP methods @@ -58,11 +58,26 @@ POST /_plugins/_flow_framework/workflow?validation=none ``` {% include copy-curl.html %} +You cannot update a full workflow once it has been provisioned, but you can update fields other than the `workflows` field, such as `name` and `description`: + +```json +PUT /_plugins/_flow_framework/workflow/?update_fields=true +{ + "name": "new-template-name", + "description": "A new description for the existing template" +} +``` +{% include copy-curl.html %} + +You cannot specify both the `provision` and `update_fields` parameters at the same time. +{: .note} + The following table lists the available query parameters. All query parameters are optional. User-provided parameters are only allowed if the `provision` parameter is set to `true`. | Parameter | Data type | Description | | :--- | :--- | :--- | | `provision` | Boolean | Whether to provision the workflow as part of the request. Default is `false`. | +| `update_fields` | Boolean | Whether to update only the fields included in the request body. Default is `false`. | | `validation` | String | Whether to validate the workflow. Valid values are `all` (validate the template) and `none` (do not validate the template). Default is `all`. | | User-provided substitution expressions | String | Parameters matching substitution expressions in the template. Only allowed if `provision` is set to `true`. Optional. If `provision` is set to `false`, you can pass these parameters in the [Provision Workflow API query parameters]({{site.url}}{{site.baseurl}}/automating-configurations/api/provision-workflow/#query-parameters). | diff --git a/_benchmark/index.md b/_benchmark/index.md index 1a71d57de9..6d343b908a 100644 --- a/_benchmark/index.md +++ b/_benchmark/index.md @@ -24,13 +24,12 @@ The following diagram visualizes how OpenSearch Benchmark works when run against ![Benchmark workflow]({{site.url}}{{site.baseurl}}/images/benchmark/osb-workflow.jpg). -The OpenSearch Benchmark documentation is split into five sections: +The OpenSearch Benchmark documentation is split into four sections: - [Quickstart]({{site.url}}{{site.baseurl}}/benchmark/quickstart/): Learn how to quickly run and install OpenSearch Benchmark. - [User guide]({{site.url}}{{site.baseurl}}/benchmark/user-guide/index/): Dive deep into how OpenSearch Benchmark can help you track the performance of your cluster. - [Tutorials]({{site.url}}{{site.baseurl}}/benchmark/tutorials/index/): Use step-by-step guides for more advanced benchmarking configurations and functionality. -- [Commands]({{site.url}}{{site.baseurl}}/benchmark/commands/index/): A detailed reference of commands and command options supported by OpenSearch. -- [Workloads]({{site.url}}{{site.baseurl}}/benchmark/workloads/index/): A detailed reference of options available for both default and custom workloads. +- [Reference]({{site.url}}{{site.baseurl}}/benchmark/reference/index/): A detailed reference of metrics, commands, telemetry devices, and workloads. diff --git a/_clients/javascript/helpers.md b/_clients/javascript/helpers.md index f88efd8e00..c6cff46be0 100644 --- a/_clients/javascript/helpers.md +++ b/_clients/javascript/helpers.md @@ -62,7 +62,7 @@ When creating a new bulk helper instance, you can use the following configuratio | `flushBytes` | Integer | Optional. Default is 5,000,000. | Maximum bulk body size to send in bytes. | `flushInterval` | Integer | Optional. Default is 30,000. | Time in milliseconds to wait before flushing the body after the last document has been read. | `onDrop` | Function | Optional. Default is `noop`. | A function to be invoked for every document that can’t be indexed after reaching the maximum number of retries. -| `refreshOnCompletion` | Boolean | Optional. Default is false. | Whether or not a refresh should be run on all affected indexes at the end of the bulk operation. +| `refreshOnCompletion` | Boolean | Optional. Default is `false`. | Whether or not a refresh should be run on all affected indexes at the end of the bulk operation. | `retries` | Integer | Optional. Defaults to the client's `maxRetries` value. | The number of times an operation is retried before `onDrop` is called for that document. | `wait` | Integer | Optional. Default is 5,000. | Time in milliseconds to wait before retrying an operation. diff --git a/_clients/python-low-level.md b/_clients/python-low-level.md index 894bef0e38..ba40fa3f45 100644 --- a/_clients/python-low-level.md +++ b/_clients/python-low-level.md @@ -8,9 +8,15 @@ redirect_from: # Low-level Python client -The OpenSearch low-level Python client (`opensearch-py`) provides wrapper methods for the OpenSearch REST API so that you can interact with your cluster more naturally in Python. Rather than sending raw HTTP requests to a given URL, you can create an OpenSearch client for your cluster and call the client's built-in functions. For the client's complete API documentation and additional examples, see the [`opensearch-py` API documentation](https://opensearch-project.github.io/opensearch-py/). +The OpenSearch low-level Python client (`opensearch-py`) provides wrapper methods for the OpenSearch REST API so that you can interact with your cluster more naturally in Python. Rather than sending raw HTTP requests to a given URL, you can create an OpenSearch client for your cluster and call the client's built-in functions. -This getting started guide illustrates how to connect to OpenSearch, index documents, and run queries. For the client source code, see the [`opensearch-py` repo](https://github.com/opensearch-project/opensearch-py). +This getting started guide illustrates how to connect to OpenSearch, index documents, and run queries. For additional information, see the following resources: +- [OpenSearch Python repo](https://github.com/opensearch-project/opensearch-py) +- [API reference](https://opensearch-project.github.io/opensearch-py/api-ref.html) +- [User guides](https://github.com/opensearch-project/opensearch-py/tree/main/guides) +- [Samples](https://github.com/opensearch-project/opensearch-py/tree/main/samples) + +If you have any questions or would like to contribute, you can [create an issue](https://github.com/opensearch-project/opensearch-py/issues) to interact with the OpenSearch Python team directly. ## Setup diff --git a/_dashboards/visualize/area.md b/_dashboards/visualize/area.md index 0f3b7863d3..5df59579ec 100644 --- a/_dashboards/visualize/area.md +++ b/_dashboards/visualize/area.md @@ -1,6 +1,6 @@ --- layout: default -title: Using area charts +title: Area charts parent: Building data visualizations nav_order: 5 --- diff --git a/_dashboards/visualize/gantt.md b/_dashboards/visualize/gantt.md index 875e35c127..3a9814465a 100644 --- a/_dashboards/visualize/gantt.md +++ b/_dashboards/visualize/gantt.md @@ -1,6 +1,6 @@ --- layout: default -title: Using Gantt charts +title: Gantt charts parent: Building data visualizations nav_order: 30 redirect_from: @@ -18,7 +18,7 @@ To create a Gantt chart, perform the following steps: 1. In the visualizations menu, choose **Create visualization** and **Gantt Chart**. 1. Choose a source for the chart (e.g. some log data). 1. Under **Metrics**, choose **Event**. For log data, each log is an event. -1. Select the **Start Time** and **Duration** fields from your data set. The start time is the timestamp for the beginning of an event. The duration is the amount of time to add to the start time. +1. Select the **Start Time** and **Duration** fields from your dataset. The start time is the timestamp for the beginning of an event. The duration is the amount of time to add to the start time. 1. Under **Results**, choose the number of events to display on the chart. Gantt charts sequence events from earliest to latest based on start time. 1. Choose **Panel settings** to adjust axis labels, time format, and colors. 1. Choose **Update**. diff --git a/_dashboards/visualize/geojson-regionmaps.md b/_dashboards/visualize/geojson-regionmaps.md index 663c4c2f39..aa006e0a24 100644 --- a/_dashboards/visualize/geojson-regionmaps.md +++ b/_dashboards/visualize/geojson-regionmaps.md @@ -1,6 +1,6 @@ --- layout: default -title: Using coordinate and region maps +title: Coordinate and region maps parent: Building data visualizations has_children: true nav_order: 15 @@ -12,7 +12,7 @@ redirect_from: OpenSearch has a standard set of GeoJSON files that provide a vector map with each region map. OpenSearch Dashboards also provides basic map tiles with a standard vector map to create region maps. You can configure the base map tiles using [Web Map Service (WMS)](https://www.ogc.org/standards/wms). For more information, see [Configuring WMS in OpenSearch Dashboards]({{site.url}}{{site.baseurl}}/dashboards/maptiles/). -For air gapped environments, OpenSearch Dashboards provides a self-host maps server. For more information, see [Using the self-host maps server]({{site.url}}{{site.baseurl}}/dashboards/selfhost-maps-server/) +For air-gapped environments, OpenSearch Dashboards provides a self-host maps server. For more information, see [Using the self-host maps server]({{site.url}}{{site.baseurl}}/dashboards/selfhost-maps-server/). While you can't configure a server to support user-defined vector map layers, you can configure your own GeoJSON file and upload it for this purpose. {: .note} @@ -35,7 +35,7 @@ You can use [geojson.io](https://geojson.io/#map=2/20.0/0.0) to extract GeoJSON To create your own custom vector map, upload a JSON file that contains GEO data for your customized regional maps. The JSON file contains vector layers for visualization. -1. Prepare a JSON file to upload. Make sure the file has either a .geojson or .json extension. +1. Prepare a JSON file to upload. Make sure the file has either a `.geojson` or `.json` extension. 1. On the top menu bar, go to **OpenSearch Dashboards > Visualize**. 1. Select the **Create Visualization** button. 1. Select **Region Map**. diff --git a/_dashboards/visualize/maps-stats-api.md b/_dashboards/visualize/maps-stats-api.md index 7939a4e732..f81c7e6ac4 100644 --- a/_dashboards/visualize/maps-stats-api.md +++ b/_dashboards/visualize/maps-stats-api.md @@ -3,7 +3,7 @@ layout: default title: Maps Stats API nav_order: 20 grand_parent: Building data visualizations -parent: Using coordinate and region maps +parent: Coordinate and region maps has_children: false --- diff --git a/_dashboards/visualize/maps.md b/_dashboards/visualize/maps.md index 23e14d41c3..5728fd9092 100644 --- a/_dashboards/visualize/maps.md +++ b/_dashboards/visualize/maps.md @@ -2,7 +2,7 @@ layout: default title: Using maps grand_parent: Building data visualizations -parent: Using coordinate and region maps +parent: Coordinate and region maps nav_order: 10 redirect_from: - /dashboards/maps-plugin/ diff --git a/_dashboards/visualize/maptiles.md b/_dashboards/visualize/maptiles.md index 6b8cc06ef3..6c7afc7462 100644 --- a/_dashboards/visualize/maptiles.md +++ b/_dashboards/visualize/maptiles.md @@ -2,7 +2,7 @@ layout: default title: Configuring a Web Map Service (WMS) grand_parent: Building data visualizations -parent: Using coordinate and region maps +parent: Coordinate and region maps nav_order: 30 redirect_from: - /dashboards/maptiles/ diff --git a/_dashboards/visualize/selfhost-maps-server.md b/_dashboards/visualize/selfhost-maps-server.md index 925c5449fe..439f9b634a 100644 --- a/_dashboards/visualize/selfhost-maps-server.md +++ b/_dashboards/visualize/selfhost-maps-server.md @@ -1,14 +1,14 @@ --- layout: default -title: Using the self-host maps server +title: Using self-hosted map servers grand_parent: Building data visualizations -parent: Using coordinate and region maps +parent: Coordinate and region maps nav_order: 40 redirect_from: - /dashboards/selfhost-maps-server/ --- -# Using the self-host maps server +# Using self-hosted map servers The self-host maps server for OpenSearch Dashboards allows users to access the default maps service in air-gapped environments. OpenSearch-compatible map URLs include a map manifest with map tiles and vectors, the map tiles, and the map vectors. diff --git a/_dashboards/visualize/tsvb.md b/_dashboards/visualize/tsvb.md new file mode 100644 index 0000000000..d845dea58a --- /dev/null +++ b/_dashboards/visualize/tsvb.md @@ -0,0 +1,70 @@ +--- +layout: default +title: TSVB +parent: Building data visualizations +nav_order: 45 +--- + +# TSVB + +The Time-Series Visual Builder (TSVB) is a powerful data visualization tool in OpenSearch Dashboards that allows you to create detailed time-series visualizations. One of its key features is the ability to add annotations or markers at specific time points based on index data. This feature is particularly useful for making connections between multiple indexes and building visualizations that display data over time, such as flight status, delays by type, and more. TSVB currently supports the following visualization types: Area, Line, Metric, Gauge, Markdown, and Data Table. + +## Creating TSVB visualizations from multiple data sources +Introduced 2.14 +{: .label .label-purple } + +Before proceeding, ensure that the following configuration settings are enabled in the `config/opensearch_dasboards.yaml` file: + +```yaml +data_source.enabled: true +vis_type_timeseries.enabled: true +``` +{% include copy-curl.html %} + +Once you have configured [multiple data sources]({{site.url}}{{site.baseurl}}/dashboards/management/multi-data-sources/) in OpenSearch Dashboards, you can use Vega to query those data sources. The following GIF shows the process of creating TSVB visualizations in OpenSearch Dashboards. + +![Process of creating TSVB visualizations in OpenSearch Dashboards]({{site.url}}{{site.baseurl}}/images/dashboards/configure-tsvb.gif) + +**Step 1: Set up and connect data sources** + +Open OpenSearch Dashboards and follow these steps: + +1. Select **Dashboards Management** from the main menu on the left. +2. Select **Data sources** and then select the **Create data source** button. +3. On the **Create data source** page, enter the connection details and endpoint URL. +4. On the **Home** page, select **Add sample data** and then select the **Add data** button for the **Sample web logs** dataset. + +The following GIF shows the steps required to set up and connect a data source. + +![Create data source]({{site.url}}{{site.baseurl}}/images/dashboards/create-datasource.gif) + +**Step 2: Create the visualization** + +Follow these steps to create the visualization: + +1. From the menu on the left, select **Visualize**. +2. On the **Visualizations** page, select **Create Visualization** and then select **TSVB** in the pop-up window. + +**Step 3: Specify data sources** + +After creating a TSVB visualization, data may appear based on your default index pattern. To change the index pattern or configure additional settings, follow these steps: + +1. In the **Create** window, select **Panel options**. +2. Under **Data source**, select the OpenSearch cluster from which to pull data. In this case, choose your newly created data source. +3. Under **Index name**, enter `opensearch_dashboards_sample_data_logs`. +4. Under **Time field**, select `@timestamp`. This setting specifies the time range for rendering the visualization. + +**(Optional) Step 4: Add annotations** + +Annotations are markers that can be added to time-series visualizations. Follow these steps to add annotations: + +1. On the upper-left corner of the page, select **Time Series**. +2. Select the **Annotations** tab and then **Add data source**. +3. In the **Index** name field, specify the appropriate index. In this case, continue using the same index from the previous steps, that is, `opensearch_dashboards_sample_data_logs`. +4. From **Time** field, select `@timestamp`. +5. In the **Fields** field, enter `timestamp`. +6. In the **Row template** field, enter `timestamp`. + +The visualization automatically updates to display your annotations, as shown in the following image. + + TSVB visualization with annotations diff --git a/_dashboards/visualize/vega.md b/_dashboards/visualize/vega.md index 7764d583a6..3a9f6aad4f 100644 --- a/_dashboards/visualize/vega.md +++ b/_dashboards/visualize/vega.md @@ -1,192 +1,137 @@ --- layout: default -title: Using Vega +title: Vega parent: Building data visualizations -nav_order: 45 +nav_order: 50 --- -# Using Vega +# Vega -[Vega](https://vega.github.io/vega/) and [Vega-Lite](https://vega.github.io/vega-lite/) are open-source, declarative language visualization tools that you can use to create custom data visualizations with your OpenSearch data and [Vega Data](https://vega.github.io/vega/docs/data/). These tools are ideal for advanced users comfortable with writing OpenSearch queries directly. Enable the `vis_type_vega` plugin in your `opensearch_dashboards.yml` file to write your [Vega specifications](https://vega.github.io/vega/docs/specification/) in either JSON or [HJSON](https://hjson.github.io/) format or to specify one or more OpenSearch queries within your Vega specification. By default, the plugin is set to `true`. The configuration is shown in the following example. For configuration details, refer to the `vis_type_vega` [README](https://github.com/opensearch-project/OpenSearch-Dashboards/blob/main/src/plugins/vis_type_vega/README.md). +[Vega](https://vega.github.io/vega/) and [Vega-Lite](https://vega.github.io/vega-lite/) are open-source, declarative language visualization tools that you can use to create custom data visualizations with your OpenSearch data and [Vega data](https://vega.github.io/vega/docs/data/). These tools are ideal for advanced users comfortable with writing OpenSearch queries directly. Enable the `vis_type_vega` plugin in your `opensearch_dashboards.yml` file to write your [Vega specifications](https://vega.github.io/vega/docs/specification/) in either JSON or [HJSON](https://hjson.github.io/) format or to specify one or more OpenSearch queries in your Vega specification. By default, the plugin is set to `true`. + +## Creating Vega visualizations from multiple data sources +Introduced 2.13 +{: .label .label-purple } + +Before proceeding, ensure that the following configuration settings are enabled in the `config/opensearch_dasboards.yaml` file. For configuration details, refer to the `vis_type_vega` [README](https://github.com/opensearch-project/OpenSearch-Dashboards/blob/main/src/plugins/vis_type_vega/README.md). ``` +data_source.enabled: true vis_type_vega.enabled: true ``` -The following image shows a custom Vega map created in OpenSearch. +After you have configured [multiple data sources]({{site.url}}{{site.baseurl}}/dashboards/management/multi-data-sources/) in OpenSearch Dashboards, you can use Vega to query those data sources. The following GIF shows the process of creating Vega visualizations in OpenSearch Dashboards. -Map created using Vega visualization in OpenSearch Dashboards +![Process of creating Vega visualizations in OpenSearch Dashboards]({{site.url}}{{site.baseurl}}/images/dashboards/configure-vega.gif) -## Querying from multiple data sources +### Step 1: Set up and connect data sources -If you have configured [multiple data sources]({{site.url}}{{site.baseurl}}/dashboards/management/multi-data-sources/) in OpenSearch Dashboards, you can use Vega to query those data sources. Within your Vega specification, add the `data_source_name` field under the `url` property to target a specific data source by name. By default, queries use data from the local cluster. You can assign individual `data_source_name` values to each OpenSearch query within your Vega specification. This allows you to query multiple indexes across different data sources in a single visualization. +Open OpenSearch Dashboards and follow these steps: -The following is an example Vega specification with `Demo US Cluster` as the specified `data_source_name`: +1. Select **Dashboards Management** from the menu on the left. +2. Select **Data sources** and then select the **Create data source** button. +3. On the **Create data source** page, enter the connection details and endpoint URL, as shown in the following GIF. +4. On the **Home page**, select **Add sample data**. Under **Data source**, select your newly created data source, and then select the **Add data button** for the **Sample web logs** dataset. -``` +The following GIF shows the steps required for setting up and connecting a data source. + +![Setting up and connecting data sources with OpenSearch Dashboards]({{site.url}}{{site.baseurl}}/images/dashboards/Add_datasource.gif) + +### Step 2: Create the visualization + +1. From the menu on the left, select **Visualize**. +2. On the **Visualizations** page, select **Create Visualization** and then select **Vega** in the pop-up window. + +### Step 3: Add the Vega specification + +By default, queries use data from the local cluster. You can assign individual `data_source_name` values to each OpenSearch query in your Vega specification. This allows you to query multiple indexes across different data sources in a single visualization. + +1. Verify that the data source you created is specified under `data_source_name`. Alternatively, in your Vega specification, add the `data_source_name` field under the `url` property to target a specific data source by name. +2. Copy the following Vega specification and then select the **Update** button in the lower-right corner. The visualization should appear. + +```json { - $schema: https://vega.github.io/schema/vega/v5.json - config: { - kibana: {type: "map", latitude: 25, longitude: -70, zoom: 3} - } - data: [ - { - name: table - url: { - index: opensearch_dashboards_sample_data_flights - // This OpenSearchQuery will query from the Demo US Cluster datasource - data_source_name: Demo US Cluster - %context%: true - // Uncomment to enable time filtering - // %timefield%: timestamp - body: { - size: 0 - aggs: { - origins: { - terms: {field: "OriginAirportID", size: 10000} - aggs: { - originLocation: { - top_hits: { - size: 1 - _source: { - includes: ["OriginLocation", "Origin"] - } - } - } - distinations: { - terms: {field: "DestAirportID", size: 10000} - aggs: { - destLocation: { - top_hits: { - size: 1 - _source: { - includes: ["DestLocation"] - } - } - } - } + $schema: https://vega.github.io/schema/vega-lite/v5.json + data: { + url: { + %context%: true + %timefield%: @timestamp + index: opensearch_dashboards_sample_data_logs + data_source_name: YOUR_DATA_SOURCE_TITLE + body: { + aggs: { + 1: { + date_histogram: { + field: @timestamp + fixed_interval: 3h + time_zone: America/Los_Angeles + min_doc_count: 1 + } + aggs: { + 2: { + avg: { + field: bytes } } } } } + size: 0 } - format: {property: "aggregations.origins.buckets"} - transform: [ - { - type: geopoint - projection: projection - fields: [ - originLocation.hits.hits[0]._source.OriginLocation.lon - originLocation.hits.hits[0]._source.OriginLocation.lat - ] - } - ] } - { - name: selectedDatum - on: [ - {trigger: "!selected", remove: true} - {trigger: "selected", insert: "selected"} - ] + format: { + property: aggregations.1.buckets } - ] - signals: [ + } + transform: [ { - name: selected - value: null - on: [ - {events: "@airport:mouseover", update: "datum"} - {events: "@airport:mouseout", update: "null"} - ] + calculate: datum.key + as: timestamp } - ] - scales: [ { - name: airportSize - type: linear - domain: {data: "table", field: "doc_count"} - range: [ - {signal: "zoom*zoom*0.2+1"} - {signal: "zoom*zoom*10+1"} - ] + calculate: datum[2].value + as: bytes } ] - marks: [ + layer: [ { - type: group - from: { - facet: { - name: facetedDatum - data: selectedDatum - field: distinations.buckets - } + mark: { + type: line } - data: [ - { - name: facetDatumElems - source: facetedDatum - transform: [ - { - type: geopoint - projection: projection - fields: [ - destLocation.hits.hits[0]._source.DestLocation.lon - destLocation.hits.hits[0]._source.DestLocation.lat - ] - } - {type: "formula", expr: "{x:parent.x, y:parent.y}", as: "source"} - {type: "formula", expr: "{x:datum.x, y:datum.y}", as: "target"} - {type: "linkpath", shape: "diagonal"} - ] - } - ] - scales: [ - { - name: lineThickness - type: log - clamp: true - range: [1, 8] - } - { - name: lineOpacity - type: log - clamp: true - range: [0.2, 0.8] - } - ] - marks: [ - { - from: {data: "facetDatumElems"} - type: path - interactive: false - encode: { - update: { - path: {field: "path"} - stroke: {value: "black"} - strokeWidth: {scale: "lineThickness", field: "doc_count"} - strokeOpacity: {scale: "lineOpacity", field: "doc_count"} - } - } - } - ] } { - name: airport - type: symbol - from: {data: "table"} - encode: { - update: { - size: {scale: "airportSize", field: "doc_count"} - xc: {signal: "datum.x"} - yc: {signal: "datum.y"} - tooltip: { - signal: "{title: datum.originLocation.hits.hits[0]._source.Origin + ' (' + datum.key + ')', connnections: length(datum.distinations.buckets), flights: datum.doc_count}" - } - } + mark: { + type: circle + tooltip: true } } ] + encoding: { + x: { + field: timestamp + type: temporal + axis: { + title: @timestamp + } + } + y: { + field: bytes + type: quantitative + axis: { + title: Average bytes + } + } + color: { + datum: Average bytes + type: nominal + } + } } ``` {% include copy-curl.html %} + +## Additional resources + +The following resources provide additional information about Vega visualizations in OpenSearch Dashboards: + +- [Improving ease of use in OpenSearch Dashboards with Vega visualizations](https://opensearch.org/blog/Improving-Dashboards-usability-with-Vega/) diff --git a/_dashboards/visualize/visbuilder.md b/_dashboards/visualize/visbuilder.md index de4dfb1666..2b6818a00e 100644 --- a/_dashboards/visualize/visbuilder.md +++ b/_dashboards/visualize/visbuilder.md @@ -1,13 +1,13 @@ --- layout: default -title: Using VisBuilder +title: VisBuilder parent: Building data visualizations nav_order: 100 redirect_from: - /dashboards/drag-drop-wizard/ --- -# Using VisBuilder +# VisBuilder You can use the VisBuilder visualization type in OpenSearch Dashboards to create data visualizations by using a drag-and-drop gesture. With VisBuilder you have: @@ -19,7 +19,7 @@ You can use the VisBuilder visualization type in OpenSearch Dashboards to create ## Try VisBuilder in the OpenSearch Dashboards playground -If you'd like to try out VisBuilder without installing OpenSearch locally, you can do so in the [Dashboards playground](https://playground.opensearch.org/app/vis-builder#/). +You can try VisBuilder without installing OpenSearch locally by using [OpenSearch Dashboards Playground](https://playground.opensearch.org/app/vis-builder#/). VisBuilder is enabled by default. ## Try VisBuilder locally @@ -27,7 +27,7 @@ Follow these steps to create a new visualization using VisBuilder in your enviro 1. Open Dashboards: - If you're not running the Security plugin, go to http://localhost:5601. - - If you're running the Security plugin, go to https://localhost:5601 and log in with your username and password (default is admin/admin). + - If you're running the Security plugin, go to https://localhost:5601 and log in with your username and password (default is `admin/admin`). 1. From the top menu, select **Visualize > Create visualization > VisBuilder**. @@ -37,4 +37,4 @@ Follow these steps to create a new visualization using VisBuilder in your enviro Here’s an example visualization. Your visualization will look different depending on your data and the fields you select. -Visualization generated using sample data \ No newline at end of file +Visualization generated using sample data diff --git a/_data-prepper/common-use-cases/metrics-logs.md b/_data-prepper/common-use-cases/metrics-logs.md new file mode 100644 index 0000000000..3fda8597c7 --- /dev/null +++ b/_data-prepper/common-use-cases/metrics-logs.md @@ -0,0 +1,70 @@ +--- +layout: default +title: Deriving metrics from logs +parent: Common use cases +nav_order: 15 +--- + +# Deriving metrics from logs + +You can use Data Prepper to derive metrics from logs. + +The following example pipeline receives incoming logs using the [`http` source plugin]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/sources/http-source) and the [`grok` processor]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/processors/grok/). It then uses the [`aggregate` processor]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/processors/aggregate/) to extract the metric bytes aggregated during a 30-second window and derives histograms from the results. + +This pipeline writes data to two different OpenSearch indexes: + +- `logs`: This index stores the original, un-aggregated log events after being processed by the `grok` processor. +- `histogram_metrics`: This index stores the derived histogram metrics extracted from the log events using the `aggregate` processor. + +The pipeline contains two sub-pipelines: + +- `apache-log-pipeline-with-metrics`: Receives logs through an HTTP client like FluentBit, using `grok` to extract important values from the logs by matching the value in the log key against the [Apache Common Log Format](https://httpd.apache.org/docs/2.4/logs.html#accesslog). It then forwards the grokked logs to two destinations: + + - An OpenSearch index named `logs` to store the original log events. + - The `log-to-metrics-pipeline` for further aggregation and metric derivation. + +- `log-to-metrics-pipeline`: Receives the grokked logs from the `apache-log-pipeline-with-metrics` pipeline, aggregates the logs, and derives histogram metrics of bytes based on the values in the `clientip` and `request` keys. Finally, it sends the derived histogram metrics to an OpenSearch index named `histogram_metrics`. + +#### Example pipeline + +```json +apache-log-pipeline-with-metrics: + source: + http: + # Provide the path for ingestion. ${pipelineName} will be replaced with pipeline name configured for this pipeline. + # In this case it would be "/apache-log-pipeline-with-metrics/logs". This will be the FluentBit output URI value. + path: "/${pipelineName}/logs" + processor: + - grok: + match: + log: [ "%{COMMONAPACHELOG_DATATYPED}" ] + sink: + - opensearch: + ... + index: "logs" + - pipeline: + name: "log-to-metrics-pipeline" + +log-to-metrics-pipeline: + source: + pipeline: + name: "apache-log-pipeline-with-metrics" + processor: + - aggregate: + # Specify the required identification keys + identification_keys: ["clientip", "request"] + action: + histogram: + # Specify the appropriate values for each of the following fields + key: "bytes" + record_minmax: true + units: "bytes" + buckets: [0, 25000000, 50000000, 75000000, 100000000] + # Pick the required aggregation period + group_duration: "30s" + sink: + - opensearch: + ... + index: "histogram_metrics" +``` +{% include copy-curl.html %} diff --git a/_data-prepper/common-use-cases/s3-logs.md b/_data-prepper/common-use-cases/s3-logs.md index 7986a7eef8..8d5a9ce967 100644 --- a/_data-prepper/common-use-cases/s3-logs.md +++ b/_data-prepper/common-use-cases/s3-logs.md @@ -9,7 +9,6 @@ nav_order: 40 Data Prepper allows you to load logs from [Amazon Simple Storage Service](https://aws.amazon.com/s3/) (Amazon S3), including traditional logs, JSON documents, and CSV logs. - ## Architecture Data Prepper can read objects from S3 buckets using an [Amazon Simple Queue Service (SQS)](https://aws.amazon.com/sqs/) (Amazon SQS) queue and [Amazon S3 Event Notifications](https://docs.aws.amazon.com/AmazonS3/latest/userguide/NotificationHowTo.html). @@ -20,7 +19,7 @@ The following diagram shows the overall architecture of the components involved. S3 source architecture{: .img-fluid} -The flow of data is as follows. +The component data flow is as follows: 1. A system produces logs into the S3 bucket. 2. S3 creates an S3 event notification in the SQS queue. @@ -28,7 +27,6 @@ The flow of data is as follows. 4. Data Prepper downloads the content from the S3 object. 5. Data Prepper sends a document to OpenSearch for the content in the S3 object. - ## Pipeline overview Data Prepper supports reading data from S3 using the [`s3` source]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/sources/s3/). @@ -44,7 +42,6 @@ Before Data Prepper can read log data from S3, you need the following prerequisi - An S3 bucket. - A log producer that writes logs to S3. The exact log producer will vary depending on your specific use case, but could include writing logs to S3 or a service such as Amazon CloudWatch. - ## Getting started Use the following steps to begin loading logs from S3 with Data Prepper. @@ -57,8 +54,7 @@ Use the following steps to begin loading logs from S3 with Data Prepper. ### Setting permissions for Data Prepper -To view S3 logs, Data Prepper needs access to Amazon SQS and S3. -Use the following example to set up permissions: +To view S3 logs, Data Prepper needs access to Amazon SQS and S3. Use the following example to set up permissions: ```json { @@ -88,12 +84,13 @@ Use the following example to set up permissions: ] } ``` +{% include copy-curl.html %} If your S3 objects or SQS queues do not use KMS, you can remove the `kms:Decrypt` permission. ### SQS dead-letter queue -The are two options for how to handle errors resulting from processing S3 objects. +The following two options can be used to handle S3 object processing errors: - Use an SQS dead-letter queue (DLQ) to track the failure. This is the recommended approach. - Delete the message from SQS. You must manually find the S3 object and correct the error. @@ -104,8 +101,8 @@ The following diagram shows the system architecture when using SQS with DLQ. To use an SQS dead-letter queue, perform the following steps: -1. Create a new SQS standard queue to act as your DLQ. -2. Configure your SQS's redrive policy [to use your DLQ](https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-configure-dead-letter-queue.html). Consider using a low value such as 2 or 3 for the "Maximum Receives" setting. +1. Create a new SQS standard queue to act as the DLQ. +2. Configure your SQS re-drive policy [to use DLQ](https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-configure-dead-letter-queue.html). Consider using a low value such as 2 or 3 for the **Maximum Receives** setting. 3. Configure the Data Prepper `s3` source to use `retain_messages` for `on_error`. This is the default behavior. ## Pipeline design @@ -125,6 +122,7 @@ s3-log-pipeline: queue_url: "arn:aws:sqs::<123456789012>:" visibility_timeout: "2m" ``` +{% include copy-curl.html %} Configure the following options according to your use case: @@ -164,10 +162,11 @@ s3-log-pipeline: password: "admin" index: s3_logs ``` +{% include copy-curl.html %} ## Multiple Data Prepper pipelines -We recommend that you have one SQS queue per Data Prepper pipeline. In addition, you can have multiple nodes in the same cluster reading from the same SQS queue, which doesn't require additional configuration with Data Prepper. +It is recommended that you have one SQS queue per Data Prepper pipeline. In addition, you can have multiple nodes in the same cluster reading from the same SQS queue, which doesn't require additional Data Prepper configuration. If you have multiple pipelines, you must create multiple SQS queues for each pipeline, even if both pipelines use the same S3 bucket. @@ -175,6 +174,55 @@ If you have multiple pipelines, you must create multiple SQS queues for each pip To meet the scale of logs produced by S3, some users require multiple SQS queues for their logs. You can use [Amazon Simple Notification Service](https://docs.aws.amazon.com/sns/latest/dg/welcome.html) (Amazon SNS) to route event notifications from S3 to an SQS [fanout pattern](https://docs.aws.amazon.com/sns/latest/dg/sns-common-scenarios.html). Using SNS, all S3 event notifications are sent directly to a single SNS topic, where you can subscribe to multiple SQS queues. -To make sure that Data Prepper can directly parse the event from the SNS topic, configure [raw message delivery](https://docs.aws.amazon.com/sns/latest/dg/sns-large-payload-raw-message-delivery.html) on the SNS to SQS subscription. Setting this option will not affect other SQS queues that are subscribed to that SNS topic. +To make sure that Data Prepper can directly parse the event from the SNS topic, configure [raw message delivery](https://docs.aws.amazon.com/sns/latest/dg/sns-large-payload-raw-message-delivery.html) on the SNS-to-SQS subscription. Applying this option does not affect other SQS queues subscribed to the SNS topic. + +## Filtering and retrieving data using Amazon S3 Select + +If a pipeline uses an S3 source, you can use SQL expressions to perform filtering and computations on the contents of S3 objects before ingesting them into the pipeline. +The `s3_select` option supports objects in the [Parquet File Format](https://parquet.apache.org/docs/). It also works with objects that are compressed with GZIP or BZIP2 (for CSV and JSON objects only) and supports columnar compression for the Parquet File Format using GZIP and Snappy. +Refer to [Filtering and retrieving data using Amazon S3 Select](https://docs.aws.amazon.com/AmazonS3/latest/userguide/selecting-content-from-objects.html) and [SQL reference for Amazon S3 Select](https://docs.aws.amazon.com/AmazonS3/latest/userguide/s3-select-sql-reference.html) for comprehensive information about using Amazon S3 Select. +{: .note} + +The following example pipeline retrieves all data from S3 objects encoded in the Parquet File Format: + +```json +pipeline: + source: + s3: + s3_select: + expression: "select * from s3object s" + input_serialization: parquet + notification_type: "sqs" +... +``` +{% include copy-curl.html %} + +The following example pipeline retrieves only the first 10,000 records in the objects: + +```json +pipeline: + source: + s3: + s3_select: + expression: "select * from s3object s LIMIT 10000" + input_serialization: parquet + notification_type: "sqs" +... +``` +{% include copy-curl.html %} + +The following example pipeline retrieves records from S3 objects that have a `data_value` in the given range of 200--500: + +```json +pipeline: + source: + s3: + s3_select: + expression: "select s.* from s3object s where s.data_value > 200 and s.data_value < 500 " + input_serialization: parquet + notification_type: "sqs" +... +``` +{% include copy-curl.html %} diff --git a/_data-prepper/managing-data-prepper/configuring-data-prepper.md b/_data-prepper/managing-data-prepper/configuring-data-prepper.md index d6750daba4..d890b741cc 100644 --- a/_data-prepper/managing-data-prepper/configuring-data-prepper.md +++ b/_data-prepper/managing-data-prepper/configuring-data-prepper.md @@ -65,9 +65,9 @@ Option | Required | Type | Description ssl | No | Boolean | Enables TLS/SSL. Default is `true`. ssl_certificate_file | Conditionally | String | The SSL certificate chain file path or AWS S3 path. S3 path example `s3:///`. Required if `ssl` is true and `use_acm_certificate_for_ssl` is false. Defaults to `config/default_certificate.pem` which is the default certificate file. Read more about how the certificate file is generated [here](https://github.com/opensearch-project/data-prepper/tree/main/examples/certificates). ssl_key_file | Conditionally | String | The SSL key file path or AWS S3 path. S3 path example `s3:///`. Required if `ssl` is true and `use_acm_certificate_for_ssl` is false. Defaults to `config/default_private_key.pem` which is the default private key file. Read more about how the default private key file is generated [here](https://github.com/opensearch-project/data-prepper/tree/main/examples/certificates). -ssl_insecure_disable_verification | No | Boolean | Disables the verification of server's TLS certificate chain. Default is false. -ssl_fingerprint_verification_only | No | Boolean | Disables the verification of server's TLS certificate chain and instead verifies only the certificate fingerprint. Default is false. -use_acm_certificate_for_ssl | No | Boolean | Enables TLS/SSL using certificate and private key from AWS Certificate Manager (ACM). Default is false. +ssl_insecure_disable_verification | No | Boolean | Disables the verification of server's TLS certificate chain. Default is `false`. +ssl_fingerprint_verification_only | No | Boolean | Disables the verification of server's TLS certificate chain and instead verifies only the certificate fingerprint. Default is `false`. +use_acm_certificate_for_ssl | No | Boolean | Enables TLS/SSL using certificate and private key from AWS Certificate Manager (ACM). Default is `false`. acm_certificate_arn | Conditionally | String | The ACM certificate ARN. The ACM certificate takes preference over S3 or a local file system certificate. Required if `use_acm_certificate_for_ssl` is set to true. acm_private_key_password | No | String | The ACM private key password that decrypts the private key. If not provided, Data Prepper generates a random password. acm_certificate_timeout_millis | No | Integer | The timeout in milliseconds for ACM to get certificates. Default is 120000. diff --git a/_data-prepper/pipelines/configuration/processors/add-entries.md b/_data-prepper/pipelines/configuration/processors/add-entries.md index d28f2d8f6f..26b95c7b64 100644 --- a/_data-prepper/pipelines/configuration/processors/add-entries.md +++ b/_data-prepper/pipelines/configuration/processors/add-entries.md @@ -10,55 +10,215 @@ nav_order: 40 The `add_entries` processor adds entries to an event. -### Configuration +## Configuration You can configure the `add_entries` processor with the following options. | Option | Required | Description | | :--- | :--- | :--- | | `entries` | Yes | A list of entries to add to an event. | -| `key` | Yes | The key of the new entry to be added. Some examples of keys include `my_key`, `myKey`, and `object/sub_Key`. | -| `metadata_key` | Yes | The key for the new metadata attribute. The argument must be a literal string key and not a JSON Pointer. Either one string key or `metadata_key` is required. | +| `key` | No | The key of the new entry to be added. Some examples of keys include `my_key`, `myKey`, and `object/sub_Key`. The key can also be a format expression, for example, `${/key1}` to use the value of field `key1` as the key. | +| `metadata_key` | No | The key for the new metadata attribute. The argument must be a literal string key and not a JSON Pointer. Either one string key or `metadata_key` is required. | +| `value` | No | The value of the new entry to be added, which can be used with any of the following data types: strings, Booleans, numbers, null, nested objects, and arrays. | | `format` | No | A format string to use as the value of the new entry, for example, `${key1}-${key2}`, where `key1` and `key2` are existing keys in the event. Required if neither `value` nor `value_expression` is specified. | | `value_expression` | No | An expression string to use as the value of the new entry. For example, `/key` is an existing key in the event with a type of either a number, a string, or a Boolean. Expressions can also contain functions returning number/string/integer. For example, `length(/key)` will return the length of the key in the event when the key is a string. For more information about keys, see [Expression syntax](https://opensearch.org/docs/latest/data-prepper/pipelines/expression-syntax/). | | `add_when` | No | A [conditional expression](https://opensearch.org/docs/latest/data-prepper/pipelines/expression-syntax/), such as `/some-key == "test"'`, that will be evaluated to determine whether the processor will be run on the event. | -| `value` | Yes | The value of the new entry to be added. You can use the following data types: strings, Booleans, numbers, null, nested objects, and arrays. | | `overwrite_if_key_exists` | No | When set to `true`, the existing value is overwritten if `key` already exists in the event. The default value is `false`. | +| `append_if_key_exists` | No | When set to `true`, the existing value will be appended if a `key` already exists in the event. An array will be created if the existing value is not an array. Default is `false`. | -### Usage -To get started, create the following `pipeline.yaml` file: +## Usage + +The following examples show how the `add_entries` processor can be used in different cases. + +### Example: Add entries with simple values + +The following example shows you how to configure the processor to add entries with simple values: ```yaml -pipeline: - source: - ... - .... +... processor: - add_entries: entries: - - key: "newMessage" - value: 3 - overwrite_if_key_exists: true - - metadata_key: myMetadataKey - value_expression: 'length("newMessage")' - add_when: '/some_key == "test"' - sink: + - key: "name" + value: "John" + - key: "age" + value: 20 +... ``` {% include copy.html %} +When the input event contains the following data: + +```json +{"message": "hello"} +``` -For example, when your source contains the following event record: +The processed event will contain the following data: + +```json +{"message": "hello", "name": "John", "age": 20} +``` + +### Example: Add entries using format strings + +The following example shows you how to configure the processor to add entries with values from other fields: + +```yaml +... + processor: + - add_entries: + entries: + - key: "date" + format: "${month}-${day}" +... +``` +{% include copy.html %} + +When the input event contains the following data: + +```json +{"month": "Dec", "day": 1} +``` + +The processed event will contain the following data: + +```json +{"month": "Dec", "day": 1, "date": "Dec-1"} +``` + +### Example: Add entries using value expressions + +The following example shows you how to configure the processor to use the `value_expression` option: + +```yaml +... + processor: + - add_entries: + entries: + - key: "length" + value_expression: "length(/message)" +... +``` +{% include copy.html %} + +When the input event contains the following data: ```json {"message": "hello"} ``` -And then you run the `add_entries` processor using the example pipeline, it adds a new entry, `{"newMessage": 3}`, to the existing event, `{"message": "hello"}`, so that the new event contains two entries in the final output: +The processed event will contain the following data: + +```json +{"message": "hello", "length": 5} +``` + +### Example: Add metadata + +The following example shows you how to configure the processor to add metadata to events: + +```yaml +... + processor: + - add_entries: + entries: + - metadata_key: "length" + value_expression: "length(/message)" +... +``` +{% include copy.html %} + +When the input event contains the following data: ```json -{"message": "hello", "newMessage": 3} +{"message": "hello"} ``` -If `newMessage` already exists, its existing value is overwritten with a value of `3`. +The processed event will have the same data, with the metadata, `{"length": 5}`, attached. You can subsequently use expressions like `getMetadata("length")` in the pipeline. For more information, see the [`getMetadata` function](https://opensearch.org/docs/latest/data-prepper/pipelines/expression-syntax/#getmetadata) documentation. + + +### Example: Add a dynamic key +The following example shows you how to configure the processor to add metadata to events using a dynamic key: + +```yaml +... + processor: + - add_entries: + entries: + - key: "${/param_name}" + value_expression: "/param_value" +... +``` +{% include copy.html %} + +When the input event contains the following data: + +```json +{"param_name": "cpu", "param_value": 50} +``` + +The processed event will contain the following data: + +```json +{"param_name": "cpu", "param_value": 50, "cpu": 50} +``` + +### Example: Overwrite existing entries + +The following example shows you how to configure the processor to overwrite existing entries: + +```yaml +... + processor: + - add_entries: + entries: + - key: "message" + value: "bye" + overwrite_if_key_exists: true +... +``` +{% include copy.html %} + +When the input event contains the following data: + +```json +{"message": "hello"} +``` + +The processed event will contain the following data: + +```json +{"message": "bye"} +``` + +If `overwrite_if_key_exists` is not set to `true`, then the input event will not be changed after processing. + +### Example: Append values to existing entries + +The following example shows you how to configure the processor to append values to existing entries: + +```yaml +... + processor: + - add_entries: + entries: + - key: "message" + value: "world" + append_if_key_exists: true +... +``` +{% include copy.html %} + +When the input event contains the following data: + +```json +{"message": "hello"} +``` + +The processed event will contain the following data: + +```json +{"message": ["hello", "world"]} +``` diff --git a/_data-prepper/pipelines/configuration/processors/date.md b/_data-prepper/pipelines/configuration/processors/date.md index 7ac1040c26..c44a10ba16 100644 --- a/_data-prepper/pipelines/configuration/processors/date.md +++ b/_data-prepper/pipelines/configuration/processors/date.md @@ -15,7 +15,7 @@ The `date` processor adds a default timestamp to an event, parses timestamp fiel The following table describes the options you can use to configure the `date` processor. - + Option | Required | Type | Description :--- | :--- | :--- | :--- `match` | Conditionally | [Match](#Match) | The date match configuration. This option cannot be defined at the same time as `from_time_received`. There is no default value. @@ -27,7 +27,7 @@ Option | Required | Type | Description `source_timezone` | No | String | The time zone used to parse dates, including when the zone or offset cannot be extracted from the value. If the zone or offset are part of the value, then the time zone is ignored. A list of all the available time zones is contained in the **TZ database name** column of [the list of database time zones](https://en.wikipedia.org/wiki/List_of_tz_database_time_zones#List). `destination_timezone` | No | String | The time zone used for storing the timestamp in the `destination` field. A list of all the available time zones is contained in the **TZ database name** column of [the list of database time zones](https://en.wikipedia.org/wiki/List_of_tz_database_time_zones#List). `locale` | No | String | The location used for parsing dates. Commonly used for parsing month names (`MMM`). The value can contain language, country, or variant fields in IETF BCP 47, such as `en-US`, or a string representation of the [locale](https://docs.oracle.com/javase/8/docs/api/java/util/Locale.html) object, such as `en_US`. A full list of locale fields, including language, country, and variant, can be found in [the language subtag registry](https://www.iana.org/assignments/language-subtag-registry/language-subtag-registry). Default is `Locale.ROOT`. - + ### Match diff --git a/_data-prepper/pipelines/configuration/processors/key-value.md b/_data-prepper/pipelines/configuration/processors/key-value.md index aedc1f8822..52ecc7719c 100644 --- a/_data-prepper/pipelines/configuration/processors/key-value.md +++ b/_data-prepper/pipelines/configuration/processors/key-value.md @@ -11,40 +11,32 @@ nav_order: 56 You can use the `key_value` processor to parse the specified field into key-value pairs. You can customize the `key_value` processor to parse field information with the following options. The type for each of the following options is `string`. -| Option | Description | Example | -| :--- | :--- | :--- | -| source | The message field to be parsed. Optional. Default value is `message`. | If `source` is `"message1"`, `{"message1": {"key1=value1"}, "message2": {"key2=value2"}}` parses into `{"message1": {"key1=value1"}, "message2": {"key2=value2"}, "parsed_message": {"key1": "value1"}}`. | -| destination | The destination field for the parsed source. The parsed source overwrites the preexisting data for that key. Optional. If `destination` is set to `null`, the parsed fields will be written to the root of the event. Default value is `parsed_message`. | If `destination` is `"parsed_data"`, `{"message": {"key1=value1"}}` parses into `{"message": {"key1=value1"}, "parsed_data": {"key1": "value1"}}`. | -| field_delimiter_regex | A regular expression specifying the delimiter that separates key-value pairs. Special regular expression characters such as `[` and `]` must be escaped with `\\`. Cannot be defined at the same time as `field_split_characters`. Optional. If this option is not defined, `field_split_characters` is used. | If `field_delimiter_regex` is `"&\\{2\\}"`, `{"key1=value1&&key2=value2"}` parses into `{"key1": "value1", "key2": "value2"}`. | -| field_split_characters | A string of characters specifying the delimeter that separates key-value pairs. Special regular expression characters such as `[` and `]` must be escaped with `\\`. Cannot be defined at the same time as `field_delimiter_regex`. Optional. Default value is `&`. | If `field_split_characters` is `"&&"`, `{"key1=value1&&key2=value2"}` parses into `{"key1": "value1", "key2": "value2"}`. | -| key_value_delimiter_regex | A regular expression specifying the delimiter that separates the key and value within a key-value pair. Special regular expression characters such as `[` and `]` must be escaped with `\\`. This option cannot be defined at the same time as `value_split_characters`. Optional. If this option is not defined, `value_split_characters` is used. | If `key_value_delimiter_regex` is `"=\\{2\\}"`, `{"key1==value1"}` parses into `{"key1": "value1"}`. | -| value_split_characters | A string of characters specifying the delimiter that separates the key and value within a key-value pair. Special regular expression characters such as `[` and `]` must be escaped with `\\`. Cannot be defined at the same time as `key_value_delimiter_regex`. Optional. Default value is `=`. | If `value_split_characters` is `"=="`, `{"key1==value1"}` parses into `{"key1": "value1"}`. | -| non_match_value | When a key-value pair cannot be successfully split, the key-value pair is placed in the `key` field, and the specified value is placed in the `value` field. Optional. Default value is `null`. | `key1value1&key2=value2` parses into `{"key1value1": null, "key2": "value2"}`. | -| prefix | A prefix to append before all keys. Optional. Default value is an empty string. | If `prefix` is `"custom"`, `{"key1=value1"}` parses into `{"customkey1": "value1"}`.| -| delete_key_regex | A regular expression specifying the characters to delete from the key. Special regular expression characters such as `[` and `]` must be escaped with `\\`. Cannot be an empty string. Optional. No default value. | If `delete_key_regex` is `"\s"`, `{"key1 =value1"}` parses into `{"key1": "value1"}`. | -| delete_value_regex | A regular expression specifying the characters to delete from the value. Special regular expression characters such as `[` and `]` must be escaped with `\\`. Cannot be an empty string. Optional. No default value. | If `delete_value_regex` is `"\s"`, `{"key1=value1 "}` parses into `{"key1": "value1"}`. | -| include_keys | An array specifying the keys that should be added for parsing. By default, all keys will be added. | If `include_keys` is `["key2"]`,`key1=value1&key2=value2` will parse into `{"key2": "value2"}`. | -| exclude_keys | An array specifying the parsed keys that should not be added to the event. By default, no keys will be excluded. | If `exclude_keys` is `["key2"]`, `key1=value1&key2=value2` will parse into `{"key1": "value1"}`. | -| default_values | A map specifying the default keys and their values that should be added to the event in case these keys do not exist in the source field being parsed. If the default key already exists in the message, the value is not changed. The `include_keys` filter will be applied to the message before `default_values`. | If `default_values` is `{"defaultkey": "defaultvalue"}`, `key1=value1` will parse into `{"key1": "value1", "defaultkey": "defaultvalue"}`.
If `default_values` is `{"key1": "abc"}`, `key1=value1` will parse into `{"key1": "value1"}`.
If `include_keys` is `["key1"]` and `default_values` is `{"key2": "value2"}`, `key1=value1&key2=abc` will parse into `{"key1": "value1", "key2": "value2"}`. | -| transform_key | When to lowercase, uppercase, or capitalize keys. | If `transform_key` is `lowercase`, `{"Key1=value1"}` will parse into `{"key1": "value1"}`.
If `transform_key` is `uppercase`, `{"key1=value1"}` will parse into `{"KEY1": "value1"}`.
If `transform_key` is `capitalize`, `{"key1=value1"}` will parse into `{"Key1": "value1"}`. | -| whitespace | Specifies whether to be lenient or strict with the acceptance of unnecessary white space surrounding the configured value-split sequence. Default is `lenient`. | If `whitespace` is `"lenient"`, `{"key1 = value1"}` will parse into `{"key1 ": " value1"}`. If `whitespace` is `"strict"`, `{"key1 = value1"}` will parse into `{"key1": "value1"}`. | -| skip_duplicate_values | A Boolean option for removing duplicate key-value pairs. When set to `true`, only one unique key-value pair will be preserved. Default is `false`. | If `skip_duplicate_values` is `false`, `{"key1=value1&key1=value1"}` will parse into `{"key1": ["value1", "value1"]}`. If `skip_duplicate_values` is `true`, `{"key1=value1&key1=value1"}` will parse into `{"key1": "value1"}`. | -| remove_brackets | Specifies whether to treat square brackets, angle brackets, and parentheses as value "wrappers" that should be removed from the value. Default is `false`. | If `remove_brackets` is `true`, `{"key1=(value1)"}` will parse into `{"key1": value1}`. If `remove_brackets` is `false`, `{"key1=(value1)"}` will parse into `{"key1": "(value1)"}`. | -| recursive | Specifies whether to recursively obtain additional key-value pairs from values. The extra key-value pairs will be stored as sub-keys of the root key. Default is `false`. The levels of recursive parsing must be defined by different brackets for each level: `[]`, `()`, and `<>`, in this order. Any other configurations specified will only be applied to the outmost keys.
When `recursive` is `true`:
`remove_brackets` cannot also be `true`;
`skip_duplicate_values` will always be `true`;
`whitespace` will always be `"strict"`. | If `recursive` is true, `{"item1=[item1-subitem1=item1-subitem1-value&item1-subitem2=(item1-subitem2-subitem2A=item1-subitem2-subitem2A-value&item1-subitem2-subitem2B=item1-subitem2-subitem2B-value)]&item2=item2-value"}` will parse into `{"item1": {"item1-subitem1": "item1-subitem1-value", "item1-subitem2" {"item1-subitem2-subitem2A": "item1-subitem2-subitem2A-value", "item1-subitem2-subitem2B": "item1-subitem2-subitem2B-value"}}}`. | -| overwrite_if_destination_exists | Specifies whether to overwrite existing fields if there are key conflicts when writing parsed fields to the event. Default is `true`. | If `overwrite_if_destination_exists` is `true` and destination is `null`, `{"key1": "old_value", "message": "key1=new_value"}` will parse into `{"key1": "new_value", "message": "key1=new_value"}`. | -| tags_on_failure | When a `kv` operation causes a runtime exception within the processor, the operation is safely stopped without crashing the processor, and the event is tagged with the provided tags. | If `tags_on_failure` is set to `["keyvalueprocessor_failure"]`, `{"tags": ["keyvalueprocessor_failure"]}` will be added to the event's metadata in the event of a runtime exception. | -| value_grouping | Specifies whether to group values using predefined value grouping delimiters: `{...}`, `[...]', `<...>`, `(...)`, `"..."`, `'...'`, `http://... (space)`, and `https:// (space)`. If this flag is enabled, then the content between the delimiters is considered to be one entity and is not parsed for key-value pairs. Default is `false`. If `value_grouping` is `true`, then `{"key1=[a=b,c=d]&key2=value2"}` parses to `{"key1": "[a=b,c=d]", "key2": "value2"}`. | -| drop_keys_with_no_value | Specifies whether keys should be dropped if they have a null value. Default is `false`. If `drop_keys_with_no_value` is set to `true`, then `{"key1=value1&key2"}` parses to `{"key1": "value1"}`. | -| strict_grouping | Specifies whether strict grouping should be enabled when the `value_grouping` or `string_literal_character` options are used. Default is `false`. | When enabled, groups with unmatched end characters yield errors. The event is ignored after the errors are logged. | -| string_literal_character | Can be set to either a single quotation mark (`'`) or a double quotation mark (`"`). Default is `null`. | When this option is used, any text contained within the specified quotation mark character will be ignored and excluded from key-value parsing. For example, `text1 "key1=value1" text2 key2=value2` would parse to `{"key2": "value2"}`. | -| key_value_when | Allows you to specify a [conditional expression](https://opensearch.org/docs/latest/data-prepper/pipelines/expression-syntax/), such as `/some-key == "test"`, that will be evaluated to determine whether the processor should be applied to the event. | +Option | Description | Example +:--- | :--- | :--- +`source` | The message field to be parsed. Optional. Default value is `message`. | If `source` is `"message1"`, `{"message1": {"key1=value1"}, "message2": {"key2=value2"}}` parses into `{"message1": {"key1=value1"}, "message2": {"key2=value2"}, "parsed_message": {"key1": "value1"}}`. +destination | The destination field for the parsed source. The parsed source overwrites the preexisting data for that key. Optional. If `destination` is set to `null`, the parsed fields will be written to the root of the event. Default value is `parsed_message`. | If `destination` is `"parsed_data"`, `{"message": {"key1=value1"}}` parses into `{"message": {"key1=value1"}, "parsed_data": {"key1": "value1"}}`. +`field_delimiter_regex` | A regular expression specifying the delimiter that separates key-value pairs. Special regular expression characters such as `[` and `]` must be escaped with `\\`. Cannot be defined at the same time as `field_split_characters`. Optional. If this option is not defined, `field_split_characters` is used. | If `field_delimiter_regex` is `"&\\{2\\}"`, `{"key1=value1&&key2=value2"}` parses into `{"key1": "value1", "key2": "value2"}`. +`field_split_characters` | A string of characters specifying the delimeter that separates key-value pairs. Special regular expression characters such as `[` and `]` must be escaped with `\\`. Cannot be defined at the same time as `field_delimiter_regex`. Optional. Default value is `&`. | If `field_split_characters` is `"&&"`, `{"key1=value1&&key2=value2"}` parses into `{"key1": "value1", "key2": "value2"}`. +`key_value_delimiter_regex` | A regular expression specifying the delimiter that separates the key and value within a key-value pair. Special regular expression characters such as `[` and `]` must be escaped with `\\`. This option cannot be defined at the same time as `value_split_characters`. Optional. If this option is not defined, `value_split_characters` is used. | If `key_value_delimiter_regex` is `"=\\{2\\}"`, `{"key1==value1"}` parses into `{"key1": "value1"}`. +`value_split_characters` | A string of characters specifying the delimiter that separates the key and value within a key-value pair. Special regular expression characters such as `[` and `]` must be escaped with `\\`. Cannot be defined at the same time as `key_value_delimiter_regex`. Optional. Default value is `=`. | If `value_split_characters` is `"=="`, `{"key1==value1"}` parses into `{"key1": "value1"}`. +`non_match_value` | When a key-value pair cannot be successfully split, the key-value pair is placed in the `key` field, and the specified value is placed in the `value` field. Optional. Default value is `null`. | `key1value1&key2=value2` parses into `{"key1value1": null, "key2": "value2"}`. | +`prefix` | A prefix to append before all keys. Optional. Default value is an empty string. | If `prefix` is `"custom"`, `{"key1=value1"}` parses into `{"customkey1": "value1"}`. +`delete_key_regex` | A regular expression specifying the characters to delete from the key. Special regular expression characters such as `[` and `]` must be escaped with `\\`. Cannot be an empty string. Optional. No default value. | If `delete_key_regex` is `"\s"`, `{"key1 =value1"}` parses into `{"key1": "value1"}`. +`delete_value_regex` | A regular expression specifying the characters to delete from the value. Special regular expression characters such as `[` and `]` must be escaped with `\\`. Cannot be an empty string. Optional. No default value. | If `delete_value_regex` is `"\s"`, `{"key1=value1 "}` parses into `{"key1": "value1"}`. +`include_keys` | An array specifying the keys that should be added for parsing. By default, all keys will be added. | If `include_keys` is `["key2"]`,`key1=value1&key2=value2` will parse into `{"key2": "value2"}`. +`exclude_keys` | An array specifying the parsed keys that should not be added to the event. By default, no keys will be excluded. | If `exclude_keys` is `["key2"]`, `key1=value1&key2=value2` will parse into `{"key1": "value1"}`. +`default_values` | A map specifying the default keys and their values that should be added to the event in case these keys do not exist in the source field being parsed. If the default key already exists in the message, the value is not changed. The `include_keys` filter will be applied to the message before `default_values`. | If `default_values` is `{"defaultkey": "defaultvalue"}`, `key1=value1` will parse into `{"key1": "value1", "defaultkey": "defaultvalue"}`.
If `default_values` is `{"key1": "abc"}`, `key1=value1` will parse into `{"key1": "value1"}`.
If `include_keys` is `["key1"]` and `default_values` is `{"key2": "value2"}`, `key1=value1&key2=abc` will parse into `{"key1": "value1", "key2": "value2"}`. +`transform_key` | When to lowercase, uppercase, or capitalize keys. | If `transform_key` is `lowercase`, `{"Key1=value1"}` will parse into `{"key1": "value1"}`.
If `transform_key` is `uppercase`, `{"key1=value1"}` will parse into `{"KEY1": "value1"}`.
If `transform_key` is `capitalize`, `{"key1=value1"}` will parse into `{"Key1": "value1"}`. +`whitespace` | Specifies whether to be lenient or strict with the acceptance of unnecessary white space surrounding the configured value-split sequence. Default is `lenient`. | If `whitespace` is `"lenient"`, `{"key1 = value1"}` will parse into `{"key1 ": " value1"}`. If `whitespace` is `"strict"`, `{"key1 = value1"}` will parse into `{"key1": "value1"}`. +`skip_duplicate_values` | A Boolean option for removing duplicate key-value pairs. When set to `true`, only one unique key-value pair will be preserved. Default is `false`. | If `skip_duplicate_values` is `false`, `{"key1=value1&key1=value1"}` will parse into `{"key1": ["value1", "value1"]}`. If `skip_duplicate_values` is `true`, `{"key1=value1&key1=value1"}` will parse into `{"key1": "value1"}`. +`remove_brackets` | Specifies whether to treat square brackets, angle brackets, and parentheses as value "wrappers" that should be removed from the value. Default is `false`. | If `remove_brackets` is `true`, `{"key1=(value1)"}` will parse into `{"key1": value1}`. If `remove_brackets` is `false`, `{"key1=(value1)"}` will parse into `{"key1": "(value1)"}`. +`recursive` | Specifies whether to recursively obtain additional key-value pairs from values. The extra key-value pairs will be stored as sub-keys of the root key. Default is `false`. The levels of recursive parsing must be defined by different brackets for each level: `[]`, `()`, and `<>`, in this order. Any other configurations specified will only be applied to the outmost keys.
When `recursive` is `true`:
`remove_brackets` cannot also be `true`;
`skip_duplicate_values` will always be `true`;
`whitespace` will always be `"strict"`. | If `recursive` is true, `{"item1=[item1-subitem1=item1-subitem1-value&item1-subitem2=(item1-subitem2-subitem2A=item1-subitem2-subitem2A-value&item1-subitem2-subitem2B=item1-subitem2-subitem2B-value)]&item2=item2-value"}` will parse into `{"item1": {"item1-subitem1": "item1-subitem1-value", "item1-subitem2" {"item1-subitem2-subitem2A": "item1-subitem2-subitem2A-value", "item1-subitem2-subitem2B": "item1-subitem2-subitem2B-value"}}}`. +`overwrite_if_destination_exists` | Specifies whether to overwrite existing fields if there are key conflicts when writing parsed fields to the event. Default is `true`. | If `overwrite_if_destination_exists` is `true` and destination is `null`, `{"key1": "old_value", "message": "key1=new_value"}` will parse into `{"key1": "new_value", "message": "key1=new_value"}`. +`tags_on_failure` | When a `kv` operation causes a runtime exception within the processor, the operation is safely stopped without crashing the processor, and the event is tagged with the provided tags. | If `tags_on_failure` is set to `["keyvalueprocessor_failure"]`, `{"tags": ["keyvalueprocessor_failure"]}` will be added to the event's metadata in the event of a runtime exception. +`value_grouping` | Specifies whether to group values using predefined value grouping delimiters: `{...}`, `[...]', `<...>`, `(...)`, `"..."`, `'...'`, `http://... (space)`, and `https:// (space)`. If this flag is enabled, then the content between the delimiters is considered to be one entity and is not parsed for key-value pairs. Default is `false`. If `value_grouping` is `true`, then `{"key1=[a=b,c=d]&key2=value2"}` parses to `{"key1": "[a=b,c=d]", "key2": "value2"}`. +`drop_keys_with_no_value` | Specifies whether keys should be dropped if they have a null value. Default is `false`. If `drop_keys_with_no_value` is set to `true`, then `{"key1=value1&key2"}` parses to `{"key1": "value1"}`. +`strict_grouping` | Specifies whether strict grouping should be enabled when the `value_grouping` or `string_literal_character` options are used. Default is `false`. | When enabled, groups with unmatched end characters yield errors. The event is ignored after the errors are logged. +`string_literal_character` | Can be set to either a single quotation mark (`'`) or a double quotation mark (`"`). Default is `null`. | When this option is used, any text contained within the specified quotation mark character will be ignored and excluded from key-value parsing. For example, `text1 "key1=value1" text2 key2=value2` would parse to `{"key2": "value2"}`. +`key_value_when` | Allows you to specify a [conditional expression](https://opensearch.org/docs/latest/data-prepper/pipelines/expression-syntax/), such as `/some-key == "test"`, that will be evaluated to determine whether the processor should be applied to the event. - - diff --git a/_data-prepper/pipelines/configuration/processors/write_json.md b/_data-prepper/pipelines/configuration/processors/write_json.md index 9e94176010..8f1e6851da 100644 --- a/_data-prepper/pipelines/configuration/processors/write_json.md +++ b/_data-prepper/pipelines/configuration/processors/write_json.md @@ -11,8 +11,8 @@ nav_order: 56 The `write_json` processor converts an object in an event into a JSON string. You can customize the processor to choose the source and target field names. -| Option | Description | Example | -| :--- | :--- | :--- | -| source | Mandatory field that specifies the name of the field in the event containing the message or object to be parsed. | If `source` is set to `"message"` and the input is `{"message": {"key1":"value1", "key2":{"key3":"value3"}}`, then the `write_json` processor generates `{"message": "{\"key1\":\"value`\", \"key2\":"{\"key3\":\"value3\"}"}"`. -| target | An optional field that specifies the name of the field in which the resulting JSON string should be stored. If `target` is not specified, then the `source` field is used. +Option | Description | Example +:--- | :--- | :--- +source | Mandatory field that specifies the name of the field in the event containing the message or object to be parsed. | If `source` is set to `"message"` and the input is `{"message": {"key1":"value1", "key2":{"key3":"value3"}}}`, then the `write_json` processor outputs the event as `"{\"key1\":\"value1\",\"key2\":{\"key3\":\"value3\"}}"`. +target | An optional field that specifies the name of the field in which the resulting JSON string should be stored. If `target` is not specified, then the `source` field is used. | `key1` diff --git a/_field-types/supported-field-types/index.md b/_field-types/supported-field-types/index.md index dbff7c30f2..7c7b7375f9 100644 --- a/_field-types/supported-field-types/index.md +++ b/_field-types/supported-field-types/index.md @@ -23,7 +23,7 @@ Boolean | [`boolean`]({{site.url}}{{site.baseurl}}/field-types/supported-field-t IP | [`ip`]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/ip/): An IP address in IPv4 or IPv6 format. [Range]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/range/) | A range of values (`integer_range`, `long_range`, `double_range`, `float_range`, `date_range`, `ip_range`). [Object]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/object-fields/)| [`object`]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/object/): A JSON object.
[`nested`]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/nested/): Used when objects in an array need to be indexed independently as separate documents.
[`flat_object`]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/flat-object/): A JSON object treated as a string.
[`join`]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/join/): Establishes a parent-child relationship between documents in the same index. -[String]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/string/)|[`keyword`]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/keyword/): Contains a string that is not analyzed.
[`text`]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/text/): Contains a string that is analyzed.
[`match_only_text`]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/match-only-text/): A space-optimized version of a `text` field.
[`token_count`]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/token-count/): Stores the number of analyzed tokens in a string.
[`wildcard`]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/token-count/): A variation of `keyword` with efficient substring and regular expression matching. +[String]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/string/)|[`keyword`]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/keyword/): Contains a string that is not analyzed.
[`text`]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/text/): Contains a string that is analyzed.
[`match_only_text`]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/match-only-text/): A space-optimized version of a `text` field.
[`token_count`]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/token-count/): Stores the number of analyzed tokens in a string.
[`wildcard`]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/wildcard/): A variation of `keyword` with efficient substring and regular expression matching. [Autocomplete]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/autocomplete/) |[`completion`]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/completion/): Provides autocomplete functionality through a completion suggester.
[`search_as_you_type`]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/search-as-you-type/): Provides search-as-you-type functionality using both prefix and infix completion. [Geographic]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/geographic/)| [`geo_point`]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/geo-point/): A geographic point.
[`geo_shape`]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/geo-shape/): A geographic shape. [Rank]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/rank/) | Boosts or decreases the relevance score of documents (`rank_feature`, `rank_features`). diff --git a/_getting-started/intro.md b/_getting-started/intro.md index 272d8d6981..edd178a23f 100644 --- a/_getting-started/intro.md +++ b/_getting-started/intro.md @@ -106,7 +106,6 @@ in | 1 the | 1, 2 eye | 1 of | 1 -the | 1 beholder | 1 and | 2 beast | 2 @@ -158,4 +157,4 @@ In OpenSearch, a shard is a Lucene index, which consists of _segments_ (or segme ## Next steps -- Learn how to install OpenSearch within minutes in [Installation quickstart]({{site.url}}{{site.baseurl}}/getting-started/quickstart/). \ No newline at end of file +- Learn how to install OpenSearch within minutes in [Installation quickstart]({{site.url}}{{site.baseurl}}/getting-started/quickstart/). diff --git a/_im-plugin/index-rollups/rollup-api.md b/_im-plugin/index-rollups/rollup-api.md index 61bfdf76d4..5064d2ac49 100644 --- a/_im-plugin/index-rollups/rollup-api.md +++ b/_im-plugin/index-rollups/rollup-api.md @@ -105,8 +105,8 @@ Options | Description | Type | Required `schedule.interval.cron.expression` | Specify a Unix cron expression. | String | Yes `schedule.interval.cron.timezone` | Specify timezones as defined by the IANA Time Zone Database. Defaults to UTC. | String | No `description` | Optionally, describe the rollup job. | String | No -`enabled` | When true, the index rollup job is scheduled. Default is true. | Boolean | Yes -`continuous` | Specify whether or not the index rollup job continuously rolls up data forever or just executes over the current data set once and stops. Default is false. | Boolean | Yes +`enabled` | When true, the index rollup job is scheduled. Default is `true`. | Boolean | Yes +`continuous` | Specify whether or not the index rollup job continuously rolls up data forever or executes over the current dataset once and stops. Default is `false`. | Boolean | Yes `error_notification` | Set up a Mustache message template for error notifications. For example, if an index rollup job fails, the system sends a message to a Slack channel. | Object | No `page_size` | Specify the number of buckets to paginate at a time during rollup. | Number | Yes `delay` | The number of milliseconds to delay execution of the index rollup job. | Long | No diff --git a/_im-plugin/index-transforms/transforms-apis.md b/_im-plugin/index-transforms/transforms-apis.md index df9ff19f8f..37d2c035b5 100644 --- a/_im-plugin/index-transforms/transforms-apis.md +++ b/_im-plugin/index-transforms/transforms-apis.md @@ -39,7 +39,7 @@ You can specify the following options in the HTTP request body: Option | Data Type | Description | Required :--- | :--- | :--- | :--- enabled | Boolean | If true, the transform job is enabled at creation. | No -continuous | Boolean | Specifies whether the transform job should be continuous. Continuous jobs execute every time they are scheduled according to the `schedule` field and run based off of newly transformed buckets as well as any new data added to source indexes. Non-continuous jobs execute only once. Default is false. | No +continuous | Boolean | Specifies whether the transform job should be continuous. Continuous jobs execute every time they are scheduled according to the `schedule` field and run based off of newly transformed buckets as well as any new data added to source indexes. Non-continuous jobs execute only once. Default is `false`. | No schedule | Object | The schedule for the transform job. | Yes start_time | Integer | The Unix epoch time of the transform job's start time. | Yes description | String | Describes the transform job. | No @@ -447,7 +447,7 @@ from | The starting transform to return. Default is 0. | No size | Specifies the number of transforms to return. Default is 10. | No search |The search term to use to filter results. | No sortField | The field to sort results with. | No -sortDirection | Specifies the direction to sort results in. Can be `ASC` or `DESC`. Default is ASC. | No +sortDirection | Specifies the direction to sort results in. Can be `ASC` or `DESC`. Default is `ASC`. | No #### Sample Request diff --git a/_ingest-pipelines/processors/fingerprint.md b/_ingest-pipelines/processors/fingerprint.md new file mode 100644 index 0000000000..4775da98b6 --- /dev/null +++ b/_ingest-pipelines/processors/fingerprint.md @@ -0,0 +1,158 @@ +--- +layout: default +title: Fingerprint +parent: Ingest processors +nav_order: 105 +--- + +# Fingerprint processor +Introduced 2.16 +{: .label .label-purple } + +The `fingerprint` processor is used to generate a hash value for either certain specified fields or all fields in a document. The hash value can be used to deduplicate documents within an index and collapse search results. + +For each field, the field name, the length of the field value, and the field value itself are concatenated and separated by the pipe character `|`. For example, if the field name is `field1` and the value is `value1`, then the concatenated string would be `|field1|3:value1|field2|10:value2|`. For object fields, the field name is flattened by joining the nested field names with a period `.`. For instance, if the object field is `root_field` with a sub-field `sub_field1` having the value `value1` and another sub-field `sub_field2` with the value `value2`, then the concatenated string would be `|root_field.sub_field1|1:value1|root_field.sub_field2|100:value2|`. + +The following is the syntax for the `fingerprint` processor: + +```json +{ + "community_id": { + "fields": ["foo", "bar"], + "target_field": "fingerprint", + "hash_method": "SHA-1@2.16.0" + } +} +``` +{% include copy-curl.html %} + +## Configuration parameters + +The following table lists the required and optional parameters for the `fingerprint` processor. + +Parameter | Required/Optional | Description | +|-----------|-----------|-----------| +`fields` | Optional | A list of fields used to generate a hash value. | +`exclude_fields` | Optional | Specifies the fields to be excluded from hash value generation. It is mutually exclusive with the `fields` parameter; if both `exclude_fields` and `fields` are empty or null, then all fields are included in the hash value calculation. | +`hash_method` | Optional | Specifies the hashing algorithm to be used, with options being `MD5@2.16.0`, `SHA-1@2.16.0`, `SHA-256@2.16.0`, or `SHA3-256@2.16.0`. Default is `SHA-1@2.16.0`. The version number is appended to ensure consistent hashing across OpenSearch versions, and new versions will support new hash methods. | +`target_field` | Optional | Specifies the name of the field in which the generated hash value will be stored. If not provided, then the hash value is stored in the `fingerprint` field by default. | +`ignore_missing` | Optional | Specifies whether the processor should exit quietly if one of the required fields is missing. Default is `false`. | +`description` | Optional | A brief description of the processor. | +`if` | Optional | A condition for running the processor. | +`ignore_failure` | Optional | If set to `true`, then failures are ignored. Default is `false`. | +`on_failure` | Optional | A list of processors to run if the processor fails. | +`tag` | Optional | An identifier tag for the processor. Useful for debugging in order to distinguish between processors of the same type. | + +## Using the processor + +Follow these steps to use the processor in a pipeline. + +**Step 1: Create a pipeline** + +The following query creates a pipeline named `fingerprint_pipeline` that uses the `fingerprint` processor to generate a hash value for specified fields in the document: + +```json +PUT /_ingest/pipeline/fingerprint_pipeline +{ + "description": "generate hash value for some specified fields the document", + "processors": [ + { + "fingerprint": { + "fields": ["foo", "bar"] + } + } + ] +} +``` +{% include copy-curl.html %} + +**Step 2 (Optional): Test the pipeline** + +It is recommended that you test your pipeline before ingesting documents. +{: .tip} + +To test the pipeline, run the following query: + +```json +POST _ingest/pipeline/fingerprint_pipeline/_simulate +{ + "docs": [ + { + "_index": "testindex1", + "_id": "1", + "_source": { + "foo": "foo", + "bar": "bar" + } + } + ] +} +``` +{% include copy-curl.html %} + +#### Response + +The following example response confirms that the pipeline is working as expected: + +```json +{ + "docs": [ + { + "doc": { + "_index": "testindex1", + "_id": "1", + "_source": { + "foo": "foo", + "bar": "bar", + "fingerprint": "SHA-1@2.16.0:fYeen7hTJ2zs9lpmUnk6nvH54sM=" + }, + "_ingest": { + "timestamp": "2024-03-11T02:17:22.329823Z" + } + } + } + ] +} +``` + +**Step 3: Ingest a document** + +The following query ingests a document into an index named `testindex1`: + +```json +PUT testindex1/_doc/1?pipeline=fingerprint_pipeline +{ + "foo": "foo", + "bar": "bar" +} +``` +{% include copy-curl.html %} + +#### Response + +The request indexes the document into the `testindex1` index: + +```json +{ + "_index": "testindex1", + "_id": "1", + "_version": 1, + "result": "created", + "_shards": { + "total": 2, + "successful": 1, + "failed": 0 + }, + "_seq_no": 0, + "_primary_term": 1 +} +``` + +**Step 4 (Optional): Retrieve the document** + +To retrieve the document, run the following query: + +```json +GET testindex1/_doc/1 +``` +{% include copy-curl.html %} diff --git a/_ingest-pipelines/processors/index-processors.md b/_ingest-pipelines/processors/index-processors.md index 79f30524d6..0e1ee1e114 100644 --- a/_ingest-pipelines/processors/index-processors.md +++ b/_ingest-pipelines/processors/index-processors.md @@ -40,6 +40,7 @@ Processor type | Description `dot_expander` | Expands a field with dots into an object field. `drop` |Drops a document without indexing it or raising any errors. `fail` | Raises an exception and stops the execution of a pipeline. +`fingerprint` | Generates a hash value for either certain specified fields or all fields in a document. `foreach` | Allows for another processor to be applied to each element of an array or an object field in a document. `geoip` | Adds information about the geographical location of an IP address. `geojson-feature` | Indexes GeoJSON data into a geospatial field. @@ -71,3 +72,7 @@ Processor type | Description ## Batch-enabled processors Some processors support batch ingestion---they can process multiple documents at the same time as a batch. These batch-enabled processors usually provide better performance when using batch processing. For batch processing, use the [Bulk API]({{site.url}}{{site.baseurl}}/api-reference/document-apis/bulk/) and provide a `batch_size` parameter. All batch-enabled processors have a batch mode and a single-document mode. When you ingest documents using the `PUT` method, the processor functions in single-document mode and processes documents in series. Currently, only the `text_embedding` and `sparse_encoding` processors are batch enabled. All other processors process documents one at a time. + +## Selectively enabling processors + +Processors defined by the [ingest-common module](https://github.com/opensearch-project/OpenSearch/blob/2.x/modules/ingest-common/src/main/java/org/opensearch/ingest/common/IngestCommonPlugin.java) can be selectively enabled by providing the `ingest-common.processors.allowed` cluster setting. If not provided, then all processors are enabled by default. Specifying an empty list disables all processors. If the setting is changed to remove previously enabled processors, then any pipeline using a disabled processor will fail after node restart when the new setting takes effect. diff --git a/_ingest-pipelines/processors/split.md b/_ingest-pipelines/processors/split.md index 2052c3def1..c424ef671c 100644 --- a/_ingest-pipelines/processors/split.md +++ b/_ingest-pipelines/processors/split.md @@ -26,19 +26,18 @@ The following is the syntax for the `split` processor: The following table lists the required and optional parameters for the `split` processor. -Parameter | Required/Optional | Description | -|-----------|-----------|-----------| -`field` | Required | The field containing the string to be split. -`separator` | Required | The delimiter used to split the string. This can be a regular expression pattern. -`preserve_field` | Optional | If set to `true`, preserves empty trailing fields (for example, `''`) in the resulting array. If set to `false`, empty trailing fields are removed from the resulting array. Default is `false`. -`target_field` | Optional | The field where the array of substrings is stored. If not specified, then the field is updated in-place. -`ignore_missing` | Optional | Specifies whether the processor should ignore documents that do not contain the specified -field. If set to `true`, then the processor ignores missing values in the field and leaves the `target_field` unchanged. Default is `false`. -`description` | Optional | A brief description of the processor. -`if` | Optional | A condition for running the processor. -`ignore_failure` | Optional | Specifies whether the processor continues execution even if it encounters an error. If set to `true`, then failures are ignored. Default is `false`. -`on_failure` | Optional | A list of processors to run if the processor fails. -`tag` | Optional | An identifier tag for the processor. Useful for debugging in order to distinguish between processors of the same type. +Parameter | Required/Optional | Description +:--- | :--- | :--- +`field` | Required | The field containing the string to be split. +`separator` | Required | The delimiter used to split the string. This can be a regular expression pattern. +`preserve_field` | Optional | If set to `true`, preserves empty trailing fields (for example, `''`) in the resulting array. If set to `false`, empty trailing fields are removed from the resulting array. Default is `false`. +`target_field` | Optional | The field where the array of substrings is stored. If not specified, then the field is updated in-place. +`ignore_missing` | Optional | Specifies whether the processor should ignore documents that do not contain the specified field. If set to `true`, then the processor ignores missing values in the field and leaves the `target_field` unchanged. Default is `false`. +`description` | Optional | A brief description of the processor. +`if` | Optional | A condition for running the processor. +`ignore_failure` | Optional | Specifies whether the processor continues execution even if it encounters an error. If set to `true`, then failures are ignored. Default is `false`. +`on_failure` | Optional | A list of processors to run if the processor fails. +`tag` | Optional | An identifier tag for the processor. Useful for debugging in order to distinguish between processors of the same type. ## Using the processor diff --git a/_install-and-configure/configuring-opensearch/index-settings.md b/_install-and-configure/configuring-opensearch/index-settings.md index 34b1829b78..a1894a0d2c 100644 --- a/_install-and-configure/configuring-opensearch/index-settings.md +++ b/_install-and-configure/configuring-opensearch/index-settings.md @@ -54,6 +54,8 @@ OpenSearch supports the following dynamic cluster-level index settings: - `indices.fielddata.cache.size` (String): The maximum size of the field data cache. May be specified as an absolute value (for example, `8GB`) or a percentage of the node heap (for example, `50%`). This value is static so you must specify it in the `opensearch.yml` file. If you don't specify this setting, the maximum size is unlimited. This value should be smaller than the `indices.breaker.fielddata.limit`. For more information, see [Field data circuit breaker]({{site.url}}{{site.baseurl}}/install-and-configure/configuring-opensearch/circuit-breaker/#field-data-circuit-breaker-settings). +- `indices.query.bool.max_clause_count` (Integer): Defines the maximum product of fields and terms that are queryable simultaneously. Before OpenSearch 2.16, a cluster restart was required in order to apply this static setting. Now dynamic, existing search thread pools may use the old static value initially, causing `TooManyClauses` exceptions. New thread pools use the updated value. Default is `1024`. + - `cluster.remote_store.index.path.type` (String): The path strategy for the data stored in the remote store. This setting is effective only for remote-store-enabled clusters. This setting supports the following values: - `fixed`: Stores the data in path structure `///`. - `hashed_prefix`: Stores the data in path structure `hash()////`. diff --git a/_install-and-configure/configuring-opensearch/index.md b/_install-and-configure/configuring-opensearch/index.md index ecbce1310d..c2ffbf571b 100755 --- a/_install-and-configure/configuring-opensearch/index.md +++ b/_install-and-configure/configuring-opensearch/index.md @@ -25,6 +25,10 @@ Certain operations are static and require you to modify the `opensearch.yml` [co ## Specifying settings as environment variables +You can specify environment variables in the following ways. + +### Arguments at startup + You can specify environment variables as arguments using `-E` when launching OpenSearch: ```bash @@ -32,6 +36,45 @@ You can specify environment variables as arguments using `-E` when launching Ope ``` {% include copy.html %} +### Directly in the shell environment + +You can configure the environment variables directly in a shell environment before starting OpenSearch, as shown in the following example: + +```bash +export OPENSEARCH_JAVA_OPTS="-Xms2g -Xmx2g" +export OPENSEARCH_PATH_CONF="/etc/opensearch" +./opensearch +``` +{% include copy.html %} + +### Systemd service file + +When running OpenSearch as a service managed by `systemd`, you can specify environment variables in the service file, as shown in the following example: + +```bash +# /etc/systemd/system/opensearch.service.d/override.conf +[Service] +Environment="OPENSEARCH_JAVA_OPTS=-Xms2g -Xmx2g" +Environment="OPENSEARCH_PATH_CONF=/etc/opensearch" +``` +After creating or modifying the file, reload the systemd configuration and restart the service using the following command: + +```bash +sudo systemctl daemon-reload +sudo systemctl restart opensearch +``` +{% include copy.html %} + +### Docker environment variables + +When running OpenSearch in Docker, you can specify environment variables using the `-e` option with `docker run` command, as shown in the following command: + +```bash +docker run -e "OPENSEARCH_JAVA_OPTS=-Xms2g -Xmx2g" -e "OPENSEARCH_PATH_CONF=/usr/share/opensearch/config" opensearchproject/opensearch:latest +``` +{% include copy.html %} + + ## Updating cluster settings using the API The first step in changing a setting is to view the current settings by sending the following request: @@ -113,4 +156,4 @@ If you are working on a client application running against an OpenSearch cluster - http.cors.enabled:true - http.cors.allow-headers:X-Requested-With,X-Auth-Token,Content-Type,Content-Length,Authorization - http.cors.allow-credentials:true -``` \ No newline at end of file +``` diff --git a/_install-and-configure/install-opensearch/index.md b/_install-and-configure/install-opensearch/index.md index 2615383ce1..1afe12f6a5 100644 --- a/_install-and-configure/install-opensearch/index.md +++ b/_install-and-configure/install-opensearch/index.md @@ -97,6 +97,9 @@ The [sample docker-compose.yml]({{site.url}}{{site.baseurl}}/install-and-configu - `OPENSEARCH_JAVA_OPTS=-Xms512m -Xmx512m` Sets the size of the Java heap (we recommend half of system RAM). + + OpenSearch defaults to `-Xms1g -Xmx1g` for heap memory allocation, which takes precedence over configurations specified using percentage notation (`-XX:MinRAMPercentage`, `-XX:MaxRAMPercentage`). For example, if you set `OPENSEARCH_JAVA_OPTS=-XX:MinRAMPercentage=30 -XX:MaxRAMPercentage=70`, the predefined `-Xms1g -Xmx1g` values will override these settings. When using `OPENSEARCH_JAVA_OPTS` to define memory allocation, make sure you use the `-Xms` and `-Xmx` notation. +{: .note} - `nofile 65536` diff --git a/_ml-commons-plugin/api/model-apis/register-model.md b/_ml-commons-plugin/api/model-apis/register-model.md index 61d821419e..ec830a7821 100644 --- a/_ml-commons-plugin/api/model-apis/register-model.md +++ b/_ml-commons-plugin/api/model-apis/register-model.md @@ -84,7 +84,7 @@ Field | Data type | Required/Optional | Description `name`| String | Required | The model name. | `version` | String | Required | The model version. | `model_format` | String | Required | The portable format of the model file. Valid values are `TORCH_SCRIPT` and `ONNX`. | -`function_name` | String | Required | For text embedding models, set this parameter to `TEXT_EMBEDDING`. For sparse encoding models, set this parameter to `SPARSE_ENCODING` or `SPARSE_TOKENIZE`. For cross-encoder models, set this parameter to `TEXT_SIMILARITY`. +`function_name` | String | Required | For text embedding models, set this parameter to `TEXT_EMBEDDING`. For sparse encoding models, set this parameter to `SPARSE_ENCODING` or `SPARSE_TOKENIZE`. For cross-encoder models, set this parameter to `TEXT_SIMILARITY`. For question answering models, set this parameter to `QUESTION_ANSWERING`. `model_content_hash_value` | String | Required | The model content hash generated using the SHA-256 hashing algorithm. `url` | String | Required | The URL that contains the model. | `description` | String | Optional| The model description. | diff --git a/_ml-commons-plugin/cluster-settings.md b/_ml-commons-plugin/cluster-settings.md index 37c7cab9f7..0c1f433bf2 100644 --- a/_ml-commons-plugin/cluster-settings.md +++ b/_ml-commons-plugin/cluster-settings.md @@ -468,7 +468,7 @@ When set to `true`, this setting enables the search processors for retrieval-aug ### Setting ``` -plugins.ml_commons.agent_framework_enabled: true +plugins.ml_commons.rag_pipeline_feature_enabled: true ``` ### Values diff --git a/_ml-commons-plugin/custom-local-models.md b/_ml-commons-plugin/custom-local-models.md index a265d8804a..c2866938f6 100644 --- a/_ml-commons-plugin/custom-local-models.md +++ b/_ml-commons-plugin/custom-local-models.md @@ -109,7 +109,11 @@ To learn more about model groups, see [Model access control]({{site.url}}{{site. ## Step 2: Register a local model -To register a remote model to the model group created in step 1, provide the model group ID from step 1 in the following request: +To register a local model to the model group created in step 1, send a Register Model API request. For descriptions of Register Model API parameters, see [Register a model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/register-model/). + +The `function_name` corresponds to the model type. For text embedding models, set this parameter to `TEXT_EMBEDDING`. For sparse encoding models, set this parameter to `SPARSE_ENCODING` or `SPARSE_TOKENIZE`. For cross-encoder models, set this parameter to `TEXT_SIMILARITY`. For question answering models, set this parameter to `QUESTION_ANSWERING`. In this example, set `function_name` to `TEXT_EMBEDDING` because you're registering a text embedding model. + +Provide the model group ID from step 1 and send the following request: ```json POST /_plugins/_ml/models/_register @@ -118,7 +122,7 @@ POST /_plugins/_ml/models/_register "version": "1.0.1", "model_group_id": "wlcnb4kBJ1eYAeTMHlV6", "description": "This is a port of the DistilBert TAS-B Model to sentence-transformers model: It maps sentences & paragraphs to a 768 dimensional dense vector space and is optimized for the task of semantic search.", - "model_task_type": "TEXT_EMBEDDING", + "function_name": "TEXT_EMBEDDING", "model_format": "TORCH_SCRIPT", "model_content_size_in_bytes": 266352827, "model_content_hash_value": "acdc81b652b83121f914c5912ae27c0fca8fabf270e6f191ace6979a19830413", @@ -143,7 +147,7 @@ POST /_plugins/_ml/models/_register "version": "1.0.1", "model_group_id": "wlcnb4kBJ1eYAeTMHlV6", "description": "This is a port of the DistilBert TAS-B Model to sentence-transformers model: It maps sentences & paragraphs to a 768 dimensional dense vector space and is optimized for the task of semantic search.", - "model_task_type": "TEXT_EMBEDDING", + "function_name": "TEXT_EMBEDDING", "model_format": "TORCH_SCRIPT", "model_content_size_in_bytes": 266352827, "model_content_hash_value": "acdc81b652b83121f914c5912ae27c0fca8fabf270e6f191ace6979a19830413", @@ -159,8 +163,6 @@ POST /_plugins/_ml/models/_register ``` {% include copy.html %} -For descriptions of Register API parameters, see [Register a model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/register-model/). The `model_task_type` corresponds to the model type. For text embedding models, set this parameter to `TEXT_EMBEDDING`. For sparse encoding models, set this parameter to `SPARSE_ENCODING` or `SPARSE_TOKENIZE`. For cross-encoder models, set this parameter to `TEXT_SIMILARITY`. For question answering models, set this parameter to `QUESTION_ANSWERING`. - OpenSearch returns the task ID of the register operation: ```json @@ -183,7 +185,7 @@ When the operation is complete, the state changes to `COMPLETED`: { "model_id": "cleMb4kBJ1eYAeTMFFg4", "task_type": "REGISTER_MODEL", - "function_name": "REMOTE", + "function_name": "TEXT_EMBEDDING", "state": "COMPLETED", "worker_node": [ "XPcXLV7RQoi5m8NI_jEOVQ" @@ -229,7 +231,7 @@ When the operation is complete, the state changes to `COMPLETED`: { "model_id": "cleMb4kBJ1eYAeTMFFg4", "task_type": "DEPLOY_MODEL", - "function_name": "REMOTE", + "function_name": "TEXT_EMBEDDING", "state": "COMPLETED", "worker_node": [ "n-72khvBTBi3bnIIR8FTTw" @@ -379,4 +381,4 @@ The response provides the answer based on the context: } } } -``` \ No newline at end of file +``` diff --git a/_ml-commons-plugin/pretrained-models.md b/_ml-commons-plugin/pretrained-models.md index 8847d36291..30540cfe49 100644 --- a/_ml-commons-plugin/pretrained-models.md +++ b/_ml-commons-plugin/pretrained-models.md @@ -126,7 +126,7 @@ To learn more about model groups, see [Model access control]({{site.url}}{{site. ## Step 2: Register a local OpenSearch-provided model -To register a remote model to the model group created in step 1, provide the model group ID from step 1 in the following request. +To register an OpenSearch-provided model to the model group created in step 1, provide the model group ID from step 1 in the following request. Because pretrained models originate from the ML Commons model repository, you only need to provide the `name`, `version`, `model_group_id`, and `model_format` in the register API request: @@ -163,7 +163,7 @@ When the operation is complete, the state changes to `COMPLETED`: { "model_id": "cleMb4kBJ1eYAeTMFFg4", "task_type": "REGISTER_MODEL", - "function_name": "REMOTE", + "function_name": "TEXT_EMBEDDING", "state": "COMPLETED", "worker_node": [ "XPcXLV7RQoi5m8NI_jEOVQ" @@ -209,7 +209,7 @@ When the operation is complete, the state changes to `COMPLETED`: { "model_id": "cleMb4kBJ1eYAeTMFFg4", "task_type": "DEPLOY_MODEL", - "function_name": "REMOTE", + "function_name": "TEXT_EMBEDDING", "state": "COMPLETED", "worker_node": [ "n-72khvBTBi3bnIIR8FTTw" diff --git a/_ml-commons-plugin/tutorials/index.md b/_ml-commons-plugin/tutorials/index.md index 4479d0878f..070da3cae1 100644 --- a/_ml-commons-plugin/tutorials/index.md +++ b/_ml-commons-plugin/tutorials/index.md @@ -19,6 +19,7 @@ Using the OpenSearch machine learning (ML) framework, you can build various appl - **Reranking search results**: - [Reranking search results using the Cohere Rerank model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/tutorials/reranking-cohere/) + - [Reranking search results using the MS MARCO cross-encoder model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/tutorials/reranking-cross-encoder/) - **Agents and tools**: - [Retrieval-augmented generation (RAG) chatbot]({{site.url}}{{site.baseurl}}/ml-commons-plugin/tutorials/rag-chatbot/) diff --git a/_ml-commons-plugin/tutorials/reranking-cross-encoder.md b/_ml-commons-plugin/tutorials/reranking-cross-encoder.md new file mode 100644 index 0000000000..e46c7eb511 --- /dev/null +++ b/_ml-commons-plugin/tutorials/reranking-cross-encoder.md @@ -0,0 +1,391 @@ +--- +layout: default +title: Reranking with the MS MARCO cross-encoder +parent: Tutorials +nav_order: 35 +--- + +# Reranking search results using the MS MARCO cross-encoder model + +A [reranking pipeline]({{site.url}}{{site.baseurl}}/search-plugins/search-relevance/reranking-search-results/) can rerank search results, providing a relevance score for each document in the search results with respect to the search query. The relevance score is calculated by a cross-encoder model. + +This tutorial illustrates how to use the [Hugging Face `ms-marco-MiniLM-L-6-v2` model](https://huggingface.co/cross-encoder/ms-marco-MiniLM-L-6-v2) in a reranking pipeline. + +Replace the placeholders beginning with the prefix `your_` with your own values. +{: .note} + +## Prerequisite + +Before you start, deploy the model on Amazon SageMaker. For better performance, use a GPU. + +Run the following code to deploy the model on [Amazon SageMaker](https://aws.amazon.com/pm/sagemaker): + +```python +import sagemaker +import boto3 +from sagemaker.huggingface import HuggingFaceModel +sess = sagemaker.Session() +role = sagemaker.get_execution_role() + +hub = { + 'HF_MODEL_ID':'cross-encoder/ms-marco-MiniLM-L-6-v2', + 'HF_TASK':'text-classification' +} +huggingface_model = HuggingFaceModel( + transformers_version='4.37.0', + pytorch_version='2.1.0', + py_version='py310', + env=hub, + role=role, +) +predictor = huggingface_model.deploy( + initial_instance_count=1, # number of instances + instance_type='ml.m5.xlarge' # ec2 instance type +) +``` +{% include copy.html %} + +Note the model inference endpoint; you'll use it to create a connector in the next step. + +## Step 1: Create a connector and register the model + +First, create a connector for the model, providing the inference endpoint and your AWS credentials: + +```json +POST /_plugins/_ml/connectors/_create +{ + "name": "Sagemaker cross-encoder model", + "description": "Test connector for Sagemaker cross-encoder model", + "version": 1, + "protocol": "aws_sigv4", + "credential": { + "access_key": "your_access_key", + "secret_key": "your_secret_key", + "session_token": "your_session_token" + }, + "parameters": { + "region": "your_sagemkaer_model_region_like_us-west-2", + "service_name": "sagemaker" + }, + "actions": [ + { + "action_type": "predict", + "method": "POST", + "url": "your_sagemaker_model_inference_endpoint_created_in_last_step", + "headers": { + "content-type": "application/json" + }, + "request_body": "{ \"inputs\": ${parameters.inputs} }", + "pre_process_function": "\n String escape(def input) { \n if (input.contains(\"\\\\\")) {\n input = input.replace(\"\\\\\", \"\\\\\\\\\");\n }\n if (input.contains(\"\\\"\")) {\n input = input.replace(\"\\\"\", \"\\\\\\\"\");\n }\n if (input.contains('\r')) {\n input = input = input.replace('\r', '\\\\r');\n }\n if (input.contains(\"\\\\t\")) {\n input = input.replace(\"\\\\t\", \"\\\\\\\\\\\\t\");\n }\n if (input.contains('\n')) {\n input = input.replace('\n', '\\\\n');\n }\n if (input.contains('\b')) {\n input = input.replace('\b', '\\\\b');\n }\n if (input.contains('\f')) {\n input = input.replace('\f', '\\\\f');\n }\n return input;\n }\n\n String query = params.query_text;\n StringBuilder builder = new StringBuilder('[');\n \n for (int i=0; i +Metrics dashboard To learn about multi-cluster support for data sources, see [Enable OpenSearch Dashboards to support multiple OpenSearch clusters](https://github.com/opensearch-project/OpenSearch-Dashboards/issues/1388). diff --git a/_query-dsl/compound/function-score.md b/_query-dsl/compound/function-score.md index 8180058ae6..98568e0965 100644 --- a/_query-dsl/compound/function-score.md +++ b/_query-dsl/compound/function-score.md @@ -826,4 +826,197 @@ The results contain the three matching blog posts: } } ``` - \ No newline at end of file + + +## Named functions + +When defining a function, you can specify its name using the `_name` parameter at the top level. This name is useful for debugging and understanding the scoring process. Once specified, the function name is included in the score calculation explanation whenever possible (this applies to functions, filters, and queries). You can identify the function by its `_name` in the response. + +### Example + +The following request sets `explain` to `true` for debugging purposes in order to obtain a scoring explanation in the response. Each function contains a `_name` parameter so that you can identify the function unambiguously: + +```json +GET blogs/_search +{ + "explain": true, + "size": 1, + "query": { + "function_score": { + "functions": [ + { + "_name": "likes_function", + "script_score": { + "script": { + "lang": "painless", + "source": "return doc['likes'].value * 2;" + } + }, + "weight": 0.6 + }, + { + "_name": "views_function", + "field_value_factor": { + "field": "views", + "factor": 1.5, + "modifier": "log1p", + "missing": 1 + }, + "weight": 0.3 + }, + { + "_name": "comments_function", + "gauss": { + "comments": { + "origin": 1000, + "scale": 800 + } + }, + "weight": 0.1 + } + ] + } + } +} +``` +{% include copy-curl.html %} + +The response explains the scoring process. For each function, the explanation contains the function `_name` in its `description`: + +
+ + Response + + {: .text-delta} + +```json +{ + "took": 14, + "timed_out": false, + "_shards": { + "total": 1, + "successful": 1, + "skipped": 0, + "failed": 0 + }, + "hits": { + "total": { + "value": 3, + "relation": "eq" + }, + "max_score": 6.1600614, + "hits": [ + { + "_shard": "[blogs][0]", + "_node": "_yndTaZHQWimcDgAfOfRtQ", + "_index": "blogs", + "_id": "1", + "_score": 6.1600614, + "_source": { + "name": "Semantic search in OpenSearch", + "views": 1200, + "likes": 150, + "comments": 16, + "date_posted": "2022-04-17" + }, + "_explanation": { + "value": 6.1600614, + "description": "function score, product of:", + "details": [ + { + "value": 1, + "description": "*:*", + "details": [] + }, + { + "value": 6.1600614, + "description": "min of:", + "details": [ + { + "value": 6.1600614, + "description": "function score, score mode [multiply]", + "details": [ + { + "value": 180, + "description": "product of:", + "details": [ + { + "value": 300, + "description": "script score function(_name: likes_function), computed with script:\"Script{type=inline, lang='painless', idOrCode='return doc['likes'].value * 2;', options={}, params={}}\"", + "details": [ + { + "value": 1, + "description": "_score: ", + "details": [ + { + "value": 1, + "description": "*:*", + "details": [] + } + ] + } + ] + }, + { + "value": 0.6, + "description": "weight", + "details": [] + } + ] + }, + { + "value": 0.9766541, + "description": "product of:", + "details": [ + { + "value": 3.2555137, + "description": "field value function(_name: views_function): log1p(doc['views'].value?:1.0 * factor=1.5)", + "details": [] + }, + { + "value": 0.3, + "description": "weight", + "details": [] + } + ] + }, + { + "value": 0.035040613, + "description": "product of:", + "details": [ + { + "value": 0.35040614, + "description": "Function for field comments:", + "details": [ + { + "value": 0.35040614, + "description": "exp(-0.5*pow(MIN[Math.max(Math.abs(16.0(=doc value) - 1000.0(=origin))) - 0.0(=offset), 0)],2.0)/461662.4130844683, _name: comments_function)", + "details": [] + } + ] + }, + { + "value": 0.1, + "description": "weight", + "details": [] + } + ] + } + ] + }, + { + "value": 3.4028235e+38, + "description": "maxBoost", + "details": [] + } + ] + } + ] + } + } + ] + } +} +``` +
+ diff --git a/_query-dsl/compound/hybrid.md b/_query-dsl/compound/hybrid.md index e573d17676..22b3a17fc1 100644 --- a/_query-dsl/compound/hybrid.md +++ b/_query-dsl/compound/hybrid.md @@ -12,11 +12,7 @@ You can use a hybrid query to combine relevance scores from multiple queries int ## Example -Before using a `hybrid` query, you must set up a machine learning (ML) model, ingest documents, and configure a search pipeline with a [`normalization-processor`]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/normalization-processor/). - -To learn how to set up an ML model, see [Choosing a model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/integrating-ml-models/#choosing-a-model). - -Once you set up an ML model, learn how to use the `hybrid` query by following the steps in [Using hybrid search]({{site.url}}{{site.baseurl}}/search-plugins/hybrid-search/#using-hybrid-search). +Learn how to use the `hybrid` query by following the steps in [Using hybrid search]({{site.url}}{{site.baseurl}}/search-plugins/hybrid-search/#using-hybrid-search). For a comprehensive example, follow the [Neural search tutorial]({{site.url}}{{site.baseurl}}/ml-commons-plugin/semantic-search#tutorial). diff --git a/_query-dsl/geo-and-xy/geo-bounding-box.md b/_query-dsl/geo-and-xy/geo-bounding-box.md index a0ee85f093..df697e2ce5 100644 --- a/_query-dsl/geo-and-xy/geo-bounding-box.md +++ b/_query-dsl/geo-and-xy/geo-bounding-box.md @@ -31,6 +31,7 @@ PUT testindex1 } } ``` +{% include copy-curl.html %} Index three geopoints as objects with latitudes and longitudes: @@ -42,7 +43,10 @@ PUT testindex1/_doc/1 "lon": 40.71 } } +``` +{% include copy-curl.html %} +```json PUT testindex1/_doc/2 { "point": { @@ -50,7 +54,10 @@ PUT testindex1/_doc/2 "lon": 22.62 } } +``` +{% include copy-curl.html %} +```json PUT testindex1/_doc/3 { "point": { @@ -59,6 +66,7 @@ PUT testindex1/_doc/3 } } ``` +{% include copy-curl.html %} Search for all documents and filter the documents whose points lie within the rectangle defined in the query: @@ -88,6 +96,7 @@ GET testindex1/_search } } ``` +{% include copy-curl.html %} The response contains the matching document: @@ -163,6 +172,7 @@ GET testindex1/_search } } ``` +{% include copy-curl.html %} ## Request fields @@ -205,6 +215,7 @@ GET testindex1/_search } } ``` +{% include copy-curl.html %} To specify a bounding box that covers the whole area of a geohash, provide that geohash as both `top_left` and `bottom_right` parameters of the bounding box: @@ -227,4 +238,5 @@ GET testindex1/_search } } } -``` \ No newline at end of file +``` +{% include copy-curl.html %} \ No newline at end of file diff --git a/_query-dsl/geo-and-xy/geodistance.md b/_query-dsl/geo-and-xy/geodistance.md new file mode 100644 index 0000000000..7a36b0c933 --- /dev/null +++ b/_query-dsl/geo-and-xy/geodistance.md @@ -0,0 +1,121 @@ +--- +layout: default +title: Geodistance +parent: Geographic and xy queries +grand_parent: Query DSL +nav_order: 20 +--- + +# Geodistance query + +A geodistance query returns documents with geopoints that are within a specified distance from the provided geopoint. A document with multiple geopoints matches the query if at least one geopoint matches the query. + +The searched document field must be mapped as `geo_point`. +{: .note} + +## Example + +Create a mapping with the `point` field mapped as `geo_point`: + +```json +PUT testindex1 +{ + "mappings": { + "properties": { + "point": { + "type": "geo_point" + } + } + } +} +``` +{% include copy-curl.html %} + +Index a geopoint, specifying its latitude and longitude: + +```json +PUT testindex1/_doc/1 +{ + "point": { + "lat": 74.00, + "lon": 40.71 + } +} +``` +{% include copy-curl.html %} + +Search for documents whose `point` objects are within the specified `distance` from the specified `point`: + +```json +GET /testindex1/_search +{ + "query": { + "bool": { + "must": { + "match_all": {} + }, + "filter": { + "geo_distance": { + "distance": "50mi", + "point": { + "lat": 73.5, + "lon": 40.5 + } + } + } + } + } +} +``` +{% include copy-curl.html %} + +The response contains the matching document: + +```json +{ + "took": 5, + "timed_out": false, + "_shards": { + "total": 1, + "successful": 1, + "skipped": 0, + "failed": 0 + }, + "hits": { + "total": { + "value": 1, + "relation": "eq" + }, + "max_score": 1, + "hits": [ + { + "_index": "testindex1", + "_id": "1", + "_score": 1, + "_source": { + "point": { + "lat": 74, + "lon": 40.71 + } + } + } + ] + } +} +``` + +## Request fields + +Geodistance queries accept the following fields. + +Field | Data type | Description +:--- | :--- | :--- +`_name` | String | The name of the filter. Optional. +`distance` | String | The distance within which to match the points. This distance is the radius of a circle centered at the specified point. For supported distance units, see [Distance units]({{site.url}}{{site.baseurl}}/api-reference/common-parameters/#distance-units). Required. +`distance_type` | String | Specifies how to calculate the distance. Valid values are `arc` or `plane` (faster but inaccurate for long distances or points close to the poles). Optional. Default is `arc`. +`validation_method` | String | The validation method. Valid values are `IGNORE_MALFORMED` (accept geopoints with invalid coordinates), `COERCE` (try to coerce coordinates to valid values), and `STRICT` (return an error when coordinates are invalid). Optional. Default is `STRICT`. +`ignore_unmapped` | Boolean | Specifies whether to ignore an unmapped field. If set to `true`, then the query does not return any documents that contain an unmapped field. If set to `false`, then an exception is thrown when the field is unmapped. Optional. Default is `false`. + +## Accepted formats + +You can specify the geopoint coordinates when indexing a document and searching for documents in any [format]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/geo-point#formats) accepted by the geopoint field type. \ No newline at end of file diff --git a/_query-dsl/geo-and-xy/geopolygon.md b/_query-dsl/geo-and-xy/geopolygon.md new file mode 100644 index 0000000000..c53b1379cf --- /dev/null +++ b/_query-dsl/geo-and-xy/geopolygon.md @@ -0,0 +1,177 @@ +--- +layout: default +title: Geopolygon +parent: Geographic and xy queries +grand_parent: Query DSL +nav_order: 30 +--- + +# Geopolygon query + +A geopolygon query returns documents containing geopoints that are within the specified polygon. A document containing multiple geopoints matches the query if at least one geopoint matches the query. + +A polygon is specified by a list of vertices in coordinate form. Unlike specifying a polygon for a geoshape field, the polygon does not have to be closed (specifying the first and last points at the same is unnecessary). Though points do not have to follow either clockwise or counterclockwise order, it is recommended that you list them in either of these orders. This will ensure that the correct polygon is captured. + +The searched document field must be mapped as `geo_point`. +{: .note} + +## Example + +Create a mapping with the `point` field mapped as `geo_point`: + +```json +PUT /testindex1 +{ + "mappings": { + "properties": { + "point": { + "type": "geo_point" + } + } + } +} +``` +{% include copy-curl.html %} + +Index a geopoint, specifying its latitude and longitude: + +```json +PUT testindex1/_doc/1 +{ + "point": { + "lat": 73.71, + "lon": 41.32 + } +} +``` +{% include copy-curl.html %} + +Search for documents whose `point` objects are within the specified `geo_polygon`: + +```json +GET /testindex1/_search +{ + "query": { + "bool": { + "must": { + "match_all": {} + }, + "filter": { + "geo_polygon": { + "point": { + "points": [ + { "lat": 74.5627, "lon": 41.8645 }, + { "lat": 73.7562, "lon": 42.6526 }, + { "lat": 73.3245, "lon": 41.6189 }, + { "lat": 74.0060, "lon": 40.7128 } + ] + } + } + } + } + } +} +``` +{% include copy-curl.html %} + +The polygon specified in the preceding request is the quadrilateral depicted in the following image. The matching document is within this quadrilateral. The coordinates of the quadrilateral vertices are specified in `(latitude, longitude)` format. + +![Search for points within the specified quadrilateral]({{site.url}}{{site.baseurl}}/images/geopolygon-query.png) + +The response contains the matching document: + +```json +{ + "took": 6, + "timed_out": false, + "_shards": { + "total": 1, + "successful": 1, + "skipped": 0, + "failed": 0 + }, + "hits": { + "total": { + "value": 1, + "relation": "eq" + }, + "max_score": 1, + "hits": [ + { + "_index": "testindex1", + "_id": "1", + "_score": 1, + "_source": { + "point": { + "lat": 73.71, + "lon": 41.32 + } + } + } + ] + } +} +``` + +In the preceding search request, you specified the polygon vertices in clockwise order: + +```json +"geo_polygon": { + "point": { + "points": [ + { "lat": 74.5627, "lon": 41.8645 }, + { "lat": 73.7562, "lon": 42.6526 }, + { "lat": 73.3245, "lon": 41.6189 }, + { "lat": 74.0060, "lon": 40.7128 } + ] + } +} +``` + +Alternatively, you can specify the vertices in counterclockwise order: + +```json +"geo_polygon": { + "point": { + "points": [ + { "lat": 74.5627, "lon": 41.8645 }, + { "lat": 74.0060, "lon": 40.7128 }, + { "lat": 73.3245, "lon": 41.6189 }, + { "lat": 73.7562, "lon": 42.6526 } + ] + } +} +``` + +The resulting query response contains the same matching document. + +However, if you specify the vertices in the following order: + +```json +"geo_polygon": { + "point": { + "points": [ + { "lat": 74.5627, "lon": 41.8645 }, + { "lat": 74.0060, "lon": 40.7128 }, + { "lat": 73.7562, "lon": 42.6526 }, + { "lat": 73.3245, "lon": 41.6189 } + ] + } +} +``` + +The response returns no results. + +## Request fields + +Geopolygon queries accept the following fields. + +Field | Data type | Description +:--- | :--- | :--- +`_name` | String | The name of the filter. Optional. +`validation_method` | String | The validation method. Valid values are `IGNORE_MALFORMED` (accept geopoints with invalid coordinates), `COERCE` (try to coerce coordinates to valid values), and `STRICT` (return an error when coordinates are invalid). Optional. Default is `STRICT`. +`ignore_unmapped` | Boolean | Specifies whether to ignore an unmapped field. If set to `true`, then the query does not return any documents that contain an unmapped field. If set to `false`, then an exception is thrown when the field is unmapped. Optional. Default is `false`. + +## Accepted formats + +You can specify the geopoint coordinates when indexing a document and searching for documents in any [format]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/geo-point#formats) accepted by the geopoint field type. \ No newline at end of file diff --git a/_query-dsl/geo-and-xy/index.md b/_query-dsl/geo-and-xy/index.md index 44e2df9b49..83cdbf08d7 100644 --- a/_query-dsl/geo-and-xy/index.md +++ b/_query-dsl/geo-and-xy/index.md @@ -12,7 +12,7 @@ redirect_from: # Geographic and xy queries -Geographic and xy queries let you search fields that contain points and shapes on a map or coordinate plane. Geographic queries work on geospatial data, while xy queries work on two-dimensional coordinate data. Out of all geographic queries, the geoshape query is very similar to the xy query, but the former searches [geographic fields]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/geographic), while the latter searches [Cartesian fields]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/xy). +Geographic and xy queries let you search fields that contain points and shapes on a map or coordinate plane. Geographic queries work on geospatial data, while xy queries work on two-dimensional coordinate data. Out of all geographic queries, the geoshape query is very similar to the xy query, but the former searches [geographic fields]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/geographic/), while the latter searches [Cartesian fields]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/xy). ## xy queries @@ -24,13 +24,13 @@ xy queries return documents that contain: ## Geographic queries -Geographic queries search for documents that contain geospatial geometries. These geometries can be specified in [`geo_point`]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/geo-point) fields, which support points on a map, and [`geo_shape`]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/geo-shape) fields, which support points, lines, circles, and polygons. +Geographic queries search for documents that contain geospatial geometries. These geometries can be specified in [`geo_point`]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/geo-point/) fields, which support points on a map, and [`geo_shape`]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/geo-shape/) fields, which support points, lines, circles, and polygons. OpenSearch provides the following geographic query types: - [**Geo-bounding box queries**]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/geo-and-xy/geo-bounding-box/): Return documents with geopoint field values that are within a bounding box. -- **Geodistance queries** return documents with geopoints that are within a specified distance from the provided geopoint. -- **Geopolygon queries** return documents with geopoints that are within a polygon. -- **Geoshape queries** return documents that contain: - - geoshapes and geopoints that have one of four spatial relations to the provided shape: `INTERSECTS`, `DISJOINT`, `WITHIN`, or `CONTAINS`. - - geopoints that intersect the provided shape. \ No newline at end of file +- [**Geodistance queries**]({{site.url}}{{site.baseurl}}/query-dsl/geo-and-xy/geodistance/): Return documents with geopoints that are within a specified distance from the provided geopoint. +- [**Geopolygon queries**]({{site.url}}{{site.baseurl}}/query-dsl/geo-and-xy/geodistance/): Return documents containing geopoints that are within a polygon. +- **Geoshape queries**: Return documents that contain: + - Geoshapes and geopoints that have one of four spatial relations to the provided shape: `INTERSECTS`, `DISJOINT`, `WITHIN`, or `CONTAINS`. + - Geopoints that intersect the provided shape. \ No newline at end of file diff --git a/_query-dsl/geo-and-xy/xy.md b/_query-dsl/geo-and-xy/xy.md index 3db05c01f2..88a22448c3 100644 --- a/_query-dsl/geo-and-xy/xy.md +++ b/_query-dsl/geo-and-xy/xy.md @@ -12,13 +12,13 @@ redirect_from: # xy query -To search for documents that contain [xy point]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/xy-point) and [xy shape]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/xy-shape) fields, use an xy query. +To search for documents that contain [xy point]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/xy-point/) or [xy shape]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/xy-shape/) fields, use an xy query. ## Spatial relations When you provide an xy shape to the xy query, the xy fields are matched using the following spatial relations to the provided shape. -Relation | Description | Supporting xy Field Type +Relation | Description | Supporting xy field type :--- | :--- | :--- `INTERSECTS` | (Default) Matches documents whose xy point or xy shape intersects the shape provided in the query. | `xy_point`, `xy_shape` `DISJOINT` | Matches documents whose xy shape does not intersect with the shape provided in the query. | `xy_shape` @@ -51,6 +51,7 @@ PUT testindex } } ``` +{% include copy-curl.html %} Index a document with a point and a document with a polygon: @@ -62,7 +63,10 @@ PUT testindex/_doc/1 "coordinates": [0.5, 3.0] } } +``` +{% include copy-curl.html %} +```json PUT testindex/_doc/2 { "geometry" : { @@ -77,6 +81,7 @@ PUT testindex/_doc/2 } } ``` +{% include copy-curl.html %} Define an [`envelope`]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/xy-shape#envelope)—a bounding rectangle in the `[[minX, maxY], [maxX, minY]]` format. Search for documents with xy points or shapes that intersect that envelope: @@ -96,6 +101,7 @@ GET testindex/_search } } ``` +{% include copy-curl.html %} The following image depicts the example. Both the point and the polygon are within the bounding envelope. @@ -200,6 +206,7 @@ PUT pre-indexed-shapes } } ``` +{% include copy-curl.html %} Index an envelope that specifies the boundaries and name it `rectangle`: @@ -212,6 +219,7 @@ PUT pre-indexed-shapes/_doc/rectangle } } ``` +{% include copy-curl.html %} Index a document with a point and a document with a polygon into the index `testindex`: @@ -223,7 +231,10 @@ PUT testindex/_doc/1 "coordinates": [0.5, 3.0] } } +``` +{% include copy-curl.html %} +```json PUT testindex/_doc/2 { "geometry" : { @@ -238,6 +249,7 @@ PUT testindex/_doc/2 } } ``` +{% include copy-curl.html %} Search for documents with shapes that intersect `rectangle` in the index `testindex` using a filter: @@ -261,6 +273,7 @@ GET testindex/_search } } ``` +{% include copy-curl.html %} The preceding query uses the default spatial relation `INTERSECTS` and returns both the point and the polygon: @@ -352,6 +365,7 @@ PUT testindex1 } } ``` +{% include copy-curl.html %} Index three points: @@ -360,17 +374,24 @@ PUT testindex1/_doc/1 { "point": "1.0, 1.0" } +``` +{% include copy-curl.html %} +```json PUT testindex1/_doc/2 { "point": "2.0, 0.0" } +``` +{% include copy-curl.html %} +```json PUT testindex1/_doc/3 { "point": "-2.0, 2.0" } ``` +{% include copy-curl.html %} Search for points that lie within the circle with the center at (0, 0) and a radius of 2: @@ -390,6 +411,7 @@ GET testindex1/_search } } ``` +{% include copy-curl.html %} xy point only supports the default `INTERSECTS` spatial relation, so you don't need to provide the `relation` parameter. {: .note} diff --git a/_search-plugins/hybrid-search.md b/_search-plugins/hybrid-search.md index b0fb4d5bef..7f08d63d0f 100644 --- a/_search-plugins/hybrid-search.md +++ b/_search-plugins/hybrid-search.md @@ -12,7 +12,7 @@ Introduced 2.11 Hybrid search combines keyword and neural search to improve search relevance. To implement hybrid search, you need to set up a [search pipeline]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/index/) that runs at search time. The search pipeline you'll configure intercepts search results at an intermediate stage and applies the [`normalization_processor`]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/normalization-processor/) to them. The `normalization_processor` normalizes and combines the document scores from multiple query clauses, rescoring the documents according to the chosen normalization and combination techniques. **PREREQUISITE**
-Before using hybrid search, you must set up a text embedding model. For more information, see [Choosing a model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/integrating-ml-models/#choosing-a-model). +To follow this example, you must set up a text embedding model. For more information, see [Choosing a model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/integrating-ml-models/#choosing-a-model). If you have already generated text embeddings, ingest the embeddings into an index and skip to [Step 4](#step-4-configure-a-search-pipeline). {: .note} ## Using hybrid search diff --git a/_search-plugins/index.md b/_search-plugins/index.md index fca30667ee..79e0e715d0 100644 --- a/_search-plugins/index.md +++ b/_search-plugins/index.md @@ -70,6 +70,8 @@ OpenSearch provides the following search relevance features: - [Querqy]({{site.url}}{{site.baseurl}}/search-plugins/querqy/): Offers query rewriting capability. +- [User Behavior Insights]({{site.url}}{{site.baseurl}}/search-plugins/ubi/): Links user behavior to user queries to improve search quality. + ## Search results OpenSearch supports the following commonly used operations on search results: diff --git a/_search-plugins/knn/settings.md b/_search-plugins/knn/settings.md index f4ef057cfb..4d84cc80bb 100644 --- a/_search-plugins/knn/settings.md +++ b/_search-plugins/knn/settings.md @@ -12,17 +12,28 @@ The k-NN plugin adds several new cluster settings. To learn more about static an ## Cluster settings +The following table lists all available cluster-level k-NN settings. For more information about cluster settings, see [Configuring OpenSearch]({{site.url}}{{site.baseurl}}/install-and-configure/configuring-opensearch/index/#updating-cluster-settings-using-the-api) and [Updating cluster settings using the API]({{site.url}}{{site.baseurl}}/install-and-configure/configuring-opensearch/index/#updating-cluster-settings-using-the-api). + +Setting | Static/Dynamic | Default | Description +:--- | :--- | :--- | :--- +`knn.plugin.enabled`| Dynamic | `true` | Enables or disables the k-NN plugin. +`knn.algo_param.index_thread_qty` | Dynamic | `1` | The number of threads used for native library index creation. Keeping this value low reduces the CPU impact of the k-NN plugin but also reduces indexing performance. +`knn.cache.item.expiry.enabled` | Dynamic | `false` | Whether to remove native library indexes that have not been accessed for a certain duration from memory. +`knn.cache.item.expiry.minutes` | Dynamic | `3h` | If enabled, the amount of idle time before a native library index is removed from memory. +`knn.circuit_breaker.unset.percentage` | Dynamic | `75` | The native memory usage threshold for the circuit breaker. Memory usage must be lower than this percentage of `knn.memory.circuit_breaker.limit` in order for `knn.circuit_breaker.triggered` to remain `false`. +`knn.circuit_breaker.triggered` | Dynamic | `false` | True when memory usage exceeds the `knn.circuit_breaker.unset.percentage` value. +`knn.memory.circuit_breaker.limit` | Dynamic | `50%` | The native memory limit for native library indexes. At the default value, if a machine has 100 GB of memory and the JVM uses 32 GB, then the k-NN plugin uses 50% of the remaining 68 GB (34 GB). If memory usage exceeds this value, then the plugin removes the native library indexes used least recently. +`knn.memory.circuit_breaker.enabled` | Dynamic | `true` | Whether to enable the k-NN memory circuit breaker. +`knn.model.index.number_of_shards`| Dynamic | `1` | The number of shards to use for the model system index, which is the OpenSearch index that stores the models used for approximate nearest neighbor (ANN) search. +`knn.model.index.number_of_replicas`| Dynamic | `1` | The number of replica shards to use for the model system index. Generally, in a multi-node cluster, this value should be at least 1 in order to increase stability. +`knn.model.cache.size.limit` | Dynamic | `10%` | The model cache limit cannot exceed 25% of the JVM heap. +`knn.faiss.avx2.disabled` | Static | `false` | A static setting that specifies whether to disable the SIMD-based `libopensearchknn_faiss_avx2.so` library and load the non-optimized `libopensearchknn_faiss.so` library for the Faiss engine on machines with x64 architecture. For more information, see [SIMD optimization for the Faiss engine]({{site.url}}{{site.baseurl}}/search-plugins/knn/knn-index/#simd-optimization-for-the-faiss-engine). + +## Index settings + +The following table lists all available index-level k-NN settings. All settings are static. For information about updating static index-level settings, see [Updating a static index setting]({{site.url}}{{site.baseurl}}/install-and-configure/configuring-opensearch/index-settings/#updating-a-static-index-setting). + Setting | Default | Description -:--- | :--- | :--- -`knn.algo_param.index_thread_qty` | 1 | The number of threads used for native library index creation. Keeping this value low reduces the CPU impact of the k-NN plugin, but also reduces indexing performance. -`knn.cache.item.expiry.enabled` | false | Whether to remove native library indexes that have not been accessed for a certain duration from memory. -`knn.cache.item.expiry.minutes` | 3h | If enabled, the idle time before removing a native library index from memory. -`knn.circuit_breaker.unset.percentage` | 75% | The native memory usage threshold for the circuit breaker. Memory usage must be below this percentage of `knn.memory.circuit_breaker.limit` for `knn.circuit_breaker.triggered` to remain false. -`knn.circuit_breaker.triggered` | false | True when memory usage exceeds the `knn.circuit_breaker.unset.percentage` value. -`knn.memory.circuit_breaker.limit` | 50% | The native memory limit for native library indexes. At the default value, if a machine has 100 GB of memory and the JVM uses 32 GB, the k-NN plugin uses 50% of the remaining 68 GB (34 GB). If memory usage exceeds this value, k-NN removes the least recently used native library indexes. -`knn.memory.circuit_breaker.enabled` | true | Whether to enable the k-NN memory circuit breaker. -`knn.plugin.enabled`| true | Enables or disables the k-NN plugin. -`knn.model.index.number_of_shards`| 1 | The number of shards to use for the model system index, the OpenSearch index that stores the models used for Approximate Nearest Neighbor (ANN) search. -`knn.model.index.number_of_replicas`| 1 | The number of replica shards to use for the model system index. Generally, in a multi-node cluster, this should be at least 1 to increase stability. -`knn.advanced.filtered_exact_search_threshold`| null | The threshold value for the filtered IDs that is used to switch to exact search during filtered ANN search. If the number of filtered IDs in a segment is less than this setting's value, exact search will be performed on the filtered IDs. -`knn.faiss.avx2.disabled` | False | A static setting that specifies whether to disable the SIMD-based `libopensearchknn_faiss_avx2.so` library and load the non-optimized `libopensearchknn_faiss.so` library for the Faiss engine on machines with x64 architecture. For more information, see [SIMD optimization for the Faiss engine]({{site.url}}{{site.baseurl}}/search-plugins/knn/knn-index/#simd-optimization-for-the-faiss-engine). +:--- | :--- | :--- +`index.knn.advanced.filtered_exact_search_threshold`| `null` | The filtered ID threshold value used to switch to exact search during filtered ANN search. If the number of filtered IDs in a segment is lower than this setting's value, then exact search will be performed on the filtered IDs. +`index.knn.algo_param.ef_search` | `100` | `ef` (or `efSearch`) represents the size of the dynamic list for the nearest neighbors used during a search. Higher `ef` values lead to a more accurate but slower search. `ef` cannot be set to a value lower than the number of queried nearest neighbors, `k`. `ef` can take any value between `k` and the size of the dataset. \ No newline at end of file diff --git a/_search-plugins/neural-sparse-search.md b/_search-plugins/neural-sparse-search.md index b2b4fc33d6..8aa2ff7dbf 100644 --- a/_search-plugins/neural-sparse-search.md +++ b/_search-plugins/neural-sparse-search.md @@ -16,8 +16,8 @@ Introduced 2.11 When selecting a model, choose one of the following options: -- Use a sparse encoding model at both ingestion time and search time (high performance, relatively high latency). -- Use a sparse encoding model at ingestion time and a tokenizer at search time for relatively low performance and low latency. The tokenism doesn't conduct model inference, so you can deploy and invoke a tokenizer using the ML Commons Model API for a more consistent experience. +- Use a sparse encoding model at both ingestion time and search time for better search relevance at the expense of relatively high latency. +- Use a sparse encoding model at ingestion time and a tokenizer at search time for lower search latency at the expense of relatively lower search relevance. Tokenization doesn't involve model inference, so you can deploy and invoke a tokenizer using the ML Commons Model API for a more streamlined experience. **PREREQUISITE**
Before using neural sparse search, make sure to set up a [pretrained sparse embedding model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/pretrained-models/#sparse-encoding-models) or your own sparse embedding model. For more information, see [Choosing a model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/integrating-ml-models/#choosing-a-model). diff --git a/_search-plugins/search-pipelines/search-processors.md b/_search-plugins/search-pipelines/search-processors.md index 5e53cf5615..4630ab950c 100644 --- a/_search-plugins/search-pipelines/search-processors.md +++ b/_search-plugins/search-pipelines/search-processors.md @@ -121,3 +121,7 @@ The response contains the `search_pipelines` object that lists the available req In addition to the processors provided by OpenSearch, additional processors may be provided by plugins. {: .note} + +## Selectively enabling processors + +Processors defined by the [search-pipeline-common module](https://github.com/opensearch-project/OpenSearch/blob/2.x/modules/search-pipeline-common/src/main/java/org/opensearch/search/pipeline/common/SearchPipelineCommonModulePlugin.java) are selectively enabled through the following cluster settings: `search.pipeline.common.request.processors.allowed`, `search.pipeline.common.response.processors.allowed`, or `search.pipeline.common.search.phase.results.processors.allowed`. If unspecified, then all processors are enabled. An empty list disables all processors. Removing enabled processors causes pipelines using them to fail after a node restart. \ No newline at end of file diff --git a/_search-plugins/search-relevance/reranking-search-results.md b/_search-plugins/search-relevance/reranking-search-results.md index 14c418020d..4b4deaeb92 100644 --- a/_search-plugins/search-relevance/reranking-search-results.md +++ b/_search-plugins/search-relevance/reranking-search-results.md @@ -115,4 +115,19 @@ POST /my-index/_search ``` {% include copy-curl.html %} -Alternatively, you can provide the full path to the field containing the context. For more information, see [Rerank processor example]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/rerank-processor/#example). \ No newline at end of file +Alternatively, you can provide the full path to the field containing the context. For more information, see [Rerank processor example]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/rerank-processor/#example). + +## Using rerank and normalization processors together + +When you use a rerank processor in conjunction with a [normalization processor]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/normalization-processor/) and a hybrid query, the rerank processor alters the final document scores. This is because the rerank processor operates after the normalization processor in the search pipeline. +{: .note} + +The processing order is as follows: + +- Normalization processor: This processor normalizes the document scores based on the configured normalization method. For more information, see [Normalization processor]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/normalization-processor/). +- Rerank processor: Following normalization, the rerank processor further adjusts the document scores. This adjustment can significantly impact the final ordering of search results. + +This processing order has the following implications: + +- Score modification: The rerank processor modifies the scores that were initially adjusted by the normalization processor, potentially leading to different ranking results than initially expected. +- Hybrid queries: In the context of hybrid queries, where multiple types of queries and scoring mechanisms are combined, this behavior is particularly noteworthy. The combined scores from the initial query are normalized first and then reranked, resulting in a two-stage scoring modification. \ No newline at end of file diff --git a/_search-plugins/sql/functions.md b/_search-plugins/sql/functions.md index de3b578e1a..9706148d76 100644 --- a/_search-plugins/sql/functions.md +++ b/_search-plugins/sql/functions.md @@ -32,10 +32,10 @@ The SQL plugin supports the following common functions shared across the SQL and | `expm1` | `expm1(number T) -> double` | `SELECT expm1(0.5)` | | `floor` | `floor(number T) -> long` | `SELECT floor(0.5)` | | `ln` | `ln(number T) -> double` | `SELECT ln(10)` | -| `log` | `log(number T) -> double` or `log(number T, number T) -> double` | `SELECT log(10)`, `SELECT log(2, 16)` | +| `log` | `log(number T) -> double` or `log(number T, number T) -> double` | `SELECT log(10) -> 2.3`, `SELECT log(2, 16) -> 4`| | `log2` | `log2(number T) -> double` | `SELECT log2(10)` | -| `log10` | `log10(number T) -> double` | `SELECT log10(10)` | -| `mod` | `mod(number T, number T) -> T` | `SELECT mod(2, 3)` | +| `log10` | `log10(number T) -> double` | `SELECT log10(100)` | +| `mod` | `mod(number T, number T) -> T` | `SELECT mod(10,4) -> 2 ` | | `modulus` | `modulus(number T, number T) -> T` | `SELECT modulus(2, 3)` | | `multiply` | `multiply(number T, number T) -> T` | `SELECT multiply(2, 3)` | | `pi` | `pi() -> double` | `SELECT pi()` | @@ -162,7 +162,7 @@ Functions marked with * are only available in SQL. | `replace` | `replace(string, string, string) -> string` | `SELECT replace('hello', 'l', 'x')` | | `right` | `right(string, integer) -> string` | `SELECT right('hello', 2)` | | `rtrim` | `rtrim(string) -> string` | `SELECT rtrim('hello ')` | -| `substring` | `substring(string, integer, integer) -> string` | `SELECT substring('hello', 2, 4)` | +| `substring` | `substring(string, integer, integer) -> string` | `SELECT substring('hello', 2, 2) -> 'el'` | | `trim` | `trim(string) -> string` | `SELECT trim(' hello')` | | `upper` | `upper(string) -> string` | `SELECT upper('hello world')` | diff --git a/_search-plugins/sql/ppl/functions.md b/_search-plugins/sql/ppl/functions.md index 275030f723..d192799f2e 100644 --- a/_search-plugins/sql/ppl/functions.md +++ b/_search-plugins/sql/ppl/functions.md @@ -11,7 +11,7 @@ redirect_from: # Commands -`PPL` supports all [`SQL` common]({{site.url}}{{site.baseurl}}/search-plugins/sql/functions/) functions, including [relevance search]({{site.url}}{{site.baseurl}}/search-plugins/sql/full-text/), but also introduces few more functions (called `commands`) which are available in `PPL` only. +`PPL` supports most [`SQL` common]({{site.url}}{{site.baseurl}}/search-plugins/sql/functions/) functions, including [relevance search]({{site.url}}{{site.baseurl}}/search-plugins/sql/full-text/), but also introduces few more functions (called `commands`) which are available in `PPL` only. ## dedup diff --git a/_search-plugins/sql/ppl/index.md b/_search-plugins/sql/ppl/index.md index 850a540bc4..602255d126 100644 --- a/_search-plugins/sql/ppl/index.md +++ b/_search-plugins/sql/ppl/index.md @@ -37,7 +37,15 @@ PPL filters, transforms, and aggregates data using a series of commands. See [Co ## Using PPL within OpenSearch -To use PPL, you must have installed OpenSearch Dashboards. PPL is available within the [Query Workbench tool](https://playground.opensearch.org/app/opensearch-query-workbench#/). See the [Query Workbench]({{site.url}}{{site.baseurl}}/dashboards/query-workbench/) documentation for a tutorial on using PPL within OpenSearch. +The SQL plugin is required to run PPL queries in OpenSearch. If you're running a minimal distribution of OpenSearch, you might have to [install the SQL plugin]({{site.url}}{{site.baseurl}}/install-and-configure/plugins/) before using PPL. +{: .note} + +You can run PPL queries interactively in OpenSearch Dashboards or programmatically using the ``_ppl`` endpoint. + +In OpenSearch Dashboards, the [Query Workbench tool](https://playground.opensearch.org/app/opensearch-query-workbench#/) provides an interactive testing environment, documented in [Query Workbench documentation]({{site.url}}{{site.baseurl}}/dashboards/query-workbench/). + +To run a PPL query using the API, see [SQL and PPL API]({{site.url}}{{site.baseurl}}/search-plugins/sql/sql-ppl-api/). + ## Developer documentation diff --git a/_search-plugins/sql/ppl/syntax.md b/_search-plugins/sql/ppl/syntax.md index 45eeb3aed2..22d6beaf26 100644 --- a/_search-plugins/sql/ppl/syntax.md +++ b/_search-plugins/sql/ppl/syntax.md @@ -8,10 +8,12 @@ nav_order: 1 # PPL syntax -Every PPL query starts with the `search` command. It specifies the index to search and retrieve documents from. Subsequent commands can follow in any order. +Every PPL query starts with the `search` command. It specifies the index to search and retrieve documents from. + +`PPL` supports exactly one `search` command per PPL query, and it is always the first command. The word `search` can be omitted. + +Subsequent commands can follow in any order. -Currently, `PPL` supports only one `search` command, which can be omitted to simplify the query. -{ : .note} ## Syntax @@ -22,8 +24,7 @@ source= [boolean-expression] Field | Description | Required :--- | :--- |:--- -`search` | Specifies search keywords. | Yes -`index` | Specifies which index to query from. | No +`index` | Specifies the index to query. | No `bool-expression` | Specifies an expression that evaluates to a Boolean value. | No ## Examples diff --git a/_search-plugins/ubi/data-structures.md b/_search-plugins/ubi/data-structures.md new file mode 100644 index 0000000000..0c64c3254b --- /dev/null +++ b/_search-plugins/ubi/data-structures.md @@ -0,0 +1,204 @@ +--- +layout: default +title: UBI client data structures +parent: User Behavior Insights +has_children: false +nav_order: 10 +--- + +# UBI client data structures + +Data structures are used to create events that follow the [User Behavior Insights (UBI) event schema specification](https://github.com/o19s/ubi). +For more information about the schema, see [UBI index schemas]({{site.url}}{{site.baseurl}}/search-plugins/ubi/schemas/). + + +You must provide an implementation for the following functions: +- `getClientId()` +- `getQueryId()` + +You can also optionally provide an implementation for the following functions: +- `getSessionId()` +- `getPageId()` + + +The following JavaScript structures can be used as a starter implementation to serialize UBI events into schema-compatible JSON: +```js +/********************************************************************************************* + * Ubi Event data structures + * The following structures help ensure adherence to the UBI event schema + *********************************************************************************************/ + + + +export class UbiEventData { + constructor(object_type, id=null, description=null, details=null) { + this.object_id_field = object_type; + this.object_id = id; + this.description = description; + this.object_detail = details; + } +} +export class UbiPosition{ + constructor({ordinal=null, x=null, y=null, trail=null}={}) { + this.ordinal = ordinal; + this.x = x; + this.y = y; + if(trail) + this.trail = trail; + else { + const trail = getTrail(); + if(trail && trail.length > 0) + this.trail = trail; + } + } +} + + +export class UbiEventAttributes { + /** + * Tries to prepopulate common event attributes + * The developer can add an `object` that the user interacted with and + * the site `position` information relevant to the event + * + * Attributes, other than `object` or `position` can be added in the form: + * attributes['item1'] = 1 + * attributes['item2'] = '2' + * + * @param {*} attributes: object with general event attributes + * @param {*} object: the data object the user interacted with + * @param {*} position: the site position information + */ + constructor({attributes={}, object=null, position=null}={}) { + if(attributes != null){ + Object.assign(this, attributes); + } + if(object != null && Object.keys(object).length > 0){ + this.object = object; + } + if(position != null && Object.keys(position).length > 0){ + this.position = position; + } + this.setDefaultValues(); + } + + setDefaultValues(){ + try{ + if(!this.hasOwnProperty('dwell_time') && typeof TimeMe !== 'undefined'){ + this.dwell_time = TimeMe.getTimeOnPageInSeconds(window.location.pathname); + } + + if(!this.hasOwnProperty('browser')){ + this.browser = window.navigator.userAgent; + } + + if(!this.hasOwnProperty('page_id')){ + this.page_id = window.location.pathname; + } + if(!this.hasOwnProperty('session_id')){ + this.session_id = getSessionId(); + } + + if(!this.hasOwnProperty('page_id')){ + this.page_id = getPageId(); + } + + if(!this.hasOwnProperty('position') || this.position == null){ + const trail = getTrail(); + if(trail.length > 0){ + this.position = new UbiPosition({trail:trail}); + } + } + // ToDo: set IP + } + catch(error){ + console.log(error); + } + } +} + + + +export class UbiEvent { + constructor(action_name, {message_type='INFO', message=null, event_attributes={}, data_object={}}={}) { + this.action_name = action_name; + this.client_id = getClientId(); + this.query_id = getQueryId(); + this.timestamp = Date.now(); + + this.message_type = message_type; + if( message ) + this.message = message; + + this.event_attributes = new UbiEventAttributes({attributes:event_attributes, object:data_object}); + } + + /** + * Use to suppress null objects in the json output + * @param key + * @param value + * @returns + */ + static replacer(key, value){ + if(value == null || + (value.constructor == Object && Object.keys(value).length === 0)) { + return undefined; + } + return value; + } + + /** + * + * @returns json string + */ + toJson() { + return JSON.stringify(this, UbiEvent.replacer); + } +} +``` +{% include copy.html %} + +# Sample usage + +```js +export function logUbiMessage(event_type, message_type, message){ + let e = new UbiEvent(event_type, { + message_type:message_type, + message:message + }); + logEvent(e); +} + +export function logDwellTime(action_name, page, seconds){ + console.log(`${page} => ${seconds}`); + let e = new UbiEvent(action_name, { + message:`On page ${page} for ${seconds} seconds`, + event_attributes:{ + session_id: getSessionId()}, + dwell_seconds:seconds + }, + data_object:TimeMe + }); + logEvent(e); +} + +/** + * ordinal is the number within a list of results + * for the item that was clicked + */ +export function logItemClick(item, ordinal){ + let e = new UbiEvent('item_click', { + message:`Item ${item['object_id']} was clicked`, + event_attributes:{session_id: getSessionId()}, + data_object:item, + }); + e.event_attributes.position.ordinal = ordinal; + logEvent(e); +} + +export function logEvent( event ){ + // some configured http client + return client.index( index = 'ubi_events', body = event.toJson()); +} + +``` +{% include copy.html %} diff --git a/_search-plugins/ubi/dsl-queries.md b/_search-plugins/ubi/dsl-queries.md new file mode 100644 index 0000000000..0802680dc9 --- /dev/null +++ b/_search-plugins/ubi/dsl-queries.md @@ -0,0 +1,101 @@ +--- +layout: default +title: Example UBI query DSL queries +parent: User Behavior Insights +has_children: false +nav_order: 15 +--- + +# Example UBI query DSL queries + +You can use the OpenSearch search query language, [query DSL]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/), to write User Behavior Insights (UBI) queries. The following example returns the number of times that each `action_name` event occurs. +For more extensive analytic queries, see [Example UBI SQL queries]({{site.url}}{{site.baseurl}}/search-plugins/ubi/sql-queries/). +#### Example request +```json +GET ubi_events/_search +{ + "size":0, + "aggs":{ + "event_types":{ + "terms": { + "field":"action_name", + "size":10 + } + } + } +} +``` +{% include copy.html %} + +#### Example response + +```json +{ + "took": 1, + "timed_out": false, + "_shards": { + "total": 1, + "successful": 1, + "skipped": 0, + "failed": 0 + }, + "hits": { + "total": { + "value": 10000, + "relation": "gte" + }, + "max_score": null, + "hits": [] + }, + "aggregations": { + "event_types": { + "doc_count_error_upper_bound": 0, + "sum_other_doc_count": 0, + "buckets": [ + { + "key": "brand_filter", + "doc_count": 3084 + }, + { + "key": "product_hover", + "doc_count": 3068 + }, + { + "key": "button_click", + "doc_count": 3054 + }, + { + "key": "product_sort", + "doc_count": 3012 + }, + { + "key": "on_search", + "doc_count": 3010 + }, + { + "key": "type_filter", + "doc_count": 2925 + }, + { + "key": "login", + "doc_count": 2433 + }, + { + "key": "logout", + "doc_count": 1447 + }, + { + "key": "new_user_entry", + "doc_count": 207 + } + ] + } + } +} +``` +{% include copy.html %} + +You can run the preceding queries in the OpenSearch Dashboards [Query Workbench]({{site.url}}{{site.baseurl}}/search-plugins/sql/workbench/). + +A demo workbench with sample data can be found here: +[http://chorus-opensearch-edition.dev.o19s.com:5601/app/OpenSearch-query-workbench](http://chorus-OpenSearch-edition.dev.o19s.com:5601/app/OpenSearch-query-workbench). diff --git a/_search-plugins/ubi/index.md b/_search-plugins/ubi/index.md new file mode 100644 index 0000000000..bdf09a632b --- /dev/null +++ b/_search-plugins/ubi/index.md @@ -0,0 +1,50 @@ +--- +layout: default +title: User Behavior Insights +has_children: true +nav_order: 90 +redirect_from: + - /search-plugins/ubi/ +--- +# User Behavior Insights + +**Introduced 2.15** +{: .label .label-purple } + +**References UBI Specification 1.0.0** +{: .label .label-purple } + +User Behavior Insights (UBI) is a plugin that captures client-side events and queries for the purposes of improving search relevance and the user experience. +It is a causal system, linking a user's query to all of their subsequent interactions with your application until they perform another search. + +UBI includes the following elements: +* A machine-readable [schema](https://github.com/o19s/ubi) that faciliates interoperablity of the UBI specification. +* An OpenSearch [plugin](https://github.com/opensearch-project/user-behavior-insights) that facilitates the storage of client-side events and queries. +* A client-side JavaScript [example reference implementation]({{site.url}}{{site.baseurl}}/search-plugins/ubi/data-structures/) that shows how to capture events and send them to the OpenSearch UBI plugin. + + + +The UBI documentation is organized into two categories: *Explanation and reference* and *Tutorials and how-to guides*: + +*Explanation and reference* + +| Link | Description | +| :--------- | :------- | +| [UBI Request/Response Specification](https://github.com/o19s/ubi/) | The industry-standard schema for UBI requests and responses. The current version references UBI Specification 1.0.0. | +| [UBI index schema]({{site.url}}{{site.baseurl}}/search-plugins/ubi/schemas/) | Documentation on the individual OpenSearch query and event stores. | + + +*Tutorials and how-to guides* + +| Link | Description | +| :--------- | :------- | +| [UBI plugin](https://github.com/opensearch-project/user-behavior-insights) | How to install and use the UBI plugin. | +| [UBI client data structures]({{site.url}}{{site.baseurl}}/search-plugins/ubi/data-structures/) | Sample JavaScript structures for populating the event store. | +| [Example UBI query DSL queries]({{site.url}}{{site.baseurl}}/search-plugins/ubi/dsl-queries/) | How to write queries for UBI data in OpenSearch query DSL. | +| [Example UBI SQL queries]({{site.url}}{{site.baseurl}}/search-plugins/ubi/sql-queries/) | How to write analytic queries for UBI data in SQL. | +| [UBI dashboard tutorial]({{site.url}}{{site.baseurl}}/search-plugins/ubi/ubi-dashboard-tutorial/) | How to build a dashboard containing UBI data. | +| [Chorus Opensearch Edition](https://github.com/o19s/chorus-opensearch-edition/?tab=readme-ov-file#structured-learning-using-chorus-opensearch-edition) katas | A series of structured tutorials that teach you how to use UBI with OpenSearch through a demo e-commerce store. | + + +The documentation categories were adapted using concepts based on [Diátaxis](https://diataxis.fr/). +{: .tip } diff --git a/_search-plugins/ubi/schemas.md b/_search-plugins/ubi/schemas.md new file mode 100644 index 0000000000..d8398e43bc --- /dev/null +++ b/_search-plugins/ubi/schemas.md @@ -0,0 +1,223 @@ +--- +layout: default +title: UBI index schemas +parent: User Behavior Insights +has_children: false +nav_order: 5 +--- + +# UBI index schemas + +The User Behavior Insights (UBI) data collection process involves tracking and recording the queries submitted by users as well as monitoring and logging their subsequent actions or events after receiving the search results. There are two UBI index schemas involved in the data collection process: +* The [query index](#ubi-queries-index), which stores the searches and results. +* The [event index](#ubi-events-index), which stores all subsequent user actions after the user's query. + +## Key identifiers + +For UBI to function properly, the connections between the following fields must be consistently maintained within an application that has UBI enabled: + +- [`object_id`](#object_id) represents an ID for whatever object the user receives in response to a query. For example, if you search for books, it might be an ISBN code of a book, such as `978-3-16-148410-0`. +- [`query_id`](#query_id) is a unique ID for the raw query language executed and the `object_id` values of the _hits_ returned by the user's query. +- [`client_id`](#client_id) represents a unique query source. This is typically a web browser used by a unique user. +- [`object_id_field`](#object_id_field) specifies the name of the field in your index that provides the `object_id`. For example, if you search for books, the value might be `isbn_code`. +- [`action_name`](#action_name), though not technically an ID, specifies the exact user action (such as `click`, `add_to_cart`, `watch`, `view`, or `purchase`) that was taken (or not taken) for an object with a given `object_id`. + +To summarize, the `query_id` signals the beginning of a unique search for a client tracked through a `client_id`. The search returns various objects, each with a unique `object_id`. The `action_name` specifies what action the user is performing and is connected to the objects, each with a specific `object_id`. You can differentiate between types of objects by inspecting the `object_id_field`. + +Typically, you can infer the user's overall search history by retrieving all the data for the user's `client_id` and inspecting the individual `query_id` data. Each application determines what constitutes a search session by examining the backend data + +## Important UBI roles + +The following diagram illustrates the process by which the **user** interacts with the **Search client** and **UBI client** +and how those, in turn, interact with the **OpenSearch cluster**, which houses the **UBI events** and **UBI queries** indexes. + +Blue arrows illustrate standard search, bold, dashed lines illustrate UBI-specific additions, and +red arrows illustrate the flow of the `query_id` to and from OpenSearch. + + + + +{% comment %} +The mermaid source below is converted into a PNG under +.../images/ubi/ubi-schema-interactions.png + + +```mermaid +graph LR +style L fill:none,stroke-dasharray: 5 5 +subgraph L["`*Legend*`"] + style ss height:150px + subgraph ss["Standard Search"] + direction LR + + style ln1a fill:blue + ln1a[ ]--->ln1b[ ]; + end + subgraph ubi-leg["UBI data flow"] + direction LR + + ln2a[ ].->|"`**UBI interaction**`"|ln2b[ ]; + style ln1c fill:red + ln1c[ ]-->|query_id flow|ln1d[ ]; + end +end +linkStyle 0 stroke-width:2px,stroke:#0A1CCF +linkStyle 2 stroke-width:2px,stroke:red +``` +```mermaid +%%{init: { + "flowchart": {"htmlLabels": false}, + + } +}%% +graph TB + +User--1) raw search string-->Search; +Search--2) search string-->Docs +style OS stroke-width:2px, stroke:#0A1CCF, fill:#62affb, opacity:.5 +subgraph OS[OpenSearch Cluster fa:fa-database] + style E stroke-width:1px,stroke:red + E[( UBI Events )] + style Docs stroke-width:1px,stroke:#0A1CCF + style Q stroke-width:1px,stroke:red + Docs[(Document Index)] -."3) {DSL...} & [object_id's,...]".-> Q[( UBI Queries )]; + Q -.4) query_id.-> Docs ; +end + +Docs -- "5) return both query_id & [objects,...]" --->Search ; +Search-.6) query_id.->U; +Search --7) [results, ...]--> User + +style *client-side* stroke-width:1px, stroke:#D35400 +subgraph "`*client-side*`" + style User stroke-width:4px, stroke:#EC636 + User["`**User**`" fa:fa-user] + App + Search + U + style App fill:#D35400,opacity:.35, stroke:#0A1CCF, stroke-width:2px + subgraph App[       UserApp fa:fa-store] + style Search stroke-width:2px, stroke:#0A1CCF + Search( Search Client ) + style U stroke-width:1px,stroke:red + U( UBI Client ) + end +end + +User -.8) selects object_id:123.->U; +U-."9) index event:{query_id, onClick, object_id:123}".->E; + +linkStyle 1,2,0,6 stroke-width:2px,fill:none,stroke:#0A1CCF +linkStyle 3,4,5,8 stroke-width:2px,fill:none,stroke:red +``` +{% endcomment %} +Here are some key points regarding the roles: +- The **Search client** is in charge of searching and then receiving *objects* from an OpenSearch document index (1, 2, **5**, and 7 in the preceding diagram). + +Step **5** is in bold because it denotes UBI-specific additions, like `query_id`, to standard OpenSearch interactions. + {: .note} +- If activated in the `ext.ubi` stanza of the search request, the **User Behavior Insights** plugin manages the **UBI queries** store in the background, indexing each query, ensuring a unique `query_id` for all returned `object_id` values, and then passing the `query_id` back to the **Search client** so that events can be linked to the query (3, 4, and **5** in preceding diagram). +- **Objects** represent the items that the user searches for using the queries. Activating UBI involves mapping your real-world objects (using their identifiers, such as an `isbn` or `sku`) to the `object_id` fields in the index that is searched. +- The **Search client**, if separate from the **UBI client**, forwards the indexed `query_id` to the **UBI client**. + Even though the *search* and *UBI event indexing* roles are separate in this diagram, many implementations can use the same OpenSearch client instance for both roles (6 in the preceding diagram). +- The **UBI client** then indexes all user events with the specified `query_id` until a new search is performed. At this time, a new `query_id` is generated by the **User Behavior Insights** plugin and passed back to the **UBI client**. +- If the **UBI client** interacts with a result **object**, such as during an **add to cart** event, then the `object_id`, `add_to_cart` `action_name`, and `query_id` are all indexed together, signaling the causal link between the *search* and the *object* (8 and 9 in the preceding diagram). + + + +## UBI stores + +There are two separate stores involved in supporting UBI data collection: +* UBI queries +* UBI events + +### UBI queries index + +All underlying query information and results (`object_ids`) are stored in the `ubi_queries` index and remain largely invisible in the background. + + +The `ubi_queries` index [schema](https://github.com/OpenSearch-project/user-behavior-insights/tree/main/src/main/resources/queries-mapping.json) includes the following fields: + +- `timestamp` (events and queries): A UNIX timestamp indicating when the query was received. + +- `query_id` (events and queries): The unique ID of the query provided by the client or generated automatically. Different queries with the same text generate different `query_id` values. + +- `client_id` (events and queries): A user/client ID provided by the client application. + +- `query_response_objects_ids` (queries): An array of object IDs. An ID can have the same value as the `_id`, but it is meant to be the externally valid ID of a document, item, or product. + +Because UBI manages the `ubi_queries` index, you should never have to write directly to this index (except when importing data). + +### UBI events index + +The client side directly indexes events to the `ubi_events` index, linking the event [`action_name`](#action_name), objects (each with an `object_id`), and queries (each with a `query_id`), along with any other important event information. +Because this schema is dynamic, you can add any new fields or structures (such as user information or geolocation information) that are not in the current **UBI events** [schema](https://github.com/opensearch-project/user-behavior-insights/tree/main/src/main/resources/events-mapping.json) at index time. + +Developers may define new fields under [`event_attributes`](#event_attributes). +{: .note} + +The following are the predefined, minimal fields in the `ubi_events` index: + +

+ +- `application` (size 100): The name of the application that tracks UBI events (for example, `amazon-shop` or `ABC-microservice`). + +

+ +- `action_name` (size 100): The name of the action that triggered the event. The UBI specification defines some common action names, but you can use any name. + +

+ +- `query_id` (size 100): The unique identifier of a query, which is typically a UUID but can be any string. + The `query_id` is either provided by the client or generated at index time by the UBI plugin. The `query_id` values in both the **UBI queries** and **UBI events** indexes must be consistent. + +

+ +- `client_id`: The client that issues the query. This is typically a web browser used by a unique user. + The `client_id` in both the **UBI queries** and **UBI events** indexes must be consistent. + +- `timestamp`: When the event occurred, either in UNIX format or formatted as `2018-11-13T20:20:39+00:00`. + +- `message_type` (size 100): A logical bin for grouping actions (each with an `action_name`). For example, `QUERY` or `CONVERSION`. + +- `message` (size 1,024): An optional text message for the log entry. For example, for a `message_type` `QUERY`, the `message` can contain the text related to a user's search. + +

+ +- `event_attributes`: An extensible structure that describes important context about the event. This structure consists of two primary structures: `position` and `object`. The structure is extensible, so you can add custom information about the event, such as the event's timing, user, or session. + + Because the `ubi_events` index is configured to perform dynamic mapping, the index can become bloated with many new fields. + {: .warning} + + - `event_attributes.position`: A structure that contains information about the location of the event origin, such as screen x, y coordinates, or the object's position in the list of results: + + - `event_attributes.position.ordinal`: Tracks the list position that a user can select (for example, selecting the third element can be described as `event{onClick, results[4]}`). + + - `event_attributes.position.{x,y}`: Tracks x and y values defined by the client. + + - `event_attributes.position.page_depth`: Tracks the page depth of the results. + + - `event_attributes.position.scroll_depth`: Tracks the scroll depth of the page results. + + - `event_attributes.position.trail`: A text field that tracks the path/trail that a user took to get to this location. + + - `event_attributes.object`: Contains identifying information about the object returned by the query (for example, a book, product, or post). + The `object` structure can refer to the object by internal ID or object ID. The `object_id` is the ID that links prior queries to this object. This field comprises the following subfields: + + - `event_attributes.object.internal_id`: A unique ID that OpenSearch can use to internally index the object, for example, the `_id` field in the indexes. + +

+ + - `event_attributes.object.object_id`: An ID by which a user can find the object instance in the **document corpus**. Examples include `ssn`, `isbn`, or `ean`. Variants need to be incorporated in the `object_id`, so a red T-shirt's `object_id` should be its SKU. + Initializing UBI requires mapping the document index's primary key to this `object_id`. +

+ +

+ + - `event_attributes.object.object_id_field`: Indicates the type/class of the object and the name of the search index field that contains the `object_id`. + + - `event_attributes.object.description`: An optional description of the object. + + - `event_attributes.object.object_detail`: Optional additional information about the object. + + - *extensible fields*: Be aware that any new indexed fields in the `object` will dynamically expand this schema. diff --git a/_search-plugins/ubi/sql-queries.md b/_search-plugins/ubi/sql-queries.md new file mode 100644 index 0000000000..517c81e1d6 --- /dev/null +++ b/_search-plugins/ubi/sql-queries.md @@ -0,0 +1,388 @@ +--- +layout: default +title: Sample UBI SQL queries +parent: User Behavior Insights +has_children: false +nav_order: 20 +--- + +# Sample UBI SQL queries + +You can run sample User Behavior Insights (UBI) SQL queries in the OpenSearch Dashboards [Query Workbench]({{site.url}}{{site.baseurl}}/dashboards/query-workbench/). + +To query a demo workbench with synthetic data, see +[http://chorus-opensearch-edition.dev.o19s.com:5601/app/opensearch-query-workbench](http://chorus-opensearch-edition.dev.o19s.com:5601/app/opensearch-query-workbench). + +## Queries with zero results + +Queries can be executed on events on either the server (`ubi_queries`) or client (`ubi_events`) side. + +### Server-side queries + +The UBI-activated search server logs the queries and their results, so in order to find all queries with *no* results, search for empty `query_response_hit_ids`: + +```sql +select + count(*) +from ubi_queries +where query_response_hit_ids is null + +``` + +### Client-side events + +Although it's relatively straightforward to find queries with no results on the server side, you can also get the same result by querying the *event attributes* that were logged on the client side. +Both client- and server-side queries return the same results. Use the following query to search for queries with no results: + +```sql +select + count(0) +from ubi_events +where event_attributes.result_count > 0 +``` + + + + +## Trending queries + +Trending queries can be found by using either of the following queries. + +### Server-side + +```sql +select + user_query, count(0) Total +from ubi_queries +group by user_query +order by Total desc +``` + +### Client-side + +```sql +select + message, count(0) Total +from ubi_events +where + action_name='on_search' +group by message +order by Total desc +``` + +Both queries return the distribution of search strings, as shown in the following table. + +Message|Total +|---|---| +User Behavior Insights|127 +best Laptop|78 +camera|21 +backpack|17 +briefcase|14 +camcorder|11 +cabinet|9 +bed|9 +box|8 +bottle|8 +calculator|8 +armchair|7 +bench|7 +blackberry|6 +bathroom|6 +User Behavior Insights Mac|5 +best Laptop Dell|5 +User Behavior Insights VTech|5 +ayoolaolafenwa|5 +User Behavior Insights Dell|4 +best Laptop Vaddio|4 +agrega modelos intuitivas|4 +bеуоnd|4 +abraza metodologías B2C|3 + + + +## Event type distribution counts + +To create a pie chart widget visualizing the most common events, run the following query: + +```sql +select + action_name, count(0) Total +from ubi_events +group by action_name +order by Total desc +``` + +The results include a distribution across actions, as shown in the following table. + +action_name|Total +|---|---| +on_search|5425 +brand_filter|3634 +global_click|3571 +view_search_results|3565 +product_sort|3558 +type_filter|3505 +product_hover|820 +item_click|708 +purchase|407 +declined_product|402 +add_to_cart|373 +page_exit|142 +user_feedback|123 +404_redirect|123 + +The following query shows the distribution of margins across user actions: + + +```sql +select + action_name, + count(0) total, + AVG( event_attributes.object.object_detail.cost ) average_cost, + AVG( event_attributes.object.object_detail.margin ) average_margin +from ubi_events +group by action_name +order by average_cost desc +``` + +The results include actions and the distribution across average costs and margins, as shown in the following table. + +action_name|total|average_cost|average_margin +---|---|---|--- +declined_product|395|8457.12|6190.96 +item_click|690|7789.40|5862.70 +add_to_cart|374|6470.22|4617.09 +purchase|358|5933.83|5110.69 +global_click|3555|| +product_sort|3711|| +product_hover|779|| +page_exit|107|| +on_search|5438|| +brand_filter|3722|| +user_feedback|120|| +404_redirect|110|| +view_search_results|3639|| +type_filter|3691|| + +## Sample search journey + +To find a search in the query log, run the following query: + +```sql +select + client_id, query_id, user_query, query_response_hit_ids, query_response_id, timestamp +from ubi_queries where query_id = '7ae52966-4fd4-4ab1-8152-0fd0b52bdadf' +``` + +The following table shows the results of the preceding query. + +client_id|query_id|user_query|query_response_hit_ids|query_response_id|timestamp +---|---|---|---|---|--- +a15f1ef3-6bc6-4959-9b83-6699a4d29845|7ae52966-4fd4-4ab1-8152-0fd0b52bdadf|notebook|0882780391659|6e92c90c-1eee-4dd6-b820-c522fd4126f3|2024-06-04 19:02:45.728 + +The `query` field in `query_id` has the following nested structure: + +```json +{ + "query": { + "size": 25, + "query": { + "query_string": { + "query": "(title:\"notebook\" OR attr_t_device_type:\"notebook\" OR name:\"notebook\")", + "fields": [], + "type": "best_fields", + "default_operator": "or", + "max_determinized_states": 10000, + "enable_position_increments": true, + "fuzziness": "AUTO", + "fuzzy_prefix_length": 0, + "fuzzy_max_expansions": 50, + "phrase_slop": 0, + "analyze_wildcard": false, + "escape": false, + "auto_generate_synonyms_phrase_query": true, + "fuzzy_transpositions": true, + "boost": 1.0 + } + }, + "ext": { + "query_id": "7ae52966-4fd4-4ab1-8152-0fd0b52bdadf", + "user_query": "notebook", + "client_id": "a15f1ef3-6bc6-4959-9b83-6699a4d29845", + "object_id_field": "primary_ean", + "query_attributes": { + "application": "ubi-demo" + } + } + } +} +``` + +In the event log, `ubi_events`, search for the events that correspond to the preceding query (whose query ID is `7ae52966-4fd4-4ab1-8152-0fd0b52bdadf`): + +```sql +select + application, query_id, action_name, message_type, message, client_id, timestamp +from ubi_events +where query_id = '7ae52966-4fd4-4ab1-8152-0fd0b52bdadf' +order by timestamp +``` + + + +The results include all events associated with the user's query, as shown in the following table. + +application|query_id|action_name|message_type|message|client_id|timestamp +---|---|---|---|---|---|--- +ubi-demo|7ae52966-4fd4-4ab1-8152-0fd0b52bdadf|on_search|QUERY|notebook|a15f1ef3-6bc6-4959-9b83-6699a4d29845|2024-06-04 19:02:45.777 +ubi-demo|7ae52966-4fd4-4ab1-8152-0fd0b52bdadf|product_hover|INFO|orquesta soluciones uno-a-uno|a15f1ef3-6bc6-4959-9b83-6699a4d29845|2024-06-04 19:02:45.816 +ubi-demo|7ae52966-4fd4-4ab1-8152-0fd0b52bdadf|item_click|INFO|innova relaciones centrado al usuario|a15f1ef3-6bc6-4959-9b83-6699a4d29845|2024-06-04 19:02:45.86 +ubi-demo|7ae52966-4fd4-4ab1-8152-0fd0b52bdadf|add_to_cart|CONVERSION|engineer B2B platforms|a15f1ef3-6bc6-4959-9b83-6699a4d29845|2024-06-04 19:02:45.905 +ubi-demo|7ae52966-4fd4-4ab1-8152-0fd0b52bdadf|purchase|CONVERSION|Purchase item 0884420136132|a15f1ef3-6bc6-4959-9b83-6699a4d29845|2024-06-04 19:02:45.913 + + + + +## User sessions + +To find more of the same user's sessions (with the client ID `a15f1ef3-6bc6-4959-9b83-6699a4d29845`), run the following query: + +```sql +select + application, event_attributes.session_id, query_id, + action_name, message_type, event_attributes.dwell_time, + event_attributes.object.object_id, + event_attributes.object.description, + timestamp +from ubi_events +where client_id = 'a15f1ef3-6bc6-4959-9b83-6699a4d29845' +order by query_id, timestamp +``` + +The results are truncated to show a sample of sessions, as shown in the following table. + + +application|event_attributes.session_id|query_id|action_name|message_type|event_attributes.dwell_time|event_attributes.object.object_id|event_attributes.object.description|timestamp +---|---|---|---|---|---|---|---|--- +ubi-demo|00731779-e290-4709-8af7-d495ae42bf48|0254a9b7-1d83-4083-aa46-e12dff86ec98|on_search|QUERY|46.6398|||2024-06-04 19:06:36.239 +ubi-demo|00731779-e290-4709-8af7-d495ae42bf48|0254a9b7-1d83-4083-aa46-e12dff86ec98|product_hover|INFO|53.681877|0065030834155|USB 2.0 S-Video and Composite Video Capture Cable|2024-06-04 19:06:36.284 +ubi-demo|00731779-e290-4709-8af7-d495ae42bf48|0254a9b7-1d83-4083-aa46-e12dff86ec98|item_click|INFO|40.699997|0065030834155|USB 2.0 S-Video and Composite Video Capture Cable|2024-06-04 19:06:36.334 +ubi-demo|00731779-e290-4709-8af7-d495ae42bf48|0254a9b7-1d83-4083-aa46-e12dff86ec98|declined_product|REJECT|5.0539055|0065030834155|USB 2.0 S-Video and Composite Video Capture Cable|2024-06-04 19:06:36.373 +ubi-demo|844ca4b5-b6f8-4f7b-a5ec-7f6d95788e0b|0cf185be-91a8-49cf-9401-92ad079ce43b|on_search|QUERY|26.422775|||2024-06-04 19:04:40.832 +ubi-demo|844ca4b5-b6f8-4f7b-a5ec-7f6d95788e0b|0cf185be-91a8-49cf-9401-92ad079ce43b|on_search|QUERY|17.1094|||2024-06-04 19:04:40.837 +ubi-demo|844ca4b5-b6f8-4f7b-a5ec-7f6d95788e0b|0cf185be-91a8-49cf-9401-92ad079ce43b|brand_filter|FILTER|40.090374|OBJECT-6c91da98-387b-45cb-8275-e90d1ea8bc54|supplier_name|2024-06-04 19:04:40.852 +ubi-demo|844ca4b5-b6f8-4f7b-a5ec-7f6d95788e0b|0cf185be-91a8-49cf-9401-92ad079ce43b|type_filter|INFO|37.658962|OBJECT-32d9bb39-b17d-4611-82c1-5aaa14368060|filter_product_type|2024-06-04 19:04:40.856 +ubi-demo|844ca4b5-b6f8-4f7b-a5ec-7f6d95788e0b|0cf185be-91a8-49cf-9401-92ad079ce43b|product_sort|SORT|3.6380951|||2024-06-04 19:04:40.923 +ubi-demo|844ca4b5-b6f8-4f7b-a5ec-7f6d95788e0b|0cf185be-91a8-49cf-9401-92ad079ce43b|view_search_results|INFO|46.436115|||2024-06-04 19:04:40.942 +ubi-demo|844ca4b5-b6f8-4f7b-a5ec-7f6d95788e0b|0cf185be-91a8-49cf-9401-92ad079ce43b|view_search_results|INFO|46.436115|||2024-06-04 19:04:40.959 +ubi-demo|844ca4b5-b6f8-4f7b-a5ec-7f6d95788e0b|0cf185be-91a8-49cf-9401-92ad079ce43b|type_filter|INFO|37.658962|OBJECT-32d9bb39-b17d-4611-82c1-5aaa14368060|filter_product_type|2024-06-04 19:04:40.972 +ubi-demo|844ca4b5-b6f8-4f7b-a5ec-7f6d95788e0b|0cf185be-91a8-49cf-9401-92ad079ce43b|brand_filter|FILTER|40.090374|OBJECT-6c91da98-387b-45cb-8275-e90d1ea8bc54|supplier_name|2024-06-04 19:04:40.997 +ubi-demo|844ca4b5-b6f8-4f7b-a5ec-7f6d95788e0b|0cf185be-91a8-49cf-9401-92ad079ce43b|type_filter|INFO|37.658962|OBJECT-32d9bb39-b17d-4611-82c1-5aaa14368060|filter_product_type|2024-06-04 19:04:41.006 +ubi-demo|844ca4b5-b6f8-4f7b-a5ec-7f6d95788e0b|0cf185be-91a8-49cf-9401-92ad079ce43b|product_sort|SORT|3.6380951|||2024-06-04 19:04:41.031 +ubi-demo|844ca4b5-b6f8-4f7b-a5ec-7f6d95788e0b|0cf185be-91a8-49cf-9401-92ad079ce43b|product_sort|SORT|3.6380951|||2024-06-04 19:04:41.091 +ubi-demo|844ca4b5-b6f8-4f7b-a5ec-7f6d95788e0b|0cf185be-91a8-49cf-9401-92ad079ce43b|type_filter|INFO|37.658962|OBJECT-32d9bb39-b17d-4611-82c1-5aaa14368060|filter_product_type|2024-06-04 19:04:41.164 +ubi-demo|844ca4b5-b6f8-4f7b-a5ec-7f6d95788e0b|0cf185be-91a8-49cf-9401-92ad079ce43b|brand_filter|FILTER|40.090374|OBJECT-6c91da98-387b-45cb-8275-e90d1ea8bc54|supplier_name|2024-06-04 19:04:41.171 +ubi-demo|844ca4b5-b6f8-4f7b-a5ec-7f6d95788e0b|0cf185be-91a8-49cf-9401-92ad079ce43b|view_search_results|INFO|46.436115|||2024-06-04 19:04:41.179 +ubi-demo|844ca4b5-b6f8-4f7b-a5ec-7f6d95788e0b|0cf185be-91a8-49cf-9401-92ad079ce43b|global_click|INFO|42.45651|OBJECT-d350cc2d-b979-4aca-bd73-71709832940f|(96, 127)|2024-06-04 19:04:41.224 +ubi-demo|844ca4b5-b6f8-4f7b-a5ec-7f6d95788e0b|0cf185be-91a8-49cf-9401-92ad079ce43b|view_search_results|INFO|46.436115|||2024-06-04 19:04:41.24 +ubi-demo|844ca4b5-b6f8-4f7b-a5ec-7f6d95788e0b|0cf185be-91a8-49cf-9401-92ad079ce43b|view_search_results|INFO|46.436115|||2024-06-04 19:04:41.285 +ubi-demo|844ca4b5-b6f8-4f7b-a5ec-7f6d95788e0b|0cf185be-91a8-49cf-9401-92ad079ce43b|global_click|INFO|42.45651|OBJECT-d350cc2d-b979-4aca-bd73-71709832940f|(96, 127)|2024-06-04 19:04:41.328 +ubi-demo|33bd0ee2-60b7-4c25-b62c-1aa1580da73c|2071e273-513f-46be-b835-89f452095053|on_search|QUERY|52.721157|||2024-06-04 19:03:50.8 +ubi-demo|33bd0ee2-60b7-4c25-b62c-1aa1580da73c|2071e273-513f-46be-b835-89f452095053|view_search_results|INFO|26.600422|||2024-06-04 19:03:50.802 +ubi-demo|33bd0ee2-60b7-4c25-b62c-1aa1580da73c|2071e273-513f-46be-b835-89f452095053|product_sort|SORT|14.839713|||2024-06-04 19:03:50.875 +ubi-demo|33bd0ee2-60b7-4c25-b62c-1aa1580da73c|2071e273-513f-46be-b835-89f452095053|brand_filter|FILTER|20.876852|OBJECT-6c91da98-387b-45cb-8275-e90d1ea8bc54|supplier_name|2024-06-04 19:03:50.927 +ubi-demo|33bd0ee2-60b7-4c25-b62c-1aa1580da73c|2071e273-513f-46be-b835-89f452095053|type_filter|INFO|15.212905|OBJECT-32d9bb39-b17d-4611-82c1-5aaa14368060|filter_product_type|2024-06-04 19:03:50.997 +ubi-demo|33bd0ee2-60b7-4c25-b62c-1aa1580da73c|2071e273-513f-46be-b835-89f452095053|view_search_results|INFO|26.600422|||2024-06-04 19:03:51.033 +ubi-demo|33bd0ee2-60b7-4c25-b62c-1aa1580da73c|2071e273-513f-46be-b835-89f452095053|global_click|INFO|11.710514|OBJECT-d350cc2d-b979-4aca-bd73-71709832940f|(96, 127)|2024-06-04 19:03:51.108 +ubi-demo|33bd0ee2-60b7-4c25-b62c-1aa1580da73c|2071e273-513f-46be-b835-89f452095053|product_sort|SORT|14.839713|||2024-06-04 19:03:51.144 +ubi-demo|33bd0ee2-60b7-4c25-b62c-1aa1580da73c|2071e273-513f-46be-b835-89f452095053|global_click|INFO|11.710514|OBJECT-d350cc2d-b979-4aca-bd73-71709832940f|(96, 127)|2024-06-04 19:03:51.17 +ubi-demo|33bd0ee2-60b7-4c25-b62c-1aa1580da73c|2071e273-513f-46be-b835-89f452095053|brand_filter|FILTER|20.876852|OBJECT-6c91da98-387b-45cb-8275-e90d1ea8bc54|supplier_name|2024-06-04 19:03:51.205 +ubi-demo|33bd0ee2-60b7-4c25-b62c-1aa1580da73c|2071e273-513f-46be-b835-89f452095053|type_filter|INFO|15.212905|OBJECT-32d9bb39-b17d-4611-82c1-5aaa14368060|filter_product_type|2024-06-04 19:03:51.228 +ubi-demo|33bd0ee2-60b7-4c25-b62c-1aa1580da73c|2071e273-513f-46be-b835-89f452095053|product_sort|SORT|14.839713|||2024-06-04 19:03:51.232 +ubi-demo|33bd0ee2-60b7-4c25-b62c-1aa1580da73c|2071e273-513f-46be-b835-89f452095053|type_filter|INFO|15.212905|OBJECT-32d9bb39-b17d-4611-82c1-5aaa14368060|filter_product_type|2024-06-04 19:03:51.292 +ubi-demo|33bd0ee2-60b7-4c25-b62c-1aa1580da73c|2071e273-513f-46be-b835-89f452095053|type_filter|INFO|15.212905|OBJECT-32d9bb39-b17d-4611-82c1-5aaa14368060|filter_product_type|2024-06-04 19:03:51.301 +ubi-demo|33bd0ee2-60b7-4c25-b62c-1aa1580da73c|23f0149a-13ae-4977-8dc9-ef61c449c140|on_search|QUERY|16.93674|||2024-06-04 19:03:50.62 +ubi-demo|33bd0ee2-60b7-4c25-b62c-1aa1580da73c|23f0149a-13ae-4977-8dc9-ef61c449c140|global_click|INFO|25.897957|OBJECT-d350cc2d-b979-4aca-bd73-71709832940f|(96, 127)|2024-06-04 19:03:50.624 +ubi-demo|33bd0ee2-60b7-4c25-b62c-1aa1580da73c|23f0149a-13ae-4977-8dc9-ef61c449c140|product_sort|SORT|44.345097|||2024-06-04 19:03:50.688 +ubi-demo|33bd0ee2-60b7-4c25-b62c-1aa1580da73c|23f0149a-13ae-4977-8dc9-ef61c449c140|brand_filter|FILTER|19.54417|OBJECT-6c91da98-387b-45cb-8275-e90d1ea8bc54|supplier_name|2024-06-04 19:03:50.696 +ubi-demo|33bd0ee2-60b7-4c25-b62c-1aa1580da73c|23f0149a-13ae-4977-8dc9-ef61c449c140|type_filter|INFO|48.79312|OBJECT-32d9bb39-b17d-4611-82c1-5aaa14368060|filter_product_type|2024-06-04 19:03:50.74 +ubi-demo|33bd0ee2-60b7-4c25-b62c-1aa1580da73c|23f0149a-13ae-4977-8dc9-ef61c449c140|brand_filter|FILTER|19.54417|OBJECT-6c91da98-387b-45cb-8275-e90d1ea8bc54|supplier_name|2024-06-04 19:03:50.802 + + +## List user sessions for users who logged out without submitting any queries + +The following query searches for users who don't have an associated `query_id`. Note that this may happen if the client side does not pass the returned query to other events. + +```sql +select + client_id, session_id, count(0) EventTotal +from ubi_events +where action_name='logout' and query_id is null +group by client_id, session_id +order by EventTotal desc +``` + + + +The following table shows the client ID, session ID, and that there was 1 event,`logout`. + +client_id|session_id|EventTotal +---|---|--- +100_15c182f2-05db-4f4f-814f-46dc0de6b9ea|1c36712c-44b8-4fdd-8f0d-fdfeab5bd794_1290|1 +175_e5f262f1-0db3-4948-b349-c5b95ff31259|816f94d6-8966-4a8b-8984-a2641d5865b2_2251|1 +175_e5f262f1-0db3-4948-b349-c5b95ff31259|314dc1ff-ef38-4da4-b4b1-061f62dddcbb_2248|1 +175_e5f262f1-0db3-4948-b349-c5b95ff31259|1ce5dc30-31bb-4759-9451-5a99b28ba91b_2255|1 +175_e5f262f1-0db3-4948-b349-c5b95ff31259|10ac0fc0-409e-4ba0-98e9-edb323556b1a_2249|1 +174_ab59e589-1ae4-40be-8b29-8efd9fc15380|dfa8b38a-c451-4190-a391-2e1ec3c8f196_2228|1 +174_ab59e589-1ae4-40be-8b29-8efd9fc15380|68666e11-087a-4978-9ca7-cbac6862273e_2233|1 +174_ab59e589-1ae4-40be-8b29-8efd9fc15380|5ca7a0df-f750-4656-b9a5-5eef1466ba09_2234|1 +174_ab59e589-1ae4-40be-8b29-8efd9fc15380|228c1135-b921-45f4-b087-b3422e7ed437_2236|1 +173_39d4cbfd-0666-4e77-84a9-965ed785db49|f9795e2e-ad92-4f15-8cdd-706aa1a3a17b_2206|1 +173_39d4cbfd-0666-4e77-84a9-965ed785db49|f3c18b61-2c8a-41b3-a023-11eb2dd6c93c_2207|1 +173_39d4cbfd-0666-4e77-84a9-965ed785db49|e12f700c-ffa3-4681-90d9-146022e26a18_2210|1 +173_39d4cbfd-0666-4e77-84a9-965ed785db49|da1ff1f6-26f1-49d4-bd0d-d32d199e270e_2208|1 +173_39d4cbfd-0666-4e77-84a9-965ed785db49|a1674e9d-d2dd-4da9-a4d1-dd12a401e8e7_2216|1 +172_875f04d6-2c35-45f4-a8ac-bc5b675425f6|cc8e6174-5c1a-48c5-8ee8-1226621fe9f7_2203|1 +171_7d810730-d6e9-4079-ab1c-db7f98776985|927fcfed-61d2-4334-91e9-77442b077764_2189|1 +16_581fe410-338e-457b-a790-85af2a642356|83a68f57-0fbb-4414-852b-4c4601bf6cf2_156|1 +16_581fe410-338e-457b-a790-85af2a642356|7881141b-511b-4df9-80e6-5450415af42c_162|1 +16_581fe410-338e-457b-a790-85af2a642356|1d64478e-c3a6-4148-9a64-b6f4a73fc684_158|1 + + + +You may want to identify users who logged out multiple times without submitting a query. The following query lets you see which users do this the most: + +```sql +select + client_id, count(0) EventTotal +from ubi_events +where action_name='logout' and query_id is null +group by client_id +order by EventTotal desc +``` + +The following table shows user client IDs and the number of logouts without any queries. + +client_id|EventTotal +---|--- +87_5a6e1f8c-4936-4184-a24d-beddd05c9274|8 +127_829a4246-930a-4b24-8165-caa07ee3fa47|7 +49_5da537a3-8d94-48d1-a0a4-dcad21c12615|6 +56_6c7c2525-9ca5-4d5d-8ac0-acb43769ac0b|6 +140_61192c8e-c532-4164-ad1b-1afc58c265b7|6 +149_3443895e-6f81-4706-8141-1ebb0c2470ca|6 +196_4359f588-10be-4b2c-9e7f-ee846a75a3f6|6 +173_39d4cbfd-0666-4e77-84a9-965ed785db49|5 +52_778ac7f3-8e60-444e-ad40-d24516bf4ce2|5 +51_6335e0c3-7bea-4698-9f83-25c9fb984e12|5 +175_e5f262f1-0db3-4948-b349-c5b95ff31259|5 +61_feb3a495-c1fb-40ea-8331-81cee53a5eb9|5 +181_f227264f-cabd-4468-bfcc-4801baeebd39|5 +185_435d1c63-4829-45f3-abff-352ef6458f0e|5 +100_15c182f2-05db-4f4f-814f-46dc0de6b9ea|5 +113_df32ed6e-d74a-4956-ac8e-6d43d8d60317|5 +151_0808111d-07ce-4c84-a0fd-7125e4e33020|5 +204_b75e374c-4813-49c4-b111-4bf4fdab6f26|5 +29_ec2133e5-4d9b-4222-aa7c-2a9ae0880ddd|5 +41_f64abc69-56ea-4dd3-a991-7d1fd292a530|5 diff --git a/_search-plugins/ubi/ubi-dashboard-tutorial.md b/_search-plugins/ubi/ubi-dashboard-tutorial.md new file mode 100644 index 0000000000..ac6cd72186 --- /dev/null +++ b/_search-plugins/ubi/ubi-dashboard-tutorial.md @@ -0,0 +1,94 @@ +--- +layout: default +title: UBI dashboard tutorial +parent: User Behavior Insights +has_children: false +nav_order: 25 +--- + + +# UBI dashboard tutorial + +Whether you've been collecting user events and queries for a while or [you've uploaded some sample events](https://github.com/o19s/chorus-OpenSearch-edition/blob/main/katas/003_import_preexisting_event_data.md), you're now ready to visualize the data collected through User Behavior Insights (UBI) in a dashboard in OpenSearch. + +To quickly view a dashboard without completing the full tutorial, do the following: +1. Download and save the [sample UBI dashboard]({{site.url}}{{site.baseurl}}/assets/examples/ubi-dashboard.ndjson). +1. On the top menu, go to **Management > Dashboard Management**. +1. In the **Dashboards** panel, choose **Saved objects**. +1. In the upper-right corner, select **Import**. +1. In the **Select file** panel, choose **Import**. +1. Select the UBI dashboard file that you downloaded and select the **Import** button. + +## 1. Start OpenSearch Dashboards + +Start OpenSearch Dashboards. For example, go to `http://{server}:5601/app/home#/`. For more information, see [OpenSearch Dashboards]({{site.url}}{{site.baseurl}}/dashboards/). The following image shows the home page. +![Dashboard Home]({{site.url}}{{site.baseurl}}/images/ubi/home.png "Dashboards") + +## 2. Create an index pattern + +In OpenSearch Management, navigate to **Dashboards Management > Index patterns** or navigate using a URL, such as `http://{server}:5601/app/management/OpenSearch-dashboards/indexPatterns`. + +OpenSearch Dashboards accesses your indexes using index patterns. To visualize your users' online search behavior, you must create an index pattern in order to access the indexes that UBI creates. For more information, see [Index patterns]({{site.url}}{{site.baseurl}}/dashboards/management/index-patterns/). + +After you select **Create index pattern**, a list of indexes in your OpenSearch instance is displayed. The UBI stores may be hidden by default, so make sure to select **Include system and hidden indexes**, as shown in the following image. +![Index Patterns]({{site.url}}{{site.baseurl}}/images/ubi/index_pattern2.png "Index Patterns") + +You can group indexes into the same data source for your dashboard using wildcards. For this tutorial you'll combine the query and event stores into the `ubi_*` pattern. + +OpenSearch Dashboards prompts you to filter on any `date` field in your schema so that you can look at things like trending queries over the last 15 minutes. However, for your first dashboard, select **I don't want to use the time filter**, as shown in the following image. +Index Patterns + + +After selecting **Create index pattern**, you're ready to start building a dashboard that displays the UBI store data. + +## 3. Create a new dashboard + +To create a new dashboard, on the top menu, select **OpenSearch Dashboards > Dashboards** and then **Create > Dashboard** > **Create new**. +If you haven't previously created a dashboard, you are presented with the option to create a new dashboard. Otherwise, previously created dashboards are displayed. + + +In the **New Visualization** window, select **Pie** to create a new pie chart. Then select the index pattern you created in step 2. + +Most visualizations require some sort of aggregate function on a bucket/facet/aggregatable field (numeric or keyword). You'll add a `Terms` aggregation to the `action_name` field so that you can view the distribution of event names. Change the **Size** to the number of slices you want to display, as shown in the following image. +![Pie Chart]({{site.url}}{{site.baseurl}}/images/ubi/pie.png "Pie Chart") + +Save the visualization so that it's added to your new dashboard. Now that you have a visualization displayed on your dashboard, you can save the dashboard. + +## 4. Add a tag cloud visualization + +Now you'll add a word cloud for trending searches by creating a new visualization, similarly to the previous step. + +In the **New Visualization** window, select **Tag Cloud**, and then select the index pattern you created in step 2. Choose the tag cloud visualization of the terms in the `message` field where the JavaScript client logs the raw search text. Note: The true query, as processed by OpenSearch with filters, boosting, and so on, resides in the `ubi_queries` index. However, you'll view the `message` field of the `ubi_events` index, where the JavaScript client captures the text that the user actually typed. + +The following image shows the tag cloud visualization on the `message` field. +![Word Cloud]({{site.url}}{{site.baseurl}}/images/ubi/tag_cloud1.png "Word Cloud") + +The underlying queries can be found at [SQL trending queries]({{site.url}}{{site.baseurl}}/search-plugins/ubi/sql-queries/#trending-queries). +{: .note} + + +The resulting visualization may contain different information than you're looking for. The `message` field is updated with every event, and as a result, it can contain error messages, debug messages, click information, and other unwanted data. +To view only search terms for query events, you need to add a filter to your visualization. Because during setup you provided a `message_type` of `QUERY` for each search event, you can filter by that message type to isolate the specific users' searches. To do this, select **Add filter** and then select **QUERY** in the **Edit filter** panel, as shown in the following image. +![Word Cloud]({{site.url}}{{site.baseurl}}/images/ubi/tag_cloud2.png "Word Cloud") + +There should now be two visualizations (the pie chart and the tag cloud) displayed on your dashboard, as shown in the following image. +![UBI Dashboard]({{site.url}}{{site.baseurl}}/images/ubi/dashboard2.png "UBI Dashboard") + +## 5. Add a histogram of item clicks + +Now you'll add a histogram visualization to your dashboard, similarly to the previous step. In the **New Visualization** window, select **Vertical Bar**. Then select the index pattern you created in step 2. + +Examine the `event_attributes.position.ordinal` data field. This field contains the position of the item in a list selected by the user. For the histogram visualization, the x-axis represents the ordinal number of the selected item (n). The y-axis represents the number of times that the nth item was clicked, as shown in the following image. + +![Vertical Bar Chart]({{site.url}}{{site.baseurl}}/images/ubi/histogram.png "Vertical Bar Chart") + +## 6) Filter the displayed data + +Now you can further filter the displayed data. For example, you can see how the click position changes when a purchase occurs. Select **Add filter** and then select the `action_name:product_purchase` field, as shown in the following image. +![Product Purchase]({{site.url}}{{site.baseurl}}/images/ubi/product_purchase.png "Product Purchase") + + +You can filter event messages containing the word `*laptop*` by adding wildcards, as shown in the following image. +![Laptop]({{site.url}}{{site.baseurl}}/images/ubi/laptop.png "Laptop"). + + diff --git a/_search-plugins/vector-search.md b/_search-plugins/vector-search.md new file mode 100644 index 0000000000..862b26b375 --- /dev/null +++ b/_search-plugins/vector-search.md @@ -0,0 +1,283 @@ +--- +layout: default +title: Vector search +nav_order: 22 +has_children: false +has_toc: false +--- + +# Vector search + +OpenSearch is a comprehensive search platform that supports a variety of data types, including vectors. OpenSearch vector database functionality is seamlessly integrated with its generic database function. + +In OpenSearch, you can generate vector embeddings, store those embeddings in an index, and use them for vector search. Choose one of the following options: + +- Generate embeddings using a library of your choice before ingesting them into OpenSearch. Once you ingest vectors into an index, you can perform a vector similarity search on the vector space. For more information, see [Working with embeddings generated outside of OpenSearch](#working-with-embeddings-generated-outside-of-opensearch). +- Automatically generate embeddings within OpenSearch. To use embeddings for semantic search, the ingested text (the corpus) and the query need to be embedded using the same model. [Neural search]({{site.url}}{{site.baseurl}}/search-plugins/neural-search/) packages this functionality, eliminating the need to manage the internal details. For more information, see [Generating vector embeddings within OpenSearch](#generating-vector-embeddings-in-opensearch). + +## Working with embeddings generated outside of OpenSearch + +After you generate vector embeddings, upload them to an OpenSearch index and search the index using vector search. For a complete example, see [Example](#example). + +### k-NN index + +To build a vector database and use vector search, you must specify your index as a [k-NN index]({{site.url}}{{site.baseurl}}/search-plugins/knn/knn-index/) when creating it by setting `index.knn` to `true`: + +```json +PUT test-index +{ + "settings": { + "index": { + "knn": true, + "knn.algo_param.ef_search": 100 + } + }, + "mappings": { + "properties": { + "my_vector1": { + "type": "knn_vector", + "dimension": 1024, + "method": { + "name": "hnsw", + "space_type": "l2", + "engine": "nmslib", + "parameters": { + "ef_construction": 128, + "m": 24 + } + } + } + } + } +} +``` +{% include copy-curl.html %} + +### k-NN vector + +You must designate the field that will store vectors as a [`knn_vector`]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/knn-vector/) field type. OpenSearch supports vectors of up to 16,000 dimensions, each of which is represented as a 32-bit or 16-bit float. + +To save storage space, you can use `byte` vectors. For more information, see [Lucene byte vector]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/knn-vector#lucene-byte-vector). + +### k-NN vector search + +Vector search finds the vectors in your database that are most similar to the query vector. OpenSearch supports the following search methods: + +- [Approximate search](#approximate-search) (approximate k-NN, or ANN): Returns approximate nearest neighbors to the query vector. Usually, approximate search algorithms sacrifice indexing speed and search accuracy in exchange for performance benefits such as lower latency, smaller memory footprints, and more scalable search. For most use cases, approximate search is the best option. + +- Exact search (exact k-NN): A brute-force, exact k-NN search of vector fields. OpenSearch supports the following types of exact search: + - [Exact k-NN with scoring script]({{site.url}}{{site.baseurl}}/search-plugins/knn/knn-score-script/): Using the k-NN scoring script, you can apply a filter to an index before executing the nearest neighbor search. + - [Painless extensions]({{site.url}}{{site.baseurl}}/search-plugins/knn/painless-functions/): Adds the distance functions as Painless extensions that you can use in more complex combinations. You can use this method to perform a brute-force, exact k-NN search of an index, which also supports pre-filtering. + +### Approximate search + +OpenSearch supports several algorithms for approximate vector search, each with its own advantages. For complete documentation, see [Approximate search]({{site.url}}{{site.baseurl}}/search-plugins/knn/approximate-knn/). For more information about the search methods and engines, see [Method definitions]({{site.url}}{{site.baseurl}}/search-plugins/knn/knn-index/#method-definitions). For method recommendations, see [Choosing the right method]({{site.url}}{{site.baseurl}}/search-plugins/knn/knn-index/#choosing-the-right-method). + +To use approximate vector search, specify one of the following search methods (algorithms) in the `method` parameter: + +- Hierarchical Navigable Small World (HNSW) +- Inverted File System (IVF) + +Additionally, specify the engine (library) that implements this method in the `engine` parameter: + +- [Non-Metric Space Library (NMSLIB)](https://github.com/nmslib/nmslib) +- [Facebook AI Similarity Search (Faiss)](https://github.com/facebookresearch/faiss) +- Lucene + +The following table lists the combinations of search methods and libraries supported by the k-NN engine for approximate vector search. + +Method | Engine +:--- | :--- +HNSW | NMSLIB, Faiss, Lucene +IVF | Faiss + +### Engine recommendations + +In general, select NMSLIB or Faiss for large-scale use cases. Lucene is a good option for smaller deployments and offers benefits like smart filtering, where the optimal filtering strategy—pre-filtering, post-filtering, or exact k-NN—is automatically applied depending on the situation. The following table summarizes the differences between each option. + +| | NMSLIB/HNSW | Faiss/HNSW | Faiss/IVF | Lucene/HNSW | +|:---|:---|:---|:---|:---| +| Max dimensions | 16,000 | 16,000 | 16,000 | 1,024 | +| Filter | Post-filter | Post-filter | Post-filter | Filter during search | +| Training required | No | No | Yes | No | +| Similarity metrics | `l2`, `innerproduct`, `cosinesimil`, `l1`, `linf` | `l2`, `innerproduct` | `l2`, `innerproduct` | `l2`, `cosinesimil` | +| Number of vectors | Tens of billions | Tens of billions | Tens of billions | Less than 10 million | +| Indexing latency | Low | Low | Lowest | Low | +| Query latency and quality | Low latency and high quality | Low latency and high quality | Low latency and low quality | High latency and high quality | +| Vector compression | Flat | Flat
Product quantization | Flat
Product quantization | Flat | +| Memory consumption | High | High
Low with PQ | Medium
Low with PQ | High | + +### Example + +In this example, you'll create a k-NN index, add data to the index, and search the data. + +#### Step 1: Create a k-NN index + +First, create an index that will store sample hotel data. Set `index.knn` to `true` and specify the `location` field as a `knn_vector`: + +```json +PUT /hotels-index +{ + "settings": { + "index": { + "knn": true, + "knn.algo_param.ef_search": 100, + "number_of_shards": 1, + "number_of_replicas": 0 + } + }, + "mappings": { + "properties": { + "location": { + "type": "knn_vector", + "dimension": 2, + "method": { + "name": "hnsw", + "space_type": "l2", + "engine": "lucene", + "parameters": { + "ef_construction": 100, + "m": 16 + } + } + } + } + } +} +``` +{% include copy-curl.html %} + +#### Step 2: Add data to your index + +Next, add data to your index. Each document represents a hotel. The `location` field in each document contains a vector specifying the hotel's location: + +```json +POST /_bulk +{ "index": { "_index": "hotels-index", "_id": "1" } } +{ "location": [5.2, 4.4] } +{ "index": { "_index": "hotels-index", "_id": "2" } } +{ "location": [5.2, 3.9] } +{ "index": { "_index": "hotels-index", "_id": "3" } } +{ "location": [4.9, 3.4] } +{ "index": { "_index": "hotels-index", "_id": "4" } } +{ "location": [4.2, 4.6] } +{ "index": { "_index": "hotels-index", "_id": "5" } } +{ "location": [3.3, 4.5] } +``` +{% include copy-curl.html %} + +#### Step 3: Search your data + +Now search for hotels closest to the pin location `[5, 4]`. This location is labeled `Pin` in the following image. Each hotel is labeled with its document number. + +![Hotels on a coordinate plane]({{site.url}}{{site.baseurl}}/images/k-nn-search-hotels.png/) + +To search for the top three closest hotels, set `k` to `3`: + +```json +POST /hotels-index/_search +{ + "size": 3, + "query": { + "knn": { + "location": { + "vector": [ + 5, + 4 + ], + "k": 3 + } + } + } +} +``` +{% include copy-curl.html %} + +The response contains the hotels closest to the specified pin location: + +```json +{ + "took": 1093, + "timed_out": false, + "_shards": { + "total": 1, + "successful": 1, + "skipped": 0, + "failed": 0 + }, + "hits": { + "total": { + "value": 3, + "relation": "eq" + }, + "max_score": 0.952381, + "hits": [ + { + "_index": "hotels-index", + "_id": "2", + "_score": 0.952381, + "_source": { + "location": [ + 5.2, + 3.9 + ] + } + }, + { + "_index": "hotels-index", + "_id": "1", + "_score": 0.8333333, + "_source": { + "location": [ + 5.2, + 4.4 + ] + } + }, + { + "_index": "hotels-index", + "_id": "3", + "_score": 0.72992706, + "_source": { + "location": [ + 4.9, + 3.4 + ] + } + } + ] + } +} +``` + +### Vector search with filtering + +For information about vector search with filtering, see [k-NN search with filters]({{site.url}}{{site.baseurl}}/search-plugins/knn/filter-search-knn/). + +## Generating vector embeddings in OpenSearch + +[Neural search]({{site.url}}{{site.baseurl}}/search-plugins/neural-search/) encapsulates the infrastructure needed to perform semantic vector searches. After you integrate an inference (embedding) service, neural search functions like lexical search, accepting a textual query and returning relevant documents. + +When you index your data, neural search transforms text into vector embeddings and indexes both the text and its vector embeddings in a vector index. When you use a neural query during search, neural search converts the query text into vector embeddings and uses vector search to return the results. + +### Choosing a model + +The first step in setting up neural search is choosing a model. You can upload a model to your OpenSearch cluster, use one of the pretrained models provided by OpenSearch, or connect to an externally hosted model. For more information, see [Integrating ML models]({{site.url}}{{site.baseurl}}/ml-commons-plugin/integrating-ml-models/). + +### Neural search tutorial + +For a step-by-step tutorial, see [Neural search tutorial]({{site.url}}{{site.baseurl}}/search-plugins/neural-search-tutorial/). + +### Search methods + +Choose one of the following search methods to use your model for neural search: + +- [Semantic search]({{site.url}}{{site.baseurl}}/search-plugins/semantic-search/): Uses dense retrieval based on text embedding models to search text data. + +- [Hybrid search]({{site.url}}{{site.baseurl}}/search-plugins/hybrid-search/): Combines lexical and neural search to improve search relevance. + +- [Multimodal search]({{site.url}}{{site.baseurl}}/search-plugins/multimodal-search/): Uses neural search with multimodal embedding models to search text and image data. + +- [Neural sparse search]({{site.url}}{{site.baseurl}}/search-plugins/neural-sparse-search/): Uses neural search with sparse retrieval based on sparse embedding models to search text data. + +- [Conversational search]({{site.url}}{{site.baseurl}}/search-plugins/conversational-search/): With conversational search, you can ask questions in natural language, receive a text response, and ask additional clarifying questions. diff --git a/_security/access-control/document-level-security.md b/_security/access-control/document-level-security.md index 08de85bbf7..352fe06a61 100644 --- a/_security/access-control/document-level-security.md +++ b/_security/access-control/document-level-security.md @@ -191,6 +191,10 @@ Adaptive | `adaptive-level` | The default setting that allows OpenSearch to auto OpenSearch combines all DLS queries with the logical `OR` operator. However, when a role that uses DLS is combined with another security role that doesn't use DLS, the query results are filtered to display only documents matching the DLS from the first role. This filter rule also applies to roles that do not grant read documents. +### DLS and write permissions + +Make sure that a user that has DLS-configured roles does not have write permissions. If write permissions are added, the user will be able to index documents which they will not be able to retrieve due to DLS filtering. + ### When to enable `plugins.security.dfm_empty_overrides_all` When to enable the `plugins.security.dfm_empty_overrides_all` setting depends on whether you want to restrict user access to documents without DLS. diff --git a/_security/audit-logs/storage-types.md b/_security/audit-logs/storage-types.md index c0707ff424..719287ad7f 100644 --- a/_security/audit-logs/storage-types.md +++ b/_security/audit-logs/storage-types.md @@ -53,8 +53,8 @@ If you use `external_opensearch` and the remote cluster also uses the Security p Name | Data type | Description :--- | :--- | :--- -`plugins.security.audit.config.enable_ssl` | Boolean | If you enabled SSL/TLS on the receiving cluster, set to true. The default is false. -`plugins.security.audit.config.verify_hostnames` | Boolean | Whether to verify the hostname of the SSL/TLS certificate of the receiving cluster. Default is true. +`plugins.security.audit.config.enable_ssl` | Boolean | If you enabled SSL/TLS on the receiving cluster, set to true. The Default is `false`. +`plugins.security.audit.config.verify_hostnames` | Boolean | Whether to verify the hostname of the SSL/TLS certificate of the receiving cluster. Default is `true`. `plugins.security.audit.config.pemtrustedcas_filepath` | String | The trusted root certificate of the external OpenSearch cluster, relative to the `config` directory. `plugins.security.audit.config.pemtrustedcas_content` | String | Instead of specifying the path (`plugins.security.audit.config.pemtrustedcas_filepath`), you can configure the Base64-encoded certificate content directly. `plugins.security.audit.config.enable_ssl_client_auth` | Boolean | Whether to enable SSL/TLS client authentication. If you set this to true, the audit log module sends the node's certificate along with the request. The receiving cluster can use this certificate to verify the identity of the caller. diff --git a/_security/authentication-backends/jwt.md b/_security/authentication-backends/jwt.md index b6b08388b5..3f28dfecfd 100644 --- a/_security/authentication-backends/jwt.md +++ b/_security/authentication-backends/jwt.md @@ -122,7 +122,7 @@ Name | Description `jwt_url_parameter` | If the token is not transmitted in the HTTP header but rather as an URL parameter, define the name of the parameter here. `subject_key` | The key in the JSON payload that stores the username. If not set, the [subject](https://tools.ietf.org/html/rfc7519#section-4.1.2) registered claim is used. `roles_key` | The key in the JSON payload that stores the user's roles. The value of this key must be a comma-separated list of roles. -`required_audience` | The name of the audience which the JWT must specify. This corresponds [`aud` claim of the JWT](https://datatracker.ietf.org/doc/html/rfc7519#section-4.1.3). +`required_audience` | The name of the audience that the JWT must specify. You can set a single value (for example, `project1`) or multiple comma-separated values (for example, `project1,admin`). If you set multiple values, the JWT must have at least one required audience. This parameter corresponds to the [`aud` claim of the JWT](https://datatracker.ietf.org/doc/html/rfc7519#section-4.1.3). `required_issuer` | The target issuer of JWT stored in the JSON payload. This corresponds to the [`iss` claim of the JWT](https://datatracker.ietf.org/doc/html/rfc7519#section-4.1.1). `jwt_clock_skew_tolerance_seconds` | Sets a window of time, in seconds, to compensate for any disparity between the JWT authentication server and OpenSearch node clock times, thereby preventing authentication failures due to the misalignment. Security sets 30 seconds as the default. Use this setting to apply a custom value. diff --git a/_security/authentication-backends/ldap.md b/_security/authentication-backends/ldap.md index 49b01e332b..9f98f7f5b0 100755 --- a/_security/authentication-backends/ldap.md +++ b/_security/authentication-backends/ldap.md @@ -61,8 +61,21 @@ We provide a fully functional example that can help you understand how to use an To enable LDAP authentication and authorization, add the following lines to `config/opensearch-security/config.yml`: +The internal user database authentication should also be enabled because OpenSearch Dashboards connects to OpenSearch using the `kibanaserver` internal user. +{: .note} + ```yml authc: + internal_auth: + order: 0 + description: "HTTP basic authentication using the internal user database" + http_enabled: true + transport_enabled: true + http_authenticator: + type: basic + challenge: false + authentication_backend: + type: internal ldap: http_enabled: true transport_enabled: true diff --git a/_security/authentication-backends/openid-connect.md b/_security/authentication-backends/openid-connect.md index 8efb66fbb6..8e785a9e65 100755 --- a/_security/authentication-backends/openid-connect.md +++ b/_security/authentication-backends/openid-connect.md @@ -181,8 +181,8 @@ config: Name | Description :--- | :--- -`enable_ssl` | Whether to use TLS. Default is false. -`verify_hostnames` | Whether to verify the hostnames of the IdP's TLS certificate. Default is true. +`enable_ssl` | Whether to use TLS. Default is `false`. +`verify_hostnames` | Whether to verify the hostnames of the IdP's TLS certificate. Default is `true`. ### Certificate validation @@ -252,7 +252,7 @@ config: Name | Description :--- | :--- -`enable_ssl_client_auth` | Whether to send the client certificate to the IdP server. Default is false. +`enable_ssl_client_auth` | Whether to send the client certificate to the IdP server. Default is `false`. `pemcert_filepath` | Absolute path to the client certificate. `pemcert_content` | The content of the client certificate. Cannot be used when `pemcert_filepath` is set. `pemkey_filepath` | Absolute path to the file containing the private key of the client certificate. diff --git a/_security/authentication-backends/proxy.md b/_security/authentication-backends/proxy.md index bb7d1f0151..7716b1d6d2 100644 --- a/_security/authentication-backends/proxy.md +++ b/_security/authentication-backends/proxy.md @@ -40,7 +40,7 @@ You can configure the following settings: Name | Description :--- | :--- -`enabled` | Enables or disables proxy support. Default is false. +`enabled` | Enables or disables proxy support. Default is `false`. `internalProxies` | A regular expression containing the IP addresses of all trusted proxies. The pattern `.*` trusts all internal proxies. `remoteIpHeader` | Name of the HTTP header field that has the hostname chain. Default is `x-forwarded-for`. diff --git a/_security/authentication-backends/saml.md b/_security/authentication-backends/saml.md index a4511a5325..652345ccdc 100755 --- a/_security/authentication-backends/saml.md +++ b/_security/authentication-backends/saml.md @@ -244,7 +244,7 @@ If you are loading the IdP metadata from a URL, we recommend that you use SSL/TL Name | Description :--- | :--- -`idp.enable_ssl` | Whether to enable the custom TLS configuration. Default is false (JDK settings are used). +`idp.enable_ssl` | Whether to enable the custom TLS configuration. Default is `false` (JDK settings are used). `idp.verify_hostnames` | Whether to verify the hostnames of the server's TLS certificate. Example: @@ -302,7 +302,7 @@ The Security plugin can use TLS client authentication when fetching the IdP meta Name | Description :--- | :--- -`idp.enable_ssl_client_auth` | Whether to send a client certificate to the IdP server. Default is false. +`idp.enable_ssl_client_auth` | Whether to send a client certificate to the IdP server. Default is `false`. `idp.pemcert_filepath` | Path to the PEM file containing the client certificate. The file must be placed under the OpenSearch `config` directory, and the path must be specified relative to the `config` directory. `idp.pemcert_content` | The content of the client certificate. Cannot be used when `pemcert_filepath` is set. `idp.pemkey_filepath` | Path to the private key of the client certificate. The file must be placed under the OpenSearch `config` directory, and the path must be specified relative to the `config` directory. diff --git a/_security/configuration/security-admin.md b/_security/configuration/security-admin.md index ed293b7e91..77d3711385 100755 --- a/_security/configuration/security-admin.md +++ b/_security/configuration/security-admin.md @@ -201,7 +201,7 @@ Name | Description `-cn` | Cluster name. Default is `opensearch`. `-icl` | Ignore cluster name. `-sniff` | Sniff cluster nodes. Sniffing detects available nodes using the OpenSearch `_cluster/state` API. -`-arc,--accept-red-cluster` | Execute `securityadmin.sh` even if the cluster state is red. Default is false, which means the script will not execute on a red cluster. +`-arc,--accept-red-cluster` | Execute `securityadmin.sh` even if the cluster state is red. Default is `false`, which means the script will not execute on a red cluster. ### Certificate validation settings @@ -210,7 +210,7 @@ Use the following options to control certificate validation. Name | Description :--- | :--- -`-nhnv` | Do not validate hostname. Default is false. +`-nhnv` | Do not validate hostname. Default is `false`. `-nrhn` | Do not resolve hostname. Only relevant if `-nhnv` is not set. diff --git a/_security/configuration/tls.md b/_security/configuration/tls.md index d06b16a47e..a4115b8c25 100755 --- a/_security/configuration/tls.md +++ b/_security/configuration/tls.md @@ -52,11 +52,11 @@ The following settings configure the location and password of your keystore and Name | Description :--- | :--- -`plugins.security.ssl.transport.keystore_type` | The type of the keystore file, JKS or PKCS12/PFX. Optional. Default is JKS. +`plugins.security.ssl.transport.keystore_type` | The type of the keystore file, `JKS` or `PKCS12/PFX`. Optional. Default is `JKS`. `plugins.security.ssl.transport.keystore_filepath` | Path to the keystore file, which must be under the `config` directory, specified using a relative path. Required. `plugins.security.ssl.transport.keystore_alias` | The alias name of the keystore. Optional. Default is the first alias. `plugins.security.ssl.transport.keystore_password` | Keystore password. Default is `changeit`. -`plugins.security.ssl.transport.truststore_type` | The type of the truststore file, JKS or PKCS12/PFX. Default is JKS. +`plugins.security.ssl.transport.truststore_type` | The type of the truststore file, `JKS` or `PKCS12/PFX`. Default is `JKS`. `plugins.security.ssl.transport.truststore_filepath` | Path to the truststore file, which must be under the `config` directory, specified using a relative path. Required. `plugins.security.ssl.transport.truststore_alias` | The alias name of the truststore. Optional. Default is all certificates. `plugins.security.ssl.transport.truststore_password` | Truststore password. Default is `changeit`. @@ -65,7 +65,7 @@ Name | Description Name | Description :--- | :--- -`plugins.security.ssl.http.enabled` | Whether to enable TLS on the REST layer. If enabled, only HTTPS is allowed. Optional. Default is false. +`plugins.security.ssl.http.enabled` | Whether to enable TLS on the REST layer. If enabled, only HTTPS is allowed. Optional. Default is `false`. `plugins.security.ssl.http.keystore_type` | The type of the keystore file, JKS or PKCS12/PFX. Optional. Default is JKS. `plugins.security.ssl.http.keystore_filepath` | Path to the keystore file, which must be under the `config` directory, specified using a relative path. Required. `plugins.security.ssl.http.keystore_alias` | The alias name of the keystore. Optional. Default is the first alias. @@ -137,7 +137,7 @@ plugins.security.authcz.admin_dn: For security reasons, you cannot use wildcards or regular expressions as values for the `admin_dn` setting. -For more information about admin and super admin user roles, see [Admin and super admin roles](https://opensearch.org/docs/latest/security/access-control/users-roles/#admin-and-super-admin-roles) and [Configuring super admin certificates](https://opensearch.org/docs/latest/security/configuration/tls/#configuring-admin-certificates). +For more information about admin and super admin user roles, see [Admin and super admin roles](https://opensearch.org/docs/latest/security/access-control/users-roles/#admin-and-super-admin-roles). ## (Advanced) OpenSSL @@ -150,8 +150,8 @@ If OpenSSL is enabled, but for one reason or another the installation does not w Name | Description :--- | :--- -`plugins.security.ssl.transport.enable_openssl_if_available` | Enable OpenSSL on the transport layer if available. Optional. Default is true. -`plugins.security.ssl.http.enable_openssl_if_available` | Enable OpenSSL on the REST layer if available. Optional. Default is true. +`plugins.security.ssl.transport.enable_openssl_if_available` | Enable OpenSSL on the transport layer if available. Optional. Default is `true`. +`plugins.security.ssl.http.enable_openssl_if_available` | Enable OpenSSL on the REST layer if available. Optional. Default is `true`. {% comment %} 1. Install [OpenSSL 1.1.0](https://www.openssl.org/community/binaries.html) on every node. @@ -179,8 +179,8 @@ In addition, when `resolve_hostname` is enabled, the Security plugin resolves th Name | Description :--- | :--- -`plugins.security.ssl.transport.enforce_hostname_verification` | Whether to verify hostnames on the transport layer. Optional. Default is true. -`plugins.security.ssl.transport.resolve_hostname` | Whether to resolve hostnames against DNS on the transport layer. Optional. Default is true. Only works if hostname verification is also enabled. +`plugins.security.ssl.transport.enforce_hostname_verification` | Whether to verify hostnames on the transport layer. Optional. Default is `true`. +`plugins.security.ssl.transport.resolve_hostname` | Whether to resolve hostnames against DNS on the transport layer. Optional. Default is `true`. Only works if hostname verification is also enabled. ## (Advanced) Client authentication diff --git a/_tuning-your-cluster/availability-and-recovery/snapshots/sm-api.md b/_tuning-your-cluster/availability-and-recovery/snapshots/sm-api.md index cd3a238f9c..5d89a3747b 100644 --- a/_tuning-your-cluster/availability-and-recovery/snapshots/sm-api.md +++ b/_tuning-your-cluster/availability-and-recovery/snapshots/sm-api.md @@ -181,7 +181,7 @@ Parameter | Type | Description `enabled` | Boolean | Should this SM policy be enabled at creation? Optional. `snapshot_config` | Object | The configuration options for snapshot creation. Required. `snapshot_config.date_format` | String | Snapshot names have the format `--`. `date_format` specifies the format for the date in the snapshot name. Supports all date formats supported by OpenSearch. Optional. Default is "yyyy-MM-dd'T'HH:mm:ss". -`snapshot_config.date_format_timezone` | String | Snapshot names have the format `--`. `date_format_timezone` specifies the time zone for the date in the snapshot name. Optional. Default is UTC. +`snapshot_config.date_format_timezone` | String | Snapshot names have the format `--`. `date_format_timezone` specifies the time zone for the date in the snapshot name. Optional. Default is `UTC`. `snapshot_config.indices` | String | The names of the indexes in the snapshot. Multiple index names are separated by `,`. Supports wildcards (`*`). Optional. Default is `*` (all indexes). `snapshot_config.repository` | String | The repository in which to store snapshots. Required. `snapshot_config.ignore_unavailable` | Boolean | Do you want to ignore unavailable indexes? Optional. Default is `false`. @@ -197,7 +197,7 @@ Parameter | Type | Description `deletion.delete_condition` | Object | Conditions for snapshot deletion. Optional. `deletion.delete_condition.max_count` | Integer | The maximum number of snapshots to be retained. Optional. `deletion.delete_condition.max_age` | String | The maximum time a snapshot is retained. Optional. -`deletion.delete_condition.min_count` | Integer | The minimum number of snapshots to be retained. Optional. Default is one. +`deletion.delete_condition.min_count` | Integer | The minimum number of snapshots to be retained. Optional. Default is `1`. `notification` | Object | Defines notifications for SM events. Optional. `notification.channel` | Object | Defines a channel for notifications. You must [create and configure a notification channel]({{site.url}}{{site.baseurl}}/notifications-plugin/api) before setting up SM notifications. Required. `notification.channel.id` | String | The channel ID of the channel used for notifications. To get the channel IDs of all created channels, use `GET _plugins/_notifications/configs`. Required. diff --git a/_tuning-your-cluster/availability-and-recovery/snapshots/snapshot-restore.md b/_tuning-your-cluster/availability-and-recovery/snapshots/snapshot-restore.md index f35115c95f..812d5104c7 100644 --- a/_tuning-your-cluster/availability-and-recovery/snapshots/snapshot-restore.md +++ b/_tuning-your-cluster/availability-and-recovery/snapshots/snapshot-restore.md @@ -475,10 +475,10 @@ POST /_snapshot/my-repository/2/_restore Request parameters | Description :--- | :--- `indices` | The indexes you want to restore. You can use `,` to create a list of indexes, `*` to specify an index pattern, and `-` to exclude certain indexes. Don't put spaces between items. Default is all indexes. -`ignore_unavailable` | If an index from the `indices` list doesn't exist, whether to ignore it rather than fail the restore operation. Default is false. -`include_global_state` | Whether to restore the cluster state. Default is false. -`include_aliases` | Whether to restore aliases alongside their associated indexes. Default is true. -`partial` | Whether to allow the restoration of partial snapshots. Default is false. +`ignore_unavailable` | If an index from the `indices` list doesn't exist, whether to ignore it rather than fail the restore operation. Default is `false`. +`include_global_state` | Whether to restore the cluster state. Default is `false`. +`include_aliases` | Whether to restore aliases alongside their associated indexes. Default is `true`. +`partial` | Whether to allow the restoration of partial snapshots. Default is `false`. `rename_pattern` | If you want to rename indexes as you restore them, use this option to specify a regular expression that matches all indexes you want to restore. Use capture groups (`()`) to reuse portions of the index name. `rename_replacement` | If you want to rename indexes as you restore them, use this option to specify the replacement pattern. Use `$0` to include the entire matching index name, `$1` to include the content of the first capture group, and so on. `index_settings` | If you want to change [index settings]({{site.url}}{{site.baseurl}}/im-plugin/index-settings/) applied during the restore operation, specify them here. You cannot change `index.number_of_shards`. diff --git a/_tuning-your-cluster/index.md b/_tuning-your-cluster/index.md index dbba404af8..99db78565f 100644 --- a/_tuning-your-cluster/index.md +++ b/_tuning-your-cluster/index.md @@ -20,7 +20,7 @@ To create and deploy an OpenSearch cluster according to your requirements, it’ There are many ways to design a cluster. The following illustration shows a basic architecture that includes a four-node cluster that has one dedicated cluster manager node, one dedicated coordinating node, and two data nodes that are cluster manager eligible and also used for ingesting data. - The nomenclature for the master node is now referred to as the cluster manager node. + The nomenclature for the cluster manager node is now referred to as the cluster manager node. {: .note } ![multi-node cluster architecture diagram]({{site.url}}{{site.baseurl}}/images/cluster.png) diff --git a/assets/examples/ubi-dashboard.ndjson b/assets/examples/ubi-dashboard.ndjson new file mode 100644 index 0000000000..1ae6562f52 --- /dev/null +++ b/assets/examples/ubi-dashboard.ndjson @@ -0,0 +1,14 @@ +{"attributes":{"fields":"[{\"count\":0,\"name\":\"_id\",\"type\":\"string\",\"esTypes\":[\"_id\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":false},{\"count\":0,\"name\":\"_index\",\"type\":\"string\",\"esTypes\":[\"_index\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":false},{\"count\":0,\"name\":\"_score\",\"type\":\"number\",\"scripted\":false,\"searchable\":false,\"aggregatable\":false,\"readFromDocValues\":false},{\"count\":0,\"name\":\"_source\",\"type\":\"_source\",\"esTypes\":[\"_source\"],\"scripted\":false,\"searchable\":false,\"aggregatable\":false,\"readFromDocValues\":false},{\"count\":0,\"name\":\"_type\",\"type\":\"string\",\"scripted\":false,\"searchable\":false,\"aggregatable\":false,\"readFromDocValues\":false},{\"count\":0,\"name\":\"action_name\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"count\":0,\"name\":\"application\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"count\":0,\"name\":\"client_id\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"count\":0,\"name\":\"event_attributes.browser\",\"type\":\"string\",\"esTypes\":[\"text\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":false,\"readFromDocValues\":false},{\"count\":0,\"name\":\"event_attributes.browser.keyword\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true,\"subType\":{\"multi\":{\"parent\":\"event_attributes.browser\"}}},{\"count\":0,\"name\":\"event_attributes.comment\",\"type\":\"string\",\"esTypes\":[\"text\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":false,\"readFromDocValues\":false},{\"count\":0,\"name\":\"event_attributes.comment.keyword\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true,\"subType\":{\"multi\":{\"parent\":\"event_attributes.comment\"}}},{\"count\":0,\"name\":\"event_attributes.data.internal_id\",\"type\":\"string\",\"esTypes\":[\"text\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":false,\"readFromDocValues\":false},{\"count\":0,\"name\":\"event_attributes.data.internal_id.keyword\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true,\"subType\":{\"multi\":{\"parent\":\"event_attributes.data.internal_id\"}}},{\"count\":0,\"name\":\"event_attributes.data.object_id\",\"type\":\"string\",\"esTypes\":[\"text\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":false,\"readFromDocValues\":false},{\"count\":0,\"name\":\"event_attributes.data.object_id.keyword\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true,\"subType\":{\"multi\":{\"parent\":\"event_attributes.data.object_id\"}}},{\"count\":0,\"name\":\"event_attributes.data.object_id_field\",\"type\":\"string\",\"esTypes\":[\"text\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":false,\"readFromDocValues\":false},{\"count\":0,\"name\":\"event_attributes.data.object_id_field.keyword\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true,\"subType\":{\"multi\":{\"parent\":\"event_attributes.data.object_id_field\"}}},{\"count\":0,\"name\":\"event_attributes.dwell_time\",\"type\":\"number\",\"esTypes\":[\"float\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"count\":0,\"name\":\"event_attributes.helpful\",\"type\":\"string\",\"esTypes\":[\"text\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":false,\"readFromDocValues\":false},{\"count\":0,\"name\":\"event_attributes.helpful.keyword\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true,\"subType\":{\"multi\":{\"parent\":\"event_attributes.helpful\"}}},{\"count\":0,\"name\":\"event_attributes.ip\",\"type\":\"string\",\"esTypes\":[\"text\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":false,\"readFromDocValues\":false},{\"count\":0,\"name\":\"event_attributes.ip.keyword\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true,\"subType\":{\"multi\":{\"parent\":\"event_attributes.ip\"}}},{\"count\":0,\"name\":\"event_attributes.object.ancestors\",\"type\":\"string\",\"esTypes\":[\"text\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":false,\"readFromDocValues\":false},{\"count\":0,\"name\":\"event_attributes.object.ancestors.keyword\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true,\"subType\":{\"multi\":{\"parent\":\"event_attributes.object.ancestors\"}}},{\"count\":0,\"name\":\"event_attributes.object.checkIdleStateRateMs\",\"type\":\"number\",\"esTypes\":[\"long\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"count\":0,\"name\":\"event_attributes.object.content\",\"type\":\"string\",\"esTypes\":[\"text\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":false,\"readFromDocValues\":false},{\"count\":0,\"name\":\"event_attributes.object.content.keyword\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true,\"subType\":{\"multi\":{\"parent\":\"event_attributes.object.content\"}}},{\"count\":0,\"name\":\"event_attributes.object.currentIdleTimeMs\",\"type\":\"number\",\"esTypes\":[\"long\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"count\":0,\"name\":\"event_attributes.object.currentPageName\",\"type\":\"string\",\"esTypes\":[\"text\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":false,\"readFromDocValues\":false},{\"count\":0,\"name\":\"event_attributes.object.currentPageName.keyword\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true,\"subType\":{\"multi\":{\"parent\":\"event_attributes.object.currentPageName\"}}},{\"count\":0,\"name\":\"event_attributes.object.description\",\"type\":\"string\",\"esTypes\":[\"text\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":false,\"readFromDocValues\":false},{\"count\":0,\"name\":\"event_attributes.object.description.keyword\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true,\"subType\":{\"multi\":{\"parent\":\"event_attributes.object.description\"}}},{\"count\":0,\"name\":\"event_attributes.object.docs_version\",\"type\":\"string\",\"esTypes\":[\"text\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":false,\"readFromDocValues\":false},{\"count\":0,\"name\":\"event_attributes.object.docs_version.keyword\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true,\"subType\":{\"multi\":{\"parent\":\"event_attributes.object.docs_version\"}}},{\"count\":0,\"name\":\"event_attributes.object.duration\",\"type\":\"number\",\"esTypes\":[\"long\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"count\":0,\"name\":\"event_attributes.object.hiddenPropName\",\"type\":\"string\",\"esTypes\":[\"text\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":false,\"readFromDocValues\":false},{\"count\":0,\"name\":\"event_attributes.object.hiddenPropName.keyword\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true,\"subType\":{\"multi\":{\"parent\":\"event_attributes.object.hiddenPropName\"}}},{\"count\":0,\"name\":\"event_attributes.object.id\",\"type\":\"number\",\"esTypes\":[\"long\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"count\":0,\"name\":\"event_attributes.object.idleTimeoutMs\",\"type\":\"number\",\"esTypes\":[\"long\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"count\":0,\"name\":\"event_attributes.object.internal_id\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"count\":0,\"name\":\"event_attributes.object.isUserCurrentlyIdle\",\"type\":\"boolean\",\"esTypes\":[\"boolean\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"count\":0,\"name\":\"event_attributes.object.isUserCurrentlyOnPage\",\"type\":\"boolean\",\"esTypes\":[\"boolean\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"count\":0,\"name\":\"event_attributes.object.name\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"count\":0,\"name\":\"event_attributes.object.object_detail.cost\",\"type\":\"number\",\"esTypes\":[\"float\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"count\":0,\"name\":\"event_attributes.object.object_detail.date_released\",\"type\":\"date\",\"esTypes\":[\"date\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"count\":0,\"name\":\"event_attributes.object.object_detail.filter\",\"type\":\"string\",\"esTypes\":[\"text\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":false,\"readFromDocValues\":false},{\"count\":0,\"name\":\"event_attributes.object.object_detail.filter.keyword\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true,\"subType\":{\"multi\":{\"parent\":\"event_attributes.object.object_detail.filter\"}}},{\"count\":0,\"name\":\"event_attributes.object.object_detail.isTrusted\",\"type\":\"boolean\",\"esTypes\":[\"boolean\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"count\":0,\"name\":\"event_attributes.object.object_detail.margin\",\"type\":\"number\",\"esTypes\":[\"float\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"count\":0,\"name\":\"event_attributes.object.object_detail.price\",\"type\":\"number\",\"esTypes\":[\"float\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"count\":0,\"name\":\"event_attributes.object.object_detail.supplier\",\"type\":\"string\",\"esTypes\":[\"text\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":false,\"readFromDocValues\":false},{\"count\":0,\"name\":\"event_attributes.object.object_detail.supplier.keyword\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true,\"subType\":{\"multi\":{\"parent\":\"event_attributes.object.object_detail.supplier\"}}},{\"count\":0,\"name\":\"event_attributes.object.object_id\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"count\":0,\"name\":\"event_attributes.object.object_id_field\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"count\":0,\"name\":\"event_attributes.object.results_num\",\"type\":\"number\",\"esTypes\":[\"long\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"count\":0,\"name\":\"event_attributes.object.search_term\",\"type\":\"string\",\"esTypes\":[\"text\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":false,\"readFromDocValues\":false},{\"count\":0,\"name\":\"event_attributes.object.search_term.keyword\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true,\"subType\":{\"multi\":{\"parent\":\"event_attributes.object.search_term\"}}},{\"count\":0,\"name\":\"event_attributes.object.startStopTimes./docs/latest/.startTime\",\"type\":\"date\",\"esTypes\":[\"date\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"count\":0,\"name\":\"event_attributes.object.startStopTimes./docs/latest/.stopTime\",\"type\":\"date\",\"esTypes\":[\"date\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"count\":0,\"name\":\"event_attributes.object.startStopTimes.http://137.184.176.129:4000/docs/latest/.startTime\",\"type\":\"date\",\"esTypes\":[\"date\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"count\":0,\"name\":\"event_attributes.object.title\",\"type\":\"string\",\"esTypes\":[\"text\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":false,\"readFromDocValues\":false},{\"count\":0,\"name\":\"event_attributes.object.title.keyword\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true,\"subType\":{\"multi\":{\"parent\":\"event_attributes.object.title\"}}},{\"count\":0,\"name\":\"event_attributes.object.transaction_id\",\"type\":\"string\",\"esTypes\":[\"text\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":false,\"readFromDocValues\":false},{\"count\":0,\"name\":\"event_attributes.object.transaction_id.keyword\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true,\"subType\":{\"multi\":{\"parent\":\"event_attributes.object.transaction_id\"}}},{\"count\":0,\"name\":\"event_attributes.object.type\",\"type\":\"string\",\"esTypes\":[\"text\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":false,\"readFromDocValues\":false},{\"count\":0,\"name\":\"event_attributes.object.type.keyword\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true,\"subType\":{\"multi\":{\"parent\":\"event_attributes.object.type\"}}},{\"count\":0,\"name\":\"event_attributes.object.url\",\"type\":\"string\",\"esTypes\":[\"text\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":false,\"readFromDocValues\":false},{\"count\":0,\"name\":\"event_attributes.object.url.keyword\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true,\"subType\":{\"multi\":{\"parent\":\"event_attributes.object.url\"}}},{\"count\":0,\"name\":\"event_attributes.object.version\",\"type\":\"string\",\"esTypes\":[\"text\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":false,\"readFromDocValues\":false},{\"count\":0,\"name\":\"event_attributes.object.version.keyword\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true,\"subType\":{\"multi\":{\"parent\":\"event_attributes.object.version\"}}},{\"count\":0,\"name\":\"event_attributes.object.versionLabel\",\"type\":\"string\",\"esTypes\":[\"text\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":false,\"readFromDocValues\":false},{\"count\":0,\"name\":\"event_attributes.object.versionLabel.keyword\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true,\"subType\":{\"multi\":{\"parent\":\"event_attributes.object.versionLabel\"}}},{\"count\":0,\"name\":\"event_attributes.object.visibilityChangeEventName\",\"type\":\"string\",\"esTypes\":[\"text\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":false,\"readFromDocValues\":false},{\"count\":0,\"name\":\"event_attributes.object.visibilityChangeEventName.keyword\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true,\"subType\":{\"multi\":{\"parent\":\"event_attributes.object.visibilityChangeEventName\"}}},{\"count\":0,\"name\":\"event_attributes.page_id\",\"type\":\"string\",\"esTypes\":[\"text\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":false,\"readFromDocValues\":false},{\"count\":0,\"name\":\"event_attributes.page_id.keyword\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true,\"subType\":{\"multi\":{\"parent\":\"event_attributes.page_id\"}}},{\"count\":0,\"name\":\"event_attributes.position.ordinal\",\"type\":\"number\",\"esTypes\":[\"integer\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"count\":0,\"name\":\"event_attributes.position.page_depth\",\"type\":\"number\",\"esTypes\":[\"integer\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"count\":0,\"name\":\"event_attributes.position.scroll_depth\",\"type\":\"number\",\"esTypes\":[\"integer\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"count\":0,\"name\":\"event_attributes.position.trail\",\"type\":\"string\",\"esTypes\":[\"text\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":false,\"readFromDocValues\":false},{\"count\":0,\"name\":\"event_attributes.position.trail.keyword\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true,\"subType\":{\"multi\":{\"parent\":\"event_attributes.position.trail\"}}},{\"count\":0,\"name\":\"event_attributes.position.x\",\"type\":\"number\",\"esTypes\":[\"integer\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"count\":0,\"name\":\"event_attributes.position.y\",\"type\":\"number\",\"esTypes\":[\"integer\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"count\":0,\"name\":\"event_attributes.result_count\",\"type\":\"number\",\"esTypes\":[\"long\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"count\":0,\"name\":\"event_attributes.session_id\",\"type\":\"string\",\"esTypes\":[\"text\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":false,\"readFromDocValues\":false},{\"count\":0,\"name\":\"event_attributes.session_id.keyword\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true,\"subType\":{\"multi\":{\"parent\":\"event_attributes.session_id\"}}},{\"count\":0,\"name\":\"event_attributes.user_comment\",\"type\":\"string\",\"esTypes\":[\"text\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":false,\"readFromDocValues\":false},{\"count\":0,\"name\":\"event_attributes.user_comment.keyword\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true,\"subType\":{\"multi\":{\"parent\":\"event_attributes.user_comment\"}}},{\"count\":0,\"name\":\"message\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"count\":0,\"name\":\"message_type\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"count\":0,\"name\":\"query\",\"type\":\"string\",\"esTypes\":[\"text\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":false,\"readFromDocValues\":false},{\"count\":0,\"name\":\"query_attributes\",\"type\":\"unknown\",\"esTypes\":[\"flat_object\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"count\":0,\"name\":\"query_attributes.\",\"type\":\"unknown\",\"esTypes\":[\"flat_object\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true,\"subType\":{\"multi\":{\"parent\":\"query_attributes\"}}},{\"count\":0,\"name\":\"query_attributes._value\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":false,\"readFromDocValues\":false,\"subType\":{\"multi\":{\"parent\":\"query_attributes\"}}},{\"count\":0,\"name\":\"query_attributes._valueAndPath\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":false,\"readFromDocValues\":false,\"subType\":{\"multi\":{\"parent\":\"query_attributes\"}}},{\"count\":0,\"name\":\"query_id\",\"type\":\"string\",\"esTypes\":[\"text\",\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":false},{\"count\":0,\"name\":\"query_id.keyword\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true,\"subType\":{\"multi\":{\"parent\":\"query_id\"}}},{\"count\":0,\"name\":\"query_response_hit_ids\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"count\":0,\"name\":\"query_response_id\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"count\":0,\"name\":\"timestamp\",\"type\":\"date\",\"esTypes\":[\"date\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"count\":0,\"name\":\"user_query\",\"type\":\"string\",\"esTypes\":[\"text\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":false,\"readFromDocValues\":false},{\"count\":0,\"name\":\"user_query.keyword\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true,\"subType\":{\"multi\":{\"parent\":\"user_query\"}}}]","title":"ubi_*"},"id":"2ef89828-d14a-4f51-a90f-e427d2ef941a","migrationVersion":{"index-pattern":"7.6.0"},"references":[],"type":"index-pattern","updated_at":"2024-06-18T18:15:17.795Z","version":"WzEsMV0="} +{"attributes":{"description":"","kibanaSavedObjectMeta":{"searchSourceJSON":"{\"query\":{\"query\":\"\",\"language\":\"kuery\"},\"filter\":[],\"indexRefName\":\"kibanaSavedObjectMeta.searchSourceJSON.index\"}"},"title":"basic pie","uiStateJSON":"{}","version":1,"visState":"{\"title\":\"basic pie\",\"type\":\"pie\",\"aggs\":[{\"id\":\"1\",\"enabled\":true,\"type\":\"count\",\"params\":{},\"schema\":\"metric\"},{\"id\":\"2\",\"enabled\":true,\"type\":\"terms\",\"params\":{\"field\":\"action_name\",\"orderBy\":\"1\",\"order\":\"desc\",\"size\":25,\"otherBucket\":false,\"otherBucketLabel\":\"Other\",\"missingBucket\":false,\"missingBucketLabel\":\"Missing\"},\"schema\":\"segment\"}],\"params\":{\"type\":\"pie\",\"addTooltip\":true,\"addLegend\":true,\"legendPosition\":\"right\",\"isDonut\":true,\"labels\":{\"show\":true,\"values\":true,\"last_level\":true,\"truncate\":100}}}"},"id":"2ddbc001-f5ae-4de9-82f3-30b28408bae6","migrationVersion":{"visualization":"7.10.0"},"references":[{"id":"2ef89828-d14a-4f51-a90f-e427d2ef941a","name":"kibanaSavedObjectMeta.searchSourceJSON.index","type":"index-pattern"}],"type":"visualization","updated_at":"2024-06-18T18:15:17.795Z","version":"WzIsMV0="} +{"attributes":{"description":"","kibanaSavedObjectMeta":{"searchSourceJSON":"{\"query\":{\"query\":\"\",\"language\":\"kuery\"},\"filter\":[],\"indexRefName\":\"kibanaSavedObjectMeta.searchSourceJSON.index\"}"},"title":"click position","uiStateJSON":"{}","version":1,"visState":"{\"title\":\"click position\",\"type\":\"histogram\",\"aggs\":[{\"id\":\"1\",\"enabled\":true,\"type\":\"count\",\"params\":{},\"schema\":\"metric\"},{\"id\":\"2\",\"enabled\":true,\"type\":\"histogram\",\"params\":{\"field\":\"event_attributes.position.ordinal\",\"interval\":1,\"min_doc_count\":false,\"has_extended_bounds\":false,\"extended_bounds\":{\"min\":\"\",\"max\":\"\"},\"customLabel\":\"item number out of searched results that were clicked on\"},\"schema\":\"segment\"}],\"params\":{\"type\":\"histogram\",\"grid\":{\"categoryLines\":false},\"categoryAxes\":[{\"id\":\"CategoryAxis-1\",\"type\":\"category\",\"position\":\"bottom\",\"show\":true,\"style\":{},\"scale\":{\"type\":\"linear\"},\"labels\":{\"show\":true,\"filter\":true,\"truncate\":100},\"title\":{}}],\"valueAxes\":[{\"id\":\"ValueAxis-1\",\"name\":\"LeftAxis-1\",\"type\":\"value\",\"position\":\"left\",\"show\":true,\"style\":{},\"scale\":{\"type\":\"linear\",\"mode\":\"normal\"},\"labels\":{\"show\":true,\"rotate\":0,\"filter\":false,\"truncate\":100},\"title\":{\"text\":\"Count\"}}],\"seriesParams\":[{\"show\":true,\"type\":\"histogram\",\"mode\":\"stacked\",\"data\":{\"label\":\"Count\",\"id\":\"1\"},\"valueAxis\":\"ValueAxis-1\",\"drawLinesBetweenPoints\":true,\"lineWidth\":2,\"showCircles\":true}],\"addTooltip\":true,\"addLegend\":true,\"legendPosition\":\"right\",\"times\":[],\"addTimeMarker\":false,\"labels\":{\"show\":false},\"thresholdLine\":{\"show\":false,\"value\":10,\"width\":1,\"style\":\"full\",\"color\":\"#E7664C\"}}}"},"id":"7234f759-31ab-467a-942e-d6db8c58477e","migrationVersion":{"visualization":"7.10.0"},"references":[{"id":"2ef89828-d14a-4f51-a90f-e427d2ef941a","name":"kibanaSavedObjectMeta.searchSourceJSON.index","type":"index-pattern"}],"type":"visualization","updated_at":"2024-06-18T18:15:17.795Z","version":"WzMsMV0="} +{"attributes":{"description":"","kibanaSavedObjectMeta":{"searchSourceJSON":"{\"query\":{\"query\":\"\",\"language\":\"kuery\"},\"filter\":[],\"indexRefName\":\"kibanaSavedObjectMeta.searchSourceJSON.index\"}"},"title":"all ubi messages","uiStateJSON":"{}","version":1,"visState":"{\"title\":\"all ubi messages\",\"type\":\"tagcloud\",\"aggs\":[{\"id\":\"1\",\"enabled\":true,\"type\":\"count\",\"params\":{},\"schema\":\"metric\"},{\"id\":\"2\",\"enabled\":true,\"type\":\"terms\",\"params\":{\"field\":\"message\",\"orderBy\":\"1\",\"order\":\"desc\",\"size\":50,\"otherBucket\":false,\"otherBucketLabel\":\"Other\",\"missingBucket\":false,\"missingBucketLabel\":\"Missing\"},\"schema\":\"segment\"}],\"params\":{\"scale\":\"linear\",\"orientation\":\"single\",\"minFontSize\":18,\"maxFontSize\":72,\"showLabel\":true}}"},"id":"78a48652-7423-4517-a869-9ba167afcc47","migrationVersion":{"visualization":"7.10.0"},"references":[{"id":"2ef89828-d14a-4f51-a90f-e427d2ef941a","name":"kibanaSavedObjectMeta.searchSourceJSON.index","type":"index-pattern"}],"type":"visualization","updated_at":"2024-06-18T18:15:17.795Z","version":"WzcsMV0="} +{"attributes":{"fields":"[{\"count\":0,\"name\":\"_id\",\"type\":\"string\",\"esTypes\":[\"_id\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":false},{\"count\":0,\"name\":\"_index\",\"type\":\"string\",\"esTypes\":[\"_index\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":false},{\"count\":0,\"name\":\"_score\",\"type\":\"number\",\"scripted\":false,\"searchable\":false,\"aggregatable\":false,\"readFromDocValues\":false},{\"count\":0,\"name\":\"_source\",\"type\":\"_source\",\"esTypes\":[\"_source\"],\"scripted\":false,\"searchable\":false,\"aggregatable\":false,\"readFromDocValues\":false},{\"count\":0,\"name\":\"_type\",\"type\":\"string\",\"scripted\":false,\"searchable\":false,\"aggregatable\":false,\"readFromDocValues\":false},{\"count\":0,\"name\":\"action_name\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"count\":0,\"name\":\"application\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"count\":0,\"name\":\"client_id\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"count\":0,\"name\":\"event_attributes.browser\",\"type\":\"string\",\"esTypes\":[\"text\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":false,\"readFromDocValues\":false},{\"count\":0,\"name\":\"event_attributes.browser.keyword\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true,\"subType\":{\"multi\":{\"parent\":\"event_attributes.browser\"}}},{\"count\":0,\"name\":\"event_attributes.comment\",\"type\":\"string\",\"esTypes\":[\"text\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":false,\"readFromDocValues\":false},{\"count\":0,\"name\":\"event_attributes.comment.keyword\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true,\"subType\":{\"multi\":{\"parent\":\"event_attributes.comment\"}}},{\"count\":0,\"name\":\"event_attributes.data.internal_id\",\"type\":\"string\",\"esTypes\":[\"text\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":false,\"readFromDocValues\":false},{\"count\":0,\"name\":\"event_attributes.data.internal_id.keyword\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true,\"subType\":{\"multi\":{\"parent\":\"event_attributes.data.internal_id\"}}},{\"count\":0,\"name\":\"event_attributes.data.object_id\",\"type\":\"string\",\"esTypes\":[\"text\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":false,\"readFromDocValues\":false},{\"count\":0,\"name\":\"event_attributes.data.object_id.keyword\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true,\"subType\":{\"multi\":{\"parent\":\"event_attributes.data.object_id\"}}},{\"count\":0,\"name\":\"event_attributes.data.object_id_field\",\"type\":\"string\",\"esTypes\":[\"text\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":false,\"readFromDocValues\":false},{\"count\":0,\"name\":\"event_attributes.data.object_id_field.keyword\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true,\"subType\":{\"multi\":{\"parent\":\"event_attributes.data.object_id_field\"}}},{\"count\":0,\"name\":\"event_attributes.dwell_time\",\"type\":\"number\",\"esTypes\":[\"float\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"count\":0,\"name\":\"event_attributes.helpful\",\"type\":\"string\",\"esTypes\":[\"text\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":false,\"readFromDocValues\":false},{\"count\":0,\"name\":\"event_attributes.helpful.keyword\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true,\"subType\":{\"multi\":{\"parent\":\"event_attributes.helpful\"}}},{\"count\":0,\"name\":\"event_attributes.ip\",\"type\":\"string\",\"esTypes\":[\"text\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":false,\"readFromDocValues\":false},{\"count\":0,\"name\":\"event_attributes.ip.keyword\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true,\"subType\":{\"multi\":{\"parent\":\"event_attributes.ip\"}}},{\"count\":0,\"name\":\"event_attributes.object.ancestors\",\"type\":\"string\",\"esTypes\":[\"text\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":false,\"readFromDocValues\":false},{\"count\":0,\"name\":\"event_attributes.object.ancestors.keyword\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true,\"subType\":{\"multi\":{\"parent\":\"event_attributes.object.ancestors\"}}},{\"count\":0,\"name\":\"event_attributes.object.checkIdleStateRateMs\",\"type\":\"number\",\"esTypes\":[\"long\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"count\":0,\"name\":\"event_attributes.object.content\",\"type\":\"string\",\"esTypes\":[\"text\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":false,\"readFromDocValues\":false},{\"count\":0,\"name\":\"event_attributes.object.content.keyword\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true,\"subType\":{\"multi\":{\"parent\":\"event_attributes.object.content\"}}},{\"count\":0,\"name\":\"event_attributes.object.currentIdleTimeMs\",\"type\":\"number\",\"esTypes\":[\"long\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"count\":0,\"name\":\"event_attributes.object.currentPageName\",\"type\":\"string\",\"esTypes\":[\"text\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":false,\"readFromDocValues\":false},{\"count\":0,\"name\":\"event_attributes.object.currentPageName.keyword\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true,\"subType\":{\"multi\":{\"parent\":\"event_attributes.object.currentPageName\"}}},{\"count\":0,\"name\":\"event_attributes.object.description\",\"type\":\"string\",\"esTypes\":[\"text\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":false,\"readFromDocValues\":false},{\"count\":0,\"name\":\"event_attributes.object.description.keyword\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true,\"subType\":{\"multi\":{\"parent\":\"event_attributes.object.description\"}}},{\"count\":0,\"name\":\"event_attributes.object.docs_version\",\"type\":\"string\",\"esTypes\":[\"text\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":false,\"readFromDocValues\":false},{\"count\":0,\"name\":\"event_attributes.object.docs_version.keyword\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true,\"subType\":{\"multi\":{\"parent\":\"event_attributes.object.docs_version\"}}},{\"count\":0,\"name\":\"event_attributes.object.duration\",\"type\":\"number\",\"esTypes\":[\"long\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"count\":0,\"name\":\"event_attributes.object.hiddenPropName\",\"type\":\"string\",\"esTypes\":[\"text\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":false,\"readFromDocValues\":false},{\"count\":0,\"name\":\"event_attributes.object.hiddenPropName.keyword\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true,\"subType\":{\"multi\":{\"parent\":\"event_attributes.object.hiddenPropName\"}}},{\"count\":0,\"name\":\"event_attributes.object.id\",\"type\":\"number\",\"esTypes\":[\"long\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"count\":0,\"name\":\"event_attributes.object.idleTimeoutMs\",\"type\":\"number\",\"esTypes\":[\"long\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"count\":0,\"name\":\"event_attributes.object.internal_id\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"count\":0,\"name\":\"event_attributes.object.isUserCurrentlyIdle\",\"type\":\"boolean\",\"esTypes\":[\"boolean\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"count\":0,\"name\":\"event_attributes.object.isUserCurrentlyOnPage\",\"type\":\"boolean\",\"esTypes\":[\"boolean\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"count\":0,\"name\":\"event_attributes.object.name\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"count\":0,\"name\":\"event_attributes.object.object_detail.cost\",\"type\":\"number\",\"esTypes\":[\"float\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"count\":0,\"name\":\"event_attributes.object.object_detail.date_released\",\"type\":\"date\",\"esTypes\":[\"date\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"count\":0,\"name\":\"event_attributes.object.object_detail.filter\",\"type\":\"string\",\"esTypes\":[\"text\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":false,\"readFromDocValues\":false},{\"count\":0,\"name\":\"event_attributes.object.object_detail.filter.keyword\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true,\"subType\":{\"multi\":{\"parent\":\"event_attributes.object.object_detail.filter\"}}},{\"count\":0,\"name\":\"event_attributes.object.object_detail.isTrusted\",\"type\":\"boolean\",\"esTypes\":[\"boolean\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"count\":0,\"name\":\"event_attributes.object.object_detail.margin\",\"type\":\"number\",\"esTypes\":[\"float\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"count\":0,\"name\":\"event_attributes.object.object_detail.price\",\"type\":\"number\",\"esTypes\":[\"float\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"count\":0,\"name\":\"event_attributes.object.object_detail.supplier\",\"type\":\"string\",\"esTypes\":[\"text\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":false,\"readFromDocValues\":false},{\"count\":0,\"name\":\"event_attributes.object.object_detail.supplier.keyword\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true,\"subType\":{\"multi\":{\"parent\":\"event_attributes.object.object_detail.supplier\"}}},{\"count\":0,\"name\":\"event_attributes.object.object_id\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"count\":0,\"name\":\"event_attributes.object.object_id_field\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"count\":0,\"name\":\"event_attributes.object.results_num\",\"type\":\"number\",\"esTypes\":[\"long\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"count\":0,\"name\":\"event_attributes.object.search_term\",\"type\":\"string\",\"esTypes\":[\"text\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":false,\"readFromDocValues\":false},{\"count\":0,\"name\":\"event_attributes.object.search_term.keyword\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true,\"subType\":{\"multi\":{\"parent\":\"event_attributes.object.search_term\"}}},{\"count\":0,\"name\":\"event_attributes.object.startStopTimes./docs/latest/.startTime\",\"type\":\"date\",\"esTypes\":[\"date\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"count\":0,\"name\":\"event_attributes.object.startStopTimes./docs/latest/.stopTime\",\"type\":\"date\",\"esTypes\":[\"date\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"count\":0,\"name\":\"event_attributes.object.startStopTimes.http://137.184.176.129:4000/docs/latest/.startTime\",\"type\":\"date\",\"esTypes\":[\"date\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"count\":0,\"name\":\"event_attributes.object.title\",\"type\":\"string\",\"esTypes\":[\"text\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":false,\"readFromDocValues\":false},{\"count\":0,\"name\":\"event_attributes.object.title.keyword\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true,\"subType\":{\"multi\":{\"parent\":\"event_attributes.object.title\"}}},{\"count\":0,\"name\":\"event_attributes.object.transaction_id\",\"type\":\"string\",\"esTypes\":[\"text\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":false,\"readFromDocValues\":false},{\"count\":0,\"name\":\"event_attributes.object.transaction_id.keyword\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true,\"subType\":{\"multi\":{\"parent\":\"event_attributes.object.transaction_id\"}}},{\"count\":0,\"name\":\"event_attributes.object.type\",\"type\":\"string\",\"esTypes\":[\"text\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":false,\"readFromDocValues\":false},{\"count\":0,\"name\":\"event_attributes.object.type.keyword\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true,\"subType\":{\"multi\":{\"parent\":\"event_attributes.object.type\"}}},{\"count\":0,\"name\":\"event_attributes.object.url\",\"type\":\"string\",\"esTypes\":[\"text\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":false,\"readFromDocValues\":false},{\"count\":0,\"name\":\"event_attributes.object.url.keyword\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true,\"subType\":{\"multi\":{\"parent\":\"event_attributes.object.url\"}}},{\"count\":0,\"name\":\"event_attributes.object.version\",\"type\":\"string\",\"esTypes\":[\"text\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":false,\"readFromDocValues\":false},{\"count\":0,\"name\":\"event_attributes.object.version.keyword\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true,\"subType\":{\"multi\":{\"parent\":\"event_attributes.object.version\"}}},{\"count\":0,\"name\":\"event_attributes.object.versionLabel\",\"type\":\"string\",\"esTypes\":[\"text\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":false,\"readFromDocValues\":false},{\"count\":0,\"name\":\"event_attributes.object.versionLabel.keyword\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true,\"subType\":{\"multi\":{\"parent\":\"event_attributes.object.versionLabel\"}}},{\"count\":0,\"name\":\"event_attributes.object.visibilityChangeEventName\",\"type\":\"string\",\"esTypes\":[\"text\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":false,\"readFromDocValues\":false},{\"count\":0,\"name\":\"event_attributes.object.visibilityChangeEventName.keyword\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true,\"subType\":{\"multi\":{\"parent\":\"event_attributes.object.visibilityChangeEventName\"}}},{\"count\":0,\"name\":\"event_attributes.page_id\",\"type\":\"string\",\"esTypes\":[\"text\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":false,\"readFromDocValues\":false},{\"count\":0,\"name\":\"event_attributes.page_id.keyword\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true,\"subType\":{\"multi\":{\"parent\":\"event_attributes.page_id\"}}},{\"count\":0,\"name\":\"event_attributes.position.ordinal\",\"type\":\"number\",\"esTypes\":[\"integer\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"count\":0,\"name\":\"event_attributes.position.page_depth\",\"type\":\"number\",\"esTypes\":[\"integer\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"count\":0,\"name\":\"event_attributes.position.scroll_depth\",\"type\":\"number\",\"esTypes\":[\"integer\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"count\":0,\"name\":\"event_attributes.position.trail\",\"type\":\"string\",\"esTypes\":[\"text\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":false,\"readFromDocValues\":false},{\"count\":0,\"name\":\"event_attributes.position.trail.keyword\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true,\"subType\":{\"multi\":{\"parent\":\"event_attributes.position.trail\"}}},{\"count\":0,\"name\":\"event_attributes.position.x\",\"type\":\"number\",\"esTypes\":[\"integer\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"count\":0,\"name\":\"event_attributes.position.y\",\"type\":\"number\",\"esTypes\":[\"integer\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"count\":0,\"name\":\"event_attributes.result_count\",\"type\":\"number\",\"esTypes\":[\"long\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"count\":0,\"name\":\"event_attributes.session_id\",\"type\":\"string\",\"esTypes\":[\"text\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":false,\"readFromDocValues\":false},{\"count\":0,\"name\":\"event_attributes.session_id.keyword\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true,\"subType\":{\"multi\":{\"parent\":\"event_attributes.session_id\"}}},{\"count\":0,\"name\":\"event_attributes.user_comment\",\"type\":\"string\",\"esTypes\":[\"text\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":false,\"readFromDocValues\":false},{\"count\":0,\"name\":\"event_attributes.user_comment.keyword\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true,\"subType\":{\"multi\":{\"parent\":\"event_attributes.user_comment\"}}},{\"count\":0,\"name\":\"message\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"count\":0,\"name\":\"message_type\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"count\":0,\"name\":\"query_id\",\"type\":\"string\",\"esTypes\":[\"text\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":false,\"readFromDocValues\":false},{\"count\":0,\"name\":\"query_id.keyword\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true,\"subType\":{\"multi\":{\"parent\":\"query_id\"}}},{\"count\":0,\"name\":\"timestamp\",\"type\":\"date\",\"esTypes\":[\"date\"],\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true}]","title":"ubi_events"},"id":"b8c824c3-6508-4495-a3d0-9f53e6723cdd","migrationVersion":{"index-pattern":"7.6.0"},"references":[],"type":"index-pattern","updated_at":"2024-06-18T18:15:17.795Z","version":"WzQsMV0="} +{"attributes":{"description":"","kibanaSavedObjectMeta":{"searchSourceJSON":"{\"query\":{\"query\":\"\",\"language\":\"kuery\"},\"filter\":[],\"indexRefName\":\"kibanaSavedObjectMeta.searchSourceJSON.index\"}"},"title":"Margin by Vendor (pie chart)","uiStateJSON":"{}","version":1,"visState":"{\"title\":\"Margin by Vendor (pie chart)\",\"type\":\"pie\",\"aggs\":[{\"id\":\"1\",\"enabled\":true,\"type\":\"sum\",\"params\":{\"field\":\"event_attributes.object.object_detail.margin\",\"customLabel\":\"Margin\"},\"schema\":\"metric\"},{\"id\":\"2\",\"enabled\":true,\"type\":\"terms\",\"params\":{\"field\":\"event_attributes.object.object_detail.supplier.keyword\",\"orderBy\":\"1\",\"order\":\"desc\",\"size\":15,\"otherBucket\":true,\"otherBucketLabel\":\"Other\",\"missingBucket\":false,\"missingBucketLabel\":\"Missing\"},\"schema\":\"segment\"}],\"params\":{\"type\":\"pie\",\"addTooltip\":true,\"addLegend\":true,\"legendPosition\":\"right\",\"isDonut\":true,\"labels\":{\"show\":true,\"values\":true,\"last_level\":true,\"truncate\":100}}}"},"id":"cb78bb39-4437-4347-aade-15bc268c6a26","migrationVersion":{"visualization":"7.10.0"},"references":[{"id":"b8c824c3-6508-4495-a3d0-9f53e6723cdd","name":"kibanaSavedObjectMeta.searchSourceJSON.index","type":"index-pattern"}],"type":"visualization","updated_at":"2024-06-18T18:15:17.795Z","version":"WzksMV0="} +{"attributes":{"description":"","kibanaSavedObjectMeta":{"searchSourceJSON":"{\"query\":{\"query\":\"\",\"language\":\"kuery\"},\"filter\":[{\"$state\":{\"store\":\"appState\"},\"meta\":{\"alias\":null,\"disabled\":false,\"key\":\"action_name\",\"negate\":false,\"params\":{\"query\":\"on_search\"},\"type\":\"phrase\",\"indexRefName\":\"kibanaSavedObjectMeta.searchSourceJSON.filter[0].meta.index\"},\"query\":{\"match_phrase\":{\"action_name\":\"on_search\"}}},{\"$state\":{\"store\":\"appState\"},\"meta\":{\"alias\":null,\"disabled\":false,\"key\":\"event_attributes.result_count\",\"negate\":true,\"params\":{\"query\":\"0\"},\"type\":\"phrase\",\"indexRefName\":\"kibanaSavedObjectMeta.searchSourceJSON.filter[1].meta.index\"},\"query\":{\"match_phrase\":{\"event_attributes.result_count\":\"0\"}}}],\"indexRefName\":\"kibanaSavedObjectMeta.searchSourceJSON.index\"}"},"title":"all ubi messages (copy 1)","uiStateJSON":"{}","version":1,"visState":"{\"title\":\"all ubi messages (copy 1)\",\"type\":\"tagcloud\",\"aggs\":[{\"id\":\"1\",\"enabled\":true,\"type\":\"count\",\"params\":{},\"schema\":\"metric\"},{\"id\":\"2\",\"enabled\":true,\"type\":\"terms\",\"params\":{\"field\":\"message\",\"orderBy\":\"1\",\"order\":\"desc\",\"size\":50,\"otherBucket\":false,\"otherBucketLabel\":\"Other\",\"missingBucket\":false,\"missingBucketLabel\":\"Missing\"},\"schema\":\"segment\"}],\"params\":{\"scale\":\"linear\",\"orientation\":\"single\",\"minFontSize\":18,\"maxFontSize\":72,\"showLabel\":true}}"},"id":"a1840c05-bd0c-4fb2-8a50-7f51395e3d8d","migrationVersion":{"visualization":"7.10.0"},"references":[{"id":"2ef89828-d14a-4f51-a90f-e427d2ef941a","name":"kibanaSavedObjectMeta.searchSourceJSON.index","type":"index-pattern"},{"id":"2ef89828-d14a-4f51-a90f-e427d2ef941a","name":"kibanaSavedObjectMeta.searchSourceJSON.filter[0].meta.index","type":"index-pattern"},{"id":"2ef89828-d14a-4f51-a90f-e427d2ef941a","name":"kibanaSavedObjectMeta.searchSourceJSON.filter[1].meta.index","type":"index-pattern"}],"type":"visualization","updated_at":"2024-06-18T18:15:17.795Z","version":"WzgsMV0="} +{"attributes":{"description":"","kibanaSavedObjectMeta":{"searchSourceJSON":"{\"query\":{\"query\":\"\",\"language\":\"kuery\"},\"filter\":[{\"$state\":{\"store\":\"appState\"},\"meta\":{\"alias\":null,\"disabled\":false,\"key\":\"action_name\",\"negate\":false,\"params\":{\"query\":\"on_search\"},\"type\":\"phrase\",\"indexRefName\":\"kibanaSavedObjectMeta.searchSourceJSON.filter[0].meta.index\"},\"query\":{\"match_phrase\":{\"action_name\":\"on_search\"}}}],\"indexRefName\":\"kibanaSavedObjectMeta.searchSourceJSON.index\"}"},"title":"all ubi messages (copy)","uiStateJSON":"{}","version":1,"visState":"{\"title\":\"all ubi messages (copy)\",\"type\":\"tagcloud\",\"aggs\":[{\"id\":\"1\",\"enabled\":true,\"type\":\"count\",\"params\":{},\"schema\":\"metric\"},{\"id\":\"2\",\"enabled\":true,\"type\":\"terms\",\"params\":{\"field\":\"message\",\"orderBy\":\"1\",\"order\":\"desc\",\"size\":50,\"otherBucket\":false,\"otherBucketLabel\":\"Other\",\"missingBucket\":false,\"missingBucketLabel\":\"Missing\"},\"schema\":\"segment\"}],\"params\":{\"scale\":\"linear\",\"orientation\":\"single\",\"minFontSize\":18,\"maxFontSize\":72,\"showLabel\":true}}"},"id":"4af4cc80-33fd-4ef4-b4f9-7d67ecd84b20","migrationVersion":{"visualization":"7.10.0"},"references":[{"id":"2ef89828-d14a-4f51-a90f-e427d2ef941a","name":"kibanaSavedObjectMeta.searchSourceJSON.index","type":"index-pattern"},{"id":"2ef89828-d14a-4f51-a90f-e427d2ef941a","name":"kibanaSavedObjectMeta.searchSourceJSON.filter[0].meta.index","type":"index-pattern"}],"type":"visualization","updated_at":"2024-06-18T18:15:17.795Z","version":"WzYsMV0="} +{"attributes":{"description":"","kibanaSavedObjectMeta":{"searchSourceJSON":"{\"query\":{\"query\":\"\",\"language\":\"kuery\"},\"filter\":[],\"indexRefName\":\"kibanaSavedObjectMeta.searchSourceJSON.index\"}"},"title":"Longest Session Durations","uiStateJSON":"{}","version":1,"visState":"{\"title\":\"Longest Session Durations\",\"type\":\"histogram\",\"aggs\":[{\"id\":\"1\",\"enabled\":true,\"type\":\"sum\",\"params\":{\"field\":\"event_attributes.dwell_time\",\"customLabel\":\"Duration in seconds\"},\"schema\":\"metric\"},{\"id\":\"2\",\"enabled\":true,\"type\":\"terms\",\"params\":{\"field\":\"event_attributes.session_id.keyword\",\"orderBy\":\"1\",\"order\":\"desc\",\"size\":25,\"otherBucket\":false,\"otherBucketLabel\":\"Other\",\"missingBucket\":false,\"missingBucketLabel\":\"Missing\",\"customLabel\":\"Session\"},\"schema\":\"group\"}],\"params\":{\"type\":\"histogram\",\"grid\":{\"categoryLines\":false},\"categoryAxes\":[{\"id\":\"CategoryAxis-1\",\"type\":\"category\",\"position\":\"bottom\",\"show\":true,\"style\":{},\"scale\":{\"type\":\"linear\"},\"labels\":{\"show\":true,\"filter\":true,\"truncate\":100},\"title\":{}}],\"valueAxes\":[{\"id\":\"ValueAxis-1\",\"name\":\"LeftAxis-1\",\"type\":\"value\",\"position\":\"left\",\"show\":true,\"style\":{},\"scale\":{\"type\":\"linear\",\"mode\":\"normal\"},\"labels\":{\"show\":true,\"rotate\":0,\"filter\":false,\"truncate\":100},\"title\":{\"text\":\"Duration in seconds\"}}],\"seriesParams\":[{\"show\":true,\"type\":\"histogram\",\"mode\":\"normal\",\"data\":{\"label\":\"Duration in seconds\",\"id\":\"1\"},\"valueAxis\":\"ValueAxis-1\",\"drawLinesBetweenPoints\":true,\"lineWidth\":2,\"showCircles\":true}],\"addTooltip\":true,\"addLegend\":true,\"legendPosition\":\"right\",\"times\":[],\"addTimeMarker\":false,\"labels\":{\"show\":false},\"thresholdLine\":{\"show\":false,\"value\":10,\"width\":1,\"style\":\"full\",\"color\":\"#E7664C\"},\"orderBucketsBySum\":true}}"},"id":"1bdad170-2d9f-11ef-9a92-fbd3515fac70","migrationVersion":{"visualization":"7.10.0"},"references":[{"id":"b8c824c3-6508-4495-a3d0-9f53e6723cdd","name":"kibanaSavedObjectMeta.searchSourceJSON.index","type":"index-pattern"}],"type":"visualization","updated_at":"2024-06-18T18:18:06.214Z","version":"WzE0LDFd"} +{"attributes":{"description":"","kibanaSavedObjectMeta":{"searchSourceJSON":"{\"query\":{\"query\":\"\",\"language\":\"kuery\"},\"filter\":[],\"indexRefName\":\"kibanaSavedObjectMeta.searchSourceJSON.index\"}"},"title":"Shortest session durations","uiStateJSON":"{}","version":1,"visState":"{\"title\":\"Shortest session durations\",\"type\":\"histogram\",\"aggs\":[{\"id\":\"1\",\"enabled\":true,\"type\":\"sum\",\"params\":{\"field\":\"event_attributes.dwell_time\",\"customLabel\":\"Duration in seconds\"},\"schema\":\"metric\"},{\"id\":\"2\",\"enabled\":true,\"type\":\"terms\",\"params\":{\"field\":\"event_attributes.session_id.keyword\",\"orderBy\":\"1\",\"order\":\"asc\",\"size\":25,\"otherBucket\":false,\"otherBucketLabel\":\"Other\",\"missingBucket\":false,\"missingBucketLabel\":\"Missing\",\"customLabel\":\"Session\"},\"schema\":\"group\"}],\"params\":{\"type\":\"histogram\",\"grid\":{\"categoryLines\":false},\"categoryAxes\":[{\"id\":\"CategoryAxis-1\",\"type\":\"category\",\"position\":\"bottom\",\"show\":true,\"style\":{},\"scale\":{\"type\":\"linear\"},\"labels\":{\"show\":true,\"filter\":true,\"truncate\":100},\"title\":{}}],\"valueAxes\":[{\"id\":\"ValueAxis-1\",\"name\":\"LeftAxis-1\",\"type\":\"value\",\"position\":\"left\",\"show\":true,\"style\":{},\"scale\":{\"type\":\"linear\",\"mode\":\"normal\"},\"labels\":{\"show\":true,\"rotate\":0,\"filter\":false,\"truncate\":100},\"title\":{\"text\":\"Duration in seconds\"}}],\"seriesParams\":[{\"show\":true,\"type\":\"histogram\",\"mode\":\"normal\",\"data\":{\"label\":\"Duration in seconds\",\"id\":\"1\"},\"valueAxis\":\"ValueAxis-1\",\"drawLinesBetweenPoints\":true,\"lineWidth\":2,\"showCircles\":true}],\"addTooltip\":true,\"addLegend\":true,\"legendPosition\":\"right\",\"times\":[],\"addTimeMarker\":false,\"labels\":{\"show\":false},\"thresholdLine\":{\"show\":false,\"value\":10,\"width\":1,\"style\":\"full\",\"color\":\"#E7664C\"},\"orderBucketsBySum\":true}}"},"id":"5ae594e0-2d9f-11ef-9a92-fbd3515fac70","migrationVersion":{"visualization":"7.10.0"},"references":[{"id":"b8c824c3-6508-4495-a3d0-9f53e6723cdd","name":"kibanaSavedObjectMeta.searchSourceJSON.index","type":"index-pattern"}],"type":"visualization","updated_at":"2024-06-18T18:22:57.364Z","version":"WzE3LDFd"} +{"attributes":{"description":"","kibanaSavedObjectMeta":{"searchSourceJSON":"{\"query\":{\"query\":\"\",\"language\":\"kuery\"},\"filter\":[{\"$state\":{\"store\":\"appState\"},\"meta\":{\"alias\":null,\"disabled\":false,\"key\":\"action_name\",\"negate\":false,\"params\":{\"query\":\"purchase\"},\"type\":\"phrase\",\"indexRefName\":\"kibanaSavedObjectMeta.searchSourceJSON.filter[0].meta.index\"},\"query\":{\"match_phrase\":{\"action_name\":\"purchase\"}}}],\"indexRefName\":\"kibanaSavedObjectMeta.searchSourceJSON.index\"}"},"title":"Longest Session Durations (copy)","uiStateJSON":"{}","version":1,"visState":"{\"title\":\"Longest Session Durations (copy)\",\"type\":\"histogram\",\"aggs\":[{\"id\":\"1\",\"enabled\":true,\"type\":\"sum\",\"params\":{\"field\":\"event_attributes.object.object_detail.price\",\"customLabel\":\"Price spent\"},\"schema\":\"metric\"},{\"id\":\"2\",\"enabled\":true,\"type\":\"terms\",\"params\":{\"field\":\"client_id\",\"orderBy\":\"1\",\"order\":\"desc\",\"size\":35,\"otherBucket\":false,\"otherBucketLabel\":\"Other\",\"missingBucket\":false,\"missingBucketLabel\":\"Missing\",\"customLabel\":\"Users' client_ids\"},\"schema\":\"segment\"},{\"id\":\"3\",\"enabled\":true,\"type\":\"sum\",\"params\":{\"field\":\"event_attributes.dwell_time\",\"customLabel\":\"dwell time\"},\"schema\":\"radius\"}],\"params\":{\"type\":\"histogram\",\"grid\":{\"categoryLines\":false,\"valueAxis\":\"ValueAxis-1\"},\"categoryAxes\":[{\"id\":\"CategoryAxis-1\",\"type\":\"category\",\"position\":\"bottom\",\"show\":true,\"style\":{},\"scale\":{\"type\":\"linear\"},\"labels\":{\"show\":true,\"filter\":true,\"truncate\":100},\"title\":{}}],\"valueAxes\":[{\"id\":\"ValueAxis-1\",\"name\":\"LeftAxis-1\",\"type\":\"value\",\"position\":\"left\",\"show\":true,\"style\":{},\"scale\":{\"type\":\"linear\",\"mode\":\"normal\"},\"labels\":{\"show\":true,\"rotate\":75,\"filter\":false,\"truncate\":100},\"title\":{\"text\":\"Price spent\"}}],\"seriesParams\":[{\"show\":true,\"type\":\"line\",\"mode\":\"normal\",\"data\":{\"label\":\"Price spent\",\"id\":\"1\"},\"valueAxis\":\"ValueAxis-1\",\"drawLinesBetweenPoints\":true,\"lineWidth\":2,\"showCircles\":true}],\"addTooltip\":true,\"addLegend\":true,\"legendPosition\":\"left\",\"times\":[],\"addTimeMarker\":false,\"labels\":{\"show\":true},\"thresholdLine\":{\"show\":false,\"value\":10,\"width\":1,\"style\":\"full\",\"color\":\"#E7664C\"},\"orderBucketsBySum\":true,\"radiusRatio\":12}}"},"id":"f1c37d80-2da1-11ef-9a92-fbd3515fac70","migrationVersion":{"visualization":"7.10.0"},"references":[{"id":"b8c824c3-6508-4495-a3d0-9f53e6723cdd","name":"kibanaSavedObjectMeta.searchSourceJSON.index","type":"index-pattern"},{"id":"b8c824c3-6508-4495-a3d0-9f53e6723cdd","name":"kibanaSavedObjectMeta.searchSourceJSON.filter[0].meta.index","type":"index-pattern"}],"type":"visualization","updated_at":"2024-06-18T18:51:42.469Z","version":"WzI1LDFd"} +{"attributes":{"description":"","kibanaSavedObjectMeta":{"searchSourceJSON":"{\"query\":{\"query\":\"\",\"language\":\"kuery\"},\"filter\":[]}"},"title":"Label","uiStateJSON":"{}","version":1,"visState":"{\"title\":\"Label\",\"type\":\"markdown\",\"aggs\":[],\"params\":{\"fontSize\":15,\"openLinksInNewTab\":false,\"markdown\":\"*The larger the circle above, the longer the time the user spent on the site*\"}}"},"id":"757f2cd0-2da4-11ef-9a92-fbd3515fac70","migrationVersion":{"visualization":"7.10.0"},"references":[],"type":"visualization","updated_at":"2024-06-18T18:58:51.608Z","version":"WzI5LDFd"} +{"attributes":{"description":"","hits":0,"kibanaSavedObjectMeta":{"searchSourceJSON":"{\"query\":{\"language\":\"kuery\",\"query\":\"\"},\"filter\":[]}"},"optionsJSON":"{\"hidePanelTitles\":false,\"useMargins\":true}","panelsJSON":"[{\"embeddableConfig\":{},\"gridData\":{\"h\":15,\"i\":\"4dcf9d77-3294-49c5-9c2e-25be877b34f9\",\"w\":24,\"x\":0,\"y\":0},\"panelIndex\":\"4dcf9d77-3294-49c5-9c2e-25be877b34f9\",\"version\":\"2.14.0\",\"panelRefName\":\"panel_0\"},{\"embeddableConfig\":{},\"gridData\":{\"h\":15,\"i\":\"f6491630-bcdc-4b6e-aa82-ead6aa4ef88c\",\"w\":24,\"x\":24,\"y\":0},\"panelIndex\":\"f6491630-bcdc-4b6e-aa82-ead6aa4ef88c\",\"version\":\"2.14.0\",\"panelRefName\":\"panel_1\"},{\"embeddableConfig\":{},\"gridData\":{\"h\":15,\"i\":\"42b97dac-d467-4f2d-8a4b-9ef82aa04849\",\"w\":24,\"x\":0,\"y\":15},\"panelIndex\":\"42b97dac-d467-4f2d-8a4b-9ef82aa04849\",\"version\":\"2.14.0\",\"panelRefName\":\"panel_2\"},{\"embeddableConfig\":{},\"gridData\":{\"h\":15,\"i\":\"a4e1b903-8480-4be5-86e7-fc5faeb2e998\",\"w\":24,\"x\":24,\"y\":15},\"panelIndex\":\"a4e1b903-8480-4be5-86e7-fc5faeb2e998\",\"version\":\"2.14.0\",\"panelRefName\":\"panel_3\"},{\"embeddableConfig\":{\"hidePanelTitles\":false},\"gridData\":{\"h\":15,\"i\":\"f16f225c-a71f-46c5-ab58-4aed5d54fe16\",\"w\":24,\"x\":0,\"y\":30},\"panelIndex\":\"f16f225c-a71f-46c5-ab58-4aed5d54fe16\",\"title\":\"all searches with at least 1 result\",\"version\":\"2.14.0\",\"panelRefName\":\"panel_4\"},{\"embeddableConfig\":{\"hidePanelTitles\":false},\"gridData\":{\"h\":15,\"i\":\"17d1c62c-8095-49f4-8675-9fae1e3a7896\",\"w\":24,\"x\":24,\"y\":30},\"panelIndex\":\"17d1c62c-8095-49f4-8675-9fae1e3a7896\",\"title\":\"all searches\",\"version\":\"2.14.0\",\"panelRefName\":\"panel_5\"},{\"embeddableConfig\":{\"hidePanelTitles\":false},\"gridData\":{\"h\":15,\"i\":\"8b6f3024-fb97-4e6f-a21b-df9ebf3527bd\",\"w\":24,\"x\":0,\"y\":45},\"panelIndex\":\"8b6f3024-fb97-4e6f-a21b-df9ebf3527bd\",\"title\":\"Longest session durations\",\"version\":\"2.14.0\",\"panelRefName\":\"panel_6\"},{\"embeddableConfig\":{},\"gridData\":{\"h\":15,\"i\":\"6f603293-afc1-4bf6-9f0f-f15379ac78b6\",\"w\":24,\"x\":24,\"y\":45},\"panelIndex\":\"6f603293-afc1-4bf6-9f0f-f15379ac78b6\",\"version\":\"2.14.0\",\"panelRefName\":\"panel_7\"},{\"embeddableConfig\":{\"hidePanelTitles\":false},\"gridData\":{\"h\":28,\"i\":\"7ec6ac3d-fe0d-463e-81a3-0ac77d7590e4\",\"w\":48,\"x\":0,\"y\":60},\"panelIndex\":\"7ec6ac3d-fe0d-463e-81a3-0ac77d7590e4\",\"title\":\"Biggest spenders by time spent on site\",\"version\":\"2.14.0\",\"panelRefName\":\"panel_8\"},{\"embeddableConfig\":{},\"gridData\":{\"h\":4,\"i\":\"8e997144-9142-43eb-b588-9ec863b8aa81\",\"w\":45,\"x\":1,\"y\":88},\"panelIndex\":\"8e997144-9142-43eb-b588-9ec863b8aa81\",\"version\":\"2.14.0\",\"panelRefName\":\"panel_9\"}]","timeRestore":false,"title":"User Behavior Insights","version":1},"id":"084f916a-3f75-4782-8773-4d07fcdbfda4","migrationVersion":{"dashboard":"7.9.3"},"references":[{"id":"2ddbc001-f5ae-4de9-82f3-30b28408bae6","name":"panel_0","type":"visualization"},{"id":"7234f759-31ab-467a-942e-d6db8c58477e","name":"panel_1","type":"visualization"},{"id":"78a48652-7423-4517-a869-9ba167afcc47","name":"panel_2","type":"visualization"},{"id":"cb78bb39-4437-4347-aade-15bc268c6a26","name":"panel_3","type":"visualization"},{"id":"a1840c05-bd0c-4fb2-8a50-7f51395e3d8d","name":"panel_4","type":"visualization"},{"id":"4af4cc80-33fd-4ef4-b4f9-7d67ecd84b20","name":"panel_5","type":"visualization"},{"id":"1bdad170-2d9f-11ef-9a92-fbd3515fac70","name":"panel_6","type":"visualization"},{"id":"5ae594e0-2d9f-11ef-9a92-fbd3515fac70","name":"panel_7","type":"visualization"},{"id":"f1c37d80-2da1-11ef-9a92-fbd3515fac70","name":"panel_8","type":"visualization"},{"id":"757f2cd0-2da4-11ef-9a92-fbd3515fac70","name":"panel_9","type":"visualization"}],"type":"dashboard","updated_at":"2024-06-18T20:01:02.353Z","version":"WzMwLDFd"} +{"exportedCount":13,"missingRefCount":0,"missingReferences":[]} \ No newline at end of file diff --git a/images/dashboards/Add_datasource.gif b/images/dashboards/Add_datasource.gif new file mode 100644 index 0000000000..789e1a2128 Binary files /dev/null and b/images/dashboards/Add_datasource.gif differ diff --git a/images/dashboards/add-sample-data.gif b/images/dashboards/add-sample-data.gif new file mode 100644 index 0000000000..6e569d704d Binary files /dev/null and b/images/dashboards/add-sample-data.gif differ diff --git a/images/dashboards/configure-tsvb.gif b/images/dashboards/configure-tsvb.gif new file mode 100644 index 0000000000..fc91e9e669 Binary files /dev/null and b/images/dashboards/configure-tsvb.gif differ diff --git a/images/dashboards/configure-vega.gif b/images/dashboards/configure-vega.gif new file mode 100644 index 0000000000..290ad51416 Binary files /dev/null and b/images/dashboards/configure-vega.gif differ diff --git a/images/dashboards/create-datasource.gif b/images/dashboards/create-datasource.gif new file mode 100644 index 0000000000..789e1a2128 Binary files /dev/null and b/images/dashboards/create-datasource.gif differ diff --git a/images/dashboards/make_tsvb.gif b/images/dashboards/make_tsvb.gif new file mode 100644 index 0000000000..fc91e9e669 Binary files /dev/null and b/images/dashboards/make_tsvb.gif differ diff --git a/images/dashboards/tsvb-viz.png b/images/dashboards/tsvb-viz.png new file mode 100644 index 0000000000..efdf12484c Binary files /dev/null and b/images/dashboards/tsvb-viz.png differ diff --git a/images/dashboards/tsvb-with-annotations.png b/images/dashboards/tsvb-with-annotations.png new file mode 100644 index 0000000000..6cb35632b8 Binary files /dev/null and b/images/dashboards/tsvb-with-annotations.png differ diff --git a/images/dashboards/tsvb.png b/images/dashboards/tsvb.png new file mode 100644 index 0000000000..85f55cc3ad Binary files /dev/null and b/images/dashboards/tsvb.png differ diff --git a/images/geopolygon-query.png b/images/geopolygon-query.png new file mode 100644 index 0000000000..16d73628de Binary files /dev/null and b/images/geopolygon-query.png differ diff --git a/images/k-nn-search-hotels.png b/images/k-nn-search-hotels.png new file mode 100644 index 0000000000..f17fd171cf Binary files /dev/null and b/images/k-nn-search-hotels.png differ diff --git a/images/make_vega.gif b/images/make_vega.gif new file mode 100644 index 0000000000..290ad51416 Binary files /dev/null and b/images/make_vega.gif differ diff --git a/images/ubi/001_screens_side_by_side.png b/images/ubi/001_screens_side_by_side.png new file mode 100644 index 0000000000..b230204b5a Binary files /dev/null and b/images/ubi/001_screens_side_by_side.png differ diff --git a/images/ubi/dashboard2.png b/images/ubi/dashboard2.png new file mode 100644 index 0000000000..9c00297080 Binary files /dev/null and b/images/ubi/dashboard2.png differ diff --git a/images/ubi/first_dashboard.png b/images/ubi/first_dashboard.png new file mode 100644 index 0000000000..7f3d4fabab Binary files /dev/null and b/images/ubi/first_dashboard.png differ diff --git a/images/ubi/histogram.png b/images/ubi/histogram.png new file mode 100644 index 0000000000..799bffc813 Binary files /dev/null and b/images/ubi/histogram.png differ diff --git a/images/ubi/home.png b/images/ubi/home.png new file mode 100644 index 0000000000..7c009496ec Binary files /dev/null and b/images/ubi/home.png differ diff --git a/images/ubi/index_pattern1.png b/images/ubi/index_pattern1.png new file mode 100644 index 0000000000..ca69405c7a Binary files /dev/null and b/images/ubi/index_pattern1.png differ diff --git a/images/ubi/index_pattern2.png b/images/ubi/index_pattern2.png new file mode 100644 index 0000000000..b5127ffcfd Binary files /dev/null and b/images/ubi/index_pattern2.png differ diff --git a/images/ubi/index_pattern3.png b/images/ubi/index_pattern3.png new file mode 100644 index 0000000000..201ec4fbf8 Binary files /dev/null and b/images/ubi/index_pattern3.png differ diff --git a/images/ubi/laptop.png b/images/ubi/laptop.png new file mode 100644 index 0000000000..6407139b36 Binary files /dev/null and b/images/ubi/laptop.png differ diff --git a/images/ubi/new_widget.png b/images/ubi/new_widget.png new file mode 100644 index 0000000000..5ba188c6a2 Binary files /dev/null and b/images/ubi/new_widget.png differ diff --git a/images/ubi/pie.png b/images/ubi/pie.png new file mode 100644 index 0000000000..7602d6a3aa Binary files /dev/null and b/images/ubi/pie.png differ diff --git a/images/ubi/product_purchase.png b/images/ubi/product_purchase.png new file mode 100644 index 0000000000..7121c1fb4e Binary files /dev/null and b/images/ubi/product_purchase.png differ diff --git a/images/ubi/query_id.png b/images/ubi/query_id.png new file mode 100644 index 0000000000..7051c8abb8 Binary files /dev/null and b/images/ubi/query_id.png differ diff --git a/images/ubi/tag_cloud1.png b/images/ubi/tag_cloud1.png new file mode 100644 index 0000000000..32db3ebadc Binary files /dev/null and b/images/ubi/tag_cloud1.png differ diff --git a/images/ubi/tag_cloud2.png b/images/ubi/tag_cloud2.png new file mode 100644 index 0000000000..bdd01d3516 Binary files /dev/null and b/images/ubi/tag_cloud2.png differ diff --git a/images/ubi/ubi-schema-interactions.png b/images/ubi/ubi-schema-interactions.png new file mode 100644 index 0000000000..f2319bb2c1 Binary files /dev/null and b/images/ubi/ubi-schema-interactions.png differ diff --git a/images/ubi/ubi-schema-interactions_legend.png b/images/ubi/ubi-schema-interactions_legend.png new file mode 100644 index 0000000000..91cae04c74 Binary files /dev/null and b/images/ubi/ubi-schema-interactions_legend.png differ diff --git a/images/ubi/ubi.png b/images/ubi/ubi.png new file mode 100644 index 0000000000..c24ff6f4ab Binary files /dev/null and b/images/ubi/ubi.png differ diff --git a/images/ubi/visualizations.png b/images/ubi/visualizations.png new file mode 100644 index 0000000000..9f06148686 Binary files /dev/null and b/images/ubi/visualizations.png differ diff --git a/images/ubi/visualizations2.png b/images/ubi/visualizations2.png new file mode 100644 index 0000000000..c54920023f Binary files /dev/null and b/images/ubi/visualizations2.png differ diff --git a/images/vega.png b/images/vega.png new file mode 100644 index 0000000000..ae7ea76c9d Binary files /dev/null and b/images/vega.png differ