From 3ead1ed98866e5f3a85664bb4d5f5fbed680e825 Mon Sep 17 00:00:00 2001 From: Melissa Vagi Date: Fri, 22 Dec 2023 12:07:46 -0700 Subject: [PATCH 1/7] Add foreach processor documentation Signed-off-by: Melissa Vagi --- _ingest-pipelines/processors/foreach.md | 86 +++++++++++++++++++++++++ 1 file changed, 86 insertions(+) create mode 100644 _ingest-pipelines/processors/foreach.md diff --git a/_ingest-pipelines/processors/foreach.md b/_ingest-pipelines/processors/foreach.md new file mode 100644 index 0000000000..adf3d3f02e --- /dev/null +++ b/_ingest-pipelines/processors/foreach.md @@ -0,0 +1,86 @@ +--- +layout: default +title: Foreach +parent: Ingest processors +nav_order: 110 +--- + +# Foreach processor + +The `foreach` processor is used to . + +The following is the syntax for the `foreach` processor: + +```json + +``` +{% include copy-curl.html %} + +## Configuration parameters + +The following table lists the required and optional parameters for the `foreach` processor. + +Parameter | Required/Optional | Description | +|-----------|-----------|-----------| + + +## Using the processor + +Follow these steps to use the processor in a pipeline. + +### Step 1: Create a pipeline + +The following query creates a pipeline, named , that uses the `foreach` processor to : + +```json + +``` +{% include copy-curl.html %} + +### Step 2 (Optional): Test the pipeline + +It is recommended that you test your pipeline before you ingest documents. +{: .tip} + +To test the pipeline, run the following query: + +```json + +``` +{% include copy-curl.html %} + +#### Response + +The following example response confirms that the pipeline is working as expected: + +```json + +``` + +### Step 3: Ingest a document + +The following query ingests a document into an index named `testindex1`: + +```json + +``` +{% include copy-curl.html %} + +#### Response + +The request indexes the document into the index and will index all documents with . + +```json + +``` + +### Step 4 (Optional): Retrieve the document + +To retrieve the document, run the following query: + +```json + +``` +{% include copy-curl.html %} + + \ No newline at end of file From c10ed94c6032947611de842d525a0b2811fa9444 Mon Sep 17 00:00:00 2001 From: Melissa Vagi Date: Wed, 22 May 2024 10:39:30 -0600 Subject: [PATCH 2/7] Add pipeline example requests and responses Signed-off-by: Melissa Vagi --- _ingest-pipelines/processors/foreach.md | 62 +++++++++++++++++++++++-- 1 file changed, 57 insertions(+), 5 deletions(-) diff --git a/_ingest-pipelines/processors/foreach.md b/_ingest-pipelines/processors/foreach.md index adf3d3f02e..76dad0d7c6 100644 --- a/_ingest-pipelines/processors/foreach.md +++ b/_ingest-pipelines/processors/foreach.md @@ -7,12 +7,21 @@ nav_order: 110 # Foreach processor -The `foreach` processor is used to . +The `foreach` processor is used to iterate over a list of values in an input document and perform some operation on each value. This can be useful for tasks like extracting information from a nested JSON structure or applying transformations to a collection of fields. The following is the syntax for the `foreach` processor: ```json - +{ + "foreach": { + "field": "", + "processor": { + "": { + "": "" + } + } + } +} ``` {% include copy-curl.html %} @@ -22,7 +31,14 @@ The following table lists the required and optional parameters for the `foreach` Parameter | Required/Optional | Description | |-----------|-----------|-----------| - +`field` | Required | The array field to iterate over. +`processor` | Required | The processor to execute against each field. +`ignore_missing` | Optional | If `true` and the specified field does not exist or is +null, the processor will quietly exit without modifying the document. +`if` | Optional | A conditional expression to determine whether to execute this processor. +`on_failure` | Optional | Specifies how to handle failures for this processor. See the documentation on [Handling failures in pipelines]({{site.url}}{{site.baseurl}}/ingest-pipelines/pipeline-failures/). +`ignore_failure` | Optional | If `true`, failures for this processor are ignored. See the documentation on [Handling failures in pipelines]({{site.url}}{{site.baseurl}}/ingest-pipelines/pipeline-failures/). +`tag` | Optional | An identifier for this processor. Useful for debugging and metrics. ## Using the processor @@ -30,10 +46,46 @@ Follow these steps to use the processor in a pipeline. ### Step 1: Create a pipeline -The following query creates a pipeline, named , that uses the `foreach` processor to : +The following query creates a pipeline named `test-foreach` that uses the `foreach` processor to extract information from a nested JSON structure: ```json - +PUT _ingest/pipeline/example-foreach +{ + "version": 2, + "example-foreach": { + "source": { + "http": { + "path": "/data" + } + }, + "processors": [ + { + "foreach": { + "field": "data.orders", + "processor": { + "rename": { + "field": "_ingest._value.order_id", + "target_field": "order_id" + } + }, + "on_failure": [ + { + "set": { + "field": "_index", + "value": "failed-orders" + } + } + ] + } + } + ], + "sink": { + "opensearch": { + "index": "orders" + } + } + } +} ``` {% include copy-curl.html %} From 55fe58f1e093614ad0fd555193479bd64bcf0581 Mon Sep 17 00:00:00 2001 From: Melissa Vagi Date: Wed, 22 May 2024 12:53:48 -0600 Subject: [PATCH 3/7] Add pipeline examples Signed-off-by: Melissa Vagi --- _ingest-pipelines/processors/foreach.md | 123 ++++++++++++++++-------- 1 file changed, 85 insertions(+), 38 deletions(-) diff --git a/_ingest-pipelines/processors/foreach.md b/_ingest-pipelines/processors/foreach.md index 76dad0d7c6..d80a1a7033 100644 --- a/_ingest-pipelines/processors/foreach.md +++ b/_ingest-pipelines/processors/foreach.md @@ -49,42 +49,22 @@ Follow these steps to use the processor in a pipeline. The following query creates a pipeline named `test-foreach` that uses the `foreach` processor to extract information from a nested JSON structure: ```json -PUT _ingest/pipeline/example-foreach +PUT _ingest/pipeline/test-foreach { - "version": 2, - "example-foreach": { - "source": { - "http": { - "path": "/data" - } - }, - "processors": [ - { - "foreach": { - "field": "data.orders", - "processor": { - "rename": { - "field": "_ingest._value.order_id", - "target_field": "order_id" - } - }, - "on_failure": [ - { - "set": { - "field": "_index", - "value": "failed-orders" - } - } - ] + "description": "Extracts nested JSON data", + "processors": [ + { + "foreach": { + "field": "users", + "processor": { + "json": { + "field": "_ingest._value", + "target_field": "user_data" + } } } - ], - "sink": { - "opensearch": { - "index": "orders" - } } - } + ] } ``` {% include copy-curl.html %} @@ -106,33 +86,100 @@ To test the pipeline, run the following query: The following example response confirms that the pipeline is working as expected: ```json - +{ + "docs": [ + { + "doc": { + "_index": "_index", + "_id": "_id", + "_source": { + "user_data": { + "name": "Jane Smith", + "age": 28 + }, + "users": [ + """{"name":"John Doe","age":32}""", + """{"name":"Jane Smith","age":28}""" + ] + }, + "_ingest": { + "_value": null, + "timestamp": "2024-05-22T18:27:27.299741001Z" + } + } + } + ] +} ``` +{% include copy-curl.html %} ### Step 3: Ingest a document The following query ingests a document into an index named `testindex1`: ```json - +PUT testindex1/_doc/1?pipeline=test-foreach +{ + "users": [ + "{\"name\":\"John Doe\",\"age\":32}", + "{\"name\":\"Jane Smith\",\"age\":28}" + ] +} ``` {% include copy-curl.html %} #### Response -The request indexes the document into the index and will index all documents with . +The request indexes the document into the index `testindex1` and indexes all documents with the extracted JSON data from the `users` field: ```json - +{ + "_index": "testindex1", + "_id": "1", + "_version": 2, + "result": "updated", + "_shards": { + "total": 2, + "successful": 1, + "failed": 0 + }, + "_seq_no": 1, + "_primary_term": 1 +} ``` +{% include copy-curl.html %} ### Step 4 (Optional): Retrieve the document To retrieve the document, run the following query: ```json - +GET testindex1/_doc/1 ``` {% include copy-curl.html %} - \ No newline at end of file +#### Response + +The response shows the document with the extracted JSON data from the `users` field: + +```json +{ + "_index": "testindex1", + "_id": "1", + "_version": 2, + "_seq_no": 1, + "_primary_term": 1, + "found": true, + "_source": { + "user_data": { + "name": "Jane Smith", + "age": 28 + }, + "users": [ + """{"name":"John Doe","age":32}""", + """{"name":"Jane Smith","age":28}""" + ] + } +} +``` +{% include copy-curl.html %} From eafa2b861c6b0b02f73beae71c125f3e29291944 Mon Sep 17 00:00:00 2001 From: Melissa Vagi Date: Fri, 24 May 2024 11:54:38 -0600 Subject: [PATCH 4/7] Address tech review comments Signed-off-by: Melissa Vagi --- _ingest-pipelines/processors/foreach.md | 209 +++++++++++++----------- 1 file changed, 117 insertions(+), 92 deletions(-) diff --git a/_ingest-pipelines/processors/foreach.md b/_ingest-pipelines/processors/foreach.md index d80a1a7033..0ece1c3a32 100644 --- a/_ingest-pipelines/processors/foreach.md +++ b/_ingest-pipelines/processors/foreach.md @@ -1,13 +1,13 @@ --- layout: default -title: Foreach +title: `foreach` parent: Ingest processors nav_order: 110 --- -# Foreach processor +# `foreach` processor -The `foreach` processor is used to iterate over a list of values in an input document and perform some operation on each value. This can be useful for tasks like extracting information from a nested JSON structure or applying transformations to a collection of fields. +The `foreach` processor is used to iterate over a list of values in an input document and apply a transformation to each value. This can be useful for tasks like processing all the elements in an array consistently, such as converting all elements in a string to lowercase or uppercase. The following is the syntax for the `foreach` processor: @@ -33,12 +33,12 @@ Parameter | Required/Optional | Description | |-----------|-----------|-----------| `field` | Required | The array field to iterate over. `processor` | Required | The processor to execute against each field. -`ignore_missing` | Optional | If `true` and the specified field does not exist or is -null, the processor will quietly exit without modifying the document. -`if` | Optional | A conditional expression to determine whether to execute this processor. -`on_failure` | Optional | Specifies how to handle failures for this processor. See the documentation on [Handling failures in pipelines]({{site.url}}{{site.baseurl}}/ingest-pipelines/pipeline-failures/). -`ignore_failure` | Optional | If `true`, failures for this processor are ignored. See the documentation on [Handling failures in pipelines]({{site.url}}{{site.baseurl}}/ingest-pipelines/pipeline-failures/). -`tag` | Optional | An identifier for this processor. Useful for debugging and metrics. +`ignore_missing` | Optional | If `true` and the specified field does not exist or is null, the processor will quietly exit without modifying the document. +`description` | Optional | A brief description of the processor. +`if` | Optional | A condition for running the processor. +`ignore_failure` | Optional | Specifies whether the processor continues execution even if it encounters an error. If set to `true`, failures are ignored. Default is `false`. +`on_failure` | Optional | A list of processors to run if the processor fails. +`tag` | Optional | An identifier tag for the processor. Useful for debugging in order to distinguish between processors of the same type. ## Using the processor @@ -46,26 +46,22 @@ Follow these steps to use the processor in a pipeline. ### Step 1: Create a pipeline -The following query creates a pipeline named `test-foreach` that uses the `foreach` processor to extract information from a nested JSON structure: +The following query creates a pipeline named `test-foreach` that uses the `foreach` processor to iterate over each element in the `protocols` field: ```json -PUT _ingest/pipeline/test-foreach -{ - "description": "Extracts nested JSON data", - "processors": [ - { - "foreach": { - "field": "users", - "processor": { - "json": { - "field": "_ingest._value", - "target_field": "user_data" - } - } - } - } - ] -} +PUT _ingest/pipeline/test-foreach +{ + "description": "Lowercase all the elements in an array", + "processors": [ + { + "foreach": { + "field": "protocols", + "processor": { + "lowercase": { + "field": "_ingest._value" + } + } + } ``` {% include copy-curl.html %} @@ -77,39 +73,49 @@ It is recommended that you test your pipeline before you ingest documents. To test the pipeline, run the following query: ```json - +POST _ingest/pipeline/test-foreach/_simulate +{ + "docs": [ + { + "_index": "testindex1", + "_id": "1", + "_source": { + "protocols": ["HTTP","HTTPS","TCP","UDP"] + } + } + ] +} ``` {% include copy-curl.html %} #### Response -The following example response confirms that the pipeline is working as expected: +The following example response confirms that the pipeline is working as expected, showing the four elements have been lowercased: ```json -{ - "docs": [ - { - "doc": { - "_index": "_index", - "_id": "_id", - "_source": { - "user_data": { - "name": "Jane Smith", - "age": 28 - }, - "users": [ - """{"name":"John Doe","age":32}""", - """{"name":"Jane Smith","age":28}""" - ] - }, - "_ingest": { - "_value": null, - "timestamp": "2024-05-22T18:27:27.299741001Z" - } - } - } - ] -} +{ + "docs": [ + { + "doc": { + "_index": "testindex1", + "_id": "1", + "_source": { + "protocols": [ + "http", + "https", + "tcp", + "udp" + ] + }, + "_ingest": { + "_value": null, + "timestamp": "2024-05-23T02:44:10.8201Z" + } + } + } + ] +} + ``` {% include copy-curl.html %} @@ -118,34 +124,31 @@ The following example response confirms that the pipeline is working as expected The following query ingests a document into an index named `testindex1`: ```json -PUT testindex1/_doc/1?pipeline=test-foreach -{ - "users": [ - "{\"name\":\"John Doe\",\"age\":32}", - "{\"name\":\"Jane Smith\",\"age\":28}" - ] -} +POST testindex1/_doc/1?pipeline=test-foreach +{ + "protocols": ["HTTP","HTTPS","TCP","UDP"] +} ``` {% include copy-curl.html %} #### Response -The request indexes the document into the index `testindex1` and indexes all documents with the extracted JSON data from the `users` field: +The request indexes the document into the index `testindex1` and applies the pipeline before indexing: ```json -{ - "_index": "testindex1", - "_id": "1", - "_version": 2, - "result": "updated", - "_shards": { - "total": 2, - "successful": 1, - "failed": 0 - }, - "_seq_no": 1, - "_primary_term": 1 -} +{ + "_index": "testindex1", + "_id": "1", + "_version": 6, + "result": "created", + "_shards": { + "total": 2, + "successful": 1, + "failed": 0 + }, + "_seq_no": 5, + "_primary_term": 67 +} ``` {% include copy-curl.html %} @@ -163,23 +166,45 @@ GET testindex1/_doc/1 The response shows the document with the extracted JSON data from the `users` field: ```json -{ - "_index": "testindex1", - "_id": "1", - "_version": 2, - "_seq_no": 1, - "_primary_term": 1, - "found": true, - "_source": { - "user_data": { - "name": "Jane Smith", - "age": 28 - }, - "users": [ - """{"name":"John Doe","age":32}""", - """{"name":"Jane Smith","age":28}""" - ] - } -} +{ + "_index": "testindex1", + "_id": "1", + "_version": 6, + "_seq_no": 5, + "_primary_term": 67, + "found": true, + "_source": { + "protocols": [ + "http", + "https", + "tcp", + "udp" + ] + } +} +gaobinlong1 day ago + +{ + "docs": [ + { + "doc": { + "_index": "testindex1", + "_id": "1", + "_source": { + "protocols": [ + "http", + "https", + "tcp", + "udp" + ] + }, + "_ingest": { + "_value": null, + "timestamp": "2024-05-23T02:44:10.8201Z" + } + } + } + ] +} ``` {% include copy-curl.html %} From dda29a041f2c008020b0f5e39f9022e8dd8b40cf Mon Sep 17 00:00:00 2001 From: Melissa Vagi Date: Tue, 28 May 2024 12:40:10 -0600 Subject: [PATCH 5/7] Update _ingest-pipelines/processors/foreach.md Co-authored-by: Nathan Bower Signed-off-by: Melissa Vagi --- _ingest-pipelines/processors/foreach.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_ingest-pipelines/processors/foreach.md b/_ingest-pipelines/processors/foreach.md index 0ece1c3a32..d647f2a6da 100644 --- a/_ingest-pipelines/processors/foreach.md +++ b/_ingest-pipelines/processors/foreach.md @@ -33,7 +33,7 @@ Parameter | Required/Optional | Description | |-----------|-----------|-----------| `field` | Required | The array field to iterate over. `processor` | Required | The processor to execute against each field. -`ignore_missing` | Optional | If `true` and the specified field does not exist or is null, the processor will quietly exit without modifying the document. +`ignore_missing` | Optional | If `true` and the specified field does not exist or is null, then the processor will quietly exit without modifying the document. `description` | Optional | A brief description of the processor. `if` | Optional | A condition for running the processor. `ignore_failure` | Optional | Specifies whether the processor continues execution even if it encounters an error. If set to `true`, failures are ignored. Default is `false`. From a55b31a2ea60d339fb8dd46d53ff1dc7d9447e8a Mon Sep 17 00:00:00 2001 From: Melissa Vagi Date: Tue, 28 May 2024 12:43:33 -0600 Subject: [PATCH 6/7] Update _ingest-pipelines/processors/foreach.md Co-authored-by: Nathan Bower Signed-off-by: Melissa Vagi --- _ingest-pipelines/processors/foreach.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_ingest-pipelines/processors/foreach.md b/_ingest-pipelines/processors/foreach.md index d647f2a6da..07b339c578 100644 --- a/_ingest-pipelines/processors/foreach.md +++ b/_ingest-pipelines/processors/foreach.md @@ -36,7 +36,7 @@ Parameter | Required/Optional | Description | `ignore_missing` | Optional | If `true` and the specified field does not exist or is null, then the processor will quietly exit without modifying the document. `description` | Optional | A brief description of the processor. `if` | Optional | A condition for running the processor. -`ignore_failure` | Optional | Specifies whether the processor continues execution even if it encounters an error. If set to `true`, failures are ignored. Default is `false`. +`ignore_failure` | Optional | Specifies whether the processor continues execution even if it encounters an error. If set to `true`, then failures are ignored. Default is `false`. `on_failure` | Optional | A list of processors to run if the processor fails. `tag` | Optional | An identifier tag for the processor. Useful for debugging in order to distinguish between processors of the same type. From d4bc18f42d6523800b715a1f313ddee6d71790e9 Mon Sep 17 00:00:00 2001 From: Melissa Vagi Date: Tue, 28 May 2024 12:45:58 -0600 Subject: [PATCH 7/7] Address editorial review comments Signed-off-by: Melissa Vagi --- _ingest-pipelines/processors/foreach.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/_ingest-pipelines/processors/foreach.md b/_ingest-pipelines/processors/foreach.md index 07b339c578..72a0ed1420 100644 --- a/_ingest-pipelines/processors/foreach.md +++ b/_ingest-pipelines/processors/foreach.md @@ -90,7 +90,7 @@ POST _ingest/pipeline/test-foreach/_simulate #### Response -The following example response confirms that the pipeline is working as expected, showing the four elements have been lowercased: +The following example response confirms that the pipeline is working as expected, showing that the four elements have been lowercased: ```json { @@ -182,7 +182,6 @@ The response shows the document with the extracted JSON data from the `users` fi ] } } -gaobinlong1 day ago { "docs": [