From d5dbe988fc6fb904895ccd217edce5d8be348aa8 Mon Sep 17 00:00:00 2001 From: Melissa Vagi Date: Fri, 22 Dec 2023 13:36:29 -0700 Subject: [PATCH 01/12] Add join processor documentation Signed-off-by: Melissa Vagi --- _ingest-pipelines/processors/join.md | 86 ++++++++++++++++++++++++++++ 1 file changed, 86 insertions(+) create mode 100644 _ingest-pipelines/processors/join.md diff --git a/_ingest-pipelines/processors/join.md b/_ingest-pipelines/processors/join.md new file mode 100644 index 0000000000..e1317a14ef --- /dev/null +++ b/_ingest-pipelines/processors/join.md @@ -0,0 +1,86 @@ +--- +layout: default +title: Join +parent: Ingest processors +nav_order: 160 +--- + +# Join processor + +The `join` processor is used to . + +The following is the syntax for the `join` processor: + +```json + +``` +{% include copy-curl.html %} + +## Configuration parameters + +The following table lists the required and optional parameters for the `join` processor. + +Parameter | Required/Optional | Description | +|-----------|-----------|-----------| + + +## Using the processor + +Follow these steps to use the processor in a pipeline. + +### Step 1: Create a pipeline + +The following query creates a pipeline, named , that uses the `join` processor to : + +```json + +``` +{% include copy-curl.html %} + +### Step 2 (Optional): Test the pipeline + +It is recommended that you test your pipeline before you ingest documents. +{: .tip} + +To test the pipeline, run the following query: + +```json + +``` +{% include copy-curl.html %} + +#### Response + +The following example response confirms that the pipeline is working as expected: + +```json + +``` + +### Step 3: Ingest a document + +The following query ingests a document into an index named `testindex1`: + +```json + +``` +{% include copy-curl.html %} + +#### Response + +The request indexes the document into the index and will index all documents with . + +```json + +``` + +### Step 4 (Optional): Retrieve the document + +To retrieve the document, run the following query: + +```json + +``` +{% include copy-curl.html %} + + \ No newline at end of file From 5bbecdde328baa232965cc7f3e462ecd487d6da1 Mon Sep 17 00:00:00 2001 From: Melissa Vagi Date: Thu, 23 May 2024 13:21:51 -0600 Subject: [PATCH 02/12] Add examples and explanatory text Signed-off-by: Melissa Vagi --- _ingest-pipelines/processors/join.md | 47 ++++++++++++++++++++++++---- 1 file changed, 41 insertions(+), 6 deletions(-) diff --git a/_ingest-pipelines/processors/join.md b/_ingest-pipelines/processors/join.md index e1317a14ef..7193a7b423 100644 --- a/_ingest-pipelines/processors/join.md +++ b/_ingest-pipelines/processors/join.md @@ -7,12 +7,17 @@ nav_order: 160 # Join processor -The `join` processor is used to . +The `join` processor combines fields from different data sources into a single document before indexing. Raises an error if the field is not an array. For example, you could combine the log message, severity level, and timestamp into a single field for better readability and easier querying. Or, you could join data from different sources, such as application logs and system logs, based on a common field like a session ID or user ID to provide a more comprehensive view of related events and help in troubleshooting and root cause analysis. The following is the syntax for the `join` processor: ```json - +{ + "join": { + "field": "field_name", + "separator": "separator_string" + } +} ``` {% include copy-curl.html %} @@ -22,7 +27,14 @@ The following table lists the required and optional parameters for the `join` pr Parameter | Required/Optional | Description | |-----------|-----------|-----------| - +`field` | Required | The field name where the join operator is applied. +`separator` | Optional | A string separator to use when joining field values. If not specified, the values are concatenated without a separator. +`target_field` | Optional | The field to assign the cleaned value to. If not specified, field is updated in-place. +`description` | Optional | Description of the processor's purpose or configuration. +`if` | Optional | Conditionally execute the processor. +`ignore_failure` | Optional | Ignore failures for the processor. See [Handling pipeline failures]({{site.url}}{{site.baseurl}}/ingest-pipelines/pipeline-failures/). +`on_failure` | Optional | Handle failures for the processor. See [Handling pipeline failures]({{site.url}}{{site.baseurl}}/ingest-pipelines/pipeline-failures/). +`tag` | Optional | Identifier for the processor. Useful for debugging and metrics. ## Using the processor @@ -33,7 +45,19 @@ Follow these steps to use the processor in a pipeline. The following query creates a pipeline, named , that uses the `join` processor to : ```json - +PUT _ingest/pipeline/example-join-pipeline +{ + "description": "Example pipeline using the join processor", + "processors": [ + { + "join": { + "field": "message", + "separator": " - ", + "target_field": "combined_message" + } + } + ] +} ``` {% include copy-curl.html %} @@ -45,7 +69,18 @@ It is recommended that you test your pipeline before you ingest documents. To test the pipeline, run the following query: ```json - +POST _ingest/pipeline/example-join-pipeline/_simulate +{ + "docs": [ + { + "_source": { + "message": "Server started", + "severity": "INFO", + "timestamp": "2023-04-20T12:00:00Z" + } + } + ] +} ``` {% include copy-curl.html %} @@ -54,7 +89,7 @@ To test the pipeline, run the following query: The following example response confirms that the pipeline is working as expected: ```json - + ``` ### Step 3: Ingest a document From 62bad3c1c05b13efca0bcbeaf6203d72b0c8035a Mon Sep 17 00:00:00 2001 From: Melissa Vagi Date: Wed, 29 May 2024 11:09:43 -0600 Subject: [PATCH 03/12] Address tech review comments Signed-off-by: Melissa Vagi --- _ingest-pipelines/processors/join.md | 94 ++++++++++++++++------------ 1 file changed, 54 insertions(+), 40 deletions(-) diff --git a/_ingest-pipelines/processors/join.md b/_ingest-pipelines/processors/join.md index 7193a7b423..ae79223d65 100644 --- a/_ingest-pipelines/processors/join.md +++ b/_ingest-pipelines/processors/join.md @@ -7,7 +7,7 @@ nav_order: 160 # Join processor -The `join` processor combines fields from different data sources into a single document before indexing. Raises an error if the field is not an array. For example, you could combine the log message, severity level, and timestamp into a single field for better readability and easier querying. Or, you could join data from different sources, such as application logs and system logs, based on a common field like a session ID or user ID to provide a more comprehensive view of related events and help in troubleshooting and root cause analysis. +The `join` processor concatenates the elements of an array into a single string value, using a specified separator between each element. It throws an exception if the provided input is not an array. The following is the syntax for the `join` processor: @@ -27,8 +27,8 @@ The following table lists the required and optional parameters for the `join` pr Parameter | Required/Optional | Description | |-----------|-----------|-----------| -`field` | Required | The field name where the join operator is applied. -`separator` | Optional | A string separator to use when joining field values. If not specified, the values are concatenated without a separator. +`field` | Required | The field name where the join operator is applied. Must be an array. +`separator` | Required | A string separator to use when joining field values. If not specified, the values are concatenated without a separator. `target_field` | Optional | The field to assign the cleaned value to. If not specified, field is updated in-place. `description` | Optional | Description of the processor's purpose or configuration. `if` | Optional | Conditionally execute the processor. @@ -45,19 +45,18 @@ Follow these steps to use the processor in a pipeline. The following query creates a pipeline, named , that uses the `join` processor to : ```json -PUT _ingest/pipeline/example-join-pipeline -{ - "description": "Example pipeline using the join processor", - "processors": [ - { - "join": { - "field": "message", - "separator": " - ", - "target_field": "combined_message" - } - } - ] -} +PUT _ingest/pipeline/example-join-pipeline +{ + "description": "Example pipeline using the join processor", + "processors": [ + { + "join": { + "field": "uri", + "separator": "/" + } + } + ] +} ``` {% include copy-curl.html %} @@ -69,17 +68,19 @@ It is recommended that you test your pipeline before you ingest documents. To test the pipeline, run the following query: ```json -POST _ingest/pipeline/example-join-pipeline/_simulate -{ - "docs": [ - { - "_source": { - "message": "Server started", - "severity": "INFO", - "timestamp": "2023-04-20T12:00:00Z" - } - } - ] +POST _ingest/pipeline/example-join-pipeline/_simulate +{ + "docs": [ + { + "_source": { + "uri": [ + "app", + "home", + "overview" + ] + } + } + ] } ``` {% include copy-curl.html %} @@ -89,33 +90,46 @@ POST _ingest/pipeline/example-join-pipeline/_simulate The following example response confirms that the pipeline is working as expected: ```json - +{ + "docs": [ + { + "doc": { + "_index": "_index", + "_id": "_id", + "_source": { + "uri": "app/home/overview" + }, + "_ingest": { + "timestamp": "2024-05-24T02:16:01.00659117Z" + } + } + } + ] +} ``` +{% include copy-curl.html %} ### Step 3: Ingest a document The following query ingests a document into an index named `testindex1`: ```json - +POST testindex1/_doc/1?pipeline=example-join-pipeline +{ + "uri": [ + "app", + "home", + "overview" + ] +} ``` {% include copy-curl.html %} -#### Response - -The request indexes the document into the index and will index all documents with . - -```json - -``` - ### Step 4 (Optional): Retrieve the document To retrieve the document, run the following query: ```json - +GET testindex1/_doc/1 ``` {% include copy-curl.html %} - - \ No newline at end of file From b1783a18053fbaa28cab4986feda5367b082daac Mon Sep 17 00:00:00 2001 From: Melissa Vagi Date: Wed, 29 May 2024 11:17:39 -0600 Subject: [PATCH 04/12] Update _ingest-pipelines/processors/join.md Signed-off-by: Melissa Vagi --- _ingest-pipelines/processors/join.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_ingest-pipelines/processors/join.md b/_ingest-pipelines/processors/join.md index ae79223d65..63032bf523 100644 --- a/_ingest-pipelines/processors/join.md +++ b/_ingest-pipelines/processors/join.md @@ -42,7 +42,7 @@ Follow these steps to use the processor in a pipeline. ### Step 1: Create a pipeline -The following query creates a pipeline, named , that uses the `join` processor to : +The following query creates a pipeline named `example-join-pipeline` that uses the `join` processor to concatenate all the values of the `uri` field, separating them with the specified separator `/`: ```json PUT _ingest/pipeline/example-join-pipeline From 1613f7febf301de93a305c3ee0af18c856fc98d5 Mon Sep 17 00:00:00 2001 From: Melissa Vagi Date: Thu, 30 May 2024 08:28:57 -0600 Subject: [PATCH 05/12] Update _ingest-pipelines/processors/join.md Co-authored-by: Nathan Bower Signed-off-by: Melissa Vagi --- _ingest-pipelines/processors/join.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_ingest-pipelines/processors/join.md b/_ingest-pipelines/processors/join.md index 63032bf523..c4b1592e3f 100644 --- a/_ingest-pipelines/processors/join.md +++ b/_ingest-pipelines/processors/join.md @@ -27,7 +27,7 @@ The following table lists the required and optional parameters for the `join` pr Parameter | Required/Optional | Description | |-----------|-----------|-----------| -`field` | Required | The field name where the join operator is applied. Must be an array. +`field` | Required | The name of the field to which the join operator is applied. Must be an array. `separator` | Required | A string separator to use when joining field values. If not specified, the values are concatenated without a separator. `target_field` | Optional | The field to assign the cleaned value to. If not specified, field is updated in-place. `description` | Optional | Description of the processor's purpose or configuration. From 04412660bee709a0196864a6d358723bbad56526 Mon Sep 17 00:00:00 2001 From: Melissa Vagi Date: Thu, 30 May 2024 08:29:06 -0600 Subject: [PATCH 06/12] Update _ingest-pipelines/processors/join.md Co-authored-by: Nathan Bower Signed-off-by: Melissa Vagi --- _ingest-pipelines/processors/join.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_ingest-pipelines/processors/join.md b/_ingest-pipelines/processors/join.md index c4b1592e3f..5ab4c3bb50 100644 --- a/_ingest-pipelines/processors/join.md +++ b/_ingest-pipelines/processors/join.md @@ -28,7 +28,7 @@ The following table lists the required and optional parameters for the `join` pr Parameter | Required/Optional | Description | |-----------|-----------|-----------| `field` | Required | The name of the field to which the join operator is applied. Must be an array. -`separator` | Required | A string separator to use when joining field values. If not specified, the values are concatenated without a separator. +`separator` | Required | A string separator to use when joining field values. If not specified, then the values are concatenated without a separator. `target_field` | Optional | The field to assign the cleaned value to. If not specified, field is updated in-place. `description` | Optional | Description of the processor's purpose or configuration. `if` | Optional | Conditionally execute the processor. From 6cd64bd7de6af0a55eaa88a9c12d9941cb08b1f7 Mon Sep 17 00:00:00 2001 From: Melissa Vagi Date: Thu, 30 May 2024 08:29:20 -0600 Subject: [PATCH 07/12] Update _ingest-pipelines/processors/join.md Co-authored-by: Nathan Bower Signed-off-by: Melissa Vagi --- _ingest-pipelines/processors/join.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_ingest-pipelines/processors/join.md b/_ingest-pipelines/processors/join.md index 5ab4c3bb50..adcbada20b 100644 --- a/_ingest-pipelines/processors/join.md +++ b/_ingest-pipelines/processors/join.md @@ -29,7 +29,7 @@ Parameter | Required/Optional | Description | |-----------|-----------|-----------| `field` | Required | The name of the field to which the join operator is applied. Must be an array. `separator` | Required | A string separator to use when joining field values. If not specified, then the values are concatenated without a separator. -`target_field` | Optional | The field to assign the cleaned value to. If not specified, field is updated in-place. +`target_field` | Optional | The field to assign the cleaned value to. If not specified, then the field is updated in place. `description` | Optional | Description of the processor's purpose or configuration. `if` | Optional | Conditionally execute the processor. `ignore_failure` | Optional | Ignore failures for the processor. See [Handling pipeline failures]({{site.url}}{{site.baseurl}}/ingest-pipelines/pipeline-failures/). From cc946b200b2d5eaa8a35bd786117357af8aa1363 Mon Sep 17 00:00:00 2001 From: Melissa Vagi Date: Thu, 30 May 2024 08:29:28 -0600 Subject: [PATCH 08/12] Update _ingest-pipelines/processors/join.md Co-authored-by: Nathan Bower Signed-off-by: Melissa Vagi --- _ingest-pipelines/processors/join.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_ingest-pipelines/processors/join.md b/_ingest-pipelines/processors/join.md index adcbada20b..47517cda52 100644 --- a/_ingest-pipelines/processors/join.md +++ b/_ingest-pipelines/processors/join.md @@ -33,7 +33,7 @@ Parameter | Required/Optional | Description | `description` | Optional | Description of the processor's purpose or configuration. `if` | Optional | Conditionally execute the processor. `ignore_failure` | Optional | Ignore failures for the processor. See [Handling pipeline failures]({{site.url}}{{site.baseurl}}/ingest-pipelines/pipeline-failures/). -`on_failure` | Optional | Handle failures for the processor. See [Handling pipeline failures]({{site.url}}{{site.baseurl}}/ingest-pipelines/pipeline-failures/). +`on_failure` | Optional | Specifies to handle failures for the processor. See [Handling pipeline failures]({{site.url}}{{site.baseurl}}/ingest-pipelines/pipeline-failures/). `tag` | Optional | Identifier for the processor. Useful for debugging and metrics. ## Using the processor From 6d8ba0eac4f26818480a6d184382136220707be9 Mon Sep 17 00:00:00 2001 From: Melissa Vagi Date: Thu, 30 May 2024 08:29:35 -0600 Subject: [PATCH 09/12] Update _ingest-pipelines/processors/join.md Co-authored-by: Nathan Bower Signed-off-by: Melissa Vagi --- _ingest-pipelines/processors/join.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_ingest-pipelines/processors/join.md b/_ingest-pipelines/processors/join.md index 47517cda52..03674e49f4 100644 --- a/_ingest-pipelines/processors/join.md +++ b/_ingest-pipelines/processors/join.md @@ -30,7 +30,7 @@ Parameter | Required/Optional | Description | `field` | Required | The name of the field to which the join operator is applied. Must be an array. `separator` | Required | A string separator to use when joining field values. If not specified, then the values are concatenated without a separator. `target_field` | Optional | The field to assign the cleaned value to. If not specified, then the field is updated in place. -`description` | Optional | Description of the processor's purpose or configuration. +`description` | Optional | A description of the processor's purpose or configuration. `if` | Optional | Conditionally execute the processor. `ignore_failure` | Optional | Ignore failures for the processor. See [Handling pipeline failures]({{site.url}}{{site.baseurl}}/ingest-pipelines/pipeline-failures/). `on_failure` | Optional | Specifies to handle failures for the processor. See [Handling pipeline failures]({{site.url}}{{site.baseurl}}/ingest-pipelines/pipeline-failures/). From 5b5805af4b281660f27dbc0b631eab718b0498df Mon Sep 17 00:00:00 2001 From: Melissa Vagi Date: Thu, 30 May 2024 08:29:45 -0600 Subject: [PATCH 10/12] Update _ingest-pipelines/processors/join.md Co-authored-by: Nathan Bower Signed-off-by: Melissa Vagi --- _ingest-pipelines/processors/join.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_ingest-pipelines/processors/join.md b/_ingest-pipelines/processors/join.md index 03674e49f4..b32743f275 100644 --- a/_ingest-pipelines/processors/join.md +++ b/_ingest-pipelines/processors/join.md @@ -31,7 +31,7 @@ Parameter | Required/Optional | Description | `separator` | Required | A string separator to use when joining field values. If not specified, then the values are concatenated without a separator. `target_field` | Optional | The field to assign the cleaned value to. If not specified, then the field is updated in place. `description` | Optional | A description of the processor's purpose or configuration. -`if` | Optional | Conditionally execute the processor. +`if` | Optional | Specifies to conditionally execute the processor. `ignore_failure` | Optional | Ignore failures for the processor. See [Handling pipeline failures]({{site.url}}{{site.baseurl}}/ingest-pipelines/pipeline-failures/). `on_failure` | Optional | Specifies to handle failures for the processor. See [Handling pipeline failures]({{site.url}}{{site.baseurl}}/ingest-pipelines/pipeline-failures/). `tag` | Optional | Identifier for the processor. Useful for debugging and metrics. From ac85946083c1a35cc9cccbc8fabad84ba4c42f40 Mon Sep 17 00:00:00 2001 From: Melissa Vagi Date: Thu, 30 May 2024 08:29:53 -0600 Subject: [PATCH 11/12] Update _ingest-pipelines/processors/join.md Co-authored-by: Nathan Bower Signed-off-by: Melissa Vagi --- _ingest-pipelines/processors/join.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_ingest-pipelines/processors/join.md b/_ingest-pipelines/processors/join.md index b32743f275..967c6a6d60 100644 --- a/_ingest-pipelines/processors/join.md +++ b/_ingest-pipelines/processors/join.md @@ -32,7 +32,7 @@ Parameter | Required/Optional | Description | `target_field` | Optional | The field to assign the cleaned value to. If not specified, then the field is updated in place. `description` | Optional | A description of the processor's purpose or configuration. `if` | Optional | Specifies to conditionally execute the processor. -`ignore_failure` | Optional | Ignore failures for the processor. See [Handling pipeline failures]({{site.url}}{{site.baseurl}}/ingest-pipelines/pipeline-failures/). +`ignore_failure` | Optional | Specifies to ignore failures for the processor. See [Handling pipeline failures]({{site.url}}{{site.baseurl}}/ingest-pipelines/pipeline-failures/). `on_failure` | Optional | Specifies to handle failures for the processor. See [Handling pipeline failures]({{site.url}}{{site.baseurl}}/ingest-pipelines/pipeline-failures/). `tag` | Optional | Identifier for the processor. Useful for debugging and metrics. From 3f2f9dae6a519067de89527e5d22c08d0b168884 Mon Sep 17 00:00:00 2001 From: Melissa Vagi Date: Thu, 30 May 2024 08:30:00 -0600 Subject: [PATCH 12/12] Update _ingest-pipelines/processors/join.md Co-authored-by: Nathan Bower Signed-off-by: Melissa Vagi --- _ingest-pipelines/processors/join.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_ingest-pipelines/processors/join.md b/_ingest-pipelines/processors/join.md index 967c6a6d60..c2cdcfe4de 100644 --- a/_ingest-pipelines/processors/join.md +++ b/_ingest-pipelines/processors/join.md @@ -34,7 +34,7 @@ Parameter | Required/Optional | Description | `if` | Optional | Specifies to conditionally execute the processor. `ignore_failure` | Optional | Specifies to ignore failures for the processor. See [Handling pipeline failures]({{site.url}}{{site.baseurl}}/ingest-pipelines/pipeline-failures/). `on_failure` | Optional | Specifies to handle failures for the processor. See [Handling pipeline failures]({{site.url}}{{site.baseurl}}/ingest-pipelines/pipeline-failures/). -`tag` | Optional | Identifier for the processor. Useful for debugging and metrics. +`tag` | Optional | An identifier for the processor. Useful for debugging and metrics. ## Using the processor