Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DOC] Add dissect processor documentation #5510

Merged
merged 67 commits into from
Jan 29, 2024
Merged
Changes from 9 commits
Commits
Show all changes
67 commits
Select commit Hold shift + click to select a range
04ba3d8
Develop new content for dissect processor
vagimeli Nov 3, 2023
9cea9c9
Add new dissect processor documentation
vagimeli Nov 6, 2023
7177f0d
Merge branch 'main' into dissect
vagimeli Nov 6, 2023
c41d3a6
Merge branch 'main' into dissect
vagimeli Nov 8, 2023
a9c0791
Merge branch 'main' into dissect
vagimeli Nov 14, 2023
3a9854b
Merge branch 'main' into dissect
vagimeli Nov 17, 2023
8a33fa7
Update dissect.md
vagimeli Nov 17, 2023
22d135e
Merge branch 'main' into dissect
vagimeli Nov 29, 2023
29cbe58
Merge branch 'main' into dissect
vagimeli Dec 1, 2023
46b495a
Address tech review feedback
vagimeli Dec 8, 2023
0e53648
Merge branch 'main' into dissect
vagimeli Dec 8, 2023
baa299e
Merge branch 'main' into dissect
vagimeli Dec 12, 2023
4eaefa1
Copy edits
vagimeli Dec 12, 2023
8f8e557
Merge branch 'main' into dissect
vagimeli Jan 4, 2024
8b46f7a
Copy edits
vagimeli Jan 4, 2024
eb07020
Copy edits
vagimeli Jan 4, 2024
e6cfa13
Merge branch 'main' into dissect
vagimeli Jan 10, 2024
a1c4a3a
Update _ingest-pipelines/processors/dissect.md
vagimeli Jan 18, 2024
444e8d7
Merge branch 'main' into dissect
vagimeli Jan 18, 2024
b003300
Merge branch 'main' into dissect
vagimeli Jan 23, 2024
179f20d
Address Fanit doc review feedback
vagimeli Jan 23, 2024
740af70
Copy edits
vagimeli Jan 23, 2024
da68754
Update _ingest-pipelines/processors/dissect.md
vagimeli Jan 26, 2024
f82b3b8
Update _ingest-pipelines/processors/dissect.md
vagimeli Jan 26, 2024
0ff6606
Update _ingest-pipelines/processors/dissect.md
vagimeli Jan 26, 2024
fdeacb7
Merge branch 'main' into dissect
vagimeli Jan 26, 2024
b44f48b
Update _ingest-pipelines/processors/dissect.md
vagimeli Jan 29, 2024
da19be4
Update _ingest-pipelines/processors/dissect.md
vagimeli Jan 29, 2024
98b790b
Update dissect.md
vagimeli Jan 29, 2024
5cdd221
Update _ingest-pipelines/processors/dissect.md
vagimeli Jan 29, 2024
34a47fd
Update _ingest-pipelines/processors/dissect.md
vagimeli Jan 29, 2024
b379bc3
Update _ingest-pipelines/processors/dissect.md
vagimeli Jan 29, 2024
9ea4d6f
Update _ingest-pipelines/processors/dissect.md
vagimeli Jan 29, 2024
ce9ff07
Update _ingest-pipelines/processors/dissect.md
vagimeli Jan 29, 2024
04af086
Update _ingest-pipelines/processors/dissect.md
vagimeli Jan 29, 2024
a3e7a8d
Update _ingest-pipelines/processors/dissect.md
vagimeli Jan 29, 2024
7b44a06
Update _ingest-pipelines/processors/dissect.md
vagimeli Jan 29, 2024
de25752
Update _ingest-pipelines/processors/dissect.md
vagimeli Jan 29, 2024
47195d0
Update _ingest-pipelines/processors/dissect.md
vagimeli Jan 29, 2024
7d755f6
Update _ingest-pipelines/processors/dissect.md
vagimeli Jan 29, 2024
7b03a4b
Update _ingest-pipelines/processors/dissect.md
vagimeli Jan 29, 2024
01206c5
Update _ingest-pipelines/processors/dissect.md
vagimeli Jan 29, 2024
599afa1
Update _ingest-pipelines/processors/dissect.md
vagimeli Jan 29, 2024
3be93a2
Update _ingest-pipelines/processors/dissect.md
vagimeli Jan 29, 2024
c7d1e04
Update _ingest-pipelines/processors/dissect.md
vagimeli Jan 29, 2024
5b13a02
Update _ingest-pipelines/processors/dissect.md
vagimeli Jan 29, 2024
a6d8d42
Update _ingest-pipelines/processors/dissect.md
vagimeli Jan 29, 2024
1c95dfe
Update _ingest-pipelines/processors/dissect.md
vagimeli Jan 29, 2024
736ca78
Update dissect.md
vagimeli Jan 29, 2024
472d152
Update _ingest-pipelines/processors/dissect.md
vagimeli Jan 29, 2024
fe54378
Update dissect.md
vagimeli Jan 29, 2024
91885c4
Update _ingest-pipelines/processors/dissect.md
vagimeli Jan 29, 2024
4818868
Update _ingest-pipelines/processors/dissect.md
vagimeli Jan 29, 2024
19bb826
Update _ingest-pipelines/processors/dissect.md
vagimeli Jan 29, 2024
240f709
Update _ingest-pipelines/processors/dissect.md
vagimeli Jan 29, 2024
f31da78
Update _ingest-pipelines/processors/dissect.md
vagimeli Jan 29, 2024
fa99869
Update dissect.md
vagimeli Jan 29, 2024
4e0418f
Update _ingest-pipelines/processors/dissect.md
vagimeli Jan 29, 2024
0022732
Update _ingest-pipelines/processors/dissect.md
vagimeli Jan 29, 2024
b0bf080
Update _ingest-pipelines/processors/dissect.md
vagimeli Jan 29, 2024
3eb5330
Update _ingest-pipelines/processors/dissect.md
vagimeli Jan 29, 2024
5d6a747
Update dissect.md
vagimeli Jan 29, 2024
5a4c9bb
Update dissect.md
vagimeli Jan 29, 2024
59f4e82
Update dissect.md
vagimeli Jan 29, 2024
84ff483
Update dissect.md
vagimeli Jan 29, 2024
f17c8d8
Update dissect.md
vagimeli Jan 29, 2024
a95c46d
Merge branch 'main' into dissect
vagimeli Jan 29, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
281 changes: 281 additions & 0 deletions _ingest-pipelines/processors/dissect.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,281 @@
---
layout: default
title: Dissect
parent: Ingest processors
nav_order: 60
---

# Dissect

The `dissect` processor extracts values from an event and maps them to individual fields based on user-defined dissect patterns. The processor is well-suited for field extractions from log messages with a known structure.
vagimeli marked this conversation as resolved.
Show resolved Hide resolved

## Example
The following is the syntax for the `dissect` processor:

```json
{
"dissect": {
"field": "source_field",
"pattern": "%{dissect_pattern}"
}
}
```
{% include copy-curl.html %}


## Configuration parameters

Check failure on line 26 in _ingest-pipelines/processors/dissect.md

View workflow job for this annotation

GitHub Actions / vale

[vale] _ingest-pipelines/processors/dissect.md#L26

[OpenSearch.HeadingCapitalization] 'Configuration parameters' is a heading and should be in sentence case.
Raw output
{"message": "[OpenSearch.HeadingCapitalization] 'Configuration parameters' is a heading and should be in sentence case.", "location": {"path": "_ingest-pipelines/processors/dissect.md", "range": {"start": {"line": 26, "column": 4}}}, "severity": "ERROR"}

The following table lists the required and optional parameters for the `dissect` processor.

Parameter | Required/Optional | Description |

Check failure on line 30 in _ingest-pipelines/processors/dissect.md

View workflow job for this annotation

GitHub Actions / vale

[vale] _ingest-pipelines/processors/dissect.md#L30

[OpenSearch.TableHeadings] 'Required/Optional' is a table heading and should be in sentence case.
Raw output
{"message": "[OpenSearch.TableHeadings] 'Required/Optional' is a table heading and should be in sentence case.", "location": {"path": "_ingest-pipelines/processors/dissect.md", "range": {"start": {"line": 30, "column": 13}}}, "severity": "ERROR"}
|-----------|-----------|-----------|
`field` | Required | The name of the field to which the data should be dissected. Supports [template snippets]({{site.url}}{{site.baseurl}}/ingest-pipelines/create-ingest/#template-snippets). |
`dissect_pattern` | Required | The dissect pattern used to extract data from the field specified. |
`append_separator` | Optional | The separator character or string between two or more values. Default is `""` (empty string).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
`append_separator` | Optional | The separator character or string between two or more values. Default is `""` (empty string).
`append_separator` | Optional | The separator character or string that separates appended fields. Default is `""` (empty string).

`description` | Optional | A brief description of the processor. |
`if` | Optional | A condition for running this processor. |
`ignore_failure` | Optional | If set to `true`, failures are ignored. Default is `false`. |
`ignore_missing` | Optional | If set to `true`, the processor does not modify the document if the field does not exist or is `null`. Default is `false`. |
`on_failure` | Optional | A list of processors to run if the processor fails. |
`tag` | Optional | An identifier tag for the processor. Useful for debugging to distinguish between processors of the same type. |

## Using the processor

Follow these steps to use the processor in a pipeline.

**Step 1: Create a pipeline.**

The following query creates a pipeline, named `dissect-text`, that uses the `dissect` processor to parse the log line:

```json
PUT /_ingest/pipeline/dissect-test
{
"description": "Pipeline that dissects web server logs",
"processors": [
{
"dissect": {
"field": "message",
"pattern": "%{client_ip} - - [%{timestamp}] \"%{http_method} %{url} %{http_version}\" %{response_code} %{response_size}"
}
}
]
}
```
{% include copy-curl.html %}

**Step 2 (Optional): Test the pipeline.**

It is recommended that you test your pipeline before you ingest documents.
{: .tip}

To test the pipeline, run the following query:

```json
POST _ingest/pipeline/dissect-test/_simulate
{
"docs": [
{
"_index": "testindex1",
"_id": "1",
"_source": {
"message": "192.168.1.10 - - [03/Nov/2023:15:20:45 +0000] \"POST /login HTTP/1.1\" 200 3456"
}
}
]
}
```
{% include copy-curl.html %}

#### Response

The following example response confirms that the pipeline is working as expected:

```json
{
"docs": [
{
"doc": {
"_index": "testindex1",
"_id": "1",
"_source": {
"response_code": "200",
"http_method": "POST",
"http_version": "HTTP/1.1",
"client_ip": "192.168.1.10",
"message": """192.168.1.10 - - [03/Nov/2023:15:20:45 +0000] "POST /login HTTP/1.1" 200 3456""",
"url": "/login",
"response_size": "3456",
"timestamp": "03/Nov/2023:15:20:45 +0000"
},
"_ingest": {
"timestamp": "2023-11-03T22:28:32.830244044Z"
}
}
}
]
}
```

**Step 3: Ingest a document.**

The following query ingests a document into an index named `testindex1`:

```json
PUT testindex1/_doc/1?pipeline=dissect-test
{
"message": "192.168.1.10 - - [03/Nov/2023:15:20:45 +0000] \"POST /login HTTP/1.1\" 200 3456"
}
```
{% include copy-curl.html %}

**Step 4 (Optional): Retrieve the document.**

To retrieve the document, run the following query:

```json
GET testindex1/_doc/1
```
{% include copy-curl.html %}

## Dissect patterns

A dissect pattern is a way to tell `dissect` how to parse a string into a structured format. The pattern is defined by the parts of the string that you want to discard. For example, the following dissect pattern would parse a string like `"192.168.1.10 - - [03/Nov/2023:15:20:45 +0000] \"POST /login HTTP/1.1\" 200 3456"` into the following fields:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
A dissect pattern is a way to tell `dissect` how to parse a string into a structured format. The pattern is defined by the parts of the string that you want to discard. For example, the following dissect pattern would parse a string like `"192.168.1.10 - - [03/Nov/2023:15:20:45 +0000] \"POST /login HTTP/1.1\" 200 3456"` into the following fields:
A dissect pattern is a way to tell `dissect` how to parse a string into a structured format. The pattern is defined by the parts of the string that you want to discard. For example, the `%{client_ip} - - [%{timestamp}]` dissect pattern parses the string `"192.168.1.10 - - [03/Nov/2023:15:20:45 +0000] \"POST /login HTTP/1.1\" 200 3456"` into the following fields:


```json
client_ip: "192.168.1.1"
@timestamp: "03/Nov/2023:16:09:05 MDT"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
@timestamp: "03/Nov/2023:16:09:05 MDT"
@timestamp: "03/Nov/2023:15:20:45 +0000"

```

The dissect pattern works by matching the string against a set of rules. For example, the first rule is to match a single space. Dissect will find this space and then assign the value of `client_ip` to everything up to that space. The next rule is to match the `[` and `]` characters and then assign the value of `@timestamp` to everything in between.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The dissect pattern works by matching the string against a set of rules. For example, the first rule is to match a single space. Dissect will find this space and then assign the value of `client_ip` to everything up to that space. The next rule is to match the `[` and `]` characters and then assign the value of `@timestamp` to everything in between.
The dissect pattern works by matching the string against a set of rules. For example, the first rule is to discard a single space. Dissect will find this space and then assign the value of `client_ip` to everything up to that space. The next rule is to match the `[` and `]` characters and then assign the value of `@timestamp` to everything in between.


### Building successful dissect patterns

When building dissect pattern, it is important to pay attention to the parts of the string that you want to discard. If you discard too much of the string, then `dissect` may not be able to successfully parse the remaining data. Conversely, if you do not discard enough of the string, then `dissect` may create unnecessary fields.
vagimeli marked this conversation as resolved.
Show resolved Hide resolved

If any of the `%{keyname}` defined in the pattern do not have a value, then an exception is thrown. You can handle this exception by using the `on_failure` parameter.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
If any of the `%{keyname}` defined in the pattern do not have a value, then an exception is thrown. You can handle this exception by using the `on_failure` parameter.
If any of the `%{keyname}` defined in the pattern do not have a value, then an exception is thrown. You can handle this exception by providing error handling steps in the `on_failure` parameter.


### Empty and named skip keys

An empty key `%{}` or a named skip key can be used to match values, but exclude the value from the final document. This can be useful if you want to parse a string, but you do not need to store all of the data.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a link to the "named skip key" section below

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
An empty key `%{}` or a named skip key can be used to match values, but exclude the value from the final document. This can be useful if you want to parse a string, but you do not need to store all of the data.
An empty key `%{}` or a named skip key can be used to match values, but exclude the value from the final document. This can be useful if you want to parse a string, but you do not need to store all its parts.


### Matched values as string data types
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
### Matched values as string data types
### Converting matched values to a non-string data type


By default, all matched values are represented as string data types. If you need to convert a value to a different data type, you can use the [`convert` processor]({{site.url}}{{site.baseurl}}/ingest-pipelines/processors/convert/).

### Key modifiers

Check failure on line 165 in _ingest-pipelines/processors/dissect.md

View workflow job for this annotation

GitHub Actions / vale

[vale] _ingest-pipelines/processors/dissect.md#L165

[OpenSearch.HeadingCapitalization] 'Key modifiers' is a heading and should be in sentence case.
Raw output
{"message": "[OpenSearch.HeadingCapitalization] 'Key modifiers' is a heading and should be in sentence case.", "location": {"path": "_ingest-pipelines/processors/dissect.md", "range": {"start": {"line": 165, "column": 5}}}, "severity": "ERROR"}

The `dissect` processor supports key modifiers that can change the dissection's default behavior. These modifiers are always placed to the left or right of the `%{keyname}` and are always enclosed within the `%{}`. For example, the `%{+keyname->}` modifier includes the append and right padding modifiers. Key modifiers are useful for cases such as combining multiple fields into a single line of output, creating formatted lists of data items, or aggregating values from multiple sources.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The `dissect` processor supports key modifiers that can change the dissection's default behavior. These modifiers are always placed to the left or right of the `%{keyname}` and are always enclosed within the `%{}`. For example, the `%{+keyname->}` modifier includes the append and right padding modifiers. Key modifiers are useful for cases such as combining multiple fields into a single line of output, creating formatted lists of data items, or aggregating values from multiple sources.
The `dissect` processor supports key modifiers that can change the default processor behavior. These modifiers are always placed to the left or right of the `%{keyname}` and are always enclosed within the `%{}`. For example, the `%{+keyname->}` modifier includes the append and right padding modifiers. Key modifiers are useful for cases such as combining multiple fields into a single line of output, creating formatted lists of data items, or aggregating values from multiple sources.


The following table lists the key modifiers for the `dissect` processor.

Modifier | Name | Position | Example | Description |
|-----------|-----------|-----------|
`->` | Skip right padding | (far) right | `%{keyname->}` | Tells `dissect` to skip over any repeated characters to the right. For example, `%{timestamp->}` could be used to tell `dissect` to skip over any padding characters, such as two spaces or any varying character padding, that follow `timestamp`. |
vagimeli marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Last sentence: "two consecutive spaces"?

`+` | Append | left | `%{keyname} %{+keyname}` | Appends two or more fields together. |
vagimeli marked this conversation as resolved.
Show resolved Hide resolved
`+` with `/n` | Append with order | left and right | `%{+keyname}/2 %{+keyname/1}` | Appends two or more fields together in the order specified. |
vagimeli marked this conversation as resolved.
Show resolved Hide resolved
`?` | Named skip key | left | `%{?skipme}` | Skips the matched value in the output. Same behavior as `%{}`. |
`*` and `&` | Reference keys | left | `%{*r1} %{&r1}` | Sets the output key as value of `*` and output value of `&`. |
vagimeli marked this conversation as resolved.
Show resolved Hide resolved

Detailed descriptions of each key modifier are in the following sections.

### Right padding modifier (`->`)

The dissection algorithm is precise and requires that every character in the pattern exactly match the source string. For instance, the pattern `%{helloworldkey} %{worldkey}` (one space) will match the string "Hello world" (one space) but not the string "Hello world" (two spaces) because pattern only has one space while the source string has two.

Check failure on line 183 in _ingest-pipelines/processors/dissect.md

View workflow job for this annotation

GitHub Actions / vale

[vale] _ingest-pipelines/processors/dissect.md#L183

[OpenSearch.SpacingWords] There should be once space between words in 'Hello world'.
Raw output
{"message": "[OpenSearch.SpacingWords] There should be once space between words in 'Hello world'.", "location": {"path": "_ingest-pipelines/processors/dissect.md", "range": {"start": {"line": 183, "column": 256}}}, "severity": "ERROR"}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The dissection algorithm is precise and requires that every character in the pattern exactly match the source string. For instance, the pattern `%{helloworldkey} %{worldkey}` (one space) will match the string "Hello world" (one space) but not the string "Hello world" (two spaces) because pattern only has one space while the source string has two.
The dissection algorithm is precise and requires that every character in the pattern exactly match the source string. For instance, the pattern `%{hellokey} %{worldkey}` (one space) will match the string "Hello world" (one space) but not the string "Hello world" (two spaces) because the pattern only has one space while the source string has two.


The right padding modifier can be used to address this issue. By adding the right padding modifier to the pattern `%{helloworldkey->} %{worldkey}`, it will no match `Hello world` (one space), `Hello world` (two spaces), and even `Hello world` (ten spaces).

The right padding modifier is used to allow for the repetition of characters following a `%{keyname->}`. The right padding modifier can be applied to any key along with any other modifiers. It should always be the rightmost modifier, for example, `%{+keyname/1->}`, `%{}`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The right padding modifier is used to allow for the repetition of characters following a `%{keyname->}`. The right padding modifier can be applied to any key along with any other modifiers. It should always be the rightmost modifier, for example, `%{+keyname/1->}`, `%{}`.
The right padding modifier is used to allow for the repetition of characters following a `%{keyname->}`. The right padding modifier can be applied to any key along with any other modifiers. It should always be the rightmost modifier, for example, `%{+keyname/1->}` or `%{}`.


#### Example

The following is an example of a right padding modifier and how it is used:

`%{name->} %{city}, %{state} %{zip}`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
`%{name->} %{city}, %{state} %{zip}`
`%{city->}, %{state} %{zip}`


In this pattern, the right padding modifier `->` is applied to the `%{name}` key. This means that the `%{name}` key will match an sequence of characters, including spaces. This is useful for handling names that may contain spaces, such as "First Last".

The following is an example of how the right padding would be used to extract information from the following address entries:

```bash
New York, NY 10017
New York City, NY 10017
```

Both addresses contain the same information, but the second entry has an extra word, `City`, in the city field. The right padding modifier allows the pattern to match both of these address entries, even though they have slightly different formats.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would add an example of using the right padding modifier with an empty key (%{->}):
pipeline:

PUT /_ingest/pipeline/dissect-test
{
  "description": "Pipeline that dissects web server logs",
  "processors": [
    {
      "dissect": {
        "field": "message",
        "pattern": "[%{client_ip}]%{->}[%{timestamp}]" 
      }
    }
  ]
}

Test pipeline:

POST _ingest/pipeline/dissect-test/_simulate
{
  "docs": [
    {
      "_index": "testindex1",
      "_id": "1",
      "_source": {
        "message": "[192.168.1.10]   [03/Nov/2023:15:20:45 +0000]"
      }
    }
  ]
}

response:

{
  "docs": [
    {
      "doc": {
        "_index": "testindex1",
        "_id": "1",
        "_source": {
          "client_ip": "192.168.1.10",
          "message": "[192.168.1.10]   [03/Nov/2023:15:20:45 +0000]",
          "timestamp": "03/Nov/2023:15:20:45 +0000"
        },
        "_ingest": {
          "timestamp": "2024-01-22T22:55:42.090569297Z"
        }
      }
    }
  ]
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you for the example :)


### Append modifier (`+`)

The append modifier combines the values of two or more keys into a single output value. The values are appended from left to right. You can also specify an optional separator to be inserted between the values.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The append modifier combines the values of two or more keys into a single output value. The values are appended from left to right. You can also specify an optional separator to be inserted between the values.
The append modifier combines the values of two or more values into a single output value. The values are appended from left to right. You can also specify an optional separator to be inserted between the values.


#### Example

The following pattern extracts the values of `key1` and `key2` fields and appends them together, with a space as the separator:

`%{key1} %{key2}`

The output is:

`value1 value2`

You can also specify a custom separator using the `append_separator` parameter. For example, the following pattern uses a comma as the separator:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This syntax is incorrect. Remove this part.


`%{key1} %{key2}, append_separator => ","`

Check failure on line 222 in _ingest-pipelines/processors/dissect.md

View workflow job for this annotation

GitHub Actions / vale

[vale] _ingest-pipelines/processors/dissect.md#L222

[OpenSearch.Spelling] Error: _separator. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.
Raw output
{"message": "[OpenSearch.Spelling] Error: _separator. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.", "location": {"path": "_ingest-pipelines/processors/dissect.md", "range": {"start": {"line": 222, "column": 26}}}, "severity": "ERROR"}

The output is:

`value1, value2`

### Append with order modifier (`+` and `/n`)
vagimeli marked this conversation as resolved.
Show resolved Hide resolved

The append with order modifier combines the values of two or more keys into a single output value, adhering to the a specific order defines by a newline character `/n`. You have the flexibility to customize the separator that separates the appended values. the append modifier is useful for compiling multiple fields into a single formatted output line, constructing structured lists of data items, and consolidating values from various sources.
vagimeli marked this conversation as resolved.
Show resolved Hide resolved

#### Example
vagimeli marked this conversation as resolved.
Show resolved Hide resolved

The following pattern extracts the values of `key1` and `key2` fields and appends them together, with a newline character as the separator:

`%{key1} %key2} /n`

The output is:

```bash
value1
value2
```

You can also specify an alternative separator using the `append_separator` parameter. For example, the following pattern uses a comma as the separator:
vagimeli marked this conversation as resolved.
Show resolved Hide resolved

`%{key1} %key2}, append_separator => "," /n`

Check failure on line 247 in _ingest-pipelines/processors/dissect.md

View workflow job for this annotation

GitHub Actions / vale

[vale] _ingest-pipelines/processors/dissect.md#L247

[OpenSearch.Spelling] Error: _separator. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.
Raw output
{"message": "[OpenSearch.Spelling] Error: _separator. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.", "location": {"path": "_ingest-pipelines/processors/dissect.md", "range": {"start": {"line": 247, "column": 25}}}, "severity": "ERROR"}

The outout is:

Check failure on line 249 in _ingest-pipelines/processors/dissect.md

View workflow job for this annotation

GitHub Actions / vale

[vale] _ingest-pipelines/processors/dissect.md#L249

[OpenSearch.Spelling] Error: outout. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.
Raw output
{"message": "[OpenSearch.Spelling] Error: outout. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.", "location": {"path": "_ingest-pipelines/processors/dissect.md", "range": {"start": {"line": 249, "column": 5}}}, "severity": "ERROR"}

```bash
value1, value2
```

### Named skip key (`?`)

The named skip key modifier excludes specific matches from the final output by using an empty key, `{%}`, within the pattern. The named skip key modifier is useful for excluding irrelevant or unnecessary fields from the output, focusing on specific information, or streamlining the output for further processing ot analysis.

Check failure on line 257 in _ingest-pipelines/processors/dissect.md

View workflow job for this annotation

GitHub Actions / vale

[vale] _ingest-pipelines/processors/dissect.md#L257

[OpenSearch.Spelling] Error: ot. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.
Raw output
{"message": "[OpenSearch.Spelling] Error: ot. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.", "location": {"path": "_ingest-pipelines/processors/dissect.md", "range": {"start": {"line": 257, "column": 313}}}, "severity": "ERROR"}
vagimeli marked this conversation as resolved.
Show resolved Hide resolved

#### Example

The following pattern excludes a field (in this case, `ignore`) from the output. You can assign a descriptive name to the empty key, for example, `%{ignore}`, to clarify that the corresponding value should be excluded from the final result.

`%firstName} %{lastName} %{ignore}`
vagimeli marked this conversation as resolved.
Show resolved Hide resolved

### Reference keys (`*` and `&`)

Reference keys use parsed values as key/value pairings for structured content. This can use useful when handling systems that partially log data in key/value pairs. by using reference keys, you can preserve the key/value relationship and maintain the integrity of the extracted information.

#### Example

The following pattern extracts data into a structured format, with `%{value}` represented the parsed value and `%{reference_key}` acting as placeholder for the actual key:

`%{value} %{reference_key}`

Check failure on line 273 in _ingest-pipelines/processors/dissect.md

View workflow job for this annotation

GitHub Actions / vale

[vale] _ingest-pipelines/processors/dissect.md#L273

[OpenSearch.Spelling] Error: _key. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.
Raw output
{"message": "[OpenSearch.Spelling] Error: _key. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.", "location": {"path": "_ingest-pipelines/processors/dissect.md", "range": {"start": {"line": 273, "column": 23}}}, "severity": "ERROR"}

The output is:

```bash
value1 value1
value2 value2
value3 value3
```
Loading