Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add metadata fields for mappings (content gap initiative) #6933

Merged
merged 176 commits into from
Aug 29, 2024
Merged
Show file tree
Hide file tree
Changes from 14 commits
Commits
Show all changes
176 commits
Select commit Hold shift + click to select a range
a7d4567
Add mapping as part of content gap initiative
vagimeli Apr 9, 2024
7b9aa58
Update mapping as part of content gap initiative
vagimeli Apr 9, 2024
3ad5c25
Add content to address feedback
vagimeli Apr 10, 2024
653d2f4
Merge branch 'main' into mapping-content-gap
vagimeli Apr 10, 2024
48d3a6e
Add delete mapping section
vagimeli Apr 15, 2024
8722b15
Update index.md
vagimeli Apr 15, 2024
b2dfc7a
Add metadata fields index page
vagimeli Apr 15, 2024
cf4a80c
Add setting descriptions
vagimeli Apr 17, 2024
061cc80
Delete remove mappings type section
vagimeli Apr 17, 2024
a047b4c
Add individual metadata field docs
vagimeli Apr 17, 2024
0b4bf5e
Added documentation for field names and ignored
vagimeli Apr 17, 2024
bd3b85c
Add id field doc
vagimeli Apr 17, 2024
d6cadd1
Add field docs
vagimeli Apr 17, 2024
4b6f1d1
Add new docs
vagimeli Apr 18, 2024
57be501
Merge branch 'main' into mapping-content-gap
vagimeli Apr 29, 2024
a9f6f4f
Add default and allowed values
vagimeli Apr 29, 2024
5b7ab8b
Add default and allowed values
vagimeli Apr 29, 2024
2b4db59
Update field-names.md
vagimeli Apr 29, 2024
edd60c6
Update index-metadata.md
vagimeli Apr 29, 2024
a593836
Update index-metadata.md
vagimeli Apr 29, 2024
0e41a24
Update meta.md
vagimeli Apr 29, 2024
6c0c5c8
Update meta.md
vagimeli Apr 29, 2024
5e11c4a
Update index-metadata.md
vagimeli Apr 29, 2024
3dcfd50
Update meta.md
vagimeli Apr 29, 2024
9fcc2eb
Update index-metadata.md
vagimeli Apr 29, 2024
3d30dfb
Update routing.md
vagimeli Apr 29, 2024
56991f3
Update routing.md
vagimeli Apr 29, 2024
a7f39c0
Update source.md
vagimeli Apr 29, 2024
c286a28
Merge branch 'main' into mapping-content-gap
vagimeli May 8, 2024
078c49f
Update index.md
vagimeli May 8, 2024
02cc9dd
Merge branch 'main' into mapping-content-gap
vagimeli May 14, 2024
aa6ac4d
Merge branch 'main' into mapping-content-gap
vagimeli May 14, 2024
415ef50
Update _field-types/index.md
vagimeli May 15, 2024
11e1a1c
Update _field-types/metadata-fields/id.md
vagimeli May 15, 2024
ffd96dc
Add dynamic templates section and examples
vagimeli Jun 3, 2024
5c64fd6
Add dynamic templates code snippet
vagimeli Jun 3, 2024
8e7a288
Update _field-types/metadata-fields/source.md
vagimeli Jun 26, 2024
459f3d0
Update _field-types/index.md
vagimeli Jun 26, 2024
0c451a5
Update _field-types/index.md
vagimeli Jun 26, 2024
8eefb2b
Update _field-types/index.md
vagimeli Jun 26, 2024
709d4b3
Update _field-types/index.md
vagimeli Jun 26, 2024
ced8826
Update _field-types/index.md
vagimeli Jun 26, 2024
23943c8
Update _field-types/index.md
vagimeli Jun 26, 2024
91d0f18
Update _field-types/metadata-fields/field-names.md
vagimeli Jun 26, 2024
8485d22
Update field-names.md
vagimeli Jun 26, 2024
a5ccbd9
Update _field-types/metadata-fields/id.md
vagimeli Jun 26, 2024
d5a6655
Merge branch 'main' into mapping-content-gap
vagimeli Jun 26, 2024
f478040
Update _field-types/index.md
vagimeli Jul 11, 2024
f3337a6
Update _field-types/index.md
vagimeli Jul 11, 2024
e21c9af
Writing in progress
vagimeli Jul 16, 2024
f2516c9
Update _field-types/metadata-fields/id.md
vagimeli Jul 17, 2024
365b145
Update _field-types/metadata-fields/field-names.md
vagimeli Jul 17, 2024
2292abc
Update _field-types/metadata-fields/field-names.md
vagimeli Jul 17, 2024
a114b6e
Update _field-types/metadata-fields/field-names.md
vagimeli Jul 17, 2024
47a1bf9
Update _field-types/metadata-fields/index.md
vagimeli Jul 17, 2024
00fc9c1
Update ignored.md
vagimeli Jul 17, 2024
ae99ef0
Merge branch 'main' into mapping-content-gap
vagimeli Jul 17, 2024
61a9f6b
Update index.md
vagimeli Aug 28, 2024
40d9b43
Update index.md
vagimeli Aug 28, 2024
284b73e
Merge branch 'main' into mapping-content-gap
vagimeli Aug 28, 2024
c7d9585
Update _field-types/mappings-use-cases.md
vagimeli Aug 28, 2024
dc2792e
Update mappings-use-cases.md
vagimeli Aug 28, 2024
83bfc25
Update field-names.md
vagimeli Aug 28, 2024
4389093
Update id.md
vagimeli Aug 28, 2024
02af070
Update ignored.md
vagimeli Aug 28, 2024
f83fddf
Update index-metadata.md
vagimeli Aug 28, 2024
9c819eb
Update index.md
vagimeli Aug 28, 2024
20bdec2
Update meta.md
vagimeli Aug 28, 2024
e38ec49
Update _field-types/metadata-fields/routing.md
vagimeli Aug 28, 2024
8f22a9c
Update _field-types/metadata-fields/routing.md
vagimeli Aug 28, 2024
c1a2315
Update _field-types/metadata-fields/meta.md
vagimeli Aug 28, 2024
bd8ac77
Update _field-types/metadata-fields/source.md
vagimeli Aug 28, 2024
cf4222a
Update _field-types/index.md
vagimeli Aug 28, 2024
243357a
Update _field-types/metadata-fields/field-names.md
vagimeli Aug 28, 2024
375488b
Update source.md
vagimeli Aug 28, 2024
74b82c3
Update _field-types/metadata-fields/index-metadata.md
vagimeli Aug 28, 2024
1262bae
Update _field-types/metadata-fields/index-metadata.md
vagimeli Aug 28, 2024
2e2c00b
Update _field-types/metadata-fields/ignored.md
vagimeli Aug 28, 2024
1d622df
Update _field-types/metadata-fields/ignored.md
vagimeli Aug 28, 2024
022eb23
Update _field-types/metadata-fields/ignored.md
vagimeli Aug 28, 2024
c23b805
Update _field-types/metadata-fields/id.md
vagimeli Aug 28, 2024
9f7694e
Update id.md
vagimeli Aug 28, 2024
97cb348
Update _field-types/index.md
vagimeli Aug 28, 2024
4524c1a
Update _field-types/index.md
vagimeli Aug 28, 2024
27c40c8
Update _field-types/index.md
vagimeli Aug 28, 2024
ff7c843
Merge branch 'main' into mapping-content-gap
vagimeli Aug 28, 2024
c33c047
Update _field-types/metadata-fields/index.md
vagimeli Aug 28, 2024
596fa72
Update _field-types/metadata-fields/id.md
vagimeli Aug 28, 2024
57cbe5e
Update field-names.md
vagimeli Aug 28, 2024
3d9f7ba
Address doc review comments
vagimeli Aug 28, 2024
60ecace
Address doc review comments
vagimeli Aug 28, 2024
30313fe
Update _field-types/index.md
vagimeli Aug 29, 2024
4f0b8e4
Update _field-types/index.md
vagimeli Aug 29, 2024
fc62bcd
Update _field-types/index.md
vagimeli Aug 29, 2024
3cb42e1
Update _field-types/index.md
vagimeli Aug 29, 2024
53354fe
Update _field-types/index.md
vagimeli Aug 29, 2024
cfc9997
Update _field-types/index.md
vagimeli Aug 29, 2024
31597db
Update _field-types/index.md
vagimeli Aug 29, 2024
95e7ef9
Update _field-types/index.md
vagimeli Aug 29, 2024
d67b974
Update _field-types/index.md
vagimeli Aug 29, 2024
5ad15d9
Update _field-types/index.md
vagimeli Aug 29, 2024
6a30faf
Update _field-types/index.md
vagimeli Aug 29, 2024
28a2923
Update _field-types/index.md
vagimeli Aug 29, 2024
c25d86b
Update _field-types/index.md
vagimeli Aug 29, 2024
01131a4
Update _field-types/index.md
vagimeli Aug 29, 2024
a49b1ac
Update _field-types/index.md
vagimeli Aug 29, 2024
e2d2d0d
Update _field-types/index.md
vagimeli Aug 29, 2024
bfa2952
Update _field-types/index.md
vagimeli Aug 29, 2024
9744ab7
Update _field-types/index.md
vagimeli Aug 29, 2024
4898e09
Update _field-types/index.md
vagimeli Aug 29, 2024
c44a46a
Update _field-types/index.md
vagimeli Aug 29, 2024
e1cef5f
Update _field-types/index.md
vagimeli Aug 29, 2024
7eaefbc
Update _field-types/index.md
vagimeli Aug 29, 2024
7f268dc
Update _field-types/mappings-use-cases.md
vagimeli Aug 29, 2024
71e6ed4
Update _field-types/mappings-use-cases.md
vagimeli Aug 29, 2024
e49c1d3
Update _field-types/mappings-use-cases.md
vagimeli Aug 29, 2024
a708be7
Update _field-types/mappings-use-cases.md
vagimeli Aug 29, 2024
5ee0a4d
Update _field-types/mappings-use-cases.md
vagimeli Aug 29, 2024
8f2c11e
Update _field-types/metadata-fields/field-names.md
vagimeli Aug 29, 2024
e5bc821
Update _field-types/metadata-fields/id.md
vagimeli Aug 29, 2024
1c9328d
Update _field-types/metadata-fields/id.md
vagimeli Aug 29, 2024
df556d7
Update _field-types/metadata-fields/field-names.md
vagimeli Aug 29, 2024
e1876b6
Update _field-types/metadata-fields/id.md
vagimeli Aug 29, 2024
4202902
Update _field-types/metadata-fields/id.md
vagimeli Aug 29, 2024
b970b51
Update _field-types/metadata-fields/id.md
vagimeli Aug 29, 2024
2aefa3c
Update _field-types/metadata-fields/id.md
vagimeli Aug 29, 2024
381f3c9
Update _field-types/metadata-fields/id.md
vagimeli Aug 29, 2024
2b6ada7
Update _field-types/metadata-fields/ignored.md
vagimeli Aug 29, 2024
88d3c1e
Update _field-types/metadata-fields/ignored.md
vagimeli Aug 29, 2024
46c0639
Update _field-types/metadata-fields/ignored.md
vagimeli Aug 29, 2024
9087d12
Update _field-types/metadata-fields/ignored.md
vagimeli Aug 29, 2024
f1a195d
Update _field-types/metadata-fields/ignored.md
vagimeli Aug 29, 2024
e53ec2a
Update _field-types/metadata-fields/ignored.md
vagimeli Aug 29, 2024
e90eb17
Update _field-types/metadata-fields/index-metadata.md
vagimeli Aug 29, 2024
b700ff7
Update _field-types/metadata-fields/index-metadata.md
vagimeli Aug 29, 2024
670328f
Update _field-types/metadata-fields/index-metadata.md
vagimeli Aug 29, 2024
b9ccc47
Update _field-types/metadata-fields/index-metadata.md
vagimeli Aug 29, 2024
116cecb
Update _field-types/metadata-fields/index-metadata.md
vagimeli Aug 29, 2024
ce89a83
Update _field-types/metadata-fields/index-metadata.md
vagimeli Aug 29, 2024
7bc5bfd
Update _field-types/metadata-fields/index.md
vagimeli Aug 29, 2024
7b051f5
Update _field-types/metadata-fields/index.md
vagimeli Aug 29, 2024
7213aa9
Update _field-types/metadata-fields/index.md
vagimeli Aug 29, 2024
581b118
Update _field-types/metadata-fields/index.md
vagimeli Aug 29, 2024
916a0dc
Update _field-types/metadata-fields/index.md
vagimeli Aug 29, 2024
9d8a1e7
Update _field-types/metadata-fields/index.md
vagimeli Aug 29, 2024
09e64a5
Update _field-types/metadata-fields/index.md
vagimeli Aug 29, 2024
234d6a0
Update _field-types/metadata-fields/index.md
vagimeli Aug 29, 2024
230c830
Update _field-types/metadata-fields/meta.md
vagimeli Aug 29, 2024
5b24526
Update _field-types/metadata-fields/meta.md
vagimeli Aug 29, 2024
f360baf
Update _field-types/metadata-fields/meta.md
vagimeli Aug 29, 2024
c893666
Update _field-types/metadata-fields/meta.md
vagimeli Aug 29, 2024
5802589
Update _field-types/metadata-fields/meta.md
vagimeli Aug 29, 2024
3552222
Update _field-types/metadata-fields/meta.md
vagimeli Aug 29, 2024
ae73ae8
Update _field-types/metadata-fields/meta.md
vagimeli Aug 29, 2024
d2cb1b5
Update _field-types/metadata-fields/meta.md
vagimeli Aug 29, 2024
b0f0b92
Update _field-types/metadata-fields/routing.md
vagimeli Aug 29, 2024
cd6b1aa
Update _field-types/metadata-fields/routing.md
vagimeli Aug 29, 2024
2e8ff8b
Update _field-types/metadata-fields/routing.md
vagimeli Aug 29, 2024
a5074df
Update _field-types/metadata-fields/routing.md
vagimeli Aug 29, 2024
9654a1b
Update _field-types/metadata-fields/routing.md
vagimeli Aug 29, 2024
097f3ef
Update _field-types/metadata-fields/routing.md
vagimeli Aug 29, 2024
5e263e3
Update _field-types/metadata-fields/routing.md
vagimeli Aug 29, 2024
421e97f
Update _field-types/metadata-fields/routing.md
vagimeli Aug 29, 2024
c834b5a
Update _field-types/metadata-fields/routing.md
vagimeli Aug 29, 2024
044fb6c
Update _field-types/metadata-fields/routing.md
vagimeli Aug 29, 2024
ba7b39b
Update _field-types/metadata-fields/routing.md
vagimeli Aug 29, 2024
dca4c32
Update _field-types/metadata-fields/routing.md
vagimeli Aug 29, 2024
92a949e
Update _field-types/metadata-fields/routing.md
vagimeli Aug 29, 2024
e3d8d0b
Update _field-types/metadata-fields/source.md
vagimeli Aug 29, 2024
5573bc9
Update _field-types/metadata-fields/source.md
vagimeli Aug 29, 2024
eac468b
Update _field-types/metadata-fields/source.md
vagimeli Aug 29, 2024
1037b6e
Update _field-types/metadata-fields/routing.md
vagimeli Aug 29, 2024
215a3b7
Update _field-types/metadata-fields/routing.md
vagimeli Aug 29, 2024
0cb56be
Update routing.md
vagimeli Aug 29, 2024
219b1e3
Update _field-types/metadata-fields/routing.md
vagimeli Aug 29, 2024
2dc0e2a
Merge branch 'main' into mapping-content-gap
vagimeli Aug 29, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
76 changes: 65 additions & 11 deletions _field-types/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,23 +14,17 @@

You can define how documents and their fields are stored and indexed by creating a _mapping_. The mapping specifies the list of fields for a document. Every field in the document has a _field type_, which defines the type of data the field contains. For example, you may want to specify that the `year` field should be of type `date`. To learn more, see [Supported field types]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/index/).

If you're just starting to build out your cluster and data, you may not know exactly how your data should be stored. In those cases, you can use dynamic mappings, which tell OpenSearch to dynamically add data and its fields. However, if you know exactly what types your data falls under and want to enforce that standard, then you can use explicit mappings.
If you're starting to build out your cluster and data, you may not know exactly how your data should be stored. In those cases, you can use dynamic mappings, which tell OpenSearch to dynamically add data and its fields. However, if you know exactly what types your data falls under and want to enforce that standard, then you can use explicit mappings.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are certain caveats of using the dynamic mappings (e.g. performance impact). I believe we should highlight the same and recommend to use explicit mappings

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revised

vagimeli marked this conversation as resolved.
Show resolved Hide resolved

For example, if you want to indicate that `year` should be of type `text` instead of an `integer`, and `age` should be an `integer`, you can do so with explicit mappings. By using dynamic mapping, OpenSearch might interpret both `year` and `age` as integers.

This section provides an example for how to create an index mapping and how to add a document to it that will get ip_range validated.
This documentation provides an example for how to create an index mapping and how to add a document to it that will get `ip_range` validated.

#### Table of contents
1. TOC
{:toc}


---
## Dynamic mapping

When you index a document, OpenSearch adds fields automatically with dynamic mapping. You can also explicitly add fields to an index mapping.

#### Dynamic mapping types
### Dynamic mapping types

Type | Description
:--- | :---
Expand Down Expand Up @@ -63,7 +57,7 @@
}
```

### Response
#### Response
```json
{
"acknowledged": true,
Expand All @@ -88,6 +82,42 @@
You cannot change the mapping of an existing field, you can only modify the field's mapping parameters.
{: .note}

## Mapping parameters

Mapping parameters are used to configure the behavior of fields in an index. The following table lists commonly used mapping parameters.

Parameter | Description
:--- | :---
`analyzer` | Specifies the analyzer used to analyze string fields.
`boost` | Specifies a field-level query time to boost.
`coerce` | Tries to convert the value to the specified data type.
`copy_to` | Copies the values of this field to another field.
`doc_values` | Specifies whether the field should be stored on disk to make sorting and aggregation faster.
`dynamic` | Determines whether new fields should be added dynamically.
`enabled` | Specifies whether the field is enabled or disabled.
`format` | Specifies the date format for date fields.
`ignore_above` | Skips indexing values that are longer than the specified length.
`ignore_malformed` | Specifies whether malformed values should be ignored.
`index` | Specifies whether the field should be indexed.
`index_options` | Specifies what information should be stored in the index for scoring purposes.
vagimeli marked this conversation as resolved.
Show resolved Hide resolved

## Mapping limit settings

OpenSearch has certain limits or settings related to mappings, such as the settings listed in the following table. Settings can be configured based on your requirements.
vagimeli marked this conversation as resolved.
Show resolved Hide resolved

| Setting | Default value | Allowed value | Type | Description |
|-|-|-|-|-|
| index.mapping.nested_fields.limit | 50 | [0,) | Dynamic | Limits the maximum number of nested fields that can be defined in an index mapping. |
| index.mapping.nested_objects.limit | 10000 | [0,) | Dynamic | Limits the maximum number of nested objects that can be created within a single document. |
| index.mapping.total_fields.limit | 1000 | [0,) | Dynamic | Limits the maximum number of fields that can be defined in an index mapping. |
| index.mapping.depth.limit | 20 | [1,100] | Dynamic | Limits the maximum depth of nested objects and nested fields that can be defined in an index mapping. |
| index.mapping.field_name_length.limit | 50000 | [1,50000] | Dynamic | Limits the maximum length of field names that can be defined in an index mapping. |
| index.mapper.dynamic | true | {true,false} | Dynamic | Determines whether new fields should be added dynamically to the mapping when they are encountered in a document. |

Check failure on line 115 in _field-types/index.md

View workflow job for this annotation

GitHub Actions / vale

[vale] _field-types/index.md#L115

[OpenSearch.SpacingPunctuation] There should be no space before and one space after the punctuation mark in 'true,false'.
Raw output
{"message": "[OpenSearch.SpacingPunctuation] There should be no space before and one space after the punctuation mark in 'true,false'.", "location": {"path": "_field-types/index.md", "range": {"start": {"line": 115, "column": 34}}}, "severity": "ERROR"}

## Runtime fields

You can define fields at query time, rather than at index time, by using runtime fields. this can be useful for creating fields based on the values of other fields, or for performing transformations on data during the query process. Runtime fields are defined in the query itself and do not affect the underlying data in the index.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not yet available on OpenSearch
#6943

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed

---
## Mapping example usage

Expand Down Expand Up @@ -171,7 +201,7 @@
GET <index>/_mapping
```

In the above request, `<index>` may be an index name or a comma-separated list of index names.
In the previous request, `<index>` may be an index name or a comma-separated list of index names.

To get all mappings for all indexes, use the following request:

Expand Down Expand Up @@ -220,3 +250,27 @@
}
}
```

## Delete a mapping
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we support DELETE verb on mappings.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed text


The syntax for deleting a mapping depends on whether you want to delete the entire mapping for an index or the mapping for a specific field. The syntax for deleting a mapping is as follows:

```json
DELETE /<index_name>/_mapping
DELETE /<field_name/_mapping>
```

For example, to delete the entire mapping for the `sample-index1` index, you can use the following commands:

```json
<insert command>
```

If you want to delete the mapping for a specific field, you can <insert instructional text> For example, to delete the mapping for the `year` field, use the following command:

```json
<insert command>
```

Deleting a field mapping will remove the mapping definition for that field across all indexes or the specified index. It will not delete the actual data stored in those fields.
{: .note}
18 changes: 18 additions & 0 deletions _field-types/metadata-fields/field-names.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
---
layout: default
title: Field names
vagimeli marked this conversation as resolved.
Show resolved Hide resolved
nav_order: 10
has_children: false
vagimeli marked this conversation as resolved.
Show resolved Hide resolved
parent: Metadata fields
---

# Field names

The `field_names` field indexes the names of fields within a document that contain non-null values. This field support the `exists` query, which identifies documents with or without non-null values for a specified field.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mgodwan Please review this narrative for technical accuracy.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be _field_names. I'll share an example as well on the same

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, term queries on this metadata field are deprecated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revised.

Copy link
Contributor Author

@vagimeli vagimeli Jun 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The `field_names` field indexes the names of fields within a document that contain non-null values. This field support the `exists` query, which identifies documents with or without non-null values for a specified field.
The `_field_names` setting allows you to enable or disable a metadata field called `_field_names`. This field indexes the names of fields within a document that have non-null values, enabling querying based on the existence or non-existence of specific fields. By default, the `_field_names` setting is enabled for new indexes. However, you can explicitly configure it in the mappings of an index by specifying the `_field_names: { enabled: true }` setting. You can disable this feature by setting `enabled: false` or omitting the `_field_names` configuration entirely.

vagimeli marked this conversation as resolved.
Show resolved Hide resolved

The `field_names` only indexes field names when both `doc_values` and `norms` are disabled for those fields. If either `doc_values` or `norms` are enabled, the `exists` query remains functional but does not rely on `field_names`.
vagimeli marked this conversation as resolved.
Show resolved Hide resolved

## Mapping example
vagimeli marked this conversation as resolved.
Show resolved Hide resolved

<SME: Please provide a mapping example.>

Check warning on line 17 in _field-types/metadata-fields/field-names.md

View workflow job for this annotation

GitHub Actions / vale

[vale] _field-types/metadata-fields/field-names.md#L17

[OpenSearch.Please] Using 'Please' is unnecessary. Remove.
Raw output
{"message": "[OpenSearch.Please] Using 'Please' is unnecessary. Remove.", "location": {"path": "_field-types/metadata-fields/field-names.md", "range": {"start": {"line": 17, "column": 7}}}, "severity": "WARNING"}

84 changes: 84 additions & 0 deletions _field-types/metadata-fields/id.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
---
layout: default
title: ID
nav_order: 20
has_children: false
vagimeli marked this conversation as resolved.
Show resolved Hide resolved
parent: Metadata fields
---

# ID

Each document has an `_id` field that uniquely identifies it. This field is indexed, allowing documents to be retrieved either through the `GET` API or the [`ids` query]({{site.url}}{{site.baseurl}}/query-dsl/term/ids/).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line above: Is "GET" intentionally in code font?

The following examples creates an index `test-index1` and add two documents with different `_id` values:

```json
PUT test-index1/_doc/1
{
"text": "Document with ID 1"
}

PUT test-index1/_doc/2?refresh=true
{
"text": "Document with ID 2"
}
```
{% include copy-curl.html %}

Now, you can query the documents using the `_id` field:

```json
GET test-index1/_search
{
"query": {
"terms": {
"_id": ["1", "2"]
}
}
}
```
{% include copy-curl.html %}

The following response shows that this query returns both documents with `_id` values of `1` and `2`.

```json
{
"took": 10,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 2,
"relation": "eq"
},
"max_score": 1,
"hits": [
{
"_index": "test-index1",
"_id": "1",
"_score": 1,
"_source": {
"text": "Document with ID 1"
}
},
{
"_index": "test-index1",
"_id": "2",
"_score": 1,
"_source": {
"text": "Document with ID 2"
}
}
]
}
```
vagimeli marked this conversation as resolved.
Show resolved Hide resolved
{% include copy-curl.html %}
vagimeli marked this conversation as resolved.
Show resolved Hide resolved

vagimeli marked this conversation as resolved.
Show resolved Hide resolved
vagimeli marked this conversation as resolved.
Show resolved Hide resolved
## Querying on the `_id` field

While the `_id` field is accessible in various queries, it is restricted from use in aggregations, sorting, and scripting. If you need to sort or aggregate on the `_id` field, it is recommended to duplicate the content of the `_id` field into another field that has `doc_values` enabled.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sandeshkr419 Would you be able to confirm this from search perspective?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @mgodwan missed the notification for this one somehow.

The explanation seems correct.

@vagimeli I'm wondering if we should link the usage example of querying over _id field as well: https://opensearch.org/docs/2.1/opensearch/query-dsl/term/#ids

Something like:

While the _id field is accessible in various queries (see usage), it is restricted from use in aggregations, sorting, and scripting. If you need to sort or aggregate on the _id field, it is recommended to duplicate the content of the _id field into another field that has doc_values enabled.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revised.

vagimeli marked this conversation as resolved.
Show resolved Hide resolved
69 changes: 69 additions & 0 deletions _field-types/metadata-fields/ignored.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
---
layout: default
title: Ignored
nav_order: 15
has_children: false
vagimeli marked this conversation as resolved.
Show resolved Hide resolved
parent: Metadata fields
---

# Ignored

The `_ignored` field indexes and stores the name of fields within a document that were ignored during the indexing process due to being malformed. This functionality is enabled with the `ignore_malformed` setting is turned on in the [index mapping]({{site.url}}{{site.baseurl}}/field-types/#mapping-example-usage).

The `_ignored` field allows you to search and identify documents that contain fields that were ignored, as well as the specific field names that were ignored. The can be useful for troubleshooting and understadning issues related to malformed data in your documents.
vagimeli marked this conversation as resolved.
Show resolved Hide resolved

You can query the `_ignored` field using `term`, `terms`, and `exists` queries, and the results will be in the search hits.

The `_ignored` field is only populated when the `ignore_malformed` setting is enabled in your index mapping. If `ignore_malformed` is set to `false` (the default value), malformed fields will cause the entire document to be rejected, and the `_ignored` field will not be populated.
{: .note}

For example, the following query will retrieve all documents that have at least one field that was ignored during indexing:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add indexing example as well for this?

PUT test-ignored
{
  "mappings": {
    "properties": {
      "title": {
        "type": "text"
      },
      "length": {
        "type": "long",
        "ignore_malformed": true
      }
    }
  }
}

POST test-ignored/_doc
{
  "title": "correct text",
  "length": "not a number"
}

GET test-ignored/_search
{
  "query": {
    "exists": {
      "field": "_ignored"
    }
  }
}


```json
GET _search
{
"query": {
"exists": {
"field": "_ignored"
}
}
}
```
{% include copy-curl.html %}

Similarly, you can use a term query to find documents where a specific field, such as created_at, was ignored:

```json
GET _search
{
"query": {
"term": {
"_ignored": "created_at"
}
}
}
```
{% include copy-curl.html %}

#### Reponse

```json
{
"took": 51,
"timed_out": false,
"_shards": {
"total": 45,
"successful": 45,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 0,
"relation": "eq"
},
"max_score": null,
"hits": []
}
}
```
74 changes: 74 additions & 0 deletions _field-types/metadata-fields/index-metadata.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
---
layout: default
title: Index
nav_order: 25
has_children: false
vagimeli marked this conversation as resolved.
Show resolved Hide resolved
parent: Metadata fields
---

# Index

When querying across multiple indexes, you may need to filter results based on the index a document was indexed into. The `index` field matches documents based on their index.
vagimeli marked this conversation as resolved.
Show resolved Hide resolved

The following example creates two indexes, `products` and `customers` and adds a document to each index:

```json
PUT products/_doc/1
{
"name": "Widget X"
}

PUT customers/_doc/2?refresh=true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can remove refresh=true param?

{
"name": "John Doe"
}
```
{% include copy-curl.html %}

Now, you can query both indexes and filter the results using the `_index` field:

```json
GET products,customers/_search
{
"query": {
"terms": {
"_index": ["products", "customers"]
}
},
"aggs": {
"index_groups": {
"terms": {
"field": "_index",
"size": 10
}
}
},
"sort": [
{
"_index": {
"order": "desc"
}
}
],
"script_fields": {
"index_name": {
"script": {
"lang": "painless",
"source": "doc['_index'].value"
}
}
}
}
```
{% include copy-curl.html %}

In this example:

- The `query` section uses a `terms` query to match documents from the `products` and `customers` indexes.
- The `aggs` section performs a `terms` aggregation on the `_index` field, grouping the results by index.
- The `sort` section sorts the results by the `_index` field in ascending order.
- The `script_fields` section adds a new field `index_name` to the search results that contains the value of the `_index` field for each document.
vagimeli marked this conversation as resolved.
Show resolved Hide resolved

## Querying on the `_index` field

<SME: Please provide information necessary for users to understand how this works for them.>

Check warning on line 74 in _field-types/metadata-fields/index-metadata.md

View workflow job for this annotation

GitHub Actions / vale

[vale] _field-types/metadata-fields/index-metadata.md#L74

[OpenSearch.Please] Using 'Please' is unnecessary. Remove.
Raw output
{"message": "[OpenSearch.Please] Using 'Please' is unnecessary. Remove.", "location": {"path": "_field-types/metadata-fields/index-metadata.md", "range": {"start": {"line": 74, "column": 7}}}, "severity": "WARNING"}
21 changes: 21 additions & 0 deletions _field-types/metadata-fields/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
--
vagimeli marked this conversation as resolved.
Show resolved Hide resolved
layout: default

Check failure on line 2 in _field-types/metadata-fields/index.md

View workflow job for this annotation

GitHub Actions / vale

[vale] _field-types/metadata-fields/index.md#L2

[OpenSearch.HeadingColon] Capitalize the word after a colon in ': default'.
Raw output
{"message": "[OpenSearch.HeadingColon] Capitalize the word after a colon in ': default'.", "location": {"path": "_field-types/metadata-fields/index.md", "range": {"start": {"line": 2, "column": 7}}}, "severity": "ERROR"}
title: Metadata fields
nav_order: 90
has_children: true

Check failure on line 5 in _field-types/metadata-fields/index.md

View workflow job for this annotation

GitHub Actions / vale

[vale] _field-types/metadata-fields/index.md#L5

[OpenSearch.HeadingColon] Capitalize the word after a colon in ': true'.
Raw output
{"message": "[OpenSearch.HeadingColon] Capitalize the word after a colon in ': true'.", "location": {"path": "_field-types/metadata-fields/index.md", "range": {"start": {"line": 5, "column": 13}}}, "severity": "ERROR"}
has_toc: false

Check failure on line 6 in _field-types/metadata-fields/index.md

View workflow job for this annotation

GitHub Actions / vale

[vale] _field-types/metadata-fields/index.md#L6

[OpenSearch.HeadingColon] Capitalize the word after a colon in ': false'.
Raw output
{"message": "[OpenSearch.HeadingColon] Capitalize the word after a colon in ': false'.", "location": {"path": "_field-types/metadata-fields/index.md", "range": {"start": {"line": 6, "column": 8}}}, "severity": "ERROR"}
---

# Metadata fields

OpenSearch has built-in metadata fields that provide information about the documents in an index. These fields can be accessed or used in queries as needed.

Metadata fields | Description
vagimeli marked this conversation as resolved.
Show resolved Hide resolved
:--- | :---
`field_names` | The fields within the document that hold non-empty or non-null values.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_field_names

vagimeli marked this conversation as resolved.
Show resolved Hide resolved
`_ignored` | The fields in the document that were disregarded during the indexing process due to the presence of malformed data, as specified by the `ignore_malformed` setting.
vagimeli marked this conversation as resolved.
Show resolved Hide resolved
`_id` | The unique identifier assigned to each individual document.
vagimeli marked this conversation as resolved.
Show resolved Hide resolved
`_index` | The specific index within the OpenSearch database where the document is stored and organized.
vagimeli marked this conversation as resolved.
Show resolved Hide resolved
`_meta` | Stores custom metadata or additional information specific to the application or use case.
`_routing` | Allows you to specify a custom value that determines the shard assignment for the document within the OpenSearch cluster.
vagimeli marked this conversation as resolved.
Show resolved Hide resolved
`_source` | Contains the original JSON representation of the document's data.
vagimeli marked this conversation as resolved.
Show resolved Hide resolved
8 changes: 8 additions & 0 deletions _field-types/metadata-fields/meta.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
---
layout: default
title: Meta
nav_order: 30
has_children: false
vagimeli marked this conversation as resolved.
Show resolved Hide resolved
parent: Metadata fields
---

Loading
Loading