From 39de58ba6b6e6bc6aaa78588a9488fb9d27af152 Mon Sep 17 00:00:00 2001
From: "opensearch-trigger-bot[bot]"
<98922864+opensearch-trigger-bot[bot]@users.noreply.github.com>
Date: Tue, 24 Sep 2024 15:28:01 +0000
Subject: [PATCH] Add has_child query (#8354) (#8360)
---
_field-types/supported-field-types/join.md | 2 +-
_query-dsl/geo-and-xy/geo-bounding-box.md | 6 +-
_query-dsl/geo-and-xy/geodistance.md | 6 +-
_query-dsl/geo-and-xy/geopolygon.md | 6 +-
_query-dsl/geo-and-xy/geoshape.md | 6 +-
_query-dsl/joining/has-child.md | 259 +++++++++++++++++++++
_query-dsl/joining/index.md | 5 +-
7 files changed, 275 insertions(+), 15 deletions(-)
create mode 100644 _query-dsl/joining/has-child.md
diff --git a/_field-types/supported-field-types/join.md b/_field-types/supported-field-types/join.md
index c707c66774..1c5b0d1322 100644
--- a/_field-types/supported-field-types/join.md
+++ b/_field-types/supported-field-types/join.md
@@ -61,7 +61,7 @@ PUT testindex1/_doc/1
```
{% include copy-curl.html %}
-When indexing child documents, you have to specify the `routing` query parameter because parent and child documents in the same relation have to be indexed on the same shard. Each child document refers to its parent's ID in the `parent` field.
+When indexing child documents, you need to specify the `routing` query parameter because parent and child documents in the same parent/child hierarchy must be indexed on the same shard. For more information, see [Routing]({{site.url}}{{site.baseurl}}/field-types/metadata-fields/routing/). Each child document refers to its parent's ID in the `parent` field.
Index two child documents, one for each parent:
diff --git a/_query-dsl/geo-and-xy/geo-bounding-box.md b/_query-dsl/geo-and-xy/geo-bounding-box.md
index 1112a4278e..66fcc224d6 100644
--- a/_query-dsl/geo-and-xy/geo-bounding-box.md
+++ b/_query-dsl/geo-and-xy/geo-bounding-box.md
@@ -173,11 +173,11 @@ GET testindex1/_search
```
{% include copy-curl.html %}
-## Request fields
+## Parameters
-Geo-bounding box queries accept the following fields.
+Geo-bounding box queries accept the following parameters.
-Field | Data type | Description
+Parameter | Data type | Description
:--- | :--- | :---
`_name` | String | The name of the filter. Optional.
`validation_method` | String | The validation method. Valid values are `IGNORE_MALFORMED` (accept geopoints with invalid coordinates), `COERCE` (try to coerce coordinates to valid values), and `STRICT` (return an error when coordinates are invalid). Default is `STRICT`.
diff --git a/_query-dsl/geo-and-xy/geodistance.md b/_query-dsl/geo-and-xy/geodistance.md
index b272cad81e..3eef58bc69 100644
--- a/_query-dsl/geo-and-xy/geodistance.md
+++ b/_query-dsl/geo-and-xy/geodistance.md
@@ -103,11 +103,11 @@ The response contains the matching document:
}
```
-## Request fields
+## Parameters
-Geodistance queries accept the following fields.
+Geodistance queries accept the following parameters.
-Field | Data type | Description
+Parameter | Data type | Description
:--- | :--- | :---
`_name` | String | The name of the filter. Optional.
`distance` | String | The distance within which to match the points. This distance is the radius of a circle centered at the specified point. For supported distance units, see [Distance units]({{site.url}}{{site.baseurl}}/api-reference/common-parameters/#distance-units). Required.
diff --git a/_query-dsl/geo-and-xy/geopolygon.md b/_query-dsl/geo-and-xy/geopolygon.md
index 980a0c5a63..810e48f2b7 100644
--- a/_query-dsl/geo-and-xy/geopolygon.md
+++ b/_query-dsl/geo-and-xy/geopolygon.md
@@ -161,11 +161,11 @@ However, if you specify the vertices in the following order:
The response returns no results.
-## Request fields
+## Parameters
-Geopolygon queries accept the following fields.
+Geopolygon queries accept the following parameters.
-Field | Data type | Description
+Parameter | Data type | Description
:--- | :--- | :---
`_name` | String | The name of the filter. Optional.
`validation_method` | String | The validation method. Valid values are `IGNORE_MALFORMED` (accept geopoints with invalid coordinates), `COERCE` (try to coerce coordinates to valid values), and `STRICT` (return an error when coordinates are invalid). Optional. Default is `STRICT`.
diff --git a/_query-dsl/geo-and-xy/geoshape.md b/_query-dsl/geo-and-xy/geoshape.md
index 8acc691c3a..5b144b06d6 100644
--- a/_query-dsl/geo-and-xy/geoshape.md
+++ b/_query-dsl/geo-and-xy/geoshape.md
@@ -721,10 +721,10 @@ The response returns document 1:
Note that when you indexed the geopoints, you specified their coordinates in `"latitude, longitude"` format. When you search for matching documents, the coordinate array is in `[longitude, latitude]` format. Thus, document 1 is returned in the results but document 2 is not.
-## Request fields
+## Parameters
-Geoshape queries accept the following fields.
+Geoshape queries accept the following parameters.
-Field | Data type | Description
+Parameter | Data type | Description
:--- | :--- | :---
`ignore_unmapped` | Boolean | Specifies whether to ignore an unmapped field. If set to `true`, then the query does not return any documents that contain an unmapped field. If set to `false`, then an exception is thrown when the field is unmapped. Optional. Default is `false`.
\ No newline at end of file
diff --git a/_query-dsl/joining/has-child.md b/_query-dsl/joining/has-child.md
new file mode 100644
index 0000000000..c1cc7a5423
--- /dev/null
+++ b/_query-dsl/joining/has-child.md
@@ -0,0 +1,259 @@
+---
+layout: default
+title: Has child
+parent: Joining queries
+nav_order: 10
+---
+
+# Has child query
+
+The `has_child` query returns parent documents whose child documents match a specific query. You can establish parent-child relationships between documents in the same index by using a [join]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/join/) field type.
+
+The `has_child` query is slower than other queries because of the join operation it performs. Performance decreases as the number of matching child documents pointing to different parent documents increases. Each `has_child` query in your search may significantly impact query performance. If you prioritize speed, avoid using this query or limit its usage as much as possible.
+{: .warning}
+
+## Example
+
+Before you can run a `has_child` query, your index must contain a [join]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/join/) field in order to establish parent-child relationships. The index mapping request uses the following format:
+
+```json
+PUT /example_index
+{
+ "mappings": {
+ "properties": {
+ "relationship_field": {
+ "type": "join",
+ "relations": {
+ "parent_doc": "child_doc"
+ }
+ }
+ }
+ }
+}
+```
+{% include copy-curl.html %}
+
+In this example, you'll configure an index that contains documents representing products and their brands.
+
+First, create the index and establish the parent-child relationship between `brand` and `product`:
+
+```json
+PUT testindex1
+{
+ "mappings": {
+ "properties": {
+ "product_to_brand": {
+ "type": "join",
+ "relations": {
+ "brand": "product"
+ }
+ }
+ }
+ }
+}
+```
+{% include copy-curl.html %}
+
+Index two parent (brand) documents:
+
+```json
+PUT testindex1/_doc/1
+{
+ "name": "Luxury brand",
+ "product_to_brand" : "brand"
+}
+```
+{% include copy-curl.html %}
+
+```json
+PUT testindex1/_doc/2
+{
+ "name": "Economy brand",
+ "product_to_brand" : "brand"
+}
+```
+{% include copy-curl.html %}
+
+Index three child (product) documents:
+
+```json
+PUT testindex1/_doc/3?routing=1
+{
+ "name": "Mechanical watch",
+ "sales_count": 150,
+ "product_to_brand": {
+ "name": "product",
+ "parent": "1"
+ }
+}
+```
+{% include copy-curl.html %}
+
+```json
+PUT testindex1/_doc/4?routing=2
+{
+ "name": "Electronic watch",
+ "sales_count": 300,
+ "product_to_brand": {
+ "name": "product",
+ "parent": "2"
+ }
+}
+```
+{% include copy-curl.html %}
+
+```json
+PUT testindex1/_doc/5?routing=2
+{
+ "name": "Digital watch",
+ "sales_count": 100,
+ "product_to_brand": {
+ "name": "product",
+ "parent": "2"
+ }
+}
+```
+{% include copy-curl.html %}
+
+To search for the parent of a child, use a `has_child` query. The following query returns parent documents (brands) that make watches:
+
+```json
+GET testindex1/_search
+{
+ "query" : {
+ "has_child": {
+ "type":"product",
+ "query": {
+ "match" : {
+ "name": "watch"
+ }
+ }
+ }
+ }
+}
+```
+{% include copy-curl.html %}
+
+The response returns both brands:
+
+```json
+{
+ "took": 15,
+ "timed_out": false,
+ "_shards": {
+ "total": 1,
+ "successful": 1,
+ "skipped": 0,
+ "failed": 0
+ },
+ "hits": {
+ "total": {
+ "value": 2,
+ "relation": "eq"
+ },
+ "max_score": 1,
+ "hits": [
+ {
+ "_index": "testindex1",
+ "_id": "1",
+ "_score": 1,
+ "_source": {
+ "name": "Luxury brand",
+ "product_to_brand": "brand"
+ }
+ },
+ {
+ "_index": "testindex1",
+ "_id": "2",
+ "_score": 1,
+ "_source": {
+ "name": "Economy brand",
+ "product_to_brand": "brand"
+ }
+ }
+ ]
+ }
+}
+```
+
+## Parameters
+
+The following table lists all top-level parameters supported by `has_child` queries.
+
+| Parameter | Required/Optional | Description |
+|:---|:---|:---|
+| `type` | Required | Specifies the name of the child relationship as defined in the `join` field mapping. |
+| `query` | Required | The query to run on child documents. If a child document matches the query, the parent document is returned. |
+| `ignore_unmapped` | Optional | Indicates whether to ignore unmapped `type` fields and not return documents instead of throwing an error. You can provide this parameter when querying multiple indexes, some of which may not contain the `type` field. Default is `false`. |
+| `max_children` | Optional | The maximum number of matching child documents for a parent document. If exceeded, the parent document is excluded from the search results. |
+| `min_children` | Optional | The minimum number of matching child documents required for a parent document to be included in the results. If not met, the parent is excluded. Default is `1`.|
+| `score_mode` | Optional | Defines how scores of matching child documents influence the parent document's score. Valid values are:
- `none`: Ignores the relevance scores of child documents and assigns a score of `0` to the parent document.
- `avg`: Uses the average relevance score of all matching child documents.
- `max`: Assigns the highest relevance score from the matching child documents to the parent.
- `min`: Assigns the lowest relevance score from the matching child documents to the parent.
- `sum`: Sums the relevance scores of all matching child documents.
Default is `none`. |
+
+
+## Sorting limitations
+
+The `has_child` query does not support [sorting results]({{site.url}}{{site.baseurl}}/search-plugins/searching-data/sort/) using standard sorting options. If you need to sort parent documents by fields in their child documents, you can use a [`function_score` query]({{site.url}}{{site.baseurl}}/query-dsl/compound/function-score/) and sort by the parent document's score.
+
+In the preceding example, you can sort parent documents (brands) based on the `sales_count` of their child products. This query multiplies the score by the `sales_count` field of the child documents and assigns the highest relevance score from the matching child documents to the parent:
+
+```json
+GET testindex1/_search
+{
+ "query": {
+ "has_child": {
+ "type": "product",
+ "query": {
+ "function_score": {
+ "script_score": {
+ "script": "_score * doc['sales_count'].value"
+ }
+ }
+ },
+ "score_mode": "max"
+ }
+ }
+}
+```
+{% include copy-curl.html %}
+
+The response contains the brands sorted by the highest child `sales_count`:
+
+```json
+{
+ "took": 6,
+ "timed_out": false,
+ "_shards": {
+ "total": 1,
+ "successful": 1,
+ "skipped": 0,
+ "failed": 0
+ },
+ "hits": {
+ "total": {
+ "value": 2,
+ "relation": "eq"
+ },
+ "max_score": 300,
+ "hits": [
+ {
+ "_index": "testindex1",
+ "_id": "2",
+ "_score": 300,
+ "_source": {
+ "name": "Economy brand",
+ "product_to_brand": "brand"
+ }
+ },
+ {
+ "_index": "testindex1",
+ "_id": "1",
+ "_score": 150,
+ "_source": {
+ "name": "Luxury brand",
+ "product_to_brand": "brand"
+ }
+ }
+ ]
+ }
+}
+```
\ No newline at end of file
diff --git a/_query-dsl/joining/index.md b/_query-dsl/joining/index.md
index 20f48c0b16..4ed46b3e17 100644
--- a/_query-dsl/joining/index.md
+++ b/_query-dsl/joining/index.md
@@ -3,6 +3,7 @@ layout: default
title: Joining queries
has_children: true
nav_order: 55
+has_toc: false
---
# Joining queries
@@ -10,9 +11,9 @@ nav_order: 55
OpenSearch is a distributed system in which data is spread across multiple nodes. Thus, running a SQL-like JOIN operation in OpenSearch is resource intensive. As an alternative, OpenSearch provides the following queries that perform join operations and are optimized for scaling across multiple nodes:
- `nested` queries: Act as wrappers for other queries to search [nested]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/nested/) fields. The nested field objects are searched as though they were indexed as separate documents.
-- `has_child` queries: Search for parent documents whose child documents match the query.
+- [`has_child`]({{site.url}}{{site.baseurl}}/query-dsl/joining/has-child/) queries: Search for parent documents whose child documents match the query.
- `has_parent` queries: Search for child documents whose parent documents match the query.
-- `parent_id` queries: A [join]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/nested/) field type establishes a parent/child relationship between documents in the same index. `parent_id` queries search for child documents that are joined to a specific parent document.
+- `parent_id` queries: A [join]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/join/) field type establishes a parent/child relationship between documents in the same index. `parent_id` queries search for child documents that are joined to a specific parent document.
If [`search.allow_expensive_queries`]({{site.url}}{{site.baseurl}}/query-dsl/index/#expensive-queries) is set to `false`, then joining queries are not executed.
{: .important}
\ No newline at end of file