-
Notifications
You must be signed in to change notification settings - Fork 515
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding documentation for Pagination in hybrid query #9109
Changes from 4 commits
9e1f1a7
e8af594
712460d
e8fb490
be49c71
fc1ba04
6ff58c9
e7910b0
9e0a594
9486e26
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1213,3 +1213,193 @@ | |
Field | Description | ||
:--- | :--- | ||
`explanation` | The `explanation` object has three properties: `value`, `description`, and `details`. The `value` property shows the result of the calculation, `description` explains what type of calculation was performed, and `details` shows any subcalculations performed. For score normalization, the information in the `description` property includes the technique used for normalization or combination and the corresponding score. | ||
|
||
## Paginate hybrid query results | ||
**Introduced 2.19** | ||
{: .label .label-purple } | ||
|
||
You can apply pagination to hybrid query results by using the `pagination_depth` parameter in the hybrid query clause, along with the standard `from` and `size` parameters. The `pagination_depth` parameter defines the maximum number of search results that can be retrieved from each shard per subquery. For example, setting `pagination_depth: 50` allows up to 50 results per subquery to be maintained in memory from each shard. | ||
|
||
To navigate through the results, use: | ||
- `from`: specifies the document number from which you want to start showing the results, default is `0` | ||
- `size`: specifies the number of results to return on each page, default is `10` | ||
|
||
For example, to show results from 20th document to 30th document, set `from: 20` and `size: 10`. For more information about pagination, see [paginate results]({{site.url}}{{site.baseurl}}/search-plugins/searching-data/paginate/#the-from-and-size-parameters). | ||
|
||
### The impact of pagination_depth on hybrid search results | ||
|
||
The change in `pagination_depth` also affects the search results ordering on which the user is paginating. This is because altering the `pagination_depth` directly impacts the number of results retrieved for each subquery per shard, which may ultimately might change the result ordering after normalization. Therefore, it is recommended to maintain a consistent value of `pagination_depth` while navigating between pages. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think we need the end of the first sentence as it feels a little clunky. How about There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Essentially what i want to say here is if user changes pagination_depth then it will change the ground truth on which the user is paginating. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Understood, I still think there is probably a better way to phrase it but I can't think of anything. It's technically sound, so we can leave phrasing to the doc team |
||
|
||
The standard hybrid search without pagination uses `from + size` formula (where `from` is always equals to `0`) to retrieve search results from each shard per subquery.{: .note} | ||
|
||
To enable deeper pagination, a higher value of `pagination_depth` should be provided. By using the `from` and `size` parameters, user can navigate to higher pages. However, deeper pagination comes at the cost of search performance getting a toll, as retrieving more results requires higher computation. | ||
vibrantvarun marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
Below is the example of search request with `from = 0` , `size = 5` and `pagination_depth = 10`. From each shard at max 10 search results can be catered for bool and term query respectively. | ||
Check warning on line 1237 in _search-plugins/hybrid-search.md
|
||
vibrantvarun marked this conversation as resolved.
Show resolved
Hide resolved
|
||
```json | ||
GET /my-nlp-index/_search?size=5&search_pipeline=nlp-search-pipeline | ||
vibrantvarun marked this conversation as resolved.
Show resolved
Hide resolved
|
||
{ | ||
"query": { | ||
"hybrid": { | ||
"pagination_depth":10, | ||
"queries": [ | ||
{ | ||
"term": { | ||
"category": "permission" | ||
} | ||
}, | ||
{ | ||
"bool": { | ||
"should": [ | ||
{ | ||
"term": { | ||
"category": "editor" | ||
} | ||
}, | ||
{ | ||
"term": { | ||
"category": "statement" | ||
} | ||
} | ||
] | ||
} | ||
} | ||
] | ||
} | ||
} | ||
} | ||
``` | ||
{% include copy-curl.html %} | ||
|
||
|
||
```json | ||
{ | ||
"hits": { | ||
"total": { | ||
"value": 6, | ||
"relation": "eq" | ||
}, | ||
"max_score": 0.5, | ||
"hits": [ | ||
{ | ||
"_index": "my-nlp-index", | ||
"_id": "d3eXlZQBJkWerFzHv4eV", | ||
"_score": 0.5, | ||
"_source": { | ||
"category": "permission", | ||
"doc_keyword": "workable", | ||
"doc_index": 4976, | ||
"doc_price": 100 | ||
} | ||
}, | ||
{ | ||
"_index": "my-nlp-index", | ||
"_id": "eneXlZQBJkWerFzHv4eW", | ||
"_score": 0.5, | ||
"_source": { | ||
"category": "editor", | ||
"doc_index": 9871, | ||
"doc_price": 30 | ||
} | ||
}, | ||
{ | ||
"_index": "my-nlp-index", | ||
"_id": "e3eXlZQBJkWerFzHv4eW", | ||
"_score": 0.5, | ||
"_source": { | ||
"category": "statement", | ||
"doc_keyword": "entire", | ||
"doc_index": 8242, | ||
"doc_price": 350 | ||
} | ||
}, | ||
{ | ||
"_index": "my-nlp-index", | ||
"_id": "fHeXlZQBJkWerFzHv4eW", | ||
"_score": 0.24999997, | ||
"_source": { | ||
"category": "statement", | ||
"doc_keyword": "idea", | ||
"doc_index": 5212, | ||
"doc_price": 200 | ||
} | ||
}, | ||
{ | ||
"_index": "index-test", | ||
"_id": "fXeXlZQBJkWerFzHv4eW", | ||
"_score": 5.0E-4, | ||
"_source": { | ||
"category": "editor", | ||
"doc_keyword": "bubble", | ||
"doc_index": 1298, | ||
"doc_price": 130 | ||
} | ||
} | ||
] | ||
} | ||
} | ||
``` | ||
The following search request is with `from = 6`, `size = 5` and `pagination_depth = 10`. | ||
We haven't changed the `pagination_depth` because we want to paginate on the same search result reference. {: .note} | ||
|
||
```json | ||
GET /my-nlp-index/_search?size=5&search_pipeline=nlp-search-pipeline | ||
vibrantvarun marked this conversation as resolved.
Show resolved
Hide resolved
|
||
{ | ||
"from":6, | ||
"query": { | ||
"hybrid": { | ||
"pagination_depth":10, | ||
"queries": [ | ||
{ | ||
"term": { | ||
"category": "permission" | ||
} | ||
}, | ||
{ | ||
"bool": { | ||
"should": [ | ||
{ | ||
"term": { | ||
"category": "editor" | ||
} | ||
}, | ||
{ | ||
"term": { | ||
"category": "statement" | ||
} | ||
} | ||
] | ||
} | ||
} | ||
] | ||
} | ||
} | ||
} | ||
``` | ||
{% include copy-curl.html %} | ||
|
||
The response will be trim the first 5 entries and show the remaining results. | ||
|
||
```json | ||
{ | ||
"hits": { | ||
"total": { | ||
"value": 6, | ||
"relation": "eq" | ||
}, | ||
"max_score": 0.5, | ||
"hits": [ | ||
{ | ||
"_index": "index-test", | ||
"_id": "fneXlZQBJkWerFzHv4eW", | ||
"_score": 5.0E-4, | ||
"_source": { | ||
"category": "editor", | ||
"doc_keyword": "bubble", | ||
"doc_index": 521, | ||
"doc_price": 75 | ||
} | ||
} | ||
] | ||
} | ||
} | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This paragraph is let user know that changing the pagination_depth also changes the search results reference.