Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding documentation for Pagination in hybrid query #9109

Merged
190 changes: 190 additions & 0 deletions _search-plugins/hybrid-search.md
Original file line number Diff line number Diff line change
Expand Up @@ -1213,3 +1213,193 @@
Field | Description
:--- | :---
`explanation` | The `explanation` object has three properties: `value`, `description`, and `details`. The `value` property shows the result of the calculation, `description` explains what type of calculation was performed, and `details` shows any subcalculations performed. For score normalization, the information in the `description` property includes the technique used for normalization or combination and the corresponding score.

## Paginate hybrid query results
**Introduced 2.19**
{: .label .label-purple }

You can apply pagination to hybrid query results by using the `pagination_depth` parameter in the hybrid query clause, along with the standard `from` and `size` parameters. The `pagination_depth` parameter defines the maximum number of search results that can be retrieved from each shard per subquery. For example, setting `pagination_depth: 50` allows up to 50 results per subquery to be maintained in memory from each shard.

To navigate through the results, use:
- `from`: specifies the document number from which you want to start showing the results, default is `0`
- `size`: specifies the number of results to return on each page, default is `10`

For example, to show results from 20th document to 30th document, set `from: 20` and `size: 10`. For more information about pagination, see [paginate results]({{site.url}}{{site.baseurl}}/search-plugins/searching-data/paginate/#the-from-and-size-parameters).

### The impact of pagination_depth on hybrid search results
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This paragraph is let user know that changing the pagination_depth also changes the search results reference.


The change in `pagination_depth` also affects the search results ordering on which the user is paginating. This is because altering the `pagination_depth` directly impacts the number of results retrieved for each subquery per shard, which may ultimately might change the result ordering after normalization. Therefore, it is recommended to maintain a consistent value of `pagination_depth` while navigating between pages.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need the end of the first sentence as it feels a little clunky. How about The change in pagination_depth also affects the ordering of search results.

Copy link
Member Author

@vibrantvarun vibrantvarun Jan 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Essentially what i want to say here is if user changes pagination_depth then it will change the ground truth on which the user is paginating.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood, I still think there is probably a better way to phrase it but I can't think of anything. It's technically sound, so we can leave phrasing to the doc team


The standard hybrid search without pagination uses `from + size` formula (where `from` is always equals to `0`) to retrieve search results from each shard per subquery.{: .note}

To enable deeper pagination, a higher value of `pagination_depth` should be provided. By using the `from` and `size` parameters, user can navigate to higher pages. However, deeper pagination comes at the cost of search performance getting a toll, as retrieving more results requires higher computation.
vibrantvarun marked this conversation as resolved.
Show resolved Hide resolved

Below is the example of search request with `from = 0` , `size = 5` and `pagination_depth = 10`. From each shard at max 10 search results can be catered for bool and term query respectively.

Check warning on line 1237 in _search-plugins/hybrid-search.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.DirectionAboveBelow] Use 'following or later' instead of 'Below' for versions or orientation within a document. Use 'above' and 'below' only for physical space or screen descriptions. Raw Output: {"message": "[OpenSearch.DirectionAboveBelow] Use 'following or later' instead of 'Below' for versions or orientation within a document. Use 'above' and 'below' only for physical space or screen descriptions.", "location": {"path": "_search-plugins/hybrid-search.md", "range": {"start": {"line": 1237, "column": 1}}}, "severity": "WARNING"}

Check failure on line 1237 in _search-plugins/hybrid-search.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.Spelling] Error: bool. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks. Raw Output: {"message": "[OpenSearch.Spelling] Error: bool. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.", "location": {"path": "_search-plugins/hybrid-search.md", "range": {"start": {"line": 1237, "column": 158}}}, "severity": "ERROR"}
vibrantvarun marked this conversation as resolved.
Show resolved Hide resolved
```json
GET /my-nlp-index/_search?size=5&search_pipeline=nlp-search-pipeline
vibrantvarun marked this conversation as resolved.
Show resolved Hide resolved
{
"query": {
"hybrid": {
"pagination_depth":10,
"queries": [
{
"term": {
"category": "permission"
}
},
{
"bool": {
"should": [
{
"term": {
"category": "editor"
}
},
{
"term": {
"category": "statement"
}
}
]
}
}
]
}
}
}
```
{% include copy-curl.html %}


```json
{
"hits": {
"total": {
"value": 6,
"relation": "eq"
},
"max_score": 0.5,
"hits": [
{
"_index": "my-nlp-index",
"_id": "d3eXlZQBJkWerFzHv4eV",
"_score": 0.5,
"_source": {
"category": "permission",
"doc_keyword": "workable",
"doc_index": 4976,
"doc_price": 100
}
},
{
"_index": "my-nlp-index",
"_id": "eneXlZQBJkWerFzHv4eW",
"_score": 0.5,
"_source": {
"category": "editor",
"doc_index": 9871,
"doc_price": 30
}
},
{
"_index": "my-nlp-index",
"_id": "e3eXlZQBJkWerFzHv4eW",
"_score": 0.5,
"_source": {
"category": "statement",
"doc_keyword": "entire",
"doc_index": 8242,
"doc_price": 350
}
},
{
"_index": "my-nlp-index",
"_id": "fHeXlZQBJkWerFzHv4eW",
"_score": 0.24999997,
"_source": {
"category": "statement",
"doc_keyword": "idea",
"doc_index": 5212,
"doc_price": 200
}
},
{
"_index": "index-test",
"_id": "fXeXlZQBJkWerFzHv4eW",
"_score": 5.0E-4,
"_source": {
"category": "editor",
"doc_keyword": "bubble",
"doc_index": 1298,
"doc_price": 130
}
}
]
}
}
```
The following search request is with `from = 6`, `size = 5` and `pagination_depth = 10`.
We haven't changed the `pagination_depth` because we want to paginate on the same search result reference. {: .note}

```json
GET /my-nlp-index/_search?size=5&search_pipeline=nlp-search-pipeline
vibrantvarun marked this conversation as resolved.
Show resolved Hide resolved
{
"from":6,
"query": {
"hybrid": {
"pagination_depth":10,
"queries": [
{
"term": {
"category": "permission"
}
},
{
"bool": {
"should": [
{
"term": {
"category": "editor"
}
},
{
"term": {
"category": "statement"
}
}
]
}
}
]
}
}
}
```
{% include copy-curl.html %}

The response will be trim the first 5 entries and show the remaining results.

```json
{
"hits": {
"total": {
"value": 6,
"relation": "eq"
},
"max_score": 0.5,
"hits": [
{
"_index": "index-test",
"_id": "fneXlZQBJkWerFzHv4eW",
"_score": 5.0E-4,
"_source": {
"category": "editor",
"doc_keyword": "bubble",
"doc_index": 521,
"doc_price": 75
}
}
]
}
}
```
Loading