-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hybrid Search Sorting Issue with Size Specified #1067
Comments
@martin-gaievski could you please look into this bug? |
I saw progress is going on with the paging option (from > 0). It could be this was found already. With paging size would play a key part. |
@jdomkline can you please share additional information about the issue:
|
@martin-gaievski this is a QA cluster so it is fairly small. 1-AZ with 2 data nodes. "number_of_shards": "5", Luckily the sort info provided in the result allows a post processing to gather the missing doc and correct the result. For my use case this is an ok workaround. For a much larger size I can see it not working. |
reproduced on our end, thank you for detailed steps. Team is checking what is possible root cause and if workaround is possible |
@jdomkline this issue is a known problem in the implementation of the sort function. We have fixed it recently, and the code change will be part of the upcoming 2.19 version. You can find the corresponding code change in this PR: #1043. |
Hi @martin-gaievski thanks for the link. I see this case when I'm not including a sort at all. If I set the size at 10 a certain document comes back in position 8. When I use size at 20, this same document comes back in position 11. From what I read, it seems that PR: #1043. has addressed this also. Please share your thoughts. This would need to be fixed if from > 0 is ever allowed (vs searchAfter) for Hybrid queries. |
@jdomkline fix in #1043 addresses issue of field values being mismatched between Scenario where a document's position changes depending on the size value can be an intended behavior of hybrid query. To form the final search results list, we take up to |
@martin-gaievski I'm unclear as this was the exact same query with both sizes (10 and 20), how paging with Hybrid will ever work without providing duplicates. Once people can do from > 0 they will also be setting sizes. Is searchAfter the only way paging can work without duplicates then? (assuming no need for PIT) |
for pagination in hybrid queries, users will need to set the global window size with the new parameter |
@martin-gaievski that is awesome! Got it thanks |
Closing this issue, fix is provided in #1043 |
What is the bug?
With a from of 0 and an arbitrary size set, the Hybrid Search will give incorrect sorting results. The problem can appear and disappear based on changes to the size limit, such as setting size at 8 instead of 10.
It can be seen that the sort is actually correct however during document retrieval the wrong document is grabbed, not the document indicated by the sort (please see detailed screen caps below).
How can one reproduce the bug?
Run a DSL Hybrid Query with various sizes sorting on a field and the _id (please see detailed screen caps below).
What is the expected behavior?
During document retrieval, the correct document is returned as specified by the sort.
What is your host/environment?
AWS 2.17
{
"name": "0bf1a5d4ef04a73480c61776916c50c3",
"cluster_name": "302642896912:productsearch",
"cluster_uuid": "vghL7JqlRQijNk4yvpJk_Q",
"version": {
"number": "7.10.2",
"build_type": "tar",
"build_hash": "unknown",
"build_date": "2024-11-18T04:22:32.407132088Z",
"build_snapshot": false,
"lucene_version": "9.11.1",
"minimum_wire_compatibility_version": "7.10.0",
"minimum_index_compatibility_version": "7.0.0"
},
"tagline": "The OpenSearch Project: https://opensearch.org/"
}
Do you have any screenshots?
The sort retrieval problem example.
Example of correct results with minor adjustment to the size, from 8 to 10.
Website example
Do you have any additional context?
The case can be repeated over and over... setting the size back and forth.
The text was updated successfully, but these errors were encountered: