Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Doing Exact Search and ANN search on a single index with multiple K-NN fields #1078

Closed
navneet1v opened this issue Aug 31, 2023 · 3 comments

Comments

@navneet1v
Copy link
Collaborator

Is your feature request related to a problem?
Currently if a user wants to only perform exact search on a k-NN index they need to mark knn:false in the index setting. What does this do is it forces the user to use the exact search only, for all the k-NN fields. A better experience for a user should be he should be able to define on which field ANN search should be done and on what field exact search should be done.

What solution would you like?

Option 1

One of the option I was thinking is to use an already present feature in OpenSearch which is "index": false by default its true. K-NN plugin can leverage this attribute to understand if ANN search is required on that field or not.

Option 2

Instead of using the index attribute we can define another attribute like searchType with values anns, exact to understand what type of search is required on the particular k-NN field.

What alternatives have you considered?
NA

Do you have any additional context?
As of today knn:false is not a required attribute to do the exact search. I can mark knn:true and still do both ANN search and Exact Search. Because ANN search have high memory and disk footprint it's an overkill. Plus doing knn:false is not an obvious thing here to optimize a field for exact search.

@navneet1v navneet1v changed the title [FEATURE] Providing better user experience for doing exact search [FEATURE] Doing Exact Search and ANN search on a single index with multiple K-NN fields Aug 31, 2023
@jmazanec15
Copy link
Member

One clarification: with this is we need to still provide a signal to OpenSearch to use our custom codec. Right now, the only way to do this is via the index setting index.knn.

In terms of reusing "index" mapping parameter. I like this - I think similar meaning has been attributed to other data types like geo, etc. We could introduce some other parameter, but this might clutter our field type more.

@vamshin vamshin moved this from Backlog to Backlog (Hot) in Vector Search RoadMap Oct 5, 2023
@navneet1v
Copy link
Collaborator Author

navneet1v commented Mar 18, 2024

On thinking further on the problem, I see that in Opensearch for fields which user don't want to index Opensearch expose: index: false parameter.

We can use this parameter to deicide whether we should create graphs for native engines here: https://github.com/opensearch-project/k-NN/blob/main/src/main/java/org/opensearch/knn/index/codec/KNN80Codec/KNN80DocValuesConsumer.java#L81 . This will ensure that graphs are not created. This should take care of both merge and refresh.

We do have to handle the BWC for such a solution. But I see this as most elegant and easy solution.

For Lucene I have not given much thought. Will try to think over it more.

@vamshin
Copy link
Member

vamshin commented Oct 31, 2024

duplicate of #1079

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

No branches or pull requests

4 participants