Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds ef_search support for Lucene kNN queries #1748

Merged

Conversation

shatejas
Copy link
Collaborator

Description

Lucene queries use k as efsearch. So we take the max of two to pass is the right k value to Lucene. The size parameter will dictate the number of the hits

curl -XGET "http://localhost:9200/lucene-hnsw-index/_search" -H 'Content-Type: application/json' -d'
{
"size": 1,
  "query": {
    "knn": {
      "lucene_vector": {
        "vector": [2.5, 3.5, 2.2],
        "k": 3,
        "method_parameters": {
          "ef_search": 5
        }
        }
      }
    }
  }'
{
  "took": 4,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 4,
      "relation": "eq"
    },
    "max_score": 1,
    "hits": [
      {
        "_index": "lucene-hnsw-index",
        "_id": "2",
        "_score": 1,
        "_source": {
          "lucene_vector": [
            2.5,
            3.5,
            2.2
          ],
          "price": 7.1
        }
      }
    ]
  }
}

Issues Resolved

#1537

Check List

  • New functionality includes testing.
    • All tests pass
  • Commits are signed as per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@@ -29,7 +29,6 @@
import java.util.Map;
import java.util.function.Function;

import static org.opensearch.knn.index.IndexUtil.isClusterOnOrAfterMinRequiredVersion;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we removing this ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was a leftover from this PR https://github.com/opensearch-project/k-NN/pull/1742/files. The import is not needed. Sorry for the confusion

@@ -85,6 +86,98 @@ public void testCreateLuceneDefaultQuery() {
}
}

public void testLuceneFloatVectorQuery() {
Query actualQuery1 = KNNQueryFactory.create(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a function to create the query as it is duplicated multiple times

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It has a different request everytime. The create request itself is using a builder so wrapping it up with a function is moot

Response response = searchKNNIndex(INDEX_NAME, new KNNQueryBuilder(fieldName, queryVector, k), k);
Response response = searchKNNIndex(
INDEX_NAME,
KNNQueryBuilder.builder().fieldName(fieldName).k(k).vector(queryVector).methodParameters(methodParameters).build(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its much more convenient to use KNNBuilder. It converts to XContent underneath.

Not opposed to it but any specific reason we should move to xcontent for happy case tests?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using XContentBuilder can better mock end-user API calls by simulating JSON content creation and ensuring end-to-end functionality.

KNNBuilder is an internal method, and relying on it in integration test has the possibility to lead to fragile tests that may break with internal changes, even if the public API remains stable.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair, will change it

@shatejas shatejas force-pushed the feature/ef-search branch from 8a33cd0 to 1af16ac Compare June 13, 2024 23:36
@junqiu-lei junqiu-lei merged commit f6ab18d into opensearch-project:feature/ef-search Jun 14, 2024
49 of 51 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants