[Doc] Lucene inbuilt scalar quantization #7797

naveentatikonda · 2024-07-23T00:53:06Z

Description

Add documentation for Lucene inbuilt scalar quantization in k-NN plugin.

Issues Resolved

Closes #6496

Version

2.16

Checklist

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and subject to the Developers Certificate of Origin.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

github-actions · 2024-07-23T00:55:19Z

Thank you for submitting your PR. The PR states are In progress (or Draft) -> Tech review -> Doc review -> Editorial review -> Merged.

Before you submit your PR for doc review, make sure the content is technically accurate. If you need help finding a tech reviewer, tag a maintainer.

When you're ready for doc review, tag the assignee of this PR. The doc reviewer may push edits to the PR directly or leave comments and editorial suggestions for you to address (let us know in a comment if you have a preference). The doc reviewer will arrange for an editorial review.

_search-plugins/knn/knn-vector-quantization.md

Signed-off-by: Naveen Tatikonda <[email protected]>

naveentatikonda · 2024-07-24T20:15:17Z

@kolchfa-aws Can you pls review this PR. Thanks!

kolchfa-aws · 2024-07-24T20:36:53Z

@naveentatikonda Sure, will do!

kolchfa-aws · 2024-07-25T18:03:40Z

_search-plugins/knn/knn-vector-quantization.md

+
+Optionally, you can specify the parameters in `method.parameters.encoder` shown below:
+* `confidence_interval` - used to compute the `minQuantile` and `maxQuantile` parameters which are used to quantize the vectors. The accepted values are: 
+  - It can be any value between and including `0.9` to `1.0`. For example, if we set it to `0.9` then it will consider the middle 90% of the vector values for computing the min and max Quantiles excluding the minimum and maximum 5% of the values.


So, to make sure, valid values are 0.9--1.0, inclusive (for static computation), or 0 (for dynamic computation)?

yes, that's correct. Also, users can skip this parameter which will be computed as shown under the default case

kolchfa-aws · 2024-07-25T18:15:38Z

_search-plugins/knn/knn-vector-quantization.md

+
+#### HNSW memory estimation
+
+The memory required for Hierarchical Navigable Small Worlds (HNSW) is estimated to be `1.1 * (dimension + 8 * M)` bytes/vector.


What is M?

M is max number of connections. This isn't a new parameter.

Signed-off-by: Fanit Kolchina <[email protected]>

_search-plugins/knn/knn-vector-quantization.md

Signed-off-by: Fanit Kolchina <[email protected]>

_search-plugins/knn/knn-vector-quantization.md

Signed-off-by: kolchfa-aws <[email protected]>

Signed-off-by: Fanit Kolchina <[email protected]>

_search-plugins/knn/knn-vector-quantization.md

Signed-off-by: kolchfa-aws <[email protected]>

natebower

@kolchfa-aws @naveentatikonda Please see my comments and changes and let me know if you have any questions. Thanks!

_search-plugins/knn/knn-vector-quantization.md

natebower · 2024-07-26T10:15:39Z

_search-plugins/knn/knn-vector-quantization.md

+
+Optionally, you can specify the `confidence_interval` parameter in the `method.parameters.encoder` object.
+The `confidence_interval` is used to compute the minimum and maximum quantiles in order to quantize the vectors:
+- If you set the `confidence_interval` to a value in `0.9` to `1.0` range, inclusive, then the quantiles are calculated statically. For example, setting the `confidence_interval` to `0.9` specifies to compute the minimum and maximum quantiles based on the middle 90% of the vector values, excluding the minimum 5% and maximum 5% of the values. 


"middle" => "median"?, "minimum" => "lowest"?, "maximum" => "highest"?

Definitely "middle". I would keep as is. @naveentatikonda WDYT?

This is definitely not my area of expertise, so that's fine with me :-)

Yes, we should use middle

_search-plugins/knn/knn-vector-quantization.md

Co-authored-by: Nathan Bower <[email protected]> Signed-off-by: kolchfa-aws <[email protected]>

kolchfa-aws

Thank you, @naveentatikonda!

github-actions bot assigned hdhalter Jul 23, 2024

hdhalter added 2 - In progress Issue/PR: The issue or PR is in progress. release-notes PR: Include this PR in the automated release notes v2.16.0 labels Jul 23, 2024

hdhalter assigned kolchfa-aws and unassigned hdhalter Jul 23, 2024

naveentatikonda force-pushed the lucene_inbuilt_sq branch 4 times, most recently from 332da9b to 75415ce Compare July 24, 2024 04:33

naveentatikonda marked this pull request as ready for review July 24, 2024 04:33

naveentatikonda requested review from hdhalter, kolchfa-aws, Naarcha-AWS, vagimeli, AMoo-Miki, natebower, dlvenable, stephen-crawford and epugh as code owners July 24, 2024 04:33

jmazanec15 reviewed Jul 24, 2024

View reviewed changes

_search-plugins/knn/knn-vector-quantization.md Outdated Show resolved Hide resolved

naveentatikonda added 2 commits July 24, 2024 15:13

[Doc] Lucene inbuilt scalar quantization in k-NN

223a236

Signed-off-by: Naveen Tatikonda <[email protected]>

Address Review Comments

709428a

Signed-off-by: Naveen Tatikonda <[email protected]>

naveentatikonda force-pushed the lucene_inbuilt_sq branch from 2651652 to 709428a Compare July 24, 2024 20:13

naveentatikonda requested a review from jmazanec15 July 24, 2024 20:14

hdhalter added 4 - Doc review PR: Doc review in progress and removed 2 - In progress Issue/PR: The issue or PR is in progress. labels Jul 24, 2024

kolchfa-aws reviewed Jul 25, 2024

View reviewed changes

kolchfa-aws added 2 commits July 25, 2024 15:03

Doc review

138ea17

Signed-off-by: Fanit Kolchina <[email protected]>

Clarified M

27634b5

Signed-off-by: Fanit Kolchina <[email protected]>

naveentatikonda commented Jul 25, 2024

View reviewed changes

_search-plugins/knn/knn-vector-quantization.md Outdated Show resolved Hide resolved

kolchfa-aws added 2 commits July 25, 2024 16:53

Tech review comments

278ea23

Signed-off-by: Fanit Kolchina <[email protected]>

One more change

436fb1c

Signed-off-by: Fanit Kolchina <[email protected]>

kolchfa-aws reviewed Jul 25, 2024

View reviewed changes

_search-plugins/knn/knn-vector-quantization.md Outdated Show resolved Hide resolved

kolchfa-aws and others added 2 commits July 25, 2024 17:01

Update _search-plugins/knn/knn-vector-quantization.md

4d65220

Signed-off-by: kolchfa-aws <[email protected]>

Reword search time sentence

6b10053

Signed-off-by: Fanit Kolchina <[email protected]>

kolchfa-aws reviewed Jul 25, 2024

View reviewed changes

_search-plugins/knn/knn-vector-quantization.md Outdated Show resolved Hide resolved

Update _search-plugins/knn/knn-vector-quantization.md

64832e8

Signed-off-by: kolchfa-aws <[email protected]>

natebower reviewed Jul 26, 2024

View reviewed changes

Apply suggestions from code review

5fc781c

Co-authored-by: Nathan Bower <[email protected]> Signed-off-by: kolchfa-aws <[email protected]>

kolchfa-aws approved these changes Jul 26, 2024

View reviewed changes

kolchfa-aws merged commit 79a422b into opensearch-project:main Jul 26, 2024
9 checks passed

hdhalter added 3 - Done Issue is done/complete and removed 4 - Doc review PR: Doc review in progress labels Jul 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Doc] Lucene inbuilt scalar quantization #7797

[Doc] Lucene inbuilt scalar quantization #7797

naveentatikonda commented Jul 23, 2024 •

edited by hdhalter

Loading

github-actions bot commented Jul 23, 2024

naveentatikonda commented Jul 24, 2024

kolchfa-aws commented Jul 24, 2024

kolchfa-aws Jul 25, 2024

naveentatikonda Jul 25, 2024

kolchfa-aws Jul 25, 2024

naveentatikonda Jul 25, 2024

natebower left a comment

natebower Jul 26, 2024

kolchfa-aws Jul 26, 2024

natebower Jul 26, 2024

naveentatikonda Jul 26, 2024

kolchfa-aws left a comment


		#### HNSW memory estimation

		The memory required for Hierarchical Navigable Small Worlds (HNSW) is estimated to be `1.1 * (dimension + 8 * M)` bytes/vector.

[Doc] Lucene inbuilt scalar quantization #7797

[Doc] Lucene inbuilt scalar quantization #7797

Conversation

naveentatikonda commented Jul 23, 2024 • edited by hdhalter Loading

Description

Issues Resolved

Version

Checklist

github-actions bot commented Jul 23, 2024

naveentatikonda commented Jul 24, 2024

kolchfa-aws commented Jul 24, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

natebower left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kolchfa-aws left a comment

Choose a reason for hiding this comment

naveentatikonda commented Jul 23, 2024 •

edited by hdhalter

Loading