[RFC] Integrating KNNVectorsFormat in Native Vector Search Engine #1853

navneet1v · 2024-07-18T18:18:35Z

Introduction

The issue focuses on providing detailed design for integrating KNNVectorsFormat in Native Vector Search Engines (like NMSLib and FAISSLib). Apart from just moving to the new vector format, the document takes another step forward to improve exact search user experience too. At the end issue touch bases on the plan of implementation to give an iterative way to implement things.

Background

KNNPlugin(aka Vector Engine) was added in OpenSearch back 2019, at that time Lucene didn’t support any native VectorFormat. To go around this, the decision was taken to represent vector as Binary DocValues and then override the
BinaryDocValuesFormat to store vectors and build the vector data structures. With Lucene Version 9.0 Lucene added a new format optimized for Vectors. Since that the format has evolved and optimized with features like iterative graph builds, in built scaler quantization, optimized support for reading vectors from disk etc.

Earlier Investigation

In September 2023, we did an investigation(thanks to @heemin32 who did the investigation) to get some details on what it takes to move from BinaryDocValuesFormat to KNNVectorsFormat. Below were the main concerns in a summarized fashion as per the older deep-dive.

Summarizing the Main Concerns from the earlier investigation(ref: Cons section of this)

Strong dependency on the Lucene with native engine implementation.
Increased disk space, as we will be storing vectors as doc values and also in KNNVectorsFormat.
Increased indexing latency because we will be adding vectors in both doc values and also in KNNVectorsFormat.
Increased code size: We used doc value codec to do all the works except indexing and searching. After the migration, we need our own implementation of codec which will increase the line of codes. However, as the code evolve and we keep feature parity between native engine and the Lucene engine, initial increase in code size will be paid off.
Effort to migration: The effort needs for this migration is not small as we need to write our own codec from scratch.
Not enough feature that we can get benefited out of the box: Graph build on the fly during indexing won’t happen for native engine even after the migration. Native engine should support adding node to an existing graph first and we need to implement to call the method in addField method.
With iterative graph builds in place we can also add more validation to ensure that we throttle request during indexing if memory is going above the limit of CB. Thanks to @jmazanec15 for suggesting this.

Benefits of Moving to KNNVectorsFormat #

Below are the top benefits that for moving to KNNVectorsFormat:

Based on the investigation done, we see that storing vectors in KNNVectorsFormat gives 3.3x better performance at p99(416ms for BinaryDocValues and 124ms for KNNFloatVectorsFormat on a 1M 768D dataset with SIMD) while reading the flat vectors from disk as compared to BinaryDocValues. Main reason being with BinaryDocValues we need to first read all the bytes and then convert to float, this conversion takes time. But with KNNVectorsFormat we can directly map the off heap bytes as float array in heap. This improvement will further lead to benefits in:
1. With improvement in de-serialization of vectors we have seen a 5-10% improvement in build time.
2. With improvement in de-serialization we will see improvement in efficient filters when efficient filters switches to exact search case. Ref this github issue which provides details of time taken in de-serialization.
3. For the memory optimized vector search where we need to read full precision vectors from disk, KNNVectorsFormat will keep the search latency within the desired SLAs.
With KNNVectorsFormat we will can enable the support for building the native indices during in memory segment creation rather than on refresh(aka Lucene Flush). This will provide a predictable CPU utilization rather than intermediate spiky natures of CPU utilization(similar to a saw tooth graph), due to native index builds getting kicked off during refreshes.

What about earlier concerns?

As we are migrating only the codec’s VectorFormat and not the query and field interfaces, we are not creating the strong dependency(refer HLD/LLD section to know more).
With this change we will be turning off the doc values. We have already added the support in Script Queries to read the vectors from VectorValues if it is present. For native engines too, as VectorValues will be present we will use that. Hence no additional disk space.
No increase in the latency will be there as we are just adding vectors to VectorValues.
There will be increase in the code size, as we will be implementing new VectorFormat. As compared to older investigation the code getting added is much less as we are just adding new VectorFormat.
The integration effort is coming as part of Memory Optimized Vector Search feature, hence it is not a separate effort.
From features standpoint KNNVectorsFormat provide reduction in de-serialization latency which is must to ensure that we are able to rescore more vectors to get better recall with our memory optimized Vector Search. For on the fly graph builds Faiss already supports this feature.

Solution

High Level Design

Indexing Flow

Below is the high level indexing flow. In the KNNFieldsMapper k-NN plugin will decide(refer later sections on how we will take this decision) which VectorField to add in the Lucene document. This field will then be used to decide which VectorFormat to use. If we go with knn plugin VectorField then we will use BinaryDocValues, if we FloatVectorField/ByteVectorField of Lucene for native engines then we will use NativeEngineKNNVectorFormat.

Components Definition:

NativeIndexBuilderComponent: This is a completely new component that will be responsible for taking a Field(LuceneVectorField/KNNVectorField) along with other segment information to create the index using JNI Layer. Currently most of the logic around this is present in KNN80DocValuesConsumer component but it is tied up BinaryDocValues. Hence having this abstraction will ensure that we can now decouple this logic out of Codec components. This component will also ensure that once we make this JNILayer more extensible to add external engines we don’t have to make any change in codecs and very minimal changes should be required in NativeIndexBuilderComponent.
NativeEngineKNNVectorsFormat: This is a per-field vectors format component that will provide readers and writers for Native Engine Vector Field. The component as of now will not provide the functionality for doing search in the initial phases to ensure a smoother migration to this format from KNN80DocValuesFormat but it will be something that we can explore in future.
KNNVectorFieldMapper: This is the KNN Vector Field parser that currently holds the logic of parsing the vector field and creating the right fields for different engines. This component will hold the key logic of adding the right field types which will ensure that IndexWriter can call the right codec later on. This component will hold the key for ensuring this change is Backward compatible. Read the BWC section to understand more how we will ensure the backward compatibility.
Conditions for Choosing Codec and Vector Field: When field mapper will be executing we will use the index created version attribute to know what Vector Field to use(refer BWC section for more details). This logic will be responsible for keeping Vector Engine backward compatible.

Search Flow

There will no major changes in search flow both of exact search and approximate nearest neighbors search. The only anticipated change is with efficient filters. So, when we will do exact search in efficient filter we need to switch from Binary DocValues to KNNVectorValues based on which values are available for field. Refer next sections to understand more how this will be done seamlessly.

Pros

The earlier benefits section provides all the pros for this approach which is mainly reduction in time during de-serialization.
Better performance for exact search(on new indices only) as even for exact search we will be using these new KNNVectorFormatValues.

Cons

Even with migrating to this new format we would have to maintain the old KNN80ValuesFormat till next major release of Opensearch.

Alternatives

Alternative 1: Improve current KNNDocValuesFormat to bridge the feature gap

Improving the KNNDocValuesFormat is another option where we can invest in doc values format and improve the format so that it can support iterative graph builds and reading the float values efficiently. I did deep-dive on both of them and what was found is for iterative graph build there is no support in Lucene for DocValues. This support is only there for VectorsFormat. Secondly on the reading floats efficiently, I tried to look into MemorySegmentIndex API of Lucene and DocValues Reader, all the classes are marked either package private or final.

User Experience (No Change in ANN Search, improvements for Exact Search)

The user experience for creating the index and doing the Approximate Nearest Neighbors Search will remain . But to use full potential of KNNVectorsValues for other use-cases below is the proposed changes. With the below changes we will also be able to resolve these(ref1, ref2) enhancements.

Exact Search and Training Index Creation

The main thing we used to do for the training indices and exact search indices was we mark index.knn as false. What this used to do was rather than it using the KNNCodec, index will use the default codec. In the default codec as the DocValuesFormat was not overridden no graphs used to be created. So if we look closely we can achieve the behavior by another parameter which is present in the field for every field in Opensearch this is index:true/false. Currently kNNFieldMapper doesn’t take advantage of this parameter but we can now start taking the advantage of this parameter and set a new attribute in the field and then use it later in the codec to take a decision if we need to create KNN data structures or not.

Old

PUT my-knn-index-1
{
  "settings": {
    "index": {
      "knn": false, // default value is false
    }
  },
  "mappings": {
    "properties": {
      "my_vector1": {
        "type": "knn_vector",
        "dimension": 2
      },
      "my_vector2": {
        "type": "knn_vector",
        "dimension": 4
      }
    }
  }
}

New Proposed Exact Search Optimized Interface, the old interface will still be supported. With this new interface

PUT my-knn-index-1
{
  "settings": {
    "index": {
      "knn": true, // default value is false
    }
  },
  "mappings": {
    "properties": {
      "my_vector1": {
        "type": "knn_vector",
        "dimension": 2,
        "index": "false" // default for this is true
      },
      "my_vector2": {
        "type": "knn_vector",
        "dimension": 4,
        "index": "false" // default for this is true
      }
    }
  }
}

Exact Search Query Experience

Old

POST my-knn-index-1/_search
{
 "size": 4,
 "query": {
   "script_score": {
     "query": {
       "match_all": {}
     },
     "script": {
       "source": "knn_score",
       "lang": "knn",
       "params": {
         "field": "my_vector2",
         "query_value": [2.0, 3.0, 5.0, 6.0],
         "space_type": "cosinesimil"
       }
     }
   }
 }
}

New Experience
The new experience is similar to ANN search experience. The difference here is, if customer has specified the index:false in the field mapping, the Vector Engine will be intelligent enough to switch to exact search behavior.

POST my-knn-index-1/_search
{
  "size": 2,
  "query": {
    "knn": {
      "my_vector2": {
        "vector": [2, 3, 5, 6],
        "k": 2
      }
    }
  }
}

Low Level Design

The major Low Level changes are explained below.

New KNNVectorsFormat for Native Engines (aka NativeEngineKNNVectorsFormat)

To use the KNNVectorsFormat we will be adding a new VectorsFormat specially for the NativeEngines(nmslib and faiss) named NativeEngineKNNVectorsFormat. This KNNVectorsFormat will be used for writing (via NativeEngineKNNVectorsFormatWriter) and reading (via NativeEngineKNNVectorsFormatReader) vector fields when native engines are used. Refer the class digram below for more understanding and working POC here.

Common Interface for interacting with StoredVectors(aka BinaryDocValues, FloatVectorValues and ByteVectorValues)

A new KNNVectorValues interface will be added that will act as an abstraction layer on top of FloatVectorValues, ByteVectorValues and BinaryDocValues. This KNNVectorValues then can be used at different places like in codec and also in the query(in fiters) to iterate over Vectors from segments and segment readers. Working POC can be found here.

Backward Compatibility

To maintain the backward compatibility the new KNNVectorsFormat will be enabled for the indices that are equal or above a specific version of OpenSearch in this case it will be 2.17(as we are targeting to release this feature in 2.17). Every index in OpenSearch has an associated version with it which tells with what version of OpenSearch index was created. We will leverage that parameter here. We have already used this parameter when we changed the default hyper parameters values of HNSW algorithm. Hence we have a high confidence that this will work.

Feasibility Study

I did a small POC with KNNVectorValues and ran all the BWC tests. I saw no failures. Here is the POC1 code for that. The below benchmarks that we performed was with this POC code. So we can confirm that new format works, it is backward compatible and is performant.

Benchmarking

We will use our nightly runs to benchmark the performance of this change. No special benchmarking is required apart from running a sanity test with 1M 768D dataset on similar configuration as that of nightly runs.

Testing Strategy

Backward Compatibility Testing Plan

We will use the BWC rolling upgrade and restart upgrade tests to test the BWC for this change. No other separate changes are required as it will cover the indexing and search both cases.

Integration Testing Plan

Current tests should be good enough to cover the ANN search flow.
For training index and exact search cases, we will add more integration tests via index:false attribute at mapping level for the field to ensure that we have good coverage.

Future Improvements/Ideas

Below are some of future improvement that I think could be added after this implementation

With KNNVectorsFormat implemented and FlatVectorsFormat providing a support for efficient retrieval of Vectors we can actually take better decisions like not building the vector data structures for all segments and start skipping the data structure creation for smaller segments. This will reduce a lot of wasted compute that we do to build KNN data structures and just throw them during merges. During query for these small segments we can then use this optimized format to read the vectors and do vector search.
We can start to look for giving this flatvectors format as IndexType to faiss where faiss engine don’t generate its own FlatIndex. Github issue: [ENHANCEMENT] Partial vector search datastrcuture loading in memory for doing search #1693
We can improve the flatvectors format to store the vectors by layers of HNSW data structure rather than in the order of increasing doc Ids. This idea is to mainly coming from this part of HNSW that most of the final K results comes from the bottom layer of HNSW and if we can memory map that part of the file in memory we can reduce the page cache thrashing. This idea is in very early stage, and come with limitation of how do you iterate over VectorValues but we can build another map that can store the offset of file pointer where the vector is present, which we can read upfront.
Once we move to new format we should also look to migrate the k-NN query to use the Codec search interface rather than doing everything in KNNWeight class. Thanks to @jmazanec15 for suggesting this.

FAQ

What is BinaryDocValuesFormat?

This format defines to read and write a field which has doc values in the binary format. k-NN plugin before this change was using BinaryDocValuesFormat to index vectors.

What is KNNVectorsFormat?

This is format introduced in Lucene with 9.0 version which is tailor made for indexing and retrieving dense vectors in Lucene.

Appendix

Appendix A

Benchmarks sift-128

Updated Code

Metric	Task	Value	Unit
Cumulative indexing time of primary shards		0.456983	min
Min cumulative indexing time across primary shards		0.000116667	min
Median cumulative indexing time across primary shards		0.228492	min
Max cumulative indexing time across primary shards		0.456867	min
Cumulative indexing throttle time of primary shards		0	min
Min cumulative indexing throttle time across primary shards		0	min
Median cumulative indexing throttle time across primary shards		0	min
Max cumulative indexing throttle time across primary shards		0	min
Cumulative merge time of primary shards		8.14777	min
Cumulative merge count of primary shards		1
Min cumulative merge time across primary shards		0	min
Median cumulative merge time across primary shards		4.07388	min
Max cumulative merge time across primary shards		8.14777	min
Cumulative merge throttle time of primary shards		0	min
Min cumulative merge throttle time across primary shards		0	min
Median cumulative merge throttle time across primary shards		0	min
Max cumulative merge throttle time across primary shards		0	min
Cumulative refresh time of primary shards		6.22283	min
Cumulative refresh count of primary shards		23
Min cumulative refresh time across primary shards		0.00025	min
Median cumulative refresh time across primary shards		3.11142	min
Max cumulative refresh time across primary shards		6.22258	min
Cumulative flush time of primary shards		6.33868	min
Cumulative flush count of primary shards		2
Min cumulative flush time across primary shards		0	min
Median cumulative flush time across primary shards		3.16934	min
Max cumulative flush time across primary shards		6.33868	min
Total Young Gen GC time		0.273	s
Total Young Gen GC count		14
Total Old Gen GC time		0	s
Total Old Gen GC count		0
Store size		1.42833	GB
Translog size		5.79283e-07	GB
Heap used for segments		0	MB
Heap used for doc values		0	MB
Heap used for terms		0	MB
Heap used for norms		0	MB
Heap used for points		0	MB
Heap used for stored fields		0	MB
Segment count		2
Min Throughput	custom-vector-bulk	2777.84	docs/s
Mean Throughput	custom-vector-bulk	7390.23	docs/s
Median Throughput	custom-vector-bulk	7623.53	docs/s
Max Throughput	custom-vector-bulk	7689.5	docs/s
50th percentile latency	custom-vector-bulk	8.24442	ms
90th percentile latency	custom-vector-bulk	9.18986	ms
99th percentile latency	custom-vector-bulk	17.153	ms
99.9th percentile latency	custom-vector-bulk	33.3493	ms
99.99th percentile latency	custom-vector-bulk	152.351	ms
100th percentile latency	custom-vector-bulk	182.73	ms
50th percentile service time	custom-vector-bulk	8.24371	ms
90th percentile service time	custom-vector-bulk	9.18908	ms
99th percentile service time	custom-vector-bulk	17.1645	ms
99.9th percentile service time	custom-vector-bulk	33.3493	ms
99.99th percentile service time	custom-vector-bulk	152.351	ms
100th percentile service time	custom-vector-bulk	182.73	ms
error rate	custom-vector-bulk	0	%
Min Throughput	force-merge-segments	0	ops/s
Mean Throughput	force-merge-segments	0	ops/s
Median Throughput	force-merge-segments	0	ops/s
Max Throughput	force-merge-segments	0	ops/s
100th percentile latency	force-merge-segments	490436	ms
100th percentile service time	force-merge-segments	490436	ms
error rate	force-merge-segments	0	%

----------------------------------
[INFO] SUCCESS (took 1021 seconds)
----------------------------------

Baseline

Metric	Task	Value	Unit
Cumulative indexing time of primary shards		0.453533	min
Min cumulative indexing time across primary shards		0.000133333	min
Median cumulative indexing time across primary shards		0.226767	min
Max cumulative indexing time across primary shards		0.4534	min
Cumulative indexing throttle time of primary shards		0	min
Min cumulative indexing throttle time across primary shards		0	min
Median cumulative indexing throttle time across primary shards		0	min
Max cumulative indexing throttle time across primary shards		0	min
Cumulative merge time of primary shards		8.21317	min
Cumulative merge count of primary shards		1
Min cumulative merge time across primary shards		0	min
Median cumulative merge time across primary shards		4.10658	min
Max cumulative merge time across primary shards		8.21317	min
Cumulative merge throttle time of primary shards		0	min
Min cumulative merge throttle time across primary shards		0	min
Median cumulative merge throttle time across primary shards		0	min
Max cumulative merge throttle time across primary shards		0	min
Cumulative refresh time of primary shards		5.98175	min
Cumulative refresh count of primary shards		24
Min cumulative refresh time across primary shards		0.000266667	min
Median cumulative refresh time across primary shards		2.99087	min
Max cumulative refresh time across primary shards		5.98148	min
Cumulative flush time of primary shards		6.07785	min
Cumulative flush count of primary shards		2
Min cumulative flush time across primary shards		0	min
Median cumulative flush time across primary shards		3.03892	min
Max cumulative flush time across primary shards		6.07785	min
Total Young Gen GC time		0.229	s
Total Young Gen GC count		13
Total Old Gen GC time		0	s
Total Old Gen GC count		0
Store size		1.42834	GB
Translog size		5.79283e-07	GB
Heap used for segments		0	MB
Heap used for doc values		0	MB
Heap used for terms		0	MB
Heap used for norms		0	MB
Heap used for points		0	MB
Heap used for stored fields		0	MB
Segment count		2
Min Throughput	custom-vector-bulk	3096.56	docs/s
Mean Throughput	custom-vector-bulk	7415.23	docs/s
Median Throughput	custom-vector-bulk	7628.75	docs/s
Max Throughput	custom-vector-bulk	7717.94	docs/s
50th percentile latency	custom-vector-bulk	8.33815	ms
90th percentile latency	custom-vector-bulk	9.05288	ms
99th percentile latency	custom-vector-bulk	15.8545	ms
99.9th percentile latency	custom-vector-bulk	31.2782	ms
99.99th percentile latency	custom-vector-bulk	103.492	ms
100th percentile latency	custom-vector-bulk	115.203	ms
50th percentile service time	custom-vector-bulk	8.3382	ms
90th percentile service time	custom-vector-bulk	9.05263	ms
99th percentile service time	custom-vector-bulk	15.8392	ms
99.9th percentile service time	custom-vector-bulk	31.2782	ms
99.99th percentile service time	custom-vector-bulk	103.492	ms
100th percentile service time	custom-vector-bulk	115.203	ms
error rate	custom-vector-bulk	0	%
Min Throughput	force-merge-segments	0	ops/s
Mean Throughput	force-merge-segments	0	ops/s
Median Throughput	force-merge-segments	0	ops/s
Max Throughput	force-merge-segments	0	ops/s
100th percentile latency	force-merge-segments	500428	ms
100th percentile service time	force-merge-segments	500428	ms
error rate	force-merge-segments	0	%

----------------------------------
[INFO] SUCCESS (took 1017 seconds)
----------------------------------

Benchmarks cohere-768

Updated Code

Metric	Task	Value	Unit
Cumulative indexing time of primary shards		20.1748	min
Min cumulative indexing time across primary shards		8.33333e-05	min
Median cumulative indexing time across primary shards		10.0874	min
Max cumulative indexing time across primary shards		20.1748	min
Cumulative indexing throttle time of primary shards		0	min
Min cumulative indexing throttle time across primary shards		0	min
Median cumulative indexing throttle time across primary shards		0	min
Max cumulative indexing throttle time across primary shards		0	min
Cumulative merge time of primary shards		56.7627	min
Cumulative merge count of primary shards		44
Min cumulative merge time across primary shards		0	min
Median cumulative merge time across primary shards		28.3813	min
Max cumulative merge time across primary shards		56.7627	min
Cumulative merge throttle time of primary shards		0.546583	min
Min cumulative merge throttle time across primary shards		0	min
Median cumulative merge throttle time across primary shards		0.273292	min
Max cumulative merge throttle time across primary shards		0.546583	min
Cumulative refresh time of primary shards		1.68887	min
Cumulative refresh count of primary shards		51
Min cumulative refresh time across primary shards		0.000233333	min
Median cumulative refresh time across primary shards		0.844433	min
Max cumulative refresh time across primary shards		1.68863	min
Cumulative flush time of primary shards		4.91008	min
Cumulative flush count of primary shards		23
Min cumulative flush time across primary shards		0	min
Median cumulative flush time across primary shards		2.45504	min
Max cumulative flush time across primary shards		4.91008	min
Total Young Gen GC time		0.368	s
Total Young Gen GC count		18
Total Old Gen GC time		0	s
Total Old Gen GC count		0
Store size		17.0632	GB
Translog size		5.80214e-07	GB
Heap used for segments		0	MB
Heap used for doc values		0	MB
Heap used for terms		0	MB
Heap used for norms		0	MB
Heap used for points		0	MB
Heap used for stored fields		0	MB
Segment count		3
Min Throughput	custom-vector-bulk	745.21	docs/s
Mean Throughput	custom-vector-bulk	3764.5	docs/s
Median Throughput	custom-vector-bulk	3721.14	docs/s
Max Throughput	custom-vector-bulk	4675.33	docs/s
50th percentile latency	custom-vector-bulk	139.393	ms
90th percentile latency	custom-vector-bulk	289.658	ms
99th percentile latency	custom-vector-bulk	1820.86	ms
99.9th percentile latency	custom-vector-bulk	8638.68	ms
99.99th percentile latency	custom-vector-bulk	12307.5	ms
100th percentile latency	custom-vector-bulk	12567.2	ms
50th percentile service time	custom-vector-bulk	139.419	ms
90th percentile service time	custom-vector-bulk	289.821	ms
99th percentile service time	custom-vector-bulk	1842.18	ms
99.9th percentile service time	custom-vector-bulk	8638.68	ms
99.99th percentile service time	custom-vector-bulk	12307.5	ms
100th percentile service time	custom-vector-bulk	12567.2	ms
error rate	custom-vector-bulk	0	%
Min Throughput	force-merge-segments	0	ops/s
Mean Throughput	force-merge-segments	0	ops/s
Median Throughput	force-merge-segments	0	ops/s
Max Throughput	force-merge-segments	0	ops/s
100th percentile latency	force-merge-segments	1.87156e+06	ms
100th percentile service time	force-merge-segments	1.87156e+06	ms
error rate	force-merge-segments	0	%

Baseline code

Metric	Task	Value	Unit
Cumulative indexing time of primary shards		21.6263	min
Min cumulative indexing time across primary shards		0.000166667	min
Median cumulative indexing time across primary shards		10.8132	min
Max cumulative indexing time across primary shards		21.6262	min
Cumulative indexing throttle time of primary shards		0	min
Min cumulative indexing throttle time across primary shards		0	min
Median cumulative indexing throttle time across primary shards		0	min
Max cumulative indexing throttle time across primary shards		0	min
Cumulative merge time of primary shards		59.011	min
Cumulative merge count of primary shards		47
Min cumulative merge time across primary shards		0	min
Median cumulative merge time across primary shards		29.5055	min
Max cumulative merge time across primary shards		59.011	min
Cumulative merge throttle time of primary shards		0.9051	min
Min cumulative merge throttle time across primary shards		0	min
Median cumulative merge throttle time across primary shards		0.45255	min
Max cumulative merge throttle time across primary shards		0.9051	min
Cumulative refresh time of primary shards		1.79087	min
Cumulative refresh count of primary shards		55
Min cumulative refresh time across primary shards		0.000383333	min
Median cumulative refresh time across primary shards		0.895433	min
Max cumulative refresh time across primary shards		1.79048	min
Cumulative flush time of primary shards		5.2001	min
Cumulative flush count of primary shards		25
Min cumulative flush time across primary shards		0.00025	min
Median cumulative flush time across primary shards		2.60005	min
Max cumulative flush time across primary shards		5.19985	min
Total Young Gen GC time		0.578	s
Total Young Gen GC count		17
Total Old Gen GC time		0	s
Total Old Gen GC count		0
Store size		17.0635	GB
Translog size		5.80214e-07	GB
Heap used for segments		0	MB
Heap used for doc values		0	MB
Heap used for terms		0	MB
Heap used for norms		0	MB
Heap used for points		0	MB
Heap used for stored fields		0	MB
Segment count		3
Min Throughput	custom-vector-bulk	2992.71	docs/s
Mean Throughput	custom-vector-bulk	3781.35	docs/s
Median Throughput	custom-vector-bulk	3674.29	docs/s
Max Throughput	custom-vector-bulk	4950.5	docs/s
50th percentile latency	custom-vector-bulk	142.968	ms
90th percentile latency	custom-vector-bulk	333.385	ms
99th percentile latency	custom-vector-bulk	1863.63	ms
99.9th percentile latency	custom-vector-bulk	7003.43	ms
99.99th percentile latency	custom-vector-bulk	12999.2	ms
100th percentile latency	custom-vector-bulk	13538.3	ms
50th percentile service time	custom-vector-bulk	143.002	ms
90th percentile service time	custom-vector-bulk	333.538	ms
99th percentile service time	custom-vector-bulk	1863.2	ms
99.9th percentile service time	custom-vector-bulk	7003.43	ms
99.99th percentile service time	custom-vector-bulk	12999.2	ms
100th percentile service time	custom-vector-bulk	13538.3	ms
error rate	custom-vector-bulk	0	%
Min Throughput	force-merge-segments	0	ops/s
Mean Throughput	force-merge-segments	0	ops/s
Median Throughput	force-merge-segments	0	ops/s
Max Throughput	force-merge-segments	0	ops/s
100th percentile latency	force-merge-segments	1.92148e+06	ms
100th percentile service time	force-merge-segments	1.92148e+06	ms
error rate	force-merge-segments	0	%

----------------------------------
[INFO] SUCCESS (took 2608 seconds)
----------------------------------

Reference

Old Investigation and main issue: Investigate migrating custom codec from BinaryDocValuesFormat to KnnVectorsFormat #1087

The text was updated successfully, but these errors were encountered:

navneet1v · 2024-07-18T18:54:05Z

Adding Tasks in the comment as Issue is already quite large

Tasks

Add new KNNVectorFormat for native engines
Integrate the new KNNVectorsFormat with PerFieldVectorsFormat
Add new interface for iterating on Vector Values
Write Integrate VectorFormat with KNNFieldMapper
Enable exact search experience for indexing [2.18]
Enable exact search experience for query. [2.18]
Write BWC tests

navneet1v · 2024-08-05T18:34:58Z

Re-opening the issue somehow as the PRs were getting merged this issue was resolved

navneet1v · 2024-08-14T21:46:01Z

Reopening this issue. Somehow it keeps on getting closed as the PRs are getting merged.

navneet1v · 2024-09-05T22:46:49Z

The exact search experience improvement will be taken in the 2.18 release

navneet1v · 2024-09-18T07:29:25Z

I am closing this issue. The exact search experience changes will be taken in 2.18/2.19 version of Opensearch

navneet1v added this to Vector Search RoadMap Jul 18, 2024

navneet1v self-assigned this Jul 18, 2024

github-project-automation bot moved this to Backlog in Vector Search RoadMap Jul 18, 2024

github-actions bot added the untriaged label Jul 18, 2024

navneet1v moved this from Backlog to 2.17.0 in Vector Search RoadMap Jul 18, 2024

navneet1v changed the title ~~[META] Integrating KNNVectorsFormat in Native Vector Search Engine~~ [RFC] Integrating KNNVectorsFormat in Native Vector Search Engine Jul 18, 2024

navneet1v mentioned this issue Jul 18, 2024

Introduce NativeEngineKNNVectorsFormat as a KNNVectorsFormat for Native engines #1855

Merged

5 tasks

This was referenced Jul 26, 2024

Move away from file watcher for releasing memory #1885

Closed

Introduce KNNVectorValues interface to iterate on different types of Vector values during indexing and search #1897

Merged

navneet1v closed this as completed in #1897 Aug 1, 2024

github-project-automation bot moved this from 2.17.0 to ✅ Done in Vector Search RoadMap Aug 1, 2024

navneet1v reopened this Aug 5, 2024

github-actions bot added the untriaged label Aug 5, 2024

navneet1v removed the untriaged label Aug 5, 2024

This was referenced Aug 8, 2024

[FEATURE] Creating Vector data structures Greedily #1942

Closed

Integrate Lucene Vector field with native engines to use KNNVectorFormat during segment creation #1945

Merged

luyuncheng mentioned this issue Aug 9, 2024

Add DocValuesProducers for releasing memory when close index #1946

Merged

5 tasks

navneet1v moved this from ✅ Done to 2.17.0 in Vector Search RoadMap Aug 12, 2024

shatejas mentioned this issue Aug 12, 2024

Integrates FAISS iterative builds with NativeEngines990KnnVectorsFormat #1950

Merged

5 tasks

navneet1v mentioned this issue Aug 13, 2024

Integrate KNNVectorValues with vector ANN Search flow #1952

Merged

3 tasks

navneet1v closed this as completed in #1952 Aug 14, 2024

github-project-automation bot moved this from 2.17.0 to ✅ Done in Vector Search RoadMap Aug 14, 2024

navneet1v reopened this Aug 14, 2024

github-actions bot added the untriaged label Aug 14, 2024

navneet1v removed the untriaged label Aug 14, 2024

navneet1v moved this from ✅ Done to 2.17.0 in Vector Search RoadMap Aug 14, 2024

jmazanec15 mentioned this issue Aug 14, 2024

Investigate migrating custom codec from BinaryDocValuesFormat to KnnVectorsFormat #1087

Closed

naveentatikonda added the v2.17.0 label Aug 20, 2024

vamshin added the Roadmap:Vector Database/GenAI Project-wide roadmap label label Aug 26, 2024

opensearch-infra bot added this to OpenSearch Roadmap Aug 26, 2024

github-project-automation bot moved this to New in OpenSearch Roadmap Aug 26, 2024

github-project-automation bot added this to OpenSearch Project Roadmap Aug 30, 2024

github-project-automation bot moved this to 2.17 (First RC 09/03, Release 09/17) in OpenSearch Project Roadmap Aug 30, 2024

navneet1v closed this as completed Sep 18, 2024

github-project-automation bot moved this from 2.17.0 to ✅ Done in Vector Search RoadMap Sep 18, 2024

navneet1v mentioned this issue Oct 16, 2024

Performance difference between files getting opened with IOContext.RANDOM vs IOContext.READ during merges apache/lucene#13920

Open

navneet1v mentioned this issue Nov 28, 2024

[RFC] : Boosting OpenSearch Vector Engine Performance using GPUs #2293

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFC] Integrating KNNVectorsFormat in Native Vector Search Engine #1853

[RFC] Integrating KNNVectorsFormat in Native Vector Search Engine #1853

navneet1v commented Jul 18, 2024 •

edited

Loading

navneet1v commented Jul 18, 2024 •

edited

Loading

navneet1v commented Aug 5, 2024

navneet1v commented Aug 14, 2024

navneet1v commented Sep 5, 2024

navneet1v commented Sep 18, 2024

[RFC] Integrating KNNVectorsFormat in Native Vector Search Engine #1853

[RFC] Integrating KNNVectorsFormat in Native Vector Search Engine #1853

Comments

navneet1v commented Jul 18, 2024 • edited Loading

Introduction

Background

Earlier Investigation

Summarizing the Main Concerns from the earlier investigation(ref: Cons section of this)

Benefits of Moving to KNNVectorsFormat #

What about earlier concerns?

Solution

High Level Design

Indexing Flow

Components Definition:

Search Flow

Alternatives

Alternative 1: Improve current KNNDocValuesFormat to bridge the feature gap

User Experience (No Change in ANN Search, improvements for Exact Search)

Exact Search and Training Index Creation

Exact Search Query Experience

Low Level Design

New KNNVectorsFormat for Native Engines (aka NativeEngineKNNVectorsFormat)

Common Interface for interacting with StoredVectors(aka BinaryDocValues, FloatVectorValues and ByteVectorValues)

Backward Compatibility

Feasibility Study

Benchmarking

Testing Strategy

Backward Compatibility Testing Plan

Integration Testing Plan

Future Improvements/Ideas

FAQ

What is BinaryDocValuesFormat?

What is KNNVectorsFormat?

Appendix

Appendix A

Benchmarks sift-128

Updated Code

Baseline

Benchmarks cohere-768

Updated Code

Baseline code

Reference

navneet1v commented Jul 18, 2024 • edited Loading

Tasks

navneet1v commented Aug 5, 2024

navneet1v commented Aug 14, 2024

navneet1v commented Sep 5, 2024

navneet1v commented Sep 18, 2024

navneet1v commented Jul 18, 2024 •

edited

Loading

navneet1v commented Jul 18, 2024 •

edited

Loading