-
Notifications
You must be signed in to change notification settings - Fork 143
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add tutorial for semantic search with byte quantized vector and Cohere embedding model #2127
Conversation
…e embedding model Signed-off-by: Yaliang Wu <[email protected]>
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #2127 +/- ##
============================================
- Coverage 81.86% 81.86% -0.01%
Complexity 5644 5644
============================================
Files 543 543
Lines 22790 22800 +10
Branches 2333 2333
============================================
+ Hits 18658 18665 +7
- Misses 3195 3198 +3
Partials 937 937
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor comments.
docs/tutorials/semantic_search/conversation_search_with_byte_quantized_vector.md
Outdated
Show resolved
Hide resolved
docs/tutorials/semantic_search/conversation_search_with_byte_quantized_vector.md
Outdated
Show resolved
Hide resolved
docs/tutorials/semantic_search/conversation_search_with_byte_quantized_vector.md
Outdated
Show resolved
Hide resolved
docs/tutorials/semantic_search/conversation_search_with_byte_quantized_vector.md
Outdated
Show resolved
Hide resolved
docs/tutorials/semantic_search/conversation_search_with_byte_quantized_vector.md
Outdated
Show resolved
Hide resolved
docs/tutorials/semantic_search/conversation_search_with_byte_quantized_vector.md
Outdated
Show resolved
Hide resolved
docs/tutorials/semantic_search/conversation_search_with_byte_quantized_vector.md
Outdated
Show resolved
Hide resolved
docs/tutorials/semantic_search/conversation_search_with_byte_quantized_vector.md
Outdated
Show resolved
Hide resolved
docs/tutorials/semantic_search/conversation_search_with_byte_quantized_vector.md
Outdated
Show resolved
Hide resolved
docs/tutorials/semantic_search/conversation_search_with_byte_quantized_vector.md
Outdated
Show resolved
Hide resolved
Co-authored-by: kolchfa-aws <[email protected]> Signed-off-by: Yaliang Wu <[email protected]>
|
||
The Cohere Embed v3 model supports several `embedding_types`. This tutorial uses the `int8` type for byte-quantized vectors. | ||
|
||
Note: Replace the placeholders that start with `your_` with your own values. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest that the your_xxx placeholders be emphasized somehow in the code samples, e.g., in italic or boldface or something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will pollute the REST API sample request by doing this. I think we should be good to leave as is.
docs/tutorials/semantic_search/conversation_search_with_byte_quantized_vector.md
Outdated
Show resolved
Hide resolved
docs/tutorials/semantic_search/conversation_search_with_byte_quantized_vector.md
Outdated
Show resolved
Hide resolved
docs/tutorials/semantic_search/conversation_search_with_byte_quantized_vector.md
Show resolved
Hide resolved
] | ||
}, | ||
{ | ||
"name": "sentence_embedding", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's worth mentioning in the explanation that even though the inference_results.output.data_type
says FLOAT32
, its not representative of the embeddings defined in the connector (int8)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make sense
Signed-off-by: Yaliang Wu <[email protected]>
…e embedding model (#2127) * add tutorial for semantic search with byte quantized vector and Cohere embedding model Signed-off-by: Yaliang Wu <[email protected]> * Apply suggestions from code review Co-authored-by: kolchfa-aws <[email protected]> Signed-off-by: Yaliang Wu <[email protected]> * address comments Signed-off-by: Yaliang Wu <[email protected]> --------- Signed-off-by: Yaliang Wu <[email protected]> Co-authored-by: kolchfa-aws <[email protected]> (cherry picked from commit 7b60989)
…e embedding model (#2127) (#2149) * add tutorial for semantic search with byte quantized vector and Cohere embedding model Signed-off-by: Yaliang Wu <[email protected]> * Apply suggestions from code review Co-authored-by: kolchfa-aws <[email protected]> Signed-off-by: Yaliang Wu <[email protected]> * address comments Signed-off-by: Yaliang Wu <[email protected]> --------- Signed-off-by: Yaliang Wu <[email protected]> Co-authored-by: kolchfa-aws <[email protected]> (cherry picked from commit 7b60989) Co-authored-by: Yaliang Wu <[email protected]>
…e embedding model (opensearch-project#2127) * add tutorial for semantic search with byte quantized vector and Cohere embedding model Signed-off-by: Yaliang Wu <[email protected]> * Apply suggestions from code review Co-authored-by: kolchfa-aws <[email protected]> Signed-off-by: Yaliang Wu <[email protected]> * address comments Signed-off-by: Yaliang Wu <[email protected]> --------- Signed-off-by: Yaliang Wu <[email protected]> Co-authored-by: kolchfa-aws <[email protected]>
…e embedding model
Description
[Describe what this change achieves]
Issues Resolved
[List any issues this PR will resolve]
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.