-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for asymmetric embedding models #710
base: main
Are you sure you want to change the base?
Conversation
@br3no can you add an entry in the changelog. |
@br3no Thanks for raising the PR. I am wondering do we require this change? In MLCommons repository a generic MLInference processor is getting launched which is supposed to do the inference of any kind of model both during ingestion and search. RFC: opensearch-project/ml-commons#2173 That capability is getting build as of now. Do you think we still need this feature then? |
@navneet1v I have been loosely following the discussions in the mentioned RFC. It's a large change that I don't expect to be stable soon – the PR is very much in flux. Also, I don't see the use-case of asymmetric embedding models being addressed. This PR here is much smaller in comparison and is not in any way in conflict with the RFC work. If once the work on the ML Inference Processors is finished and the use-case is addressed there as well, we can deprecate and eventually remove the functionality again. Until then, this PR offers users the chance to use more modern local embeddings. I'm eager to put this to spin, tbh. |
If that is the case I would recommend posting the same on the RFC to ensure that your use case is handled. On the other hand, I do agree this is an interesting feature. I would like to get some eyes on this change mainly in terms of should this be added or not given a more generic processor is around the corner. As I am of my opinion is concerned the main reason of generic processor was to avoid creating new/updating processors to support new model types which is happening in this PR. Thoughts? @jmazanec15 , @martin-gaievski , @vamshin , @vibrantvarun . Let me add some PMs too for Opensearch-project to know their thoughts. @dylan-tong-aws |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #710 +/- ##
============================================
- Coverage 85.02% 84.41% -0.61%
+ Complexity 790 785 -5
============================================
Files 60 59 -1
Lines 2430 2464 +34
Branches 410 409 -1
============================================
+ Hits 2066 2080 +14
- Misses 202 215 +13
- Partials 162 169 +7 ☔ View full report in Codecov by Sentry. |
@navneet1v I have added a comment earlier today to the RFC (cf. opensearch-project/ml-commons#2173 (comment)). Sure, let's open the discussion and get some PMs into it. I really don't mind leaving this out if the support is introduced in another PR in 2.14. I'm concerned opensearch-project/ml-commons#2173 is a much larger effort, that won't be ready that quickly... It's not about my contribution – I need the feature. 🙃 |
I can see the feature is marked for 2.14 release of Opensearch. Let me add maintainers from ML team too. @mingshl , @ylwu-amzn |
@mingshl @ylwu-amzn, I'd really like to have this feature in 2.14. Do you think this use-case will be fully supported with opensearch-project/ml-commons#2173? Cf. opensearch-project/ml-commons#2173 (comment) If not, I'd be happy to help this PR get merged as an interim solution! Let me know what you think! |
@br3no ml inference processor is targeting at first supporting remote model only. How did you usually connect this model? is it local or remote? if remote, can you please provide a SageMaker deployment code piece then I can quickly test it in 2.14 test cluster. Thanks |
@mingshl sorry for taking so long to answer! The use-case for now is to use a local, asymmetric model such as https://huggingface.co/intfloat/multilingual-e5-small. This PR here is the last puzzle piece to allow one to use these kinds of model and should in principle also work with remote models. It makes sure that the neural-search plugin uses the correct inference parameters when embedding passages and queries with asymmetric models. Regardless of whether the model is local or remote, if you are using asymmetric models, you will need to provide this information anyway. The thing is that asymmetric models need to know at inference time what exactly they are embedding. OpenSearch currently treats embedding models as symmetric, meaning that regardless of whether the text being embedded is a query or a passage, the embedding will be always the same. Asymmetric models require content "hints" to the text being embedded; the model exemplified above uses the string prefixes In opensearch-project/ml-commons#1799 we have added the concept of asymmetric models into ml-commons, introducing the I would really be happy to get this merged as an interim solution until the ml inference processor fully supports this use-case. |
I also vote for this PR in need for this functionality. |
@br3no will it possible if you can contribute back in MLInference processor for local model support? Is that even an option? |
@navneet1v you mean making sure this works there as well? Sure, I can commit to that. I'd propose then to merge this PR now and then start the work to eventually replace this once the MLInference processor supports this use case... |
What I meant with this comment is that I don't see the need to implement a different BWC Test class for the change in this PR. The existing BWC tests (e.g. the BTW the rolling-upgrade BWC tests are failing on the main branch. I ran
without success. This makes it hard to add new tests. |
@vibrantvarun The comment makes sense to me. Can you check whether BWC tests are needed and help him? |
src/main/java/org/opensearch/neuralsearch/ml/MLCommonsClientAccessor.java
Outdated
Show resolved
Hide resolved
src/main/java/org/opensearch/neuralsearch/ml/MLCommonsClientAccessor.java
Outdated
Show resolved
Hide resolved
Signed-off-by: br3no <[email protected]>
Signed-off-by: br3no <[email protected]>
Signed-off-by: br3no <[email protected]>
Signed-off-by: br3no <[email protected]>
Signed-off-by: br3no <[email protected]>
Signed-off-by: br3no <[email protected]>
Signed-off-by: br3no <[email protected]>
Signed-off-by: br3no <[email protected]>
Signed-off-by: br3no <[email protected]>
390d04c
to
61347b0
Compare
Signed-off-by: br3no <[email protected]>
@martin-gaievski I have addressed all your latest comments. Hope this PR can now be approved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks good to me, thank you
})); | ||
|
||
Consumer<Boolean> predictConsumer = isAsymmetricModel -> { | ||
MLInput mlInput = createMLMultimodalInput(targetResponseFilters, inputObjects, isAsymmetricModel ? mlAlgoParams : null); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There will be a case where we might want to pass mlAlgoParams for SymmetricModel in the future. Shouldn't we check if the model is asymmetric or not before constructing the request?
// Check here if model is symmetric or asymmetric
InferenceRequest.builder().modelId(this.modelId).inputTexts(inferenceList).mlAlgoParams(PASSAGE_PARAMETERS).build(),
Modifying the request internally will lead to a confusion later when we need to pass mlAlgoParams to symmetric model but it silently omit it before calling the model.
@br3no Please take a look on this comment, the seder is the only risk I see between old nodes and new nodes, and I don't think HybridSearchIT has covered asymmetric case as it only tested with text embedding model(symmetric model), they don't have the new configuration introduced in asymmetric model so no seder issue either. I would suggest you create a old cluster with two nodes(ml-node) and replace one nodes with latest code, and test two cases:
|
@zane-neo so if I get this right, I should create a new test that uses the asymmetric model feature. This test should only run for OS versions >= 2.19. Is this right? Your concern is about making sure future releases will not break compatibility with this feature. |
Ping. |
@br3no That's right to use the asymmetric model feature for OS>=2.19, but the thing is a little bit different. you should test a mixed cluster with OS 2.19 and OS 2.18, since 2.18 has different code base of neural search, so a request to 2.19 node and then being dispatched to 2.18 node could encounter seder issues. My guess is you can test on this and based on the result:
|
btw, if you're working on BWC for 2.19+ and want to check how the results are, better rebase on latest main, we've switched 2.18-snapshot to 2.18/2.19-snapshot. This branch will keep failing on running BWCs |
@br3no the CI is in good shape now, you can work on BWC tests. |
@br3no I tested on this and I didn't find seder issues for OS version >= 2.13 so there's no concern on this, you can work on creating BWC tests for OS version >= 2.19, thanks. |
Sorry folks, I didn't have time lately to invest here. I'll try to do this in the holiday season. |
Hi @br3no, @brianf-aws is planning on adding a tutorial for using asymmetric local model with ml inference processors during ingest and search. opensearch-project/ml-commons#3258 you can watch for the new updates and post comments through the pr. |
@mingshl does this make this PR here obsolete? |
Hey Breno (@br3no), I did researching and thanks to your contribution (Asymmetric Embedding support) within ML-Commons its possible to get embeddings on demand using the ML Inference processor (MLinferenceSearchProcessor & MLInferenceIngestProcessor^). I didn't read fully through the comment thread but I believe what you are going after can be done. Hopefully the tutorial (opensearch-project/ml-commons#3258) being created matches your use case. If not can you let us know what might be missing so we can understand better? ^There is a bit of a caveat, (There is a slight code change required to allow Asymmetric Embeddings with MLInferenceIngest Processor please see here opensearch-project/ml-commons#3281). |
Hi @br3no! Can you update the conflicting files? |
Description
This PR adds support for asymmetric embedding models such as https://huggingface.co/intfloat/multilingual-e5-small to the neural-search plugin.
It builds on the work done in opensearch-project/ml-commons#1799.
Asymmetric embedding models behave differently when embedding passages and queries. For that end, the model must "know" on inference time, what kind of data it is embedding.
The changes are:
1.
src/main/java/org/opensearch/neuralsearch/processor/TextEmbeddingProcessor.java
The processor signals it is embedding passages, by passing the new
AsymmetricTextEmbeddingParameters
using the content typeEmbeddingContentType.PASSAGE
.2.
src/main/java/org/opensearch/neuralsearch/query/NeuralQueryBuilder.java
Analogously, the query builder uses
EmbeddingContentType.QUERY
.3.
src/main/java/org/opensearch/neuralsearch/ml/MLCommonsClientAccessor.java
Here is where most of the work was done. The class has been extended in a backwards-compatible way with inference methods that allow one to pass
MLAlgoParams
objects. Usage ofAsymmetricTextEmbeddingParameters
(which implementsMLAlgoParams
) is mandatory for asymmetric models. At the same time symmetric models do not accept them.The only way to know whether a model is asymmetric or symmetric is by reading its model configuration (if the models' configuration contains a
passage_prefix
and/or aquery_prefix
, they are asymmetric, otherwise they are symmetric).The
src/main/java/org/opensearch/neuralsearch/ml/MLCommonsClientAccessor.java
class deals with this, keeping the complexity in one place and not requiring any API change to the neural-search plugin (as proposed in #620). When calling the inference methods, clients (such as theTextEmbeddingProcessor
) may pass theAsymmetricTextEmbeddingParameters
object without caring if the model they are using is symmetric or asymmetric. The accessor class will first read the model's configuration (by calling thegetModel
API of themlClient
) and deal appropriately.To avoid adding this extra roundtrip to every inference call, the asymmetry information is kept in a cache in memory.
Issues Resolved
#620
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.