Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Pulling fields content for neural search from user provided URL #475

Closed
martin-gaievski opened this issue Oct 25, 2023 · 2 comments
Labels
enhancement Features Introduces a new unit of functionality that satisfies a requirement

Comments

@martin-gaievski
Copy link
Member

martin-gaievski commented Oct 25, 2023

Is your feature request related to a problem?

Today the content for the text and images for mutimodal semantic search is provided directly with the request in form of base64 encoded string. This requires additional pre-processing of data and not be optimal is some scenarios when for instance data is already available in some other system. Another problem is storage space, in OpenSearch this binary content will be stored as part of the index.

What solution would you like?

Allow URLs for fields, something around these lines:

  "query": {
        "neural": {
              "vector_embeddings": {
                  "query": {
                        "image":  {
                            "value": "http://myserver/image1.jpg"
                            "type": "url"
                        },

What alternatives have you considered?

Alternative is available now, user needs to download content from the URL and do base64 encoding

Do you have any additional context?

There can be concerns around security as system will be pulling from remote URLs. Most probably the syntax of existing requests will change to support url type

@martin-gaievski martin-gaievski added Features Introduces a new unit of functionality that satisfies a requirement untriaged enhancement labels Oct 25, 2023
@martin-gaievski martin-gaievski changed the title [FEATURE] Pulling content for neural search from user provided URL [FEATURE] Pulling fields content for neural search from user provided URL Oct 25, 2023
@vamshin vamshin removed the untriaged label Oct 30, 2023
@Sanjana679
Copy link

I wanted to clarify, this method with the URL would only be for search and not for ingesting documents?

@martin-gaievski
Copy link
Member Author

This idea may not be ideal for managed systems, mainly because URL is provided by user and that may be not secure and/or can affect system stability. At some degree the overhead with storage can be solved by not storing field source in the index.

@github-project-automation github-project-automation bot moved this from Backlog to ✅ Done in Vector Search RoadMap Oct 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Features Introduces a new unit of functionality that satisfies a requirement
Projects
Status: Done
Development

No branches or pull requests

3 participants