Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Phased ranking support for streaming mode #33283

Open
Alexander-Mark opened this issue Feb 8, 2025 · 0 comments
Open

Phased ranking support for streaming mode #33283

Alexander-Mark opened this issue Feb 8, 2025 · 0 comments
Milestone

Comments

@Alexander-Mark
Copy link

Is your feature request related to a problem? Please describe.
Currently streaming mode doesn't support phased ranking. This makes it tricky to efficiently run inference with more expensive models e.g. ColBERT max sim.

Describe the solution you'd like
For streaming mode to support phased ranking in the same way as indexing mode, or (if not possible within the design) an alternative approach that achieves something similar.

Describe alternatives you've considered
Using conditional logic to determine whether to run inference:

function myFunction() {
    if (cheapExpression > cutoff, cheapExpression, expensiveExpression)
}

Additional context
It's possible I've overlooked some existing features and the use case I'm describing is already doable within the current design.

@hmusum hmusum added this to the later milestone Feb 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants