Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I modified a part of the code to enable parallel inference with multiple num_batch #1113

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

zhugelaozei
Copy link

Hi!Since I found that SAHI cannot perform parallel inference when using YOLOv11 for slicing inference, I modified part of the code and adapted it to work with the relevant parts of Ultralytics' code.Unfortunately, I have only adapted the Ultralytics part of the code for now.I hope this is helpful to you.

@fcakyon
Copy link
Collaborator

fcakyon commented Jan 4, 2025

Great @zhugelaozei ! Can you please fix the formatting by:

  1. Install development dependencies:
    pip install -e ."[dev]"

  2. Run code formatting:
    python -m scripts.run_code_style format

@tonyreina
Copy link

Does the num_batch here refer to running bounding box detections on multiple slices of the same image in a batch? Or is it running multiple images at one time in a batch? Could you provide an example of how to use it?

@eVen-gits
Copy link
Contributor

Hello,
I believe this is a very important feature. What is the current status on this?

@eVen-gits
Copy link
Contributor

Does the num_batch here refer to running bounding box detections on multiple slices of the same image in a batch? Or is it running multiple images at one time in a batch? Could you provide an example of how to use it?

@tonyreina hey. I believe it refers to one image. This is also the usual way to use sahi - large image, sliced into smaller ones.

On the topic. I tried the modifications today and it works. Here are my observations:

  • It appears that the program crashes, if perform_standard_pred is not set to False
  • Even though sliced prediction seems to work, the performance gains seem to be lower than expected. My predictions for a 12768x9564 (122.1MP) image, using 640x640 slices goes from ~13s to ~8s (I don't have an accurate metric).

With regards to the latter, I suspect it's similar to what I observed running multiple instances of inference on separate processes.

One reason is likely the data loading limitation, but more than that, I would suspect it has to do with some low level locking of the GPU operations? I really am not an expert in terms of hardware utilization, but perhaps someone with more experience could shed some light on this topic.

Either way, it's a welcome addition and I hope this change is seriously considered.

@dceluis
Copy link

dceluis commented Feb 13, 2025

Hello. I did quite some work on this a while ago. While I did not submit a PR for this I thought I would submit the link here for reference in the hopes that some of the implementation would help getting this PR merged.

https://github.com/dceluis/sahi_batched

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants