You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Each stage (hash, encryption) submits a ton of work to the shared pool executor, one task per tensor. And because the Executor executes these tasks FIFO, we are unnecessarily delaying the ability to write a single tensor because it's subsequent chain-of-tasks are stuck behind all of the other processing of all of the other task items.
Create a variant of a ThreadPoolExecutor that uses a priority queue instead of a FIFO queue, and assign priorities per tensor such that all tensor 1 tasks are preferred as a group over all tensor 2 tasks, etc.
This would allow the current 3.0 task queueing strategy to be used with a slightly better version of the execution order from 2.9
This is also flexible enough that we could prioritize certain stages more than others if we wanted to (which would allow stuff like prioritizing finishing write-hazard stages in their entirety more highly than finishing subsequent potentially-concurrent read-only steps)
A hacky implementation of this that just swaps out the queue object used by a ThreadPoolExecutor could probably be finished quite quickly
Because we don't have a way to perform cooperative multitasking, and can't modify priorities of items already in a queue, this would prioritize tasks that have unmet dependencies over tasks that are ready to execute, which would make performance worse
This could hypothetically be worked around by designing an executor subclass that can create futures before scheduling them, and only scheduling them when their dependencies are met, though waiting on that condition would quite possibly be excessively complex
The text was updated successfully, but these errors were encountered:
Each stage (hash, encryption) submits a ton of work to the shared pool executor, one task per tensor. And because the Executor executes these tasks FIFO, we are unnecessarily delaying the ability to write a single tensor because it's subsequent chain-of-tasks are stuck behind all of the other processing of all of the other task items.
Create a variant of a ThreadPoolExecutor that uses a priority queue instead of a FIFO queue, and assign priorities per tensor such that all tensor 1 tasks are preferred as a group over all tensor 2 tasks, etc.
The text was updated successfully, but these errors were encountered: