Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

getting the time job is in waiting queue #82

Open
sakari opened this issue Nov 1, 2024 · 4 comments
Open

getting the time job is in waiting queue #82

sakari opened this issue Nov 1, 2024 · 4 comments
Labels
enhancement New feature or request

Comments

@sakari
Copy link

sakari commented Nov 1, 2024

We would like to monitor the time jobs are in ready to be executed waiting for worker to detects congestion issues. As far as I can tell timestamp is the time the job was created which would not work that great if the jobs get put back to the queue from a worker -- eg with retry or delay. There is an issue from year or so ago about something similar. Is the answer still that we cannot get this information?

As an aside, I saw that you are working on (opentelemetry) telemetry support which is great. On a quick look this seemed to focus on tracing only. Can we or or are there plans to expose bullmq metrics such as number of jobs etc through opentelemetry?

@manast
Copy link
Contributor

manast commented Nov 2, 2024

We have some additional timestamps that may be useful for you such as:
processOn: https://api.docs.bullmq.io/classes/v5.Job.html#processedOn
finishedOn: https://api.docs.bullmq.io/classes/v5.Job.html#finishedOn
It is a bit difficult to model a timestamp for every time a job is retried and moved to waiting status, not sure how we could improve this situation. The telemetry will include metrics but we must first support flows and groups, which is work in progress right now.

@peplin
Copy link

peplin commented Nov 22, 2024

We're currently using a combination of job.timestamp and processedOn to try and calculate our queue delays, but ran into this same issue. If a job fails and is retried, our calculation for "queue delay" is falsely inflated because it includes the runtime of all of the job's attempts.

What would be most helpful is an additional timestamp property on the job that is the time it was last moved into the waiting status. Just as an example, call it job.waitingSince

  • The first time a job is enqueued, job.timestamp === job.waitingSince
  • If a job fails and is put back in the queue for a retry:
    • job.timestamp is unmodified
    • job.waitingSince is the time it was re-enqueued

This would be backwards compatible since the meaning of job.timestamp is unchanged.

@manast
Copy link
Contributor

manast commented Nov 22, 2024

@peplin ok, so you are interested only in the time it waited in the last time it was in wait, not on the total time it actually took until the job was processed, right? I wonder, why is of no interest to know the time it was in wait before it started to be retried?

@peplin
Copy link

peplin commented Nov 22, 2024

Yes, that's correct! I would say that both cases are interesting, but they serve different purposes. My team maintains the queues and we would like to offer our job authors guarantees about how long their jobs will wait in the queue. We don't someone's flaky job (that requires multiple retries) to impact our measurements of queue wait time.

@manast manast added the enhancement New feature or request label Nov 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants