Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to pause a span when its Fiber yields? #3026

Open
rmosolgo opened this issue Jan 21, 2025 · 2 comments
Open

How to pause a span when its Fiber yields? #3026

rmosolgo opened this issue Jan 21, 2025 · 2 comments
Labels
community To tag external issues and PRs submitted by the community feature request To tag feature request after Hero Triage for PM to disposition

Comments

@rmosolgo
Copy link

Is your feature request related to a problem? Please describe.

Hi! In GraphQL-Ruby, field execution is instrumented. But, the field can call Fiber.yield to make itself wait for some other work to be done. (The other work is GraphQL::Dataloader.) When that work is done, the field's Fiber is resumed and GraphQL execution continues.

In practice, we end up with several Fibers paused while their fields wait for data. Once the data is available, those Fibers are resumed (one-at-a-time) and field execution completes.

The problem is that the field's span in NewRelic includes the time where it was actually paused (because of Fiber.yield). So although clock time was actually passing, Ruby wasn't doing anything with that Fiber.

Then, in the UI, it looks like these GraphQL fields are taking a loooong time (longer than the request duration, actually) -- but it's because the waiting time is counted for each span that's waiting.

Feature Description

I'd like a way to improve the tracing so that I can "pause" the span when a Fiber yields. Or some other way to eliminate this "double-counting" of wait time?

Describe Alternatives

We could do nothing. In that case, whenever we adopt Dataloader, we get a big (nonsensical) spike in segment time:

Image

We'd have to learn to ignore that input 😿

We could not use Fibers in the code. That's also a possibility -- The GraphQL-Batch library doesn't use fibers -- but honestly, it has the same problem with tracing, but it manifests differently because GraphQL-Batch uses Promises instead of Fiber.yield.

Are there existing options in the NewRelic agent that I could use to improve tracing in this case?

Additional context

Here's a simplified example of what GraphQL-Ruby does:

Tracing work with Fiber.yield

require "bundler/inline"

gemfile do
  gem "newrelic_rpm"
end

def do_something(record)
  puts "Doing: #{record[:name]}"
end


manager = Fiber.new do
  ids_to_fetch = []
  records = {}


  subtask_1 = Fiber.new do
    NewRelic::Agent::MethodTracerHelpers.trace_execution_scoped("Jobs/Job") do
      puts "Requesting 1"
      ids_to_fetch << 1
      Fiber.yield
      do_something(records[1])
    end
  end
  subtask_1.resume

  subtask_2 = Fiber.new do
    NewRelic::Agent::MethodTracerHelpers.trace_execution_scoped("Jobs/Job") do
      puts "Requesting 2"
      ids_to_fetch << 2
      Fiber.yield
      do_something(records[2])
    end
  end
  subtask_2.resume

  puts "Loading data"
  NewRelic::Agent::MethodTracerHelpers.trace_execution_scoped("Database/Fetch") do
    ids_to_fetch.each do |id|
      records[id] = { name: "Job ##{id}" }
    end
  end 
  puts "Resuming jobs"

  # Data is loaded, now resume work:
  subtask_1.resume
  subtask_2.resume
end

puts "Starting..."
manager.resume
puts "...Finished"

# Starting...
# Requesting 1
# Requesting 2
# Loading data
# Resuming jobs
# Doing: Job #1
# Doing: Job #2
# ...Finished

Priority

Really Want 😊

@rmosolgo rmosolgo added the feature request To tag feature request after Hero Triage for PM to disposition label Jan 21, 2025
@workato-integration
Copy link

@github-actions github-actions bot added the community To tag external issues and PRs submitted by the community label Jan 21, 2025
@rmosolgo
Copy link
Author

I also noticed that on the Summary view, the "Ruby" portion of our request time is larger than the total request time. It's also growing, perhaps as we adopt this Fiber-based flow:

Image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community To tag external issues and PRs submitted by the community feature request To tag feature request after Hero Triage for PM to disposition
Projects
None yet
Development

No branches or pull requests

1 participant