Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

parser_json: use JSON as fallback parser instead of Yajl for performance #4813

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Watson1978
Copy link
Contributor

@Watson1978 Watson1978 commented Jan 31, 2025

Which issue(s) this PR fixes:
Fixes #

What this PR does / why we need it:
Recently, Ruby's json has incredible performance improvements.
It might be faster than oj gem.
So, I think json is a suitable as fallback.

This is similar with #4759

Here is easily benchmark. (I used same log file in #4759)

  • Before

    • It spent 90.50963662 sec to handle 10 GB file
  • After

    • It spent 74.624230077 sec to handle 10 GB file
  • config

<source>
  @type tail
  path "#{File.expand_path '~/tmp/access*.log'}"
  pos_file "#{File.expand_path '~/tmp/fluentd/access.log.pos'}"
  tag log
  read_from_head true
  <parse>
    @type json
  </parse>
</source>

<match **>
  @type file
  path "#{File.expand_path '~/tmp/fluentd/log'}"
</match>

FYI)

Docs Changes:
fluent/fluentd-docs-gitbook#560

Release Note:

@kenhys
Copy link
Contributor

kenhys commented Jan 31, 2025

we are happy if we could observe micro benchmark result.

@Watson1978
Copy link
Contributor Author

Micro benchmark

benchmark code

require 'bundler/inline'
gemfile do
  source 'https://rubygems.org'
  gem 'benchmark-ips'
  gem 'oj'
  gem 'json'
  gem 'yajl-ruby', require: 'yajl'
end

Benchmark.ips do |x|
  json = '{"a":"Alpha","b":true,"c":12345,"d":[true,[false,[-123456789,null],3.9676,["Something else.",false],null]],"e":{"zero":null,"one":1,"two":2,"three":[3],"four":[0,1,2,3,4]},"f":null,"h":{"a":{"b":{"c":{"d":{"e":{"f":{"g":null}}}}}}},"i":[[[[[[[null]]]]]]]}'

  x.report('Oj.load') { Oj.load(json) }
  x.report('JSON.load') { JSON.load(json) }
  x.report('Yajl.load') { Yajl.load(json) }

  x.compare!
end

result

ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +PRISM [x86_64-linux]
Warming up --------------------------------------
             Oj.load    31.383k i/100ms
           JSON.load    35.513k i/100ms
           Yajl.load    13.839k i/100ms
Calculating -------------------------------------
             Oj.load    319.187k (± 1.5%) i/s    (3.13 μs/i) -      1.601M in   5.015522s
           JSON.load    364.950k (± 0.5%) i/s    (2.74 μs/i) -      1.847M in   5.060214s
           Yajl.load    137.015k (± 2.5%) i/s    (7.30 μs/i) -    691.950k in   5.053679s

Comparison:
           JSON.load:   364950.5 i/s
             Oj.load:   319187.1 i/s - 1.14x  slower
           Yajl.load:   137015.2 i/s - 2.66x  slower

@kenhys
Copy link
Contributor

kenhys commented Jan 31, 2025

By the way, which json gem version did you tested with?
Do we need to use specific version of json gem?

@Watson1978
Copy link
Contributor Author

Watson1978 commented Jan 31, 2025

By the way, which json gem version did you tested with? Do we need to use specific version of json gem?

I tested with json v2.9.1 (latest version), it is default version in Ruby 3.4.1
https://github.com/ruby/ruby/blob/master/doc/NEWS/NEWS-3.4.0.md

If we will use Ruby 3.4 or later in the next fluent-package LTS, hmmm, I think we don't need to specify the version...

@@ -28,12 +28,12 @@ def test_fall_back_oj_to_yajl_if_oj_not_available

result = @parser.instance.configure_json_parser(:oj)

assert_equal [Yajl.method(:load), Yajl::ParseError], result
assert_equal [JSON.method(:load), JSON::ParserError], result
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You probably want JSON.method(:parse): https://bugs.ruby-lang.org/issues/19528

It's also very marginally faster.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants