Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ruby objects not reporting correct file size #19527

Open
nirvdrum opened this issue Dec 5, 2024 · 0 comments
Open

Ruby objects not reporting correct file size #19527

nirvdrum opened this issue Dec 5, 2024 · 0 comments
Assignees
Labels

Comments

@nirvdrum
Copy link

nirvdrum commented Dec 5, 2024

What version of protobuf and what language are you using?
Version: 4.29.0
Language: Ruby

What operating system (Linux, Windows, ...) and version?

  • macOS 15.1.1
  • Ubuntu 24.04

What runtime / compiler are you using (e.g., python version or gcc version)

ruby -v
ruby 3.3.6 (2024-11-05 revision 75015d4c1f) [arm64-darwin24]

protoc --version
libprotoc 29.0

What did you do?
Steps to reproduce the behavior:

  1. Create a message with a repeated recursive field
  2. In code, create a long-held reference of this type (e.g., in an instance variable)
  3. Create new instances that link to the long-lived object
  4. Observe memory grows unbounded

I've pulled together a small reproduction. To see the issue:

  1. git clone https://github.com/nirvdrum/grpc-protobuf-experiments
  2. cd grpc-protobuf-experiments
  3. bundle install
  4. bundle exec rake
  5. VERBOSE=true bundle exec ruby leak-simple.rb

What did you expect to see

The leak-simple.rb script in the linked repo creates local instances of Proto::Leak::Recursive in a loop with each object linking back to a global instance of Proto::Leak::Recursive via the repeated data field. After an inner loop finishes executing we trigger GC and then print out the RSS of the process along with Ruby's view of how much memory is allocated. Increasing the inner loop iterations will print out larger RSS values.

I'd expect the output of ObjectSpace.memsize_of_all to be approximately the same value as the RSS output. Likewise, I'd expect the size of an object with a growing arena to be reflected in ObjectSpace.memsize_of. If the memory that's growing is on the Ruby heap, it should be visible through ObjectSpace.

What did you see instead?

When running the leak-simple.rb script, we can see RSS climbs but the Ruby heap size reportedly stays flat. However, in debugging #19498, we found that the memory growth is due to arena fusing and arenas are on the Ruby heap. The problem is the extension is not accurately reporting its size.

CRuby can keep track of the size of Ruby objects that it allocates. But, while native extensions can allocate memory on the Ruby heap, it's incumbent on the extension to provide an implementation of a memsize hook to report its size. This extension does that with the Arena_memsize function, but it's reporting the wrong size. We believe the problem is some bookkeeping that occurs after a fuse occurs. In particular, dividing the size by the fused count looks like the wrong operation:

memsize /= fused_count;

The total memsize is divided by the total number of arenas fused into it, presumably because the other arenas would report their size, but those arenas are no longer alive and thus have no size to report.

@nirvdrum nirvdrum added the untriaged auto added to all issues by default when created. label Dec 5, 2024
@shaod2 shaod2 added ruby and removed untriaged auto added to all issues by default when created. labels Dec 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants