Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate string decode performance improvements #4

Open
andrewthad opened this issue Aug 13, 2021 · 1 comment
Open

Investigate string decode performance improvements #4

andrewthad opened this issue Aug 13, 2021 · 1 comment

Comments

@andrewthad
Copy link
Member

In 665c428, I've added a benchmark that focuses on the performance of decoding strings. Here's the performance of decoding 4KB of json, most of which is just 30-character strings:

benchmarked json/url/100/decode
time                 8.053 μs   (7.736 μs .. 8.411 μs)
                     0.996 R²   (0.989 R² .. 0.999 R²)
mean                 8.089 μs   (8.051 μs .. 8.139 μs)
std dev              74.41 ns   (48.48 ns .. 107.8 ns)

benchmarked aeson/url/100/decode
time                 22.66 μs   (21.50 μs .. 24.51 μs)
                     0.994 R²   (0.989 R² .. 0.999 R²)
mean                 22.72 μs   (22.49 μs .. 22.84 μs)
std dev              276.1 ns   (159.0 ns .. 414.7 ns)

Not bad. We're ahead of aeson by a factor of three, but can this number be improved further? String decode currently walks the string, byte-by-byte, until it finds the end of it. As it walks the string, it keeps up with information about whether or not anything will need to be unescaped. I think that it should be possible to instead walk the string w64-by-w64. This could be done by adapting the approach in bytestring-encodings to work with ByteArray instead of Ptr and adding some additional bit twiddling hacks. The general idea would be:

  • Fail if you encounter a backslash
  • Fail if you encounter a byte less than 0x20
  • Fail if you encounter a byte greater than 0x7E
  • Fail if your read would give you a w64 that straddled the end of the string (simplifies things a little)
  • Succeed if you encounter a "

Failure just means to fall back to the existing string decode logic, and succeed means that we may perform a memcpy (as we do now). This whole thing is a little bit tricky because it's possible to encounter both a " and a failing byte in them same w64, and then the order that they showed up in matters. But I think that the conservative action of always failing even if the quote showed up first is probably the best course. It simplifies it, and the bytes after a string ends are probably ascii characters anyway, so this shouldn't cost a ton of performance.

This needs to be implemented and benchmarked, but I think it could this benchmark at least 2x faster.

@andrewthad
Copy link
Member Author

Other thoughts:

  • For the beginning w64, we cannot use the same trick as we can for the end w64. The beginning w64 will include the opening double quote, and we cannot count that there.
  • For the end double quote, when we detect it, we have to do some kind of CLZ-like operation to figure out where it actually was. We have to know the size of what we are copying.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant