-
-
Notifications
You must be signed in to change notification settings - Fork 795
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use the default float/double parser as fallback #779
Conversation
@grcevski Have you measured slowdown of parsing additional formats with turned on fast parse routines? I bet that for deep stacks it could be vulnerable for DoS/DoW attacks when non-trusted input is used. |
I don't have performance numbers for the invalid formats, obviously the exception handling will introduce some overhead, but assuming these are exceptional cases the overall impact on performance should be minimal. The number of frames when an exception is hit for these unsupported formats is 5 at most, which isn't deeper than what FloatingDecimal would do in OpenJDK to parse the numbers. |
@grcevski could you report this issue to https://github.com/wrandelshofer/FastDoubleParser ? My gut feeling is that jackson-core shouldn't have the try-catch. As @plokhotnyuk, there are DoS/DoW risks with relying on exceptions. The fast parser support is optional and disabled by default. So users who want to parse JSON from untrusted sources would be better off not enabling the fast parser support. |
Sure thing, I'll open an issue there and link back. |
If the first example is fixed for |
I understand your concern but every check in the code will slow down the throughput. So far, the issues appear to be with edge cases that are unlikely to appear in real data sets - at least ones that have not been manipulated by a malicious actor. We may still accept this PR or a something similar but I would be interested to see if @wrandelshofer has any ideas on a solution to supporting these edge cases in his parser code. |
I do understand the general concern about try and catch, but I'm not sure if it's an issue here, the default OpenJDK implementation for parseDouble will try catch: So I don't think this approach is making things worse in this respect. |
@grcevski the use of try/catch in the existing JDK parser code is one of the reasons it is slower than the fast parser support. |
I apologize, I wasn't clear in what I meant to say. I meant to say that the default OpenJDK implementation will involve the exception handler for bad input, meaning that any concerns about DoS/DoW risks with relying on exceptions are not made worse in any way by wrapping the fast double parser code in an exception handler. The fast double parser doesn't internally try and catch, but the default one already is, not in Jackson's code, but downstream. |
@grcevski Please, just measure ;) The problem is not in try/catch itself but in the frequency of throwing of exceptions that will unwind a whole stack trace. Here is a benchmark that can be easily modified to have hexadecimal mantissas. You can run it with the following command for a locally published artifacts from your branch and compare results for the latest code from 2.14 branch: sbt 'jsoniter-scala-benchmarkJVM/jmh:run -p size=128 ArrayOf(Doubles|Floats)Reading.jacksonScala' Good w/a would be using of some static exception (see https://shipilev.net/blog/2014/exceptional-performance/) or special NaN value with encoded error type in the internal representation. |
@plokhotnyuk To clarify, is the concern related to the additional cost of materialising the stack, with |
Based on the comment by @wrandelshofer wrandelshofer/FastDoubleParser#19 (comment) on the upstream issue this is likely a non-issue with jackson, since the number parsing differences are only when format specifiers are used. I'll run the full suite of the JDK float tests without format specifiers and if it's all good I'll close this PR and the related issue. |
I can easily refactor the back-end classes of FastDoubleParser to better suit your needs. The back-end classes are DoubleFromByteArray, DoubleFromCharArray They internally return a primitive long-value that they will then convert into a double or a float using Double.longBitsToDouble(long) or Float.intBitsToFloat(int). So, instead of throwing an Exception, I can make them encode the error message in the long-value, by using some unused NaN bits. Also, it might be possible to provide multiple entry points in the back-end classes. For example one entry point that accepts the full lexical grammar of java.util.Double.parseDouble, one that accepts XML Schema grammar, one for JSON grammar, and so on. |
@wrandelshofer thanks for the offer to refactor your code. At the moment, all jackson-core needs are parseDouble and parseFloat methods that are as close as possible to being drop-in replacements for the JDK methods. Whatever you choose to focus on will be of great interest to us nonetheless. |
I honestly haven't looked at this yet from a vulnerability perspective. In an earlier version of the FastDoubleParser library I had a complete code path for all cases in there, which would have allowed to fully assess and control this. Then - for the sake of code size (and performance) - I decided to fall back to the JDK code for the "slow path". 🤔 |
I know that the Javadocs in helper methods do not explanation rationale well, but note that these parse methods are only meant to support floating-point values that Jackson parser/decoder ( So, it'd first be necessary to have a problem to fix; I am not sure this PR makes sense at jackson-core level. There may be other related questions about |
I'm closing this PR, I ran all OpenJDK Double tests with stripped trailing 'f' and 'd' characters, no problems at all, except for other invalid values like '+inifinity' and lowercase 'nan'. |
The FastDoubleParser introduced via #747 and #766 will fail on parsing certain esoteric OpenJDK Double/Float supported inputs.
This PR tries to remedy this by introducing a try/catch around the FastDoubleParser and then using the Java default Double/Float parsers as a fall-back.
Relates to #778 and wrandelshofer/FastDoubleParser#19