-
-
Notifications
You must be signed in to change notification settings - Fork 560
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
clean up C4244 warnings on MSVC builds #501
Conversation
C4244 appertains to implicit truncation of integer values [1]. Theoretically, these warnings could be detected using `-fsanitize=implicit-unsigned-integer-truncation` and `-fsanitize=implicit-signed-integer-truncation` as well as statically using `-Werror=implicit-int-conversion`. [1] https://learn.microsoft.com/en-us/cpp/error-messages/compiler-warnings/compiler-warning-levels-3-and-4-c4244?view=msvc-170
@jgm I'm not sure if this is better or worse really. We could just disable the warning if the truncation is not something worth bothering with. I have two follow up patches that will completely eradicate all warnings when building with MSVC 2022. I'm happy to rework the patches to just disable the warning as well. |
I don't know what is better here. Happy to go with your judgment. |
I personally tend to go with this approach. The reasoning being that this codifies the behaviour (correct or not). This means that its less likely to change with compiler changes but does mean that it would be difficult to later identify what was intentional vs not. The count here is small enough, so, if you could verify that these sites are meant to truncate (which it did appear to me), I would just go with this. |
Note that adding explicit casts makes it impossible to detect actual truncation errors with UBSan's |
In light of that, @nwellnhof , do you think this change should be reverted? |
True, but, at the same time, if there are truncation errors, those should be addressed. This is effectively one way to deal with them. The other way would be to change the signature of the called function. I think that the question to ask is which of the two are preferable. The goal should be to properly express the semantics of the operation, and if there are truncation issues, addressing them with explicit casts to indicate the truncation. |
No, an explicit cast doesn't deal with unexpected truncation, it just makes the cast obvious to readers of the code and helps with some compiler warnings. Regarding the generated code, there's no difference between an implicit and an explicit cast. I'm a big fan of UBSan's integer sanitizers which catch well-defined but mostly unwanted things like unsigned integer overflow or truncations when casting. But some of these checks only work with implicit casts. Adding an explicit cast tells the sanitizer not to report an error. So if someone were to start fuzzing libcmark with these sanitizers, they'd first have to remove most of these explicit casts. That's why I'd prefer to leave the implicit casts and tell the compiler not to warn about them. |
The compiler is free to make assumptions and optimize based upon the truncation. It often has resulted in incorrect code generation (at least within clang). If the truncation is "well-defined" then there should be no warning generated by UBSan and that sounds like a UBSan bug. If there is a proper warning being emitted by UBSan, then there is a undefined behaviour and the desired behaviour should be codified into the code IMO. |
There's no difference between an implicit cast and an explicit cast to the target type. For example, the following code means exactly the same: my_type var = expression;
my_type var = (my_type) expression;
I think you're misunderstanding something. As I said, the generated code will be exactly the same.
Yes, that's why the integer sanitizers are turned off by default. But as I said, these sanitizers are incredibly useful to catch actual bugs. If you have a narrowing conversion that truncates integers, it is most likely a bug. It doesn't matter whether there's an explicit cast or not. |
This is my point. It is a bug, and doing the explicit cast is a valid way to address the bug. Changing the types is a valid way to address it as well. I am not tied to the truncation by casting, but I'd rather get these addressed, either by the cast of changing the types. Is there another way that you would prefer to address them? |
No, whether it's a bug depends on the context. An explicit cast changes nothing and makes it impossible to detect truncation with UBSan. Consider the following code (strlen returns a size_t): int len = strlen(string); If you can prove that the length of the string won't be larger than INT_MAX, this code is perfectly fine. If there's a possibility that the result overflows, it's a bug and |
I think that is where we disagree. I think that is universally a bug. If you can prove that the length is bounded by |
Even if it was, what's the difference if you add an explicit cast?
You could of course add range checks before each cast (and taken to the extreme before every single arithmetic operation) but that's simply not practical. You have to make certain assumptions about the range of inputs all the time. Fuzz testing with sanitizers is helpful to check whether your assumptions are correct. |
I think in that case, I wouldn't add the explicit cast, I would widen the storage to
Sure, the way to deal with this is to audit the sites and determine what needs to be done at each of them. I don't think that the sanitizer is any more useful than the warnings - they both identify the call sides which need to be audited and have the developer make a call on what the right behaviour is. In the case of truncation, explicitly state that so that its clear to others that is the right thing or widen it as appropriate. For what it is worth, that is practical to do - just not with C/C++. Other languages (e.g. Swift) do perform these checks on the arithmetic operations and will ensure that overflows do not go unnoticed. |
In libcmark we deliberately store many buffer sizes and indexes as 32-bit values to save memory on 64-bit platforms. There are checks, for example in Anyway, I'm just asking to stop adding explicit casts which defeat integer sanitizers. If a compiler complains about implicit casts, simply disable these warnings. |
Sure, if there are guaranteed checks, that is reasonable. But I still think that cleaning up those cases is something that should be done. If the checks are there, then it makes sense to explicitly cast the value when assigning. |
@jgm let me know what you would like to do here. Disabling warnings U.S.A. certainly possible but it not make sense to start disabling in larger buckets to deal with implicit conversions and truncations if those are preferred. |
I don't fully understand this issue, but if the tradeoff is making warnings disappear vs allowing sanitizers to be used, then the latter seems more important. |
Can't we just disable C4244 and similar warnings on MSVC? |
Sure, but if we are disabling it for MSVC, I feel like we should also be disabling similarly for other compilers. At that point, disabling chunks of warnings would be easier :) |
gcc and clang only warn about implicit conversion with It's also a good idea to treat warnings as errors when running CI tests. |
@nwellnhof agreed on both counts :) That is in fact, exactly what I did in #512 - dropped the warning back down to |
It should be enough to set the |
C4244 appertains to implicit truncation of integer values [1]. Theoretically, these warnings could be detected using
-fsanitize=implicit-unsigned-integer-truncation
and-fsanitize=implicit-signed-integer-truncation
as well as statically using-Werror=implicit-int-conversion
.[1] https://learn.microsoft.com/en-us/cpp/error-messages/compiler-warnings/compiler-warning-levels-3-and-4-c4244?view=msvc-170