-
Notifications
You must be signed in to change notification settings - Fork 477
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix parsing of noncompliant RFC3339 timestamps missing only a timezone #3346
base: master
Are you sure you want to change the base?
Conversation
@gilbsgilbs: There are no 'kind' label on this PR. You need a 'kind' label to generate the release automatically.
DetailsI am a bot created to help the crowdsecurity developers manage community feedback and contributions. You can check out my manifest file to understand my behavior and what I can do. If you want to use this for your project, you can check out the BirthdayResearch/oss-governance-bot repository. |
@gilbsgilbs: There are no area labels on this PR. You can add as many areas as you see fit.
DetailsI am a bot created to help the crowdsecurity developers manage community feedback and contributions. You can check out my manifest file to understand my behavior and what I can do. If you want to use this for your project, you can check out the BirthdayResearch/oss-governance-bot repository. |
/kind fix |
/area agent |
112f9c0
to
31df237
Compare
This commit fixes the parsing of dates that are almost RFC3339 compliant, except they are just missing a timezone. It seems that this format (which is still ISO 8601 compliant, but not RFC3339) is quite widely used. Some Crowdsec parsers from the hub had to deal with this format and ended up appending a "Z" consistently to make the timestamp UTC and make it RFC3339 compliant again: - authentik-logs: https://github.com/crowdsecurity/hub/blob/146659cd4ac19abfa87e39b5e5c0ec8bc4313bf8/parsers/s01-parse/firix/authentik-logs.yaml#L24 - redmine-logs: https://github.com/crowdsecurity/hub/blob/146659cd4ac19abfa87e39b5e5c0ec8bc4313bf8/parsers/s01-parse/LePresidente/redmine-logs.yaml#L22 - qbittorent [not upstreamed yet]: https://github.com/crowdsecurity/hub/pull/1179/files#diff-ba102ec88ac5a804fd6acfac54bdae1778b44992ed8b550a011082a32e6f9b9cR32 - I tried to be a bit cautious by checking if a timezone is already present before appending one because I suspected that the missing timezone might be due to a system configuration quirk rather than something intended by the developer and stable from one machine to another. Handling this edge-case at the parser level would make things less fragile, and would prevent such dirty workarounds from spreading on the hub. I don't see any downside because it doesn't break any existing parsing, it just adds support for more formats. Also note that per-specification, adding UTC timezone to a timezone-naive timestamps is not actually the 100% accurate thing to do. In theory, we should use the "local" timezone… of the machine that initially emitted the log, which is hard to figure out. But this is a tradeoff that will at least prevent parsing errors, and sounds like a reasonable default as downstream can still override this behavior by specifying a timezone explicitely to get rid of any ambiguity.
31df237
to
d4db2f4
Compare
This commit fixes the parsing of dates that are almost RFC3339
compliant, except they are just missing a timezone. It seems that this
format (which is still ISO 8601 compliant, but not RFC3339) is quite
widely used. Some Crowdsec parsers from the hub had to deal with this
format and ended up appending a "Z" consistently to make the timestamp
UTC and make it RFC3339 compliant again:
present before appending one because I suspected that the missing
timezone might be due to a system configuration quirk rather than
something intended by the developer and stable from one machine to
another.
Handling this edge-case at the parser level would make things less
fragile, and would prevent such dirty workarounds from spreading on the
hub. I don't see any downside because it doesn't break any existing
parsing, it just adds support for more formats.
Also note that per-specification, adding UTC timezone to a
timezone-naive timestamps is not actually the 100% accurate thing to do.
In theory, we should use the "local" timezone… of the machine that
initially emitted the log, which is hard to figure out. But this is a
tradeoff that will at least prevent parsing errors, and sounds like a
reasonable default as downstream can still override this behavior by
specifying a timezone explicitely to get rid of any ambiguity.