-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove delete sqs message calls and optimise json collector decorator #534
Remove delete sqs message calls and optimise json collector decorator #534
Conversation
…ent type, or with a events list expander from field)
Minimum allowed coverage is Generated by 🐒 cobertura-action against 670f763 |
…elastic#537) * Add ESF specific User-Agent header in outgoing Elasticsearch requests * Bump minimum supported Elastic stack version to 7.17
@@ -72,9 +72,10 @@ def setUpClass(cls) -> None: | |||
lgc = LogstashContainer(es_container=esc) | |||
cls.logstash = lgc.start() | |||
|
|||
lsc = LocalStackContainer(image="localstack/localstack:1.4.0") | |||
lsc = LocalStackContainer(image="localstack/localstack:3.0.0") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this!
We could use 3.0.1
here instead. We should have a look at how to automate these changes with dependabot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT: It would be great, even as separate PR, to have a comment briefly explaining what this code does. The same applies for the overall json processing logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would also add a changelog entry as part of v1.11.0
Bumps [boto3](https://github.com/boto/boto3) from 1.28.80 to 1.33.1. - [Release notes](https://github.com/boto/boto3/releases) - [Changelog](https://github.com/boto/boto3/blob/develop/CHANGELOG.rst) - [Commits](boto/boto3@1.28.80...1.33.1) --- updated-dependencies: - dependency-name: boto3 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [boto3](https://github.com/boto/boto3) from 1.33.1 to 1.33.2. - [Release notes](https://github.com/boto/boto3/releases) - [Changelog](https://github.com/boto/boto3/blob/develop/CHANGELOG.rst) - [Commits](boto/boto3@1.33.1...1.33.2) --- updated-dependencies: - dependency-name: boto3 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
thanks for the PR! i'm still working through, but a couple of more general comments: there's lots of changes here to the JSON logic - it would be really useful to have a detailed description of what has changed and why in the PR description. additionally, in the future, it would be great to split the changes into seperate PRs to keep them logically seperate, and easier to review. here there are 3 changes: sqs delete message, localstack, and json parser. |
👍
as for the "sqs delete message", yes, there's already a separate branch for it: I've based my branch on it , since it's not yet merged to main it shows up in the diff: let me update the base branch |
@@ -100,7 +100,7 @@ def test_collect_buffer_collect( | |||
for i, line in enumerate(lines): | |||
collect_buffer.grow(data=line, newline=newline) | |||
|
|||
content, content_length = collect_buffer.collect_and_reset() | |||
content, content_length, _ = collect_buffer.collect_and_reset() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we validate the newline
returned from collect_and_reset
is correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when we reset newline
could be empty, that's why we ignore it and keep the one from the previous collect
@@ -432,26 +430,24 @@ def feed_iterator(content: bytes) -> Iterator[tuple[bytes, bytes, int]]: | |||
b"\r\n", | |||
b"line1\r\n line1.1\r\n line1.2\r\nline2\r\n line2.1\r\n line2.2\r\n", | |||
[(b"line1", 7), (b" line1.1", 11), (b" line1.2", 11), (b"line2", 7), (b" line2.1", 11), (b" line2.2", 11)], | |||
id="circuit breaker: \n", | |||
id="circuit breaker: \r\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's the reason for this change in the input data?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's just the id
of the test: the description that will be printed when running the test. It was wrong, since here we were testing \r\n
in the content as newline separator, not \n
I will add unit test today for the decorators, then we can merge |
default_max_bytes: int = 10485760 # Default maximum number of bytes to return in one multi-line event | ||
default_max_lines: int = 500 # Default maximum number of lines to return in one multi-line event | ||
default_multiline_timeout: int = 5 # Default timeout in secs to finish a multi-line event. | ||
|
||
timedelta_circuit_breaker: datetime.timedelta = datetime.timedelta(seconds=5) | ||
|
||
# CollectTuple is a tuple representing the multilines bytes content, the length of the content and the newline found |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these are really useful - thanks
…#544) * Remove sqs delete record calls * Remove delete sqs message calls and optimise json collector decorator (#534) * parse the content as json only when really necessary (not single content type, or with a events list expander from field) * make lint * fix bugs * bump localstack container * add ec2 service to localstack * fix comment * fix typo * reduce more complexity * fix test for localstack 3.0.0 * fix lint * further semplification and better performance * remove debug printing * remove debug printing * clean get_by_lines unit test * fix lint * even simplier * fix lint * increase coverage * Add ESF specific User-Agent header in outgoing Elasticsearch requests (#537) * Add ESF specific User-Agent header in outgoing Elasticsearch requests * Bump minimum supported Elastic stack version to 7.17 * Bump boto3 from 1.28.80 to 1.33.1 (#541) Bumps [boto3](https://github.com/boto/boto3) from 1.28.80 to 1.33.1. - [Release notes](https://github.com/boto/boto3/releases) - [Changelog](https://github.com/boto/boto3/blob/develop/CHANGELOG.rst) - [Commits](boto/boto3@1.28.80...1.33.1) --- updated-dependencies: - dependency-name: boto3 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump boto3 from 1.33.1 to 1.33.2 (#542) Bumps [boto3](https://github.com/boto/boto3) from 1.33.1 to 1.33.2. - [Release notes](https://github.com/boto/boto3/releases) - [Changelog](https://github.com/boto/boto3/blob/develop/CHANGELOG.rst) - [Commits](boto/boto3@1.33.1...1.33.2) --- updated-dependencies: - dependency-name: boto3 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * inline doc * fix tests * changelog * add back empty newline in get_by_lines_parameters * use type aliases * add comments * fix typo in comment * add more comments * add decorator unit tests * fix decorator according to test expectations --------- Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: Davide Girardi <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Update CHANGELOG.md * revert not updating ending offset in ExpandEventListFromField * add typing-extensions==4.8.0 to requirements --------- Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: girodav <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Enhancement
What does this PR do?
Remove calls to
sqs.DeleteMessage
that are the most expensive operation in the forwarder, and optimiseJsonCollector
decorator in order to parse as json the storage content only when strictly requiredIt also bumps the version of Localstack used to 3.x
Why is it important?
Checklist
- [ ] I have made corresponding changes to the documentation- [ ] I have made corresponding change to the default configuration filesCHANGELOG.md
Author's Checklist
How to test this PR locally
Related issues
Use cases
Screenshots
Logs