Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: Bulk ack truncated by NetworkError / CORS #452

Closed
hhramberg opened this issue Oct 28, 2022 · 4 comments
Closed

Bug: Bulk ack truncated by NetworkError / CORS #452

hhramberg opened this issue Oct 28, 2022 · 4 comments
Assignees
Labels
backend Needs/has companion issue in backend bug Something is not working as expected usability Improves ease of use

Comments

@hhramberg
Copy link

hhramberg commented Oct 28, 2022

Describe the bug

Bulk acknowledgement of multiple Argus incidents is causes CORS and/or NetworkError.

catched network error Error: Network Error
    exports createError.js:16
    onerror xhr.js:99
 Error: Failed to post incident ack: Error: Network Error
    value client.ts:611
    value client.ts:106
[apiinterceptor.tsx:43:18](https://secret.argus.com/static/js/components/apiinterceptor.tsx)
    e apiinterceptor.tsx:43
    value client.ts:109
    value client.ts:108
Cross-Origin Request Blocked: The Same Origin Policy disallows reading the remote resource at https://secret.argus.com/api/v1/incidents/26915/acks/. (Reason: CORS header ‘Access-Control-Allow-Origin’ missing). Status code: 502.

To Reproduce
Steps to reproduce the behavior:

  1. Go to Argus incident page.
  2. Click on "select all" incidents.
  3. Click on "Ack" to acknowledge all incidents currently selected.
  4. See error reported by Argus.

Expected behavior
All selected Argus incidents to be successfully acknowledged without any network errors.

Screenshots
argus_bulk_ack_issue

Desktop (please complete the following information):

  • OS: Manjaro Linux 22.0.0
  • Browser: Firefox
  • Version: 105.0.3 (64-bit)

Additional context
Argus version: Backend v.1.8.0, API v1(stable), frontend v.1.6.1

@hhramberg hhramberg added the bug Something is not working as expected label Oct 28, 2022
@hhramberg
Copy link
Author

After some further testing it appears that the error occurs when bulk inserting 5 or more incidents.

@johannaengland
Copy link
Contributor

Fixed by #480

@podliashanyk
Copy link
Contributor

podliashanyk commented Mar 30, 2023

Bug is still relevant. Error message is same as reported by @hhramberg

Bug update:

Bulk close and bulk reopen are also affected. For all cases where bug occurs, only 4-6 incidents are updated properly, the rest is failing.
Bulk ticket update works fine and without errors, regardless how many incidents are being updated.

@lunkwill42
Copy link
Member

I have debugged this in our own infrastructure, and this isn't a front-end bug.

Currently, the problem stems from how notifications are processed when API calls produce new incident events. Notifications are processed synchronously for every updated incident, and causes a background process to be forked off for each event.

This essentially means that when acking 100 incidents, 100 background processes are rapidly spawned to handle the dispatch of notifications for the 100 events this generates. Specifically, in our production infrastructure, this takes about 33 seconds, but our deployment runs on gunicorn with a default timeout of 30 seconds. When the Django process has not responded in the allotted 30 seconds, gunicorn kills it, the K8s load balancer ends up with an incomplete response, returning 502 Bad Gateway, a response that doesn't contain CORS headers, and this appears to be logged as CORS error by the browser.

This does not affect bulk ticket operations, as these do not currently generate any events, so no notifications are processed.

The workaround in our production environment is to increase the gunicorn timeout, so the request is allowed to complete.

For the Argus back-end, there is a slight mitigation in progress in Uninett/Argus#616, but the real fix is that we are currently reworking the notification system to process events asynchronously using messaging queues. See Uninett/Argus#121 for the overarching status of the notification system rewrites that are taking place.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend Needs/has companion issue in backend bug Something is not working as expected usability Improves ease of use
Projects
None yet
Development

No branches or pull requests

4 participants