Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

coco crashes if it can't reach slack.com #202

Closed
nritsche opened this issue May 28, 2020 · 3 comments
Closed

coco crashes if it can't reach slack.com #202

nritsche opened this issue May 28, 2020 · 3 comments
Assignees
Labels
bug Something isn't working

Comments

@nritsche
Copy link
Contributor

nritsche commented May 28, 2020

reverse log:

-- Logs begin at Wed 2020-04-29 15:01:56 PDT, end at Thu 2020-05-28 13:30:14 PDT. --
May 28 13:30:08 csBfs cocod[23515]: Connection <RedisConnection [db:0]> has pending commands, closing it.
May 28 13:29:08 csBfs cocod[23515]: Connection <RedisConnection [db:0]> has pending commands, closing it.
May 28 13:29:08 csBfs cocod[23515]: [2020-05-28 13:29:08 -0700] - (sanic.access)[INFO][10.1.13.1:32990]: POST http://csbfs:54323/rfi-zeroing-toggle  503 23
May 28 13:29:08 csBfs cocod[23515]: NoneType: None
May 28 13:29:08 csBfs cocod[23515]: [2020-05-28 13:29:08 -0700] [23537] [ERROR] Exception occurred while handling uri: 'http://csbfs:54323/rfi-zeroing-toggle'
May 28 13:18:19 csBfs cocod[23515]: Connection <RedisConnection [db:0]> has pending commands, closing it.
May 28 13:18:18 csBfs cocod[23515]: Connection <RedisConnection [db:0]> has pending commands, closing it.
May 28 13:18:18 csBfs cocod[23515]: Connection <RedisConnection [db:0]> has pending commands, closing it.
May 28 13:18:18 csBfs cocod[23515]: Connection <RedisConnection [db:0]> has pending commands, closing it.
May 28 13:18:17 csBfs cocod[23515]: aioredis.errors.ConnectionClosedError: Reader at end of file
May 28 13:18:17 csBfs cocod[23515]: raise ConnectionClosedError(msg)
May 28 13:18:17 csBfs cocod[23515]: File "/usr/local/lib/python3.7/site-packages/aioredis/connection.py", line 322, in execute
May 28 13:18:17 csBfs cocod[23515]: await conn.execute("rpush", f"{name}:res", json.dumps(result))
May 28 13:18:17 csBfs cocod[23515]: File "/usr/local/lib/python3.7/site-packages/coco/worker.py", line 168, in go
May 28 13:18:17 csBfs cocod[23515]: File "uvloop/loop.pyx", line 1456, in uvloop.loop.Loop.run_until_complete
May 28 13:18:17 csBfs cocod[23515]: loop.run_until_complete(asyncio.gather(go(), scheduler.start()))
May 28 13:18:17 csBfs cocod[23515]: File "/usr/local/lib/python3.7/site-packages/coco/worker.py", line 187, in main_loop
May 28 13:18:17 csBfs cocod[23515]: self._target(*self._args, **self._kwargs)
May 28 13:18:17 csBfs cocod[23515]: File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run
May 28 13:18:17 csBfs cocod[23515]: self.run()
May 28 13:18:17 csBfs cocod[23515]: File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
May 28 13:18:17 csBfs cocod[23515]: Traceback (most recent call last):
May 28 13:18:17 csBfs cocod[23515]: Process Process-1:
May 28 13:18:17 csBfs cocod[23515]: {'attachments': [{'ts': 1590696177.837849, 'text': 'Success!', 'title': 'coco.endpoint.update-gain'}], 'channel': 'receiver_ops'}
May 28 13:18:17 csBfs cocod[23515]: This was the message:
May 28 13:18:17 csBfs cocod[23515]: Sending message to slack server failed: Cannot connect to host slack.com:443 ssl:default [Connection reset by peer]
May 28 13:18:17 csBfs cocod[23515]: [2020-05-28 13:18:17 -0700] [23530] [DEBUG] [coco.endpoint.start-node] Success!
May 28 13:18:04 csBfs cocod[23515]: [2020-05-28 13:18:04 -0700] [23530] [DEBUG] [coco.endpoint.start-node] Request "None"
May 28 13:18:04 csBfs cocod[23515]: [2020-05-28 13:18:04 -0700] [23530] [INFO] [coco.endpoint.start-node] endpoint called
May 28 13:17:59 csBfs cocod[23515]: Connection <RedisConnection [db:0]> has pending commands, closing it.
May 28 13:17:54 csBfs cocod[23515]: [2020-05-28 13:17:54 -0700] [23530] [DEBUG] [coco.endpoint.restart-node] Request "None"
May 28 13:17:54 csBfs cocod[23515]: [2020-05-28 13:17:54 -0700] [23530] [DEBUG] [coco.endpoint.kill-node] Success!
May 28 13:17:51 csBfs cocod[23515]: [2020-05-28 13:17:51 -0700] [23530] [DEBUG] [coco.endpoint.kill-node] Request "None"
May 28 13:17:51 csBfs cocod[23515]: [2020-05-28 13:17:51 -0700] [23530] [INFO] [coco.endpoint.kill-node] endpoint called
May 28 13:17:51 csBfs cocod[23515]: [2020-05-28 13:17:51 -0700] [23530] [DEBUG] [coco.check] Calling kill-node on hosts [cndg3:12048, cscg1:12048, cs0g7:12048, cnbg4:12048, cn0g9:12048, cn4g3:12048, csbg4:12048, cn1g1:12048, csag3:12048, cscg6:12048, cn9g3:12048, cn5g4:12048, cndg8
May 28 13:17:51 csBfs cocod[23515]: [2020-05-28 13:17:51 -0700] [23530] [INFO] [coco.check] /status: Check reply for values failed: ['http://cnDg3:12048/', 'http://csCg1:12048/', 'http://cs0g7:12048/', 'http://cnBg4:12048/', 'http://cn0g9:12048/', 'http://cn4g3:12048/', 'http://csB
May 28 13:17:51 csBfs cocod[23515]: [2020-05-28 13:17:51 -0700] [23530] [DEBUG] [coco.check] Expected False but found True.
@nritsche nritsche added the bug Something isn't working label May 28, 2020
@jrs65
Copy link
Contributor

jrs65 commented Aug 20, 2020

Here's a similar issue. We seemed to breach a slack API limit and had a similar crash:

Aug 20 09:27:37 csBfs cocod[2142]: Sending message to slack server failed with status: Too Many Requests (429).
Aug 20 09:27:37 csBfs cocod[2142]: {'attachments': [{'ts': 1597873093.9287603, 'text': 'endpoint called', 'title': 'coco.endpoint.update-bad-inputs-receiver', 'color': 'good'}]
Aug 20 09:27:37 csBfs cocod[2142]: This was the message:
Aug 20 09:27:37 csBfs cocod[2142]: Sending message to slack server failed with status: Too Many Requests (429).
Aug 20 09:27:37 csBfs cocod[2142]: {'attachments': [{'ts': 1597873093.9287603, 'text': 'endpoint called', 'title': 'coco.endpoint.update-bad-inputs-receiver', 'color': 'good'}]
Aug 20 09:27:37 csBfs cocod[2142]: This was the message:
Aug 20 09:27:37 csBfs cocod[2142]: Sending message to slack server failed with status: Too Many Requests (429).
Aug 20 09:27:37 csBfs cocod[2142]: {'attachments': [{'ts': 1597873093.5713255, 'text': 'Request "{\'update_id\': \'flaginput_20200819T213812.963749Z_rms\', \'start_time\': 1597
Aug 20 09:27:37 csBfs cocod[2142]: This was the message:
Aug 20 09:27:37 csBfs cocod[2142]: Sending message to slack server failed with status: Too Many Requests (429).
Aug 20 09:27:37 csBfs cocod[2142]: {'attachments': [{'ts': 1597873092.9567924, 'text': 'endpoint called', 'title': 'coco.endpoint.update-bad-inputs-cluster', 'color': 'good'}],
Aug 20 09:27:37 csBfs cocod[2142]: This was the message:
Aug 20 09:27:37 csBfs cocod[2142]: Sending message to slack server failed with status: Too Many Requests (429).
Aug 20 09:27:37 csBfs cocod[2142]: {'attachments': [{'ts': 1597873092.9563758, 'text': 'endpoint called', 'title': 'coco.endpoint.update-bad-inputs', 'color': 'good'}], 'channe
Aug 20 09:27:37 csBfs cocod[2142]: This was the message:
Aug 20 09:27:37 csBfs cocod[2142]: Sending message to slack server failed with status: Too Many Requests (429).
Aug 20 09:27:37 csBfs cocod[2142]: {'attachments': [{'ts': 1597873080.2039685, 'text': 'endpoint called', 'title': 'coco.endpoint.update-gain', 'color': 'good'}], 'channel': 'c
Aug 20 09:27:37 csBfs cocod[2142]: This was the message:
Aug 20 09:27:37 csBfs cocod[2142]: Sending message to slack server failed with status: Too Many Requests (429).
Aug 20 09:27:37 csBfs cocod[2142]: {'attachments': [{'ts': 1597873070.362564, 'text': 'endpoint called', 'title': 'coco.endpoint.update-gain', 'color': 'good'}], 'channel': 'co
Aug 20 09:27:37 csBfs cocod[2142]: This was the message:
Aug 20 09:27:37 csBfs cocod[2142]: Sending message to slack server failed with status: Too Many Requests (429).
Aug 20 09:27:37 csBfs cocod[2142]: {'attachments': [{'ts': 1597873051.6476731, 'text': 'Success!', 'title': 'coco.endpoint.update-gain'}], 'channel': 'coco_messages'}
Aug 20 09:27:37 csBfs cocod[2142]: This was the message:
Aug 20 09:27:37 csBfs cocod[2142]: Sending message to slack server failed with status: Too Many Requests (429).
Aug 20 09:27:37 csBfs cocod[2142]: {'attachments': [{'ts': 1597873050.7396224, 'text': 'Success!', 'title': 'coco.endpoint.gps-time'}], 'channel': 'coco_messages'}
Aug 20 09:27:37 csBfs cocod[2142]: This was the message:
Aug 20 09:27:37 csBfs cocod[2142]: Sending message to slack server failed with status: Too Many Requests (429).
Aug 20 09:27:37 csBfs cocod[2142]: {'attachments': [{'ts': 1597873050.3309178, 'text': 'Request "None"', 'title': 'coco.endpoint.gps-time'}], 'channel': 'coco_messages'}
Aug 20 09:27:37 csBfs cocod[2142]: This was the message:
Aug 20 09:27:37 csBfs cocod[2142]: Sending message to slack server failed with status: Too Many Requests (429).
Aug 20 09:27:37 csBfs cocod[2142]: [2020-08-20 09:27:37 -0700] [2145] [ERROR] [coco.scheduler] Scheduler failed calling gps-time: (500, message='Internal Server Error', url=URL
Aug 20 09:27:37 csBfs cocod[2142]: [2020-08-20 09:27:37 -0700] - (sanic.access)[INFO][127.0.0.1:53908]: GET http://localhost:54323/gps-time  500 250
Aug 20 09:27:37 csBfs cocod[2142]: aioredis.errors.ConnectionClosedError: Reader at end of file
Aug 20 09:27:37 csBfs cocod[2142]: res = await fut
Aug 20 09:27:37 csBfs cocod[2142]: File "/usr/local/lib/python3.7/site-packages/aioredis/util.py", line 52, in wait_ok
Aug 20 09:27:37 csBfs cocod[2142]: now,
Aug 20 09:27:37 csBfs cocod[2142]: File "/usr/local/lib/python3.7/site-packages/coco/master.py", line 456, in external_endpoint
Aug 20 09:27:37 csBfs cocod[2142]: response = await response
Aug 20 09:27:37 csBfs cocod[2142]: File "/usr/local/lib/python3.7/site-packages/sanic/app.py", line 973, in handle_request
Aug 20 09:27:37 csBfs cocod[2142]: Traceback (most recent call last):
Aug 20 09:27:37 csBfs cocod[2142]: [2020-08-20 09:27:37 -0700] [2152] [ERROR] Exception occurred while handling uri: 'http://localhost:54323/gps-time'
Aug 20 09:27:35 csBfs cocod[2142]: [2020-08-20 09:27:35 -0700] - (sanic.access)[INFO][10.1.50.11:39040]: POST http://csBfs:54323/update-gain  200 55

(reversed log)

@nritsche
Copy link
Contributor Author

I'm not sure if the slack problems (which are different in the two cases) are related with it, but I'd say they are not the source of the problem. That would be aioredis.errors.ConnectionClosedError: Reader at end of file.

Therefor this and #203 seem to be duplicates of #178. Please re-open if you don't think so.

@jrs65
Copy link
Contributor

jrs65 commented Aug 31, 2020

Okay. Sounds reasonable to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants