Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recover / Reconnect / Restart Server #37

Open
fire17 opened this issue Jun 21, 2024 · 1 comment
Open

Recover / Reconnect / Restart Server #37

fire17 opened this issue Jun 21, 2024 · 1 comment

Comments

@fire17
Copy link

fire17 commented Jun 21, 2024

Hi there,
First of all let me say thanks a lot
I've been using this library for a few years and i love it for the most part

The only thing is that I cant seem to recover from errors
If something has happened during the client/server conversation
I have to restart all of the Servers and all of the Clients

 ::: NEW QUERY REQUEST ON ROUTER SERVER PIPELINE: Local RID: 6 ::: Query: tell a joke :::
Traceback (most recent call last):
  File "/Users/magic/wholesomegarden/magicllight/magicllight/core/airouter/pipelines/xo_benedict/freshServer.py", line 69, in listen
    for payload in listen_for_request:
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/zeroless/zeroless.py", line 60, in _recv
    frames = sock.recv_multipart()
             ^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/zmq/sugar/socket.py", line 806, in recv_multipart
    parts = [self.recv(flags, copy=copy, track=track)]
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "_zmq.py", line 1137, in zmq.backend.cython._zmq.Socket.recv
  File "_zmq.py", line 1172, in zmq.backend.cython._zmq.Socket.recv
  File "_zmq.py", line 1264, in zmq.backend.cython._zmq._recv_copy
  File "_zmq.py", line 1259, in zmq.backend.cython._zmq._recv_copy
  File "_zmq.py", line 160, in zmq.backend.cython._zmq._check_rc
zmq.error.ZMQError: Operation cannot be accomplished in current state
.............. server crashed, needs recovering.........

but no matter what i do, i cant seem to recover [ALL CLIENTS AND SERVERS MUST DIE TO RESTART]
I want a simple recovery without closing everything

Again this library is amazing,
But the server/clients connections must be more robust, and auto handle reconnecting

Please let me know what you think, and how we can solve this
Thanks a lot and all the best!

@fire17
Copy link
Author

fire17 commented Jun 21, 2024

THE SOLUTION IM LOOKING FOR

reply, listen_for_request = server.reply()

try:
	for payload in listen_for_request:
		res = process_payload(payload)
		reply(res)
except:
	traceback.print_exc()
	# recover[0] = True
	print(".............. recovering from zmq error .........")

	server.reconnect()           # <-------------- THIS NEEDS TO BE INCLUDED IN THE ZEROLESS LIBRARY
	# This should:
	#	1. restart the server, avoid the "already exists on port error" 
	#	2. Handle the stuck client, by either:
	#		a. sending the stuck request a FAILED message, and let the client handle it
	#		b. recalling the function, simple recovery
	#		c. returning what the failed method already processed, advanced recovery (BETTER SOLUTION)
	#			Explanation: The ZMQ error happens at the end (on reply) which means that the server function
already ran and returned results. So right before sending via reply, that data should be temporarily saved,
and if the reply failed, it will be used after server recovery to be return immediately to the client. 


	print(".............. recovering done .........")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant