Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor heartbeat to shutdown cleanly #1135

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

halleysfifthinc
Copy link
Contributor

@halleysfifthinc halleysfifthinc commented Dec 10, 2024

With Julia 1.11.2 (also tested 1.10.7), I'm seeing segfaults when shutting down an idle
Julia kernel (with current [email protected]) in Jupyter on Linux.

% jupyter --version
Selected Jupyter core packages...
IPython          : 8.30.0
ipykernel        : 6.29.5
ipywidgets       : 8.1.5
jupyter_client   : 8.6.3
jupyter_core     : 5.7.2
jupyter_server   : 2.14.2
jupyterlab       : 4.3.2
nbclient         : 0.10.1
nbconvert        : 7.16.4
nbformat         : 5.10.4
notebook         : 7.0.7
qtconsole        : not installed
traitlets        : 5.14.3

Reproducible by opening/starting a notebook, and then shutting down after the kernel is
connected and idle.
EDIT: I can no longer reproduce this, but the change still seems like a good idea?

There are several errors in the segfault dump, but the heartbeat thread is notable

The ZMQ docs state "zmq_proxy() runs in the current thread and returns only
if/when the current context is closed."

The heartbeat socket doesn't need to be global, as nothing else touches
it. So, if we create the heartbeat socket in a Context that has a global ref,
we can close the context, which will cause zmq_proxy to return and then
that thread to end/finish.

With this change, I'm not sure whether the jl_gc_safe_enter ccall is necessary?

From ZMQ docs: "zmq_proxy() runs in the current thread and returns only
if/when the current context is closed."

The heartbeat socket doesn't need to be global, as nothing else touches
it. BUT, if we create the heartbeat socket in a `Context` that has a global ref,
we can close the context, which will cause zmq_proxy to return and then
that thread to end/finish.

Doing that before shutting down helps avoid a segfault on shutdown.
src/heartbeat.jl Outdated Show resolved Hide resolved
src/heartbeat.jl Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants