Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sys.exit traceback running 8.3+ on MacOS #6291

Closed
hjoliver opened this issue Aug 7, 2024 · 5 comments
Closed

sys.exit traceback running 8.3+ on MacOS #6291

hjoliver opened this issue Aug 7, 2024 · 5 comments
Assignees
Labels
bug Something is wrong :(
Milestone

Comments

@hjoliver
Copy link
Member

hjoliver commented Aug 7, 2024

Traceback appears in terminal on start-up in detaching mode, since 8.3.0 (8.2.7 was fine)
The workflow still runs, but a similar traceback appears in the scheduler log on shutdown

$ cylc play bug/run20

 ▪ ■  Cylc Workflow Engine 8.3.0.dev
 ██   Copyright (C) 2008-2023 NIWA
▝▘    & British Crown (Met Office) & Contributors

2024-08-08T11:48:35+12:00 INFO - Extracting job.sh to /Users/oliver/cylc-
    run/bug/run20/.service/etc/job.sh
Traceback (most recent call last):
  File "/Users/oliver/miniforge3/lib/python3.10/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/Users/oliver/miniforge3/lib/python3.10/asyncio/base_events.py", line 636, in run_until_complete
    self.run_forever()
  File "/Users/oliver/miniforge3/lib/python3.10/asyncio/base_events.py", line 603, in run_forever
    self._run_once()
  File "/Users/oliver/miniforge3/lib/python3.10/asyncio/base_events.py", line 1909, in _run_once
    handle._run()
  File "/Users/oliver/miniforge3/lib/python3.10/asyncio/events.py", line 80, in _run
    self._context.run(self._callback, *self._args)
  File "/Users/oliver/cylc/cylc-flow/cylc/flow/scheduler_cli.py", line 417, in scheduler_cli
    daemonize(scheduler)
  File "/Users/oliver/cylc/cylc-flow/cylc/flow/daemonize.py", line 121, in daemonize
    sys.exit(0)
SystemExit: 0
@hjoliver hjoliver added the bug Something is wrong :( label Aug 7, 2024
@hjoliver
Copy link
Member Author

hjoliver commented Aug 8, 2024

The offending commit is apparently this one:

$ git bisect bad
8140c6f6a2fa4c7ac44ded0f60bdcdf3bd145bb2 is the first bad commit
commit 8140c6f6a2fa4c7ac44ded0f60bdcdf3bd145bb2 (HEAD)
Author: Mark Dawson <[email protected]>
Date:   Wed Nov 1 17:00:39 2023 +0000

    made reinstall work on multiple workflows

    flake8 clean up

    fixed unit tests

    fixed cylc-combination/01-vr-reload functional test

    fixed cylc-reinstall/00-simple.t functional test

    updated change log

    review amends

 changes.d/5803.feat.md                      |  1 +
 cylc/flow/scheduler_cli.py                  | 46 ++++++++++++++++++++++++++++------------------
 cylc/flow/scripts/reinstall.py              | 35 +++++++++++++++++++++--------------
 cylc/flow/scripts/validate_reinstall.py     | 25 +++++++++++++++----------
 tests/functional/cylc-reinstall/00-simple.t |  1 +
 tests/integration/test_reinstall.py         | 49 +++++++++++++++++++++++++------------------------
 6 files changed, 91 insertions(+), 66 deletions(-)
 create mode 100644 changes.d/5803.feat.md

From PR #5803

@hjoliver
Copy link
Member Author

hjoliver commented Aug 8, 2024

It's something to do with the asyncio changes in scheduler_cli but I haven't got my head around it yet.

@oliver-sanders
Copy link
Member

oliver-sanders commented Aug 8, 2024

Aah.

The asyncio stuff in scheduler_cli is highly abnormal due to the daemonization step.

If the process is being daemonized, then control is passed over to the new fork. However, asyncio is not yet initalised in that fork so we have to open a new loop.

This has resulted in the highly illogical situation where asyncio.run is called from within an async function when detaching (to start the new loop):

# run the workflow
if options.no_detach:
ret = await _run(scheduler)
else:
# Note: The daemonization messes with asyncio so we have to start a
# new event loop if detaching
ret = asyncio.run(
_run(scheduler)
)

Calling asyncio.run from within an async function should not work (the loop is already running and asyncio does not tolerate nesting), however, it works because by this point in the code we are actually running in a new fork.

The solution must work for:

  • play (restart)
  • play (resume)
  • vip
  • vr

@oliver-sanders
Copy link
Member

oliver-sanders commented Aug 12, 2024

The workflow still runs, but a similar traceback appears in the scheduler log on shutdown

Traceback aside, is this symptomless?

If so, is this just Python reporting the sys.exit call?

If that's the case, I wonder if doing something along these lines would help:

# cylc/flow/scheduler_cli

@cli_function(get_option_parser)
def play(parser: COP, options: 'Values', id_: str):
    try:
        return asyncio.run(scheduler_cli(options, id_))
    except SystemExit as exc:
        sys.exit(exc.code)

Totally illogical of course!

@hjoliver
Copy link
Member Author

hjoliver commented Aug 12, 2024

Traceback aside, is this symptomless?

If so, is this just Python reporting the sys.exit call?

Yes, that's right.

But I couldn't avoid the traceback using any variant of your suggestion that I tried. Another way seems to work, see upcoming PR.

oliver-sanders added a commit to oliver-sanders/cylc-flow that referenced this issue Aug 20, 2024
* Closes cylc#6291
* The `cylc play` command would sometimes produce traceback
  when detaching workflows (the default unless `--no-detach` is used).
* This traceback does not appear to have had any ill effects, but may
  have suppressed the normal Python session teardown logic.
* It was only reported on Mac OS, but may potentially occur on other
  systems.
* This PR mitigates the circumstances under which the traceback
  occurred by separating the asyncio event loops that are run before and
  after daemonization.
@oliver-sanders oliver-sanders added this to the 8.3.4 milestone Aug 21, 2024
@oliver-sanders oliver-sanders self-assigned this Aug 21, 2024
@wxtim wxtim closed this as completed in 70aca62 Aug 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something is wrong :(
Projects
None yet
Development

No branches or pull requests

2 participants