Skip to content

Commit

Permalink
Merge pull request galaxyproject#19594 from sanjaysrikakulam/cleanup_…
Browse files Browse the repository at this point in the history
…jwd_periodic_task

Add failed jobs working directory cleanup as a celery periodic task
  • Loading branch information
jmchilton authored Feb 13, 2025
2 parents a81c573 + e1e962f commit d572bcc
Show file tree
Hide file tree
Showing 5 changed files with 85 additions and 7 deletions.
37 changes: 37 additions & 0 deletions doc/source/admin/galaxy_options.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5747,4 +5747,41 @@
:Type: int


~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
``enable_failed_jobs_working_directory_cleanup``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

:Description:
Enables the cleanup of failed Galaxy job's working directories.
Runs in a Celery task.
:Default: ``false``
:Type: bool


~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
``failed_jobs_working_directory_cleanup_days``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

:Description:
The number of days to keep failed Galaxy job's working directories
before attempting to delete them if
enable_failed_jobs_working_directory_cleanup is ``true``. Runs in
a Celery task.
:Default: ``5``
:Type: int


~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
``failed_jobs_working_directory_cleanup_interval``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

:Description:
The interval in seconds between attempts to delete all failed
Galaxy job's working directories from the filesystem (every 24
hours by default) if enable_failed_jobs_working_directory_cleanup
is ``true``. Runs in a Celery task.
:Default: ``86400``
:Type: int



3 changes: 3 additions & 0 deletions lib/galaxy/celery/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -246,6 +246,9 @@ def schedule_task(task, interval):
if config.object_store_cache_monitor_driver in ["auto", "celery"]:
schedule_task("clean_object_store_caches", config.object_store_cache_monitor_interval)

if config.enable_failed_jobs_working_directory_cleanup:
schedule_task("cleanup_jwds", config.failed_jobs_working_directory_cleanup_interval)

if beat_schedule:
celery_app.conf.beat_schedule = beat_schedule

Expand Down
3 changes: 2 additions & 1 deletion lib/galaxy/celery/tasks.py
Original file line number Diff line number Diff line change
Expand Up @@ -508,7 +508,7 @@ def dispatch_pending_notifications(notification_manager: NotificationManager):


@galaxy_task(action="clean up job working directories")
def cleanup_jwds(sa_session: galaxy_scoped_session, object_store: BaseObjectStore, days: int = 5):
def cleanup_jwds(sa_session: galaxy_scoped_session, object_store: BaseObjectStore, config: GalaxyAppConfiguration):
"""Cleanup job working directories for failed jobs that are older than X days"""

def get_failed_jobs():
Expand All @@ -530,6 +530,7 @@ def delete_jwd(job):
log.error(f"Error deleting job working directory: {path} : {e.strerror}")

failed_jobs = get_failed_jobs()
days = config.failed_jobs_working_directory_cleanup_days

if not failed_jobs:
log.info("No failed jobs found within the last %s days", days)
Expand Down
28 changes: 22 additions & 6 deletions lib/galaxy/config/sample/galaxy.yml.sample
Original file line number Diff line number Diff line change
@@ -1,21 +1,21 @@
# Galaxy is configured by default to be usable in a single-user development
# environment. To tune the application for a multi-user production
# environment, see the documentation at:
#
#
# https://docs.galaxyproject.org/en/master/admin/production.html
#
#
# Throughout this sample configuration file, except where stated otherwise,
# uncommented values override the default if left unset, whereas commented
# values are set to the default value. Relative paths are relative to the root
# Galaxy directory.
#
#
# Examples of many of these options are explained in more detail in the Galaxy
# Community Hub.
#
#
# https://galaxyproject.org/admin/config
#
#
# Config hackers are encouraged to check there before asking for help.
#
#
# Configuration for Gravity process manager.
# ``uwsgi:`` section will be ignored if Galaxy is started via Gravity commands (e.g ``./run.sh``, ``galaxy`` or ``galaxyctl``).
gravity:
Expand Down Expand Up @@ -3067,3 +3067,19 @@ galaxy:
# affects s3fs file sources.
#file_source_listings_expiry_time: 60

# Enables the cleanup of failed Galaxy job's working directories. Runs
# in a Celery task.
#enable_failed_jobs_working_directory_cleanup: false

# The number of days to keep failed Galaxy job's working directories
# before attempting to delete them if
# enable_failed_jobs_working_directory_cleanup is ``true``. Runs in a
# Celery task.
#failed_jobs_working_directory_cleanup_days: 5

# The interval in seconds between attempts to delete all failed Galaxy
# job's working directories from the filesystem (every 24 hours by
# default) if enable_failed_jobs_working_directory_cleanup is
# ``true``. Runs in a Celery task.
#failed_jobs_working_directory_cleanup_interval: 86400

21 changes: 21 additions & 0 deletions lib/galaxy/config/schemas/config_schema.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4240,3 +4240,24 @@ mapping:
Number of seconds before file source content listings are refreshed. Shorter times will result in more
queries while browsing a file sources. Longer times will result in fewer requests to file sources but
outdated contents might be displayed to the user. Currently only affects s3fs file sources.
enable_failed_jobs_working_directory_cleanup:
type: bool
default: false
required: false
desc: |
Enables the cleanup of failed Galaxy job's working directories. Runs in a Celery task.
failed_jobs_working_directory_cleanup_days:
type: int
required: false
default: 5
desc: |
The number of days to keep failed Galaxy job's working directories before attempting to delete them if enable_failed_jobs_working_directory_cleanup is ``true``. Runs in a Celery task.
failed_jobs_working_directory_cleanup_interval:
type: int
required: false
default: 86400
desc: |
The interval in seconds between attempts to delete all failed Galaxy job's working directories from the filesystem (every 24 hours by default) if enable_failed_jobs_working_directory_cleanup is ``true``. Runs in a Celery task.

0 comments on commit d572bcc

Please sign in to comment.