Skip to content

Commit

Permalink
Merge pull request #37 from ACCESS-NRI/main
Browse files Browse the repository at this point in the history
Update from main
  • Loading branch information
bschroeter authored Jan 7, 2025
2 parents 6bcbebc + 635f1a8 commit aa8edb6
Show file tree
Hide file tree
Showing 21 changed files with 294 additions and 161 deletions.
11 changes: 11 additions & 0 deletions .github/dependabot.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# To get started with Dependabot version updates, you'll need to specify which
# package ecosystems to update and where the package manifests are located.
# Please see the documentation for all configuration options:
# https://docs.github.com/code-security/dependabot/dependabot-version-updates/configuration-options-for-the-dependabot.yml-file

version: 2
updates:
- package-ecosystem: "" # See documentation for possible values
directory: "/" # Location of package manifests
schedule:
interval: "weekly"
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ matrix:
mode: en
dictionary:
wordlists:
- .github/workflows/.wordlist.txt
- .github/wordlist.txt
output: wordlist.dic
encoding: utf-8
pipeline:
Expand Down
2 changes: 0 additions & 2 deletions .github/workflows/.wordlist.txt → .github/wordlist.txt
Original file line number Diff line number Diff line change
@@ -1,6 +1,4 @@
HPCPY
HPC
hpcpy
HPCpy
pre
py
Expand Down
4 changes: 2 additions & 2 deletions .github/workflows/ci_spelling.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ jobs:
- name: Checkout
uses: actions/checkout@v3
- name: Check Spelling
uses: rojopolis/spellcheck-github-actions@0.23.0
uses: rojopolis/spellcheck-github-actions@0.42.0
with:
config_path: .github/workflows/.spellcheck.yml
config_path: .github/spellcheck.yml
task_name: Markdown
20 changes: 20 additions & 0 deletions .readthedocs.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# .readthedocs.yaml
# Read the Docs configuration file
# See https://docs.readthedocs.io/en/stable/config-file/v2.html for details

# Required
version: 2

# Set the version of Python and other tools you might need
build:
os: ubuntu-20.04
tools:
python: "3.9"

mkdocs:
configuration: mkdocs.yml

# Optionally declare the Python requirements required to build your docs
python:
install:
- requirements: .conda/mkdocs-requirements.txt
20 changes: 18 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,19 @@
# HPCPY
[![CI pytest](https://github.com/ACCESS-NRI/hpcpy/actions/workflows/ci_pytest.yml/badge.svg?branch=main)](https://github.com/ACCESS-NRI/hpcpy/actions/workflows/ci_pytest.yml)
[![Documentation Status](https://readthedocs.org/projects/hpcpy/badge/?version=latest)](https://hpcpy.readthedocs.io/en/latest/?badge=latest)

HPCPY is a prototype Python client for interacting with HPC scheduling systems (i.e PBS).
# HPCpy

HPCpy is a Python package for interacting with HPC scheduling systems. The package provides generalised clients to communicate with HPC schedulers agnostically.

Currently supported scheduling systems:

- PBS
- SLURM*

_* under development_

The full documentation is available at [hpcpy.readthedocs.io](https://hpcpy.readthedocs.io)

## License

HPCpy is distributed under the Apache Software License v2.0. Please see the [LICENSE](https://github.com/ACCESS-NRI/hpcpy/blob/main/LICENSE) file in this repository for further details.
35 changes: 0 additions & 35 deletions docs/advanced_usage.md

This file was deleted.

134 changes: 111 additions & 23 deletions docs/usage.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Usage

The following describes the basic usage of hpcpy.
The following describes the basic usage of HPCpy.

## Getting a client object

Expand All @@ -11,28 +11,59 @@ from hpcpy import get_client
client = get_client()
```

This will return the most-likely client object based on the submission commands available on the system.

In the case of the factory being unable to return an appropriate client object (or if you need to be explicit), you may import the client explicitly for your system.

For example:
This will return the most-likely client object based on the submission commands available on the system, raising a `NoClientException` if no host scheduler is detected. In the case of the factory being unable to return an appropriate client object (or if you need to be explicit), you may import the client explicitly for your system:

```python
from hpcpy import PBSClient
client = PBSClient()
from hpcpy import PBSClient, SLURMClient
client_pbs = PBSClient()
client_slurm = SLURMClient()
```

You can now use the client object for the remaining examples.
!!! note

## Submitting jobs
When using this approach you are bypassing any auto-detection of the host scheduler.

The simplest way to submit a pre-written job script is via the following command:
## Submit

```python
job_id = client.submit("/path/to/script.sh")
```
The simplest way to submit a pre-written job script is via the `submit()` command, which executes the appropriate command for the scheduler:

=== "HPCPy (Python)"
```python
job_id = client.submit("/path/to/script.sh")
```

=== "PBS"
```shell
JOB_ID=$(qsub /path/to/script.sh)
```

=== "SLURM"
```shell
JOB_ID=$(sbatch /path/to/script.sh)
```

However, oftentimes it is preferable to use a script template that is rendered with additional variables prior to submission. Depending on how this is written, a single script could be used for multiple scheduling systems.
### Environment Variables

=== "HPCpy (Python)"
```python
job_id = client.submit(
"/path/to/script.sh",
variables=dict(a=1, b="test")
)
```

=== "PBS"
```shell
qsub -v a=1,b=test /path/to/script.sh
```

!!! note

All environment variables are passed to the job as strings WITHOUT treatment of commas.

### Script templates

Script templates can be used to generalise a single template script for use in multiple scenarios (i.e. different scheduling systems).

*template.sh*
```shell
Expand All @@ -52,6 +83,7 @@ job_id = client.submit(
```

This will do two things:

1. The template will be loaded into memory, rendered, and written to a temporary file at `$HOME/.hpcpy/job_scripts` (these are periodically cleared by hpcpy).
2. The rendered jobscript will be submitted to the scheduler.

Expand All @@ -68,13 +100,19 @@ job_script_filepath = client._render_job_script(
)
```

## Checking job status
## Status

Checking the status of a job that has been submitted requires the `job_id` of the job on on the scheduler. Using the `submit()` command as above will return this identifier for use with the client.

```python
status = client.status(job_id)
```
=== "HPCpy (Python)"
```python
status = client.status(job_id)
```
=== "PBS"
```shell
STATUS=$(qstat -f -F json $JOB_ID)
# ... then grepping through to find the job_state attribute
```

The status will be a character code as listed in `constants.py`, however, certain shortcut methods are available for the most common queries.

Expand All @@ -88,12 +126,62 @@ client.is_running(job_id)

More shorthand methods will be made available as required.

Note: all status related commands will poll the underlying scheduler; please be mindful of overloading the scheduling system with repeated, frequent calls.
!!! note
All status related commands will poll the underlying scheduler; please be mindful of overloading the scheduling system with repeated, frequent calls.

## Deleting jobs
## Delete

Deleting a job on the system requires only the `job_id` of the job on the scheduler

=== "HPCpy (Python)"
```python
client.delete(job_id)
```
=== "PBS"
```shell
qdel $JOB_ID
```

## Task dependence

HPCpy implements a simple task-dependence strategy at the scheduler level, whereby, we can use scheduler directives to make one job dependent on another.

=== "HPCPy (Python)"
```python
job1 = client.submit("job1.sh")
job2 = client.submit("job2.sh", depends_on=job1)
```
=== "PBS"
```shell
JOB1=$(qsub job1.sh)
JOB2=$(qsub -W depend=afterok:$JOB1 job2.sh)
```

Consider the following snippet:

```python
client.delete(job_id)
```
from hpcpy import get_client
client = get_client()

# Submit the first job
first_id = client.submit("job.sh")

# Submit some interim jobs all requiring the first to finish
job_ids = list()
for x in range(3):
jobx_id = client.submit("job.sh", depends_on=first_id)
job_ids.append(jobx_id)

# Submit a final job that requires everything to have finished.
job_last = client.submit("job.sh", depends_on=job_ids)
```

This will create 5 jobs:

- 1 x starting job
- 3 x middle jobs (which depend on the first)
- 1 x finishing job (which depends on the middle jobs to complete)

Essentially demonstrating a "fork and join" example.

More advanced graphs can be assembled as needed, the complexity of which is determined by your scheduler.
3 changes: 1 addition & 2 deletions hpcpy/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,13 +4,12 @@
from hpcpy.client.client_factory import ClientFactory
from hpcpy.client.pbs import PBSClient
from hpcpy.client.slurm import SlurmClient
from hpcpy.client.mock import MockClient
from typing import Union

__version__ = _version.get_versions()["version"]


def get_client(*args, **kwargs) -> Union[PBSClient, SlurmClient, MockClient]:
def get_client(*args, **kwargs) -> Union[PBSClient, SlurmClient]:
"""Get a client object specific for the current scheduler.
Returns
Expand Down
9 changes: 6 additions & 3 deletions hpcpy/client/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -115,7 +115,6 @@ def status(self, job_id):
job_id : str
Job ID.
"""
# raise NotImplementedError()
cmd = self._tmp_status.format(job_id=job_id)
result = self._shell(cmd)
return result
Expand Down Expand Up @@ -171,11 +170,15 @@ def _shell(self, cmd, decode=True):
Command to run.
decode : bool
Automatically decode response with utf-8, defaults to True
Raises
------
hpcpy.exceptions.ShellException :
When the underlying shell call fails.
Returns
-------
_type_
_description_
str
Result from the underlying called command.
"""
result = shell(cmd)

Expand Down
12 changes: 3 additions & 9 deletions hpcpy/client/client_factory.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,16 +2,14 @@

from hpcpy.client.pbs import PBSClient
from hpcpy.client.slurm import SlurmClient
from hpcpy.client.mock import MockClient
import os
import hpcpy.exceptions as hx
from hpcpy.utilities import shell
from typing import Union


class ClientFactory:

def get_client(*args, **kwargs) -> Union[PBSClient, SlurmClient, MockClient]:
def get_client(*args, **kwargs) -> Union[PBSClient, SlurmClient]:
"""Get a client object based on what kind of scheduler we are using.
Arguments:
Expand All @@ -21,7 +19,7 @@ def get_client(*args, **kwargs) -> Union[PBSClient, SlurmClient, MockClient]:
Returns
-------
Union[PBSClient, SlurmClient, MockClient]
Union[PBSClient, SlurmClient]
Client object suitable for the detected scheduler.
Raises
Expand All @@ -30,11 +28,7 @@ def get_client(*args, **kwargs) -> Union[PBSClient, SlurmClient, MockClient]:
When no scheduler can be detected.
"""

clients = dict(ls=MockClient, qsub=PBSClient, sbatch=SlurmClient)

# Remove the MockClient if dev mode is off
if os.getenv("HPCPY_DEV_MODE", "0") != "1":
_ = clients.pop("ls")
clients = dict(qsub=PBSClient, sbatch=SlurmClient)

# Loop through the clients in order, looking for a valid scheduler
for cmd, client in clients.items():
Expand Down
Loading

0 comments on commit aa8edb6

Please sign in to comment.