Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update from main #37

Merged
merged 22 commits into from
Jan 7, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
7ab554c
Bumped Spellcheck action to contemporary version
jonasbn Oct 4, 2024
d9403a2
Moved configuration files out of workflows directory
jonasbn Oct 4, 2024
9046504
Bad a moving/renaming apparently
jonasbn Oct 4, 2024
aa62e5f
Added client variable passing to PBS command. Fixes #25
bschroeter Oct 6, 2024
5376d99
Added in shlex.
bschroeter Oct 6, 2024
2e230bb
Update hpcpy/client/pbs.py
bschroeter Oct 8, 2024
61ec837
Update utilities.py
bschroeter Oct 8, 2024
dc1cbfa
Merge pull request #27 from ACCESS-NRI/25-pass-environment-variables-…
bschroeter Oct 8, 2024
b7ecd5b
Removed mock client, improved docs. Fixes #29
bschroeter Oct 14, 2024
fb97376
Default to PBS client
bschroeter Oct 14, 2024
d13ef5e
Merge pull request #30 from ACCESS-NRI/29-remove-mock-client
bschroeter Oct 14, 2024
6dd069c
Added readthedocs.yml
bschroeter Oct 15, 2024
fa24ab7
Merge pull request #31 from ACCESS-NRI/9-add-readthedocs
bschroeter Oct 15, 2024
2cb310a
Merge pull request #26 from jonasbn/spellcheck_action_update_due_to_s…
bschroeter Oct 15, 2024
7647920
Create dependabot.yml
bschroeter Oct 15, 2024
a1cb029
Updated readme. Fixes #32
bschroeter Oct 15, 2024
eb901be
Merge pull request #33 from ACCESS-NRI/32-update-readme
bschroeter Oct 15, 2024
7ced15e
Added new exception class to reveal more information to the user. Fix…
bschroeter Oct 17, 2024
d15ee5a
Fixed exception test for general use. Fixes: #34
bschroeter Oct 17, 2024
eda8dc0
Update utilities.py
bschroeter Oct 17, 2024
c7281eb
Update base.py
bschroeter Oct 17, 2024
635f1a8
Merge pull request #35 from ACCESS-NRI/34-pbs-scheduler-errors-are-no…
bschroeter Oct 17, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions .github/dependabot.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# To get started with Dependabot version updates, you'll need to specify which
# package ecosystems to update and where the package manifests are located.
# Please see the documentation for all configuration options:
# https://docs.github.com/code-security/dependabot/dependabot-version-updates/configuration-options-for-the-dependabot.yml-file

version: 2
updates:
- package-ecosystem: "" # See documentation for possible values
directory: "/" # Location of package manifests
schedule:
interval: "weekly"
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ matrix:
mode: en
dictionary:
wordlists:
- .github/workflows/.wordlist.txt
- .github/wordlist.txt
output: wordlist.dic
encoding: utf-8
pipeline:
Expand Down
2 changes: 0 additions & 2 deletions .github/workflows/.wordlist.txt → .github/wordlist.txt
Original file line number Diff line number Diff line change
@@ -1,6 +1,4 @@
HPCPY
HPC
hpcpy
HPCpy
pre
py
Expand Down
4 changes: 2 additions & 2 deletions .github/workflows/ci_spelling.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ jobs:
- name: Checkout
uses: actions/checkout@v3
- name: Check Spelling
uses: rojopolis/spellcheck-github-actions@0.23.0
uses: rojopolis/spellcheck-github-actions@0.42.0
with:
config_path: .github/workflows/.spellcheck.yml
config_path: .github/spellcheck.yml
task_name: Markdown
20 changes: 20 additions & 0 deletions .readthedocs.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# .readthedocs.yaml
# Read the Docs configuration file
# See https://docs.readthedocs.io/en/stable/config-file/v2.html for details

# Required
version: 2

# Set the version of Python and other tools you might need
build:
os: ubuntu-20.04
tools:
python: "3.9"

mkdocs:
configuration: mkdocs.yml

# Optionally declare the Python requirements required to build your docs
python:
install:
- requirements: .conda/mkdocs-requirements.txt
20 changes: 18 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,19 @@
# HPCPY
[![CI pytest](https://github.com/ACCESS-NRI/hpcpy/actions/workflows/ci_pytest.yml/badge.svg?branch=main)](https://github.com/ACCESS-NRI/hpcpy/actions/workflows/ci_pytest.yml)
[![Documentation Status](https://readthedocs.org/projects/hpcpy/badge/?version=latest)](https://hpcpy.readthedocs.io/en/latest/?badge=latest)

HPCPY is a prototype Python client for interacting with HPC scheduling systems (i.e PBS).
# HPCpy

HPCpy is a Python package for interacting with HPC scheduling systems. The package provides generalised clients to communicate with HPC schedulers agnostically.

Currently supported scheduling systems:

- PBS
- SLURM*

_* under development_

The full documentation is available at [hpcpy.readthedocs.io](https://hpcpy.readthedocs.io)

## License

HPCpy is distributed under the Apache Software License v2.0. Please see the [LICENSE](https://github.com/ACCESS-NRI/hpcpy/blob/main/LICENSE) file in this repository for further details.
35 changes: 0 additions & 35 deletions docs/advanced_usage.md

This file was deleted.

134 changes: 111 additions & 23 deletions docs/usage.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Usage

The following describes the basic usage of hpcpy.
The following describes the basic usage of HPCpy.

## Getting a client object

Expand All @@ -11,28 +11,59 @@ from hpcpy import get_client
client = get_client()
```

This will return the most-likely client object based on the submission commands available on the system.

In the case of the factory being unable to return an appropriate client object (or if you need to be explicit), you may import the client explicitly for your system.

For example:
This will return the most-likely client object based on the submission commands available on the system, raising a `NoClientException` if no host scheduler is detected. In the case of the factory being unable to return an appropriate client object (or if you need to be explicit), you may import the client explicitly for your system:

```python
from hpcpy import PBSClient
client = PBSClient()
from hpcpy import PBSClient, SLURMClient
client_pbs = PBSClient()
client_slurm = SLURMClient()
```

You can now use the client object for the remaining examples.
!!! note

## Submitting jobs
When using this approach you are bypassing any auto-detection of the host scheduler.

The simplest way to submit a pre-written job script is via the following command:
## Submit

```python
job_id = client.submit("/path/to/script.sh")
```
The simplest way to submit a pre-written job script is via the `submit()` command, which executes the appropriate command for the scheduler:

=== "HPCPy (Python)"
```python
job_id = client.submit("/path/to/script.sh")
```

=== "PBS"
```shell
JOB_ID=$(qsub /path/to/script.sh)
```

=== "SLURM"
```shell
JOB_ID=$(sbatch /path/to/script.sh)
```

However, oftentimes it is preferable to use a script template that is rendered with additional variables prior to submission. Depending on how this is written, a single script could be used for multiple scheduling systems.
### Environment Variables

=== "HPCpy (Python)"
```python
job_id = client.submit(
"/path/to/script.sh",
variables=dict(a=1, b="test")
)
```

=== "PBS"
```shell
qsub -v a=1,b=test /path/to/script.sh
```

!!! note

All environment variables are passed to the job as strings WITHOUT treatment of commas.

### Script templates

Script templates can be used to generalise a single template script for use in multiple scenarios (i.e. different scheduling systems).

*template.sh*
```shell
Expand All @@ -52,6 +83,7 @@ job_id = client.submit(
```

This will do two things:

1. The template will be loaded into memory, rendered, and written to a temporary file at `$HOME/.hpcpy/job_scripts` (these are periodically cleared by hpcpy).
2. The rendered jobscript will be submitted to the scheduler.

Expand All @@ -68,13 +100,19 @@ job_script_filepath = client._render_job_script(
)
```

## Checking job status
## Status

Checking the status of a job that has been submitted requires the `job_id` of the job on on the scheduler. Using the `submit()` command as above will return this identifier for use with the client.

```python
status = client.status(job_id)
```
=== "HPCpy (Python)"
```python
status = client.status(job_id)
```
=== "PBS"
```shell
STATUS=$(qstat -f -F json $JOB_ID)
# ... then grepping through to find the job_state attribute
```

The status will be a character code as listed in `constants.py`, however, certain shortcut methods are available for the most common queries.

Expand All @@ -88,12 +126,62 @@ client.is_running(job_id)

More shorthand methods will be made available as required.

Note: all status related commands will poll the underlying scheduler; please be mindful of overloading the scheduling system with repeated, frequent calls.
!!! note
All status related commands will poll the underlying scheduler; please be mindful of overloading the scheduling system with repeated, frequent calls.

## Deleting jobs
## Delete

Deleting a job on the system requires only the `job_id` of the job on the scheduler

=== "HPCpy (Python)"
```python
client.delete(job_id)
```
=== "PBS"
```shell
qdel $JOB_ID
```

## Task dependence

HPCpy implements a simple task-dependence strategy at the scheduler level, whereby, we can use scheduler directives to make one job dependent on another.

=== "HPCPy (Python)"
```python
job1 = client.submit("job1.sh")
job2 = client.submit("job2.sh", depends_on=job1)
```
=== "PBS"
```shell
JOB1=$(qsub job1.sh)
JOB2=$(qsub -W depend=afterok:$JOB1 job2.sh)
```

Consider the following snippet:

```python
client.delete(job_id)
```
from hpcpy import get_client
client = get_client()

# Submit the first job
first_id = client.submit("job.sh")

# Submit some interim jobs all requiring the first to finish
job_ids = list()
for x in range(3):
jobx_id = client.submit("job.sh", depends_on=first_id)
job_ids.append(jobx_id)

# Submit a final job that requires everything to have finished.
job_last = client.submit("job.sh", depends_on=job_ids)
```

This will create 5 jobs:

- 1 x starting job
- 3 x middle jobs (which depend on the first)
- 1 x finishing job (which depends on the middle jobs to complete)

Essentially demonstrating a "fork and join" example.

More advanced graphs can be assembled as needed, the complexity of which is determined by your scheduler.
3 changes: 1 addition & 2 deletions hpcpy/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,13 +4,12 @@
from hpcpy.client.client_factory import ClientFactory
from hpcpy.client.pbs import PBSClient
from hpcpy.client.slurm import SlurmClient
from hpcpy.client.mock import MockClient
from typing import Union

__version__ = _version.get_versions()["version"]


def get_client(*args, **kwargs) -> Union[PBSClient, SlurmClient, MockClient]:
def get_client(*args, **kwargs) -> Union[PBSClient, SlurmClient]:
"""Get a client object specific for the current scheduler.

Returns
Expand Down
9 changes: 6 additions & 3 deletions hpcpy/client/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -115,7 +115,6 @@ def status(self, job_id):
job_id : str
Job ID.
"""
# raise NotImplementedError()
cmd = self._tmp_status.format(job_id=job_id)
result = self._shell(cmd)
return result
Expand Down Expand Up @@ -171,11 +170,15 @@ def _shell(self, cmd, decode=True):
Command to run.
decode : bool
Automatically decode response with utf-8, defaults to True
Raises
------
hpcpy.exceptions.ShellException :
When the underlying shell call fails.

Returns
-------
_type_
_description_
str
Result from the underlying called command.
"""
result = shell(cmd)

Expand Down
12 changes: 3 additions & 9 deletions hpcpy/client/client_factory.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,16 +2,14 @@

from hpcpy.client.pbs import PBSClient
from hpcpy.client.slurm import SlurmClient
from hpcpy.client.mock import MockClient
import os
import hpcpy.exceptions as hx
from hpcpy.utilities import shell
from typing import Union


class ClientFactory:

def get_client(*args, **kwargs) -> Union[PBSClient, SlurmClient, MockClient]:
def get_client(*args, **kwargs) -> Union[PBSClient, SlurmClient]:
"""Get a client object based on what kind of scheduler we are using.

Arguments:
Expand All @@ -21,7 +19,7 @@ def get_client(*args, **kwargs) -> Union[PBSClient, SlurmClient, MockClient]:

Returns
-------
Union[PBSClient, SlurmClient, MockClient]
Union[PBSClient, SlurmClient]
Client object suitable for the detected scheduler.

Raises
Expand All @@ -30,11 +28,7 @@ def get_client(*args, **kwargs) -> Union[PBSClient, SlurmClient, MockClient]:
When no scheduler can be detected.
"""

clients = dict(ls=MockClient, qsub=PBSClient, sbatch=SlurmClient)

# Remove the MockClient if dev mode is off
if os.getenv("HPCPY_DEV_MODE", "0") != "1":
_ = clients.pop("ls")
clients = dict(qsub=PBSClient, sbatch=SlurmClient)

# Loop through the clients in order, looking for a valid scheduler
for cmd, client in clients.items():
Expand Down
Loading
Loading