Skip to content

Commit

Permalink
Tidying and optimising (#35)
Browse files Browse the repository at this point in the history
* Pinning to fastapi 0.103, allowing patches

* Remove psycopg2 dependency, move to python 3.10 as 3.11 not found on cosma

* Remove remaining python 3.11 references in pyproject.toml

* Add gunicorn to dependencies

* Add python-multipart

* Updating tests to handle slimmed down test data

* Change datetime tzinfo from UTC to timezone.utc as python 3.10 doesn't support

* Remove docker section from README, add gunicorn section

* Updating testing portion of docs

* Add test dependencies to dev instructions, unpin mypy and pytest

* Stop referencing python 3.11 in the readme

* Bump version number, customise API docs

* Add a more robust reference to original code in swiftsimio
  • Loading branch information
harryjmoss authored Oct 18, 2023
1 parent 2be3c3b commit 9d67b35
Show file tree
Hide file tree
Showing 12 changed files with 91 additions and 56 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ jobs:
strategy:
fail-fast: false
matrix:
python-version: ["3.11"]
python-version: ["3.10"]
runs-on: [ubuntu-latest]

steps:
Expand Down
43 changes: 21 additions & 22 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ Centre for Advanced Research Computing, University College London

### Prerequisites

- Python 3.11
- Python 3.10 or newer

### Installation

Expand All @@ -58,10 +58,12 @@ source ./env/bin/activate
- While in the top-level repository directory (containing this `README.md`)

```bash
pip install "./api[dev]"
pip install "./api[dev,test]"
```

### Running Locally
## Running the API

### Running locally

After installing the package, from the root directory (containing this README)

Expand All @@ -71,42 +73,39 @@ uvicorn api.main:app --reload

By default, the API will be served on `localhost:8000`, with OpenAPI documentation available at `localhost:8000/docs`

### Running via Docker Compose

Create a `.env` file in the package root directory, based on the `.env.example`provided and noting the following
### Deploying the API

- Provide an `API_UID` in the file matching the integer returned by `id -u`
- Supply a port to `API_PORT` that you will use to access the API
When deploying the API for use in production, it's recommended to use [Gunicorn](https://docs.gunicorn.org/en/stable/index.html) to serve the FastAPI application and act as a process manager. Gunicorn can start one or more uvicorn worker processes, listening on the port indicated on startup. Request and response handling is taken care of by individual workers.

From the package root directory, bring the API up with
Gunicorn will restart failing workers, but care should be taken to deal with cases where the Gunicorn process itself is killed.
It's important to note that Gunicorn does not provide load balancing capability, but relies on the operating system to perform that role.

```bash
docker compose -p swiftapi up --build
```
The documentation recommends `(2 x $num_cores) + 1` workers, although depending on your deployment environment this may not be suitable.

where `-p swiftapi` sets the docker compose project name to `swiftapi`.

Bring the running container down with
As an example, to start this application under Gunicorn on a `localhost` port with your choice of workers:

```bash
docker compose down
gunicorn src.api.main:app --workers ${n_workers} --worker-class uvicorn.workers.UvicornWorker --bind localhost:${port}
```

## For developers

### Running Tests

Tests can be run either via `tox` or directly via `pytest`
Tests can be run either via `tox` or directly via `pytest` from the top level directory of the repository

```bash
cd api
tox run
```

or

```bash
python -m pytest -ra . --cov=api/src/api
python -m pytest -ra . --cov=src/api
```

either of which will run all tests and generate a coverage report.

## Contributing

To contribute to the project as a developer, use the following as a guide. These are based on ARC Collaborations [group practices](https://github.com/UCL-ARC/research-software-documentation/blob/main/processes/programming_projects/group_practices.md) and [code review documentation](https://github.com/UCL-ARC/research-software-documentation/blob/main/processes/programming_projects/review.md).
Expand All @@ -115,7 +114,7 @@ To contribute to the project as a developer, use the following as a guide. These

To make explicit some of the potentially implicit:

- We will target Python versions `>= 3.11`
- We will target Python versions `>= 3.10`
- We will use [ruff](https://beta.ruff.rs/docs/) for linting and [black](https://github.com/psf/black) for code formatting to standardise code, improve legibility and speed up code reviews
- Function arguments and return types will be annotated, with type checking via [mypy](https://mypy.readthedocs.io/en/stable/)
- We will use [docstrings](https://peps.python.org/pep-0257/) to annotate classes, class methods and functions
Expand Down Expand Up @@ -147,7 +146,7 @@ The `main` branch is for ready-to-deploy release quality code

## Project Roadmap

- [x] Initial Research <-- You are Here
- [ ] Minimum viable product
- [x] Initial Research
- [x] Minimum viable product <-- You are Here
- [ ] Alpha Release
- [ ] Feature-Complete Release
24 changes: 13 additions & 11 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -14,20 +14,21 @@ classifiers = [
"Operating System :: POSIX",
"Programming Language :: Python :: 3",
"Programming Language :: Python :: 3 :: Only",
"Programming Language :: Python :: 3.11",
"Programming Language :: Python :: 3.10",
"Typing :: Typed",
]
dependencies = [
"aiofiles>=23.1.0",
"cloudpickle==2.2.1",
"fastapi>=0.100.0",
"fastapi>=0.103.0,<0.104.0",
"gunicorn>=21.2.0",
"httpx>=0.24.1",
"loguru>=0.7.0",
"psycopg2==2.9.6",
"pydantic-settings~=2.0.2",
"pydantic~=2.1",
"pyjwt>=2.8.0",
"python-dotenv>=1.0.0",
"python-multipart>=0.0.6",
"requests>=2.31.0",
"swiftsimio~=7.0.1",
"uvicorn>=0.22.0",
Expand All @@ -39,11 +40,11 @@ name = "dirac-swift-api"
optional-dependencies = {dev = [
"black",
"build",
"mypy==1.4.1",
"mypy",
"pre-commit",
"pytest",
"pytest-cov",
"pytest-mock",
"pytest==7.4.0",
"ruff",
"tox",
"twine",
Expand All @@ -57,8 +58,8 @@ optional-dependencies = {dev = [
"tox",
]}
readme = "README.md"
requires-python = ">=3.11"
version = "0.0.1"
requires-python = ">=3.10"
version = "1.0.1"
license.file = "LICENCE.md"
urls.homepage = "https://github.com/UCL-ARC/dirac-swift-api"

Expand Down Expand Up @@ -152,7 +153,7 @@ select = [
"W",
"YTT",
]
target-version = "py311"
target-version = "py310"
isort.known-first-party = [
"dirac_swift_api",
]
Expand All @@ -176,17 +177,18 @@ overrides."tool.coverage.paths.source".inline_arrays = false
legacy_tox_ini = """
[gh-actions]
python =
3.11: py311
3.10: py310
[testenv]
commands =
pytest -ra tests --cov=src/api
pytest -ra . --cov=api
deps =
freezegun
pytest
pytest-cov
pytest-mock
[tox]
env_list =
py311
py310
"""
21 changes: 20 additions & 1 deletion src/api/main.py
Original file line number Diff line number Diff line change
@@ -1,13 +1,32 @@
"""Entry point and main file for the FastAPI backend."""

from importlib import metadata

from fastapi import FastAPI
from loguru import logger

from api.routers import auth, file_processing

logger.info("API starting")

app = FastAPI()
description = """
SWIFTsimIO API provides read-only access to SWIFT data via HTTP requests.
Authenticated users can access
* Masked data
* Unmasked data
* Metadata
* Units
Users must have existing access to [VirgoDB](https://virgodb.dur.ac.uk/)
"""

app = FastAPI(
title="SWIFTsimIO API",
description=description,
version=metadata.version("dirac-swift-api"),
)

app.include_router(file_processing.router)
app.include_router(auth.router)
Expand Down
4 changes: 3 additions & 1 deletion src/api/processing/data_processing.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,9 @@
"""Implements SWIFTsimIO data processing functionality on the server side.
This module calls SWIFTsimIO functions and creates numpy arrays from HDF5 files
read on the server.
read on the server. It uses code available under the GNU General Public License
version 3 from the SWIFTsimIO library https://github.com/SWIFTSIM/swiftsimio/.
"""
import json

Expand Down
4 changes: 2 additions & 2 deletions src/api/virgo_auth.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
"""Module to handle authentiation against the database server."""
import json
from datetime import UTC, datetime, timedelta
from datetime import datetime, timedelta, timezone
from pathlib import Path

import jwt
Expand Down Expand Up @@ -138,7 +138,7 @@ def generate_token(self) -> str:
-------
token (str): Generated JWT token.
"""
expiration = datetime.now(UTC) + timedelta(hours=1)
expiration = datetime.now(timezone.utc) + timedelta(hours=1)
return jwt.encode(
{"exp": expiration, "sub": self.username},
self.jwt_secret,
Expand Down
Binary file removed tests/data/cosmo_volume_example.hdf5
Binary file not shown.
Binary file added tests/data/test_subset.hdf5
Binary file not shown.
9 changes: 6 additions & 3 deletions tests/fixtures/test_data.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,12 +11,15 @@ def data_path() -> Path:
@pytest.fixture()
def template_swift_data_path(data_path) -> Path:
"""
Return a Path object representing the sample swift data file linked from the SWIFTsimIO docs.
Return a Path object representing sample swift data.
File available http://virgodb.cosma.dur.ac.uk/swift-webstorage/IOExamples/cosmo_volume_example.hdf5
The data is a subset of the file linked from the SWIFTsimIO docs.
File available:
http://virgodb.cosma.dur.ac.uk/swift-webstorage/IOExamples/cosmo_volume_example.hdf5
"""
return data_path / "cosmo_volume_example.hdf5"
return data_path / "test_subset.hdf5"


@pytest.fixture()
Expand Down
4 changes: 2 additions & 2 deletions tests/test_api.py
Original file line number Diff line number Diff line change
Expand Up @@ -345,7 +345,7 @@ def test_get_unmasked_array_data_success_no_columns(
"field": "PartType0/Coordinates",
},
}
expected_array_length = 261992
expected_array_length = 32382
response = mock_auth_client_success_jwt_decode.post(
"/swiftdata/unmasked_dataset",
json=payload,
Expand All @@ -365,7 +365,7 @@ def test_get_unmasked_array_data_success_columns(
"columns": 0,
},
}
expected_array_length = 261992
expected_array_length = 32382
response = mock_auth_client_success_jwt_decode.post(
"/swiftdata/unmasked_dataset",
json=payload,
Expand Down
19 changes: 11 additions & 8 deletions tests/test_data_processing.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,10 +13,13 @@ def test_retrieve_filename_failure(template_dataset_alias_map):
processor.retrieve_filename("a_nonexistant_alias")


def test_retrieve_filename_success(template_dataset_alias_map):
def test_retrieve_filename_success(
template_dataset_alias_map,
template_swift_data_path,
):
processor = SWIFTProcessor(template_dataset_alias_map)

expected_filename = "cosmo_volume_example.hdf5"
expected_filename = str(template_swift_data_path).rsplit("/", 1)[-1]
filename = processor.retrieve_filename("test_file")

assert Path(filename).name == expected_filename # type: ignore
Expand Down Expand Up @@ -113,9 +116,9 @@ def test_get_array_unmasked_no_columns(
test_columns = None
processor = SWIFTProcessor(template_dataset_alias_map)

expected_shape = (261992,)
expected_shape = (32382,)
expected_first_element_6dp = "6.377697e-06"
expected_final_element_6dp = "1.192093e-06"
expected_20k_element_6dp = "8.940697e-07"

output = processor.get_array_unmasked(
template_swift_data_path,
Expand All @@ -125,7 +128,7 @@ def test_get_array_unmasked_no_columns(

assert output.shape == expected_shape
assert f"{output[0]:.6e}" == expected_first_element_6dp
assert f"{output[-1]:.6e}" == expected_final_element_6dp
assert f"{output[20000]:.6e}" == expected_20k_element_6dp


def test_get_array_unmasked_columns(
Expand All @@ -136,11 +139,11 @@ def test_get_array_unmasked_columns(
test_columns = 0
processor = SWIFTProcessor(template_dataset_alias_map)

expected_shape = (261992,)
expected_shape = (32382,)
expected_first_element_col0 = "0.75200003"
expected_final_element_col0 = "0.75200033"
expected_final_element_col0 = "0.75199997"
expected_first_element_col1 = "0.24800000"
expected_final_element_col1 = "0.24800017"
expected_final_element_col1 = "0.24800000"

output = processor.get_array_unmasked(
template_swift_data_path,
Expand Down
17 changes: 12 additions & 5 deletions tests/test_virgo_auth.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
import json
from datetime import UTC, datetime, timedelta
from datetime import datetime, timedelta, timezone
from pathlib import Path

import jwt
Expand Down Expand Up @@ -159,7 +159,14 @@ def test_generate_token(mock_settings):
generated_token = auth.generate_token()

decoded = jwt.decode(generated_token, expected_test_secret, algorithms=["HS256"])
expected_exp = datetime(2022, 1, 1, 1, 0, tzinfo=UTC) # 1 hour added to utcnow
expected_exp = datetime(
2022,
1,
1,
1,
0,
tzinfo=timezone.utc,
) # 1 hour added to utcnow
expected_exp_unix = expected_exp.timestamp()

assert decoded["exp"] == expected_exp_unix
Expand All @@ -170,7 +177,7 @@ def test_verify_jwt_token_valid(mock_settings):
expected_test_secret = mock_settings.jwt_secret_key.get_secret_value()
test_user = "test_user"

expiration = datetime.now(UTC) + timedelta(hours=1)
expiration = datetime.now(timezone.utc) + timedelta(hours=1)
token = jwt.encode(
{"exp": expiration, "sub": test_user},
expected_test_secret,
Expand All @@ -185,7 +192,7 @@ def test_verify_jwt_token_valid(mock_settings):
def test_verify_jwt_token_expired(mock_settings):
test_user = "test_user"
expected_test_secret = mock_settings.jwt_secret_key.get_secret_value()
expiration = datetime.now(UTC) - timedelta(hours=1)
expiration = datetime.now(timezone.utc) - timedelta(hours=1)
expired_token = jwt.encode(
{"exp": expiration, "sub": test_user},
expected_test_secret,
Expand All @@ -200,7 +207,7 @@ def test_verify_jwt_token_expired(mock_settings):

def test_verify_jwt_token_invalid(mock_settings):
test_user = "test_user"
expiration = datetime.now(UTC) + timedelta(hours=1)
expiration = datetime.now(timezone.utc) + timedelta(hours=1)
invalid_token = jwt.encode(
{"exp": expiration, "sub": test_user},
"wrongsecret",
Expand Down

0 comments on commit 9d67b35

Please sign in to comment.