Skip to content

Commit

Permalink
Add Spack buildcache to CPU Ubuntu builds (#85)
Browse files Browse the repository at this point in the history
* Add spack buildcache to CPU builds, trying with just python for now.

* Ubuntu 22.04 uses different gcc version...

* Fix test stage and only build perl.

* Fix test stage

* Disable test stage for now.

* Fix username for action.

* Add back exago builds.

* Trust buildcache and find externals.

* Correct spack commands.

* Move where mirror is set.

* Move where mirror is set.

* Move where mirror is set.

* Build custom base image with fortran libs for use in each stage.

* Fix tags and use static commit for action.

* Build base image prior to matrix jobs.

* Cleanup code and fix base image name.

* Hard code image version.

* Use latest action versions, hard code image tags for push.

* Fix image_name to be lower case.

* Fix image name...

* Force push to registry and try build exago based on PR branch.

* Update actions to get branch name within job step.

* Use spack develop to configure correct re-build.

* Use -e . for spack develop command

* Add libstdc++ and gcc to base image.

* Update concretizer config and give specific version in root spec.

* Reuse dependencies in spack.yaml

* Minimize spack config to fix concretizer

* Get rid of spack git= syntax as it is bugged.

* Attempt to fix glibc issue in base image.

* Attempt to fix glibc issue in base image.

* Attempt to fix glibc issue in base image.

* Attempt to fix glibc issue in base image.

* Attempt to fix glibc issue in base image.

* Add openssh to base image.

* Add documentation including demo video.

* Fix broken links identified in review.
  • Loading branch information
Cameron Rutherford authored Dec 4, 2023
1 parent 1e8ee59 commit ae5c8ac
Show file tree
Hide file tree
Showing 6 changed files with 199 additions and 26 deletions.
30 changes: 30 additions & 0 deletions .github/workflows/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# GitHub Actions Documentation

## `file_naming.yaml`

Runs on push, this action runs our perl script in `buildsystem/tools/file_naming_conventions.pl` to enforce certain name restrictions defined in our developer guidelines (P003). We use one action to checkout the code, then another to install perl before running the script.

## `ornl_ascent_mirror.yaml`

This pushes to a GitLab at ORNL and runs CI/CD on Ascent. This also can re-build modules for testing newer versions of ExaGO and it's dependencies without needing to monitor builds by hand.

## `pnnl_mirror.yaml`

Similar to `ornl_ascent_mirror.yaml`, this mirrors to PNNL GitLab, but also supports Incline, Decpeption and Newell.

## `pre_commit.yaml`

This enforces and runs pre-commit, and automatically commits fixes to any tests that it can. Noteably applies clang formatting, and cmake formatting most often, and requires developers to either install locally, or rebase to incorporate changes.

## `spack_cpu_build.yaml`

Logically does the following:
- Build a base image with Linux deps for things like mpi / gcc that are runtime dependencies for exago
- Build binaries in ubuntu GitHub actions runner for a matrix of ExaGO configs, force pushing each time to refresh binaries
- Push said binaries with the custom base image to the ghcr packages for exago

This also leverages the spack public mirror binaries as well as ExaGO's GHCR binaries, to drop builds down from taking 1.5hrs down to <10 minutes!! Now the slowest part is the concretization...

To pull the binaries and run, you can consult some more verbose docs:
- From GitHub https://docs.github.com/en/packages/working-with-a-github-packages-registry/working-with-the-container-registry
- From Spack https://spack.readthedocs.io/en/latest/binary_caches.html#oci-docker-v2-registries-as-build-cache
138 changes: 117 additions & 21 deletions .github/workflows/spack_cpu_build.yaml
Original file line number Diff line number Diff line change
@@ -1,18 +1,86 @@
name: Spack CPU Builds
# https://spack.readthedocs.io/en/latest/binary_caches.html#spack-build-cache-for-github-actions
name: Spack Ubunutu x86_64 Buildcache

env:
SPACK_COLOR: always
REGISTRY: ghcr.io/pnnl
# Our repo name contains upper case characters, so we can't use ${{ github.repository }}
IMAGE_NAME: exago
USERNAME: exago-bot
BASE_VERSION: ubuntu-22.04-fortran

# Until we remove the need to clone submodules to build, this should on be in PRs
on: [pull_request]

jobs:
base_image_build:
runs-on: ubuntu-22.04
permissions:
packages: write
contents: read

name: Build Custom Base Image
steps:
- name: Checkout
uses: actions/checkout@v4
with:
# Once we move submodule deps into spack, we can do some more builds
# Also need to change build script to use spack from base image
submodules: true

# Need to build custom base image with gfortran
- name: Create Dockerfile heredoc
run: |
cat << EOF > Dockerfile
FROM ubuntu:22.04
RUN apt-get update && \
apt-get install -y --no-install-recommends \
software-properties-common \
gpg-agent \
openssh-client \
openssh-server \
&& rm -rf /var/lib/apt/lists/*
RUN add-apt-repository ppa:ubuntu-toolchain-r/test && \
apt-get install -y --no-install-recommends \
gfortran \
gcc \
libstdc++6 \
&& rm -rf /var/lib/apt/lists/*
EOF
# https://docs.github.com/en/actions/publishing-packages/publishing-docker-images
- name: Log in to the Container registry
uses: docker/login-action@v3
with:
registry: ${{ env.REGISTRY }}
username: ${{ env.USERNAME }}
password: ${{ secrets.GITHUB_TOKEN }}

- name: Extract metadata (tags, labels) for Docker
id: meta
uses: docker/metadata-action@v5
with:
images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
labels: org.opencontainers.image.version=${{ env.BASE_VERSION }}

- name: Build and push Docker base image
uses: docker/build-push-action@v5
with:
context: .
push: true
tags: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ env.BASE_VERSION }}
labels: ${{ steps.meta.outputs.labels }}

exago_spack_builds:
# 20.04 is a version shared by E4S cache and Spack binaries for x86_64
runs-on: ubuntu-20.04
# This seems redundant if we use a spack submodule?
container: spack/ubuntu-focal:latest
needs: base_image_build
runs-on: ubuntu-22.04
permissions:
packages: write
contents: read

strategy:
matrix:
# Minimal Build(s)
# Need S3 mirror to have these builds speedup
# Minimal Build(s) - GHCR mirror speeds these up a lot!
spack_spec:
# See #39 - ~python~mpi causes issues
# - exago@develop~mpi~ipopt~hiop~python~raja
Expand All @@ -30,23 +98,51 @@ jobs:
name: Build ExaGO with Spack
steps:
- name: Checkout
uses: actions/checkout@v2
uses: actions/checkout@v4
with:
# Once we move submodule deps into spack, we can do some more builds
# Also need to change build script to use spack from base image
submodules: true

- name: Setup Spack
run: echo "$PWD/tpl/spack/bin" >> "$GITHUB_PATH"

- name: Create heredoc spack.yaml
run: |
cat << EOF > spack.yaml
spack:
specs:
- ${{ matrix.spack_spec }} target=x86_64_v2
concretizer:
reuse: dependencies
config:
install_tree:
root: /opt/spack
padded_length: 128
mirrors:
local-buildcache: oci://${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
spack: https://binaries.spack.io/develop
- name: Configure GHCR mirror
run: spack -e . mirror set --oci-username ${{ env.USERNAME }} --oci-password "${{ secrets.GITHUB_TOKEN }}" local-buildcache

- name: Trust keys
run: spack -e . buildcache keys --install --trust

- name: Find external packages
run: spack -e . external find --all --exclude python

- name: Spack develop exago
run: spack -e . develop --path=$(pwd) exago@develop

- name: Build Environment
env:
SPACK_SPEC: ${{ matrix.spack_spec }}
- name: Concretize
run: spack -e . concretize

- name: Install
run: spack -e . install --no-check-signature

# Push with force to override existing binaries...
- name: Push to binaries to buildcache
run: |
ls && pwd
. ./tpl/spack/share/spack/setup-env.sh
spack debug report
spack env create -d ./spack-env
spack env activate ./spack-env
spack add $SPACK_SPEC target=x86_64
spack develop --path $(pwd) --no-clone exago@develop
spack concretize --reuse
git config --global --add safe.directory $(pwd)
spack --stacktrace install --fail-fast
spack -e . buildcache push --force --base-image ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ env.BASE_VERSION }} --unsigned --update-index local-buildcache
if: ${{ !cancelled() }}
38 changes: 38 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,31 @@ Additionally, note that SCOPFLOW and SOPFLOW with HiOp solver use Ipopt to solve

Details installation instructions are given at [INSTALL.md](./INSTALL.md) for information on acquiring, building and installing ExaGO.

If you are a developer with access to the project, we also provide public binaries that are generated through our GitHub actions workflows documented in [README.md](.github/workflows/README.md), and with documentation about usage in the packages section of our repository. Check out a short (< 60s demo) of pulling down a version of ExaGO:

[![asciicast](
https://asciinema.org/a/KCi5TmUXc6zWDj7JYHzfSFxmw.png)](
https://asciinema.org/a/KCi5TmUXc6zWDj7JYHzfSFxmw)

## Developer Guide

You can view the following helpful documentation sources:
- [test_add.md](docs/web/test_add.md) markdown file for information on adding tets (outdated)
- [README.md](buildsystem/README.md) for our bash / spack buildsystem used in GitHub/GitLab CI/CD
- [README.md](buildsystem/spack/README.md) for our spack specific build scripts that support CI tcl modules on HPC target platforms
- [README.md](docs/devcontainer/README.md) for our devcontianer configuration information (codespace support coming soon)
- [exago_policy_compatiblility](docs/exago_policy_compatibility.md) for xSDK compatibility guidelines, and ways to enforce compliance
- [python_bindings.md](docs/python_bindings.md) for documentation about or Python bindings
- [README.md](performance_analysis/README.md) for information about profiling ExaGO with spack
- [README.md](.github/workflows/README.md) for details about our GitHub actions
- [README.ci_clusters.md](docs/web/README.ci_clusters.md) for CI cluster workflow documentation
- [README.summit.md](docs/web/README.summit.md) for ORNL's Summit specific configuration

## Vizualisation

Our ChatGrid frontend deployed with React, PSQL and LangChain has documentation in [README.md](viz/README.md) as well as a pdf [README.pdf](viz/README.pdf) in the `viz` subdirectory. Several of our tutorials install this through commands in Jupyter Notebooks as well.


## Usage
Instructions for executing the different ExaGO<sup>TM</sup> applications is given below.
- [OPFLOW](docs/web/opflow.md)
Expand All @@ -40,6 +65,19 @@ Instructions for executing the different ExaGO<sup>TM</sup> applications is give
- [SCOPFLOW](docs/web/scopflow.md)
- [PFLOW](docs/web/pflow.md)

We also provide our user manual as a pdf [manual.pdf](docs/manual/manual.pdf) -> need to update this regularly with CI / move to quarto docs.

## Tutorials

- If you are using a devcontainer with VSCode, the following tutorials are provided:
- [tutorial.ipynb](docs/devcontainer/tutorial.ipynb) for basic configuration infromation and I/O
- [mpi4py-tutorial.ipynb](docs/devcontainer/mpi4py-tutorial.ipynb) for mpi4py pointers and best practices
- [viz-tutorial.ipynb](docs/devcontainer/viz-tutorial.ipynb) for spinning up our frontend visualization with ChatGrid integration
- Otherwise, you can check out our more in depth application tutorials in the `tutorials`subdirectory:
- [demo1.ipynb](tutorials/demo1.ipynb) run OPFLOW, SCOPFLOW and visualize your output
- [demo2.ipynb](tutorials/demo2.ipynb) run SOPFLOW on many ranks using MPI, and visualize outpu
- TODO - add fixes from `mpi4py` devcontainer example into this notebook to show working MPI workflow

### Options

Each application has a different set of options that are described in depth in the usage notes. These options can be passed optionally through an options file (`-optionsfile <option_file>`), or directly on the command line.
Expand Down
2 changes: 1 addition & 1 deletion buildsystem/spack/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ any existing installations unusable due to changes in the hashing algorithm.

Spack modules are automatically rebuilt via CI pipelines for a cluster when a commit message includes `[<clustername>-rebuild]` where `<clustername>` is one of the following [newell, deception, ascent].

See the [developer guidelines](./docs/developer_guidelines.md) for a general workflow outline.
See the [developer guidelines](../../docs/developer_guidelines.md) for a general workflow outline.

Once a build is finished, a new commit is pushed to the branch with a commit message with `[<clustername>-test]`, where tests are run only for that platform. If you want all of CI to be re-run after a specific platform test, you may have to push another empty commit, or re-run CI manually.
## General Workflow
Expand Down
9 changes: 9 additions & 0 deletions docs/devcontainer/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,3 +28,12 @@ The build info for this container is in `.devcontainer/`. There is a Dockerfile
1. once the conatiner has build and launched, open `docs/devcontainer/tutorial.ipynb`
1. select the (newly built) existing jupyter kernel "ExaGO"
1. run all cells!

## Devcontainer Quickstart

A devcontainer is configured through three things here:
- `create_dockerfile.sh` generates the Dockerfile, and needs to be run from ExaGO root with spack submodule cloned with `$ .devcontainer/create_dockerfile.sh`. Note that this currently uses sed commands that only work on Mac shell... so hopefully the Dockerfile is easy enough to fix if slightly broken, but ideally we also have CI for this.
- `Dockerfile` which is generated from the above bash script, and defines a container with ExaGO. Currently this build takes 1.5 hours, but ideally this is cached and successfully pulled from CI builds.
- `devcontainer.json` is unfortunately a json that forbids comments, however should be self-explanatory. You can define extensions to load here which is really helpful for common environments, and you also mount the local folder so you can play with ExaGO clone in container easily.

Note - pushing/pulling from git is not supported in a devcontainer, and should be done independently.
8 changes: 4 additions & 4 deletions docs/exago_policy_compatibility.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ and should be considered when filling out this form.
Please, provide information on your compability status for each mandatory policy, and if possible also for recommended policies.
If you are not compatible, state what is lacking and what are your plans on how to achieve compliance.

**Website:** https://gitlab.pnnl.gov/exasgd/frameworks/exago
**Website:** https://github.com/pnnl/ExaGO

### Mandatory Policies

Expand Down Expand Up @@ -39,11 +39,11 @@ M2 details <a id="m2-details"></a>: optional: provide more details about approac

| Policy |Support| Notes |
|------------------------|-------|-------------------------|
|**R1.** Have a public repository. |Full| [Public GitLab repository linked here](https://gitlab.pnnl.gov/exasgd/frameworks/exago/). |
|**R1.** Have a public repository. |Full| [Public GitHub repository linked here](https://github.com/pnnl/ExaGO). |
|**R2.** Possible to run test suite under valgrind in order to test for memory corruption issues. |Full| It is possible to run any of the application drivers and test drivers under Valgrind. This has only been test with the leakcheck tool, and not any of the other tools from Valgrind. |
|**R3.** Adopt and document consistent system for error conditions/exceptions. |Full| ExaGO makes thorough use of return codes and error checking, particularly the PETSc macros such as `ERRCHKQ`. |
|**R4.** Free all system resources acquired as soon as they are no longer needed. |Full| Memory for the model is allocated at the beginning of the program and freed at the end. ExaGO also allows for using an external solver, in which case the memory is used by a thrid-party library. These libraries (PETSc, Ipopt, and HiOp) also adequately free memory they allocate. |
|**R5.** Provide a mechanism to export ordered list of library dependencies. |Full| ExaGO exposes two arrays, `ExaGODependencyNames` and `ExaGOIsDependencyEnabled`, allowing users to query dependency information. Only key dependencies are tracked in these arrays, such as RAJA and GPU-related dependencies. |
|**R6.** Document versions of packages that it works with or depends upon, preferably in machine-readable form. |Full| Our Spack packages document much of this information. Documentation in [`INSTALL.md`](INSTALL.md) and [`docs/InstallingWithSpack.md`](docs/installing_with_spack.md) contain additional information about dependencies.|
|**R6.** Document versions of packages that it works with or depends upon, preferably in machine-readable form. |Full| Our Spack packages document much of this information. Documentation in [`INSTALL.md`](../INSTALL.md) and [`docs/InstallingWithSpack.md`](./installing_with_spack.md) contain additional information about dependencies.|
|**R7.** Have README, SUPPORT, LICENSE, and CHANGELOG files in top directory. |Full| We currently have README.md, CHANGELOG.md, SUPPORT.md, and LICENSE files in root directory. |
|**R8.** Each xSDK member package should have sufficient documentation to support use and further development. |Full| The directory `docs/manual` contains thorough documentation in LaTeX with a prebuilt user manual PDF [linked here](docs/manual/manual.pdf). The file [`docs/DeveloperGuidelines`](./docs/developer_guidelines.md) contains documentation on software development best practices that contributors are expected to follow. `docs/web` contains markdown documentation on each of the application libraries and further documentation on some dependencies and platforms. `docs/petsc-dependencies` contains further documentation on PETSc usage. |
|**R8.** Each xSDK member package should have sufficient documentation to support use and further development. |Full| The directory `docs/manual` contains thorough documentation in LaTeX with a prebuilt user manual PDF [linked here](./manual/manual.pdf). The file [`docs/DeveloperGuidelines`](./developer_guidelines.md) contains documentation on software development best practices that contributors are expected to follow. `docs/web` contains markdown documentation on each of the application libraries and further documentation on some dependencies and platforms. `docs/petsc-dependencies` contains further documentation on PETSc usage. |

0 comments on commit ae5c8ac

Please sign in to comment.