Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ecTrans documentation site #121

Open
wants to merge 23 commits into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -8,4 +8,5 @@ build/*
install/*
env.sh
*.DS_Store

docs/site
docs/content_processed
41 changes: 41 additions & 0 deletions docs/content/api.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
---
title: ecTrans API
---

@warning
Page under construction.
@endwarning

## General notes

@note
ecTrans is a legacy code with an accumulated 30 years of history. Over this time certain
features enabled through optional arguments will have fallen out of use. We are currently reviewing
all options to identify those that can be safely deleted, but this takes time. In the mean time, we
have tagged below all options we deem to be "potentially deprecatable".
@endnote

### Variable names

ecTrans _in principle_ follows the coding standard and conventions outlined in the [IFS
Documentation - Part VI: Technical and Computational Procedures](https://www.ecmwf.int/en/elibrary/
81372-ifs-documentation-cy48r1-part-vi-technical-and-computational-procedures) section 1.5.
Following these standards, all variable names must begin with a one- or two-character prefix
denoting their scope (module level, dummy argument, local variables, loop index, or parameter) and
type. These are outlined in Table 1.2. Dummy variables have the following prefixes:

- `K` - integer
- `P` - real (single or double precision)
- `LD` - logical
- `CD` - character
- `YD` - derived type

### `KIND` parameters

As with the IFS, integer and real variables in ecTrans always have an explicit `KIND` specification.
These are defined in the [`PARKIND1` module](https://github.com/ecmwf-ifs/fiat/blob/main/src/parkind
/parkind1.F90) which is part of the Fiat library (a dependency of ecTrans). To understand the
subroutines described here, only two must be considered:

- `INTEGER, PARAMETER :: JPIM = SELECTED_INT_KIND(9)` (i.e. 4-byte integer)
- `INTEGER, PARAMETER :: JPRD = SELECTED_REAL_KIND(13,300)` (i.e. 8-byte float)
122 changes: 122 additions & 0 deletions docs/content/benchmarking.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,122 @@
---
title: Benchmarking ecTrans
---

@warning
Page under construction.
@endwarning

A ["benchmark driver" program](https://sites.ecmwf.int/docs/ectrans/sourcefile/ectrans-benchmark.
f90.html) is bundled with ecTrans. This program performs a loop of inverse and
direct spectral transforms over and over a specified number of times and collects timing statistics
to provide an assessment of the overall performance of ecTrans. It is designed to mimic the use of
ecTrans from within the IFS atmospheric model, in which inverse and direct spectral transforms are
carried out on every model timestep. The benchmark program also includes a simple error checking
algorithm for verifying that the transforms are performing with correct numerics. This latter
feature is in fact used for the ecTrans CTest suite.

Here we describe how to write a benchmark suite for ecTrans.

## Installing ecTrans

First follow the [instructions for installing ecTrans](installation.html) on your system. Verify
that the benchmark programs (one for single and double precision) exist in your build's bin
directory. You should see

```bash
ectrans-benchmark-cpu-sp ectrans-benchmark-cpu-dp
```

Here we assume you have only enabled the `CPU` feature of ecTrans (which is on by default). If you
also enabled the `GPU` feature, you'll also see GPU versions of these two programs. We'll just focus
on CPUs here.

## Using the benchmark program

The benchmark program has many arguments for running ecTrans in different configurations. You can
see the full set by running one of the benchmark programs with the `--help` option:

```
NAME ectrans-benchmark-cpu-sp

DESCRIPTION
This program tests ecTrans by transforming fields back and forth between spectral
space and grid-point space (single-precision version)

USAGE
ectrans-benchmark-cpu-sp [options]

OPTIONS
-h, --help Print this message
-v Run with verbose output
-t, --truncation T Run with this triangular spectral truncation (default = 79)
-g, --grid GRID Run with this grid. Possible values: O<N>, F<N>
If not specified, O<N> is used with N=truncation+1 (cubic relation)
-n, --niter NITER Run for this many inverse/direct transform iterations (default = 10)
-f, --nfld NFLD Number of scalar fields (default = 1)
-l, --nlev NLEV Number of vertical levels (default = 1)
--vordiv Also transform vorticity-divergence to wind
--scders Compute scalar derivatives (default off)
--uvders Compute uv East-West derivatives (default off). Only when also --vordiv is given
--flt Run with fast Legendre transforms (default off)
--nproma NPROMA Run with NPROMA (default no blocking: NPROMA=ngptot)
--norms Calculate and print spectral norms of transformed fields
The computation of spectral norms will skew overall timings
--meminfo Show diagnostic information from FIAT's ec_meminfo subroutine on memory usage, thread-binding etc.
--nprtrv Size of V set in spectral decomposition
--nprtrw Size of W set in spectral decomposition
-c, --check VALUE The multiplier of the machine epsilon used as a tolerance for correctness checking

DEBUGGING
--dump-values Output gridpoint fields in unformatted binary file
```

Some of these options (e.g. `-nprtrv`) require a detailed understanding of how fields are
distributed across MPI tasks, so we won't describe them in detail here. The most important arguments
are the following:
- `-t, --truncation T`: this sets the overall resolution of the benchmark. The truncation T refers
to the highest zonal and total wavenumber that can be kept in spectral space. By default, a
suitable grid point resolution (i.e. a suitable number of latitudes on the octahedral grid) will
be chosen for spectral space. This single number then determines the overall problem size of the
spectral transform. The higher this number, the larger the problem size. As of August 2024, the
"HRES" (high-resolution, deterministic) forecast of ECMWF uses a spectral truncation of 1279,
combined with an octahedral grid of 2560 latitudes, which gives a grid point resolution of
approximately 8 km.
- `-n, --niter NITER`: this determines how many iterations to perform in the spectral transform.
The more interations you perform, the more reliable the timing statistics you gather. Note that
two additional iterations are always performed at the start. This is because (at least for the
GPU version of ecTrans) the first two iterations include some initialisation costs which
shouldn't be included in any timing statistics.
- `-l, --nlev NLEV`: this determines the number of vertical levels for three-dimensional fields
such as U and V wind (or vorticity and divergence). ecTrans can operate on a batch of vertical
levels with a single call and this determines the size of this batch (though by default, fields
are distributed across MPI tasks on the vertical dimension at some stages in the spectral
transform)
- `--vordiv --scders --uvders`: these options enable some auxiliary code paths when calling the
inverse transform. `--vordiv` calculates grid point vorticity and divergence, `--scders`
calculates derivatives of scalar fields in grid point space, and `--uvders` calculates gradients
of the U and V wind in grid point space. For testing code changes, it's good to include these
options so as many code paths as possible are verified.
- `--norms`: this option enables error norms, which are printed aggregated over all fields at the
end of the benchmark. The errors are computed in spectral space with respect to the initial
values of the fields. This is useful to get a good idea that the benchmark is numerically
correct.









When inspecting the program, you will notice that it is significantly more complex than, say, the
example program described in our [usage guide](usage.html). This additional complexity comes not
just from the instrumentation code for the timings, but notably also from the infrastructure to
permit transforms of distributed fields. As explained in the [introduction](introduction.html),
ecTrans can operate on fields distributed across MPI tasks, and the dimension across which fields
are split is different for spectral space and grid point space. As such, the benchmark program
includes infrastructure for specifying which elements of the relevant decomposed dimension belong
to which MPI task.


7 changes: 7 additions & 0 deletions docs/content/gpu.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
---
title: GPU offloading
---

@warning
Page under construction.
@endwarning
Binary file added docs/content/img/spherical_harmonic.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
67 changes: 67 additions & 0 deletions docs/content/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
---
title: User Guide
ordered_subpage: introduction.md
ordered_subpage: installation.md
ordered_subpage: usage.md
ordered_subpage: benchmarking.md
ordered_subpage: api.md
ordered_subpage: transi.md
ordered_subpage: license.md
copy_subdir: img
---

@warning
Page under construction.
@endwarning

## Introduction

ecTrans is a high-performance numerical library for transforming meteorological fields between
global grid-point space representation and a spectral representation based on spherical harmonics.
It is a fundamental part of the
[European Centre for Medium-Range Weather Forecasts'](https://www.ecmwf.int/)
[Integrated Forecasting System (IFS)](https://www.ecmwf.int/en/forecasts/documentation-and-support/
changes-ecmwf-model), a global numerical weather prediction suite. Indeed, ecTrans
was previously part of the IFS source code itself. It therefore benefits from over 30 years of
development and optimisation. In 2022, ecTrans was split out from the IFS source code and released
as the first open-source component of the IFS as its own project.

ecTrans is engineered to work efficiently running on many hundreds, or even thousands, of compute
nodes. This is achieved through a significant optimisation of the constituent compute kernels,
making use of the FFTW library for the Fourier transform in the longitudinal direction and BLAS
GEMMs for the Legendre transforms in the latitudinal direction. However, given that transformed
fields are distributed across compute tasks, great care has been taken to ensure that parallelism
can be fully exploited at all stages in the algorithm. This is achieved through data exchange steps
interleaved between the Fourier and Legendre transforms, which are implemented using the Message
Passing Interface (MPI).

The result is an algorithm which stresses a high-performance computing system both on the
node level _and_ the network level, serving as an excellent overall benchmark and a target for
optimisation of IFS execution speed.

## The spectral transform

ecTrans transforms a batch of meteorological fields from a grid point space representation
\( X_k(\lambda_i, \phi_j) \), where \( \lambda_i \) is the \( i^{\text{th}} \) longitude,
\( \phi_j \) is the \( j^{\text{th}} \) latitude, and \( k \) is the index which ranges over the
batch of fields, to a spectral space representation \( X_{m,n,k} \), where \( m \) is the zonal
wavenumber, and \( n \) is the total wavenumber. This constitutes a direct spectral transform.
ecTrans can also carry out the inverse spectral transform.

Beginning with the inverse spectral transform (spectral space to grid point space), this is
accomplished in two computational steps. Firstly, an inverse Legendre transform is performed in the
latitudinal direction,

\[
X_{m,k}(\phi_j) = \sum_{n=|m|}^{N} X_{m,n,k} P_{m,n}(\sin(\phi_j)).
\]

Then, an inverse Fourier transform is performed in the longitudinal direction,

\[
X_k(\lambda_i, \phi_j) = \sum_{n=-N}^{N} X_{m,k}(\phi_j) e^{im\lambda_i}.
\]

## Parallelizing a spectral transform

## Basic usage of ecTrans
131 changes: 131 additions & 0 deletions docs/content/installation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,131 @@
---
title: Installing ecTrans
---

ecTrans relies on CMake for building and for unit testing. It follows standard CMake procedures for
building out-of-source, but below we describe this explicitly for newcomers to CMake.

## Requirements

ecTrans has the following requirements:

- A [CMake](https://cmake.org/) with version >= 3.12
- [ecBuild](https://github.com/ecmwf/ecbuild.git) (a collection of ECMWF-specific CMake macros)
- A Fortran compiler with OpenMP support and a C compiler. Officially we support:
- Classic Intel (i.e. ifort and icc)
- GNU
- NVHPC
- Cray
- [FIAT: the Fortran IFS and Arpege Toolkit](https://github.com/ecmwf-ifs/fiat)
- [FFTW](https://www.fftw.org/)
- A library containing standard BLAS routines such as [LAPACK](https://www.netlib.org/lapack/)

Note that you will also of course need a MPI library if you want to run ecTrans with distributed
memory parallelism, but this is not an explicit dependency of ecTrans. Instead MPI functionality is
provided through a wrapper library provided by FIAT. It is the latter that must be built with MPI
support. In any case, ecTrans can still be tested without MPI with only shared memory parallelism
support.

For all of these except for FIAT and ecBuild we cannot give general instructions as it depends
entirely on your system and software environment. Most modern high-performance computing systems
should have these installed already, so we will assume this is the case for you as well.

ecBuild can simply be cloned from GitHub like so:

```bash
git clone https://github.com/ecmwf/ecbuild.git --branch 3.8.2 --single-branch
```

It does not require a build or installation step. Simply export a variable `ecbuild_DIR` pointing to
the cloned repository.

@note
We are always willing to add other compilers to our list, so please
[raise an issue](https://github.com/ecmwf-ifs/ectrans/issues) if you are encountering
difficulties with a particular compiler.
@endnote

## (Optional) Prepare env and toolchain files

Preparing these files is not strictly necessary, but is recommended to ease reproducibility. An env
file (called env.sh below) should contain all of the steps required to make the dependencies of FIAT
and ecTrans available on the path, and otherwise prepare the environment for their building and
execution. For example, the env file on a system that uses modules might load the modules for CMake,
the compiler suite, FFTW, and the BLAS library. The toolchain file (called toolchain.cmake below)
contains definitions of CMake variables which will be needed to build FIAT and ecTrans correctly.
The most important are the compilers. The toolchain file should set the values of
`CMAKE_C_COMPILER`, `CMAKE_CXX_COMPILER`, and `CMAKE_Fortran_COMPILER`. The toolchain file for
NVHPC will look like this, for example:

```cmake
set(CMAKE_C_COMPILER nvc)
set(CMAKE_CXX_COMPILER nvc++)
set(CMAKE_Fortran_COMPILER nvfortran)
```

To get CMake to always detect ecBuild, add this to your env file:

```bash
export ecbuild_DIR=<path to ecBuild>
```

We will assume both the env file (env.sh) and the toolchain file (toolchain.cmake) are in the
current directory. Now source the env file:

```bash
source env.sh
```

## Building FIAT

First clone the latest version of the FIAT repository:

```bash
git clone https://github.com/ecmwf-ifs/fiat.git -b 1.4.1
```

Then run the configure step for FIAT (you can leave out `-DCMAKE_TOOLCHAIN_FILE` if you don't want
to bother with toolchain files):

```bash
cmake -S fiat -B fiat/build -DCMAKE_TOOLCHAIN_FILE=$PWD/toolchain.cmake
```

Now run the build step (adjusting the level of multithreading as required with `-j`):

```bash
cmake --build fiat/build -j 4
```

## Building ecTrans

Clone the latest version of the ecTrans repository:

```bash
git clone https://github.com/ecmwf-ifs/ectrans.git -b 1.3.2
```

Then run the configure step, making sure to pass the location of the FIAT build directory to CMake:

```bash
cmake -S ectrans -B ectrans/build -DCMAKE_TOOLCHAIN_FILE=$PWD/toolchain.cmake \
-Dfiat_ROOT=$PWD/fiat/build
```

Now run the build step:

```bash
cmake --build ectrans/build -j 4
```

You should now find in the ecTrans build directory the ecTrans libraries in the lib subdirectory
(single- and double-precision versions) and the standalone ecTrans benchmarking binary in the bin
subdirectory. We strongly recommend you run the full CTest suite to check everything's installed
correctly:

```bash
cd ectrans/build
ctest
```
If any of the tests fail, please let us know by
[raising an issue](https://github.com/ecmwf-ifs/ectrans/issues)!
Loading
Loading