-
Notifications
You must be signed in to change notification settings - Fork 144
Library code restructure (2021‐2024)
Status as of 24 Jan 2022
The Wannier function representation provided by Wannier90 is extremely useful for the calculation of diverse materials properties and forms an important part in many research workflows. Wannier90 both uses the outputs of various ab initio codes and provides the inputs for various advanced studies which require an efficient real-space representation of the band structure problem. As part of the UK's Computational Collaborative Project for the Study of the Electronic Structure of Condensed Matter (CCP9), we have been working on restructuring the Wannier90 source with the aim of creating a library interface to Wannier90 functionality that is capable of being invoked by an external calling program in a parallel (MPI) environment and that is able to perform at scale. This objective places some new requirements on the Wannier90 code and has motivated some significant changes to the structure of the code since the last release (v3.1.0).
You can also check the ideas collected in 2016-2017, during one of the Wannier workshops, that are at the core of the design of the library mode (that eventually is merged in Wannier90 4.0).
Stage 1: [complete]
- Pull request PR #360: explicit passing of arguments to subroutines rather than "use"-ing module variables.
Stage 2: [complete]
-
Pull request PR #394
-
Further work to eliminate "common blocks" (shared module data), which previously caused the operation of Wannier90 to be highly "state-full". Instead of data members and control parameters being set up in shared module data, with subsequent function calls acting on the stored data, library subroutines operate exclusively upon data passed as formal arguments;
-
The large number of variables controlling the action of different functions and representing different physical quantities have been grouped together as new defined types allowing formal argument lists to be shorter and less susceptible to error. As well as making the meaning of different variables more obvious, type checking provides significant protection against errors in argument lists;
-
Only a small number of variables and properties of a fundamental nature, where no ambiguity is possible, have not been encapsulated in new data types;
-
Input parameters, type definitions and i/o routines required for wannier90.x and postw90.x have been separated from each other. They are now organised in different files and to allow the (almost) completely independent operation of the two executables or corresponding library routines;
File | Purpose |
---|---|
src/wannier90_readwrite.F90 | io routines specific to wannier90.x |
src/wannier90_types.F90 | types specific to operation of wannier90.x |
src/readwrite.F90 | io routines used by both wannier90.x and postw90.x |
src/types.F90 | types used by both wannier90.x and postw90.x |
src/postw90/postw90_readwrite.F90 | io routines specific to postw90.x |
src/postw90/postw90_types.F90 | types specific to operation of postw90.x |
-
MPI communicators, originating in the external calling program, are passed as explicit arguments to all parallel routines (therefore don't assume MPI_COMM_WORLD but use the
comm%comm
variable); and -
Print output from the code is directed to a stream that is to be provided by the calling program (no assumption is made about this).
Data should not be added to the various modules, instead new variables should be added to the existing defined types, if appropriate, or a new type should be created and instances of this passed down the call tree as subroutine arguments. New types should be added to src/postw90/postw90_types.F90, src/wannier90_types.F90 or src/types.F90 as is most fitting (see table above).
The top level executables wannier90.x and postw90.x now consist essentially of wrappers around core library routines, calling mpi_init() and mpi_finalize().
Stage 3 [complete]
-
Pull request PR #412
-
redesign of the handling of error conditions to ardently defer program exit to the calling program. Error conditions potentially generated in different parts of the code are caught after each subroutine call, returned up the call graph, and passed to the calling function as an error code together with informative message. This approach has been in part inspired by the work of Bálint Aradi on ErrorFx.
Stage 4 [in preparation]
- Convenience functions are being developed that expose the core Wannier90 functionality--the set of subroutines and their arguments--with more or less complexity and that provide alternative ways (alongside reading of the control file) to set or reset variables and control Wannier90 behaviour.
Following Stage 3, error handling is now more involved. Because fortran has no mechanism to throw/catch errors, this behaviour is emulated by setting a variable to denote the error condition and immediately returning from the current subroutine. Here is shown how to set the error for the case of an allocate call that did not return successfully, the variable storing the error state is of type type(w90_error_type)
, here is an instance called error
:
allocate (cwb(num_wann, num_bands), stat=ierr)
if (ierr /= 0) then
call set_error_alloc(error, 'Error allocating cwb in dis_extract',comm)
return
endif
type(w90_error_type)
is defined in the w90_error
module (source file src/error.F90
). The MPI communicator comm
argument to the error setting functions (see below) is required because the error system is designed to be aware of error conditions triggered by other MPI ranks.
Instances of type(w90_error_type)
are used with the allocatable
attribute and are allocated when an error is encountered.
Immediately after any call to a subroutine or function that is capable of causing an error condition (where an error variable is passed as an argument), the status of the error variable must be tested. The test for an error is whether or not the variable (here called error
) is allocated.
if (lsitesymmetry) then
call sitesym_symmetrize_u_matrix(sitesym, u_matrix, num_bands, num_wann, num_kpts, num_wann, &
stdout, error, comm)
if (allocated(error)) return
endif
If the error variable is not checked and is passed to a subsequent subroutine already allocated, this will cause the code to exit with an untrapped error
message, intended to identify programming mistakes (failure to check error status). This works because the error variable is always declared as intent(out)
, then the runtime system will deallocate the error variable in any subroutine call where it is already allocated. The deallocation invokes an additional subroutine via fortran's final
mechanism.
Summary:
- Always pass the error variable of type
type(w90_error_type)
, from thew90_error
module, to any subroutines that may encounter error; - The error variable must have the
intent(out)
property; - Include a meaningful message when setting an error condition using a
set_error_
subroutine (see table below); - Immediately
return
after setting an error condition; - Immediately after every subroutine call, test whether the error variable has been allocated and, if it has, then
return
immediately; and - The
error
variable and the communicatorcomm
should ideally be kept as the final two arguments to subroutines (unless optional arguments are truly necessary).
The error condition can be set by various routines, they differ only in that they store a different (integer) code for return to the calling program. The different error kinds are defined in src/error.F90.
Subroutine | Intended use |
---|---|
set_error_fatal(err, mesg, comm) | generic fatal error |
set_error_alloc(err, mesg, comm) | error allocating |
set_error_dealloc(err, mesg, comm) | error deallocating |
set_error_mpi(err, mesg, comm) | error encountered in an mpi call |
set_error_input(err, mesg, comm) | erroneous input |
set_error_file(err, mesg, comm) | error on file open |
set_error_unconv(err, mesg, comm) | non-fatal indication that convergence is not achieved |
set_error_warn(err, mesg, comm) | generic non-fatal error (warning) |
During the workshops in 2016-2017, we have identified four main "tasks" to investigate/work on, that might potentially be started in parallel, and form the core design ideas of the "new" library mode that eventually is released with Wannier90 4.0.
-
Convert the code to be "thread-safe" (i.e., no 'saved'/global variables, but pass them all in a global datastructure)
- no saved data in the modules: it means we need data in derived types (and perhaps the main holding object is a derived type of derived types). This enables the code to handle multiple instances.
- Re-entrant -> no saved data. everyone dellocated by the end of the run. Or a function to deallocate?
- Possible idea (at least as a first step): mirror the current structure of the code; every module with public variables is converted to a derived type whose members are the current public variables. They are all grouped in a global 'w90' derived datatype, that is the one passed around
-
Properly take care of the error messages, don't stop but return a error message back to the caller
- Remove all
stop
calls in functions - Note: it's not sufficient to rewrite io_error: if this is called, some internal variables might be set to keep track of the error, but we need to have some kind of 'exception' management that is propagated up to the caller, when io_error is called deep inside a routine
- all library functions must give return code to say what error happened (when different from zero). Do we need to pass other info (human-readable information? Or integers/floats to allow the outer codes to do some automatic recovery?)
- Remove all
-
Write down a specification document with what we think should be called from a library, and what probably could stay just in the executable
- e.g.: does it make sense to make library calls for BoltzWann? Or this can stay in the main executable?
- To be later discussed with the community
- Explicitly list the outputs of each 'core' library function (and the dependencies)
- Rough initial discussion:
- Initialisation
- Disentangle
- Wannierise
- Interpolation (just core routines to get back arrays; the outer caller will decide to pass e.g. the correct k-points for a band structure, get back an array of energies and then take care externally for plotting etc.). Basically a library interface to geninterp. We need also to allow external codes to interpolate other operators beside H (derivatives, and others?)
- WF on a real-space grid (again probably only get back arrays)
- Postw90 routines: needs discussion on which need to be in the library; they would all depend on Wannierise (e.g. Berry, Boltzwann, ...)
-
Start prototyping an external caller that would use the library (in Fortran, or probably start directly from Python also to make sure we never use concepts that are too specific to Fortran.
- Idea behind it: eventually, the main wannier90 should use the library
- There is still the possibility that some logic is only in wannier_prog, but the core logic should be in the library
- Understand how to avoid creation of files: probably the library should only set internal arrays, and the caller can get those back. Main wannier_prog will have the plotting/file writing routines (e.g. XSF, ...). We could still have these callable from the library (optionally), we have though probably to pass a filename from the outside (e.g. if there are multiple instances of W90 running concurrently, called from the same main code, in different MPI communicators)
- output files, one option: pass open unit for standard output; think to a library call to open a file and return a handle (and one to close the file) so the library can be called also from non-fortran codes.
- Is this a good idea? Maybe for the global output.
- Otherwise, pass a filename, and Fortran takes care of producing an output?
- How to get the data into the code? Higher level program shouldn't need to know about derived types. We should use a flat interface, where we ask Wannier in a library call to allocate a new derived type are return to the outer code an integer handle, that is then passes in each library call to set the internal values.
- Function call to preform actions (wannierise) and to return values, e.g.
w90_get_u
w90_get_spreads
w90_get_centres
.- Maybe we want to have a single function to get back properties, that receives a string telling what to return? Something like, from python "pseudocode"):
the_size = W90SizeObject() w90_get_size('energy', the_size) numpy_array_var = w90_allocate(the_size) w90_get('energy', numpy_array_var)
- Note: we might want to access interpolated properties too. Aim: write the core routines (e.g. only to get arrays) and developers will add the functions they are interested in, or eventually plot them using either our routines, or their own routines.
- Maybe we want to have a single function to get back properties, that receives a string telling what to return? Something like, from python "pseudocode"):
-
More general discussions (mainly on input/wannier -pp/nnkp file, Amn vs. projections, or k-points):
- projections currently in win file. Instead, core library should work with
Amn
. Be careful about guiding centres!! - Also, maybe the part generating the nnkp can be a different module, and the main module instead takes care of minimisation of the spread.
- Amn will need to be optional when automatic minimisation is implemented.
- k-points: add a function to which you can pass
mp_grid
and returns list of k-points. These two pieces of information can then be passed to the main library routine. Probably it's ok to keep the redundancy (bothmp_grid
and the explicit list) because in the future it might be possible to pass only the irreducible BZ? - how to deal with b-vectors: input to wannier library mode, but we also provide a utility function to avoid people reimplement it? (Probably, same thing as for the k-points)
- projections currently in win file. Instead, core library should work with
-
Parallelisation
- long term project, but need to scope out before library finalised (but currently - March 2017 - we might have it already in place)
- parallelisation strategies (these are the two extremes, we can tune the # of CPUs for each of the two tasks):
- Gamma only (parallelise diagonalization of large matrices)
- lots of k-points (parallelise on k-points)
- Question: we need to choose how do we set the # of CPUs for the two levels, and if we freeze only these two levels, or we want to be able to pass parallelisation options more generally so we can extend this in the future
- From a library point of view, we need to understand how to pass this info. Probably in a very initial setup function call. Check also from python from the multiprocessing module.