This style guide summarizes code conventions used in ProbNum. This is intended as a reference for developers.
ProbNum uses Black's formatting ruleset, which can be viewed as a strict subset of PEP 8, and we recommend Black for automated code formatting.
With respect to code style, the Google Python Style Guide should be applied with some notable additions and exceptions (i.e. docstrings, ...). We summarize and expand on this style guide below.
Use absolute imports over relative imports.
import x
for importing packages and modules.from x import y
wherex
is the package prefix andy
is the module name with no prefix.from x import y as z
if two modules namedy
are to be imported or ify
is an inconveniently long name.import y as z
only whenz
is a standard abbreviation (e.g.np
fornumpy
).
Use __all__ = [...]
in __init__.py
files to fix the order in which the methods are visible in the documentation.
This also avoids importing unnecessary functions via import statements from ... import *
.
Many classes and functions are "pulled up" to a higher-level namespace via __init__. py
files. Import from there wherever there is no chance for
confusion and/or circular imports. This makes imports more readable. When changing the namespace of classes make sure to
to correct module paths in the documentation by adding SuperClass.__module__ = "probnum.module"
to the corresponding
__init.py__
.
If imports are shortened, the following conventions should be used. Full import paths are always acceptable.
import probnum as pn
from probnum import randvars, linalg, diffeq, statespace
An exception from these rules are type-related modules, which include typing
and probnum.type
.
Types are always imported directly.
from typing import Optional, Callable
from probnum.type import FloatArgType
Please do not abbreviate import paths unnecessarily. We do not use the following imports:
import probnum.random_variables as pnrv
orimport probnum.filtsmooth as pnfs
(correct would befrom probnum import randvars, filtsmooth
)from probnum import random_variables as rvs
orimport probnum.random_variables as rvs
(therandvars
name is sufficiently short and does not need to be abbreviated)
While all of these rules obey the Google Python Style Guide,
we use one import convention that deviates from this guide.
If two objects (functions, classes) share the same namespace
(i.e. RandomVariable
and Normal
are both imported via probnum.randvars
, but their implementation is in different
files, randvars/_randomvariable.py
and randvars/_normal.py
)
and one object needs to be imported into the module of the other object, use relative imports. For instance, in randvars/_normal.py
the import reads
from ._randomvariables import RandomVariable
which helps with making the code in Normal
more compact and readable.
Many types representing numeric values, shapes, dtypes, random states, etc. have different
possible representations. For example a shape could be specified in the following ways: n, (n,), (n, 1), [n], [n, 1]
.
For this reason most types should be standardized internally to a core set of types defined
in probnum.type
, e.g. for numeric types np.generic
, np.ndarray
. Methods for input
argument standardization can be found in probnum.utils.argutils
.
The package itself is written "ProbNum" except when referred to as a package import, then probnum
should be used.
joined_lower
for functions, methods, attributes, variablesjoined_lower
orALL_CAPS
for constantsStudlyCaps
for classescamelCase
only to conform to pre-existing conventions, e.g. inunittest
Function names and signatures of PN methods attempt to replicate numpy
or scipy
naming conventions.
For example
probsolve_ivp(...)
(scipy:solve_ivp(...)
)problinsolve(...)
(scipy:linalg.solve(...)
)
Methods with "Bayesian" in the name come with the prefix bayes
, e.g. bayesquad
; Bayesian quadrature, BayesFilter
; Bayesian filter, BayesSmoother
; Bayesian smoother.
The way an object is represented in the console or printed is defined by the following functions:
repr(obj)
is defined byobj.__repr__()
and should return a developer-friendly representation ofobj
. If possible, this should be code that can recreateobj
.str(obj)
is defined byobj.__str__()
and should return a user-friendly representation ofobj
. If no.__str__()
method is implemented, Python will fall back to the.__repr__()
method.
As an example consider numpy
's array representation
array([[1, 0],
[0, 1]])
versus its output of str
.
[[1 0]
[0 1]]
Stick to the first few letters for abbreviations if they are sufficiently descriptive:
cov
: covariancefun
: functionmat
: matrixvec
: vectorarr
: array; wherever applicable, specifyvec
ormat
Further conventions are
unit2unit
: convert between types or units, e.g.mat2arr
: convert matrix to array ors2ms
: convert seconds to milliseconds. Can also be used for simple adapter methods, along the lines offilt2odefilt
.proj
: projection (if required:projmat
,projvec
,projlinop
, ...)precon
: preconditionerdriftmat
: drift-matrix,forcevec
: force-vector,dispmat
dispersion-matrix,dynamicsmat
dynamics-matrix,diffmat
diffusion-matrix, plus the respectivedriftmatfun
,driftfun
,dispmatfun
, etc.inv*
: for inverse of a matrix; e.g.invprecond
,invcovmat
, ...- optional arguments via
**kwargs
, e.g.:fun(t, x, **kwargs)
msg
: message, e.g. for issuing raising and warnings (errmsg
,warnmsg
)rv
: random variable; if concatenated with e.g.init
, abbreviate toinitrv
(initial random variable)data
: data (don't abbreviate that one)- functions/methods that do something from time
t0
to timet1
with step sizeh
use the signature(start, stop, step, **kwargs)
or any corresponding subset of that. This is in line withnp.arange
for instance. Use it like(start=t0, stop=t1, step=h, **kwargs)
. jacob
: Jacobian, if necessary usejacobfun
. Hessians arehess
, respectivelyhessfun
.param(s)
: parameter(s). If abbreviations are necessary (e.g. in inline-function definition, usepar(s)
).- Indices via
idx
(eitheridx1
,idx2
, ... oridx
,jdx
,kdx
) and not viai, j, k
. The former is more readable (and follows PEP8); the latter may collide with the built-in imaginary constantj=sqrt(-1)
. - A function maps from its
domain
to itsrange
. Therange
of a random variable is thedomain
of its distribution.
- Stick to the built-in python exceptions (
TypeError
,NotImplementedError
, ...) - If dunder method is not implemented for a type, return
NotImplemented
- Warnings via
warnings.warn()
. See https://docs.python.org/2/library/warnings.html or https://docs.python.org/2/library/exceptions.html#exceptions.Warning. - Recall the difference between
TypeError
andValueError
TypeError
Passing arguments of the wrong type (e.g. passing a list when an int is expected) should result in a TypeError. Example:float(['5 '])
since a list cannot be converted to float.ValueError
: Raised when a built-in operation or function receives an argument that has the right type but an inappropriate value. Example: Thefloat
function can take a string, i.e.float('5')
, butfloat('string')
fails since'string'
is a non-convertible string.
low
(shortened lower caps) for modules/folders in the namespace, e.g.probnum.linalg.linops
lower
for modules/folders not in the namespace, e.g.probnum/linalg/linearsolvers
. Rule of thumb: the more low-level the module is, the longer (more descriptive) the file name can be, because the chances that access is provided through higher-level namespaces are rather high.
Interfaces to PN methods should be in a separate module, while their implementation (in classes) is in the same folder in other files.
All documentation is written in American English. Every publicly visible class or function must have a docstring. Do not use extensive documentation as a clutch for spaghetti code -- divide and conquer instead!