Benchmarks

Clava comes with an extensible set of pre-packaged benchmark suites, and allows you to easily integrate your own. The advantages of doing this are plenty, as they allow you to apply code transformations programmatically (i.e., one benchmark at a time), while also allowing you to automatically compile and execute those benchmarks at will. This is opposed to the usual Clava workflow, which applies code transformations indiscriminately to every input file, while leaving the burden of compilation and execution to the developer.

Including a benchmark suite

There are two ways of including a benchmark suite, depending on their origin:

To use the built-in benchmarks: to use one of the built-in benchmark suites (available in this repository), all you need to do is to include their URL in the "External dependencies" prompt in the Clava options tab. For instance, to include the NAS suite, you add https://github.com/specs-feup/clava-benchmarks.git?folder=NAS to your external dependencies.
To use your own benchmarks: to use your own benchmark suites, you should create a folder structure similar to the built-in benchmarks, and further create your own JavaScript files with the required logic (e.g., to manage input files/options). You can add it to Clava by including your local benchmark folder in the "Includes Folder" prompt in the Clava options tab.

Using a benchmark suite

After you include your benchmark suite, you can use it in the following way. This is just an example of basic usage, and you can check the documentation of the BenchmarkSet and BenchmarkInstance classes for more information.

Compilation requires you to have CMake installed on your system, as well as a C/C++ compiler.

// include the benchmark suite(s) you want to use
laraImport("lara.benchmark.RosettaBenchmarkSet");
laraImport("weaver.Query");

function main() {
	//create a BenchmarkSet object
	const benches = new RosettaBenchmarkSet();

	//choose the individual benchmark within the set
	benches.setBenchmarks(["3d-rendering", "digit-recognition", "face-detection"]);

	//choose the input size(s) to be used during execution
	benches.setInputSizes(["N"]);

	//go through each benchmark (objects of type BenchmarkInstance)
	for (var bench of benches) {
		// loads the benchmark into Clava's AST
		bench.load();

		// now everything is ready for you to do your analysis and transformations
		// in this example, we're just printing the name of every function
		var funNames = [];
		for (var elem of Query.search("function")) {
			if (elem.isImplementation) {
				funNames.push(elem.name);
			}
		}
		println(funNames.join(","));

		// now, we prepare for compilation
		// if you are on Windows, you may wish to choose MinGW instead of the default (MSVC), but
		// this is entirely dependent on your system. Check Clava's CMake docs for more info
		bench.getCMaker().setGenerator("MinGW Makefiles");

		// now we compile the benchmark
		bench.compile();

		// and finally, we execute it
		bench.execute();
	}
}
main()

Benchmark Descriptions

In this section, we describe each of our built-in benchmark suites in terms of their possible input sizes, programming language and purpose.

CHStone

CHStone is a set of kernels used to evaluate High-level Synthesis applications, as well as to generate some IP components, such as arithmetic operators. Include with https://github.com/specs-feup/clava-benchmarks.git?folder=CHStone.

Benchmark	Language	Input options	Description
adpcm	C	N	An adaptive differential pulse-code modulation decoder/encoder
aes	C	N	An Advanced Encryption Standard (AES) decoder/encoder
blowfish	C	N	A blowfish encoder/decoder (blowfish is a symmetric-key block cipher)
dfadd	C	N	A double-precision adder implementation for generating an hardware IP
dfdiv	C	N	A double-precision divider implementation for generating an hardware IP
dfmul	C	N	A double-precision multiplier implementation for generating an hardware IP
dfsin	C	N	A double-precision sine function implementation for generating an hardware IP
gsm	C	N	A linear predictive coding analyser of a global system for mobile communication
jpeg	C	N	A JPEG image decompressor
mips	C	N	A simplified simulator of a MIPS CPU
motion	C	N	A motion vector decoder for the MPEG-2 video format
sha	C	N	An implementation of the SHA (secure hashing algorithm) to produce an hash code of an input

HiFlipVX

HiFlipVX is an object detection library aimed at FPGAs. Considering it is a library, we focus only on the example application it provides, which makes use of most of the library's functionality in order to manipulate an input image. Include with https://github.com/specs-feup/clava-benchmarks.git?folder=HiFlipVX.

Benchmark	Language	Input options	Description
v2	C++	N	Object detection library for FPGAs

LSU

LSU is a set of large real-world applications distributed as a single file. Include with https://github.com/specs-feup/clava-benchmarks.git?folder=LSU.

Benchmark	Language	Input options	Description
bzip2	C	SMALL, LARGE	Lossless compression tool
gzip	C	SMALL, LARGE	Lossless compression tool
oggend	C	SMALL	Encoding tool for Ogg Vorbis, a lossy audio compressing scheme
gcc	C	SMALL, LARGE	GNU C compiler

MachSuite

MachSuite is a set of 19 benchmarks designed to mimic low-level kernels suitable for hardware acceleration. Info below is directly adapted from the official docs. The names of some benchmarks were modified so that they can all exist at the same directory level (e.g., the sort folder has two subfolders with the benchmarks merge and radix; these are flattened into sort-merge and sort-radix). Include with https://github.com/specs-feup/clava-benchmarks.git?folder=MachSuite.

Benchmark	Language	Input options	Description
aes	C	D	The Advanced Encryption Standard, a common block cipher.
backprop	C	D	A simple method for training neural networks.
bfs-bulk	C	D	Data-oriented version of breadth-first search.
bfs-queue	C	D	The “expanding-horizon” version of breadth-first search.
fft-strided	C	D	Recursive formulation of the Fast Fourier Transform.
fft-transpose	C	D	A two-level FFT optimized for a small, fixed-size butterfly.
gemm-blocked	C	D	Naive, O(n3) algorithm for dense matrix multiplication.
gemm-ncubed	C	D	A blocked version of matrix multiplication, with better locality.
kmp	C	D	The Knuth-Morris-Pratt string matching algorithm.
md-grid	C	D	n-body molecular dynamics, using k-nearest neighbors to compute only local forces.
md-knn	C	D	n-body molecular dynamics, using spatial decomposition to compute only local forces.
nw	C	D	A dynamic programming algorithm for optimal sequence alignment.
sort-merge	C	D	The mergesort algorithm, on an integer array.
sort-radix	C	D	Sorts an integer array by comparing 4-bits blocks at a time.
spmv-crs	C	D	Sparse matrix-vector multiplication, using variable-length neighbor lists.
spmv-ellpack	C	D	Sparse matrix-vector multiplication, using fixed-size neighbor lists.
stencil-2d	C	D	A two-dimensional stencil computation, using a 9-point square stencil.
stencil-3d	C	D	A three-dimensional stencil computation, using a 7-point von Neumann stencil.
viterbi	C	D	A dynamic programing method for computing probabilities on a Hidden Markov model.

NAS

NAS is a set of benchmarks used to evaluate the parallel performance of supercomputers. Include with https://github.com/specs-feup/clava-benchmarks.git?folder=NAS.

Benchmark	Language	Input options	Description
BT	C	S, W, A, B, C, D, E	Block Tri-diagonal solver (application)
CG	C	S, W, A, B, C	Conjugate Gradient, irregular memory access and communication (kernel)
EP	C	S, W, A, B, C, D, E	Embarrassingly Parallel (kernel)
FT	C	S, W, A, B, C, D, E	discrete 3D fast Fourier Transform, all-to-all communication (kernel)
IS	C	S, W, A, B, C, D	Integer Sort, random memory access (kernel)
LU	C	S, W, A, B, C, D, E	Lower-Upper Gauss-Seidel solver (application)
MG	C	S, W, A, B, C, D, E	Multi-Grid on a sequence of meshes, long- and short-distance communication, memory intensive (kernel)
SP	C	S, W, A, B, C, D, E	Scalar Penta-diagonal solver (application)
UA	C	S, W, A, B, C, D	Unstructured Adaptive mesh, dynamic and irregular memory access (application)

Parboil

Parboil is a suite of complex applications from several fields, such as bioinformatics, physics and mathematics. Include with https://github.com/specs-feup/clava-benchmarks.git?folder=Parboil.

Benchmark	Language	Input options	Description
bfs	C++	1M, NY, SF, UT	A breadth-first search algorithm operating over a graph
cutcp	C++	large, small	Computes the short-range component of Coulombic potential at each grid point over a 3D grid
histo	C++	large, default	Calculates a histogram of 255 bins using data with a Gaussian distribution
lbm	C++	long, short	A fluid dynamics simulation using the Lattice-Boltzmann Method
mri-gridding	C++	small	Converts points from an MR scan into a grid through interpolation, and the applies a Fast Fourier Transform over the grid
mri-q	C++	large, small	MRI calibration matrix using image reconstruction algorithms in non-Cartesian space
sad	C++	large, default	An implementation of a sum of absolute differences kernel, used in MPEG and H.264 decoders
sgemm	C++	medium, small	General purpose dense matrix-matrix multiplication
spmv	C++	large, medium, small	Calculates the product of a sparse matrix into a dense vector
stencil	C++	default, small	An iterative Jacobi stencil operation over a 3D grid
tpacf	C++	large, medium, small	Implementation of a Two Point Angular Correlation Function, used for the statistical analysis of the spatial distribution of astronomical bodies

PolyBench

PolyBench is a set of 30 kernels with static control flow (i.e., no branching). Include with https://github.com/specs-feup/clava-benchmarks.git?folder=Polybench.

Benchmark	Language	Input options	Description
2mm	C	MINI, SMALL, MEDIUM, LARGE, EXTRALARGE	2 Matrix Multiplications (alpha * A * B * C + beta * D)
3mm	C	MINI, SMALL, MEDIUM, LARGE, EXTRALARGE	3 Matrix Multiplications ((AB)(C*D))
adi	C	MINI, SMALL, MEDIUM, LARGE, EXTRALARGE	Alternating Direction Implicit solver
atax	C	MINI, SMALL, MEDIUM, LARGE, EXTRALARGE	Matrix Transpose and Vector Multiplication
bicg	C	MINI, SMALL, MEDIUM, LARGE, EXTRALARGE	BiCG Sub Kernel of BiCGStab Linear Solver
cholesky	C	MINI, SMALL, MEDIUM, LARGE, EXTRALARGE	Cholesky Decomposition
correlation	C	MINI, SMALL, MEDIUM, LARGE, EXTRALARGE	Correlation Computation
covariance	C	MINI, SMALL, MEDIUM, LARGE, EXTRALARGE	Covariance Computation
deriche	C	MINI, SMALL, MEDIUM, LARGE, EXTRALARGE	Edge detection filter
doitgen	C	MINI, SMALL, MEDIUM, LARGE, EXTRALARGE	Multi-resolution analysis kernel (MADNESS)
durbin	C	MINI, SMALL, MEDIUM, LARGE, EXTRALARGE	Toeplitz system solver
fdtd-2d	C	MINI, SMALL, MEDIUM, LARGE, EXTRALARGE	2-D Finite Different Time Domain Kernel
gemm	C	MINI, SMALL, MEDIUM, LARGE, EXTRALARGE	Matrix-multiply C=alpha.A.B+beta.C
gemver	C	MINI, SMALL, MEDIUM, LARGE, EXTRALARGE	Vector Multiplication and Matrix Addition
gesummv	C	MINI, SMALL, MEDIUM, LARGE, EXTRALARGE	Scalar, Vector and Matrix Multiplication
gramschmidt	C	MINI, SMALL, MEDIUM, LARGE, EXTRALARGE	Gram-Schmidt decomposition
head-3d	C	MINI, SMALL, MEDIUM, LARGE, EXTRALARGE	Heat equation over 3D data domain
jacobi-1D	C	MINI, SMALL, MEDIUM, LARGE, EXTRALARGE	1-D Jacobi stencil computation
jacobi-2D	C	MINI, SMALL, MEDIUM, LARGE, EXTRALARGE	2-D Jacobi stencil computation
lu	C	MINI, SMALL, MEDIUM, LARGE, EXTRALARGE	LU decomposition
ludcmp	C	MINI, SMALL, MEDIUM, LARGE, EXTRALARGE	LU decomposition followed by Forward Substitution
mvt	C	MINI, SMALL, MEDIUM, LARGE, EXTRALARGE	Matrix Vector Product and Transpose
nussinov	C	MINI, SMALL, MEDIUM, LARGE, EXTRALARGE	Dynamic programming algorithm for sequence alignment
seidel	C	MINI, SMALL, MEDIUM, LARGE, EXTRALARGE	2-D Seidel stencil computation
symm	C	MINI, SMALL, MEDIUM, LARGE, EXTRALARGE	Symmetric matrix-multiply
syr2k	C	MINI, SMALL, MEDIUM, LARGE, EXTRALARGE	Symmetric rank-2k update
syrk	C	MINI, SMALL, MEDIUM, LARGE, EXTRALARGE	Symmetric rank-k update
trisolv	C	MINI, SMALL, MEDIUM, LARGE, EXTRALARGE	Triangular solver
trmm	C	MINI, SMALL, MEDIUM, LARGE, EXTRALARGE	Triangular matrix-multiply

Rosetta

Rosetta is a set of complex image processing and machine learning applications used to evaluate FPGA optimizations. We present a CPU-friendly version in our distribution. Include with https://github.com/specs-feup/clava-benchmarks.git?folder=Rosetta.

Benchmark	Language	Input options	Description
3d-rendering	C++	N	A 3D software renderer
digit-recognition	C++	N	A digit recognition application based on a K-nearest neighbours classifier
face-detection	C++	N	A face detection application based on the Viola-Jones algorithm
optical-flow	C++	current, sintel	An application that calculates the optical flow (i.e., motion vectors) between image frames
spam-filter	C++	N	A Logistic Regression model trained with Stochastic Gradient Descent (SGD)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly