Skip to content

Benchmarks

Tiago Lascasas dos Santos edited this page Oct 13, 2022 · 7 revisions

Clava comes with an extensible set of pre-packaged benchmark suites, and allows you to easily integrate your own. The advantages of doing this are plenty, as they allow you to apply code transformations programmatically (i.e., one benchmark at a time), while also allowing you to automatically compile and execute those benchmarks at will. This is opposed to the usual Clava workflow, which applies code transformations indiscriminately to every input file, while leaving the burden of compilation and execution to the developer.

Including a benchmark suite

There are two ways of including a benchmark suite, depending on their origin:

  • To use the built-in benchmarks: to use one of the built-in benchmark suites (available in this repository), all you need to do is to include their URL in the "External dependencies" prompt in the Clava options tab. For instance, to include the NAS suite, you add https://github.com/specs-feup/clava-benchmarks.git?folder=NAS to your external dependencies.

  • To use your own benchmarks: to use your own benchmark suites, you should create a folder structure similar to the built-in benchmarks, and further create your own JavaScript files with the required logic (e.g., to manage input files/options). You can add it to Clava by including your local benchmark folder in the "Includes Folder" prompt in the Clava options tab.

Using a benchmark suite

After you include your benchmark suite, you can use it in the following way. This is just an example of basic usage, and you can check the documentation of the BenchmarkSet and BenchmarkInstance classes for more information.

Compilation requires you to have CMake installed on your system, as well as a C/C++ compiler.

// include the benchmark suite(s) you want to use
laraImport("lara.benchmark.RosettaBenchmarkSet");
laraImport("weaver.WeaverJps");

function main() {
	//create a BenchmarkSet object
	const benches = new RosettaBenchmarkSet();

	//choose the individual benchmark within the set
	benches.setBenchmarks(["3d-rendering", "digit-recognition", "face-detection"]);

	//choose the input size(s) to be used during execution
	benches.setInputSizes(["N"]);

	//go through each benchmark (objects of type BenchmarkInstance)
	for (var bench of benches) {
		// loads the benchmark into Clava's AST
		bench.load();

		// now everything is ready for you to do your analysis and transformations
		// in this example, we're just printing the name of every function
		var funNames = [];
		for (var elem of WeaverJps.search("function")) {
			if (elem.isImplementation) {
				funNames.push(elem.name);
			}
		}
		println(funNames.join(","));

		// now, we prepare for compilation
		// if you are on Windows, you may wish to choose MinGW instead of the default (MSVC), but
		// this is entirely dependent on your system. Check Clava's CMake docs for more info
		bench.getCMaker().setGenerator("MinGW Makefiles");

		// now we compile the benchmark
		bench.compile();

		// and finally, we execute it
		bench.execute();
	}
}

main()

Benchmark Descriptions

In this section, we describe what each of our built-in benchmark suites, in terms of their possible input sizes, programming language and purpose.

CHStone

CHStone is a set of kernels used to evaluate High-level Synthesis applications, as well as to generate some IP components, such as arithmetic operators. Include with https://github.com/specs-feup/clava-benchmarks.git?folder=CHStone.

Benchmark Language Input options Description
adpcm C N An adaptive differential pulse-code modulation decoder/encoder
aes C N An Advanced Encryption Standard (AES) decoder/encoder
blowfish C N A blowfish encoder/decoder (blowfish is a symmetric-key block cipher)
dfadd C N A double-precision adder implementation for generating an hardware IP
dfdiv C N A double-precision divider implementation for generating an hardware IP
dfmul C N A double-precision multiplier implementation for generating an hardware IP
dfsin C N A double-precision sine function implementation for generating an hardware IP
gsm C N A linear predictive coding analyser of a global system for mobile communication
jpeg C N A JPEG image decompressor
mips C N A simplified simulator of a MIPS CPU
motion C N A motion vector decoder for the MPEG-2 video format
sha C N An implementation of the SHA (secure hashing algorithm) to produce an hash code of an input

HiFlipVX

HiFlipVX is an object detection library aimed at FPGAs. Considering it is a library, we focus only on the example application it provides, which makes use of most of the library's functionality in order to manipulate an input image. Include with https://github.com/specs-feup/clava-benchmarks.git?folder=HiFlipVX.

Benchmark Language Input options Description
v2 C++ N Object detection library for FPGAs

LSU

LSU is a set of large real-world applications distributed as a single file. Include with https://github.com/specs-feup/clava-benchmarks.git?folder=LSU.

Benchmark Language Input options Description
bzip2 C SMALL, LARGE Lossless compression tool
gzip C SMALL, LARGE Lossless compression tool
oggend C SMALL Encoding tool for Ogg Vorbis, a lossy audio compressing scheme
gcc C SMALL, LARGE GNU C compiler

NAS

NAS is a set of benchmarks used to evaluate the parallel performance of supercomputers. Include with https://github.com/specs-feup/clava-benchmarks.git?folder=NAS.

Benchmark Language Input options Description
BT C S, W, A, B, C, D, E Block Tri-diagonal solver (application)
CG C S, W, A, B, C Conjugate Gradient, irregular memory access and communication (kernel)
EP C S, W, A, B, C, D, E Embarrassingly Parallel (kernel)
FT C S, W, A, B, C, D, E discrete 3D fast Fourier Transform, all-to-all communication (kernel)
IS C S, W, A, B, C, D Integer Sort, random memory access (kernel)
LU C S, W, A, B, C, D, E Lower-Upper Gauss-Seidel solver (application)
MG C S, W, A, B, C, D, E Multi-Grid on a sequence of meshes, long- and short-distance communication, memory intensive (kernel)
SP C S, W, A, B, C, D, E Scalar Penta-diagonal solver (application)
UA C S, W, A, B, C, D Unstructured Adaptive mesh, dynamic and irregular memory access (application)

Parboil

Parboil is a suite of complex applications from several fields, such as bioinformatics, physics and mathematics. Include with https://github.com/specs-feup/clava-benchmarks.git?folder=Parboil.

Benchmark Language Input options Description
bfs C++ 1M, NY, SF, UT A breadth-first search algorithm operating over a graph
cutcp C++ large, small Computes the short-range component of Coulombic potential at each grid point over a 3D grid
histo C++ large, default Calculates a histogram of 255 bins using data with a Gaussian distribution
lbm C++ long, short A fluid dynamics simulation using the Lattice-Boltzmann Method
mri-gridding C++ small Converts points from an MR scan into a grid through interpolation, and the applies a Fast Fourier Transform over the grid
mri-q C++ large, small MRI calibration matrix using image reconstruction algorithms in non-Cartesian space
sad C++ large, default An implementation of a sum of absolute differences kernel, used in MPEG and H.264 decoders
sgemm C++ medium, small General purpose dense matrix-matrix multiplication
spmv C++ large, medium, small Calculates the product of a sparse matrix into a dense vector
stencil C++ default, small An iterative Jacobi stencil operation over a 3D grid
tpacf C++ large, medium, small Implementation of a Two Point Angular Correlation Function, used for the statistical analysis of the spatial distribution of astronomical bodies

PolyBench

PolyBench is a set of 30 kernels with static control flow (i.e., no branching). Include with https://github.com/specs-feup/clava-benchmarks.git?folder=Polybench.

Benchmark Language Input options Description
2mm C MINI, SMALL, MEDIUM, LARGE, EXTRALARGE 2 Matrix Multiplications (alpha * A * B * C + beta * D)
3mm C MINI, SMALL, MEDIUM, LARGE, EXTRALARGE 3 Matrix Multiplications ((AB)(C*D))
adi C MINI, SMALL, MEDIUM, LARGE, EXTRALARGE Alternating Direction Implicit solver
atax C MINI, SMALL, MEDIUM, LARGE, EXTRALARGE Matrix Transpose and Vector Multiplication
bicg C MINI, SMALL, MEDIUM, LARGE, EXTRALARGE BiCG Sub Kernel of BiCGStab Linear Solver
cholesky C MINI, SMALL, MEDIUM, LARGE, EXTRALARGE Cholesky Decomposition
correlation C MINI, SMALL, MEDIUM, LARGE, EXTRALARGE Correlation Computation
covariance C MINI, SMALL, MEDIUM, LARGE, EXTRALARGE Covariance Computation
deriche C MINI, SMALL, MEDIUM, LARGE, EXTRALARGE Edge detection filter
doitgen C MINI, SMALL, MEDIUM, LARGE, EXTRALARGE Multi-resolution analysis kernel (MADNESS)
durbin C MINI, SMALL, MEDIUM, LARGE, EXTRALARGE Toeplitz system solver
fdtd-2d C MINI, SMALL, MEDIUM, LARGE, EXTRALARGE 2-D Finite Different Time Domain Kernel
gemm C MINI, SMALL, MEDIUM, LARGE, EXTRALARGE Matrix-multiply C=alpha.A.B+beta.C
gemver C MINI, SMALL, MEDIUM, LARGE, EXTRALARGE Vector Multiplication and Matrix Addition
gesummv C MINI, SMALL, MEDIUM, LARGE, EXTRALARGE Scalar, Vector and Matrix Multiplication
gramschmidt C MINI, SMALL, MEDIUM, LARGE, EXTRALARGE Gram-Schmidt decomposition
head-3d C MINI, SMALL, MEDIUM, LARGE, EXTRALARGE Heat equation over 3D data domain
jacobi-1D C MINI, SMALL, MEDIUM, LARGE, EXTRALARGE 1-D Jacobi stencil computation
jacobi-2D C MINI, SMALL, MEDIUM, LARGE, EXTRALARGE 2-D Jacobi stencil computation
lu C MINI, SMALL, MEDIUM, LARGE, EXTRALARGE LU decomposition
ludcmp C MINI, SMALL, MEDIUM, LARGE, EXTRALARGE LU decomposition followed by Forward Substitution
mvt C MINI, SMALL, MEDIUM, LARGE, EXTRALARGE Matrix Vector Product and Transpose
nussinov C MINI, SMALL, MEDIUM, LARGE, EXTRALARGE Dynamic programming algorithm for sequence alignment
seidel C MINI, SMALL, MEDIUM, LARGE, EXTRALARGE 2-D Seidel stencil computation
symm C MINI, SMALL, MEDIUM, LARGE, EXTRALARGE Symmetric matrix-multiply
syr2k C MINI, SMALL, MEDIUM, LARGE, EXTRALARGE Symmetric rank-2k update
syrk C MINI, SMALL, MEDIUM, LARGE, EXTRALARGE Symmetric rank-k update
trisolv C MINI, SMALL, MEDIUM, LARGE, EXTRALARGE Triangular solver
trmm C MINI, SMALL, MEDIUM, LARGE, EXTRALARGE Triangular matrix-multiply

Rosetta

Rosetta is a set of complex image processing and machine learning applications used to evaluate FPGA optimizations. We present a CPU-friendly version in our distribution. Include with https://github.com/specs-feup/clava-benchmarks.git?folder=Rosetta.

Benchmark Language Input options Description
3d-rendering C++ N A 3D software renderer
digit-recognition C++ N A digit recognition application based on a K-nearest neighbours classifier
face-detection C++ N A face detection application based on the Viola-Jones algorithm
optical-flow C++ current, sintel An application that calculates the optical flow (i.e., motion vectors) between image frames
spam-filter C++ N A Logistic Regression model trained with Stochastic Gradient Descent (SGD)
Clone this wiki locally