Skip to content

Version 0.1.5

Compare
Choose a tag to compare
@benvanwerkhoven benvanwerkhoven released this 21 Jul 13:51
· 1678 commits to master since this release

Version 0.1.5

Version 0.1.5 brings more flexibility, you can now pass code generating functions, your own functions for verifying kernel output correctness, and use your own names for the thread block dimensions.

Internally, quite a lot has changed in this version. The runners have been separated into strategies and runners. And the way that options are passed around within the Kernel Tuner has changed dramatically.

From the CHANGELOG:

[0.1.5] - 2017-07-21

Changed

  • option to pass a fraction to the sample runner
  • fixed a bug in memset for OpenCL backend

Added

  • parallel tuning on single node using Noodles runner
  • option to pass new defaults for block dimensions
  • option to pass a Python function as code generator
  • option to pass custom function for output verification