clBLAS-2.10.0 Release for ACL 1.0 GA
This clBLAS release is tagged as v2.10 and is part of AMD Compute Libraries (ACL) 1.0 GA. This release is based on merge from develop branch to master branch.
- AutoGemm now contains optimized parameters for Fiji GPUs with HBM (High-Bandwidth Memory) as well as optimized parameters for non-HBM devices, such as Hawaii, from release 2.8. The selection of logic can be done in CMake.
- Many bug fixes, including:
- Restore ability to use multiple different devices (not concurrently) via different contexts.
- AutoGemm works with Python 2 and 3.
- Better memory cleanup during teardown.
Thank you to the following contributors for this release: @pavanky , @shehzan10 , @hughperkins , @ghisvail , @notorca
- The release binaries are online compiled only, assuming OpenCL 2.0 compiler. The ASIC name (Hawaii or Fiji) in the binary titles indicates the kernel selection logic used to generate the binary; use the Fiji version for Fiji only (due to HBM) and use the Hawaii version for all other (non-HBM) GPUs.