Skip to content

Commit

Permalink
Merge branch 'develop' into master
Browse files Browse the repository at this point in the history
Bump version to v2.2

Conflicts:
	README.md
	src/CMakeLists.txt
  • Loading branch information
Kent Knox committed Jun 19, 2014
2 parents ac0cb67 + f2de5e7 commit 54e949e
Show file tree
Hide file tree
Showing 140 changed files with 4,417 additions and 1,294 deletions.
44 changes: 44 additions & 0 deletions .travis.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
language: cpp

compiler:
- gcc

before_install:
- sudo apt-get update -qq
- sudo apt-get install -qq fglrx opencl-headers libboost-program-options-dev libgtest-dev
# Uncomment below to help verify the installs above work
# - ls -la /usr/lib/libboost*
# - ls -la /usr/include/boost
# - ls -la /usr/src/gtest

install:
- mkdir -p bin/gTest
- cd bin/gTest
- cmake -DCMAKE_BUILD_TYPE=Release /usr/src/gtest
- make
- sudo mv libg* /usr/lib

before_script:
- cd ${TRAVIS_BUILD_DIR}
- mkdir -p bin/clBLAS
- cd bin/clBLAS
- cmake -DCMAKE_BUILD_TYPE=Release -DBUILD_TEST=OFF -DBUILD_CLIENT=ON ../../src

script:
- make install
# - ls -Rla package
# Run a simple test to validate that the build works; CPU device in a VM
- cd package/bin
- export LD_LIBRARY_PATH=${TRAVIS_BUILD_DIR}/bin/clBLAS/package/lib64:${LD_LIBRARY_PATH}
- ./client --cpu

after_success:
- cd ${TRAVIS_BUILD_DIR}/bin/clBLAS
- make package

notifications:
email:
- [email protected]
on_success: change
on_failure: always

31 changes: 0 additions & 31 deletions CHANGELOG
Original file line number Diff line number Diff line change
Expand Up @@ -243,34 +243,3 @@ For example:
./example_sgemm
- Run a simple client; one example is provided for each supported main
BLAS function family.
_______________________________________________________________________________
(C) 2010-2013 Advanced Micro Devices, Inc. All rights reserved. AMD, the AMD
Arrow logo, ATI, the ATI logo, Radeon, FireStream, FireGL, Catalyst, and
combinations thereof are trademarks of Advanced Micro Devices, Inc. Microsoft
(R), Windows, and Windows Vista (R) are registered trademarks of Microsoft
Corporation in the U.S. and/or other jurisdictions. OpenCL and the OpenCL logo
are trademarks of Apple Inc. used by permission by Khronos. Other names are for
informational purposes only and may be trademarks of their respective owners.

The contents of this document are provided in connection with Advanced Micro
Devices, Inc. ("AMD") products. AMD makes no representations or warranties with
respect to the accuracy or completeness of the contents of this publication and
reserves the right to make changes to specifications and product descriptions
at any time without notice. The information contained herein may be of a
preliminary or advance nature and is subject to change without notice. No
license, whether express, implied, arising by estoppel or otherwise, to any
intellectual property rights is granted by this publication. Except as set forth
in AMD's Standard Terms and Conditions of Sale, AMD assumes no liability
whatsoever, and disclaims any express or implied warranty, relating to its
products including, but not limited to, the implied warranty of
merchantability, fitness for a particular purpose, or infringement of any
intellectual property right.

AMD's products are not designed, intended, authorized or warranted for use as
components in systems intended for surgical implant into the body, or in other
applications intended to support or sustain life, or in any other application
in which the failure of AMD's product could create a situation where personal
injury, death, or severe property or environmental damage may occur. AMD
reserves the right to discontinue or make changes to its products at any time
without notice.
_______________________________________________________________________________
4 changes: 2 additions & 2 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ Firstly, in order to contribute code to this project, a contributor must have a
* After forking, the contributor [clones their repository](https://help.github.com/articles/create-a-repo) locally on their machine
* Code is developed and checked into the contributor's repository. These commits are eventually pushed upstream to their GitHub repository
* The contributor then issues a [pull-request](https://help.github.com/articles/using-pull-requests) against the **develop** branch of this repository, which is the [git flow](http://nvie.com/posts/a-successful-git-branching-model/) workflow which is well suited for working with GitHub
* A [git extention](https://github.com/nvie/gitflow) has been developed to ease the use of the 'git flow' methodology, but requires manual installation by the user. Refer to the projects wiki
* A [git extension](https://github.com/nvie/gitflow) has been developed to ease the use of the 'git flow' methodology, but requires manual installation by the user. Refer to the projects wiki

At this point, the repository maintainers will be notified by GitHub that a 'pull request' exists pending against their repository. A code review should be completed within a few days, depending on the scope of submitted code, and the code will either be accepted, rejected or commented on for extra feedback.

Expand All @@ -32,5 +32,5 @@ guidelines over time
Pull requests will be reviewed by the set of collaborators that are assigned for the repository. Pull requests may be accepted, declined or a conversation may start on the pull request thread with feedback. If the pull request is trivial and all the submission guidelines defined above are honored, the pull request may be accepted without delay. If the pull request is good, but the guidelines defined above are not followed, the collaborators may leave feedback on the pull request and engage in a conversation with the contributor with what they can do to improve the pull request. At any time, collaborators may decline a pull request if they decide the contribution is not appropriate for the project, or the feedback from reviewers on a pull request is not being addressed in an appropriate amount of time.

## Is it possible to become an official collaborator of the repository?
Yes, we hope to promote trusted members of the community, who have proven themselves to be competent and request to take on the extra responsibility to be official collaborators of the project. When an individual requests to be an official collaborator, current project collaborators will browse through the history of the requester's prior pull requests and take a vote amongst themselves if the requester should be promoted to collaborator. These individuals will then have the right to approve/decline pull requests and help shape the path that the project goes. It is worth noting, that on GitHub everybody has read-only access to the source and that everybody has the ability to issue a pull request to contribute to the project. The benefit of being a repository collaborator allows you to be able to be able to manage other peoples pull requests.
Yes, we hope to promote trusted members of the community, who have proven themselves to be competent and request to take on the extra responsibility to be official collaborators of the project. When an individual requests to be an official collaborator, current project collaborators will browse through the history of the requester's prior pull requests and take a vote amongst themselves if the requester should be promoted to collaborator. These individuals will then have the right to approve/decline pull requests and help shape the path that the project goes. It is worth noting, that on GitHub everybody has read-only access to the source and that everybody has the ability to issue a pull request to contribute to the project. The benefit of being a repository collaborator allows you to be able to manage other peoples pull requests.

147 changes: 101 additions & 46 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,78 +1,110 @@
clBLAS
=====
[![Build Status](https://travis-ci.org/clMathLibraries/clBLAS.png)](https://travis-ci.org/clMathLibraries/clBLAS)


This repository houses the code for the OpenCL™ BLAS portion of clMath.
The complete set of BLAS level 1, 2 & 3 routines is implemented. Please
see Netlib BLAS for the list of supported routines. In addition to GPU
devices, the library also supports running on CPU devices to facilitate
debugging and multicore programming. APPML 1.10 is the most current
generally available pre-packaged binary version of the library available
for download for both Linux and Windows platforms.

The primary goal of clBLAS is to make it easier for developers to
utilize the inherent performance and power efficiency benefits of
heterogeneous computing. clBLAS interfaces do not hide nor wrap OpenCL
interfaces, but rather leaves OpenCL state management to the control of
the user to allow for maximum performance and flexibility. The clBLAS
library does generate and enqueue optimized OpenCL kernels, relieving
the user from the task of writing, optimizing and maintaining kernel
code themselves.

clMATH is a software library containing FFT and BLAS functions written in OpenCL. In addition to GPU devices, the libraries also support running on CPU devices to facilitate debugging and multicore programming.
## clBLAS library user documentation

<a href="http://developer.amd.com/tools-and-sdks/heterogeneous-computing/amd-accelerated-parallel-processing-math-libraries/">APPML 1.10</a> is the most current generally available version of the library, and pre-built binaries are available for download on both Linux and Windows platforms.
[Library and API documentation][] for developers is available online as
a GitHub Pages website

This repository houses the code for the OpenCL™ BLAS portion of APPML. The complete set of BLAS level 1, 2 & 3 routines has been implemented. Please see <a href="http://www.netlib.org/blas/index.html"> Netlib BLAS </a> for the list of routines. For more information on supported graphics cards, see the <a href="http://developer.amd.com/tools-and-sdks/heterogeneous-computing/amd-accelerated-parallel-processing-app-sdk/system-requirements-driver-compatibility/">AMD APP System Requirements</a>.
### Google Groups

The primary goal of clBLAS is to make it easier for developers to utilize the inherent performance and power efficiency benefits of heterogeneous computing. clBLAS interfaces do not hide nor wrap OpenCL interfaces, but rather leaves OpenCL state management to the control of the user to allow for maximum performance and flexibility. The clBLAS library does generate and enqueue optimized OpenCL kernels, relieving the user from the task of writing, optimizing and maintaining kernel code themselves.
Two mailing lists have been created for the clMath projects:

## clBLAS library user documentation
[Library and API documentation]( http://clmathlibraries.github.io/clBLAS/ ) for developers is available online as a GitHub Pages website
- [[email protected]][] - group whose focus is to answer
questions on using the library or reporting issues

- [[email protected]][] - group whose focus is for
developers interested in contributing to the library code itself

## clBLAS Wiki
The [project wiki](https://github.com/clMathLibraries/clBLAS/wiki) contains helpful documentation, including a [build primer](https://github.com/clMathLibraries/clBLAS/wiki/Build)

The [project wiki][] contains helpful documentation, including a [build
primer][]

## Contributing code
Please refer to and read the [Contributing](CONTRIBUTING.md) document for guidelines on how to contribute code to this open source project

Please refer to and read the [Contributing][] document for guidelines on
how to contribute code to this open source project. The code in the
/master branch is considered to be stable, and all pull-requests should
be made against the /develop branch.

## License
The source for clFFT is licensed under the [Apache License, Version 2.0]( http://www.apache.org/licenses/LICENSE-2.0 )

The source for clBLAS is licensed under the [Apache License, Version
2.0][]

## Example
The simple example below shows how to use clBLAS to compute an OpenCL accelerated SGEMM

```c
#include <sys/types.h>
#include <stdio.h>
The simple example below shows how to use clBLAS to compute an OpenCL
accelerated SGEMM

/* Include the clBLAS header. It includes the appropriate OpenCL headers
#include <sys/types.h>
#include <stdio.h>

/* Include the clBLAS header. It includes the appropriate OpenCL headers
*/
#include <clBLAS.h>
#include <clBLAS.h>

/* This example uses predefined matrices and their characteristics for
/* This example uses predefined matrices and their characteristics for
* simplicity purpose.
*/

#define M 4
#define N 3
#define K 5
#define M 4
#define N 3
#define K 5

static const cl_float alpha = 10;
static const cl_float alpha = 10;

static const cl_float A[M*K] = {
static const cl_float A[M*K] = {
11, 12, 13, 14, 15,
21, 22, 23, 24, 25,
31, 32, 33, 34, 35,
41, 42, 43, 44, 45,
};
static const size_t lda = K; /* i.e. lda = K */
};
static const size_t lda = K; /* i.e. lda = K */

static const cl_float B[K*N] = {
static const cl_float B[K*N] = {
11, 12, 13,
21, 22, 23,
31, 32, 33,
41, 42, 43,
51, 52, 53,
};
static const size_t ldb = N; /* i.e. ldb = N */
};
static const size_t ldb = N; /* i.e. ldb = N */

static const cl_float beta = 20;
static const cl_float beta = 20;

static cl_float C[M*N] = {
static cl_float C[M*N] = {
11, 12, 13,
21, 22, 23,
31, 32, 33,
41, 42, 43,
};
static const size_t ldc = N; /* i.e. ldc = N */
};
static const size_t ldc = N; /* i.e. ldc = N */

static cl_float result[M*N];
static cl_float result[M*N];

int main( void )
{
int main( void )
{
cl_int err;
cl_platform_id platform = 0;
cl_device_id device = 0;
Expand Down Expand Up @@ -138,25 +170,48 @@ int main( void )
clReleaseContext( ctx );

return ret;
}
```
}

## Build dependencies

### Library for Windows
* Windows® 7/8
* Visual Studio 2010 SP1
* An OpenCL SDK, such as APP SDK 2.8
* Latest CMake

- Windows® 7/8

- Visual Studio 2010 SP1, 2012

- An OpenCL SDK, such as APP SDK 2.9

- Latest CMake

### Library for Linux
* GCC 4.6 and onwards
* An OpenCL SDK, such as APP SDK 2.8
* Latest CMake

- GCC 4.6 and onwards

- An OpenCL SDK, such as APP SDK 2.9

- Latest CMake

### Library for Mac OSX

- Recommended to generate Unix makefiles with cmake

### Test infrastructure
* Latest Googletest
* Latest ACML
* Latest Boost

- Googletest v1.6

- ACML on windows/linux; Accelerate on Mac OSX

- Latest Boost

### Performance infrastructure
* Python

- Python

[Library and API documentation]: http://clmathlibraries.github.io/clBLAS/
[[email protected]]: https://groups.google.com/forum/#!forum/clmath
[[email protected]]: https://groups.google.com/forum/#!forum/clmath-developers
[project wiki]: https://github.com/clMathLibraries/clBLAS/wiki
[build primer]: https://github.com/clMathLibraries/clBLAS/wiki/Build
[Contributing]: CONTRIBUTING.md
[Apache License, Version 2.0]: http://www.apache.org/licenses/LICENSE-2.0
26 changes: 13 additions & 13 deletions doc/clBLAS.doxy
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ PROJECT_LOGO =
# If a relative path is entered, it will be relative to the location
# where doxygen was started. If left blank the current directory will be used.

OUTPUT_DIRECTORY = F:\code\git-svn\clBLAS.head\bin\master\vs10x64.superbuild\docs
OUTPUT_DIRECTORY = ..\..\bin\clBLAS.doxy

# If the CREATE_SUBDIRS tag is set to YES, then doxygen will create
# 4096 sub-directories (in 2 levels) under the output directory of each output
Expand Down Expand Up @@ -651,17 +651,17 @@ WARN_LOGFILE =
# directories like "/usr/src/myproject". Separate the files or directories
# with spaces.

INPUT = clBLAS.h \
include/cltypes.h \
include/kerngen.h \
include/solver.h \
include/mempat.h \
src/blas/gens/blas_kgen.h \
src/blas/include/clblas-internal.h \
src/blas/include/kernel_extra.h \
src/blas/include/solution_seq.h \
include/granulation.h \
src/tools/ktest/step.h
INPUT = ../src/clBLAS.h \
../src/include/cltypes.h \
../src/include/kerngen.h \
../src/include/solver.h \
../src/include/mempat.h \
../src/library/gens/blas_kgen.h \
../src/library/include/clblas-internal.h \
../src/library/include/kernel_extra.h \
../src/library/include/solution_seq.h \
../src/include/granulation.h \
../src/library/tools/ktest/step.h

# This tag can be used to specify the character encoding of the source files
# that doxygen parses. Internally doxygen uses the UTF-8 encoding, which is
Expand Down Expand Up @@ -721,7 +721,7 @@ EXCLUDE_SYMBOLS =
# directories that contain example code fragments that are included (see
# the \include command).

EXAMPLE_PATH = samples
EXAMPLE_PATH = ../src/samples

# If the value of the EXAMPLE_PATH tag contains directories, you can use the
# EXAMPLE_PATTERNS tag to specify one or more wildcard pattern (like *.cpp
Expand Down
Loading

0 comments on commit 54e949e

Please sign in to comment.