Documentation for rocSPARSE is available at https://rocm.docs.amd.com/projects/rocSPARSE/en/latest/.
- New LRB algorithm to SpMV, supporting CSR format
- rocBLAS as now an optional dependency for SDDMM algorithms
- Additional verbose output for
csrgemm
andbsrgemm
- CMake support for documentation
- Triangular solve with multiple rhs (SpSM, csrsm, ...) now calls SpSV, csrsv, etcetera when nrhs equals 1
- Improved user manual section Installation and Building for Linux and Windows
rocsparse_inverse_permutation
- Mixed-precisions for SpVV
- Uniform int8 precision for gather and scatter
- Added new
rocsparse_spmv
routine - Added new
rocsparse_xbsrmv
routines - When using host pointer mode, you must now call
hipStreamSynchronize
followingdoti
,dotci
,spvv
, andcsr2ell
doti
routine- Improved spin-looping algorithms
- Improved documentation
- Improved verbose output during argument checking on API function calls
rocsparse_spmv_ex
rocsparse_xbsrmv_ex
- Auto stages from
spmv
,spmm
,spgemm
,spsv
,spsm
, andspitsv
- Formerly deprecated
rocsparse_spmv
routines - Formerly deprecated
rocsparse_xbsrmv
routines - Formerly deprecated
rocsparse_spmm_ex
routine
- Bug in
rocsparse-bench
where the SpMV algorithm was not taken into account in CSR format - BSR and GEBSR routines (
bsrmv
,bsrsv
,bsrmm
,bsrgeam
,gebsrmv
,gebsrmm
) didn't always showblock_dim==0
as an invalid size - Passing
nnz = 0
todoti
ordotci
wasn't always returning a dot product of 0 gpsv
minimum size is nowm >= 3
- More mixed-precisions for SpMV, (
matrix: float
,vectors: double
,calculation: double
) and (matrix: rocsparse_float_complex
,vectors: rocsparse_double_complex
,calculation: rocsparse_double_complex
) - Support for gfx940, gfx941, and gfx942
- Bug in
csrsm
andbsrsm
- In
csritlu0
, the algorithmrocsparse_itilu0_alg_sync_split_fusion
has some accuracy issues when XNACK is enabled (you can userocsparse_itilu0_alg_sync_split
as an alternative)
- Memory leak in
csritsv
- Bug in
csrsm
andbsrsm
bsrgemm
andspgemm
for BSR formatbsrgeam
- Build support for Navi32
- Experimental hipGraph support for some rocSPARSE routines
csritsv
,spitsv
csr iterative triangular solve- Mixed-precisions for SpMV
- Batched SpMM for transpose A in COO format with atomic algorithm
csr2bsr
csr2csr_compress
csr2coo
gebsr2csr
csr2gebsr
- Documentation
- Bug in COO SpMV grid size
- Bug in SpMM grid size when using very large matrices
- In
csritlu0
, the algorithmrocsparse_itilu0_alg_sync_split_fusion
has some accuracy issues when XNACK is enabled (you can userocsparse_itilu0_alg_sync_split
as an alternative)
rocsparse_spmv_ex
routinerocsparse_bsrmv_ex_analysis
androcsparse_bsrmv_ex
routinescsritilu0
routine- Build support for Navi31 and Navi 33
- Segmented algorithm for COO SpMV by performing analysis
- Improved performance when generating random matrices
bsr2csr
routine
- Integer overflow bugs
- Bug in
ellmv
- Transpose A for SpMM COO format
- Matrix checker routines for verifying matrix data
- Atomic algorithm for COO SpMV
bsrpad
routine
- Bug in
csrilu0
that could cause a deadlock - Bug where asynchronous
memcpy
would use wrong stream - Potential size overflows
- Batched SpMM for CSR, CSC, and COO formats
- Packages for test and benchmark executables on all supported operating systems using CPack
- Clients file importers and exporters
- Clients code size reduction
- Clients error handling
- Clients benchmarking for performance tracking
- Test adjustments due to round-off errors
- Fixing API call compatibility with rocPRIM
gtsv_interleaved_batch
gpsv_interleaved_batch
SpGEMM_reuse
- Allow copying of mat info struct
- Optimization for SDDMM
- Allow unsorted matrices in
csrgemm
multipass algorithm
csrmv
,coomv
,ellmv
, andhybmv
for (conjugate) transposed matricescsrmv
for symmetric matrices- Packages for test and benchmark executables on all supported operating systems using CPack
spmm_ex
has been deprecated and will be removed in the next major release
- Optimization for
gtsv
- Triangular solve for multiple right-hand sides using BSR format
- SpMV for BSRX format
- SpMM in CSR format enhanced to work with transposed A
- Matrix coloring for CSR matrices
- Added batched tridiagonal solve (
gtsv_strided_batch
) - SpMM for BLOCKED ELL format
- Generic routines for SpSV and SpSM
- Beta support for Windows 10
- Additional atomic-based algorithms for SpMM in COO format
- Extended version of SpMM
- Additional algorithm for SpMM in CSR format
- Added (conjugate) transpose support for CsrMV and SpMV (CSR) routines
- Packaging has been split into a runtime package (
rocsparse
) and a development package (rocsparse-devel
): The development package depends on the runtime package. When installing the runtime package, the package manager will suggest the installation of the development package to aid users transitioning from the previous version's combined package. This suggestion by package manager is for all supported operating systems (except CentOS 7) to aid in the transition. Thesuggestion
feature in the runtime package is introduced as a deprecated feature and will be removed in a future ROCm release.
- Bug with
gemvi
on Navi21 - Bug with adaptive CsrMV
- Optimization for pivot-based
gtsv
- (batched) Tridiagonal solver with and without pivoting
- Dense matrix sparse vector multiplication (gemvi)
- Support for gfx90a
- Sampled dense-dense matrix multiplication (SDDMM)
- client matrix download mechanism
- removed boost dependency in clients
- SpMM (CSR, COO)
- Code coverage analysis
- Install script
- Level 2/3 unit tests
rocsparse-bench
no longer depends on boost
gebsrmm
gebsrmv
gebsrsv
coo2dense
anddense2coo
- Generic APIs, including
axpby
,gather
,scatter
,rot
,spvv
,spmv
,spgemm
,sparsetodense
,densetosparse
- Support for mixed indexing types in matrix formats
- Changelog
csr2gebsr
gebsr2gebsc
gebsr2gebsr
- Treating filename as regular expression for YAML-based testing generation
- Documentation for
gebsr2csr
bsric0
- gfx1030 has been adjusted to the latest compiler
- Replace old XNACK 'off' compiler flag with new version
- Updated Debian package name
prune_csr2csr
,prune_dense2csr_percentage
andprune_csr2csr_percentage
addedbsrilu0 added
csrilu0_numeric_boost
functionality added
bsric0
- No changes for this ROCm release
- Fortran bindings
- CentOS 6 support
bsrmv
- Default compiler switched to HIP-Clang
csr2dense
,csc2dense
,csr2csr_compress
,nnz_compress
,bsr2csr
,csr2bsr
,bsrmv
, andcsrgeam
- Triangular solve for BSR format (
bsrsv
) - Options for static build
- Examples
dense2csr
anddense2csc
- Installation process