This repository has been archived by the owner on Mar 12, 2021. It is now read-only.
Releases: JuliaGPU/CuArrays.jl
Releases · JuliaGPU/CuArrays.jl
v1.6.0
v1.6.0 (2019-12-21)
Closed issues:
- cu(x) always returns Float32 (#540)
- CUDNN Batchnorm fails (#531)
- Overhead of memory copies (#528)
- CUFFT_EXEC_FAILED when using Zygote (#524)
- Tutorial initial benchmark times/text don't make sense (#519)
- SplittingPool: assertion violated (#516)
- Split sublibraries into pure library wrappers and higher-level wrappers (#287)
- Require CUDA libraries unconditionally (#221)
- Intermittent CI failure: CUSPARSE switch2csr (#215)
Merged pull requests:
- Rework library handles for multithreading. (#544) (maleadt)
- Overhaul documentation and README. (#541) (maleadt)
- CompatHelper: bump compat for "Requires" to "1.0" (#535) (github-actions[bot])
- reenable complex tensor contraction tests (#533) (Jutho)
- CI improvements: ARM, and test several CUDA versions (#532) (maleadt)
- Clean-up copy constructors. (#530) (maleadt)
- Use 128 bits to represent blocks in the splitting memory pool. (#527) (maleadt)
- Take available memory into account when selecting a device. (#526) (maleadt)
- RFC: Allow cu to operate on Array types (#522) (baggepinnen)
- Upgrade Literate and Documenter. (#511) (fredrikekre)
v1.5.0
v1.5.0 (2019-11-29)
Closed issues:
- Accessing CuArrays and CUDAnative through Flux gives
WARNING: replacing module CUDA.
and breaks precompilation (#509) - 64-bit windows LoadError: Could not find libcublas (#507)
- GC preserve in generated ccall (#503)
- Conditional use of CUDA on Windows (#488)
- Memory allocator: @external_alloc (#426)
- setindex! fails with Array source CuArray target with different ndims or when broadcasting (#403)
Merged pull requests:
- Allocate workspace repeatedly right before calling into CUDNN. (#517) (maleadt)
- Use at-runtime_ccall's ability to delay the library lookup. (#514) (maleadt)
- Demote CUTENSOR version check to a warning. (#513) (maleadt)
- Don't always time allocations. (#512) (maleadt)
- API for manually reclaiming memory (#504) (maleadt)
- Updates for CUTENSOR 1.0 (#502) (maleadt)
- Add ND support for find functions. (#500) (maleadt)
v1.4.7
v1.4.6
v1.4.5
v1.4.4
v1.4.4 (2019-11-08)
Merged pull requests:
v1.4.3
Improve conditional use of the package.
v1.4.2
v1.4.1
v1.4.0
v1.4.0 (2019-11-04)
Merged pull requests:
- Use released GPUArrays. (#470) (maleadt)
- Init improvements (#468) (maleadt)
- Works towards precompilation (#466) (maleadt)
- Fix #463 (#464) (xukai92)
- Rename CuArray.own .pool to avoid confusion with unsafe_wrap. (#462) (maleadt)
- Add a strided batched method for getrf (#461) (maleadt)
- Improve type constraint for
\
(#460) (xukai92) - Implement matrix division using CUSOLVER (#459) (xukai92)
- CI and wrapper updates (#458) (maleadt)
- Adapt to CUDAdrv.jl#167 (#457) (maleadt)
- Prevent CPU to GPU copy in mapreduce. (#455) (maleadt)
- CI improvements. (#454) (maleadt)
- Upgrade to new CI templates. (#452) (maleadt)
- Accumulate improvements (#448) (maleadt)
- Optimize accumulate (#447) (maleadt)
- Implement findall (#446) (maleadt)
- More memory allocator improvements (#441) (maleadt)
- remove extraneous where and add some tests (#439) (kshyatt)
- Use a separate allocation timer. (#437) (maleadt)
- Use the memory pool for CUFFT workarea allocation. (#436) (maleadt)
- Update .gitlab-ci.yml (#435) (MikeInnes)
- CUDNN improvements (#404) (maleadt)