Releases: EnzymeAD/Enzyme.jl
Releases · EnzymeAD/Enzyme.jl
v0.11.8
Enzyme v0.11.8
Merged pull requests:
- Add EnzymeTestUtils package for testing Enzyme rules (#782) (@sethaxen)
- Add tests for BLAS.dot, BLAS.dotc, and BLAS.dotu (#842) (@sethaxen)
- CompatHelper: bump compat for GPUCompiler to 0.22, (keep existing compat) (#1001) (@github-actions[bot])
- support wide ints in tape (#1002) (@motabbara)
- add 1.10 cpu features string (#1003) (@wsmoses)
- Fix poison value handling (#1006) (@wsmoses)
- Abstract state is default duplicated (#1007) (@wsmoses)
- Reduce amount of printing during 'cannot find shadow' error of an arg… (#1012) (@wsmoses)
- Attempt fix of nicer error handler for jl_f__apply_iterate (#1013) (@wsmoses)
- Fix 0 arg wait (#1014) (@wsmoses)
- Fix display of formula in documentation (#1015) (@metab0t)
- Fix fallback apply iterate in vector mode (#1016) (@wsmoses)
- Fix order of args to occursin to reduce printing on store error (#1017) (@wsmoses)
- Fix
UndefVarError: ST not defined
(#1019) (@devmotion) - Mark custom rule tape loads as needing caching (#1024) (@wsmoses)
- Fix batched arraycopy (reverse mode) (#1033) (@wsmoses)
- CompatHelper: bump compat for GPUCompiler to 0.23, (keep existing compat) (#1034) (@github-actions[bot])
- Add getri nice backtrace (#1040) (@wsmoses)
- Add missing julia compat to EnzymeTestUtils (#1041) (@sethaxen)
- Unmark reverse mutating test as broken (#1042) (@sethaxen)
- Add nicer nc_sync error (#1044) (@wsmoses)
- Fix print for abi wrapper error (#1049) (@wsmoses)
- Fix array type cast (#1050) (@wsmoses)
- CompatHelper: bump compat for GPUCompiler to 0.24, (keep existing compat) (#1051) (@github-actions[bot])
- Set active_reg for mutable structs to be duplicated rather than active (#1052) (@wsmoses)
- Fix error when returning undef from function (#1054) (@wsmoses)
- Fix shadow segfault on custom rule (#1056) (@wsmoses)
- Add guess activity of ref of int (#1060) (@wsmoses)
- Fix ordering for vector fallback of apply iterate (#1062) (@wsmoses)
- Adapt to jll bump and use nice default error for unknown c functions (#1064) (@wsmoses)
- Better Store Activity Error Messages (#1066) (@wsmoses)
Closed issues:
- Adding testing utilities (#780)
- JLJit duplicate symbol (#991)
- tape width issue using InlineStrings (#998)
- Incorrect Jacobian when indexing an Array using
begin
&end
(#1008) - Catching assertions / debugging mode? (#1009)
- Assertion failed error (#1011)
- Illegal type analysis error with gemv call (#1020)
- Incorrect tape provided to custom reverse rule in for loop (#1022)
- Cannot deduce type error (#1023)
- mixed activity for jl_new_struct (#1026)
- dval of mutated vector not zeroed when other activities are Const (#1028)
- LLVM failed verification error with BatchDuplicated return type on reverse autodiff_thunk (#1032)
- No augmented forward pass found for dgemqrt_64_ (#1037)
- Precompilation failed;
has_orc_v1
not defined (#1043) - BLAS.scal! error with autodiff_thunk in batch reverse-mode (#1048)
- Crash with BLAS.scal! as argument in calling method with complex inputs (#1053)
- AssertionError: Tuple{Vector{Float64}, Float64} has mixed internal activity types (#1055)
- Invalid unwrap (#1058)
v0.11.7
Enzyme v0.11.7
Closed issues:
- Better error for
Active{Vector{Float64}}
(#318) - Differentiating Gridap code (#447)
- getting a function call location from a stacktrace (#854)
- improve debugging (#864)
- CUDA sin is broken (#924)
- runtime activity changes AD result (#947)
quantile
gives wrong gradient on Julia 1.6 (#973)hypot
error on Julia 1.7 (#974)Set
error with x86 (#975)- Incorrect gradients when using views (#979)
Merged pull requests:
- Make bitcode replacement a preference (#960) (@gaurav-arya)
- Reduce warnings (and have as runtime errors) (#961) (@wsmoses)
- Fix phi of addrspace 13 (#962) (@wsmoses)
- Additional julia runtime function from int (#964) (@wsmoses)
- Fix hit c symbol map (#965) (@wsmoses)
- Extend make zero to handle generic structs (#966) (@wsmoses)
- Fix cuda implementedby (#967) (@wsmoses)
- Mark memhash as inactive (#968) (@wsmoses)
- Unreachable dce workaround (#972) (@wsmoses)
- Update GPU CI (#976) (@vchuravy)
- Better backtrace for some lapack functions (#980) (@wsmoses)
- Bump jll to 79 (#981) (@wsmoses)
- Fix unhandled handler (#982) (@wsmoses)
- nicer error fallback for getrf (#983) (@wsmoses)
- Don't mixed activity error if storing constant int (#984) (@wsmoses)
- Add structural check for view (#985) (@wsmoses)
- Disable attributor on known failing llvm versions (#986) (@wsmoses)
- Update compiler.jl (#987) (@wsmoses)
- Fix internal printing segfault on 1.6 where mustprogress attr not def… (#988) (@wsmoses)
- Update Project.toml (#989) (@wsmoses)
- Test
hypot
error with CI (#990) (@jgreener64) - Applytup (#992) (@wsmoses)
- Fix active reg inner with vector (#994) (@wsmoses)
- Adapt to abi bump (#995) (@wsmoses)
v0.11.6
Enzyme v0.11.6
Closed issues:
autodiff_deferred
failing on Metal.jl kernel (#925)- 1.10 Forward mode segfault (#948)
- Compilation error for custom reverse rule (#952)
Merged pull requests:
- Demonstrate backtraces using the juliaojit (#863) (@gbaraldi)
- More stdlib and type tests (#872) (@jgreener64)
- Add
memmove
tonofreefns
(#949) (@devmotion) - Fix union return issue (#950) (@wsmoses)
- [CI] Update TagBot setup (#951) (@giordano)
- Fix runtime activity in generic (#953) (@wsmoses)
- Jll 76 (#955) (@wsmoses)
- fix newstruct (#957) (@wsmoses)
- Erase with placeholder (#958) (@wsmoses)
- Bump jll to 78 (#959) (@wsmoses)
v0.11.5
Enzyme v0.11.5
Closed issues:
- Gradient computation "corrupted" on ARM-based M2 CPU (#611)
- Error with missing value (#801)
- Derivative leaks into
Const
inputview
(#804) - Segfaults in LLVM.jl on Julia 1.9.0 (#860)
- Enzyme + KernelAbstractions: KA syntax changes and differentiating multiple kernels (#896)
- Handle the new calling convention that julia emits for cheaper ptls (#909)
- Enzyme compilation failed when using
logpdf
(#910) - Enzyme troubles with ElectrochemicalKinetics (#931)
- Asserts on upcoming Julia 1.10 (#933)
- Regression on differentiating Bessels.jl (#941)
Merged pull requests:
- enable LLVM6 (#917) (@motabbara)
- Fix argbox issue (#932) (@wsmoses)
- Wrap gradient utils in julia_error (#938) (@wsmoses)
- Improve hessian docs (#939) (@wsmoses)
- Add noinl fix (#940) (@wsmoses)
- Add line break in hessian docs (#942) (@Vaibhavdixit02)
- Make more errors occur at runtime (#943) (@wsmoses)
- Use world in precomputing activity (#944) (@wsmoses)
- Missing closing backticks (#945) (@Vaibhavdixit02)
- Jll bump 75 (#946) (@wsmoses)
v0.11.4
Enzyme v0.11.4
Merged pull requests:
v0.11.3
Enzyme v0.11.3
Closed issues:
- Erroneous
autodiff
results when multi-threading enabled on various CPU arch (#903) Enzyme execution failed
withFunctors.jl
(#916)- GC invariant error (#919)
Merged pull requests:
- fix error handler return (#914) (@wsmoses)
- Enable later versions of GPUCompiler (#915) (@motabbara)
- Misc GC, ABI, callconv fixes (#920) (@wsmoses)
- Add batch duplicated func (#921) (@wsmoses)
- Bump objectfile (#922) (@wsmoses)
- Fix abisret (#923) (@wsmoses)
- Add fic phi error handler (#926) (@wsmoses)
- Start making helper for gutils (#927) (@wsmoses)
- Bump jll (#928) (@wsmoses)
- Tape type in enzymecore (#929) (@wsmoses)
- Fix triple (#930) (@wsmoses)
- Fix UndefVarError (#934) (@devmotion)
- Handle swiftself internally (#935) (@wsmoses)
v0.11.2
Enzyme v0.11.2
Closed issues:
- Define custom adjoint for
jl_f_getfield
(#176) - Jit dangling reference (#208)
- Numerically Incorrect
sinpi
Derivatives (#443) - Add
jl_nthfield
in reverse mode (#645) - CUDA.jl kernel errors with Julia 1.9.0-rc3 (#746)
- Incorrect return type for const return type forward-mode rule (#774)
- Function failed verification error for BatchDuplicated with complex (#778)
- Enzyme 0.11.1 segfaults (#779)
- Cannot deduce type after recent commit (#784)
- Composing with an
inactive
function (#786) - Illegal type analysis error for BLAS.nrm2 rule (#789)
- Error using autodiff_thunk (#791)
- Error differentiating 2-arg BLAS.dot with defined 5-arg rule (#793)
- Performance regression (#796)
- Support for
jl_eqtable_get
(#803) - Error reverse-mode differentiating 2-arg BLAS.dot with defined 5-arg rule (#811)
- Crash when trying to differentiate DynamicExpressions.jl (#816)
- Expected not LegalFullUnwrap for potentially last-value phi node (#817)
- Add reshape test (#819)
- Terribly slow compile time for to_tape_type (#823)
- Incorrect forward-mode result with Const argument (#830)
- Segfault differentiating sincos in forward-mode (#834)
- LLVM error: function failed verification (4) (#840)
- Warning: TypeAnalysisDepthLimit, incorrect gradient (#841)
- void GradientUtils::eraseFictiousPHIs(): Assertion `pp->getNumUses() == 0' failed. (#848)
- GC segfault (no MWE, private repo) (#853)
- sincos errors for ComplexF64 forward-mode (#855)
- Compilation failed for
sum(sin, x)
forcomplex
x (#856) - ERROR: AssertionError: p isa LLVM.Instruction (#859)
- sincos segfaults for ComplexF64 batch forward-mode (#867)
- Clarification on mutable objects (#869)
- Complex cos fails verification in reverse-mode (#875)
middle
fromStatistics
errors in reverse mode (#876)empty!
gives error with reverse mode (#882)- Julia nightly broken due to attribute mismatch (#884)
- Error from untaken branch with Float32 (#886)
Merged pull requests:
- Sizehint augfwd fix -- fixed up test (#626) (@motabbara)
- Ensure return on constant (#775) (@wsmoses)
- Update box.jl (#781) (@swilliamson7)
- Fix function type getter to be opaque-pointer-invariant (#787) (@wsmoses)
- Additional opq fixes (#788) (@wsmoses)
- Fix batch byref duplicated (#790) (@wsmoses)
- Canonicalize TT (#792) (@wsmoses)
- add blas attrs on the julia side (#794) (@ZuseZ4)
- Bump jll (#797) (@wsmoses)
- Update Project.toml (#798) (@wsmoses)
- Returned attribute performance fixes (#799) (@wsmoses)
- Add Aqua to tests (#800) (@jgreener64)
- Fix some typos in docs/src/index.md (#802) (@st--)
- Get-field reverse mode (#806) (@wsmoses)
- Mixed activity fix (#812) (@wsmoses)
- ABI fixes (#813) (@wsmoses)
- Fix inactivity on prior Julia (#814) (@wsmoses)
- Check rooting of generated functions (#815) (@wsmoses)
- Add reshape assertion (#818) (@wsmoses)
- Add better return info type rule (#820) (@wsmoses)
- Bump jll and add reshape test (#821) (@wsmoses)
- Improve error print info (#822) (@wsmoses)
- Iterate to_tuple_type (#824) (@wsmoses)
- No dot fb (#825) (@wsmoses)
- Remove jobref (#826) (@wsmoses)
- Add set undef value (#827) (@wsmoses)
- Enable calling conv fix (#828) (@wsmoses)
- fix returnprimal bug (#829) (@wsmoses)
- Prep phi node return handling (#831) (@wsmoses)
- add needsprim orig (#832) (@wsmoses)
- Jllbump (#835) (@wsmoses)
- Fix sincos call directly (#836) (@wsmoses)
- Fwdfix (#837) (@wsmoses)
- fix activity throw (#838) (@wsmoses)
- Fix runtime test (#843) (@wsmoses)
- better active reg (#845) (@wsmoses)
- Add optional sanitization, strong zero, and fast math flags (#846) (@wsmoses)
- Fix return mismatch error (#847) (@wsmoses)
- Update optimize.jl (#849) (@motabbara)
- Add inactive jl_gf_invoke_lookup (#850) (@wsmoses)
- Add sinpi (#861) (@wsmoses)
- Add print before assert (#862) (@wsmoses)
- Fix instruction assertion (#865) (@wsmoses)
- Add cuda fixes (#873) (@wsmoses)
- Bump jll (#878) (@wsmoses)
- Fix order of LLVM.dispose calls to fix SymbolStringPool assertion (#879) (@vchuravy)
- Remove LLVM IR introspection from init (#881) (@vchuravy)
- Use Ubuntu 20.04 and renable assert builds (#883) (@vchuravy)
- Fix type tree str on error (#887) (@wsmoses)
- Allow support for noinline on inner function (#888) (@wsmoses)
- Fix tup any_type (#889) (@wsmoses)
- More noinl fixes (#890) (@wsmoses)
- Avoid recursive warning on recursive types (#891) (@MilesCranmer)
- Add Intel&GDB event listeners (#894) (@vchuravy)
- Adapt to jll change (#898) (@wsmoses)
- Mark safepoint inactive and nofree (#899) (@wsmoses)
- Mark module_parent inactive (#900) (@wsmoses)
- reinsert gc marker before lowering (#907) (@wsmoses)
- Fix active reg push issue (#911) (@wsmoses)
- Don't use primal in gc preserve if it would be deleted (#913) (@wsmoses)
v0.11.1
Enzyme v0.11.1
Closed issues:
- Differentiating Gridap code (#447)
- Garbage Collection (Oceananigans) (#480)
- Enzyme segfaults on Turing model (#650)
- [EnzymeRules]: Original function implementation determines types of
augemented_primal()
andreverse()
(#695) - Incomplete doc sentence for autodiff forward mode (#698)
- gc seg fault (#727)
- Uninterpretable error with simple mistake (#730)
- cispi gradient is incorrect (#735)
- Segfault when rerunning gradient computation multiple times (#737)
- Help with gemm! (#738)
- Error when using closure over type (#741)
- Help writing complex rules (#744)
- Error with LoopVectorization.jl (#745)
- Julia Nightly custom interpreter (#749)
- Incomplete sentence in docstring of
autodiff(::ForwardMode, ...)
(#752) - LLVM error: function failed verification for reverse rule with complex inputs (#758)
- Incorrect zero primal returned for ComplexF64 inputs (#761)
- Extra 32-byte allocation per-input when using forward-mode rule (#763)
- 32bit probably get's wrong floating-point mode. (#765)
Merged pull requests:
- Activate gradient tests (#710) (@jgreener64)
- Add logo to docs (#723) (@gaurav-arya)
- Fix docs links in README (#724) (@gaurav-arya)
- Fix typo in custom rule docs (#725) (@gaurav-arya)
- Ensure keyword arg overwritten length (#726) (@wsmoses)
- Change
::Int64
to::Int
(#728) (@devmotion) - Add forward mode thunk (#729) (@wsmoses)
- Fix llvm callconv (#731) (@wsmoses)
- Fix return reverse calling conv (#733) (@wsmoses)
- add sincospi (#736) (@wsmoses)
- Updated the code for the box model example (#740) (@swilliamson7)
- Fix tape by reference (#743) (@wsmoses)
- Fix llvm vector (#747) (@wsmoses)
- Fix example in README.md (#750) (@giordano)
- Fix small typo in docs (#753) (@giordano)
- Fix rrules for pass-by-reference active vars (like complex) (#754) (@wsmoses)
- Add some more forward docs (#755) (@wsmoses)
- Add unionall to inactive (#756) (@wsmoses)
- Fix rule offset (#759) (@wsmoses)
- zero_type attrs (#762) (@wsmoses)
- Permit inlining (#764) (@wsmoses)
- Disable nightly test (#766) (@wsmoses)
- Mandate SSE/SSE2 for x86 (#767) (@vchuravy)
- Add fastmath log (#768) (@wsmoses)
- Add maybe-failing test (#769) (@sethaxen)
- Fix primal byref (#770) (@wsmoses)
- Force inlining of rule (#771) (@wsmoses)
- Bump jll (#772) (@wsmoses)
- 32-bit marktype (#773) (@wsmoses)
v0.11.0
Enzyme v0.11.0
Closed issues:
- EnzymeRules (#172)
- Stable docs missing from scripts deploy (#354)
- Missing support for erf (and related I guess) (#364)
- Error differentiating past FFT (#369)
- Merge fwddiff_deferred and autodiff_deferred (#483)
- Segmentation Fault (#514)
- Incorrect gradient returned when vector splatting is used (#545)
- lowerGCAllocBytes LLVM error running tests (#553)
- GC Segfault [private repo] (#555)
- Julia 1.9 GC segfault (#597)
- Linking two modules of different target triples: 'bcloader' is 'arm64-apple-macosx11.0.0' whereas 'text' is 'arm64-apple-darwin21.5.0' (#607)
- Test faulting in
specialfunctions
on ARM-based Apple M2 CPU (#609) - Supporting rules on functions with keyword arguments (#617)
- Segmentation Fault (#630)
- Segment violation when trying to run an optimization problem with DiffEqFlux.jl (#634)
- Wrong gradients when modifying a vector in a struct (#639)
- Enzyme is modifying variables in a struct that is not part of active data (#640)
- getfield calls not supported (#644)
- Type insertion error (#646)
- cannot handle (forward) unknown intrinsic llvm.rint and pretty nasty segfault (#647)
- Cannot handle instrinsic @llvm.trunc.f64 (#648)
- Computing hessian of unnamed functions throws error (#649)
- Calling convention mismatch
autodiff(Forward, rosenbrock_inp, Duplicated, BatchDuplicated(x, (dx_1, dx_2)))
causes segmentation fault (#652) - LLVM bug encountered in Turing.jl (#658)
- 50% correctness with certain Turing models (#659)
- Insufficiently aggressive activity analysis (#660)
- Primal returned instead of shadow? (#667)
- incorrect value when comparing with ForwardDiff (Forward mode) (#672)
- Duplicated of Ref value does not accumulate gradient in GPU kernel (#674)
- Active variables passed by value to jl_threadsfor (#675)
- Parameter unpacking yields
ERROR
s (#677) - Forward over reverse of a simple spring energy (#684)
- Forward over Reverse Example broken on 0.11-dev (#685)
- Unhandled binary operator (#688)
- Inconsistent results with ParameterHandling.jl (#691)
- Realloc error on Reverse over Forward (#693)
- Custom rule not detected if defined after call to
autodiff
(#696) - running autodiff twice leads to bad result when result vector is preallocated (#699)
- Missing docstrings for types used in custom reverse rules (#704)
- Calling convention mismatch error with custom reverse rule (#706)
- Gradient of matrix determinant errors (#709)
- seg fault in legalRecompute (#716)
- CUDA test fails - no method matching LLVM.Value (#718)
Merged pull requests:
- Add EnzymeCore changes for split mode (#334) (@vchuravy)
- Add split mode to orcv2 support (#534) (@vchuravy)
- Add support for user-defined rules: Take 4 (#589) (@wsmoses)
- Cleanup deferred (#604) (@wsmoses)
- Add EnzymeRules to autodocs (#605) (@vchuravy)
- Test and improve custom forward rule shadow handling (#608) (@wsmoses)
- Fix rrule api (#612) (@wsmoses)
- Move and simplify activity rules (#614) (@wsmoses)
- declare autodiff in EnzymeCore (#618) (@vchuravy)
- KA rules (#619) (@vchuravy)
- Add flexshadow option (#620) (@wsmoses)
- Add svec_ref null handler (#621) (@wsmoses)
- Improve error message (#622) (@wsmoses)
- Use tuple of modification status rather than single bool (#625) (@wsmoses)
- Bump workaround (#627) (@wsmoses)
- Ensure mustwrap preserves sret (#628) (@wsmoses)
- Convert task errors to runtime (#629) (@wsmoses)
- Implement blocking inlining for kwcall with rule (#638) (@vchuravy)
- Add rosenbrock example (#641) (@vchuravy)
- Fix calling conv promotion (#653) (@wsmoses)
- Fix activity bug on broadcast (#656) (@wsmoses)
- Bump jll (#657) (@wsmoses)
- Handle copy of inactive input (#661) (@wsmoses)
- Attempt fixes in runtime generic (#663) (@wsmoses)
- Initial split mode [and related ABI change] (#664) (@wsmoses)
- Add view splat test (#665) (@wsmoses)
- Try enabling more tests (#666) (@wsmoses)
- Fallback runtime generic fix (#668) (@wsmoses)
- Bump jll (#670) (@wsmoses)
- Add const activity test (#671) (@wsmoses)
- Adapt to GPUCompiler 0.18 (#673) (@vchuravy)
- Change function in thunk to be type rather than instance (#678) (@wsmoses)
- Fix 1.9 via inactive call latest (#679) (@wsmoses)
- Fix indirection and improve errors (#680) (@wsmoses)
- Fix runtime exception (#682) (@wsmoses)
- Handle emulated fma (#689) (@wsmoses)
- Adapt to llvm5 (#694) (@wsmoses)
- Custom rule doc example (#697) (@gaurav-arya)
- support invalidation for user-defined rules (#702) (@aviatesk)
- Inactive box char fn (#703) (@wsmoses)
- Handle active parallel thread loops (#705) (@wsmoses)
- Bump jll (#707) (@wsmoses)
- Enable optimization for addr13 (#711) (@wsmoses)
- Fix custom reverse rule ordering bug (#713) (@wsmoses)
- Fix gpucompiler kernel arg (#719) (@wsmoses)
- Adapt to new c abi (#720) (@wsmoses)
- Remove user level parent job option (#721) (@wsmoses)
- Bump EnzymeCore version (#722) (@wsmoses)