You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Most of the work is done now, cudaAssign() needs overloads for every expression to support partial evaluation properly, following the same implementation pattern as in DMatDMatAddExpr.h:
External to the original expression templates
Implement the same functionalities as their CPU counterparts
Follow the same enable condition as their CPU counterparts
Call cudaAssign() instead of assign()
cuBLAS will be used as much as possible to implement them.
The text was updated successfully, but these errors were encountered:
DMatDVecMultExpr:
Requires a bit of work on the cuBLAS part
DVecDVecInnerExpr:
Only plain vectors, requires modifications on Blaze to work seamlessly with views
& CUDA-compatible expressions
And for starters, here's a list of expressions to implement:
Most of the work is done now,
cudaAssign()
needs overloads for every expression to support partial evaluation properly, following the same implementation pattern as in DMatDMatAddExpr.h:cudaAssign()
instead ofassign()
cuBLAS will be used as much as possible to implement them.
The text was updated successfully, but these errors were encountered: