-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
End to end support for bfp16 scl2vec intrinsics #278
Conversation
@@ -13,3 +13,4 @@ | |||
|
|||
include "AIEBaseRegisterBanks.td" | |||
def AccRegBank : RegisterBank<"AccRegBank", [ACC512, ACC1024, ACC2048]>; | |||
def GPRRegBank : RegisterBank<"GPRRegBank", [eR, eL, eE, EXPVEC64]>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, include a new line in the end.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should not be needed after rebase
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, few nits
@@ -572,4 +572,10 @@ def int_aie2p_sqrtf : ClangBuiltin<"__builtin_aie2p_sqrtf">, AIE2PNLF; | |||
// DIVS | |||
def int_aie2p_divs : AIE2PDIVS; | |||
|
|||
// BFP16 MAC MUL | |||
class AIE2PSHUFFLEBFP16 | |||
: Intrinsic<[llvm_v64i8_ty, llvm_v8i8_ty], [llvm_v64i8_ty, llvm_v8i8_ty, llvm_v64i8_ty, llvm_v8i8_ty, llvm_i32_ty], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should use DefaultAttrsIntrinsic
instead of Intrinsic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add .ll
test from IR to assembly?
0f04536
to
487aede
Compare
clang/lib/Headers/aie2p_scl2vec.h
Outdated
} | ||
|
||
INTRINSIC(v64bfp16ebs8) shuffle(v64bfp16ebs8 a, unsigned mode) { | ||
v64bfp16ebs8 unDef; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Undef
clang/lib/Headers/aie2p_scl2vec.h
Outdated
} | ||
|
||
INTRINSIC(v64bfp16ebs16) shuffle(v64bfp16ebs16 a, unsigned mode) { | ||
v64bfp16ebs16 unDef; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit Undef
%struct.v64bfp16ebs16 = type <{ <64 x i8>, <8 x i8> }> | ||
|
||
; Function Attrs: mustprogress nofree norecurse nosync nounwind willreturn memory(none) | ||
define dso_local noundef <16 x i32> @_Z18test_broadcast_s64Dv2_j(<2 x i32> noundef %b) local_unnamed_addr { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems unrelated. Didn't we have tests for these broadcasts?
; CHECK-NEXT: mova r1, #1 // Delay Slot 5 | ||
; CHECK-NEXT: lshl r29, r0, r1 // Delay Slot 4 | ||
; CHECK-NEXT: or r29, r29, r1; vinsert.32 x0, x2, r29, r2 // Delay Slot 3 | ||
; CHECK-NEXT: vinsert.32 x0, x0, r29, r3 // Delay Slot 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add these tests in a separate commit? as they are unrelated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Besides a few minor comments, it looks good!
487aede
to
2eec92d
Compare
2eec92d
to
8bc61ad
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
8bc61ad
to
39006d5
Compare
TODO:
v128bfp16ebs8 shuffle(v128bfp16ebs8 , unsigned int );
This depends on intrinsics from bfp16
upd_ext