-
Notifications
You must be signed in to change notification settings - Fork 90
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update rvv-intrinsic-generator to define new RVV C intrinsic API for bf16 type #229
Conversation
Signed-off-by: joshua-arch1 <[email protected]>
Signed-off-by: joshua-arch1 <[email protected]>
Signed-off-by: joshua-arch1 <[email protected]>
Signed-off-by: joshua-arch1 <[email protected]>
Signed-off-by: joshua-arch1 <[email protected]>
@@ -331,7 +331,7 @@ def gen(g): | |||
"Vector Widening Floating-Point Fused Multiply-Add Functions", | |||
REF_DOC_URL + | |||
"#147-vector-widening-floating-point-fused-multiply-add-operations", | |||
["wmacc", "wnmacc", "wmsac", "wnmsac"], FTYPES, WFSEWS, WLMULS, | |||
["wmacc", "wnmacc", "wmsac", "wnmsac"], BFTYPES, WFSEWS, WLMULS, | |||
decorators.has_masking_no_maskedoff_policy) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like bfloat has a separated section and mention its still on the draft status
Signed-off-by: joshua-arch1 <[email protected]>
Signed-off-by: joshua-arch1 <[email protected]>
Hi! It looks like the commits you provided are generating instructions like |
Signed-off-by: joshua-arch1 <[email protected]>
The PR for __bf16 ABI has been approved and is going to be merged (riscv-non-isa/riscv-elf-psabi-doc#367). Maybe we can now accelerate finalizing our RVV bf16 intrinsic. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for pushing this and sorry I haven't been able to review this in the past 2 weeks.
Please separate the vfwmaccbf16
, vfwcvtbf16.f.f.v
and vfncvtbf16.f.f.w
in a separate section. Please merge your commits in #224 to here too.
Regarding the type descriptions in the document, I think you should make it explicit that the bfloat16 types and the intrinsics will not be available when zvbfmin
and zvbfwma
is not specified in the architecture.
On the other hand, I agree with Rich's latest comment that it would be convenient to have a bfloat16 load/store that represents a (load + fncvt) and (fwcvt + store).
I have replied on my thought regarding this and the planned v1.0 release in the mailing list [0].
[0] https://lists.riscv.org/g/tech-rvv-intrinsics/message/57
@@ -334,6 +334,13 @@ def gen(g): | |||
["wmacc", "wnmacc", "wmsac", "wnmsac"], FTYPES, WFSEWS, WLMULS, | |||
decorators.has_masking_no_maskedoff_policy) | |||
|
|||
g.function_group( | |||
mac_template, | |||
"Vector BFloat16 Widening Multiply-Add Functions (draft)", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add Zvfbfwma
in the subtitle.
Signed-off-by: joshua-arch1 <[email protected]>
Signed-off-by: joshua-arch1 <[email protected]>
Signed-off-by: joshua-arch1 <[email protected]>
Signed-off-by: joshua-arch1 <[email protected]>
Signed-off-by: joshua-arch1 <[email protected]>
Signed-off-by: joshua-arch1 <[email protected]>
I have updated my PR according to your comments, except the load/store intrinsics. Maybe you have a different understanding with Rich's. He meant an int16 load/store followed by a reinterpret cast rather than fncvt/fwcvt. Which definition is better? @eopXD |
Hi, May you rebase upon the latest main? Thank you. |
On top of rebasing, if possible, I think we can also have the effort of adding the intrinsics for Zvfbfwma and Zvfbfmin here. I imagine we would also need load/store intrinsics for the bfloat type. The narrow conversions and multiply-add instruction intrinsics will need a rounding mode variant. |
I believe this should covered by this and #293 merged into main branch, so close this :) |
As is duscussed in #223, the BF16 Extension has recently been proposed and we need to define new intrinsics for convert instruction (bf16-to-fp32/fp32-to-bf16), reinterpret function as well as vfwmaccbf16 in Zvfbfwma Extenstion.