-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ControlCodeToTransaction] Use air-rt's serializer to generate transaction #1002
Conversation
@@ -15,7 +15,7 @@ module { | |||
// ----- | |||
|
|||
// CHECK: 0x06030100 | |||
// CHECK: 0x00000105 | |||
// CHECK: 0x00000104 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was previously hardcoded as 0x00000105
, which was wrong as the last byte stands for the number of columns, and we are using npu1_4col
.
// CHECK: 0x00000001 | ||
// CHECK: 0x00000028 | ||
// CHECK: 0x00000000 | ||
// CHECK: 0x00140000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a workaround to accommodate aie-rt
.
I believe the following implementation of _XAie_GetRowfromRegOff
is actually wrong, as it behaves incorrectly by always preserving the lowest 8 bits.
https://github.com/Xilinx/aie-rt/blob/1470f6c1d0e765a73c741ad51bc761515d74457e/driver/src/common/xaie_helper.c#L1024-L1027
In contrast, _XAie_GetColfromRegOff
appears to be correctly implemented and performs as expected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where exactly do you perform this workaround?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The correct value here should be 0x00000000
by definition, as it operates on row 0, column 0, with the opcode set to 0. However, due to the current behavior of the aie-rt implementation, the CHECK value has been updated to align with its output.
This discrepancy does not currently cause any hardware errors, as this part of the value is discarded later in the process. The relevant row and column information is retrieved from elsewhere. Nonetheless, I will report this issue to the aie-rt repository.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I see, thanks for the info, would indeed be good to report this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great! Just a few nits.
@@ -162,14 +164,19 @@ LogicalResult configureDMABD( | |||
TRY_XAIE_API_LOGICAL_RESULT(XAie_DmaSetPadding, &dmaDesc, &dmaPadTensor); | |||
} | |||
|
|||
if (maybeIter.has_value()) { | |||
auto iter = maybeIter.value(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
auto iter = maybeIter.value(); | |
BDIterLayout iter = maybeIter.value(); |
Use explicit type.
for (auto &stride : stridesNew) | ||
stride = std::max(int64_t(stride), int64_t(1)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for (auto &stride : stridesNew) | |
stride = std::max(int64_t(stride), int64_t(1)); | |
std::for_each(stridesNew.begin(), stridesNew.end(), [](int32_t &stride) { stride = std::max(stride, (int32_t)1)}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just a suggestion, if you prefer the style you have, that works as well.
std::max((int64_t)op.getIterationSize() - 1, (int64_t)0); | ||
// Strides and iteration_size will be encoded as `actual - 1`, so we need to | ||
// ensure they are at least 1. | ||
SmallVector<int32_t> stridesNew(strides.begin(), strides.end()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could perform the copy on strides
directly to avoid having both strides
and stridesNew
around, for example:
SmallVector<int32_t> strides(op.getStrides());
// CHECK: 0x00000001 | ||
// CHECK: 0x00000028 | ||
// CHECK: 0x00000000 | ||
// CHECK: 0x00140000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where exactly do you perform this workaround?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
No description provided.