-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add InterBlock loop/epilogue analysis #51
Conversation
Benchmark evaluation: With option First results are without the change and the second are with.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nothing wrong. Just a few nits you may want to consider.
62899cc
to
d74b615
Compare
AddRegionToEdges(LoopBS.getBottom()); | ||
Edges.markBoundary(); | ||
// Second part is the epilogue itself | ||
AddRegionToEdges(EpilogueBS.getTop()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Maybe also add the ExitSU
node of EpilogueBS
to the map of distances so we can avoid the if (Succ->isBoundaryNode())
corner case in the loop below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the current implementation, this mapping maps instructions to depths, in this way we need to change the mapping logic as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I meant is something like: DistancesFromLoopEntry[EpilogueBS.getTop().getExitSU()] = DistFromLoopEntry;
, this way there is no if (Succ->isBoundaryNode())
special casing in the loop below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, I simplified a bit more the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good, I think the last remaining thing would be to fix the assertion that now triggers in the RegionEndEdges
mutator for some benchmarks, and add a small reproducer test for it.
On top of that, please look at the Nit comments and address up to taste :) Generally, I'd be happy if we can get rid of the if (Succ->isBoundaryNode())
special casing.
0f3a798
to
7c6d248
Compare
Hi @martien-de-jong and @gbossu, I addressed all the comments and suggestions. Benchmarks are running again with no problems. I you can take a look, it would be nice. |
7c6d248
to
038a23a
Compare
This change is necessary to run correctly mutations after bundling.
038a23a
to
50d5ed6
Compare
Here the most recent results:
|
while (Bottom.conflict(Top, Depth)) { | ||
Bottom.advance(); | ||
NopCounter++; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: That looks a lot like AIEPostRASchedStrategy::handleRegionConflicts
, maybe there's code we can share.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @gbossu, I think it could be a bit complicated, because handleRegionConflicts
uses two hazard recognizers, while we use just one, plus scoreboard comparison. I am afraid that a refactor could create a more confusing code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The implementation is different, but ultimately the goal is the same, we want to insert NOPs until there is no resource hazard. We can do that in a follow-up PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, I think it's great work!
Nit: I'd be even happier if we could remove the if (Succ->isBoundaryNode())
special case in getCyclesToRespectTiming()
, maybe by extending the map of depths, maybe by using a conditional assignment instead.
50d5ed6
to
e42ec66
Compare
Thank you for this suggestion, I addressed this special case. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
continue; | ||
const MachineInstr *PostBoundaryMI = Succ->getInstr(); | ||
|
||
int PostBondOrExitDist = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit PostBoundOrExitDist
while (Bottom.conflict(Top, Depth)) { | ||
Bottom.advance(); | ||
NopCounter++; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The implementation is different, but ultimately the goal is the same, we want to insert NOPs until there is no resource hazard. We can do that in a follow-up PR.
e42ec66
to
ab33106
Compare
Implement loop/epilog analysis to reduce pessimism related to the propagation of loop latencies and resource conflicts to the epilogue. This commit adds: * Latency analysis using a DDG. * Conflict analysis using two scoreboards. * Code refactoring. * Test additions/updates.
Now, this mutation can be applied to already bundled instructions without misclassification. It means that it is safe to run this mutation with Epilogue scheduling, for example.
If cast is necessary, MaxLatencyFinder will do it anyway.
ab33106
to
831ac27
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Implement loop/epilogue analysis to reduce pessimism related to the propagation of loop latencies and resource conflicts to epilogue.
This PR adds: