-
Notifications
You must be signed in to change notification settings - Fork 12.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[LoopVectorize] Vectorize select-cmp reduction pattern for increasing integer induction variable #67812
Conversation
@llvm/pr-subscribers-llvm-transforms @llvm/pr-subscribers-vectorizers ChangesConsider the following loop:
We can vectorize this loop if This patch added new RecurKind enums - IFindLastIV and FFindLastIV. Patch is 230.36 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/67812.diff 11 Files Affected:
diff --git a/llvm/include/llvm/Analysis/IVDescriptors.h b/llvm/include/llvm/Analysis/IVDescriptors.h
index 0ee3f4fed8c976d..622060df0d8cd6c 100644
--- a/llvm/include/llvm/Analysis/IVDescriptors.h
+++ b/llvm/include/llvm/Analysis/IVDescriptors.h
@@ -52,9 +52,16 @@ enum class RecurKind {
FMulAdd, ///< Sum of float products with llvm.fmuladd(a * b + sum).
IAnyOf, ///< Any_of reduction with select(icmp(),x,y) where one of (x,y) is
///< loop invariant, and both x and y are integer type.
- FAnyOf ///< Any_of reduction with select(fcmp(),x,y) where one of (x,y) is
+ FAnyOf, ///< Any_of reduction with select(fcmp(),x,y) where one of (x,y) is
///< loop invariant, and both x and y are integer type.
- // TODO: Any_of reduction need not be restricted to integer type only.
+ IFindLastIV, ///< FindLast reduction with select(icmp(),x,y) where one of
+ ///< (x,y) is increasing loop induction PHI, and both x and y are
+ ///< integer type.
+ FFindLastIV ///< FindLast reduction with select(fcmp(),x,y) where one of (x,y)
+ ///< is increasing loop induction PHI, and both x and y are
+ ///< integer type.
+ // TODO: Any_of and FindLast reduction need not be restricted to integer type
+ // only.
};
/// The RecurrenceDescriptor is used to identify recurrences variables in a
@@ -126,7 +133,7 @@ class RecurrenceDescriptor {
/// the returned struct.
static InstDesc isRecurrenceInstr(Loop *L, PHINode *Phi, Instruction *I,
RecurKind Kind, InstDesc &Prev,
- FastMathFlags FuncFMF);
+ FastMathFlags FuncFMF, ScalarEvolution *SE);
/// Returns true if instruction I has multiple uses in Insts
static bool hasMultipleUsesOf(Instruction *I,
@@ -153,6 +160,16 @@ class RecurrenceDescriptor {
static InstDesc isAnyOfPattern(Loop *Loop, PHINode *OrigPhi, Instruction *I,
InstDesc &Prev);
+ /// Returns a struct describing whether the instruction is either a
+ /// Select(ICmp(A, B), X, Y), or
+ /// Select(FCmp(A, B), X, Y)
+ /// where one of (X, Y) is an increasing loop induction variable, and the
+ /// other is a PHI value.
+ // TODO: FindLast does not need be restricted to increasing loop induction
+ // variables.
+ static InstDesc isFindLastIVPattern(Loop *Loop, PHINode *OrigPhi,
+ Instruction *I, ScalarEvolution *SE);
+
/// Returns a struct describing if the instruction is a
/// Select(FCmp(X, Y), (Z = X op PHINode), PHINode) instruction pattern.
static InstDesc isConditionalRdxPattern(RecurKind Kind, Instruction *I);
@@ -241,6 +258,12 @@ class RecurrenceDescriptor {
return Kind == RecurKind::IAnyOf || Kind == RecurKind::FAnyOf;
}
+ /// Returns true if the recurrence kind is of the form
+ /// select(cmp(),x,y) where one of (x,y) is increasing loop induction.
+ static bool isFindLastIVRecurrenceKind(RecurKind Kind) {
+ return Kind == RecurKind::IFindLastIV || Kind == RecurKind::FFindLastIV;
+ }
+
/// Returns the type of the recurrence. This type can be narrower than the
/// actual type of the Phi if the recurrence has been type-promoted.
Type *getRecurrenceType() const { return RecurrenceType; }
diff --git a/llvm/include/llvm/Transforms/Utils/LoopUtils.h b/llvm/include/llvm/Transforms/Utils/LoopUtils.h
index 0d99249be413762..b83cfbe07ec59d8 100644
--- a/llvm/include/llvm/Transforms/Utils/LoopUtils.h
+++ b/llvm/include/llvm/Transforms/Utils/LoopUtils.h
@@ -372,6 +372,12 @@ CmpInst::Predicate getMinMaxReductionPredicate(RecurKind RK);
Value *createAnyOfOp(IRBuilderBase &Builder, Value *StartVal, RecurKind RK,
Value *Left, Value *Right);
+/// See RecurrenceDescriptor::isFindLastIVPattern for a description of the
+/// pattern we are trying to match. In this pattern, since the selected set of
+/// values forms an increasing sequence, we are selecting the maximum value from
+/// \p Left and \p Right.
+Value *createFindLastIVOp(IRBuilderBase &Builder, Value *Left, Value *Right);
+
/// Returns a Min/Max operation corresponding to MinMaxRecurrenceKind.
/// The Builder's fast-math-flags must be set to propagate the expected values.
Value *createMinMaxOp(IRBuilderBase &Builder, RecurKind RK, Value *Left,
@@ -402,6 +408,12 @@ Value *createAnyOfTargetReduction(IRBuilderBase &B, Value *Src,
const RecurrenceDescriptor &Desc,
PHINode *OrigPhi);
+/// Create a target reduction of the given vector \p Src for a reduction of the
+/// kind RecurKind::IFindLastIV or RecurKind::FFindLastIV. The reduction
+/// operation is described by \p Desc.
+Value *createFindLastIVTargetReduction(IRBuilderBase &B, Value *Src,
+ const RecurrenceDescriptor &Desc);
+
/// Create a generic target reduction using a recurrence descriptor \p Desc
/// The target is queried to determine if intrinsics or shuffle sequences are
/// required to implement the reduction.
@@ -415,6 +427,15 @@ Value *createOrderedReduction(IRBuilderBase &B,
const RecurrenceDescriptor &Desc, Value *Src,
Value *Start);
+/// Returns a set of cmp and select instructions as shown below:
+/// Select(Cmp(NE, Rdx, Iden), Rdx, InitVal)
+/// where \p Rdx is a scalar value generated by target reduction, Iden is the
+/// sentinel value of the recurrence descriptor \p Desc, and InitVal is the
+/// start value of the recurrence descriptor \p Desc.
+Value *createSentinelValueHandling(IRBuilderBase &Builder,
+ const RecurrenceDescriptor &Desc,
+ Value *Rdx);
+
/// Get the intersection (logical and) of all of the potential IR flags
/// of each scalar operation (VL) that will be converted into a vector (I).
/// If OpValue is non-null, we only consider operations similar to OpValue
diff --git a/llvm/lib/Analysis/IVDescriptors.cpp b/llvm/lib/Analysis/IVDescriptors.cpp
index 46629e381bc3665..19f594d0453c574 100644
--- a/llvm/lib/Analysis/IVDescriptors.cpp
+++ b/llvm/lib/Analysis/IVDescriptors.cpp
@@ -54,6 +54,8 @@ bool RecurrenceDescriptor::isIntegerRecurrenceKind(RecurKind Kind) {
case RecurKind::UMin:
case RecurKind::IAnyOf:
case RecurKind::FAnyOf:
+ case RecurKind::IFindLastIV:
+ case RecurKind::FFindLastIV:
return true;
}
return false;
@@ -375,7 +377,7 @@ bool RecurrenceDescriptor::AddReductionVar(
// type-promoted).
if (Cur != Start) {
ReduxDesc =
- isRecurrenceInstr(TheLoop, Phi, Cur, Kind, ReduxDesc, FuncFMF);
+ isRecurrenceInstr(TheLoop, Phi, Cur, Kind, ReduxDesc, FuncFMF, SE);
ExactFPMathInst = ExactFPMathInst == nullptr
? ReduxDesc.getExactFPMathInst()
: ExactFPMathInst;
@@ -662,6 +664,96 @@ RecurrenceDescriptor::isAnyOfPattern(Loop *Loop, PHINode *OrigPhi,
: RecurKind::FAnyOf);
}
+// We are looking for loops that do something like this:
+// int r = 0;
+// for (int i = 0; i < n; i++) {
+// if (src[i] > 3)
+// r = i;
+// }
+// The reduction value (r) is derived from either the values of an increasing
+// induction variable (i) sequence, or from the start value (0).
+// The LLVM IR generated for such loops would be as follows:
+// for.body:
+// %r = phi i32 [ %spec.select, %for.body ], [ 0, %entry ]
+// %i = phi i32 [ %inc, %for.body ], [ 0, %entry ]
+// ...
+// %cmp = icmp sgt i32 %5, 3
+// %spec.select = select i1 %cmp, i32 %i, i32 %r
+// %inc = add nsw i32 %i, 1
+// ...
+// Since 'i' is an increasing induction variable, the reduction value after the
+// loop will be the maximum value of 'i' that the condition (src[i] > 3) is
+// satisfied, or the start value (0 in the example above). When the start value
+// of the increasing induction variable 'i' is greater than the minimum value of
+// the data type, we can use the minimum value of the data type as a sentinel
+// value to replace the start value. This allows us to perform a single
+// reduction max operation to obtain the final reduction result.
+// TODO: It is possible to solve the case where the start value is the minimum
+// value of the data type or a non-constant value by using mask and multiple
+// reduction operations.
+RecurrenceDescriptor::InstDesc
+RecurrenceDescriptor::isFindLastIVPattern(Loop *Loop, PHINode *OrigPhi,
+ Instruction *I, ScalarEvolution *SE) {
+ // Only match select with single use cmp condition.
+ // TODO: Only handle single use for now.
+ CmpInst::Predicate Pred;
+ if (!match(I, m_Select(m_OneUse(m_Cmp(Pred, m_Value(), m_Value())), m_Value(),
+ m_Value())))
+ return InstDesc(false, I);
+
+ SelectInst *SI = cast<SelectInst>(I);
+ Value *NonRdxPhi = nullptr;
+
+ if (OrigPhi == dyn_cast<PHINode>(SI->getTrueValue()))
+ NonRdxPhi = SI->getFalseValue();
+ else if (OrigPhi == dyn_cast<PHINode>(SI->getFalseValue()))
+ NonRdxPhi = SI->getTrueValue();
+ else
+ return InstDesc(false, I);
+
+ auto IsIncreasingLoopInduction = [&SE, &Loop](Value *V) {
+ if (!SE)
+ return false;
+
+ Type *Ty = V->getType();
+ if (!SE->isSCEVable(Ty))
+ return false;
+
+ auto *AR = dyn_cast<SCEVAddRecExpr>(SE->getSCEV(V));
+ if (!AR)
+ return false;
+
+ const SCEV *Step = AR->getStepRecurrence(*SE);
+ if (!SE->isKnownPositive(Step))
+ return false;
+
+ const ConstantRange IVRange = SE->getSignedRange(AR);
+ unsigned NumBits = Ty->getIntegerBitWidth();
+ // Keep the minmum value of the recurrence type as the sentinel value.
+ // The maximum acceptable range for the increasing induction variable,
+ // called the valid range, will be defined as
+ // [<sentinel value> + 1, SignedMin(<recurrence type>))
+ // TODO: This range restriction can be lifted by adding an additional
+ // virtual OR reduction.
+ const APInt Sentinel = APInt::getSignedMinValue(NumBits);
+ const ConstantRange ValidRange = ConstantRange::getNonEmpty(
+ Sentinel + 1, APInt::getSignedMinValue(NumBits));
+ LLVM_DEBUG(dbgs() << "LV: FindLastIV valid range is " << ValidRange
+ << ", and the signed range of " << *AR << " is "
+ << IVRange << "\n");
+ return ValidRange.contains(IVRange);
+ };
+
+ // We are looking for selects of the form:
+ // select(cmp(), phi, loop_induction) or
+ // select(cmp(), loop_induction, phi)
+ if (!IsIncreasingLoopInduction(NonRdxPhi))
+ return InstDesc(false, I);
+
+ return InstDesc(I, isa<ICmpInst>(I->getOperand(0)) ? RecurKind::IFindLastIV
+ : RecurKind::FFindLastIV);
+}
+
RecurrenceDescriptor::InstDesc
RecurrenceDescriptor::isMinMaxPattern(Instruction *I, RecurKind Kind,
const InstDesc &Prev) {
@@ -765,10 +857,9 @@ RecurrenceDescriptor::isConditionalRdxPattern(RecurKind Kind, Instruction *I) {
return InstDesc(true, SI);
}
-RecurrenceDescriptor::InstDesc
-RecurrenceDescriptor::isRecurrenceInstr(Loop *L, PHINode *OrigPhi,
- Instruction *I, RecurKind Kind,
- InstDesc &Prev, FastMathFlags FuncFMF) {
+RecurrenceDescriptor::InstDesc RecurrenceDescriptor::isRecurrenceInstr(
+ Loop *L, PHINode *OrigPhi, Instruction *I, RecurKind Kind, InstDesc &Prev,
+ FastMathFlags FuncFMF, ScalarEvolution *SE) {
assert(Prev.getRecKind() == RecurKind::None || Prev.getRecKind() == Kind);
switch (I->getOpcode()) {
default:
@@ -798,6 +889,8 @@ RecurrenceDescriptor::isRecurrenceInstr(Loop *L, PHINode *OrigPhi,
if (Kind == RecurKind::FAdd || Kind == RecurKind::FMul ||
Kind == RecurKind::Add || Kind == RecurKind::Mul)
return isConditionalRdxPattern(Kind, I);
+ if (isFindLastIVRecurrenceKind(Kind))
+ return isFindLastIVPattern(L, OrigPhi, I, SE);
[[fallthrough]];
case Instruction::FCmp:
case Instruction::ICmp:
@@ -902,6 +995,11 @@ bool RecurrenceDescriptor::isReductionPHI(PHINode *Phi, Loop *TheLoop,
<< *Phi << "\n");
return true;
}
+ if (AddReductionVar(Phi, RecurKind::IFindLastIV, TheLoop, FMF, RedDes, DB, AC,
+ DT, SE)) {
+ LLVM_DEBUG(dbgs() << "Found a FindLastIV reduction PHI." << *Phi << "\n");
+ return true;
+ }
if (AddReductionVar(Phi, RecurKind::FMul, TheLoop, FMF, RedDes, DB, AC, DT,
SE)) {
LLVM_DEBUG(dbgs() << "Found an FMult reduction PHI." << *Phi << "\n");
@@ -1091,6 +1189,9 @@ Value *RecurrenceDescriptor::getRecurrenceIdentity(RecurKind K, Type *Tp,
case RecurKind::FAnyOf:
return getRecurrenceStartValue();
break;
+ case RecurKind::IFindLastIV:
+ case RecurKind::FFindLastIV:
+ return getRecurrenceIdentity(RecurKind::SMax, Tp, FMF);
default:
llvm_unreachable("Unknown recurrence kind");
}
@@ -1118,12 +1219,14 @@ unsigned RecurrenceDescriptor::getOpcode(RecurKind Kind) {
case RecurKind::UMax:
case RecurKind::UMin:
case RecurKind::IAnyOf:
+ case RecurKind::IFindLastIV:
return Instruction::ICmp;
case RecurKind::FMax:
case RecurKind::FMin:
case RecurKind::FMaximum:
case RecurKind::FMinimum:
case RecurKind::FAnyOf:
+ case RecurKind::FFindLastIV:
return Instruction::FCmp;
default:
llvm_unreachable("Unknown recurrence operation");
diff --git a/llvm/lib/Transforms/Utils/LoopUtils.cpp b/llvm/lib/Transforms/Utils/LoopUtils.cpp
index 21affe7bdce406e..192e93a2f9b455d 100644
--- a/llvm/lib/Transforms/Utils/LoopUtils.cpp
+++ b/llvm/lib/Transforms/Utils/LoopUtils.cpp
@@ -942,6 +942,11 @@ Value *llvm::createAnyOfOp(IRBuilderBase &Builder, Value *StartVal,
return Builder.CreateSelect(Cmp, Left, Right, "rdx.select");
}
+Value *llvm::createFindLastIVOp(IRBuilderBase &Builder, Value *Left,
+ Value *Right) {
+ return createMinMaxOp(Builder, RecurKind::SMax, Left, Right);
+}
+
Value *llvm::createMinMaxOp(IRBuilderBase &Builder, RecurKind RK, Value *Left,
Value *Right) {
Type *Ty = Left->getType();
@@ -1062,6 +1067,14 @@ Value *llvm::createAnyOfTargetReduction(IRBuilderBase &Builder, Value *Src,
return Builder.CreateSelect(Cmp, NewVal, InitVal, "rdx.select");
}
+Value *llvm::createFindLastIVTargetReduction(IRBuilderBase &Builder, Value *Src,
+ const RecurrenceDescriptor &Desc) {
+ assert(RecurrenceDescriptor::isFindLastIVRecurrenceKind(
+ Desc.getRecurrenceKind()) &&
+ "Unexpected reduction kind");
+ return Builder.CreateIntMaxReduce(Src, true);
+}
+
Value *llvm::createSimpleTargetReduction(IRBuilderBase &Builder, Value *Src,
RecurKind RdxKind) {
auto *SrcVecEltTy = cast<VectorType>(Src->getType())->getElementType();
@@ -1115,6 +1128,8 @@ Value *llvm::createTargetReduction(IRBuilderBase &B,
RecurKind RK = Desc.getRecurrenceKind();
if (RecurrenceDescriptor::isAnyOfRecurrenceKind(RK))
return createAnyOfTargetReduction(B, Src, Desc, OrigPhi);
+ if (RecurrenceDescriptor::isFindLastIVRecurrenceKind(RK))
+ return createFindLastIVTargetReduction(B, Src, Desc);
return createSimpleTargetReduction(B, Src, RK);
}
@@ -1131,6 +1146,16 @@ Value *llvm::createOrderedReduction(IRBuilderBase &B,
return B.CreateFAddReduce(Start, Src);
}
+Value *llvm::createSentinelValueHandling(IRBuilderBase &Builder,
+ const RecurrenceDescriptor &Desc,
+ Value *Rdx) {
+ Value *InitVal = Desc.getRecurrenceStartValue();
+ Value *Iden = Desc.getRecurrenceIdentity(
+ Desc.getRecurrenceKind(), Rdx->getType(), Desc.getFastMathFlags());
+ Value *Cmp = Builder.CreateCmp(CmpInst::ICMP_NE, Rdx, Iden, "rdx.select.cmp");
+ return Builder.CreateSelect(Cmp, Rdx, InitVal, "rdx.select");
+}
+
void llvm::propagateIRFlags(Value *I, ArrayRef<Value *> VL, Value *OpValue,
bool IncludeWrapFlags) {
auto *VecOp = dyn_cast<Instruction>(I);
diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index cc17d91d4f43727..35be8deeb542c56 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -3901,6 +3901,8 @@ void InnerLoopVectorizer::fixReduction(VPReductionPHIRecipe *PhiR,
else if (RecurrenceDescriptor::isAnyOfRecurrenceKind(RK))
ReducedPartRdx = createAnyOfOp(Builder, ReductionStartValue, RK,
ReducedPartRdx, RdxPart);
+ else if (RecurrenceDescriptor::isFindLastIVRecurrenceKind(RK))
+ ReducedPartRdx = createFindLastIVOp(Builder, ReducedPartRdx, RdxPart);
else
ReducedPartRdx = createMinMaxOp(Builder, RK, ReducedPartRdx, RdxPart);
}
@@ -3919,6 +3921,10 @@ void InnerLoopVectorizer::fixReduction(VPReductionPHIRecipe *PhiR,
: Builder.CreateZExt(ReducedPartRdx, PhiTy);
}
+ if (RecurrenceDescriptor::isFindLastIVRecurrenceKind(RK))
+ ReducedPartRdx =
+ createSentinelValueHandling(Builder, RdxDesc, ReducedPartRdx);
+
PHINode *ResumePhi =
dyn_cast<PHINode>(PhiR->getStartValue()->getUnderlyingValue());
@@ -5822,8 +5828,9 @@ LoopVectorizationCostModel::selectInterleaveCount(ElementCount VF,
HasReductions &&
any_of(Legal->getReductionVars(), [&](auto &Reduction) -> bool {
const RecurrenceDescriptor &RdxDesc = Reduction.second;
- return RecurrenceDescriptor::isAnyOfRecurrenceKind(
- RdxDesc.getRecurrenceKind());
+ RecurKind RK = RdxDesc.getRecurrenceKind();
+ return RecurrenceDescriptor::isAnyOfRecurrenceKind(RK) ||
+ RecurrenceDescriptor::isFindLastIVRecurrenceKind(RK);
});
if (HasSelectCmpReductions) {
LLVM_DEBUG(dbgs() << "LV: Not interleaving select-cmp reductions.\n");
@@ -8973,8 +8980,10 @@ void LoopVectorizationPlanner::adjustRecipesForReductions(
for (VPReductionPHIRecipe *PhiR : InLoopReductionPhis) {
const RecurrenceDescriptor &RdxDesc = PhiR->getRecurrenceDescriptor();
RecurKind Kind = RdxDesc.getRecurrenceKind();
- assert(!RecurrenceDescriptor::isAnyOfRecurrenceKind(Kind) &&
- "AnyOf reductions are not allowed for in-loop reductions");
+ assert(
+ (!RecurrenceDescriptor::isAnyOfRecurrenceKind(Kind) &&
+ !RecurrenceDescriptor::isFindLastIVRecurrenceKind(Kind)) &&
+ "AnyOf and FindLast reductions are not allowed for in-loop reductions");
// Collect the chain of "link" recipes for the reduction starting at PhiR.
SetVector<VPRecipeBase *> Worklist;
diff --git a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
index 572c4399b8b55cd..3ffba0113cbca03 100644
--- a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+++ b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
@@ -14352,6 +14352,8 @@ class HorizontalReduction {
case RecurKind::FMulAdd:
case RecurKind::IAnyOf:
case RecurKind::FAnyOf:
+ case RecurKind::IFindLastIV:
+ case RecurKind::FFindLastIV:
case RecurKind::None:
llvm_unreachable("Unexpected reduction kind for repeated scalar.");
}
@@ -14441,6 +14443,8 @@ class HorizontalReduction {
case RecurKind::FMulAdd:
case RecurKind::IAnyOf:
case RecurKind::FAnyOf:
+ case RecurKind::IFindLastIV:
+ case RecurKind::FFindLastIV:
case RecurKind::None:
llvm_unreachable("Unexpected reduction kind for reused scalars.");
}
diff --git a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
index 2a1213a98095907..86e93f909bf747c 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
@@ -1589,6 +1589,22 @@ void VPReductionPHIRecipe::execute(VPTransformState &State) {
StartV = Iden =
Builder.CreateVectorSplat(State.VF, StartV, "minmax.ident");
}
+ } ...
[truncated]
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, apart from minor nits.
Extend the idea in llvm#67812 to support vectorizion of decreasing IV in select-cmp patterns. llvm#67812 enabled vectorization of the following example: long src[20000] = {4, 5, 2}; long r = 331; for (long i = 0; i < 20000; i++) { if (src[i] > 3) r = i; } return r; This patch extends the above idea to also vectorize: long src[20000] = {4, 5, 2}; long r = 331; for (long i = 20000 - 1; i >= 0; i--) { if (src[i] > 3) r = i; } return r;
✅ With the latest revision this PR passed the C/C++ code formatter. |
Although IFindLastIV can be fixed to FFindLastIV when parsing instructions, could we reject IFindLastIV with FCmpInst and AddReductionVar for FFindLastIV explicitily? |
12801ba
to
ed0e0a1
Compare
This is possible, but it may cause going through the pattern identification process one more time. |
@ayalz @fhahn @david-arm Ping. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this patch - it looks like a useful addition. I'm still looking through all the tests, but I thought I'd leave some comments I have so far!
@@ -900,6 +993,11 @@ bool RecurrenceDescriptor::isReductionPHI(PHINode *Phi, Loop *TheLoop, | |||
<< *Phi << "\n"); | |||
return true; | |||
} | |||
if (AddReductionVar(Phi, RecurKind::IFindLastIV, TheLoop, FMF, RedDes, DB, AC, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we not also need one for FFindLastIV?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. We don't need to call AddReductionVar
again for FFindLastIV.
This is because at the end of function isFindLastIVPattern
, IFindLastIV can be transformed into FFindLastIV if the predicate is an fcmp instruction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The AnyOf recurrence kinds do the same thing, although there is a separate check for FAnyOf
below which wouldn't be reached if IAnyOf
matched it first, and is just redundant work if there wasn't a match.
We would only see that when looking at iv-descriptors debug output, and there's no tests for that. This code at least reports the correct type for an fcmp.
I figure it's fine with just the single case here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like #118393 is attempting to improve this issue.
This reverts commit 4bd1df2.
As the following case, the complex semantics of FindLastIV vectorization are not yet supported: ``` for.header: %indvars = phi 0, %indvars.next %rdx.phi = phi 0, %rdx.phi.next br %pred, label %bb0, label %bb1 bb0: ... %select.bb0 = select %cmp0, %rdx.phi, %indvars br label %for.inc bb1: ... %select.bb1 = select %cmp1, %rdx.phi, %indvars br label %for.inc for.inc: ... %rdx.phi.next = phi %select.bb0, %select.bb1 %indvars.next = add nuw nsw i64 %indvars, 1 br %exitcond, label %for.end, label %for.body for.end: ; external use of %rdx.phi.next ``` This patch bails out unsupported idiom during the reduction identification phase, to prevent internal compiler error.
f30b2e9
to
5ebec5d
Compare
Thanks @fhahn @huntergr-arm :) |
It looks like the patch didn't correctly handle epilogue vectorization and caused a mis-compile in SPEC2017 on AArch64. I pushed 0e528ac to fix the issue and hopefully avoid a revert |
Thank you! This is something I didn't notice before. |
This patch adds runtime test case for vectorization of FindLastIV reduction idiom: int32_t Rdx = -1; for (int32_t I = 0; I < TC; I++) { Rdx = A[I] > B[I] ? I : Rdx; } return Rdx; Improving test coverage for llvm/llvm-project#67812 and llvm/llvm-project#120395.
Consider the following loop:
We can vectorize this loop if
i
is an increasing induction variable.The final reduced value will be the maximum of
i
that the conditiona[i] > b[i]
is satisfied, or the start valueinit
.This patch added new RecurKind enums - IFindLastIV and FFindLastIV.