[LV] Fix FindLastIV reduction for epilogue vectorization. #120395

Mel-Chen · 2024-12-18T09:49:43Z

Following 0e528ac, this patch adjusts the resume value of VPReductionPHIRecipe for FindLastIV reductions. Replacing the resume value with:

ResumeValue = ResumeValue == StartValue ? SentinelValue : ResumeValue;

This addressed the correctness issue when the start value might not be less than the minimum value of a monotonically increasing induction variable.

Thanks Florian Hahn for the help.

llvmbot · 2024-12-18T09:50:20Z

@llvm/pr-subscribers-llvm-transforms

@llvm/pr-subscribers-vectorizers

Author: Mel Chen (Mel-Chen)

Changes

Following 0e528ac, this patch uses the start value of VPReductionPHIRecipe to fix the FindLastIV reduction result instead of the start value of RecurrenceDescriptor, addressing the issue where the FindLastIV idiom did not correctly use the resume value during epilogue vectorization.

Thanks Florian Hahn for the help.

Full diff: https://github.com/llvm/llvm-project/pull/120395.diff

5 Files Affected:

(modified) llvm/include/llvm/Transforms/Utils/LoopUtils.h (+6-4)
(modified) llvm/lib/Transforms/Utils/LoopUtils.cpp (+5-5)
(modified) llvm/lib/Transforms/Vectorize/LoopVectorize.cpp (-9)
(modified) llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp (+6-7)
(modified) llvm/test/Transforms/LoopVectorize/epilog-iv-select-cmp.ll (+12-16)

diff --git a/llvm/include/llvm/Transforms/Utils/LoopUtils.h b/llvm/include/llvm/Transforms/Utils/LoopUtils.h
index b4cd52fef70fd2..4e5ec962e606d0 100644
--- a/llvm/include/llvm/Transforms/Utils/LoopUtils.h
+++ b/llvm/include/llvm/Transforms/Utils/LoopUtils.h
@@ -420,15 +420,17 @@ Value *createAnyOfReduction(IRBuilderBase &B, Value *Src,
                             PHINode *OrigPhi);
 
 /// Create a reduction of the given vector \p Src for a reduction of the
-/// kind RecurKind::IFindLastIV or RecurKind::FFindLastIV. The reduction
-/// operation is described by \p Desc.
-Value *createFindLastIVReduction(IRBuilderBase &B, Value *Src,
+/// kind RecurKind::IFindLastIV or RecurKind::FFindLastIV. The scalar \p
+/// StartVal is the incoming value of reduction phi from outside the loop. The
+/// reduction operation is described by \p Desc.
+Value *createFindLastIVReduction(IRBuilderBase &B, Value *Src, Value *StartVal,
                                  const RecurrenceDescriptor &Desc);
 
 /// Create a generic reduction using a recurrence descriptor \p Desc
 /// Fast-math-flags are propagated using the RecurrenceDescriptor.
 Value *createReduction(IRBuilderBase &B, const RecurrenceDescriptor &Desc,
-                       Value *Src, PHINode *OrigPhi = nullptr);
+                       Value *Src, Value *StartVal = nullptr,
+                       PHINode *OrigPhi = nullptr);
 
 /// Create an ordered reduction intrinsic using the given recurrence
 /// descriptor \p Desc.
diff --git a/llvm/lib/Transforms/Utils/LoopUtils.cpp b/llvm/lib/Transforms/Utils/LoopUtils.cpp
index 45915c10107b2e..91b31c793b0f53 100644
--- a/llvm/lib/Transforms/Utils/LoopUtils.cpp
+++ b/llvm/lib/Transforms/Utils/LoopUtils.cpp
@@ -1209,11 +1209,12 @@ Value *llvm::createAnyOfReduction(IRBuilderBase &Builder, Value *Src,
 }
 
 Value *llvm::createFindLastIVReduction(IRBuilderBase &Builder, Value *Src,
+                                       Value *StartVal,
                                        const RecurrenceDescriptor &Desc) {
   assert(RecurrenceDescriptor::isFindLastIVRecurrenceKind(
              Desc.getRecurrenceKind()) &&
          "Unexpected reduction kind");
-  Value *StartVal = Desc.getRecurrenceStartValue();
+  assert(StartVal && "Null start value");
   Value *Sentinel = Desc.getSentinelValue();
   Value *MaxRdx = Src->getType()->isVectorTy()
                       ? Builder.CreateIntMaxReduce(Src, true)
@@ -1320,9 +1321,8 @@ Value *llvm::createSimpleReduction(VectorBuilder &VBuilder, Value *Src,
   return VBuilder.createSimpleReduction(Id, SrcTy, Ops);
 }
 
-Value *llvm::createReduction(IRBuilderBase &B,
-                             const RecurrenceDescriptor &Desc, Value *Src,
-                             PHINode *OrigPhi) {
+Value *llvm::createReduction(IRBuilderBase &B, const RecurrenceDescriptor &Desc,
+                             Value *Src, Value *StartVal, PHINode *OrigPhi) {
   // TODO: Support in-order reductions based on the recurrence descriptor.
   // All ops in the reduction inherit fast-math-flags from the recurrence
   // descriptor.
@@ -1333,7 +1333,7 @@ Value *llvm::createReduction(IRBuilderBase &B,
   if (RecurrenceDescriptor::isAnyOfRecurrenceKind(RK))
     return createAnyOfReduction(B, Src, Desc, OrigPhi);
   if (RecurrenceDescriptor::isFindLastIVRecurrenceKind(RK))
-    return createFindLastIVReduction(B, Src, Desc);
+    return createFindLastIVReduction(B, Src, StartVal, Desc);
 
   return createSimpleReduction(B, Src, RK);
 }
diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index 749336939b7109..0a87ce8ee3d4bd 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -9706,15 +9706,6 @@ void LoopVectorizationPlanner::adjustRecipesForReductions(
       // Convert the reduction phi to operate on bools.
       PhiR->setOperand(0, Plan->getOrAddLiveIn(ConstantInt::getFalse(
                               OrigLoop->getHeader()->getContext())));
-      continue;
-    }
-
-    if (RecurrenceDescriptor::isFindLastIVRecurrenceKind(
-            RdxDesc.getRecurrenceKind())) {
-      // Adjust the start value for FindLastIV recurrences to use the sentinel
-      // value after generating the ResumePhi recipe, which uses the original
-      // start value.
-      PhiR->setOperand(0, Plan->getOrAddLiveIn(RdxDesc.getSentinelValue()));
     }
   }
 
diff --git a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
index 7f8c560270bc0c..6f59df225e75ca 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
@@ -581,8 +581,9 @@ Value *VPInstruction::generate(VPTransformState &State) {
          RecurrenceDescriptor::isAnyOfRecurrenceKind(RK) ||
          RecurrenceDescriptor::isFindLastIVRecurrenceKind(RK)) &&
         !PhiR->isInLoop()) {
+      Value *StartVal = PhiR->getStartValue()->getLiveInIRValue();
       ReducedPartRdx =
-          createReduction(Builder, RdxDesc, ReducedPartRdx, OrigPhi);
+          createReduction(Builder, RdxDesc, ReducedPartRdx, StartVal, OrigPhi);
       // If the reduction can be performed in a smaller type, we need to extend
       // the reduction to the wider type before we branch to the original loop.
       if (PhiTy != RdxDesc.getRecurrenceType())
@@ -3405,15 +3406,13 @@ void VPReductionPHIRecipe::execute(VPTransformState &State) {
     }
   } else if (RecurrenceDescriptor::isFindLastIVRecurrenceKind(RK)) {
     // [I|F]FindLastIV will use a sentinel value to initialize the reduction
-    // phi or the resume value from the main vector loop when vectorizing the
-    // epilogue loop. In the exit block, ComputeReductionResult will generate
-    // checks to verify if the reduction result is the sentinel value. If the
-    // result is the sentinel value, it will be corrected back to the start
-    // value.
+    // phi. In the exit block, ComputeReductionResult will generate checks to
+    // verify if the reduction result is the sentinel value. If the result is
+    // the sentinel value, it will be corrected back to the start value.
     // TODO: The sentinel value is not always necessary. When the start value is
     // a constant, and smaller than the start value of the induction variable,
     // the start value can be directly used to initialize the reduction phi.
-    Iden = StartV;
+    StartV = Iden = RdxDesc.getSentinelValue();
     if (!ScalarPHI) {
       IRBuilderBase::InsertPointGuard IPBuilder(Builder);
       Builder.SetInsertPoint(VectorPH->getTerminator());
diff --git a/llvm/test/Transforms/LoopVectorize/epilog-iv-select-cmp.ll b/llvm/test/Transforms/LoopVectorize/epilog-iv-select-cmp.ll
index 052b4a10e9c8d5..23ab84e46e1ff6 100644
--- a/llvm/test/Transforms/LoopVectorize/epilog-iv-select-cmp.ll
+++ b/llvm/test/Transforms/LoopVectorize/epilog-iv-select-cmp.ll
@@ -46,13 +46,11 @@ define i64 @select_icmp_const(ptr %a, i64 %n) {
 ; CHECK-NEXT:    [[DOTSPLATINSERT:%.*]] = insertelement <4 x i64> poison, i64 [[BC_RESUME_VAL]], i64 0
 ; CHECK-NEXT:    [[DOTSPLAT:%.*]] = shufflevector <4 x i64> [[DOTSPLATINSERT]], <4 x i64> poison, <4 x i32> zeroinitializer
 ; CHECK-NEXT:    [[INDUCTION:%.*]] = add <4 x i64> [[DOTSPLAT]], <i64 0, i64 1, i64 2, i64 3>
-; CHECK-NEXT:    [[DOTSPLATINSERT8:%.*]] = insertelement <4 x i64> poison, i64 [[BC_MERGE_RDX]], i64 0
-; CHECK-NEXT:    [[DOTSPLAT9:%.*]] = shufflevector <4 x i64> [[DOTSPLATINSERT8]], <4 x i64> poison, <4 x i32> zeroinitializer
 ; CHECK-NEXT:    br label %[[VEC_EPILOG_VECTOR_BODY:.*]]
 ; CHECK:       [[VEC_EPILOG_VECTOR_BODY]]:
 ; CHECK-NEXT:    [[INDEX4:%.*]] = phi i64 [ [[BC_RESUME_VAL]], %[[VEC_EPILOG_PH]] ], [ [[INDEX_NEXT9:%.*]], %[[VEC_EPILOG_VECTOR_BODY]] ]
 ; CHECK-NEXT:    [[VEC_IND5:%.*]] = phi <4 x i64> [ [[INDUCTION]], %[[VEC_EPILOG_PH]] ], [ [[VEC_IND_NEXT6:%.*]], %[[VEC_EPILOG_VECTOR_BODY]] ]
-; CHECK-NEXT:    [[VEC_PHI7:%.*]] = phi <4 x i64> [ [[DOTSPLAT9]], %[[VEC_EPILOG_PH]] ], [ [[TMP11:%.*]], %[[VEC_EPILOG_VECTOR_BODY]] ]
+; CHECK-NEXT:    [[VEC_PHI7:%.*]] = phi <4 x i64> [ splat (i64 -9223372036854775808), %[[VEC_EPILOG_PH]] ], [ [[TMP11:%.*]], %[[VEC_EPILOG_VECTOR_BODY]] ]
 ; CHECK-NEXT:    [[TMP7:%.*]] = add i64 [[INDEX4]], 0
 ; CHECK-NEXT:    [[TMP8:%.*]] = getelementptr inbounds i64, ptr [[A]], i64 [[TMP7]]
 ; CHECK-NEXT:    [[TMP9:%.*]] = getelementptr inbounds i64, ptr [[TMP8]], i32 0
@@ -66,16 +64,16 @@ define i64 @select_icmp_const(ptr %a, i64 %n) {
 ; CHECK:       [[VEC_EPILOG_MIDDLE_BLOCK]]:
 ; CHECK-NEXT:    [[TMP13:%.*]] = call i64 @llvm.vector.reduce.smax.v4i64(<4 x i64> [[TMP11]])
 ; CHECK-NEXT:    [[RDX_SELECT_CMP10:%.*]] = icmp ne i64 [[TMP13]], -9223372036854775808
-; CHECK-NEXT:    [[RDX_SELECT11:%.*]] = select i1 [[RDX_SELECT_CMP10]], i64 [[TMP13]], i64 3
+; CHECK-NEXT:    [[RDX_SELECT11:%.*]] = select i1 [[RDX_SELECT_CMP10]], i64 [[TMP13]], i64 [[BC_MERGE_RDX]]
 ; CHECK-NEXT:    [[CMP_N12:%.*]] = icmp eq i64 [[N]], [[N_VEC3]]
 ; CHECK-NEXT:    br i1 [[CMP_N12]], label %[[EXIT]], label %[[VEC_EPILOG_SCALAR_PH]]
 ; CHECK:       [[VEC_EPILOG_SCALAR_PH]]:
-; CHECK-NEXT:    [[BC_RESUME_VAL15:%.*]] = phi i64 [ [[N_VEC3]], %[[VEC_EPILOG_MIDDLE_BLOCK]] ], [ 0, %[[ITER_CHECK]] ], [ [[N_VEC]], %[[VEC_EPILOG_ITER_CHECK]] ]
-; CHECK-NEXT:    [[BC_MERGE_RDX16:%.*]] = phi i64 [ [[RDX_SELECT11]], %[[VEC_EPILOG_MIDDLE_BLOCK]] ], [ 3, %[[ITER_CHECK]] ], [ [[RDX_SELECT]], %[[VEC_EPILOG_ITER_CHECK]] ]
+; CHECK-NEXT:    [[BC_RESUME_VAL13:%.*]] = phi i64 [ [[N_VEC3]], %[[VEC_EPILOG_MIDDLE_BLOCK]] ], [ 0, %[[ITER_CHECK]] ], [ [[N_VEC]], %[[VEC_EPILOG_ITER_CHECK]] ]
+; CHECK-NEXT:    [[BC_MERGE_RDX14:%.*]] = phi i64 [ [[RDX_SELECT11]], %[[VEC_EPILOG_MIDDLE_BLOCK]] ], [ 3, %[[ITER_CHECK]] ], [ [[RDX_SELECT]], %[[VEC_EPILOG_ITER_CHECK]] ]
 ; CHECK-NEXT:    br label %[[LOOP:.*]]
 ; CHECK:       [[LOOP]]:
-; CHECK-NEXT:    [[IV:%.*]] = phi i64 [ [[BC_RESUME_VAL15]], %[[VEC_EPILOG_SCALAR_PH]] ], [ [[IV_NEXT:%.*]], %[[LOOP]] ]
-; CHECK-NEXT:    [[RDX:%.*]] = phi i64 [ [[BC_MERGE_RDX16]], %[[VEC_EPILOG_SCALAR_PH]] ], [ [[SEL:%.*]], %[[LOOP]] ]
+; CHECK-NEXT:    [[IV:%.*]] = phi i64 [ [[BC_RESUME_VAL13]], %[[VEC_EPILOG_SCALAR_PH]] ], [ [[IV_NEXT:%.*]], %[[LOOP]] ]
+; CHECK-NEXT:    [[RDX:%.*]] = phi i64 [ [[BC_MERGE_RDX14]], %[[VEC_EPILOG_SCALAR_PH]] ], [ [[SEL:%.*]], %[[LOOP]] ]
 ; CHECK-NEXT:    [[GEP:%.*]] = getelementptr inbounds i64, ptr [[A]], i64 [[IV]]
 ; CHECK-NEXT:    [[L:%.*]] = load i64, ptr [[GEP]], align 8
 ; CHECK-NEXT:    [[C:%.*]] = icmp eq i64 [[L]], 3
@@ -150,13 +148,11 @@ define i64 @select_fcmp_const_fast(ptr %a, i64 %n) {
 ; CHECK-NEXT:    [[DOTSPLATINSERT:%.*]] = insertelement <4 x i64> poison, i64 [[BC_RESUME_VAL]], i64 0
 ; CHECK-NEXT:    [[DOTSPLAT:%.*]] = shufflevector <4 x i64> [[DOTSPLATINSERT]], <4 x i64> poison, <4 x i32> zeroinitializer
 ; CHECK-NEXT:    [[INDUCTION:%.*]] = add <4 x i64> [[DOTSPLAT]], <i64 0, i64 1, i64 2, i64 3>
-; CHECK-NEXT:    [[DOTSPLATINSERT8:%.*]] = insertelement <4 x i64> poison, i64 [[BC_MERGE_RDX]], i64 0
-; CHECK-NEXT:    [[DOTSPLAT9:%.*]] = shufflevector <4 x i64> [[DOTSPLATINSERT8]], <4 x i64> poison, <4 x i32> zeroinitializer
 ; CHECK-NEXT:    br label %[[VEC_EPILOG_VECTOR_BODY:.*]]
 ; CHECK:       [[VEC_EPILOG_VECTOR_BODY]]:
 ; CHECK-NEXT:    [[INDEX4:%.*]] = phi i64 [ [[BC_RESUME_VAL]], %[[VEC_EPILOG_PH]] ], [ [[INDEX_NEXT9:%.*]], %[[VEC_EPILOG_VECTOR_BODY]] ]
 ; CHECK-NEXT:    [[VEC_IND5:%.*]] = phi <4 x i64> [ [[INDUCTION]], %[[VEC_EPILOG_PH]] ], [ [[VEC_IND_NEXT6:%.*]], %[[VEC_EPILOG_VECTOR_BODY]] ]
-; CHECK-NEXT:    [[VEC_PHI7:%.*]] = phi <4 x i64> [ [[DOTSPLAT9]], %[[VEC_EPILOG_PH]] ], [ [[TMP11:%.*]], %[[VEC_EPILOG_VECTOR_BODY]] ]
+; CHECK-NEXT:    [[VEC_PHI7:%.*]] = phi <4 x i64> [ splat (i64 -9223372036854775808), %[[VEC_EPILOG_PH]] ], [ [[TMP11:%.*]], %[[VEC_EPILOG_VECTOR_BODY]] ]
 ; CHECK-NEXT:    [[TMP7:%.*]] = add i64 [[INDEX4]], 0
 ; CHECK-NEXT:    [[TMP8:%.*]] = getelementptr inbounds float, ptr [[A]], i64 [[TMP7]]
 ; CHECK-NEXT:    [[TMP9:%.*]] = getelementptr inbounds float, ptr [[TMP8]], i32 0
@@ -170,16 +166,16 @@ define i64 @select_fcmp_const_fast(ptr %a, i64 %n) {
 ; CHECK:       [[VEC_EPILOG_MIDDLE_BLOCK]]:
 ; CHECK-NEXT:    [[TMP13:%.*]] = call i64 @llvm.vector.reduce.smax.v4i64(<4 x i64> [[TMP11]])
 ; CHECK-NEXT:    [[RDX_SELECT_CMP10:%.*]] = icmp ne i64 [[TMP13]], -9223372036854775808
-; CHECK-NEXT:    [[RDX_SELECT11:%.*]] = select i1 [[RDX_SELECT_CMP10]], i64 [[TMP13]], i64 2
+; CHECK-NEXT:    [[RDX_SELECT11:%.*]] = select i1 [[RDX_SELECT_CMP10]], i64 [[TMP13]], i64 [[BC_MERGE_RDX]]
 ; CHECK-NEXT:    [[CMP_N12:%.*]] = icmp eq i64 [[N]], [[N_VEC3]]
 ; CHECK-NEXT:    br i1 [[CMP_N12]], label %[[EXIT]], label %[[VEC_EPILOG_SCALAR_PH]]
 ; CHECK:       [[VEC_EPILOG_SCALAR_PH]]:
-; CHECK-NEXT:    [[BC_RESUME_VAL15:%.*]] = phi i64 [ [[N_VEC3]], %[[VEC_EPILOG_MIDDLE_BLOCK]] ], [ 0, %[[ITER_CHECK]] ], [ [[N_VEC]], %[[VEC_EPILOG_ITER_CHECK]] ]
-; CHECK-NEXT:    [[BC_MERGE_RDX16:%.*]] = phi i64 [ [[RDX_SELECT11]], %[[VEC_EPILOG_MIDDLE_BLOCK]] ], [ 2, %[[ITER_CHECK]] ], [ [[RDX_SELECT]], %[[VEC_EPILOG_ITER_CHECK]] ]
+; CHECK-NEXT:    [[BC_RESUME_VAL13:%.*]] = phi i64 [ [[N_VEC3]], %[[VEC_EPILOG_MIDDLE_BLOCK]] ], [ 0, %[[ITER_CHECK]] ], [ [[N_VEC]], %[[VEC_EPILOG_ITER_CHECK]] ]
+; CHECK-NEXT:    [[BC_MERGE_RDX14:%.*]] = phi i64 [ [[RDX_SELECT11]], %[[VEC_EPILOG_MIDDLE_BLOCK]] ], [ 2, %[[ITER_CHECK]] ], [ [[RDX_SELECT]], %[[VEC_EPILOG_ITER_CHECK]] ]
 ; CHECK-NEXT:    br label %[[LOOP:.*]]
 ; CHECK:       [[LOOP]]:
-; CHECK-NEXT:    [[IV:%.*]] = phi i64 [ [[BC_RESUME_VAL15]], %[[VEC_EPILOG_SCALAR_PH]] ], [ [[IV_NEXT:%.*]], %[[LOOP]] ]
-; CHECK-NEXT:    [[RDX:%.*]] = phi i64 [ [[BC_MERGE_RDX16]], %[[VEC_EPILOG_SCALAR_PH]] ], [ [[SEL:%.*]], %[[LOOP]] ]
+; CHECK-NEXT:    [[IV:%.*]] = phi i64 [ [[BC_RESUME_VAL13]], %[[VEC_EPILOG_SCALAR_PH]] ], [ [[IV_NEXT:%.*]], %[[LOOP]] ]
+; CHECK-NEXT:    [[RDX:%.*]] = phi i64 [ [[BC_MERGE_RDX14]], %[[VEC_EPILOG_SCALAR_PH]] ], [ [[SEL:%.*]], %[[LOOP]] ]
 ; CHECK-NEXT:    [[GEP:%.*]] = getelementptr inbounds float, ptr [[A]], i64 [[IV]]
 ; CHECK-NEXT:    [[L:%.*]] = load float, ptr [[GEP]], align 4
 ; CHECK-NEXT:    [[C:%.*]] = fcmp fast ueq float [[L]], 3.000000e+00

llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp

fhahn · 2024-12-18T10:06:19Z

It would be good to have more runtime test coverage for the FindLastIV logic, could you add some end-to-end test to llvm-test-suite, possibly similar to https://github.com/llvm/llvm-test-suite/blob/main/SingleSource/UnitTests/Vectorizer/index-select.cpp

Mel-Chen · 2024-12-20T10:45:12Z

It would be good to have more runtime test coverage for the FindLastIV logic, could you add some end-to-end test to llvm-test-suite, possibly similar to https://github.com/llvm/llvm-test-suite/blob/main/SingleSource/UnitTests/Vectorizer/index-select.cpp

Sure, I should. llvm/llvm-test-suite#193

Mel-Chen · 2025-01-10T07:43:33Z

The llvm test-suite result of current implementation:

Checking findlast_true_update
Checking findlast_false_update
Checking findlast_start_TC
Miscompare

The fail can be fixed after this patch:

Checking findlast_true_update
Checking findlast_false_update
Checking findlast_start_TC
Checking findlast_inc_2
Checking findlast_start_decreasing_induction
Checking findlast_iv_start_3
Checking findlast_start_3_iv_start_3
Checking findlast_start_2_iv_start_3
Checking findlast_start_4_iv_start_3

The test config is -mcpu=sifive-x280 -O3 -mllvm -epilogue-vectorization-force-VF=2.

fhahn

LGTM, thanks!

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp

github-actions · 2025-01-13T08:44:04Z

✅ With the latest revision this PR passed the C/C++ code formatter.

github-actions · 2025-01-13T09:01:35Z

✅ With the latest revision this PR passed the undef deprecator.

This reverts commit be9d783.

…ion."" This reverts commit 895a134.

This reverts commit 7e233ba.

Use ResumeV = ResumeV == StartValue ? Sentinel : ResumeV to adjust the resume value.

Avoid warning on release build Co-authored-by: Florian Hahn <[email protected]>

Following 0e528ac, this patch adjusts the resume value of VPReductionPHIRecipe for FindLastIV reductions. Replacing the resume value with: ResumeValue = ResumeValue == StartValue ? SentinelValue : ResumeValue; This addressed the correctness issue when the start value might not be less than the minimum value of a monotonically increasing induction variable. Thanks Florian Hahn for the help. --------- Co-authored-by: Florian Hahn <[email protected]>

This patch adds runtime test case for vectorization of FindLastIV reduction idiom: int32_t Rdx = -1; for (int32_t I = 0; I < TC; I++) { Rdx = A[I] > B[I] ? I : Rdx; } return Rdx; Improving test coverage for llvm/llvm-project#67812 and llvm/llvm-project#120395.

Mel-Chen requested review from fhahn, alexey-bataev, ayalz and david-arm December 18, 2024 09:49

llvmbot added vectorizers llvm:transforms labels Dec 18, 2024

Mel-Chen mentioned this pull request Dec 18, 2024

[LoopVectorize] Vectorize select-cmp reduction pattern for increasing integer induction variable #67812

Merged

fhahn reviewed Dec 18, 2024

View reviewed changes

llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp Outdated Show resolved Hide resolved

Mel-Chen mentioned this pull request Dec 20, 2024

[SingleSource/Vectorizer] Add unit tests for FindLastIV pattern. llvm/llvm-test-suite#193

Merged

Mel-Chen requested a review from fhahn December 20, 2024 10:56

Mel-Chen force-pushed the fix-epilog-vec branch from be9d783 to 7185ef2 Compare January 10, 2025 07:34

fhahn approved these changes Jan 10, 2025

View reviewed changes

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp Show resolved Hide resolved

Mel-Chen requested review from DeinAlptraum, rupprecht, keith, grypp, antiagainst, kuhar, ftynse, nicolasvasilache, dcaballe, hanhanW, banach-space, rengolin and matthias-springer as code owners January 13, 2025 08:56

Mel-Chen force-pushed the fix-epilog-vec branch from 41a42d6 to 4d44a1f Compare January 13, 2025 09:11

Mel-Chen removed request for a team, nikic, keith, antiagainst, rupprecht, JDevlieghere, rengolin, ftynse, matthias-springer, banach-space, grypp, kuhar, hanhanW, nicolasvasilache, dcaballe and DeinAlptraum January 13, 2025 09:12

Mel-Chen added 5 commits January 13, 2025 01:13

[LV] Fix FindLastIV reduction for epilogue vectorization.

840b295

Revert "[LV] Fix FindLastIV reduction for epilogue vectorization."

98ab179

This reverts commit be9d783.

Revert "Revert "[LV] Fix FindLastIV reduction for epilogue vectorizat…

b3b436b

…ion."" This reverts commit 895a134.

Revert "[LV] Fix FindLastIV reduction for epilogue vectorization."

1cf21f0

This reverts commit 7e233ba.

Adjust resume value of FindLastIV

c6cf63f

Use ResumeV = ResumeV == StartValue ? Sentinel : ResumeV to adjust the resume value.

Mel-Chen force-pushed the fix-epilog-vec branch from 4d44a1f to c6cf63f Compare January 13, 2025 09:14

Mel-Chen and others added 2 commits January 13, 2025 17:14

Update llvm/lib/Transforms/Vectorize/LoopVectorize.cpp

3e5d819

Avoid warning on release build Co-authored-by: Florian Hahn <[email protected]>

Format

ca464d0

Mel-Chen merged commit 3397950 into llvm:main Jan 13, 2025
8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[LV] Fix FindLastIV reduction for epilogue vectorization. #120395

[LV] Fix FindLastIV reduction for epilogue vectorization. #120395

Mel-Chen commented Dec 18, 2024 •

edited

Loading

llvmbot commented Dec 18, 2024 •

edited

Loading

fhahn commented Dec 18, 2024

Mel-Chen commented Dec 20, 2024

Mel-Chen commented Jan 10, 2025

fhahn left a comment

github-actions bot commented Jan 13, 2025 •

edited

Loading

github-actions bot commented Jan 13, 2025 •

edited

Loading

[LV] Fix FindLastIV reduction for epilogue vectorization. #120395

[LV] Fix FindLastIV reduction for epilogue vectorization. #120395

Conversation

Mel-Chen commented Dec 18, 2024 • edited Loading

llvmbot commented Dec 18, 2024 • edited Loading

fhahn commented Dec 18, 2024

Mel-Chen commented Dec 20, 2024

Mel-Chen commented Jan 10, 2025

fhahn left a comment

Choose a reason for hiding this comment

github-actions bot commented Jan 13, 2025 • edited Loading

github-actions bot commented Jan 13, 2025 • edited Loading

Mel-Chen commented Dec 18, 2024 •

edited

Loading

llvmbot commented Dec 18, 2024 •

edited

Loading

github-actions bot commented Jan 13, 2025 •

edited

Loading

github-actions bot commented Jan 13, 2025 •

edited

Loading