-
Notifications
You must be signed in to change notification settings - Fork 12.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AsmPrinter][ELF] Support profile-guided section prefix for jump tables' (read-only) data sections #122215
[AsmPrinter][ELF] Support profile-guided section prefix for jump tables' (read-only) data sections #122215
Conversation
cf1b613
to
8d3a985
Compare
✅ With the latest revision this PR passed the C/C++ code formatter. |
Co-authored-by: Ellis Hoag <[email protected]>
@llvm/pr-subscribers-backend-x86 Author: Mingming Liu (mingmingl-llvm) Changes#122183 adds a codegen pass to infer machine jump table entry's hotness from the MBB hotness. This is a follow-up PR to produce When this patch is enabled, linker will see {
Patch is 27.30 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/122215.diff 12 Files Affected:
diff --git a/llvm/include/llvm/CodeGen/AsmPrinter.h b/llvm/include/llvm/CodeGen/AsmPrinter.h
index c9a88d7b1c015c..4a7de22b844a65 100644
--- a/llvm/include/llvm/CodeGen/AsmPrinter.h
+++ b/llvm/include/llvm/CodeGen/AsmPrinter.h
@@ -453,6 +453,10 @@ class AsmPrinter : public MachineFunctionPass {
/// function to the current output stream.
virtual void emitJumpTableInfo();
+ virtual void emitJumpTables(ArrayRef<unsigned> JumpTableIndices,
+ MCSection *JumpTableSection, bool JTInDiffSection,
+ const MachineJumpTableInfo &MJTI);
+
/// Emit the specified global variable to the .s file.
virtual void emitGlobalVariable(const GlobalVariable *GV);
@@ -892,10 +896,10 @@ class AsmPrinter : public MachineFunctionPass {
// Internal Implementation Details
//===------------------------------------------------------------------===//
- void emitJumpTableEntry(const MachineJumpTableInfo *MJTI,
+ void emitJumpTableEntry(const MachineJumpTableInfo &MJTI,
const MachineBasicBlock *MBB, unsigned uid) const;
- void emitJumpTableSizesSection(const MachineJumpTableInfo *MJTI,
+ void emitJumpTableSizesSection(const MachineJumpTableInfo &MJTI,
const Function &F) const;
void emitLLVMUsedList(const ConstantArray *InitList);
diff --git a/llvm/include/llvm/CodeGen/CommandFlags.h b/llvm/include/llvm/CodeGen/CommandFlags.h
index d5448d781363d4..000aed782a8057 100644
--- a/llvm/include/llvm/CodeGen/CommandFlags.h
+++ b/llvm/include/llvm/CodeGen/CommandFlags.h
@@ -136,6 +136,8 @@ bool getEmitCallSiteInfo();
bool getEnableMachineFunctionSplitter();
+bool getEnableStaticDataPartitioning();
+
bool getEnableDebugEntryValues();
bool getValueTrackingVariableLocations();
diff --git a/llvm/include/llvm/CodeGen/TargetLoweringObjectFileImpl.h b/llvm/include/llvm/CodeGen/TargetLoweringObjectFileImpl.h
index a2a9e5d499e527..3d48d380fcb245 100644
--- a/llvm/include/llvm/CodeGen/TargetLoweringObjectFileImpl.h
+++ b/llvm/include/llvm/CodeGen/TargetLoweringObjectFileImpl.h
@@ -74,6 +74,9 @@ class TargetLoweringObjectFileELF : public TargetLoweringObjectFile {
MCSection *getSectionForJumpTable(const Function &F,
const TargetMachine &TM) const override;
+ MCSection *
+ getSectionForJumpTable(const Function &F, const TargetMachine &TM,
+ const MachineJumpTableEntry *JTE) const override;
MCSection *getSectionForLSDA(const Function &F, const MCSymbol &FnSym,
const TargetMachine &TM) const override;
diff --git a/llvm/include/llvm/Target/TargetLoweringObjectFile.h b/llvm/include/llvm/Target/TargetLoweringObjectFile.h
index 4864ba843f4886..577adc458fcbf1 100644
--- a/llvm/include/llvm/Target/TargetLoweringObjectFile.h
+++ b/llvm/include/llvm/Target/TargetLoweringObjectFile.h
@@ -27,6 +27,7 @@ class Function;
class GlobalObject;
class GlobalValue;
class MachineBasicBlock;
+class MachineJumpTableEntry;
class MachineModuleInfo;
class Mangler;
class MCContext;
@@ -132,6 +133,10 @@ class TargetLoweringObjectFile : public MCObjectFileInfo {
virtual MCSection *getSectionForJumpTable(const Function &F,
const TargetMachine &TM) const;
+ virtual MCSection *
+ getSectionForJumpTable(const Function &F, const TargetMachine &TM,
+ const MachineJumpTableEntry *JTE) const;
+
virtual MCSection *getSectionForLSDA(const Function &, const MCSymbol &,
const TargetMachine &) const {
return LSDASection;
diff --git a/llvm/include/llvm/Target/TargetMachine.h b/llvm/include/llvm/Target/TargetMachine.h
index 9bdb110bd36839..4a54c706c0cb6a 100644
--- a/llvm/include/llvm/Target/TargetMachine.h
+++ b/llvm/include/llvm/Target/TargetMachine.h
@@ -305,6 +305,10 @@ class TargetMachine {
return Options.FunctionSections;
}
+ bool getEnableStaticDataPartitioning() const {
+ return Options.EnableStaticDataPartitioning;
+ }
+
/// Return true if visibility attribute should not be emitted in XCOFF,
/// corresponding to -mignore-xcoff-visibility.
bool getIgnoreXCOFFVisibility() const {
diff --git a/llvm/include/llvm/Target/TargetOptions.h b/llvm/include/llvm/Target/TargetOptions.h
index 88f253805ca99c..1ddee265effa73 100644
--- a/llvm/include/llvm/Target/TargetOptions.h
+++ b/llvm/include/llvm/Target/TargetOptions.h
@@ -312,6 +312,9 @@ namespace llvm {
/// Enables the MachineFunctionSplitter pass.
unsigned EnableMachineFunctionSplitter : 1;
+ /// Enables the StaticDataSplitter pass.
+ unsigned EnableStaticDataPartitioning : 1;
+
/// Set if the target supports default outlining behaviour.
unsigned SupportsDefaultOutlining : 1;
diff --git a/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp b/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
index d34fe0e86c7495..4730c33d748832 100644
--- a/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
@@ -2861,7 +2861,6 @@ void AsmPrinter::emitConstantPool() {
// Print assembly representations of the jump tables used by the current
// function.
void AsmPrinter::emitJumpTableInfo() {
- const DataLayout &DL = MF->getDataLayout();
const MachineJumpTableInfo *MJTI = MF->getJumpTableInfo();
if (!MJTI) return;
if (MJTI->getEntryKind() == MachineJumpTableInfo::EK_Inline) return;
@@ -2876,42 +2875,101 @@ void AsmPrinter::emitJumpTableInfo() {
MJTI->getEntryKind() == MachineJumpTableInfo::EK_LabelDifference32 ||
MJTI->getEntryKind() == MachineJumpTableInfo::EK_LabelDifference64,
F);
+
+ std::vector<unsigned> JumpTableIndices;
+ if (!TM.Options.EnableStaticDataPartitioning) {
+ for (unsigned JTI = 0, JTSize = JT.size(); JTI < JTSize; ++JTI)
+ JumpTableIndices.push_back(JTI);
+ emitJumpTables(JumpTableIndices, TLOF.getSectionForJumpTable(F, TM),
+ JTInDiffSection, *MJTI);
+ return;
+ }
+
+ // When static data partitioning is enabled, collect jump table entries that
+ // go into the same section together to reduce the amount of section switch
+ // statements.
+ //
+ // Iterate all jump tables, put hot jump table indices towards the beginning
+ // of the vector, and cold jump table indices towards the end. Meanwhile
+ // retain the relative orders of original jump tables within a hot or unlikely
+ // section by reversing the cold jump table indices.
+ int NextHotJumpTableIndex = 0, NextColdJumpTableIndex = JT.size() - 1;
+ JumpTableIndices.resize(JT.size());
+ for (unsigned JTI = 0, JTSize = JT.size(); JTI < JTSize; ++JTI) {
+ if (JT[JTI].Hotness == MachineFunctionDataHotness::Cold)
+ JumpTableIndices[NextColdJumpTableIndex--] = JTI;
+ else
+ JumpTableIndices[NextHotJumpTableIndex++] = JTI;
+ }
+
+ if (NextHotJumpTableIndex != 0) {
+ emitJumpTables(
+ ArrayRef<unsigned>(JumpTableIndices).take_front(NextHotJumpTableIndex),
+ TLOF.getSectionForJumpTable(F, TM, &JT[0]), JTInDiffSection, *MJTI);
+ }
+
+ if (NextHotJumpTableIndex < (int)JT.size()) {
+ // Reverse the order of cold jump tables indices.
+ for (int L = NextHotJumpTableIndex, R = JT.size() - 1; L < R; ++L, --R)
+ std::swap(JumpTableIndices[L], JumpTableIndices[R]);
+
+ emitJumpTables(
+ ArrayRef<unsigned>(JumpTableIndices)
+ .take_back(JT.size() - NextHotJumpTableIndex),
+ TLOF.getSectionForJumpTable(
+ F, TM, &JT[JumpTableIndices[NextHotJumpTableIndex]]),
+ JTInDiffSection, *MJTI);
+ }
+
+ return;
+}
+
+void AsmPrinter::emitJumpTables(ArrayRef<unsigned> JumpTableIndices,
+ MCSection *JumpTableSection,
+ bool JTInDiffSection,
+ const MachineJumpTableInfo &MJTI) {
+ if (JumpTableIndices.empty())
+ return;
+
+ const DataLayout &DL = MF->getDataLayout();
if (JTInDiffSection) {
- // Drop it in the readonly section.
- MCSection *ReadOnlySection = TLOF.getSectionForJumpTable(F, TM);
- OutStreamer->switchSection(ReadOnlySection);
+ OutStreamer->switchSection(JumpTableSection);
}
- emitAlignment(Align(MJTI->getEntryAlignment(DL)));
+ emitAlignment(Align(MJTI.getEntryAlignment(MF->getDataLayout())));
// Jump tables in code sections are marked with a data_region directive
// where that's supported.
if (!JTInDiffSection)
OutStreamer->emitDataRegion(MCDR_DataRegionJT32);
- for (unsigned JTI = 0, e = JT.size(); JTI != e; ++JTI) {
- const std::vector<MachineBasicBlock*> &JTBBs = JT[JTI].MBBs;
+ const auto &JT = MJTI.getJumpTables();
+ for (unsigned Index = 0, e = JumpTableIndices.size(); Index != e; ++Index) {
+ const std::vector<MachineBasicBlock *> &JTBBs =
+ JT[JumpTableIndices[Index]].MBBs;
// If this jump table was deleted, ignore it.
- if (JTBBs.empty()) continue;
+ if (JTBBs.empty())
+ continue;
// For the EK_LabelDifference32 entry, if using .set avoids a relocation,
/// emit a .set directive for each unique entry.
- if (MJTI->getEntryKind() == MachineJumpTableInfo::EK_LabelDifference32 &&
+ if (MJTI.getEntryKind() == MachineJumpTableInfo::EK_LabelDifference32 &&
MAI->doesSetDirectiveSuppressReloc()) {
- SmallPtrSet<const MachineBasicBlock*, 16> EmittedSets;
+ SmallPtrSet<const MachineBasicBlock *, 16> EmittedSets;
const TargetLowering *TLI = MF->getSubtarget().getTargetLowering();
- const MCExpr *Base = TLI->getPICJumpTableRelocBaseExpr(MF,JTI,OutContext);
+ const MCExpr *Base = TLI->getPICJumpTableRelocBaseExpr(
+ MF, JumpTableIndices[Index], OutContext);
for (const MachineBasicBlock *MBB : JTBBs) {
if (!EmittedSets.insert(MBB).second)
continue;
// .set LJTSet, LBB32-base
const MCExpr *LHS =
- MCSymbolRefExpr::create(MBB->getSymbol(), OutContext);
- OutStreamer->emitAssignment(GetJTSetSymbol(JTI, MBB->getNumber()),
- MCBinaryExpr::createSub(LHS, Base,
- OutContext));
+ MCSymbolRefExpr::create(MBB->getSymbol(), OutContext);
+ OutStreamer->emitAssignment(
+ GetJTSetSymbol(JumpTableIndices[Index], MBB->getNumber()),
+ MCBinaryExpr::createSub(LHS, Base, OutContext));
}
}
@@ -2923,27 +2981,27 @@ void AsmPrinter::emitJumpTableInfo() {
// FIXME: This doesn't have to have any specific name, just any randomly
// named and numbered local label started with 'l' would work. Simplify
// GetJTISymbol.
- OutStreamer->emitLabel(GetJTISymbol(JTI, true));
+ OutStreamer->emitLabel(GetJTISymbol(JumpTableIndices[Index], true));
- MCSymbol* JTISymbol = GetJTISymbol(JTI);
+ MCSymbol *JTISymbol = GetJTISymbol(JumpTableIndices[Index]);
OutStreamer->emitLabel(JTISymbol);
// Defer MCAssembler based constant folding due to a performance issue. The
// label differences will be evaluated at write time.
for (const MachineBasicBlock *MBB : JTBBs)
- emitJumpTableEntry(MJTI, MBB, JTI);
+ emitJumpTableEntry(MJTI, MBB, JumpTableIndices[Index]);
}
if (EmitJumpTableSizesSection)
- emitJumpTableSizesSection(MJTI, F);
+ emitJumpTableSizesSection(MJTI, MF->getFunction());
if (!JTInDiffSection)
OutStreamer->emitDataRegion(MCDR_DataRegionEnd);
}
-void AsmPrinter::emitJumpTableSizesSection(const MachineJumpTableInfo *MJTI,
+void AsmPrinter::emitJumpTableSizesSection(const MachineJumpTableInfo &MJTI,
const Function &F) const {
- const std::vector<MachineJumpTableEntry> &JT = MJTI->getJumpTables();
+ const std::vector<MachineJumpTableEntry> &JT = MJTI.getJumpTables();
if (JT.empty())
return;
@@ -2991,17 +3049,17 @@ void AsmPrinter::emitJumpTableSizesSection(const MachineJumpTableInfo *MJTI,
/// EmitJumpTableEntry - Emit a jump table entry for the specified MBB to the
/// current stream.
-void AsmPrinter::emitJumpTableEntry(const MachineJumpTableInfo *MJTI,
+void AsmPrinter::emitJumpTableEntry(const MachineJumpTableInfo &MJTI,
const MachineBasicBlock *MBB,
unsigned UID) const {
assert(MBB && MBB->getNumber() >= 0 && "Invalid basic block");
const MCExpr *Value = nullptr;
- switch (MJTI->getEntryKind()) {
+ switch (MJTI.getEntryKind()) {
case MachineJumpTableInfo::EK_Inline:
llvm_unreachable("Cannot emit EK_Inline jump table entry");
case MachineJumpTableInfo::EK_Custom32:
Value = MF->getSubtarget().getTargetLowering()->LowerCustomJumpTableEntry(
- MJTI, MBB, UID, OutContext);
+ &MJTI, MBB, UID, OutContext);
break;
case MachineJumpTableInfo::EK_BlockAddress:
// EK_BlockAddress - Each entry is a plain address of block, e.g.:
@@ -3035,7 +3093,7 @@ void AsmPrinter::emitJumpTableEntry(const MachineJumpTableInfo *MJTI,
// If the .set directive avoids relocations, this is emitted as:
// .set L4_5_set_123, LBB123 - LJTI1_2
// .word L4_5_set_123
- if (MJTI->getEntryKind() == MachineJumpTableInfo::EK_LabelDifference32 &&
+ if (MJTI.getEntryKind() == MachineJumpTableInfo::EK_LabelDifference32 &&
MAI->doesSetDirectiveSuppressReloc()) {
Value = MCSymbolRefExpr::create(GetJTSetSymbol(UID, MBB->getNumber()),
OutContext);
@@ -3051,7 +3109,7 @@ void AsmPrinter::emitJumpTableEntry(const MachineJumpTableInfo *MJTI,
assert(Value && "Unknown entry kind!");
- unsigned EntrySize = MJTI->getEntrySize(getDataLayout());
+ unsigned EntrySize = MJTI.getEntrySize(getDataLayout());
OutStreamer->emitValue(Value, EntrySize);
}
diff --git a/llvm/lib/CodeGen/CommandFlags.cpp b/llvm/lib/CodeGen/CommandFlags.cpp
index d180cfcea658c2..023656cde0089e 100644
--- a/llvm/lib/CodeGen/CommandFlags.cpp
+++ b/llvm/lib/CodeGen/CommandFlags.cpp
@@ -103,6 +103,7 @@ CGOPT(bool, EnableStackSizeSection)
CGOPT(bool, EnableAddrsig)
CGOPT(bool, EmitCallSiteInfo)
CGOPT(bool, EnableMachineFunctionSplitter)
+CGOPT(bool, EnableStaticDataPartitioning)
CGOPT(bool, EnableDebugEntryValues)
CGOPT(bool, ForceDwarfFrameSection)
CGOPT(bool, XRayFunctionIndex)
@@ -480,6 +481,12 @@ codegen::RegisterCodeGenFlags::RegisterCodeGenFlags() {
cl::init(false));
CGBINDOPT(EnableMachineFunctionSplitter);
+ static cl::opt<bool> EnableStaticDataPartitioning(
+ "partition-static-data-sections",
+ cl::desc("Partition data sections using profile information."),
+ cl::init(false));
+ CGBINDOPT(EnableStaticDataPartitioning);
+
static cl::opt<bool> ForceDwarfFrameSection(
"force-dwarf-frame-section",
cl::desc("Always emit a debug frame section."), cl::init(false));
@@ -586,6 +593,7 @@ codegen::InitTargetOptionsFromCodeGenFlags(const Triple &TheTriple) {
Options.ExceptionModel = getExceptionModel();
Options.EmitStackSizeSection = getEnableStackSizeSection();
Options.EnableMachineFunctionSplitter = getEnableMachineFunctionSplitter();
+ Options.EnableStaticDataPartitioning = getEnableStaticDataPartitioning();
Options.EmitAddrsig = getEnableAddrsig();
Options.EmitCallSiteInfo = getEmitCallSiteInfo();
Options.EnableDebugEntryValues = getEnableDebugEntryValues();
diff --git a/llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp b/llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp
index be243c0e74e9db..2a00ecf80ac1e2 100644
--- a/llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp
+++ b/llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp
@@ -24,6 +24,7 @@
#include "llvm/CodeGen/BasicBlockSectionUtils.h"
#include "llvm/CodeGen/MachineBasicBlock.h"
#include "llvm/CodeGen/MachineFunction.h"
+#include "llvm/CodeGen/MachineJumpTableInfo.h"
#include "llvm/CodeGen/MachineModuleInfo.h"
#include "llvm/CodeGen/MachineModuleInfoImpls.h"
#include "llvm/IR/Comdat.h"
@@ -642,9 +643,11 @@ static StringRef getSectionPrefixForGlobal(SectionKind Kind, bool IsLarge) {
static SmallString<128>
getELFSectionNameForGlobal(const GlobalObject *GO, SectionKind Kind,
Mangler &Mang, const TargetMachine &TM,
- unsigned EntrySize, bool UniqueSectionName) {
+ unsigned EntrySize, bool UniqueSectionName,
+ const MachineJumpTableEntry *JTE) {
SmallString<128> Name =
getSectionPrefixForGlobal(Kind, TM.isLargeGlobalValue(GO));
+
if (Kind.isMergeableCString()) {
// We also need alignment here.
// FIXME: this is getting the alignment of the character, not the
@@ -663,7 +666,14 @@ getELFSectionNameForGlobal(const GlobalObject *GO, SectionKind Kind,
bool HasPrefix = false;
if (const auto *F = dyn_cast<Function>(GO)) {
- if (std::optional<StringRef> Prefix = F->getSectionPrefix()) {
+ // Jump table hotness takes precedence over its enclosing function's hotness
+ // if both are available.
+ if (JTE) {
+ if (JTE->Hotness == MachineFunctionDataHotness::Hot)
+ raw_svector_ostream(Name) << ".hot";
+ else if (JTE->Hotness == MachineFunctionDataHotness::Cold)
+ raw_svector_ostream(Name) << ".unlikely";
+ } else if (std::optional<StringRef> Prefix = F->getSectionPrefix()) {
raw_svector_ostream(Name) << '.' << *Prefix;
HasPrefix = true;
}
@@ -761,8 +771,8 @@ calcUniqueIDUpdateFlagsAndSize(const GlobalObject *GO, StringRef SectionName,
// implicitly for this symbol e.g. .rodata.str1.1, then we don't need
// to unique the section as the entry size for this symbol will be
// compatible with implicitly created sections.
- SmallString<128> ImplicitSectionNameStem =
- getELFSectionNameForGlobal(GO, Kind, Mang, TM, EntrySize, false);
+ SmallString<128> ImplicitSectionNameStem = getELFSectionNameForGlobal(
+ GO, Kind, Mang, TM, EntrySize, false, /*MJTE=*/nullptr);
if (SymbolMergeable &&
Ctx.isELFImplicitMergeableSectionNamePrefix(SectionName) &&
SectionName.starts_with(ImplicitSectionNameStem))
@@ -862,7 +872,8 @@ MCSection *TargetLoweringObjectFileELF::getExplicitSectionGlobal(
static MCSectionELF *selectELFSectionForGlobal(
MCContext &Ctx, const GlobalObject *GO, SectionKind Kind, Mangler &Mang,
const TargetMachine &TM, bool EmitUniqueSection, unsigned Flags,
- unsigned *NextUniqueID, const MCSymbolELF *AssociatedSymbol) {
+ unsigned *NextUniqueID, const MCSymbolELF *AssociatedSymbol,
+ const MachineJumpTableEntry *MJTE = nullptr) {
auto [Group, IsComdat, ExtraFlags] = getGlobalObjectInfo(GO, TM);
Flags |= ExtraFlags;
@@ -881,7 +892,7 @@ static MCSectionELF *selectELFSectionForGlobal(
}
}
SmallString<128> Name = getELFSectionNameForGlobal(
- GO, Kind, Mang, TM, EntrySize, UniqueSectionName);
+ GO, Kind, Mang, TM, EntrySize, UniqueSectionName, MJTE);
// Use 0 as the unique ID for execute-only text.
if (Kind.isExecuteOnly())
@@ -955,17 +966,23 @@ MCSection *TargetLoweringObjectFileELF::getUniqueSectionForFunction(
MCSection *TargetLoweringObjectFileELF::getSectionForJumpTable(
const Function &F, const TargetMachine &TM) const {
+ return getSectionForJumpTable(F, TM, nullptr);
+}
+
+MCSection *TargetLoweringObjectFileELF::getSectionForJumpTable(
+ const Function &F, const TargetMachine &TM,
+ const MachineJumpTableEntry *JTE) const {
// If the function can be removed, produce a unique section so that
// the table doesn't prevent the removal.
const Comdat *C = F.getComdat();
bool EmitUniqueSection = TM.getFunctionSections() || C;
- if (!EmitUniqueSection)
+ if (!EmitUniqueSection && !TM.getEnableStaticDataPartitioning())
return ReadOnlySection;
return selectELFSectionForGlobal(getContext(), &F, SectionKind::getReadOnly(),
getMangler(), TM, EmitUniqueSection,
ELF::SHF_ALLOC, &NextUniqueID,
- /* AssociatedSymbol */ nullptr);
+ /* AssociatedSymbol */ nullptr, JTE);
}
MCSection *TargetLoweringObjectFileELF::getSectionForLSDA(
diff --git a/llvm/lib/CodeGen/TargetPassConfig.cpp b/llvm/lib/CodeGen/TargetPassConfig.c...
[truncated]
|
Co-authored-by: Ellis Hoag <[email protected]>
https://discourse.llvm.org/t/rfc-profile-guided-static-data-partitioning/83744 proposes to partition static data sections. This patch introduces a codegen pass. This patch produces jump table hotness in the in-memory states (machine jump table info and entries). Target-lowering and asm-printer consume the states and produce `.hot` section suffix. The follow up PR #122215 implements such changes. --------- Co-authored-by: Ellis Hoag <[email protected]>
https://discourse.llvm.org/t/rfc-profile-guided-static-data-partitioning/83744 proposes to partition static data sections. This patch introduces a codegen pass. This patch produces jump table hotness in the in-memory states (machine jump table info and entries). Target-lowering and asm-printer consume the states and produce `.hot` section suffix. The follow up PR llvm/llvm-project#122215 implements such changes. --------- Co-authored-by: Ellis Hoag <[email protected]>
I think .rodata.hot and .rodata.unlikely will be mapped to .rodata by default. llvm-project/lld/ELF/LinkerScript.cpp Line 108 in 9bb3c62
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you rebase past the latest merged changes to make it easier to review?
@@ -0,0 +1,213 @@ | |||
; -stats requires asserts | |||
; requires: asserts |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Capital REQUIRES?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
void emitJumpTableEntry(const MachineJumpTableInfo *MJTI, | ||
template <typename Iterator> | ||
void emitJumpTableImpl(const MachineJumpTableInfo &MJTI, | ||
const llvm::iterator_range<Iterator> &JumpTableIndices, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I think you can take an iterator range by value since it's a pair.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
@@ -2876,42 +2875,111 @@ void AsmPrinter::emitJumpTableInfo() { | |||
MJTI->getEntryKind() == MachineJumpTableInfo::EK_LabelDifference32 || | |||
MJTI->getEntryKind() == MachineJumpTableInfo::EK_LabelDifference64, | |||
F); | |||
|
|||
std::vector<unsigned> JumpTableIndices; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the attempt to reuse this vector for both cases - with and without partitioning makes the code more complex than it needs to be. Consider the following refactoring:
- Sink the vector into the if condition below.
- Use separate vectors for hot and cold when partitioning is enabled.
- Use an arrayref as param for emitJumpTableImpl instead of iterator range.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
JumpTableIndices.rbegin() + NumColdJumpTables), | ||
JTInDiffSection); | ||
|
||
return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't need this return?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed it.
|
||
// If this jump table was deleted, ignore it. | ||
if (JTBBs.empty()) continue; | ||
if (JTBBs.empty()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are some formatting changes here (and below) that make it hard to reason about what actually changed. Any way to separate them out into a separate PR easily?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. #124645 does the refactor.
|
||
if (!JTInDiffSection) | ||
OutStreamer->emitDataRegion(MCDR_DataRegionEnd); | ||
} | ||
|
||
void AsmPrinter::emitJumpTableSizesSection(const MachineJumpTableInfo *MJTI, | ||
void AsmPrinter::emitJumpTableSizesSection(const MachineJumpTableInfo &MJTI, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to split out the pointer to ref changes as a separate PR and commit it as NFC.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a6aa936 should be the pointer-to-ref change. I rebased the PR as suggested. After the rebase this diff will not show up.
"Number of jump tables with unknown hotness. Option " | ||
"-static-data-default-hotness specifies the hotness."); | ||
|
||
static cl::opt<MachineFunctionDataHotness> StaticDataDefaultHotness( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this option if we plan to keep unknown hotness in rodata
? Then we can defer the decision to the linker about which sections to merge and we don't need the flexibility (and option) here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This option is un-necessary now. I removed it.
Add method `AsmPrinter::emitJumpTableImpl`. It takes an array-ref of jump table indices. This splits refactor of PR #122215
…645) Add method `AsmPrinter::emitJumpTableImpl`. It takes an array-ref of jump table indices. This splits refactor of PR llvm/llvm-project#122215
True. To do the mapping with this change, we need to extend the mapping (like 059e7cb), or use an equivalent linker script.
Done. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still going through the rest, thanks for rebasing and splitting out the NFC changes.
SmallVector<unsigned> JumpTableIndices; | ||
for (unsigned JTI = 0, JTSize = JT.size(); JTI < JTSize; ++JTI) | ||
JumpTableIndices.push_back(JTI); | ||
emitJumpTableImpl(*MJTI, JumpTableIndices, JTInDiffSection); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be written as emitJumpTableImpl(*MJTI, llvm::seq(0, JT.size()), JTInDiffSection);
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestion! TIL about llvm::seq
. llvm::to_vector(llvm::seq(<range>))
works.
MCSection *JumpTableSection = nullptr; | ||
if (TM.Options.EnableStaticDataPartitioning) { | ||
JumpTableSection = | ||
TLOF.getSectionForJumpTable(F, TM, &JT[*JumpTableIndices.begin()]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe s/*JumpTableIndices.begin()/JumpTableIndices.front()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm modulo the handling of unknown hotness jumptables.
raw_svector_ostream(Name) << ".hot"; | ||
} else if (JTE->Hotness == MachineFunctionDataHotness::Cold) { | ||
raw_svector_ostream(Name) << ".unlikely"; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What will happen to the case where JTE->Hotness == MachineFunctionDataHotness::Unknown
? I think the behaviour we want is to fall back to the section prefix. Can you add a test if you agree?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It makes sense. Done.
The updated PR has @bar
for test coverage.
@@ -967,17 +978,23 @@ MCSection *TargetLoweringObjectFileELF::getUniqueSectionForFunction( | |||
|
|||
MCSection *TargetLoweringObjectFileELF::getSectionForJumpTable( | |||
const Function &F, const TargetMachine &TM) const { | |||
return getSectionForJumpTable(F, TM, nullptr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: document the param name for nullptr like you have elsewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done, and did the same for nullptr
in TargetLoweringObjectFile.cpp at line 351.
; @func_without_profile doesn't have profiles. It's jump table hotness is unknown. | ||
; Hot jump tables are in the `.rodata.hot`-prefixed section, and cold ones in | ||
; the `.rodata.unlikely`-prefixed section. In the function without profile | ||
; information, jump table section is `rodata` without hot or unlikely prefix. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the unknown case, it might be better to preserve the original behaviour with function prefix (though it should end up in the rodata section ultimately). Wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. The updated PR did that and added @bar
for test coverage as suggested.
I also grouped CHECK lines by functions, and moved comments closer to the CHECK lines.
… used for linker's disambiguation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The updated PR also generates .rodata.<suffix>.
(with a trailing dot after <suffix>
) when -ffunction-sections
is false, for consistency with existing code (which sets HasPrefix
for linker's disambiguation between .text.<section-prefix>.
and .text.<function>
).
PTAL, thanks!
@@ -967,17 +978,23 @@ MCSection *TargetLoweringObjectFileELF::getUniqueSectionForFunction( | |||
|
|||
MCSection *TargetLoweringObjectFileELF::getSectionForJumpTable( | |||
const Function &F, const TargetMachine &TM) const { | |||
return getSectionForJumpTable(F, TM, nullptr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done, and did the same for nullptr
in TargetLoweringObjectFile.cpp at line 351.
raw_svector_ostream(Name) << ".hot"; | ||
} else if (JTE->Hotness == MachineFunctionDataHotness::Cold) { | ||
raw_svector_ostream(Name) << ".unlikely"; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It makes sense. Done.
The updated PR has @bar
for test coverage.
; @func_without_profile doesn't have profiles. It's jump table hotness is unknown. | ||
; Hot jump tables are in the `.rodata.hot`-prefixed section, and cold ones in | ||
; the `.rodata.unlikely`-prefixed section. In the function without profile | ||
; information, jump table section is `rodata` without hot or unlikely prefix. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. The updated PR did that and added @bar
for test coverage as suggested.
I also grouped CHECK lines by functions, and moved comments closer to the CHECK lines.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/50/builds/9557 Here is the relevant piece of the build log for the reference
|
#122183 adds a codegen pass to infer machine jump table entry's hotness from the MBB hotness. This is a follow-up PR to produce
.hot
and or.unlikely
section prefix for jump table's (read-only) data sections in the relocatable.o
files.When this patch is enabled, linker will see {
.rodata
,.rodata.hot
,.rodata.unlikely
} in input sections. It can map.rodata.hot
and.rodata
in the input sections to.rodata.hot
in the executable, and map.rodata.unlikely
into.rodata
with a pending extension to--keep-text-section-prefix
like 059e7cb, or with a linker script.To partition hot and jump tables, the AsmPrinter pass slices a function's jump table indices into two groups, one for hot and the other for cold jump tables. It then emits hot jump tables into a
.hot
-prefixed data section and cold ones into a.unlikely
-prefixed data section, retaining the relative order ofLJT<N>
labels within each group.[ELF only] To have data sections with dynamic names (e.g.,
.rodata.hot[.func]
), we implementTargetLoweringObjectFile::getSectionForJumpTable
method that accepts aMachineJumpTableEntry
parameter, and updateselectELFSectionForGlobal
to generate.hot
or.unlikely
based on MJTE's hotness.-ffunction-section=true
or-funique-section-names=true
, even though it leverages the similar underlying mechanism to have a MCSection with on-demand name as-ffunction-section
does.The new code path is off by default.
TargetOptions
conveys clang or LLVM tools' options to code generation passes. To follow the pattern, add optionEnableStaticDataPartitioning
bit inTargetOptions
and make it readable throughTargetMachine
.llc
,partition-static-data-sections
option is introduced inCodeGen/CommandFlags.h/cpp
.