Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[lldb] Clear cached unwind plans when adding symbols #125839

Merged
merged 2 commits into from
Feb 7, 2025

Conversation

labath
Copy link
Collaborator

@labath labath commented Feb 5, 2025

PR #86603 broke unwinding in for unwind info added via "target symbols add". #86770 attempted to fix this, but the fix was only partial -- it accepted new sources of unwind information, but didn't take into account that the symbol file can alter what lldb percieves as function boundaries.

A stripped file will not contain information about private (non-exported) symbols, which will make the public symbols appear very large. If lldb tries to unwind from such a function before symbols are added, then the cached unwind plan will prevent new (correct) unwind plans from being created.

target-symbols-add-unwind.test might have caught this, were it not for the fact that the "image show-unwind" command does not use cached unwind information (it recomputes it from scratch).

The changes in this patch come in three pieces:

  • Clear cached unwind plans when adding symbols. Since the symbol boundaries can change, we cannot trust anything we've computed previously.
  • Add a flag to "image show-unwind" to display the cached unwind information (mainly for the use in the test, but I think it's also generally useful).
  • Rewrite the test to better and more reliably simulate the real-world scenario: I've swapped the running process for a core (minidump) file so it can run anywhere; used the caching version of the show-unwind command; and swapped C for assembly to better control the placement of symbols

PR llvm#86603 broke unwinding in for unwind info added via "target symbols
add". llvm#86770 attempted to fix this, but the fix was only partial -- it
accepted new sources of unwind information, but didn't take into account
that the symbol file can alter what lldb percieves as function
boundaries.

A stripped file will not contain information about private
(non-exported) symbols, which will make the public symbols appear very
large. If lldb tries to unwind from such a function before symbols are
added, then the cached unwind plan will prevent new (correct) unwind
plans from being created.

target-symbols-add-unwind.test might have caught this, were it not for
the fact that the "image show-unwind" command does *not* use cached
unwind information (it recomputes it from scratch).

The changes in this patch come in three pieces:
- Clear cached unwind plans when adding symbols. Since the symbol
  boundaries can change, we cannot trust anything we've computed
  previously.
- Add a flag to "image show-unwind" to display the cached unwind
  information (mainly for the use in the test, but I think it's also
  generally useful).
- Rewrite the test to better and more reliably simulate the real-world
  scenario: I've swapped the running process for a core (minidump) file
  so it can run anywhere; used the caching version of the show-unwind
  command; and swapped C for assembly to better control the placement of
  symbols
@labath labath requested a review from jasonmolenda February 5, 2025 11:31
@labath labath requested a review from JDevlieghere as a code owner February 5, 2025 11:31
@llvmbot llvmbot added the lldb label Feb 5, 2025
@llvmbot
Copy link
Member

llvmbot commented Feb 5, 2025

@llvm/pr-subscribers-lldb

Author: Pavel Labath (labath)

Changes

PR #86603 broke unwinding in for unwind info added via "target symbols add". #86770 attempted to fix this, but the fix was only partial -- it accepted new sources of unwind information, but didn't take into account that the symbol file can alter what lldb percieves as function boundaries.

A stripped file will not contain information about private (non-exported) symbols, which will make the public symbols appear very large. If lldb tries to unwind from such a function before symbols are added, then the cached unwind plan will prevent new (correct) unwind plans from being created.

target-symbols-add-unwind.test might have caught this, were it not for the fact that the "image show-unwind" command does not use cached unwind information (it recomputes it from scratch).

The changes in this patch come in three pieces:

  • Clear cached unwind plans when adding symbols. Since the symbol boundaries can change, we cannot trust anything we've computed previously.
  • Add a flag to "image show-unwind" to display the cached unwind information (mainly for the use in the test, but I think it's also generally useful).
  • Rewrite the test to better and more reliably simulate the real-world scenario: I've swapped the running process for a core (minidump) file so it can run anywhere; used the caching version of the show-unwind command; and swapped C for assembly to better control the placement of symbols

Full diff: https://github.com/llvm/llvm-project/pull/125839.diff

6 Files Affected:

  • (modified) lldb/include/lldb/Symbol/UnwindTable.h (+3-2)
  • (modified) lldb/source/Commands/CommandObjectTarget.cpp (+19-3)
  • (modified) lldb/source/Commands/Options.td (+2)
  • (modified) lldb/source/Symbol/UnwindTable.cpp (+2-1)
  • (removed) lldb/test/Shell/SymbolFile/Inputs/target-symbols-add-unwind.c (-1)
  • (modified) lldb/test/Shell/SymbolFile/target-symbols-add-unwind.test (+99-19)
diff --git a/lldb/include/lldb/Symbol/UnwindTable.h b/lldb/include/lldb/Symbol/UnwindTable.h
index 29b7fa61b4849e0..3166fdec6ebaa77 100644
--- a/lldb/include/lldb/Symbol/UnwindTable.h
+++ b/lldb/include/lldb/Symbol/UnwindTable.h
@@ -38,8 +38,9 @@ class UnwindTable {
   ArmUnwindInfo *GetArmUnwindInfo();
   SymbolFile *GetSymbolFile();
 
-  lldb::FuncUnwindersSP GetFuncUnwindersContainingAddress(const Address &addr,
-                                                          SymbolContext &sc);
+  lldb::FuncUnwindersSP
+  GetFuncUnwindersContainingAddress(const Address &addr,
+                                    const SymbolContext &sc);
 
   bool GetAllowAssemblyEmulationUnwindPlans();
 
diff --git a/lldb/source/Commands/CommandObjectTarget.cpp b/lldb/source/Commands/CommandObjectTarget.cpp
index d8265e41a7384eb..db142e8f0eaee77 100644
--- a/lldb/source/Commands/CommandObjectTarget.cpp
+++ b/lldb/source/Commands/CommandObjectTarget.cpp
@@ -3471,6 +3471,17 @@ class CommandObjectTargetModulesShowUnwind : public CommandObjectParsed {
         m_type = eLookupTypeFunctionOrSymbol;
         break;
 
+      case 'c':
+        bool value, success;
+        value = OptionArgParser::ToBoolean(option_arg, false, &success);
+        if (success) {
+          m_cached = value;
+        } else {
+          return Status::FromErrorStringWithFormatv(
+              "invalid boolean value '%s' passed for -G option", option_arg);
+        }
+        break;
+
       default:
         llvm_unreachable("Unimplemented option");
       }
@@ -3482,6 +3493,7 @@ class CommandObjectTargetModulesShowUnwind : public CommandObjectParsed {
       m_type = eLookupTypeInvalid;
       m_str.clear();
       m_addr = LLDB_INVALID_ADDRESS;
+      m_cached = false;
     }
 
     llvm::ArrayRef<OptionDefinition> GetDefinitions() override {
@@ -3494,6 +3506,7 @@ class CommandObjectTargetModulesShowUnwind : public CommandObjectParsed {
                                      // parsing options
     std::string m_str; // Holds name lookup
     lldb::addr_t m_addr = LLDB_INVALID_ADDRESS; // Holds the address to lookup
+    bool m_cached = false;
   };
 
   CommandObjectTargetModulesShowUnwind(CommandInterpreter &interpreter)
@@ -3583,9 +3596,12 @@ class CommandObjectTargetModulesShowUnwind : public CommandObjectParsed {
       if (abi)
         start_addr = abi->FixCodeAddress(start_addr);
 
-      FuncUnwindersSP func_unwinders_sp(
-          sc.module_sp->GetUnwindTable()
-              .GetUncachedFuncUnwindersContainingAddress(start_addr, sc));
+      UnwindTable &uw_table = sc.module_sp->GetUnwindTable();
+      FuncUnwindersSP func_unwinders_sp =
+          m_options.m_cached
+              ? uw_table.GetFuncUnwindersContainingAddress(start_addr, sc)
+              : uw_table.GetUncachedFuncUnwindersContainingAddress(start_addr,
+                                                                   sc);
       if (!func_unwinders_sp)
         continue;
 
diff --git a/lldb/source/Commands/Options.td b/lldb/source/Commands/Options.td
index 777f8c36c4916ca..8831fed38435b54 100644
--- a/lldb/source/Commands/Options.td
+++ b/lldb/source/Commands/Options.td
@@ -965,6 +965,8 @@ let Command = "target modules show unwind" in {
   def target_modules_show_unwind_address : Option<"address", "a">, Group<2>,
     Arg<"AddressOrExpression">, Desc<"Show unwind instructions for a function "
     "or symbol containing an address">;
+  def target_modules_show_unwind_cached : Option<"cached", "c">,
+    Arg<"Boolean">, Desc<"Show cached unwind information">;
 }
 
 let Command = "target modules lookup" in {
diff --git a/lldb/source/Symbol/UnwindTable.cpp b/lldb/source/Symbol/UnwindTable.cpp
index da88b0c9c4ea191..61d51192bf3d1e4 100644
--- a/lldb/source/Symbol/UnwindTable.cpp
+++ b/lldb/source/Symbol/UnwindTable.cpp
@@ -86,6 +86,7 @@ void UnwindTable::Initialize() {
 void UnwindTable::ModuleWasUpdated() {
   std::lock_guard<std::mutex> guard(m_mutex);
   m_scanned_all_unwind_sources = false;
+  m_unwinds.clear();
 }
 
 UnwindTable::~UnwindTable() = default;
@@ -118,7 +119,7 @@ UnwindTable::GetAddressRange(const Address &addr, const SymbolContext &sc) {
 
 FuncUnwindersSP
 UnwindTable::GetFuncUnwindersContainingAddress(const Address &addr,
-                                               SymbolContext &sc) {
+                                               const SymbolContext &sc) {
   Initialize();
 
   std::lock_guard<std::mutex> guard(m_mutex);
diff --git a/lldb/test/Shell/SymbolFile/Inputs/target-symbols-add-unwind.c b/lldb/test/Shell/SymbolFile/Inputs/target-symbols-add-unwind.c
deleted file mode 100644
index 237c8ce181774d9..000000000000000
--- a/lldb/test/Shell/SymbolFile/Inputs/target-symbols-add-unwind.c
+++ /dev/null
@@ -1 +0,0 @@
-int main() {}
diff --git a/lldb/test/Shell/SymbolFile/target-symbols-add-unwind.test b/lldb/test/Shell/SymbolFile/target-symbols-add-unwind.test
index 5420213d405e868..249d2ef79a891c4 100644
--- a/lldb/test/Shell/SymbolFile/target-symbols-add-unwind.test
+++ b/lldb/test/Shell/SymbolFile/target-symbols-add-unwind.test
@@ -1,27 +1,107 @@
-# TODO: When it's possible to run "image show-unwind" without a running
-# process, we can remove the unsupported line below, and hard-code an ELF
-# triple in the test.
-# UNSUPPORTED: system-windows, system-darwin
-
-# RUN: cd %T
-# RUN: %clang_host %S/Inputs/target-symbols-add-unwind.c -g \
-# RUN:   -fno-unwind-tables -fno-asynchronous-unwind-tables \
-# RUN:   -o target-symbols-add-unwind.debug
-# RUN: llvm-objcopy --strip-debug target-symbols-add-unwind.debug \
-# RUN:   target-symbols-add-unwind.stripped
-# RUN: %lldb target-symbols-add-unwind.stripped -s %s -o quit | FileCheck %s
-
-process launch --stop-at-entry
-image show-unwind -n main
-# CHECK-LABEL: image show-unwind -n main
+# NB: The minidump core file exists only because "image show-unwind" currently
+# requires a process to exist. If that changes, it can be removed.
+
+# REQUIRES: x86, lld
+
+# RUN: split-file %s %t
+# RUN: yaml2obj %t/a.core.yaml -o %t/a.core
+# RUN: %clang -c --target=x86_64-pc-linux %t/a.s -o %t/a.o
+# RUN: ld.lld --shared %t/a.o -o %t/a.debug --build-id=0xdeadbeef \
+# RUN:   --image-base=0x10000
+# RUN: llvm-objcopy --strip-all %t/a.debug %t/a.stripped
+# RUN: cd %t
+# RUN: %lldb -c %t/a.core \
+# RUN:   -o "settings set interpreter.stop-command-source-on-error false" \
+# RUN:   -s %t/commands -o quit | FileCheck %s
+
+#--- commands
+
+image add a.stripped
+image load --file a.stripped --slide 0
+image list
+# CHECK-LABEL: image list
+# CHECK: [  0] DEADBEEF 0x0000000000010000 a.stripped
+
+## Due to missing symbol information this (incorrectly) prints the unwind
+## information for public_fn1
+image show-unwind -n public_fn1 --cached true
+# CHECK-LABEL: image show-unwind -n public_fn1
+# CHECK-NEXT: UNWIND PLANS for a.stripped`public_fn1 (start addr 0x12000)
 # CHECK-NOT: debug_frame UnwindPlan:
 
-target symbols add -s target-symbols-add-unwind.stripped target-symbols-add-unwind.debug
+target symbols add -s a.stripped a.debug
 # CHECK-LABEL: target symbols add
 # CHECK: symbol file {{.*}} has been added to {{.*}}
 
-image show-unwind -n main
-# CHECK-LABEL: image show-unwind -n main
+image show-unwind -n private_fn --cached true
+# CHECK-LABEL: image show-unwind -n private_fn
+# CHECK-NEXT: UNWIND PLANS for a.stripped`private_fn (start addr 0x12010)
 # CHECK: debug_frame UnwindPlan:
 # CHECK-NEXT: This UnwindPlan originally sourced from DWARF CFI
 # CHECK-NEXT: This UnwindPlan is sourced from the compiler: yes.
+# CHECK-NEXT: This UnwindPlan is valid at all instruction locations: no.
+# CHECK-NEXT: This UnwindPlan is for a trap handler function: no.
+# CHECK-NEXT: Address range of this UnwindPlan: [a.stripped.PT_LOAD[1]..text + 16-0x0000000000000013)
+
+
+#--- a.s
+
+        .text
+        .cfi_sections .debug_frame
+        .globl  public_fn1, public_fn2
+
+        .p2align 12
+public_fn1:
+        .cfi_startproc
+        pushq   %rbp
+        .cfi_def_cfa_offset 16
+        .cfi_offset %rbp, -16
+        popq    %rbp
+        .cfi_def_cfa %rsp, 8
+        retq
+        .cfi_endproc
+
+        .p2align 4
+private_fn:
+        .cfi_startproc
+        pushq   %rbp
+        .cfi_def_cfa_offset 16
+        .cfi_offset %rbp, -16
+        popq    %rbp
+        .cfi_def_cfa %rsp, 8
+        retq
+        .cfi_endproc
+
+        .p2align 4
+public_fn2:
+        .cfi_startproc
+        pushq   %rbp
+        .cfi_def_cfa_offset 16
+        .cfi_offset %rbp, -16
+        popq    %rbp
+        .cfi_def_cfa %rsp, 8
+        retq
+        .cfi_endproc
+
+#--- a.core.yaml
+--- !minidump
+Streams:
+  - Type:            SystemInfo
+    Processor Arch:  AMD64
+    Platform ID:     Linux
+    CPU:
+      Vendor ID:       GenuineIntel
+      Version Info:    0x00000000
+      Feature Info:    0x00000000
+  - Type:            ThreadList
+    Threads:
+      - Thread Id:       0x000074F3
+        Context:         0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000B001000000000006CAE000000006B7FC05A0000C81D415A0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000A2BF9E5A6B7F0000000000000000000000000000000000008850C14BFD7F00009850C14BFD7F00000100000000000000B04AC14BFD7F0000000000000000000060812D01000000000800000000000000B065E05A6B7F00008004400000000000E050C14BFD7F00000000000000000000000000000000000004400000000000007F03FFFF0000FFFFFFFFFFFF000000000000000000000000801F00006B7F00000400000000000000B84CC14BFD7F0000304D405A6B7F0000C84DC14BFD7F0000C0AA405A6B7F00004F033D0000000000B84DC14BFD7F0000E84DC14BFD7F0000000000000000000000000000000000000070E05A6B7F000078629E5A6B7F0000C81D415A6B7F0000804F9E5A6B7F00000000000001000000E603000001000000E093115A6B7F0000804EC14BFD7F0000584EC14BFD7F000099ADC05A6B7F00000100000000000000AAAAD77D0000000002000000000000000800000000000000B065E05A6B7F0000E6B7C05A6B7F0000010000006B7F0000884DC14BFD7F0000106F7C5A6B7F0000984EC14BFD7F0000488B7C5A6B7F0000C4A71CB90000000001000000000000000800000000000000B065E05A6B7F000048B6C05A6B7F0000702AE25A6B7F0000D84DC14BFD7F000030489E5A6B7F0000E84EC14BFD7F0000E05E9E5A6B7F00000991F0460000000001000000000000000800000000000000B065E05A6B7F000048B6C05A6B7F00000100000000000000284EC14BFD7F00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
+        Stack:
+          Start of Memory Range: 0x00007FFD4BC15080
+          Content:         30044000000000000000000000000000
+  - Type:            MemoryList
+    Memory Ranges:
+      - Start of Memory Range: 0x00007FFD4BC15080
+        Content:         30044000000000000000000000000000
+...

@JDevlieghere
Copy link
Member

JDevlieghere commented Feb 5, 2025

Rather than a top-level show-unwind under image (aka target modules), it seems like this would fit nicely under the dump subcommand: image dump unwind?

Syntax: target modules dump [objfile|symtab|sections|ast|symfile|line-table|pcm-info|separate-debug-info] [<file1> <file2> ...]

The following subcommands are supported:

      ast                 -- Dump the clang ast for a given module's symbol file.
      line-table          -- Dump the line table for one or more compilation units.
      objfile             -- Dump the object file headers from one or more target modules.
      pcm-info            -- Dump information about the given clang module (pcm).
      sections            -- Dump the sections from one or more target modules.
      separate-debug-info -- List the separate debug info symbol files for one or more target modules.
      symfile             -- Dump the debug symbol file for one or more target modules.
      symtab              -- Dump the symbol table from one or more target modules.

@labath
Copy link
Collaborator Author

labath commented Feb 5, 2025

If I was adding a new command -- sure, but this is adding only a slightly different way to print something we're already printing (*). I think it would be confusing for this to live in a separate command. Or are you proposing to move the existing command.

(*) I conservatively didn't make it the default, though I think it would actually be a better default, as it shows what's actually being used by lldb. Based on this comment the main reason it's printing uncached plans is to log the workings of the instruction emulator.

@JDevlieghere
Copy link
Member

My bad, I misread "add a flag" as "add a command". I think moving the command and making it an alias would be a nice-to-have (in your copious amounts of spare time ;-), but obviously that's outside the scope of this PR.

@jasonmolenda
Copy link
Collaborator

target-symbols-add-unwind.test might have caught this, were it not for the fact that the "image show-unwind" command does not use cached unwind information (it recomputes it from scratch).

FWIW this behavior was mostly to aid in debugging unwind plan creators - you could attach to an lldb, put a breakpoint in the unwind plan creator methods, and re-do image show-unwind -n ... repeatedly as you debugged the issue. Or turn on logging that the unwind plan creators might generate and re-do it.

Copy link
Collaborator

@jasonmolenda jasonmolenda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I'm fine with changing the default to show the current cached unwind plans, this command is only used by lldb maintainers (mostly, me), normal users won't notice a change here. Jonas' suggestion of moving this command under target modules dump would make sense to me too, but you shouldn't need to do that in this PR unless you want to, I know there's a lot of mechanical code moving to implement a change like that. Again this is not a normal user command, I don't mind changing its behavior or name if anyone thinks that is better.

m_cached = value;
} else {
return Status::FromErrorStringWithFormatv(
"invalid boolean value '%s' passed for -G option", option_arg);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-c option?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

haha, you caught me.

@@ -3494,6 +3506,7 @@ class CommandObjectTargetModulesShowUnwind : public CommandObjectParsed {
// parsing options
std::string m_str; // Holds name lookup
lldb::addr_t m_addr = LLDB_INVALID_ADDRESS; // Holds the address to lookup
bool m_cached = false;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with changing the behavior of the existing command if you'd like to. This command is very much only used by lldb maintainers and tests.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may be biased because this is the second time that the command has lied to me (the first time being when the propeller thingy caused a huge unwind plan covering the whole file to be inserted into the cache), but I think this would be a better default. I'll flip the flag here. Thanks.

@labath labath merged commit 83ba374 into llvm:main Feb 7, 2025
7 checks passed
@labath labath deleted the symbols-add branch February 7, 2025 12:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants