Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Optional Symbol Cache for Fast Symbol Access in ELF Files #577

Open
Sababoni opened this issue Nov 8, 2024 · 1 comment
Open

Add Optional Symbol Cache for Fast Symbol Access in ELF Files #577

Sababoni opened this issue Nov 8, 2024 · 1 comment

Comments

@Sababoni
Copy link

Sababoni commented Nov 8, 2024

I would like to propose adding an optional symbol caching feature to pyelftools that allows users to quickly access symbols and Debugging Information Entries (DIEs) from ELF files. This would be particularly useful for users working with large ELF files or those who need frequent access to specific symbols.

Currently, accessing DWARF information requires iterating through Compilation Units (CUs) and DIEs, which can be time-consuming, especially for ELF files containing thousands of symbols or extensive debugging data. By implementing an optional caching mechanism, pyelftools could drastically improve performance for certain use cases.

Feature Details:

Caching Symbol Information:

Add an optional parameter (index_symbols=True) to ELFFile that, when enabled, builds an in-memory cache of all relevant symbols and their corresponding DIEs.
The cache could be implemented as a dictionary ({symbol_name: DIE_info}) to enable constant-time (O(1)) lookups of symbols such as functions, variables, and types.
Opt-In for Flexibility:

This feature would be opt-in to provide flexibility and maintain backward compatibility.
By default, pyelftools would behave as it currently does, iterating through each CU and DIE on demand, without caching.
Usage Example:

When opening an ELF file, users could enable the symbol cache:
python
Copy code
from elftools.elf.elffile import ELFFile

with open('large.elf', 'rb') as f:
elf = ELFFile(f, index_symbols=True)

die = elf.get_symbol('main')
if die:
print(f"Found symbol 'main' at address: 0x{die.attributes['DW_AT_low_pc'].value:x}")
else:
print("Symbol 'main' not found.")
This API would allow users to easily access symbols after the initial caching step, making repeated queries much more efficient.

@sevaa
Copy link
Contributor

sevaa commented Nov 8, 2024

By symbols, do you mean exported symbols?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants