Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Discovery] Improve execution time of discovery #71

Open
felix-20 opened this issue Jul 13, 2023 · 0 comments
Open

[Discovery] Improve execution time of discovery #71

felix-20 opened this issue Jul 13, 2023 · 0 comments
Labels
enhancement New feature or request

Comments

@felix-20
Copy link
Contributor

Problem statement

Especially for large applications, the discovery can take really long when trying to discover multiple patterns.
This is most likely caused by the loading of the CPG. At the moment, the CPG is loaded for every pattern individually.

How is it possible to reduce the execution time for discovery?

There are a few ideas, on how to improve the execution time for discovery:

Single file / or subdirectory discovery

The potentially easiest solution as proposed by @pr0me, is to create multiple CPGs for one project. For each directory in the project or even each file in the project, discovery can be run individually. This might decrease the total execution time, because the CPGs themselves are smaller and the discovery for the subdirectories / files could be run in parallel.

Keep CPG in memory

If the loading of the CPG takes so much time, it might be convenient to keep the CPG in memory in order to execute multiple queries on it.

Interactive joern shell

If you use the interactive shell provided by joern it should be possible to load the CPG once and execute multiple rules on it. This requires a script, that can interact with the joern shell.

One large rule

Another possibility is to merge all discovery rules, that the user wants to execute into one big discovery rule and execute this big discovery rule. The CPG would only be loaded once and all queries could be run on the loaded CPG.
The problems of this solution might be:

  • if one discovery rule is broken, it might break the large discovery rule
  • how do you seperate the different results, and assign each pattern and each instance a result?
@felix-20 felix-20 added the enhancement New feature or request label Jul 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant