[Discovery] Improve execution time of `discovery` #71

felix-20 · 2023-07-13T12:08:26Z

Problem statement

Especially for large applications, the discovery can take really long when trying to discover multiple patterns.
This is most likely caused by the loading of the CPG. At the moment, the CPG is loaded for every pattern individually.

How is it possible to reduce the execution time for `discovery`?

There are a few ideas, on how to improve the execution time for discovery:

Single file / or subdirectory discovery

The potentially easiest solution as proposed by @pr0me, is to create multiple CPGs for one project. For each directory in the project or even each file in the project, discovery can be run individually. This might decrease the total execution time, because the CPGs themselves are smaller and the discovery for the subdirectories / files could be run in parallel.

Keep CPG in memory

If the loading of the CPG takes so much time, it might be convenient to keep the CPG in memory in order to execute multiple queries on it.

Interactive `joern` shell

If you use the interactive shell provided by joern it should be possible to load the CPG once and execute multiple rules on it. This requires a script, that can interact with the joern shell.

One large rule

Another possibility is to merge all discovery rules, that the user wants to execute into one big discovery rule and execute this big discovery rule. The CPG would only be loaded once and all queries could be run on the loaded CPG.
The problems of this solution might be:

if one discovery rule is broken, it might break the large discovery rule
how do you seperate the different results, and assign each pattern and each instance a result?

The text was updated successfully, but these errors were encountered:

felix-20 added the enhancement New feature or request label Jul 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Discovery] Improve execution time of `discovery` #71

[Discovery] Improve execution time of `discovery` #71

felix-20 commented Jul 13, 2023

[Discovery] Improve execution time of discovery #71

[Discovery] Improve execution time of discovery #71

Comments

felix-20 commented Jul 13, 2023

Problem statement

How is it possible to reduce the execution time for discovery?

Single file / or subdirectory discovery

Keep CPG in memory

Interactive joern shell

One large rule

[Discovery] Improve execution time of `discovery` #71

[Discovery] Improve execution time of `discovery` #71

How is it possible to reduce the execution time for `discovery`?

Interactive `joern` shell