performance: parallelize diffgraph application #288

bbrehm · 2025-01-09T11:58:36Z

Can you @mpollmeier test that on the same thing you use to evaluate ArrayList vs ArrayBuffer?
Can you test this on a big cpg creation?

(the issue with cpg creation is temp memory during diffgraph application. If we parallelize, then more stuff is alive at the same time. I also had to remove some early clearing of memory, because that would require more synchronization to figure out when an object is truly dead)

mpollmeier · 2025-01-09T12:04:23Z

thank you, will do 👍🏻

bbrehm · 2025-01-09T13:50:51Z

Ok, it wasn't so bad to restore the early cleanup logic (so that the GC can reclaim structures that are not needed anymore, even while the diffgraph application is still running).

Main things to consider are:

The atomic refcount rigamarole
The Callable closure is kept alive by the java.util.concurrent.ForkJoinTask.AdaptedCallable after it finished, until we do submissions.clear(). So we need to be careful about what we capture in the closure!

mpollmeier · 2025-01-09T15:55:50Z

import timing results: #286 (comment)

mpollmeier · 2025-01-13T07:14:49Z

Won't merge for now, because the performance gains currently don't justify the added complexity. But this might come in handy in future, so let's keep the PR open.

Context:
#286

core/src/main/scala/flatgraph/DiffGraphApplier.scala

README.md

mpollmeier · 2025-01-20T16:48:55Z

core/src/main/scala/flatgraph/Misc.scala

+        java.lang.Thread.currentThread() match {
+          case fjt: concurrent.ForkJoinWorkerThread => fjt.getPool


why not import the classes rather than the concurrent namespace?
that would make this more readably IMO, especially given the mix of 'current and 'concurrent'

I renamed the current, good catch, that looked ugly.

Otherwise, I guess that's a matter of personal coding style -- I prefer if names are at least somewhat qualified, such that the import lists are small and mostly the same across all files, and the qualified names are mostly the same across all files. Like the ubiquituous import scala.collection.mutable, or import io.shiftleft.codepropertygraph.generated.nodes.

Personal style is fine to a degree, but in general it's best to stick to community guidelines, so that more than just one person has a good chance of understanding the code. Scala's guidelines on that front is essentially "import the class unless there's a good reason not to".

import scala.collection.mutable has a good reason, because it deviates from the norm, namely not being immutable.

Our import io.shiftleft.codepropertygraph.generated.nodes is a historical decision that we changed some time ago - in joern it's not used any longer.

core/src/main/scala/flatgraph/Misc.scala

core/src/main/scala/flatgraph/storage/Deserialization.scala

…hutdown in converter

core/src/main/scala/flatgraph/Misc.scala

bbrehm · 2025-01-27T11:32:15Z

@mpollmeier @ml86 ready to merge?

For my test-CPG, I observe: 1100 ms load / 2200 ms store with this PR, compared to master 2600 load / 4100 store.

If we decide against this, then the Zstd directBuffer thing must be backported in some way -- the long JNI critical section in zstd-ni was an inacceptable bug (and the only reason it didn't bite us so far was that the surrounding code was single-threaded and we already moved from G1 parallel garbage collector for many testsets).

Using a custom replacement for mutable.LinkedHashMap for the stringpool / deduplication brings the store down to 1700 ms. I'll put that into a separate PR that we can then decide to merge or not to merge.

mpollmeier · 2025-01-27T11:53:23Z

will re-review in a bit

core/src/main/scala/flatgraph/storage/ZstdWrapper.scala

mpollmeier

(note that I just pushed two minor code formatting changes)

bbrehm added 2 commits January 9, 2025 12:52

parallelize diffgraph application

00e7965

oopsie formatting / debug code

fdd2d56

bbrehm assigned mpollmeier Jan 9, 2025

restore the early cleanup

6015548

mpollmeier mentioned this pull request Jan 9, 2025

performance: use ArrayList instead of ArrayBuffer #286

Closed

mpollmeier self-requested a review January 10, 2025 12:36

bbrehm added 2 commits January 17, 2025 10:24

no need to build arraybuffer for addNodes

8598fae

add some notes, parallelize everything

1853d43

mpollmeier reviewed Jan 20, 2025

View reviewed changes

core/src/main/scala/flatgraph/DiffGraphApplier.scala Show resolved Hide resolved

mpollmeier reviewed Jan 20, 2025

View reviewed changes

README.md Show resolved Hide resolved

mpollmeier reviewed Jan 20, 2025

View reviewed changes

address suggestions re option-wrapping; also fix bug about executor s…

09086d7

…hutdown in converter

mpollmeier self-requested a review January 22, 2025 07:59

mpollmeier approved these changes Jan 22, 2025

View reviewed changes

core/src/main/scala/flatgraph/Misc.scala Outdated Show resolved Hide resolved

mpollmeier and others added 6 commits January 22, 2025 09:01

Update core/src/main/scala/flatgraph/Misc.scala

b24d9a1

fix overflow bug

b449073

use direct buffers for compression :(

c8fda7e

minor stuff

29a565a

final flourishes

83686a4

address suggestion: some more source-code comment

759996c

ml86 self-requested a review January 27, 2025 11:33

ml86 approved these changes Jan 27, 2025

View reviewed changes

mpollmeier reviewed Jan 27, 2025

View reviewed changes

core/src/main/scala/flatgraph/storage/ZstdWrapper.scala Outdated Show resolved Hide resolved

Update core/src/main/scala/flatgraph/storage/ZstdWrapper.scala

0c56477

mpollmeier reviewed Jan 27, 2025

View reviewed changes

core/src/main/scala/flatgraph/storage/ZstdWrapper.scala Outdated Show resolved Hide resolved

Update core/src/main/scala/flatgraph/storage/ZstdWrapper.scala

57341e8

mpollmeier approved these changes Jan 27, 2025

View reviewed changes

sbt scalafmt

542c590

bbrehm merged commit edabc69 into master Jan 27, 2025
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

performance: parallelize diffgraph application #288

performance: parallelize diffgraph application #288

bbrehm commented Jan 9, 2025

mpollmeier commented Jan 9, 2025

bbrehm commented Jan 9, 2025

mpollmeier commented Jan 9, 2025

mpollmeier commented Jan 13, 2025

mpollmeier Jan 20, 2025

bbrehm Jan 20, 2025

mpollmeier Jan 20, 2025

bbrehm commented Jan 27, 2025

mpollmeier commented Jan 27, 2025

mpollmeier left a comment

		java.lang.Thread.currentThread() match {
		case fjt: concurrent.ForkJoinWorkerThread => fjt.getPool

performance: parallelize diffgraph application #288

performance: parallelize diffgraph application #288

Conversation

bbrehm commented Jan 9, 2025

mpollmeier commented Jan 9, 2025

bbrehm commented Jan 9, 2025

mpollmeier commented Jan 9, 2025

mpollmeier commented Jan 13, 2025

mpollmeier Jan 20, 2025

Choose a reason for hiding this comment

bbrehm Jan 20, 2025

Choose a reason for hiding this comment

mpollmeier Jan 20, 2025

Choose a reason for hiding this comment

bbrehm commented Jan 27, 2025

mpollmeier commented Jan 27, 2025

mpollmeier left a comment

Choose a reason for hiding this comment