perf: Fix performance issue when assigning previous version #78

chme · 2024-03-19T19:25:04Z

I still encounter severe performance problems on a commit graph that has a lot of merge commits (each feature is developed in a feature branch; a version can have more than 20 merge commits - sorry for not testing this as part of #76).

The problematic code path is now in build._assign_previous_versions. It has the same issue that was fixed with #76 for grouping commits into versions.
The commit graph is traversed through all possible branches. Branching/merge commits result in the same commit being checked multiple times.

With large commit graphs, I think, it is best to optimize it, to only follow the commit graph once and gather all information necessary. This PR will find the previous version information during the grouping of the commits.

I will do some more tests tomorrow, but would appreciate, if you could take a look at the proposed solution.

Assigning the previous version was checking commits multiple times, when ever a merge commit was encountered. This leads to severe performance issues on a commit graph with lots of merge commits. Instead of traversing the commit graph again, finding the previous version is now part of the commit to version grouping.

pawamoy

I'll trust you again on this one. What would be nice is that we add fuzzing to the test suite: a test that runs the code on a big temp repo, with hundreds of commits, dozens of branches, merge commits, etc., all randomly generated, just to see if the code finishes running in a small amount of time.

chme · 2024-03-20T19:48:46Z

I added a "test" that generates profiling data with cProfile. It has no assertions, not sure if it should have any ...
The git repo is just a bunch of merges of one commit into main, but it illustrates the current issue well (see below).

I did run the test on different versions of the code:

No. merge commit	2.4.1		PR 78		2.4.0
	f calls	seconds	f calls	seconds	f calls	seconds
15	394,968	0.361	1,764	0.009	1,544	0.007
17	1,574,774	4.413	1,922	0.008	1,674	0.008
18	3,147,717	17.444	2,001	0.008	1,739	0.008

Note: the 2.4.0 version was not the "real" 2.4.0, but I just reverted the changes to build.py. And 2.4.1 was in reality the current main branch.

Running the test with 20 merges was unbearable with current main (I did not wait until it finished and aborted it ...).

My tests on a "real" git repo were successful with this PR applied. git-changelog was fast and it felt as if the response was instant.

For reference, here the generated files:

perf_stats_2.4.0_15.txt
perf_stats_2.4.0_17.txt
perf_stats_2.4.0_18.txt
perf_stats_now_15.txt
perf_stats_now_17.txt
perf_stats_now_18.txt
perf_stats_pr78_15.txt
perf_stats_pr78_17.txt
perf_stats_pr78_18.txt

pawamoy · 2024-03-20T23:52:46Z

Amazing, thanks a lot!

Since it's not really a test, I'd move this profiling code to a task, in duties.py. Let me do that tomorrow 🙂

pawamoy · 2024-03-23T11:47:43Z

Actually it's using Pytest fixtures so it could be hard to move that outside of a test run. So instead I think we can simply add a timeout to the test to make sure that it runs fast. WDYT?

pawamoy · 2024-03-23T19:58:46Z

I rewrote it as a task. Lets merge it as is, and we can refine it later if needed.

pawamoy

Thanks a lot for your hard work! 🚀

pawamoy reviewed Mar 19, 2024

View reviewed changes

perf: Add test creating prfiling data

99496aa

dummy added 3 commits March 23, 2024 21:01

fixup! perf: Add test creating prfiling data

9e63757

fixup! perf: Add test creating prfiling data

5fa631d

fixup! perf: Add test creating prfiling data

dffb4e8

pawamoy approved these changes Mar 23, 2024

View reviewed changes

pawamoy merged commit f35c88b into pawamoy:main Mar 23, 2024
16 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: Fix performance issue when assigning previous version #78

perf: Fix performance issue when assigning previous version #78

chme commented Mar 19, 2024

pawamoy left a comment •

edited

Loading

chme commented Mar 20, 2024

pawamoy commented Mar 20, 2024

pawamoy commented Mar 23, 2024

pawamoy commented Mar 23, 2024 •

edited

Loading

pawamoy left a comment

perf: Fix performance issue when assigning previous version #78

perf: Fix performance issue when assigning previous version #78

Conversation

chme commented Mar 19, 2024

pawamoy left a comment • edited Loading

Choose a reason for hiding this comment

chme commented Mar 20, 2024

pawamoy commented Mar 20, 2024

pawamoy commented Mar 23, 2024

pawamoy commented Mar 23, 2024 • edited Loading

pawamoy left a comment

Choose a reason for hiding this comment

pawamoy left a comment •

edited

Loading

pawamoy commented Mar 23, 2024 •

edited

Loading