Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: Fix performance issue when assigning previous version #78

Merged
merged 5 commits into from
Mar 23, 2024

Conversation

chme
Copy link
Contributor

@chme chme commented Mar 19, 2024

I still encounter severe performance problems on a commit graph that has a lot of merge commits (each feature is developed in a feature branch; a version can have more than 20 merge commits - sorry for not testing this as part of #76).

The problematic code path is now in build._assign_previous_versions. It has the same issue that was fixed with #76 for grouping commits into versions.
The commit graph is traversed through all possible branches. Branching/merge commits result in the same commit being checked multiple times.

With large commit graphs, I think, it is best to optimize it, to only follow the commit graph once and gather all information necessary. This PR will find the previous version information during the grouping of the commits.

I will do some more tests tomorrow, but would appreciate, if you could take a look at the proposed solution.

Assigning the previous version was checking commits multiple times, when ever a merge commit was encountered.
This leads to severe performance issues on a commit graph with lots of merge commits.

Instead of traversing the commit graph again, finding the previous version is now part of the commit to version grouping.
Copy link
Owner

@pawamoy pawamoy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll trust you again on this one. What would be nice is that we add fuzzing to the test suite: a test that runs the code on a big temp repo, with hundreds of commits, dozens of branches, merge commits, etc., all randomly generated, just to see if the code finishes running in a small amount of time.

@chme
Copy link
Contributor Author

chme commented Mar 20, 2024

I added a "test" that generates profiling data with cProfile. It has no assertions, not sure if it should have any ...
The git repo is just a bunch of merges of one commit into main, but it illustrates the current issue well (see below).

I did run the test on different versions of the code:

No. merge commit 2.4.1 PR 78 2.4.0
f calls seconds f calls seconds f calls seconds
15 394,968 0.361 1,764 0.009 1,544 0.007
17 1,574,774 4.413 1,922 0.008 1,674 0.008
18 3,147,717 17.444 2,001 0.008 1,739 0.008

Note: the 2.4.0 version was not the "real" 2.4.0, but I just reverted the changes to build.py. And 2.4.1 was in reality the current main branch.

Running the test with 20 merges was unbearable with current main (I did not wait until it finished and aborted it ...).

My tests on a "real" git repo were successful with this PR applied. git-changelog was fast and it felt as if the response was instant.

For reference, here the generated files:

perf_stats_2.4.0_15.txt
perf_stats_2.4.0_17.txt
perf_stats_2.4.0_18.txt
perf_stats_now_15.txt
perf_stats_now_17.txt
perf_stats_now_18.txt
perf_stats_pr78_15.txt
perf_stats_pr78_17.txt
perf_stats_pr78_18.txt

@pawamoy
Copy link
Owner

pawamoy commented Mar 20, 2024

Amazing, thanks a lot!

Since it's not really a test, I'd move this profiling code to a task, in duties.py. Let me do that tomorrow 🙂

@pawamoy
Copy link
Owner

pawamoy commented Mar 23, 2024

Actually it's using Pytest fixtures so it could be hard to move that outside of a test run. So instead I think we can simply add a timeout to the test to make sure that it runs fast. WDYT?

@pawamoy
Copy link
Owner

pawamoy commented Mar 23, 2024

I rewrote it as a task. Lets merge it as is, and we can refine it later if needed.

Copy link
Owner

@pawamoy pawamoy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for your hard work! 🚀

@pawamoy pawamoy merged commit f35c88b into pawamoy:main Mar 23, 2024
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants