Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow for the configuration of the size of the fetch queue and workers #120

Merged
merged 3 commits into from
Dec 17, 2024

Conversation

kgaughan
Copy link
Owner

@kgaughan kgaughan commented Dec 17, 2024

The defaults should be fine in 99% of cases, but some may want to tune them down on more powerful machines.

Summary by Sourcery

Add configuration options for job queue depth and parallelism to allow tuning of fetch operations, with a cap on CPU usage to enhance performance.

New Features:

  • Introduce configuration options for job queue depth and parallelism in the fetch process.

Enhancements:

  • Cap the number of CPUs/cores used to a maximum of 32 to optimize performance.

The defaults should be fine in 99% of cases, but some may want to tune
them down on more powerful machines.
Copy link

sourcery-ai bot commented Dec 17, 2024

Reviewer's Guide by Sourcery

This PR implements configuration options for controlling feed fetching parallelism by introducing two new settings: JobQueueDepth and Parallelism. The implementation adds these fields to the Config struct, sets sensible defaults based on CPU cores, and modifies the manifest priming function to use these configurable values instead of hardcoded ones.

Sequence diagram for manifest priming with configurable parallelism

sequenceDiagram
    participant Main
    participant Manifest
    participant Worker
    Main->>Manifest: Prime(config.Cache, config.Timeout.Duration, config.Parallelism, config.JobQueueDepth)
    loop parallelism times
        Manifest->>Worker: Start worker
        Worker-->>Manifest: Fetch job
    end
    Manifest->>Main: Complete priming
Loading

Class diagram for updated Config struct

classDiagram
    class Config {
        string Name
        string URL
        string Owner
        string Email
        string FeedID
        string Cache
        utils.Duration Timeout
        string themePath
        fs.FS Theme
        string Output
        manifest.Feed[] Feeds
        int ItemsPerPage
        int MaxPages
        int JobQueueDepth
        int Parallelism
    }
Loading

File-Level Changes

Change Details Files
Add new configuration options for controlling feed fetching behavior
  • Add JobQueueDepth and Parallelism fields to Config struct
  • Set default values to 0 for both new fields
  • Add constant cpuLimit (32) to cap maximum CPU core usage
internal/config.go
Implement bounds checking and validation for the new configuration values
  • Enforce minimum Parallelism of 1 and maximum of min(cpuLimit, available CPUs)
  • Set JobQueueDepth minimum to 2 times the Parallelism value
internal/config.go
Modify manifest priming to use configurable parallelism values
  • Update Prime function signature to accept parallelism and jobQueueDepth parameters
  • Remove hardcoded runtime.NumCPU() usage
  • Configure job channel with dynamic queue depth
internal/manifest/manifest.go
cmd/mercury/main.go

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time. You can also use
    this command to specify where the summary should be inserted.

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @kgaughan - I've reviewed your changes - here's some feedback:

Overall Comments:

  • Please add documentation for the new job_queue_depth and parallelism TOML configuration options explaining their purpose and typical usage scenarios.
Here's what I looked at during the review
  • 🟡 General issues: 1 issue found
  • 🟢 Security: all looks good
  • 🟢 Testing: all looks good
  • 🟢 Complexity: all looks good
  • 🟢 Documentation: all looks good

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@@ -37,6 +40,9 @@ func (c *Config) Load(path string) error {
c.Output = "./output"
c.ItemsPerPage = 10
c.MaxPages = 5
// These are both somewhat arbitrary
c.JobQueueDepth = 2 * runtime.NumCPU()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (performance): Consider adding upper bounds to JobQueueDepth and Parallelism to prevent resource exhaustion on many-CPU systems

While lower bounds are enforced, unbounded values on many-CPU systems could lead to excessive resource usage. Consider adding reasonable upper limits like math.Min(runtime.NumCPU(), 32) for Parallelism.

Suggested implementation:

	"math"
	"github.com/BurntSushi/toml"
	// Cap values to prevent resource exhaustion on many-CPU systems
	maxCPUs := 32 // Maximum reasonable number of CPUs to utilize
	c.JobQueueDepth = int(math.Min(float64(2*runtime.NumCPU()), float64(2*maxCPUs)))
	c.Parallelism = int(math.Min(float64(runtime.NumCPU()), float64(maxCPUs)))

@kgaughan
Copy link
Owner Author

@sourcery-ai review

Copy link

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @kgaughan - I've reviewed your changes - here's some feedback:

Overall Comments:

  • Consider adding a comment explaining the reasoning behind the cpuLimit=32 constant. Is this based on empirical testing or specific hardware considerations?
Here's what I looked at during the review
  • 🟢 General issues: all looks good
  • 🟢 Security: all looks good
  • 🟢 Testing: all looks good
  • 🟢 Complexity: all looks good
  • 🟢 Documentation: all looks good

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@kgaughan kgaughan merged commit aa47283 into master Dec 17, 2024
4 checks passed
@kgaughan kgaughan deleted the configurable-parallelism branch December 17, 2024 21:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant