bundesanzeiger: allow fetching multiple pages for one company #143
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The current Bundesanzeiger implementation only fetches one page of results with up to 20 reports, but sometimes it might be interesting to get older reports as well.
This adds a named parameter
page_limit
toBundesanzeiger.get_reports
. The default value is 1, which preserves the current behavior of fetching only one page. If a higher value is set, the client will search the returned HTML for a "next page" link, and keep generating reports until page_limits pages have been parsed or there is no "next page" link anymore.float('inf')
can be passed to fetch all available pages.This commit adds a unit test for the method to find the "next page" link and another to test that it actually generates more than 20 reports.
This also encodes the company name in the URL so that search terms like
"Saxony Minerals & Exploration - SME AG"
work correctly.