Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues with circle-ci job for harvesting geoserver’s DCAT-AP #66

Open
pietercolpaert opened this issue Feb 13, 2024 · 0 comments
Open

Comments

@pietercolpaert
Copy link

On the one hand, there are various issues with the Geoserver’s DCAT-AP export: e.g., they use non-existing hydra IRIs for the pagination, they have some prefixes undefined (leading to broken IRIs), the pages take very long to respond - they even sometimes time out, and the distributions are blank nodes instead of given an IRI. Can we get in contact with Geoserver to solve that? Is there a reference to their code that generates this to maybe do a pull request there?

On the other hand, the scripts provided in this repository at https://github.com/Informatievlaanderen/vodap/blob/master/.circleci/config.yml and https://github.com/Informatievlaanderen/vodap/blob/master/scripts/download.sh also contain an important mistake: the parser re-starts numbering blank nodes on every page, so the blank node numbering across pages will conflict.

I’ve writen a small nodejs script that does number blank nodes correctly over here: https://github.com/pietercolpaert/DCAT-AP-Dumps-To-Feeds/blob/main/bin/helperFlanders.ts - feel free to reuse it here.

Needs input from @bertvannuffelen

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant