This repository has been archived by the owner on Apr 24, 2024. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 13
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Feature/1176 separate fetching of german common names; Data overrides (…
…#1186) - [X] Separate fetching of german common names and merging of datasets, fixes #1176 - [X] Add apply overrides functionality, fixes #726 - [x] Create PR with new data, overrides and README in [scraper-data repository](https://github.com/ElektraInitiative/scraper-data/) Corresponding scraper-data PR: ElektraInitiative/scraper-data#2 This PR supersedes #799 <!-- Check relevant points but **please do not remove entries**. --> ## Basics <!-- These points need to be fulfilled for every PR. --> - [x] The PR is rebased with current master - [x] I added a line to [changelog.md](/doc/changelog.md) - [x] Details of what I changed are in the commit messages - [x] References to issues, e.g. `close #X`, are in the commit messages and changelog - [ ] The buildserver is happy <!-- If you have any troubles fulfilling these criteria, please write about the trouble as comment in the PR. We will help you, but we cannot accept PRs that do not fulfill the basics. --> ## Checklist <!-- For documentation fixes, spell checking, and similar none of these points below need to be checked. Otherwise please check these points when getting a PR done: --> - [x] I fully described what my PR does in the documentation - [x] I fixed all affected documentation - [ ] I fixed the introduction tour - [ ] I wrote migrations in a way that they are compatible with already present data - [ ] I fixed all affected decisions - [ ] I added automated tests or a [manual test protocol](../doc/tests/manual/protocol.md) - [x] I added code comments, logging, and assertions as appropriate - [ ] I translated all strings visible to the user - [ ] I mentioned [every code or binary](https://github.com/ElektraInitiative/PermaplanT/blob/master/.reuse/dep5) not directly written or done by me in [reuse syntax](https://reuse.software/) - [ ] I created left-over issues for things that are still to be done - [ ] Code is conforming to [our Architecture](/doc/architecture) - [ ] Code is conforming to [our Guidelines](/doc/guidelines) - [ ] Code is consistent to [our Design Decisions](/doc/decisions) - [ ] Exceptions to any guidelines are documented ## First Time Checklist <!-- These points are only relevant when creating a PR the first time. --> - [ ] I have installed and I am using [pre-commit hooks](../doc/contrib/README.md#Hooks) - [ ] I am using [Tailwind CSS Linting](https://tailwindcss.com/blog/introducing-linting-for-tailwindcss-intellisense) ## Review <!-- Reviewers can copy&check the following to their review. Also the checklist above can be used. But also the PR creator should check these points when getting a PR done: --> - [ ] I've tested the code - [ ] I've read through the whole code - [ ] I've read through the whole documentation - [ ] I've checked conformity to guidelines - [ ] I've checked conformity to requirements - [ ] I've checked that the requirements are tested
- Loading branch information
Showing
12 changed files
with
504 additions
and
84 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,76 @@ | ||
import fs from "fs"; | ||
import path from "path"; | ||
import { parse as json2csv } from "json2csv"; | ||
import csv from "csvtojson"; | ||
import { cleanUpJsonForCsv } from "./helpers/helpers.js"; | ||
import { applyOverride } from "./helpers/override.js"; | ||
|
||
const deletionsFile = "00_DELETIONS.csv"; | ||
|
||
async function loadMergedDataset() { | ||
return csv().fromFile("data/mergedDatasets.csv"); | ||
} | ||
|
||
async function applyDeletions(plants) { | ||
console.log(`[INFO] Deleting plants from data/overrides/${deletionsFile}`); | ||
|
||
const deletePlants = await csv().fromFile(`data/overrides/${deletionsFile}`); | ||
|
||
deletePlants.forEach((overridePlant) => { | ||
// find the plant | ||
const index = plants.findIndex( | ||
(plant) => plant.unique_name === overridePlant.unique_name | ||
); | ||
|
||
if (index === -1) { | ||
console.log( | ||
`[INFO] Could not find plant with unique_name '${overridePlant.unique_name}' in merged dataset.` | ||
); | ||
return; | ||
} | ||
|
||
// delete the plant | ||
plants.splice(index, 1); | ||
}); | ||
|
||
return plants; | ||
} | ||
|
||
async function applyAllOverrides(plants) { | ||
let overridesDir = "data/overrides"; | ||
if (!fs.existsSync(overridesDir)) { | ||
fs.mkdirSync(overridesDir); | ||
} | ||
|
||
// list all csv files in data/overrides | ||
const overrideFiles = fs.readdirSync(overridesDir); | ||
overrideFiles.sort(); | ||
|
||
// apply all overrides | ||
for (const file of overrideFiles) { | ||
// deletions were handled separately | ||
if (path.extname(file) !== ".csv" || file === deletionsFile) { | ||
continue; | ||
} | ||
await applyOverride(plants, `${overridesDir}/${file}`); | ||
} | ||
|
||
return plants; | ||
} | ||
|
||
async function writePlantsToOverwriteCsv(plants) { | ||
console.log( | ||
`[INFO] Writing ${plants.length} plants to csv data/finalDataset.csv` | ||
); | ||
cleanUpJsonForCsv(plants); | ||
const csvFile = json2csv(plants); | ||
fs.writeFileSync("data/finalDataset.csv", csvFile); | ||
|
||
return plants; | ||
} | ||
|
||
loadMergedDataset() | ||
.then((plants) => applyDeletions(plants)) | ||
.then((plants) => applyAllOverrides(plants)) | ||
.then((plants) => writePlantsToOverwriteCsv(plants)) | ||
.catch((error) => console.error(error)); |
Oops, something went wrong.