GBIF_occurence_download

Pipeline for downloading occurrences for a series of scientific names with known conservation status, including taxonomic matching and problem resolving.

Purpose

The main purpose of this pipeline was to obtain georeferenced occurrences unambiguously associated with the conservation status of the taxa they belong to. Existing tools, like rgbif, are doing well with retreiving occurrences for most species, but failed in cases when GBIF species matching tool (say, GBIF Backbone Taxonomy) faces difficulties.

Another issue with existing tools is even you provide a list of scientific names as an input, you received a list of occurrences which often have scientific names differ from the input ones, due to synonymy. It makes complicated to link attributes, assigned to the input names, with the retrieved occurrences.

The pipeline was developed within the project GBIF Viewer: an open web-based biodiversity conservation decision-making tool for policy and governance (The Habitat Foundation and Ukrainian Nature Conservation Group), funded by NLBIF: The Netherlands Biodiversity Information Facility, nlbif2022.014.

Records of species posessing conservation status in Ukraine

How to run the pipeline

This pipeline is 80% automatic, but still requires manual operations to resolve some difficult taxonomy issues. The reason of that is because the GBIF Backbone Taxonomy, which is generally used for name matching, may fail for some names, especially for poorly known or ambiguous taxa. For such exceptions user needs to manually edit higherrank.csv file, generated by 1_data_preparation.R, before running 2_get_gbif_data.R.

GBIF Backbone Taxonomy undergoes periodic (once a couple of months) update. After each one the list of names which faced difficulties with automatic matching is slightly changed. That means, user cannot use the same higherrank.csv for a long time. To facilitate revision of the higherrank.csv, run the script 1a_update_higherrank.R. It automatically retrieve the data from the previous version of the file (named higherrank_nameVariants_prev.csv) and provide a handy GUI for manual revision only those names in which the matching issues appeared for the first time, not leaving your R session.

Technical details

Input:

csv file with scientific names, their conservation status, and higher classification.

Output:

a simple features spatial data frame, containing georeferenced occurrences unambiguously associated with the conservation status of the taxa they belong to.

Dependencies

rgbif
dplyr, tidyr, and stringr for data manipulation
sf for preparingworking with spatial data
DataEditR for GUI for data frame revision
ggplot2 for visualisation (optional)

Schematic workflow

Scalable diagram

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
data		data
old_code		old_code
outputs		outputs
shp		shp
temp		temp
.DS_Store		.DS_Store
.gitignore		.gitignore
1_data_preparation.R		1_data_preparation.R
1a_update_higherrank.R		1a_update_higherrank.R
2_get_gbif_data.R		2_get_gbif_data.R
3_filtering_cleaning_attribution.R		3_filtering_cleaning_attribution.R
LICENSE		LICENSE
README.md		README.md
gbif_occ_downloader_workflow.png		gbif_occ_downloader_workflow.png
name_lookup.Rproj		name_lookup.Rproj

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GBIF_occurence_download

Purpose

Records of species posessing conservation status in Ukraine

How to run the pipeline

Technical details

Input:

Output:

Dependencies

Schematic workflow

About

Releases

Packages

Languages

License

olehprylutskyi/GBIF_occurence_download

Folders and files

Latest commit

History

Repository files navigation

GBIF_occurence_download

Purpose

Records of species posessing conservation status in Ukraine

How to run the pipeline

Technical details

Input:

Output:

Dependencies

Schematic workflow

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages