-
Notifications
You must be signed in to change notification settings - Fork 1
Help
gnlist-resolver-gui
or Scientific Names List Resolver
is an app that allows you to upload a file containing scientific names and match it with scientific names from a data set (for example Catalogue of Life,
IPNI, ZooBank etc)
-
It allows to compare large lists of names (up to 100,000 names) all at the same time and returns result either in CSV or Excel-compatible XLSX format
-
It returns important statistics about the match -- was it exact or fuzzy, edit distance for fuzzy matches, confidence score for the match, classification and id from another resource (if they are available)
-
For fuzzy match in XLSX format it highlights the difference between matched names
-
Prepare your file by saving in CSV format using UTF-8 encoding. There are several supported formats for a file, and there is a good chance that you will only need to change headers according to the list of supported terms.
-
When names are represented as one string, modify capitalizations of the words to correspond to nomenclatural rules (for example convert
ECHINISCOIDES Sigismundi Groenlandicus KRISTENSEN and HALLAS, 1980
toEchiniscoides sigismundi groenlandicus KRISTENSEN and HALLAS, 1980
). Note that the authors names might be capitalized. -
Upload the file
-
Check the headers. The headers recognized by the app will appear with a green background. All other headers will ignored by the matching process. You can delete erroneous matches, or add a new match. Note that there are two possible workflows:
- The name is given as a single string (
scientificName
term is present) - The name is split into parts (
genus
,specificEpithet
terms are present)
- The name is given as a single string (
-
Pick a source that you want to use for name matching and select other settings, if available
-
Get a break, and watch statistics of your match updated dynamically.
-
When all is done (of after pushing the
Cancel
button) download results of the match in CSV or Excel format
subKingdom
subPhylum
superClass
subClass
cohort
superOrder
subOrder
infraOrder
superFamily
subFamily
tribe
subTribe
subGenus
section
subSpecificEpithet
variety
form
- Comma Separated File with names of fields in the first row.
- Columns can be separated by tab, comma or semicolon
- At least some columns should have recognizable fields, unused fields won't hurt the process
- Comma or semicolon-separated values need to be bordered by double quotes if there are commas or semicolons inside the value
taxonID
kingdom
phylum
class
order
family
genus
species
subspecies
variety
form scientificNameAuthorship
scientificName
taxonRank
scientificName |
---|
Animalia |
Macrobiotus echinogenitus subsp. areolatus Murray, 1907 |
taxonID;scientificName
1;Macrobiotus echinogenitus subsp. areolatus Murray, 1907
...
taxonID | scientificName |
---|---|
1 | Animalia |
2 | Macrobiotus echinogenitus subsp. areolatus Murray, 1907 |
taxonID;scientificName;taxonRank
1;Macrobiotus echinogenitus f. areolatus Murray, 1907;form
...
taxonID | scientificName | taxonRank |
---|---|---|
1 | Animalia | kingdom |
2 | Macrobiotus echinogenitus subsp. areolatus Murray, 1907 | subspecies |
taxonID;family;scientificName;scientificNameAuthorship
1;Macrobiotidae;Macrobiotus echinogenitus subsp. areolatus;Murray, 1907
...
taxonID | family | scientificName | scientificNameAuthorship |
---|---|---|---|
1 | Animalia | ||
2 | Macrobiotidae | Macrobiotus echinogenitus | Murray |
TaxonId;kingdom;subkingdom;phylum;subphylum;superclass;class;subclass;cohort;superorder;order;suborder;infraorder;superfamily;family;subfamily;tribe;subtribe;genus;subgenus;section;species;subspecies;variety;form;ScientificNameAuthorship
1;Animalia;;Tardigrada;;;Eutardigrada;;;;Parachela;;;Macrobiotoidea;Macrobiotidae;;;;Macrobiotus;;;harmsworthi;obscurus;;;Dastych, 1985
TaxonId | kingdom | subkingdom | phylum | subphylum | superclass | class | subclass | cohort | superorder | order | suborder | infraorder | superfamily | family | subfamily | tribe | subtribe | genus | subgenus | section | species | subspecies | variety | form | ScientificNameAuthorship |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
136021 | Animalia | Pogonophora | |||||||||||||||||||||||
136022 | Animalia | Pogonophora | Frenulata | Webb, 1969 | |||||||||||||||||||||
565443 | Animalia | Tardigrada | Eutardigrada | Parachela | Macrobiotoidea | Macrobiotidae | Macrobiotus | harmsworthi | obscurus | Dastych, 1985 |
You can take and modify example files to suite your needs
Output includes the following fields:
Field | Description |
---|---|
taxonID | original ID attached to a name in the checklist |
scientificName | name from the checklist |
matchedScientificName | name matched from the GN Reolver data source |
inputCanonicalForm | canonical form of the input name |
matchedCanonicalForm | canonical form of the matched name |
editDistance | for fuzzy-matching -- how many characters differ between checklist and data source name |
rank | rank from the source (if it was given/inferred) |
matchedRank | corresponding rank from the data source |
matchType | what kind of match it is |
score | heuristic score from 0 to 1 where 1 is a good match, 0.5 match requires further human investigation |
matchTaxonID | the ID of matched name |
classification | a hierarchy path for the matched name |