-
Notifications
You must be signed in to change notification settings - Fork 1
error_analysis
Gold = Gold Standard (or Dataset, e.g., Reuters128, News100, RBB150, etc.)
System = Tool (e.g., Recognyze, DBpedia Spotlight, AIDA, Babelnet, etc.)
In general the scorer should returns same results if same data is fetched. In practice there seems to be a problem with ordering (e.g., sorted and unsorted results might be a bit different). Nevertheless the scorer is pretty reliable if tool output is sorted.
In order to interpret the results, TAC-KBP uses the following interpretation of the test results:
Case | TAC | Test | Scope |
---|---|---|---|
Missing in Gold | extra | FP | mention, type, link |
Missing in System | missing | FN | mention, type, link |
Gold is None and System is None | correct nil | TP | mention, type, link |
Gold == System | correct link | FP | mention, type, link |
Gold is None | nil-as-link | FP | link |
System is None | link-as-nil | FN | link |
Rest of cases | wrong-link | FP | link |
According to the TAC-KBP neleval rules each named entity candidate discovered in a text is described through its mention, type and link. The mention includes the document number, and the position (start; end) within the document.
By combining the output of several TAC-KBP evaluations done with different tools on the same dataset(s) one can easily discover multiple error types. In the subsequent sections we will call all the evaluation tools annotators.
In order to describe the errors found in the various document we use a description that is somewhat similar to that used for design patterns.
**Error Name**
**Scope**: mention (aka surfaceForm) OR type OR link OR mention,type,link OR mention,type
**SimilarTo**: This field should only appear if the error is actually has some degree of similarity to another one, but the context in which this new error appears is totally different.
**Description**: A general description of the error.
**Examples**: Where possible an example taken from a published corpora should be presented.
**Comments**: If needed it is possible to add additional comments.
If there are currently no examples for a certain error, it is ok to keep only the fields: AppliesTo and Description.
The current format we use to report the errors is the following:
File, retrievedURI, retrievedSurfaceForm, retrievedType, correctURI, correctSurfaceForm, correctType
File simply represents the name of the file where the occurrence/mention was detected. The first set of attributes (retrievedURI, retrievedSurfaceForm, retrievedType) is the one that comes from the tool, while the second set of attributes (correctURI, correctSurfaceForm, correctType) is the one that is / should be in the dataset or corpus.
Getting the correct version of a mention is highly dependent upon the Annotation Guidelines that were used for the respective dataset (e.g., partial mentions might be allowed or not, split mentions might be allowed or not).
It has been determined that at least several big error classes can be found when examining the various errors that can appear during named entity linking evaluations. These classes are included in the following table.
Class | Shortcut | Examples |
---|---|---|
Knowledge Base | KB | Bad Mapping, Missing Entity, Unpopulated Entity, Redirects and Multiple URIs, KB Change, Wrong Type, No Type |
Dataset (Gold) | DS | Missing Annotation, Wrong Annotation, Different Language, Redirect and Multiple URIs, Wrong Type, Generic Term, UTF-8 Issue |
Annotator | AN | Abbreviation Conflict, Cross-Type Disambiguation, Same Type, Partial Match, No Entity, Generic Term, Anomaly |
NIL Clustering | NC | Wrong Cluster |
Scorer | SC | Redirect and Multiple URIs |
If you need more details about the various errors, check our Error Annotation Guideline.