Skip to content

error_analysis

adrianb82 edited this page May 8, 2018 · 1 revision

TAC-KBP neleval interpretation

Gold = Gold Standard (or Dataset, e.g., Reuters128, News100, RBB150, etc.)

System = Tool (e.g., Recognyze, DBpedia Spotlight, AIDA, Babelnet, etc.)

In general the scorer should returns same results if same data is fetched. In practice there seems to be a problem with ordering (e.g., sorted and unsorted results might be a bit different). Nevertheless the scorer is pretty reliable if tool output is sorted.

In order to interpret the results, TAC-KBP uses the following interpretation of the test results:

Case TAC Test Scope
Missing in Gold extra FP mention, type, link
Missing in System missing FN mention, type, link
Gold is None and System is None correct nil TP mention, type, link
Gold == System correct link FP mention, type, link
Gold is None nil-as-link FP link
System is None link-as-nil FN link
Rest of cases wrong-link FP link

According to the TAC-KBP neleval rules each named entity candidate discovered in a text is described through its mention, type and link. The mention includes the document number, and the position (start; end) within the document.

ERROR ANALYSIS INTRODUCTION

By combining the output of several TAC-KBP evaluations done with different tools on the same dataset(s) one can easily discover multiple error types. In the subsequent sections we will call all the evaluation tools annotators.

In order to describe the errors found in the various document we use a description that is somewhat similar to that used for design patterns.

**Error Name**

**Scope**: mention (aka surfaceForm) OR type OR link OR mention,type,link OR mention,type

**SimilarTo**: This field should only appear if the error is actually has some degree of similarity to another one, but the context in which this new error appears is totally different. 

**Description**: A general description of the error.

**Examples**: Where possible an example taken from a published corpora should be presented.

**Comments**: If needed it is possible to add additional comments.

If there are currently no examples for a certain error, it is ok to keep only the fields: AppliesTo and Description.

The current format we use to report the errors is the following:

File, retrievedURI, retrievedSurfaceForm, retrievedType, correctURI, correctSurfaceForm, correctType

File simply represents the name of the file where the occurrence/mention was detected. The first set of attributes (retrievedURI, retrievedSurfaceForm, retrievedType) is the one that comes from the tool, while the second set of attributes (correctURI, correctSurfaceForm, correctType) is the one that is / should be in the dataset or corpus.

Getting the correct version of a mention is highly dependent upon the Annotation Guidelines that were used for the respective dataset (e.g., partial mentions might be allowed or not, split mentions might be allowed or not).

LARGE ERROR CLASSES

It has been determined that at least several big error classes can be found when examining the various errors that can appear during named entity linking evaluations. These classes are included in the following table.

Class Shortcut Examples
Knowledge Base KB Bad Mapping, Missing Entity, Unpopulated Entity, Redirects and Multiple URIs, KB Change, Wrong Type, No Type
Dataset (Gold) DS Missing Annotation, Wrong Annotation, Different Language, Redirect and Multiple URIs, Wrong Type, Generic Term, UTF-8 Issue
Annotator AN Abbreviation Conflict, Cross-Type Disambiguation, Same Type, Partial Match, No Entity, Generic Term, Anomaly
NIL Clustering NC Wrong Cluster
Scorer SC Redirect and Multiple URIs

If you need more details about the various errors, check our Error Annotation Guideline.

Clone this wiki locally