error_analysis

TAC-KBP neleval interpretation

Gold = Gold Standard (or Dataset, e.g., Reuters128, News100, RBB150, etc.)

System = Tool (e.g., Recognyze, DBpedia Spotlight, AIDA, Babelnet, etc.)

In general the scorer should returns same results if same data is fetched. In practice there seems to be a problem with ordering (e.g., sorted and unsorted results might be a bit different). Nevertheless the scorer is pretty reliable if tool output is sorted.

In order to interpret the results, TAC-KBP uses the following interpretation of the test results:

Case	TAC	Test	Scope
Missing in Gold	extra	FP	mention, type, link
Missing in System	missing	FN	mention, type, link
Gold is None and System is None	correct nil	TP	mention, type, link
Gold == System	correct link	FP	mention, type, link
Gold is None	nil-as-link	FP	link
System is None	link-as-nil	FN	link
Rest of cases	wrong-link	FP	link

According to the TAC-KBP neleval rules each named entity candidate discovered in a text is described through its mention, type and link. The mention includes the document number, and the position (start; end) within the document.

ERROR ANALYSIS INTRODUCTION

By combining the output of several TAC-KBP evaluations done with different tools on the same dataset(s) one can easily discover multiple error types. In the subsequent sections we will call all the evaluation tools annotators.

In order to describe the errors found in the various document we use a description that is somewhat similar to that used for design patterns.

**Error Name**

**Scope**: mention (aka surfaceForm) OR type OR link OR mention,type,link OR mention,type

**SimilarTo**: This field should only appear if the error is actually has some degree of similarity to another one, but the context in which this new error appears is totally different. 

**Description**: A general description of the error.

**Examples**: Where possible an example taken from a published corpora should be presented.

**Comments**: If needed it is possible to add additional comments.

If there are currently no examples for a certain error, it is ok to keep only the fields: AppliesTo and Description.

The current format we use to report the errors is the following:

File, retrievedURI, retrievedSurfaceForm, retrievedType, correctURI, correctSurfaceForm, correctType

File simply represents the name of the file where the occurrence/mention was detected. The first set of attributes (retrievedURI, retrievedSurfaceForm, retrievedType) is the one that comes from the tool, while the second set of attributes (correctURI, correctSurfaceForm, correctType) is the one that is / should be in the dataset or corpus.

Getting the correct version of a mention is highly dependent upon the Annotation Guidelines that were used for the respective dataset (e.g., partial mentions might be allowed or not, split mentions might be allowed or not).

LARGE ERROR CLASSES

It has been determined that at least several big error classes can be found when examining the various errors that can appear during named entity linking evaluations. These classes are included in the following table.

Class	Shortcut	Examples
Knowledge Base	KB	Bad Mapping, Missing Entity, Unpopulated Entity, Redirects and Multiple URIs, KB Change, Wrong Type, No Type
Dataset (Gold)	DS	Missing Annotation, Wrong Annotation, Different Language, Redirect and Multiple URIs, Wrong Type, Generic Term, UTF-8 Issue
Annotator	AN	Abbreviation Conflict, Cross-Type Disambiguation, Same Type, Partial Match, No Entity, Generic Term, Anomaly
NIL Clustering	NC	Wrong Cluster
Scorer	SC	Redirect and Multiple URIs

If you need more details about the various errors, check our Error Annotation Guideline.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

error_analysis

TAC-KBP neleval interpretation

ERROR ANALYSIS INTRODUCTION

LARGE ERROR CLASSES

Clone this wiki locally