USE CASE Messy Person Data BY ORGANIZATION DFKI

Context

When persons are mentioned in texts with their first name, last name and/or middle names, there can be a high variation which of their names are used, how their names are ordered and if their names are abbreviated. For example, "John Fitzgerald Kennedy", "John", "Kennedy, J F" and "J. Kennedy" are variations that refer to the same person. If multiple persons are mentioned consecutively in very different ways, especially short texts can be perceived as "messy". In addition, once ambiguous names occur, associations to persons may not be inferred correctly.

Challenges

optional (middle) names
name variations
ambiguities
short texts

Resources

Website / Data (Generator) / Tool / Evaluator: https://github.com/mschroeder-github/person-index

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dfki-messy-person-data.md

dfki-messy-person-data.md

USE CASE Messy Person Data BY ORGANIZATION DFKI

Context

Challenges

Resources

Files

dfki-messy-person-data.md

Latest commit

History

dfki-messy-person-data.md

File metadata and controls

USE CASE Messy Person Data BY ORGANIZATION DFKI

Context

Challenges

Resources