Skip to content

Relation extraction through distal supervision with CNNs. Final project for CS 224U.

Notifications You must be signed in to change notification settings

nikhilxb/cs224u-project

Repository files navigation

CS 224U Final Project: Relation Extraction

Relation extraction through distal supervision with CNNs. Reproducing Zeng et al's work here: http://www.emnlp2015.org/proceedings/EMNLP/pdf/EMNLP203.pdf

Data

Download the NYT10 relation extraction dataset from: https://github.com/thunlp/NRE/blob/master/data.zip and extract to the top-most level of the repository. Also download the Wikipedia 2014 + Gigaword 5 distribution of the pretrained GloVe vectors from here: http://nlp.stanford.edu/data/glove.6B.zip. Structure should be:

└── data/
      ├── entity2id.txt
      ├── relation2id.txt
      ├── test.txt
      ├── train.txt
      └── glove.6B.50d.txt

The dataset is in the following format:

  • train.txt: training file, format (fb_mid_e1, fb_mid_e2, e1_name, e2_name, relation, sentence).
  • test.txt: test file, same format as train.txt.
  • entity2id.txt: all entities and corresponding ids, one per line.
  • relation2id.txt: all relations and corresponding ids, one per line.
  • glove.6B.50d.txt: the pre-trained GloVe word embedding file hosted by Stanford NLP

About

Relation extraction through distal supervision with CNNs. Final project for CS 224U.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •