structured-neural-summarization/parsers/naturallanguage/dmcnn/README.md at master · mloncode/structured-neural-summarization · GitHub

To download the file go to https://cs.nyu.edu/~kcho/DMQA/. Once the files have been split use Stanford NLP to get the XML representation:

./corenlp.sh -annotators tokenize,ssplit,pos,lemma,ner,parse,depparse,coref -coref.algorithm neural -filelist path/to/filelist.txt outputFormat xml -outputDirectory /path/to/output/xml

Then to process the data run

python convert2graph.py /path/to/output/xml /path/to/summaries /path/to/output