Awesome Deep Learning for NLP papers

1. Seq2seq/Attention Papers

Sequence to Sequence Learning with Neural Networks, https://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks.pdf, 2014.
Neural Machine Translation by Jointly Learning to Align and Translate, paper, 2015 (Additive Attention).
Effective Approaches to Attention-based Neural Machine Translation, paper, 2015 (Multiplicative Attention).
A Structured Self-Attentive Sentence Embedding, paper, ICLR 2017 (Self-attention).
Long Short-Term Memory-Networks for Machine Reading, paper, EMNLP 2016 (Self-attention).
A Decomposable Attention Model for Natural Language Inference, paper, EMNLP 2016 (Self-attention).
A Deep Reinforced Model for Abstractive Summarization, paper, 2017 (Self-attention).
Frustratingly Short Attention Spans in Neural Language Modeling, paper, ICLR 2017 (Key-value attention).
Attention is all you need, https://papers.nips.cc/paper/7181-attention-is-all-you-need.pdf, NIPS 2017.

Efficient Estimation of Word Representations in Vector Space, https://arxiv.org/pdf/1301.3781.pdf, 2013.
Distributed Representations of Words and Phrases and their Compositionality, http://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf, 2013.
Distributed Representations of Sentences and Documents, https://arxiv.org/pdf/1405.4053.pdf, 2014.
GloVe: Global Vectors for Word Representation, https://www.aclweb.org/anthology/D14-1162, 2014.
Semi-supervised Sequence Learning, https://papers.nips.cc/paper/5949-semi-supervised-sequence-learning.pdf, 2015.
Deep contextualized word representations, https://aclweb.org/anthology/N18-1202, 2018.
Universal Language Model Fine-tuning for Text Classification, https://aclweb.org/anthology/P18-1031, 2018.
Improving Language Understanding by Generative Pre-Training, https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf, 2018.
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, https://arxiv.org/pdf/1810.04805.pdf, 2018.
Language Models are Unsupervised Multitask Learners, https://d4mucfpksywv.cloudfront.net/better-language-models/language-models.pdf, 2019.

Bidirectional LSTM-CRF Models for Sequence Tagging, https://arxiv.org/pdf/1508.01991.pdf, 9 Aug 2015.
Named Entity Recognition with Bidirectional LSTM-CNNs, https://www.aclweb.org/anthology/Q16-1026, 2016.
Neural Architectures for Named Entity Recognition, https://arxiv.org/pdf/1603.01360.pdf, 2016.
End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF, https://arxiv.org/pdf/1603.01354.pdf, 2016.

A Convolutional Neural Network for Modelling Sentences, https://arxiv.org/pdf/1404.2188.pdf, ACL 2014.
Convolutional Neural Networks for Sentence Classification, https://arxiv.org/pdf/1408.5882.pdf, EMNLP 2014.
Character-level Convolutional Networks for Text Classification, https://papers.nips.cc/paper/5782-character-level-convolutional-networks-for-text-classification.pdf, NIPS 2015.
Very Deep Convolutional Networks for Text Classification, https://www.aclweb.org/anthology/E17-1104, EACL 2017.
Deep Pyramid Convolutional Neural Networks for Text Categorization, https://aclweb.org/anthology/P17-1052, ACL 2017.
A Sensitivity Analysis of (and Practitioners’ Guide to) Convolutional Neural Networks for Sentence Classification, https://www.aclweb.org/anthology/I17-1026, IJCNLP 2017.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
README.md		README.md