1. 好的方法 - 开创性或突破性方法
2. 好的应用 - 提供解决问题的全新视角
(Application: 商品标题压缩,资讯标题改写,PUSH消息改写...)
VADER Sentiment Analysis. VADER (Valence Aware Dictionary and sEntiment Reasoner) is a lexicon and rule-based sentiment analysis tool that is specifically attuned to sentiments expressed in social media, and works well on texts from other domains.
Bias Statement Detector (BSD) computationally detects and quantifies the degree of bias in sentence-level text of news stories.
Word Embedding
Machine Translation
Fluency and Coherency
Deep Reinforcement Learning for NLP ACL 2018 ppt
2017 - ...
2018 - 神经网络、注意机制、表示学习、语义和知识...
The progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the most common NLP tasks.
CoNLL 评测主要是学术界主导,所以内容多偏向自然语言处理的基础研究问题。
CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies
result | paper
CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies
result | paper
Uppsala (Uppsala) - Universal Word Segmentation: Implementation and Interpretation code
What's the bottleneck of deep learning in NLP ? (deep parsing ?)
How to explain the good generalization of deep learning in NLP ?
Is there implicit or explicit way to incorporate linguistic knowledge in deep learning ?
(Deep Learning is task oriented, and there is no need to explicit represent linguistic knowledge ?)
What's linguistics hypothesis of deep learning ? (generative linguistics(formal linguistics), functional linguistics and cognitive linguistics)
What's the different between deep learning and statistical based method ? (shallow parsing ?)
Deep learning pro end-to-end, Linguistics pro-representation, do we have to choose ? Better together?
Do corpus-based lexicography methods scale up?
Are they too manually intensive?
If so, could we use machine learning methods
- to speed up manual methods?
Just as statistical parsers learn syntactic rules: S -> NP VP
- Can we learn valency?
- Collocations (搭配) ?
- Typical predicate argument relations (谓词论元关系)?
refer: Minsky, Chomsky & Deep Nets
Linguistic Knowledge and Transferability of Contextual Representations. Nelson F. Liu, Matt Gardner, Yonatan Belinkov, Matthew E. Peters, and Noah A. Smith. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2019), June 2019.
Dissecting Contextual Word Embeddings: Architecture and Representation, Matthew E. Peters, Mark Neumann, Luke Zettlemoyer, Wen-tau Yih, EMNLP 2018 arxiv
From Word to Sense Embeddings: A Survey on Vector Representations of Meaning, Jose Camacho-Collados, Mohammad Taher Pilehvar, JAIR 2018 arxiv
Uncovering Divergent Linguistic Information in Word Embeddings with Lessons for Intrinsic and Extrinsic Evaluation, Mikel Artetxe, Gorka Labaka, Iñigo Lopez-Gazpio, Eneko Agirre, CoNLL 2018, Best Paper Award, arxiv | code
Sharp Nearby, Fuzzy Far Away: How Neural Language Models Use Context, Urvashi Khandelwal, He He, Peng Qi, Dan Jurafsky, ACL 2018, arxiv | code
LSTMs Exploit Linguistic Attributes of Data, ACL 2018, code
Does sentence embedding learned from RNNs captrue the syntactic information?
MSc project: Inferring Sentence Features from Sentence Embeddings, code
Semantically Conditioned LSTM-based Natural Language Generation for Spoken Dialogue Systems, Tsung-Hsien Wen, Milica Gasic, Nikola Mrksic, Pei-Hao Su, David Vandyke, Steve Young, EMNLP 2015 arxiv | code
- semantically
- natural language generation
Linguistically Regularized LSTM for Sentiment Classification, Qiao Qian, Minlie Huang, Jinhao Lei, Xiaoyan Zhu, ACL 2017 paper | code | Review
Linguistically-Informed Self-Attention for Semantic Role Labeling, Emma Strubell, Patrick Verga, Daniel Andor, David Weiss, Andrew McCallum, Google, EMNLP 2018 - Best long paper 1/2. arxiv | code
摘要:当前最先进的语义角色标记(SRL)使用深度神经网络而没有明确的语言特征。但是,之前的工作表明,语法树可以显着改善SRL解码,这表明通过显式语法建模可以提高准确性。在这项工作中,我们提出了基于语言学的self-attention(LISA):一种神经网络模型,它将multi-head self-attention与多任务学习相结合,包括依赖解析,词性标注,谓词检测和语义角色标记。与先前需要大量预处理来准备语言特征的模型不同,LISA可以仅使用原始的token,对序列进行一次编码,来同时执行多个预测任务。语法信息被用来训练一个attention head来关注每个token语法上的父节点。如果已经有高质量的语法分析,则可以在测试时进行有益的注入,而无需重新训练我们的SRL模型。在CoNLL-2005 SRL数据集上,LISA在谓词预测、word embedding任务上比当前最好的算法在F1值上高出了2.5(新闻专线数据)和3.5以上(其他领域数据),减少了约10%的错误。在ConLL-2012英文角色标记任务上,我们的方法也获得了2.5 F1值得提升。LISA同时也比当前最好的基于上下文的词表示学习方法(ELMo)高出了1.0的F1(新闻专线数据)和多于2.0的F1(其他领域数据)。
Recent Trends in Deep Learning Based Natural Language Processing, Tom Young, Devamanyu Hazarika, Soujanya Poria, Erik Cambria, last revised 25 Nov 2018. arxiv
Analysis Methods in Neural Language Processing: A Survey, Yonatan Belinkov, James Glass, 2019 TACL. arxiv | code
- Lexical and Neural Networks Combined
- Adversarial Learning
COLING - 欧洲 - 关注语言规律、模型分析(可解释性研究)
ACL - 北美
NAACL - 北美
2018 - Highlights
Why are you telling me this? Relevance & informativity in language processing. slides
Practical Parsing for Downstream Applications. tutorial
ACL 2018 Highlights
NAACL-HLT 2018 Highlights
EMNLP 2018 Highlights
- Shallow Semantic Parsing of Chinese, HLT-NAACL, 2004, Sun, Honglin,Jurafsky, Daniel paper
- Chinese Word Segmentation: Another Decade Review (2007-2017), arxiv
- Analogical Reasoning on Chinese Morphological and Semantic Relations, ACL 2018. paper | code
Data Programming: Creating Large Training Sets, Quickly. Alexander Ratner, Christopher De Sa, Sen Wu, Daniel Selsam, Christopher Ré, 2016 NIPS, arxiv
We therefore propose a paradigm for the programmatic creation of training sets called data programming in which users express weak supervision strategies or domain heuristics as labeling functions, which are programs that label subsets of the data, but that are noisy and may conflict.
EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks, Jason W. Wei, Kai Zou, ICLR 2019, arxiv | code
EDA is the following operations:
1. Synonym Replacement (SR): Randomly choose n words from the sentence that are not stop words. Replace each of these words with one of its synonyms chosen at random.
2. Random Insertion (RI): Find a random synonym of a random word in the sentence that is not a stop word. Insert that synonym into a random position in the sentence. Do this n times.
3. Random Swap (RS): Randomly choose two words in the sentence and swap their positions. Do this n times.
4. Random Deletion (RD): Randomly remove each word in the sentence with probability p. -
Conditional BERT Contextual Augmentation, Xing Wu, Shangwen Lv, Liangjun Zang, Jizhong Han, Songlin Hu, 201812, arxiv
Contextual Augmentation: Data Augmentation by Words with Paradigmatic Relations. Sosuke Kobayashi, NAACL-HLT, 2018. arxiv | code
Contextual augmentation is a domain-independent data augmentation for text classification tasks. Texts in supervised dataset are augmented by replacing words with other words which are predicted by a label-conditioned bi-directional language model.
Data Noising as Smoothing in Neural Network Language Models, Ziang Xie, Sida I. Wang, Jiwei Li, Daniel Lévy, Aiming Nie, Dan Jurafsky, Andrew Y. Ng, ICLR 2017 arxiv
Improving language understanding with unsupervised learning, OpenAI 2018
- language model
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova, Google 2018
- language model
Generating and Exploiting Large-scale Pseudo Training Data for Zero Pronoun Resolution, Ting Liu, Yiming Cui, et al. , ACL2017 arxiv | slides | Yiming Cui
- data scarcity
- transfer learning
- Unsupervised Named-Entity Extraction from the Web: An Experimental Study, 2005 paper
- Unsupervised Models for Named Entity Classification, Michael Collins and Yoram Singer, 1999 paper ⭐⭐⭐⭐
Language as a Latent Variable: Discrete Generative Models for Sentence Compression, Yishu Miao, Phil Blunsom, EMNLP 2016 arxiv
Sentence Compression as Tree Transduction, Trevor Anthony Cohn, Mirella Lapata, 2009 arxiv
Global Inference for Sentence Compression: An Integer Linear Programming Approach James Clarke,Mirella Lapata 2008 code
Sentence Reduction for Automatic Text Summarization, Hongyan Jing 2000 paper
A Multi-task Learning Approach for Improving Product Title Compression with User Search Log Data, Jingang Wang, Junfeng Tian, Long Qiu, Sheng Li, Jun Lang, Luo Si, Man Lan, AAAI 2018 arxiv
- A Neural Attention Model for Abstractive Sentence Summarization, Alexander M. Rush, Sumit Chopra, Jason Weston, Facebook, EMNLP 2015 code
Using millions of emoji occurrences to learn any-domain representations for detecting sentiment, emotion and sarcasm, Bjarke Felbo, et al., EMNLP 2017 arxiv | code
- data scarcity
- transfer learning
Chen, T.; Xu, R.; He, Y.; Wang, X. Improving sentiment analysis via sentence type classification using BiLSTM-CRF and CNN. Expert Syst. Appl. 2017 ⭐⭐
Attention-based LSTM for Aspect-level Sentiment Classification, Yequan Wang and Minlie Huang and Li Zhao* and Xiaoyan Zhu, EMNLP 2016, paper ⭐⭐⭐
Aspect-level ? 关注特定词语?可以用于entity linking中候选实体排序 ?
VADER: A Parsimonious Rule-based Model for Sentiment Analysis of Social Media Text. Eighth International Conference on Weblogs and Social Media, CJ Hutto & Eric Gilbert's social media (ICWSM-14). Ann Arbor, MI, June 2014.
supplement: data set code (github) python package
- Enhanced LSTM for Natural Language Inference, Qian Chen, Xiaodan Zhu, Zhenhua Ling, Si Wei, Hui Jiang, Diana Inkpen. ACL (2017) arxiv | code
- A Decomposable Attention Model for Natural Language Inference, Ankur P. Parikh, Oscar Täckström, Dipanjan Das, Jakob Uszkoreit, EMNLP 2016 arxiv
- Learning Natural Language Inference with LSTM, Shuohang Wang, Jing Jiang, 2016, arxiv | code
- Convolutional neural network architectures for matching natural language sentences, B Hu, Z Lu, H Li, Q Chen - Advances in neural information processing systems, 2014 paper
- Defending Against Neural Fake News, Rowan Zellers, Ari Holtzman, Hannah Rashkin, Yonatan Bisk, Ali Farhadi, Franziska Roesner, Yejin Choi, arxiv | code
- Combating Fake News: A Survey on Identification and Mitigation Techniques, Karishma Sharma, Feng Qian, He Jiang, Natali Ruchansky, Ming Zhang, Yan Liu, ACM 2019, arxiv
- False News On Social Media: A Data-Driven Survey, Francesco Pierri, Stefano Ceri, 2019, arxiv
- DeClarE: Debunking Fake News and False Claims using Evidence-Aware Deep Learning, Kashyap Popat, Subhabrata Mukherjee, Andrew Yates, Gerhard Weikum, EMNLP 2018 arxiv
- Hutto, C.J. (2015). Computationally Detecting and Quantifying the Degree of Bias in Sentence-Level Text of News Stories. Second International Conference on Human and Social Analytics (HUSO-15). Barcelona, Spain 2015. code