- philipp koehn site
- 全国机器翻译研讨会 (CWMT)
-
Bridging the Gap between Training and Inference for Neural Machine Translation, Wen Zhang, Yang Feng, Fandong Meng, Di You, Qun Liu, ACL 2019 Best Long Paper, arxiv video
Inspired from Data as Demonstrator, which is a meta learning algorithm to improve the multi-step predictive capabilities of a learned time series (e.g. dynamical system) model. code
At training time, it predicts with the ground truth words as context while at inference it has to generate the entire sequence from scratch. This discrepancy of the fed context leads to error accumulation among the way.
Word-level training requires strict matching between the generated sequence and the ground truth sequence which leads to overcorrection over different but reasonable translations.
We address these issues by sampling context words not only from the ground truth sequence but also from the predicted sequence by the model during training, where the predicted sequence is selected with a sentence-level optimum.