Self-Training for Compositional Neural NLG

This repo contains code and data for reproducing the LSTM results in Self-Training for Compositional Neural NLG in Task-Oriented Dialogue. This repo was originally used for pull request #5 of facebookresearch/TreeNLG. Then, it reborn as a branch based on commit e66e012 of facebookresearch/TreeNLG for further reasearch. The BART version is at znculee/TreeNLG-BART.

Reference

TBD

Data

In addition to the weather and enriched E2E challenge dataset from our paper, we released another weather_challenge dataset, which contains harder weather scenarios in train/val/test files. Each response was collected by providing annotators, who are native English speakers, with a user query, and a compositional meaning representation (with discourse relations and dialog acts). All of these are made available in our dataset. See our linked paper for more details.

Data Statistics

Dataset	Train	Val	Test	Disc_Test
Weather	25390	3078	3121	454
Weather_Challenge	32684	3397	3382	-
E2E	42061	4672	4693	230

Disc_Test is a more challenging subset of our test set that contains discourse relations, which is also the subset we report results in Disc column in Table 7 in our paper. Note that there are some minor differences of data statistics to our paper, please use the statistics above.

Note: There are some responses in Weather dataset which are not provided a user query (141/17/18/4 for train/val/test/disc_test, respectively). We simply use a "placeholder" token for those missing user queries.

Constrained Decoding

fairseq should be installed at the very beginning, referring to Requirements and Installation of Fairseq. The code has been tested on commit e9014fb of fairseq.

Get Started

conda create -n treenlg python=3.7 pip
conda activate treenlg
conda install pytorch==1.4.0 torchvision==0.5.0 cudatoolkit=10.1 -c pytorch
git clone https://github.com/znculee/TreeNLG.git
cd TreeNLG
git clone https://github.com/pytorch/fairseq.git
cd fairseq
git checkout -b treenlg e9014fb
pip install -e .
cd ..

bash scripts/prepare.weather.sh
bash scripts/train.weather.lstm.sh
bash scripts/generate.weather.lstm.sh

Results

The BLEU score is calculated on just the output text, without any of the tree information. "+replfail" indicates evaluating the constrained decoding generations by replacing the failure cases with unconstrained decoding generations. We use the BLEU evaluation script provided for the E2E challenge here.

Dataset   | Method    | discourse |         | no-discourse |         | whole
          |           | BLEU      | TreeAcc | BLEU         | TreeAcc | TreeAcc
--        | --        | --        | --      | --           | --      | --
Weather   | S2S-Tree  | 74.51     | 89.65   | 76.34        | 94.17   | 93.59
          | +constr   | 75.41     | 100.0   | 76.88        | 99.84   | 99.86
          | +replfail | 75.41     | 100.0   | 77.38        | 99.84   | 99.86
--        | --        | --        | --      | --           | --      | --
Weather   | S2S-Tree  | N/A       | N/A     | 77.79        | 94.09   | N/A
Challenge | +constr   | N/A       | N/A     | 78.52        | 99.91   | N/A
          | +replfail | N/A       | N/A     | 79.02        | 99.91   | N/A
--        | --        | --        | --      | --           | --      | --
E2E       | S2S-Tree  | 66.70     | 62.17   | 77.37        | 96.72   | 95.10
          | +constr   | 64.32     | 99.13   | 77.44        | 99.89   | 99.86
          | +replfail | 65.38     | 99.13   | 77.43        | 99.89   | 99.86

Self-Training

Please refer to self_training/README.md to reproduce the results of self-training experiments in the paper.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
constrained_decoding		constrained_decoding
data		data
optim		optim
rerank		rerank
revmdl		revmdl
scripts		scripts
self_training		self_training
tasks		tasks
README.md		README.md
__init__.py		__init__.py
options.py		options.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Self-Training for Compositional Neural NLG

Reference

Data

Data Statistics

Constrained Decoding

Get Started

Results

Self-Training

About

Releases

Packages

Languages

znculee/TreeNLG

Folders and files

Latest commit

History

Repository files navigation

Self-Training for Compositional Neural NLG

Reference

Data

Data Statistics

Constrained Decoding

Get Started

Results

Self-Training

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages