Skip to content

RaphaelMouravieff/Partial-Exec

Repository files navigation

Partial Execution

Table Question-Answering involves both understanding the natural language query and grounding it in the context of the input table to extract the relevant information. In this context, many methods have highlighted the benefits of intermediate pre-training from SQL queries. However, while most approaches aim at generating final answers from inputs directly, we claim that there is better to do with SQL queries during training. By learning to imitate a restricted portion of SQL-like algebraic operations, we show that their execution flow provides intermediate supervision steps that allow increased generalization and structural reasoning compared with classical approaches of the field. Our study bridges the gap between semantic parsing and direct answering methods and provides useful insights regarding what types of operations should be predicted by a generative architecture or be preferably executed by an external algorithm.

Installation

Data

Instructions for installation

Installation of Squall data.

git clone https://github.com/tzshi/squall.git && \
mkdir -p ~/partial_execution/data && \
mv squall/data ~/partial_execution && \
mv squall/tables ~/partial_execution/data && \
rm -rf squall

Creating the logical form for the 'preorder' flattening method.

cd scripts
python create_logicalforms.py --flatten_mode preorder

Train Your Own Model

finetuned_model.py

This script evaluates a pre-trained model on the Question + Table -> Omega_include data, on the WikiTableQuestions dataset.

For example, here we aim to evaluate the model using the intermediate training "tapex-base" with the "preorder" flattening method whyle executing only the Projection, Selection, and Comparison (PCS) nodes. This fine-tuning is designed to run on a GPU, with a total batch size of around 128.

python finetune_model.py \
  --do_train \
  --do_eval \
  --Omega_include "pcs"  \
  --dataset_name data/wtq_lf_preorder \
  --output_dir "models/tapex-base_pcs" \
  --config_name "microsoft/tapex-base" \
  --tokenizer_name "microsoft/tapex-base"\
  --model_name_or_path "microsoft/tapex-base"  \
  --overwrite_output_dir \
  --per_device_train_batch_size 6 \
  --gradient_accumulation_steps 21 \
  --per_device_eval_batch_size 6 \
  --learning_rate 3e-5 \
  --logging_steps 10 \
  --eval_steps 1000 \
  --save_steps 1000 \
  --warmup_steps 2000 \
  --evaluation_strategy steps \
  --predict_with_generate \
  --num_beams 5 \
  --weight_decay 1e-2 \
  --label_smoothing_factor 0.1 \
  --max_steps 20000

perf_model.py

After fine-tuning, you can evaluate the model's performance on the test dataset with the following code. This process identifies the best checkpoint during validation and then applies this checkpoint to the test set for a comprehensive performance assessment.

python perf_model.py \
 --path_model_folder "tapex-base_pcs" \
 --flatten_mode "preorder"

Test the Best Model

Pas dispo encore

python perf_model.py \
 --path_model_folder "tapex-large_pcs" \
 --flatten_mode "preorder"

Teste permuted performances

Creating the datasets: non-permuted and permuted. This will create two datasets (HF) each of size 506. One will consist of 316 permuted tables, and the other will be the original from the official validation set.

cd scripts
python create_permuted_data.py 

Example of using the baseline file for Tapex with permuted data.

python analysis/baselines.py --model tapex --permuted 1

Cite Us

If our work is useful for you, please consider citing our paper:

@article{mouravieff2024training,
  title={Training Table Question Answering via SQL Query Decomposition},
  author={Mouravieff, Rapha{\"e}l and Piwowarski, Benjamin and Lamprier, Sylvain},
  journal={arXiv preprint arXiv:2402.13288},
  year={2024}
}

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages