This repository contains our source code of both Task 1 and Task 2 in the DocILE Competition.
DocILE is a large-scale research benchmark for cross-evaluation of machine learning methods for Key Information Localization and Extraction (KILE) and Line Item Recognition (LIR) from semi-structured business documents such as invoices, orders etc. Such large-scale benchmark was previously missing (Skalický et al., 2022), hindering comparative evaluation.
- This repository this repository contain our source code to use LiLT: A Simple yet Effective Language-Independent Layout Transformer for Structured Document Understanding
We're using Python 3.9 to run this repo: Requirements packages:
- poppler-utils
- tensorboard
- jsonargparse[signatures]
- tensorrt==8.5.1.7
- timm
- transformers==4.26.0
- datasets==2.11.0
Run this command to install requirements:
pip install -r requirements.txt
- Change config in ./config/train.cfg to suit your enviroments.
- Change path to config file in ./run_training.sh and ./run_inference.sh to suit your enviroments.
We decided to randomly resplit dataset, 80% for train and 20% for validation to have better score. To randomly resplit dataset, change dataset_path in ./data_split.py and run:
python3 data_split.py
Run ./run_training.sh
bash run_training.sh
The models will be save at OUPUT_DIR (config)
To run inference, follow below instruction.
- On test set, call:
bash run_inference.sh test
- On validation set, call:
bash run_inference.sh val
The result will be save at PREDICTION_DIR (con)