This study explores Task 2 in NumEval-2024, which is SemEval-2024(Semantic Evaluation) Task 7, focusing on the Reading Comprehension of Numerals in Text (Chinese). The dataset utilized in this study is the Numeral-related Question Answering Dataset (NQuAD), and the model employed is BERT. The data undergoes preprocessing, incorporating Numerals Augmentation and Feature Enhancement to numerical entities before model training. Additionally, fine-tuning will also be applied.The result was an accuracy rate of 77.09%, representing a 7.14% improvement compared to the initial NQuAD processing model, referred to as the Numeracy-Enhanced Model (NEMo).
The file "pre-processing.ipynb" is intended for generating data for model training and evaluation.
The file "BERT-NAFE.ipynb" is utilized for fine-tuning the model.