Team Members | Kriti Asija | Parth Rohilla | Nishthavan Dahiya | Darsh Patel | Soham Khade |
Paraphrase identification in today’s world is increasingly valuable, finding diverse applications across various fields, from enhancing academic integrity to refining legal document analysis and boosting content originality in digital publishing. While there are many existing models for paraphrase detection, they typically focus on word-level context, potentially missing sentence-level subtleties. This project presents ”ParaBERT,” a novel approach for paraphrase identification that combines the strengths of a Siamese BERT network with handcrafted features to get a more nuanced understanding of semantics. To evaluate the efficacy of ”ParaBERT”, it is compared against a baseline of various classical models. The final results demonstrate the model’s robust performance, achieving high accuracy and F1- scores on the datasets used.