Skip to content

Prompt-engineered RAGs for Open Domain Complex QA

Notifications You must be signed in to change notification settings

r-gheda/rag-complex-qa

Repository files navigation

RAGs for Open Domain Complex QA

As question answering (QA) systems increasingly rely on large language models (LLMs), integrating retrieval mechanisms to provide context is essential for handling complex queries, which may require information missing from its training data. This work explores the impact of different contexts on Retrieval-Augmented Generation (RAG) systems for complex QA, through comparing the performance of several LLMs with relevant, negative and random contexts. Our results indicate that injecting negative contexts and using different prompting techniques can impact QA performance differently depending on the LLM.

Repository structure

  • llm-responses contains QA responses from the experiments in json format.
  • llm-inference contains notebooks used for running inference with tested LLMs.
  • llm-evaluation-metrics.ipynb can be used to compute evaluation metrics from the llm's responses
  • adore-notebooks contains code used for preprocessing, training and infernce of the ADORE Dense Retrieval model (see GitHub and Paper).
  • qualitative-analysis contains qualitative results for 100 queries and code to produce them (TBD: cleaning)
  • rag contains miscellaneous code used for data processing and evaluation of the experiments.

Finally, you can find a detailed project report in Report.pdf, containing the aim, description and results of all the experiments that we conducted.