This project analyzes online reviews from British Airways customers on Skytrax using Natural Language Processing (NLP) and topic modeling. The goal is to discover recurring themes and sentiments to inform decision-making in customer service and product development.
To set up the project, clone the repository and install the required packages:
git clone https://github.com/hrbn/british-airways-reviews.git
cd british-airways-reviews
python3 -m venv env
source env/bin/activate
pip install -r requirements.txt
The dataset is scraped from Skytrax. To run the scraper:
python scraper/scrape_reviews.py
To pseudonymize customer names in the dataset:
python scraper/pseudonymize.py
Run the following Jupyter notebooks sequentially to perform the analysis:
british_airways_1_clean.ipynb
: Data cleaning and preprocessingbritish_airways_2_nlp.ipynb
: NLP text analysis and feature extractionbritish_airways_3_topic_model.ipynb
: Topic modeling using BERTopic