Sevilla-Interview-Analysis

Angela Krak

December 15, 2021

Purpose

This is Angela Krak's final project repo for Data Science (LING 2340), Fall 2021. The data consist of transcriptions of sociolinguistic interviews from speakers from Seville, Spain. The purpose of this project is to conduct frequency analyses to explore common themes within the data, both as a whole set and by interview question.

The data that I began the project with consisted of 24 sound files. I had personally collected the data in the summer of 2019. For this project, I transcribed 22/24 files and divided the speech into .txt files by interview question. There were 5 questions total, so each speaker is associated with 5 files. While I cannot share the data set, I have uploaded a sample of .txt files from one speaker, which can be viewed here: