Skip to content

Latest commit

 

History

History
40 lines (18 loc) · 1.61 KB

README.md

File metadata and controls

40 lines (18 loc) · 1.61 KB

Sevilla-Interview-Analysis

Angela Krak

December 15, 2021

[email protected]

Purpose

This is Angela Krak's final project repo for Data Science (LING 2340), Fall 2021. The data consist of transcriptions of sociolinguistic interviews from speakers from Seville, Spain. The purpose of this project is to conduct frequency analyses to explore common themes within the data, both as a whole set and by interview question.

The data that I began the project with consisted of 24 sound files. I had personally collected the data in the summer of 2019. For this project, I transcribed 22/24 files and divided the speech into .txt files by interview question. There were 5 questions total, so each speaker is associated with 5 files. While I cannot share the data set, I have uploaded a sample of .txt files from one speaker, which can be viewed here:

Sample Interview Q1

Sample Interview Q2

Sample Interview Q3

Sample Interview Q4

Sample Interview Q5

Organization

I have linked the most important parts of the repo below.

The final_report.md will contain a summary of the project and the most important findings from the frequency analyses.

The code used in this project can be accessed by viewing Sevilla Transcription Rmd or Sevilla Transcription md.

The images of the frequency graphs can be found in the folder titled images .

Thanks for visiting and feel free to contact me with any questions!