This repository contains the research and code (Python) from my master thesis. (2023-2024)
Title: "Combining Multi-Scale Kernels and Transformer Encoder for ECG classification"
The research utilizes electrocardiogram (ECG) data from the Physionet 2021 challenge: https://physionet.org/content/challenge-2021/1.0.3/.
You can directly download the challenge data from the Physionet 2021 homepage: https://physionet.org/content/challenge-2021/1.0.3/#files, or you might have a look at my Google Drive, where I wrapped all the ECG files in a single .h5 (key-value) file: https://drive.google.com/drive/folders/1e0LygPzn5tM9i2m2leXFwSs5J5og97W5 -> do download physionet2021.h5, physionet2021_references.csv (labels) and codes_SNOMED.csv.
Each ECG in physionet2021.h5 is stored as key (str): "PATIENT-ID" and value: 12-lead ECG (list[12 lists]), where each ECG has 12-leads and was recorded with a sampling rate of 500Hz. Most, but not all are 10 seconds long ECGs with 5000 sample points. Physionet 2021 and the provided .h5 dataset also include some ECG recordings less than 10 seconds or up to 15 minutes, which can be preprocessed (padded/truncated) to 10 seconds. The ECGs can have multiple labels, see Physionet2021_references.csv. The corresponding arrhythmia label names are in the codes_SNOMED.csv listed. Notice, in my Google Drive is also the Physionet 2017 challenge data available and in the directory "prepared" are further prepared datasets, since this thesis focuses particularly on the arrhythmia types Sinus Rhythm (SR), Atrial Fibrilliation (AF), Atrial Flutter (AFL), Premature Atrial Contraction (PAC) and Premature Ventricular Contraction (PVC). The Physionet 2021 labels distinguish between more than 100 known arrhythmia types.
Reyna MA, Sadr N, Perez Alday EA, Gu A, Shah AJ, Robichaux C, Rad AB, Elola A, Seyedi S, Ansari S, Ghanbari H, Li Q, Sharma A, Clifford GD. Will Two Do? Varying Dimensions in Electrocardiography: The PhysioNet/Computing in Cardiology Challenge 2021. 2021 Computing in Cardiology (CinC), Brno, Czech Republic, 2021 (pp. 1-4). doi: 10.23919/CinC53138.2021.9662687
Reyna, M., Sadr, N., Gu, A., Perez Alday, E. A., Liu, C., Seyedi, S., Shah, A., & Clifford, G. (2022). Will Two Do? Varying Dimensions in Electrocardiography: The PhysioNet/Computing in Cardiology Challenge 2021 (version 1.0.3). PhysioNet. https://doi.org/10.13026/34va-7q14.
Goldberger AL, Amaral LAN, Glass L, Hausdorff JM, Ivanov PCh, Mark RG, Mietus JE, Moody GB, Peng C-K, Stanley HE. PhysioBank, PhysioToolkit, and PhysioNet: Components of a New Research Resource for Complex Physiologic Signals. Circulation 101(23):e215-e220 [Circulation Electronic Pages; http://circ.ahajournals.org/content/101/23/e215.full]; 2000 (June 13).
Please go to section Setup for an installation guide to run this repository.
![https://en.wikipedia.org/wiki/Electrocardiography](https://private-user-images.githubusercontent.com/57774167/305729785-1a2b2533-3aae-4876-8f32-2c24ce4cc90e.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MzkzNzY3NzUsIm5iZiI6MTczOTM3NjQ3NSwicGF0aCI6Ii81Nzc3NDE2Ny8zMDU3Mjk3ODUtMWEyYjI1MzMtM2FhZS00ODc2LThmMzItMmMyNGNlNGNjOTBlLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAyMTIlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMjEyVDE2MDc1NVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTMxM2VhMjM4MTRmMzQyMGRhNjFmZGJkYjcwZTA1Mzg0ODNhNjFkOGJiMDIyN2NkZDgyMDVkZGYwNDllZDFkN2QmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.l37ZDPGhrAjVYMdwF5BCX2UmA6EcLKiEH-szO3Nhs68)
Source: https://en.wikipedia.org/wiki/Electrocardiography
![Transformer model](https://private-user-images.githubusercontent.com/57774167/339169590-3c0b185b-b43b-49ef-bd72-5369edcab3a4.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MzkzNzY3NzUsIm5iZiI6MTczOTM3NjQ3NSwicGF0aCI6Ii81Nzc3NDE2Ny8zMzkxNjk1OTAtM2MwYjE4NWItYjQzYi00OWVmLWJkNzItNTM2OWVkY2FiM2E0LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAyMTIlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMjEyVDE2MDc1NVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWQwMWMxYjM0OGRjZWVkMmVkNDJkZmQ5ZjYwNmJmODUyNGI3Mjc0N2FjYzkyYjE5ZTk3MjM0NjA4Yjg5YTQ5ODYmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.wSXR8b26QPC4E1molVQxjbiSQHEnAYW-x0qUiwi4zTk)
Source: Ashish Vaswani and Noam Shazeer and Niki Parmar and Jakob Uszkoreit and Llion Jones and Aidan N. Gomez and Lukasz Kaiser and Illia Polosukhin. "Attention Is All You Need", 2017
![](https://private-user-images.githubusercontent.com/57774167/339168072-66c8ec15-edfb-4a55-ae57-db5118aa18f2.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MzkzNzY3NzUsIm5iZiI6MTczOTM3NjQ3NSwicGF0aCI6Ii81Nzc3NDE2Ny8zMzkxNjgwNzItNjZjOGVjMTUtZWRmYi00YTU1LWFlNTctZGI1MTE4YWExOGYyLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAyMTIlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMjEyVDE2MDc1NVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWU0NDFiNzM2NzVlMWMzN2ViNTBiNjIxZGUwM2YyOGQ2YzE4OWMwMjI0MzdlODFmYjVmZDA3YWRiYTBhYTdlNjYmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.qv2dfUPMElRJudIyfsLRuYn20kiLfKx8L_r01Ev6qNk)
![](https://private-user-images.githubusercontent.com/57774167/339167839-e1c2090c-f9e5-4027-9432-6cd6b535a130.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MzkzNzY3NzUsIm5iZiI6MTczOTM3NjQ3NSwicGF0aCI6Ii81Nzc3NDE2Ny8zMzkxNjc4MzktZTFjMjA5MGMtZjllNS00MDI3LTk0MzItNmNkNmI1MzVhMTMwLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAyMTIlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMjEyVDE2MDc1NVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTdmY2RlM2I0NjViYjIwYThhNjM4MzU5MWNiOTU4MDJlN2EwZmEyZDhhMjY0MDkzZGE1NTY1NzMxMWM3OTAzOGMmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.bxPrB6qA5S7hhrWy7znooQ7MnQ8JgPTXiIeUM_IDpZQ)
![](https://private-user-images.githubusercontent.com/57774167/342243815-94422599-de4d-4001-9ae0-e068863220c6.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MzkzNzY3NzUsIm5iZiI6MTczOTM3NjQ3NSwicGF0aCI6Ii81Nzc3NDE2Ny8zNDIyNDM4MTUtOTQ0MjI1OTktZGU0ZC00MDAxLTlhZTAtZTA2ODg2MzIyMGM2LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAyMTIlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMjEyVDE2MDc1NVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTgzZDZkNTE0ODJhM2ZkMWU3YWJiODQ0MWVlMzdhZTYzZjU2MTA2NTllODY0ZWZlMjdkNWQ2MzA3OTU0NTAxZDgmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.vMy2z5dgZIv1N0ak2svolAI0T6F-_13vusSePB4uzlg)
The repostiory is still under development, therefore just take a look at Jupiter_notebook_Physionet2021_categoricalCrossentropy.ipynb and ignore the following steps at the moment.
If not done yet, install
- ... Git: https://git-scm.com/book/en/v2/Getting-Started-Installing-Git
- ... Python 3.11: https://www.python.org/downloads/
- ... Pip (Package Installer Python): https://pip.pypa.io/en/stable/installation/
-
Open a folder or directory on your computer, where you want to save the project.
-
Open terminal on Mac OS/Linux or cmd (Command Prompt) on Windows.
-
Clone repository
git clone https://github.com/KIlian42/Atrial-Fibrillation-Classification-Using-Ensembled-Models.git
- Change directory
cd Atrial-Fibrillation-Classification-Using-Ensembled-Models
- Install library to create virtual environments
pip install virtualenv
- Create Python 3.11 environment
Mac OS/Linux:
python3.11 -m venv .venv
Windows:
py -3.11 -m venv .venv
- Source enviroment
Mac OS/Linux:
source .venv/bin/activate
Windows:
.venv\Scripts\activate
- Install requirements
pip install -r requirements.txt