GitHub - mehilshah/Bug_Reproducibility_DL_Bugs: This repository serves as the replication package for the EMSE'24 paper: "Towards Enhancing the Reproducibility of Deep Learning Bugs: An Empirical Study".

Bug Reproducibility for Deep Learning Systems (Accepted in EMSE, 2024)

Paper Link: https://arxiv.org/abs/2401.03069

This repository is the replication package for the project "Enhancing the Reproducibility of Deep Learning Bugs: An Empirical Study". This study is conducted under the supervision and guidance of Dr. Masud Rahman and Dr. Foutse Khomh.

Abstract

Context: Deep learning has achieved remarkable progress in various domains. However, like any software system, deep learning systems contain bugs, some of which can have severe impacts, as evidenced by crashes involving autonomous vehicles. Despite substantial advancements in deep learning techniques, little research has focused on reproducing deep learning bugs, which hinders resolving them. Existing literature suggests that only 3% of machine/deep learning bugs are reproducible, underscoring the need for further research.

Objective: This paper examines the reproducibility of deep learning bugs. We identify edit actions and useful information that could improve deep learning bug reproducibility.

Method: First, we construct a dataset of 668 deep learning bugs from Stack Overflow and Defects4ML across three frameworks and 22 architectures. Second, out of the 668 bugs, we select 165 bugs using stratified sampling and try to determine their reproducibility. While reproducing these bugs, we identify edit actions and useful information necessary for their reproduction. Third, we used the Apriori algorithm to identify useful information and edit actions required to reproduce specific bug types. Finally, we conduct a user study with 22 developers to assess the effectiveness of our findings in real-life settings.

Results: We successfully reproduced 148 out of 165 bugs attempted. We identified ten edit actions and five useful information categories that can help us reproduce the deep learning bugs. Our findings improved bug reproducibility by 22.92% and reduced reproduction time by 24.35% based on a user study with 22 developers.

Conclusions: Our research addresses the critical issue of deep learning bug reproducibility. Practitioners and researchers can leverage our findings to improve deep learning bug reproducibility.

Materials Included

Analysis Folder: This folder contains Jupyter notebooks focused on dataset analysis. The notebooks include code for implementing the Apriori algorithm, which is used to identify critical edit actions. These actions are essential for gathering the necessary information required to reproduce bugs.
Dataset Folder: Within this directory, you'll find various datasets, including those for PyTorch, TensorFlow (TF), and Keras Posts. The folder also contains queries used to filter the data and retrieve specific posts. Additionally, reproducibility results are included, along with corresponding edit actions and vital bug report details. Finally, this folder also contains the results for the user study conducted as a part of the third research question.
Bugs Folder In this folder, a collection of bugs is organized, alongside their original code snippets sourced from Stack Overflow. Completed code snippets associated with each bug are also provided. Each specific bug folder contains the following elements:
- Finalized code snippet after applying the edit actions
- Original code snippet from Stack Overflow/Github.
- requirements.txt file is generated to facilitate the installation of necessary dependencies for each specific bug.
User-Study-RQ3: Contains the forms and hint formulation technique + results for the RQ3.
Statistical-Tests-RQ3: GLM Dataset and Code for Statistical Analysis of the impact of hints on bug reproducibility.
LLAMA3-Experiments: Contains the experimental results for the experiment conducted in Appendix B.
Cohen-Kappa: Contains the data for the bug reproduction and agreement analysis between first author and independent collaborator.

System Requirements

Operating System: Windows 10 or higher
Python Version: 3.10
Disk Space: ~3.5 GB
Development Environment: Visual Studio Code (VS Code)
RAM: 16GB
GPU: N/A

Installation Instructions

To replicate the work, follow these steps:

Step 1: Setting Up the Virtual Environment

Create a virtual environment using the following command:
```
python -m venv venv
```

Step 2: Installing Dependencies

After creating the virtual environment, activate it by following the instructions here.
Install the necessary dependencies for the required bug:
```
cd Bugs/<BugID>
pip install -r requirements.txt
```

Bug Reproduction

After installing the dependencies, run the following command to reproduce the bug

python main.py &> output.txt

This ensures that the code to reproduce the bug is run, and the results are stored in the output file. To check the original bug report, go to the Dataset_Manual_Reproduction.csv and find the Stack Overflow Post of the corresponding Bug ID. This will help you verify the output of the reproduced bug and the original error message.

Bug Reproduction - All Bugs

To reproduce all the bugs, download the data and dependencies for the bugs, and run the following command.

cd Bugs
python script.py

Analysis

To analyze the results and run the code for Apriori implementation, go to the respective Jupyter notebook in the Analysis folder and run the cells in the notebook sequentially.

Licensing Information

This project is licensed under the MIT License, a permissive open-source license that allows others to use, modify, and distribute the project's code with very few restrictions. This license can benefit research by promoting collaboration and encouraging the sharing of ideas and knowledge. With this license, researchers can build on existing code to create new tools, experiments, or projects, and easily adapt and customize the code to suit their specific research needs without worrying about legal implications. The open-source nature of the MIT License can help foster a collaborative research community, leading to faster innovation and progress in their respective fields. Additionally, the license can help increase the visibility and adoption of the project, attracting more researchers to use and contribute to it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Bug Reproducibility for Deep Learning Systems (Accepted in EMSE, 2024)

Abstract

Materials Included

System Requirements

Installation Instructions

Step 1: Setting Up the Virtual Environment

Step 2: Installing Dependencies

Bug Reproduction

Bug Reproduction - All Bugs

Analysis

Licensing Information

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
Analysis		Analysis
Bugs		Bugs
Cohen-Kappa		Cohen-Kappa
Dataset		Dataset
Llama3-Experiments		Llama3-Experiments
Pilot-Study		Pilot-Study
Statistical-Tests-RQ3		Statistical-Tests-RQ3
User-Study-RQ3		User-Study-RQ3
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

mehilshah/Bug_Reproducibility_DL_Bugs

Folders and files

Latest commit

History

Repository files navigation

Bug Reproducibility for Deep Learning Systems (Accepted in EMSE, 2024)

Abstract

Materials Included

System Requirements

Installation Instructions

Step 1: Setting Up the Virtual Environment

Step 2: Installing Dependencies

Bug Reproduction

Bug Reproduction - All Bugs

Analysis

Licensing Information

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages