Skip to content

joansaurina/Protein_Structure_Prediction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Benchmarking Diffusion Models for Monomeric Protein Structure Prediction

This repository contains the code and datasets used in our study on systematically benchmarking state-of-the-art AI-powered diffusion models for monomeric protein structure prediction. Our analysis focuses on three leading models—AlphaFold 3, Protenix, and Chai-1—evaluating their accuracy, robustness, and ability to detect subtle 3D structural variations in unseen protein structures.

Overview of the project

Overview

Protein structure prediction is crucial for advancing computational biology, with applications ranging from drug design to understanding biological mechanisms. Recent advancements in diffusion models have revolutionized the field, enabling faster and more accurate predictions. This project explores:

  • The predictive performance of AlphaFold 3, Protenix, and Chai-1 on test sets of unseen proteins.
  • Sensitivity of AlphaFold 2 to single amino acid mutations, analyzing local structural deformation.
  • Comparative performance across difficulty levels using metrics like pLDDT, PAE, RMSD, and pTM.

Key Features

  • Dataset: Includes newly released protein structures from CAMEO (2024) and experimental datasets (https://zenodo.org/records/10013253) with mutation-specific comparisons.
  • Metrics: Comprehensive evaluation using confidence scores, alignment errors, and structural deviations.
  • Code: Modular implementation for benchmarking diffusion-based prediction models.

Metrics Analysis

This repository is designed for detailed analysis of metrics using the available data and tools. Follow the steps below to install dependencies and run the Jupyter notebook.

Requirements

To install the necessary dependencies, ensure you have pip or conda installed in your environment. Then, run the following command:

With pip

pip install -r requirements.txt

Running the Notebook

  1. Download the dataset from this link, unzip and place it in data folder.

  2. Download the dataset from this link, extract the "PDB" folder and place it in data folder.

  3. Install the required dependencies. Make sure you have all the prerequisites set up, as outlined in the requirements.txt file.

  4. Open the playground.ipynb notebook in the Jupyter web interface.

  5. Run the cells in the notebook to analyze the metrics and explore the results. Each cell is designed to guide you through the analysis step by step.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published