Skip to content

nicwulab/SARS2_FP_DMS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

41 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Studying the mutational fitness effects of SARS-CoV-2 spike fusion peptide

Dependencies

Input files

Primer design for DMS library construction

  1. Generating foward (NNK + internal barcode) and reverse primers (constant)
    python3 script/lib_primer_design.py

  2. Generating barcode file
    python3 script/check_barcode.py

Calculating fitness from DMS data

  1. Merge overlapping paired-end reads using PEAR
    pear -f [FASTQ FILE FOR FORWARD READ] -r [FASTQ FILE FOR FORWARD READ] -o [OUTPUT FASTQ FILE]

  2. Counting variants based on nucleotide sequences
    python3 script/FP_fastq2count.py

    • Input files:
      • Merged read files in fastq_merged/ folder
    • Output files:
      • result/FP_DMS_count_nuc.tsv
  3. Convert nucleotide sequences to amino acid mutations
    python3 script/FP_count_nuc2aa.py

  4. Convert nucleotide sequences to codon variants
    python3 script/FP_count_nuc2codon.py

  5. Compute fitness
    python3 script/FP_count2fit.py

  6. Convert B factor in PDB file into mean fitness value
    python3 script/convert_Bfactor_to_fit.py

Plotting

  1. Plot correlation between replicates and compare silent/missense/nonsense
    Rscript script/plot_QC.R

  2. Plot correlation between fitness measurements in this study and those in previous studies python3 script/plot_cor_measures.py

  3. Plot heatmap for the fitnss of individual mutations
    Rscript script/plot_heatmap_fit.R

  4. Plot mean fitness (i.e. mutational tolerance) of individual residue positions
    Rscript script/plot_mean_fit.R

  5. Plot meanfitness on structure
    pymol script/plot_Bfactor_as_fit.pml

  6. Plot heatmap for the antibody escape of individual mutations
    Rscript script/plot_heatmap_escape.R

  7. Plot heatmap for the codon variants
    Rscript script/plot_heatmap_codon_freq.R

Analyze mutation rate

  1. Identity the number of amino acid mutations on each merged read
    python3 script/analyze_mut_rate.py

  2. Plot mutation rate
    Rscript script/plot_lib_mut_count.R