Skip to content

Commit

Permalink
Select the less significant call in case of duplicate HIV calls
Browse files Browse the repository at this point in the history
  • Loading branch information
krassowski committed Feb 6, 2022
1 parent cf7ee22 commit ba7507f
Showing 1 changed file with 9 additions and 0 deletions.
9 changes: 9 additions & 0 deletions website/imports/sites/infections.py
Original file line number Diff line number Diff line change
Expand Up @@ -441,6 +441,15 @@ def load_sites(self, file_path='data/sites/2016_Greenwood/elife-18296-fig6-data1
'Log2 (fold change) HIV WT vs Mock': 'effect_size'
}, inplace=True)

# some combinations of PTMs were called multiple times for a single peptide;
# as we have no way to select the most likely call, we conservatively choose
# the one which has the least significant results (to avoid inflating FDR)
sites = (
sites
.sort_values('adj_p_val', ascending=False)
.drop_duplicates(subset=['protein_accession', 'residue', 'position'], keep='first')
)

mapped_sites = self.process_event_associated_sites(
sites,
canonical=CANONICAL_PHOSPHOSITE_RESIDUES
Expand Down

0 comments on commit ba7507f

Please sign in to comment.