Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for DSSP version and add rsa node features to residues with insertion codes #355

Merged
merged 30 commits into from
Mar 31, 2024

Conversation

biochunan
Copy link
Contributor

@biochunan biochunan commented Nov 3, 2023

Reference Issues/PRs

Fixes Issue #353
Fixes Issue #354

What does this implement/fix? Explain your changes

For Issue #353, this PR added a line to determine dssp version, which is passed to the function Bio.PDB.DSSP.dssp_dict_from_pdb_file otherwise, biopython will use default version number 3.9.9 and for users with dssp version >= 4.0.0, this will lead to an empty dssp DataFrame

For Issue #354, this modification fixes errors caused by missing rsa feature from nodes with insertion code. This is caused by skipping insertions when creating node_id. This modifications adds insertion codes to node_id if insertions is set to True in ProteinGraphConfiguration.

What testing did you do to verify the changes in this PR?

Added a script test_dssp.py to tests/features
The example input pdb file (with cryst1 line and insertion codes) is attached here: input_pdb_cryst1.pdb.gz

File with suffix .pdb is not supported by GitHub, thus the compressed version. Need to uncompress it first before running the test.

Pull Request Checklist

  • Added a note about the modification or contribution to the ./CHANGELOG.md file (if applicable)
  • Added appropriate unit test functions in the ./graphein/tests/* directories (if applicable)
  • Modify documentation in the corresponding Jupyter Notebook under ./notebooks/ (if applicable)
  • Ran python -m py.test tests/ and make sure that all unit tests pass (for small modifications, it might be sufficient to only run the specific test file, e.g., python -m py.test tests/protein/test_graphs.py)
  • Checked for style issues by running black . and isort .

@a-r-j
Copy link
Owner

a-r-j commented Nov 6, 2023

Thanks @biochunan LGTM! Only comment re: pdb files - I'm quite sure these can be uploaded uncompressed. See: https://github.com/a-r-j/graphein/tree/master/tests/protein/test_data

Copy link
Owner

@a-r-j a-r-j left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Will merge once tests pass. I think uploading the PDB (or using an existing file in test_data/ should work).

@a-r-j
Copy link
Owner

a-r-j commented Nov 6, 2023

On second thought, it's likely the .gitignore that's preventing you from adding the .pdb file. Try: git add input_pdb_cryst1.pdb --force.

@biochunan
Copy link
Contributor Author

Hi @a-r-j thanks for going through the PR.
For the pdb file, I was trying to upload it through the PR window and GitHub doesn't support files with the suffix pdb.
Will use existing pdb files in ./test_data in the future.

@biochunan
Copy link
Contributor Author

I had a look at the three pdb files in tests/protein/test_data, and they do not have insertion code. Maybe it's a good idea to add a PDB file with insertion codes for insertions related testing.

Copy link

sonarqubecloud bot commented Nov 8, 2023

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 2 Code Smells

No Coverage information No Coverage information
0.0% 0.0% Duplication

Copy link

Quality Gate Passed Quality Gate passed

Issues
2 New issues
0 Accepted issues

Measures
0 Security Hotspots
No data about Coverage
0.0% Duplication on New Code

See analysis details on SonarCloud

@codecov-commenter
Copy link

Codecov Report

Attention: Patch coverage is 90.00000% with 1 lines in your changes are missing coverage. Please review.

Project coverage is 45.07%. Comparing base (8123f42) to head (26cfdbb).
Report is 166 commits behind head on master.

Files Patch % Lines
graphein/protein/features/nodes/dssp.py 90.00% 1 Missing ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #355      +/-   ##
==========================================
+ Coverage   40.27%   45.07%   +4.80%     
==========================================
  Files          48      113      +65     
  Lines        2811     7916    +5105     
==========================================
+ Hits         1132     3568    +2436     
- Misses       1679     4348    +2669     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@a-r-j a-r-j merged commit f463464 into a-r-j:master Mar 31, 2024
17 of 18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants