-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HW4_Gorbarenko #14
base: main
Are you sure you want to change the base?
HW4_Gorbarenko #14
Changes from all commits
3c63aff
a82ffed
445fde1
d13d869
96321d8
ef1e6fd
6182c31
f6a435a
dbad46c
c937377
f13125a
9efbeda
42d6fa4
787d5ca
d241035
eedebb3
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,174 @@ | ||
# Welcome to amino_analyzer tool | ||
|
||
## Overview | ||
The amino_analyzer is an easy-to-use Python tool designed to facilitate the comprehensive analysis of protein sequences. It provides a broad functionality from basic checks for valid amino acid sequences to more complicated computations like molecular weights, hydrophobicity analysis, and cleavage site identification. | ||
|
||
## :green_heart: Key features | ||
|
||
### 1. Protein molecular weight calculation | ||
The amino_analyzer offers the capability to calculate the molecular weight of a protein sequence. Users can choose between average and monoisotopic weights. | ||
### 2. Hydrophobicity analysis | ||
This function counts the quantity of hydrophobic and hydrophilic amino acids within a protein sequence. | ||
### 3. Cleavage site identification | ||
Researchers can identify cleavage sites in a given peptide sequence using a specified enzyme. The tool currently supports two commonly used enzymes, trypsin and chymotrypsin. | ||
### 4. One-letter to three-Letter code conversion | ||
The amino_analyzer provides a function to convert a protein sequence from the standard one-letter amino acid code to the three-letter code. | ||
### 5. Sulphur-containing amino acid counting | ||
The tool allows a quick determine the number of sulphur-containing amino acids, namely Cysteine (C) and Methionine (M), within a protein sequence. | ||
|
||
## Usage | ||
|
||
To run amino_analyzer tool you need to use the function ***run_amino_analyzer*** with the following arguments: | ||
|
||
```python | ||
from amino_analyzer import run_amino_analyzer | ||
run_amino_analyzer(sequence, procedure, *, weight_type = 'average', enzyme: str = 'trypsine')` | ||
``` | ||
|
||
- `sequence (str):` The input protein sequence in one-letter code. | ||
- `procedure (str):` The procedure to perform over your protein sequence. | ||
- `weight_type: str = 'average':` default argument for `aa_weight` function. `weight_type = 'monoisotopic'` can be used as another option. | ||
- `enzyme: str = 'trypsine':` default argument for `peptide_cutter` function. `enzyme = 'chymotrypsin'` can be used as another option | ||
|
||
|
||
**Available procedures list** | ||
- `aa_weight` — calculates the amino acids weight in a protein sequence. | ||
- `count_hydroaffinity` — counts the quantity of hydrophobic and hydrophilic amino acids in a protein sequence. | ||
- `peptide_cutter` — identifies cleavage sites in a given peptide sequence using a specified enzyme (trypsine or chymotripsine). | ||
- `one_to_three_letter_code` — converts a protein sequence from one-letter amino acid code to three-letter code. | ||
- `sulphur_containing_aa_counter` - counts sulphur-containing amino acids in a protein sequence. | ||
|
||
You can also use each function separately by importing them in advance. Below are the available functions and their respective purposes: | ||
|
||
#### 1. **aa_weight** function calculates the weight of amino acids in a protein sequence: | ||
The type of weight to use, either `average` or `monoisotopic`. Default is `average`. | ||
```python | ||
from amino_analyzer import aa_weight | ||
aa_weight(seq: str, weight: str = `average`) -> float` | ||
``` | ||
```python | ||
sequence = "VLDQRKSTMA" | ||
result = aa_weight(sequence, weight='monoisotopic') | ||
print(result) # Output: 1348.517 | ||
``` | ||
|
||
#### 2. **count_hydroaffinity** сounts the quantity of hydrophobic and hydrophilic amino acids in a protein sequence: | ||
```python | ||
from amino_analyzer import count_hydroaffinity | ||
count_hydroaffinity(seq: str) -> tuple | ||
``` | ||
```python | ||
sequence = "VLDQRKSTMA" | ||
result = count_hydroaffinity(sequence) | ||
print(result) # Output: (3, 7) | ||
``` | ||
#### 3. **peptide_cutter** function identifies cleavage sites in a given peptide sequence using a specified enzyme: trypsine or chymotrypsine: | ||
```python | ||
from amino_analyzer import peptide_cutter | ||
peptide_cutter(sequence: str, enzyme: str = "trypsin") -> str | ||
``` | ||
```python | ||
sequence = "VLDQRKSTMA" | ||
result = peptide_cutter(sequence, enzyme="trypsin") | ||
print(result) # Output: Found 2 trypsin cleavage sites at positions 3, 6 | ||
``` | ||
#### 4. **one_to_three_letter_code** converts a protein sequence from one-letter amino acid code to three-letter code. | ||
```python | ||
from amino_analyzer import one_to_three_letter_code | ||
one_to_three_letter_code(sequence: str) -> str | ||
``` | ||
|
||
```python | ||
sequence = "VLDQRKSTMA" | ||
result = one_to_three_letter_code(sequence) | ||
print(result) # Output: ValLeuAspGlnArgLysSerThrMetAla | ||
``` | ||
|
||
#### 5. **sulphur_containing_aa_counter** counts sulphur-containing amino acids in a protein sequence | ||
```python | ||
from amino_analyzer import sulphur_containing_aa_counter | ||
sulphur_containing_aa_counter(sequence: str) -> str | ||
``` | ||
```python | ||
sequence = "VLDQRKSTMA" | ||
result = sulphur_containing_aa_counter(sequence) | ||
print(result) # Output: The number of sulphur-containing amino acids in the sequence is equal to 2 | ||
``` | ||
|
||
## Examples | ||
To calculate protein molecular weight: | ||
```python | ||
run_amino_analyzer("VLSPADKTNVKAAW", "aa_weight") # Output: 1481.715 | ||
|
||
run_amino_analyzer("VLSPADKTNVKAAW", "aa_weight", weight_type = 'monoisotopic') # Output: 1480.804 | ||
``` | ||
|
||
To count hydroaffinity: | ||
```python | ||
run_amino_analyzer("VLSPADKTNVKAAW", "count_hydroaffinity") # Output: (8, 6) | ||
``` | ||
|
||
To find trypsin/chymotripsine clivage sites: | ||
```python | ||
run_amino_analyzer("VLSPADKTNVKAAW", "peptide_cutter") # Output: 'Found 2 trypsin cleavage sites at positions 7, 11' | ||
|
||
run_amino_analyzer("VLSPADKTNVKAAWW", "peptide_cutter", enzyme = 'chymotrypsin') # Output: 'Found 1 chymotrypsin cleavage sites at positions 14' | ||
``` | ||
|
||
To change to 3-letter code and count sulphur-containing amino acids. | ||
```python | ||
run_amino_analyzer("VLSPADKTNVKAAW", "one_to_three_letter_code") # Output: 'ValLeuSerProAlaAspLysThrAsnValLysAlaAlaTrp' | ||
|
||
run_amino_analyzer("VLSPADKTNVKAAWM", "sulphur_containing_aa_counter") # Output: The number of sulphur-containing amino acids in the sequence is equal to 1 | ||
``` | ||
|
||
## Troubleshooting | ||
Here are some common issues you can come ascross while using the amino-analyzer tool and their possible solutions: | ||
|
||
1. **ValueError: Incorrect procedure** | ||
If you receive this error, it means that you provided an incorrect procedure when calling `run_amino_analyzer`. Make sure you choose one of the following procedures: `aa_weight`, `count_hydroaffinity`, `peptide_cutter`, `one_to_three_letter_code`, or `sulphur_containing_aa_counter`. | ||
|
||
Example: | ||
```python | ||
run_amino_analyzer("VLSPADKTNVKAAW", "incorrect_procedure") | ||
# Output: ValueError: Incorrect procedure. Acceptable procedures: aa_weight, count_hydroaffinity, peptide_cutter, one_to_three_letter_code, sulphur_containing_aa_counter | ||
``` | ||
|
||
2. **ValueError: Incorrect sequence** | ||
This error occurs if the input sequence provided to run_amino_analyzer contains characters that are not valid amino acids. Make sure your sequence only contains valid amino acid characters (V, I, L, E, Q, D, N, H, W, F, Y, R, K, S, T, M, A, G, P, C, v, i, l, e, q, d, n, h, w, f, y, r, k, s, t, m, a, g, p, c). | ||
|
||
Example: | ||
```python | ||
run_amino_analyzer("VLSPADKTNVKAAW!", "aa_weight") | ||
# Output: ValueError: Incorrect sequence. Only amino acids are allowed (V, I, L, E, Q, D, N, H, W, F, Y, R, K, S, T, M, A, G, P, C, v, i, l, e, q, d, n, h, w, f, y, r, k, s, t, m, a, g, p, c). | ||
``` | ||
|
||
3. **ValueError: You have chosen an enzyme that is not provided** | ||
This error occurs if you provide an enzyme other than "trypsin" or "chymotrypsin" when calling peptide_cutter. Make sure to use one of the specified enzymes. | ||
|
||
Example: | ||
```python | ||
peptide_cutter("VLSPADKTNVKAAW", "unknown_enzyme") | ||
# Output: You have chosen an enzyme that is not provided. Please choose between trypsin and chymotrypsin. | ||
``` | ||
4. **ValueError: You have chosen an enzyme that is not provided.** | ||
If you encounter this error, it means that you're trying to iterate over a float value. Ensure that you're using the correct function and passing the correct arguments. | ||
|
||
Example: | ||
```python | ||
result = count_hydroaffinity(123) | ||
# Output: TypeError: 'int' object is not iterable | ||
``` | ||
## Development team: | ||
![image](https://github.com/CaptnClementine/HW4_Gorbarenko/assets/131146976/ad89e427-5b2a-4b32-b65f-519d284fcaa7) | ||
|
||
**Anastasia Gorbarenko** - team leader, author of aa_weight and count_hydroaffinity functions | ||
**Anna Ogurtsova** - author of peptide_cutter and one_to_three_letter_code functions | ||
**Ilya Popov** - author of main and sulphur_containing_aa_counter functions | ||
|
||
## Contacts | ||
If you have any questions, suggestions, or encounter any issues while using the amino-analyzer tool, feel free to reach out: | ||
|
||
|
||
- **GitHub**: [Cucumberan](https://github.com/YourGitHubUsername), [CaptnClementine](https://github.com/YourGitHubUsername), [iliapopov17](https://github.com/YourGitHubUsername) | ||
|
Original file line number | Diff line number | Diff line change | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
@@ -0,0 +1,196 @@ | ||||||||||
from typing import List | ||||||||||
|
||||||||||
def is_aa(seq: str) -> bool: | ||||||||||
""" | ||||||||||
Check if a sequence contains only amino acids. | ||||||||||
|
||||||||||
Args: | ||||||||||
seq (str): The input sequфence to be checked. | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||
|
||||||||||
Returns: | ||||||||||
bool: True if the sequence contains only amino acids, False otherwise. | ||||||||||
""" | ||||||||||
aa_list = ['V', 'I', 'L', 'E', 'Q', 'D', 'N', 'H', 'W', 'F', 'Y', 'R', 'K', 'S', 'T', 'M', 'A', 'G', 'P', 'C', | ||||||||||
'v', 'i', 'l', 'e', 'q', 'd', 'n', 'h', 'w', 'f', 'y', 'r', 'k', 's', 't', 'm', 'a', 'g', 'p', 'c'] | ||||||||||
Comment on lines
+13
to
+14
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Лучше сделать константой |
||||||||||
unique_chars = set(seq) | ||||||||||
amino_acids = set(aa_list) | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Лучше тогда завести сразу set, вы же его сами задали. |
||||||||||
return unique_chars <= amino_acids | ||||||||||
|
||||||||||
|
||||||||||
|
||||||||||
def choose_weight(weight: str) -> List[float]: | ||||||||||
""" | ||||||||||
Choose the weight type of amino acids - average or monoisotopic. | ||||||||||
|
||||||||||
Args: | ||||||||||
weight (str): The type of weight to choose, either 'average' or 'monoisotopic'. | ||||||||||
|
||||||||||
Returns: | ||||||||||
List[float]: A list of amino acid weights based on the chosen type. | ||||||||||
""" | ||||||||||
if weight == 'average': | ||||||||||
average_weights = [71.0788, 156.1875, 114.1038, 115.0886, 103.1388, 129.1155, 128.1307, 57.0519, 137.1411, 113.1594, | ||||||||||
113.1594, 128.1741, 131.1926, 147.1766, 97.1167, 87.0782, 101.1051, 186.2132, 163.1760, 99.1326] | ||||||||||
Comment on lines
+32
to
+33
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Просто самый простой способ сделать сначала списки, а потом словарь) (потому что списки в нужном порядке можно найти в интернете, а вот словари - нет :D) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Смотри, я загуглила словарь с весами, в первой же ссылке нашла, нужный: https://stackoverflow.com/questions/27361877/how-i-can-add-the-molecular-weight-of-my-new-list-of-strings-using-a-mw-dictiona . Списки сложнее спустя время отслеживать, что нигде нет ошибки. Второй комментарий был про то, что у вас в, например, в if сперва создается список average_weights, а строкой ниже вы создаете переменную weights_aa = average_weights. Здесь просто не нужно создавать отдельные переменные average_weights и monoisotopic_weights. |
||||||||||
weights_aa = average_weights | ||||||||||
elif weight == 'monoisotopic': | ||||||||||
monoisotopic_weights = [71.03711, 156.10111, 114.04293, 115.02694, 103.00919, 129.04259, 128.05858, 57.02146, 137.05891, 113.08406, | ||||||||||
113.08406, 128.09496, 131.04049, 147.06841, 97.05276, 87.03203, 101.04768, 186.07931, 163.06333, 99.06841] | ||||||||||
weights_aa = monoisotopic_weights | ||||||||||
else: | ||||||||||
raise ValueError(f"I do not know what '{weight}' is :( \n Read help or just do not write anything except your sequence") | ||||||||||
|
||||||||||
return weights_aa | ||||||||||
|
||||||||||
|
||||||||||
def aa_weight(seq: str, weight: str = 'average') -> float: | ||||||||||
""" | ||||||||||
Calculate the amino acids weight in a protein sequence. | ||||||||||
|
||||||||||
Args: | ||||||||||
seq (str): The amino acid sequence to calculate the weight for. | ||||||||||
weight (str, optional): The type of weight to use, either 'average' or 'monoisotopic'. Default is 'average'. | ||||||||||
|
||||||||||
Returns: | ||||||||||
float: The calculated weight of the amino acid sequence. | ||||||||||
""" | ||||||||||
aa_list = str('A, R, N, D, C, E, Q, G, H, I, L, K, M, F, P, S, T, W, Y, V').split(', ') | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Зачем такая запись? Сперва вы берете строку, потом опять ее превращаете в строку, а потом делаете из строки список. Лучше сразу сделать список и объявить его константной переменной (+ вынести из под функции).
Suggested change
|
||||||||||
weights_aa = choose_weight(weight) | ||||||||||
aa_to_weight = dict(zip(aa_list, weights_aa)) | ||||||||||
final_weight = 0 | ||||||||||
for i in seq.upper(): | ||||||||||
final_weight += aa_to_weight.get(i, 0) | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. десь метод get не нужен. Лучше просто вызывать значение из словаря
Suggested change
|
||||||||||
return round(final_weight, 3) | ||||||||||
|
||||||||||
|
||||||||||
def count_hydroaffinity(seq: str) -> tuple: | ||||||||||
""" | ||||||||||
Count the quantity of hydrophobic and hydrophilic amino acids in a protein sequence. | ||||||||||
|
||||||||||
Args: | ||||||||||
seq (str): The protein sequence for which to count hydrophobic and hydrophilic amino acids. | ||||||||||
|
||||||||||
Returns: | ||||||||||
tuple: A tuple containing the count of hydrophobic and hydrophilic amino acids, respectively. | ||||||||||
""" | ||||||||||
hydrophobic_aa = ['A', 'V', 'L', 'I', 'P', 'F', 'W', 'M'] | ||||||||||
hydrophilic_aa = ['R', 'N', 'D', 'C', 'Q', 'E', 'G', 'H', 'K', 'S', 'T', 'Y'] | ||||||||||
Comment on lines
+75
to
+76
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Лучше сделать константами. |
||||||||||
|
||||||||||
hydrophobic_count = 0 | ||||||||||
hydrophilic_count = 0 | ||||||||||
|
||||||||||
seq = seq.upper() | ||||||||||
|
||||||||||
for aa in seq: | ||||||||||
if aa in hydrophobic_aa: | ||||||||||
hydrophobic_count += 1 | ||||||||||
elif aa in hydrophilic_aa: | ||||||||||
hydrophilic_count += 1 | ||||||||||
|
||||||||||
return hydrophobic_count, hydrophilic_count | ||||||||||
|
||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||
def peptide_cutter(sequence: str, enzyme: str = "trypsin") -> str: | ||||||||||
""" | ||||||||||
This function identifies cleavage sites in a given peptide sequence using a specified enzyme. | ||||||||||
|
||||||||||
Args: | ||||||||||
sequence (str): The input peptide sequence. | ||||||||||
enzyme (str): The enzyme to be used for cleavage. Choose between "trypsin" and "chymotrypsin". Default is "trypsin". | ||||||||||
|
||||||||||
Returns: | ||||||||||
str: A message indicating the number and positions of cleavage sites, or an error message if an invalid enzyme is provided. | ||||||||||
""" | ||||||||||
cleavage_sites = [] | ||||||||||
if enzyme not in ("trypsin", "chymotrypsin"): | ||||||||||
return "You have chosen an enzyme that is not provided. Please choose between trypsin and chymotrypsin." | ||||||||||
|
||||||||||
if enzyme == "trypsin": # Trypsin cuts peptide chains mainly at the carboxyl side of the amino acids lysine or arginine. | ||||||||||
for i in range(len(sequence)-1): | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Уделите больше внимания неймингу. i - плохой нейминг здесь.
Suggested change
|
||||||||||
if sequence[i] in ['K', 'R', 'k', 'r'] and sequence[i+1] not in ['P','p']: | ||||||||||
cleavage_sites.append(i+1) | ||||||||||
Comment on lines
+108
to
+109
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||
|
||||||||||
if enzyme == "chymotrypsin": # Chymotrypsin preferentially cleaves at Trp, Tyr and Phe in position P1(high specificity) | ||||||||||
for i in range(len(sequence)-1): | ||||||||||
if sequence[i] in ['W', 'Y', 'F', 'w', 'y', 'f'] and sequence[i+1] not in ['P','p']: | ||||||||||
cleavage_sites.append(i+1) | ||||||||||
|
||||||||||
if cleavage_sites: | ||||||||||
return f"Found {len(cleavage_sites)} {enzyme} cleavage sites at positions {', '.join(map(str, cleavage_sites))}" | ||||||||||
else: | ||||||||||
return f"No {enzyme} cleavage sites were found." | ||||||||||
|
||||||||||
|
||||||||||
def one_to_three_letter_code(sequence: str) -> str: | ||||||||||
""" | ||||||||||
This function converts a protein sequence from one-letter amino acid code to three-letter code. | ||||||||||
|
||||||||||
Args: | ||||||||||
sequence (str): The input protein sequence in one-letter code. | ||||||||||
|
||||||||||
Returns: | ||||||||||
str: The converted protein sequence in three-letter code. | ||||||||||
""" | ||||||||||
amino_acids = { | ||||||||||
'A': 'Ala', 'C': 'Cys', 'D': 'Asp', 'E': 'Glu', 'F': 'Phe', | ||||||||||
'G': 'Gly', 'H': 'His', 'I': 'Ile', 'K': 'Lys', 'L': 'Leu', | ||||||||||
'M': 'Met', 'N': 'Asn', 'P': 'Pro', 'Q': 'Gln', 'R': 'Arg', | ||||||||||
'S': 'Ser', 'T': 'Thr', 'V': 'Val', 'W': 'Trp', 'Y': 'Tyr' | ||||||||||
} | ||||||||||
Comment on lines
+132
to
+137
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Лучше сделать константой и вынести за пределы функций |
||||||||||
|
||||||||||
three_letter_code = [amino_acids.get(aa.upper()) for aa in sequence] | ||||||||||
return ''.join(three_letter_code) | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Я бы лучше сделала '-'.join(three_letter_code), так запись более читаема (и, если мне не изменяет память, так трехбуквенные аминокислотные остатки и записывают). Например, Val-Ile-Leu-Glu (вместо ValIleLeuGlu)
Suggested change
|
||||||||||
|
||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Потеряла пропуск строки
Suggested change
|
||||||||||
def sulphur_containing_aa_counter(sequence: str) -> str: | ||||||||||
""" | ||||||||||
This function counts sulphur-containing amino acids (Cysteine and Methionine) in a protein sequence. | ||||||||||
|
||||||||||
Args: | ||||||||||
sequence (str): The input protein sequence in one-letter code. | ||||||||||
|
||||||||||
Returns: | ||||||||||
str: The number of sulphur-containing amino acids in a protein sequence. | ||||||||||
""" | ||||||||||
counter = 0 | ||||||||||
for i in sequence: | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Плохой нейминг. Лучше было назвать не i, а aminoacid или aa хотя бы. |
||||||||||
if i == 'C' or i == 'M': | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Из-за этой логики, у вас тянется ошибка: вы проверяете только по заглавным буквам, а при выбросе ошибки на аминокислотный словарь в вашей главной функции написано, что "Only amino acids are allowed (V, I, L, E, Q, D, N, H, W, F, Y, R, K, S, T, M, A, G, P, C, v, i, l, e, q, d, n, h, w, f, y, r, k, s, t, m, a, g, p, c". Поэтому я могу захотеть подать последовательность 'mmmmm' и результат будет 0. |
||||||||||
counter += 1 | ||||||||||
answer = str(counter) | ||||||||||
return 'The number of sulphur-containing amino acids in the sequence is equal to ' + answer | ||||||||||
|
||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||
def run_amino_analyzer(sequence: str, procedure: str, *, weight_type: str = 'average', enzyme: str = 'trypsin'): | ||||||||||
""" | ||||||||||
This is the main function to run the amino-analyzer.py tool. | ||||||||||
|
||||||||||
Args: | ||||||||||
sequence (str): The input protein sequence in one-letter code. | ||||||||||
procedure (str): amino-analyzer.py tool has 5 functions at all: | ||||||||||
1. aa_weight - Calculate the amino acids weight in a protein sequence. | ||||||||||
2. count_hydroaffinity - Count the quantity of hydrophobic and hydrophilic amino acids in a protein sequence. | ||||||||||
3. peptide_cutter - This function identifies cleavage sites in a given peptide sequence using a specified enzyme. | ||||||||||
4. one_to_three_letter_code - This function converts a protein sequence from one-letter amino acid code to three-letter code. | ||||||||||
5. sulphur_containing_aa_counter - This function counts sulphur-containing amino acids in a protein sequence. | ||||||||||
weight_type = 'average': default argument for 'aa_weight' function. weight_type = 'monoisotopic' can be used as a second option. | ||||||||||
enzyme = 'trypsin': default argument for 'peptide_cutter' function. enzyme = 'chymotrypsin' can be used as a second option. | ||||||||||
|
||||||||||
Returns: | ||||||||||
The result of the procedure. | ||||||||||
""" | ||||||||||
|
||||||||||
procedures = ['aa_weight', 'count_hydroaffinity', 'peptide_cutter', 'one_to_three_letter_code', 'sulphur_containing_aa_counter'] | ||||||||||
if procedure not in procedures: | ||||||||||
raise ValueError(f"Incorrect procedure. Acceptable procedures: {', '.join(procedures)}") | ||||||||||
|
||||||||||
for i in sequence: | ||||||||||
if not is_aa(sequence): | ||||||||||
Comment on lines
+182
to
+183
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Цикл for здесь не нужен. Функция
Suggested change
|
||||||||||
raise ValueError("Incorrect sequence. Only amino acids are allowed (V, I, L, E, Q, D, N, H, W, F, Y, R, K, S, T, M, A, G, P, C, v, i, l, e, q, d, n, h, w, f, y, r, k, s, t, m, a, g, p, c).") | ||||||||||
|
||||||||||
if procedure == 'aa_weight': | ||||||||||
result = aa_weight(sequence, weight_type) | ||||||||||
elif procedure == 'count_hydroaffinity': | ||||||||||
result = count_hydroaffinity(sequence) | ||||||||||
elif procedure == 'peptide_cutter': | ||||||||||
result = peptide_cutter(sequence, enzyme) | ||||||||||
elif procedure == 'one_to_three_letter_code': | ||||||||||
result = one_to_three_letter_code(sequence) | ||||||||||
elif procedure == 'sulphur_containing_aa_counter': | ||||||||||
result = sulphur_containing_aa_counter(sequence) | ||||||||||
return result |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Не библиотека не нужна, я пропуск строки добавила. Насколько помню, в PEP8 допускается пропуск 1 строки после импортов, но почти во всем коде, что я смотрела, видела именно 2 пропуска строки. За это баллы не снижала.