-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HW4 Shtompel #23
base: main
Are you sure you want to change the base?
HW4 Shtompel #23
Changes from all commits
7d942f7
2eb2ff2
f3f7002
9125a08
380cee3
81b386f
4fc5771
0aae778
9d4123f
dc34a87
7c346f2
ec74fd9
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,50 @@ | ||
# Protein_tools | ||
### Overview | ||
**Protein_tools** is a tool for basic analysis of protein and polypeptide sequenses. Using this tool you can estimate sequence length, charge, aminoacid compound and mass of the protein, find out the aliphatic index and see if the protein could be cleaved by trypsin. | ||
|
||
### Usage | ||
If you want to use the **Protein_tools**, use `git clone` to this repo. To run this tool, you can use this command: | ||
`run_protein_tools('<sequence>', '<procedure>')`, where `<sequence>` is the protein sequence (or several sequences) that should be analysed, and `<procedure>` is the name of option that you want to be done with the sequence(-s). Please write the name of option and sequences in quotes separated by commas, use only one option per time and make sure that your sequences contain the one-letter names of aminoacids (the case is not important). | ||
|
||
### Options | ||
1. `count_seq_length`: counts the length of protein sequence and output the number of aminoacids. | ||
2. `classify_aminoacids`: classify all aminoacids from the input sequence in accordance with the 'AA_ALPHABET' classification. If aminoacid is not included in this list, it should be classified as 'Unusual'. | ||
|
||
AA_ALPHABET classification: | ||
| Class | Aminoacids | | ||
|----------|-----------| | ||
| Nonpolar | G, A, V, I, L, P| | ||
| Polar uncharged | S, T, C, M, N, Q | | ||
| Aromatic | F, W, Y | | ||
| Polar with negative charge | D, E | | ||
| Polar with positive charge | K, R, H | | ||
|
||
3. `check_unusual_aminoacids`: checks the composition of aminoacids and return the list of unusual aminoacids if they present in the sequence. We call the aminoacid unusual when it does not belong to the list of proteinogenic aminoacids (see AA_ALPHABET classification). | ||
4. `count_charge`: counts the charge of the protein by the subtraction between the number of positively and negatively charged aminoacids. | ||
5. `count_protein_mass`: calculates mass of all aminoacids of input sequence in g/mol scale. | ||
6. `count_aliphatic_index`: calculates aliphatic index - relative proportion of aliphatic aminoacids in input peptide. The higher aliphatic index the higher thermostability of peptide. | ||
7. `count_trypsin_sites`: counts number of valid trypsin cleavable sites: Arginine/any aminoacid and Lysine/any aminoacid (except Proline). If peptide has not any trypsin cleavable sites, it will return zero. | ||
|
||
### Examples | ||
An illustration of the capabilities of **Protein_tools** using a random protein sequence is presented below: | ||
*sequence:* CVWGWAMGEACPNPIKINISAYAKTWYQNGPIGRCCCWVGYTAIRFPHQEMQQNTRFNKP | ||
|
||
| Option | Output | | ||
|--------|---------| | ||
| count_seq_length | 60 | | ||
| classify_aminoacids | 'Nonpolar': 22, 'Polar uncharged': 20, 'Aromatic': 9, 'Polar with negative charge': 2, 'Polar with positive charge': 7, 'Unusual': 0 | | ||
| check_unusual_aminoacids | This sequence contains only proteinogenic aminoacids. | | ||
| count_charge | 5 | | ||
| count_protein_mass | 6918.99 | | ||
| count_aliphatic_index | 0.5049999999999999 | | ||
| count_trypsin_sites | 5 | | ||
|
||
### Limitations and troubleshooting | ||
**Protein_tools** has several limitations that can raise the errors in the work of the program. Here are some of them: | ||
1. **Protein_Tools** works only with protein sequences that contains letters of Latin alphabet (the case is not important); also every aminoacid should be coded by one letter. If there are other symbols in the sequence, the tool raise `ValueError` *"One of these sequences is not protein sequence or does not match the rools of input. Please select another sequence."*. In this case you should check if there are punctuation marks, spaces or some other symbols in your sequence. | ||
2. Be careful to work only with the sequences that contain aminoacids that coded with one letter. If your sequense is "SerMetAlaGly", **Protein_tools** reads it as "SERMETALAGLY". | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Очень хорошее замечение! Было бы здорово в версии 2.0 ввести поодержку трехбуквенной кодировки (полушутка). |
||
3. The list of available functions is available in section "Options". If you see `ValueError` *"This procedure is not available. Please choose another procedure."*, probably your spelling of the name of function is incorrect. Please check the name of chosen prosedure and make sure that it is available in the **Protein_Tools**. | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🔥 |
||
### Contribution and contacts | ||
- Shtompel Anastasia (Telegram: @Aenye) — teamlead, developer (options 'count_protein_mass', 'count_aliphatic_index', 'count_trypsin_sites') | ||
- Chevokina Elizaveta (Telegram: @lzchv) — developer (options 'count_seq_length', 'classify_aminoacids', 'check_unusual_aminoacids', 'count_charge'), author of README file |
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ну это тут наверное не нужно |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,36 @@ | ||
#!/bin/bash | ||
|
||
echo "Hi! I'm your pre-commit code checker." | ||
|
||
FILE="dna_rna_tools.py" | ||
TESTS="${FILE%.py}_test.py" | ||
|
||
if [ -f $FILE ]; then | ||
|
||
if [ ! -f hooks_env/bin/activate ]; then | ||
echo "For the first time I need to prepare an environment, give me a minute..." | ||
python3 -m venv hooks_env | ||
source hooks_env/bin/activate | ||
python3 -m pip install --upgrade pip --quiet | ||
pip install pytest flake8 flake8-bugbear pep8-naming flake8-builtins flake8-functions-names flake8-variables-names pep8-naming pylint mypy --quiet | ||
echo "hooks_env" >> .gitignore | ||
echo ".gitignore" >> .gitignore | ||
else | ||
source hooks_env/bin/activate | ||
fi | ||
|
||
echo "$(tput setab 7 setaf 1)>>>> Code quality checks <<<<$(tput sgr 0)" | ||
echo ">>>> flake8 check" | ||
flake8 $FILE | ||
echo ">>>> pylint check" | ||
pylint $FILE | ||
echo ">>>> mypy check" | ||
mypy $FILE | ||
|
||
deactivate | ||
|
||
else | ||
|
||
echo "Seems no python code to be checked. You can configure me in .git/hook/pre-commit" | ||
|
||
fi |
Original file line number | Diff line number | Diff line change | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
@@ -0,0 +1,200 @@ | ||||||||||||||||
""" | ||||||||||||||||
Global variables: | ||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ну они скорее не глобал, а просто константы. В целом никогда такого не встречал (такого комментария к константам), но если у модуля и без того есть докстринга, то добавить такую инфу - кажется имеет смысл. |
||||||||||||||||
- AA_ALPHABET — a dictionary variable that contains a list of proteinogenic aminoacids classes. | ||||||||||||||||
- ALL_AMINOACIDS — a set variable that contains a list of all proteinogenic aminoacids. | ||||||||||||||||
- AMINO_ACIDS_MASSES — a dictionary variable that contains masses of all proteinogenic aminoacids. | ||||||||||||||||
""" | ||||||||||||||||
|
||||||||||||||||
AA_ALPHABET = {'Nonpolar': ['G', 'A', 'V', 'I', 'L', 'P'], | ||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Здорово что правильно это дело оформили! There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Хотя это ж не алфавит)
Suggested change
|
||||||||||||||||
'Polar uncharged': ['S', 'T', 'C', 'M', 'N', 'Q'], | ||||||||||||||||
'Aromatic': ['F', 'W', 'Y'], | ||||||||||||||||
'Polar with negative charge': ['D', 'E'], | ||||||||||||||||
'Polar with positive charge': ['K', 'R', 'H'] | ||||||||||||||||
} | ||||||||||||||||
|
||||||||||||||||
ALL_AMINOACIDS = set(('G', 'A', 'V', 'I', 'L', 'P', 'S', 'T', 'C', 'M', 'N', 'Q', 'F', 'W', 'Y', 'D', 'E', 'K', 'R', 'H')) | ||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||||||||
|
||||||||||||||||
AMINO_ACIDS_MASSES = { | ||||||||||||||||
'G': 57.05, 'A': 71.08, 'S': 87.08, 'P': 97.12, 'V': 99.13, | ||||||||||||||||
'T': 101.1, 'C': 103.1, 'L': 113.2, 'I': 113.2, 'N': 114.1, | ||||||||||||||||
'D': 115.1, 'Q': 128.1, 'K': 128.2, 'E': 129.1, 'M': 131.2, | ||||||||||||||||
'H': 137.1, 'F': 147.2, 'R': 156.2, 'Y': 163.2, 'W': 186.2 | ||||||||||||||||
} | ||||||||||||||||
|
||||||||||||||||
|
||||||||||||||||
def is_protein(seq: str) -> bool: | ||||||||||||||||
""" | ||||||||||||||||
Input: a protein sequence (a str type). | ||||||||||||||||
Output: boolean value. | ||||||||||||||||
'is_protein' function check if the sequence contains only letters in the upper case. | ||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Это идет первой строчкой в докстринге, а потом уже аргументы и результат |
||||||||||||||||
""" | ||||||||||||||||
if seq.isalpha() and seq.isupper(): | ||||||||||||||||
return True | ||||||||||||||||
Comment on lines
+31
to
+32
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. А что если else? Возвращает None, хотя логично что должен бы False. Тут можно было даже не мудрить:
Suggested change
|
||||||||||||||||
|
||||||||||||||||
|
||||||||||||||||
def count_seq_length(seq: str) -> int: | ||||||||||||||||
""" | ||||||||||||||||
Input: a protein sequence (a str type). | ||||||||||||||||
Output: length of protein sequence (an int type). | ||||||||||||||||
'count_seq_length' function counts the length of protein sequence. | ||||||||||||||||
""" | ||||||||||||||||
return len(seq) | ||||||||||||||||
Comment on lines
+35
to
+41
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ну уж это совсем примитивненькая функция:) |
||||||||||||||||
|
||||||||||||||||
|
||||||||||||||||
def classify_aminoacids(seq: str) -> dict: | ||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Прикольная идея, 👍 |
||||||||||||||||
""" | ||||||||||||||||
Input: a protein sequence (a str type). | ||||||||||||||||
Output: a classification of all aminoacids from the sequence (a dict type — 'all_aminoacids_classes' variable). | ||||||||||||||||
'classify_aminoacids' function classify all aminoacids from the input sequence in accordance with the 'AA_ALPHABET' classification. If aminoacid is not included in this list, | ||||||||||||||||
it should be classified as 'Unusual'. | ||||||||||||||||
""" | ||||||||||||||||
all_aminoacids_classes = dict.fromkeys(['Nonpolar', 'Polar uncharged', 'Aromatic', 'Polar with negative charge', 'Polar with positive charge', 'Unusual'], 0) | ||||||||||||||||
for aminoacid in seq: | ||||||||||||||||
aminoacid = aminoacid.upper() | ||||||||||||||||
if aminoacid not in ALL_AMINOACIDS: | ||||||||||||||||
all_aminoacids_classes['Unusual'] += 1 | ||||||||||||||||
for aa_key, aa_value in AA_ALPHABET.items(): | ||||||||||||||||
if aminoacid in aa_value: | ||||||||||||||||
all_aminoacids_classes[aa_key] += 1 | ||||||||||||||||
return all_aminoacids_classes | ||||||||||||||||
|
||||||||||||||||
|
||||||||||||||||
def check_unusual_aminoacids(seq: str) -> str: | ||||||||||||||||
""" | ||||||||||||||||
Input: a protein sequence (a str type). | ||||||||||||||||
Output: an answer whether the sequense contains unusual aminoacids (a str type). | ||||||||||||||||
'check_unusual_aminoacids' function checks the composition of aminoacids and return the list of unusual aminoacids if they present in the sequence. We call the aminoacid | ||||||||||||||||
unusual when it does not belong to the list of proteinogenic aminoacids (see 'ALL_AMINOACIDS' global variable). | ||||||||||||||||
""" | ||||||||||||||||
seq_aminoacids = set() | ||||||||||||||||
for aminoacid in seq: | ||||||||||||||||
aminoacid = aminoacid.upper() | ||||||||||||||||
seq_aminoacids.add(aminoacid) | ||||||||||||||||
if seq_aminoacids <= ALL_AMINOACIDS: | ||||||||||||||||
return 'This sequence contains only proteinogenic aminoacids.' | ||||||||||||||||
else: | ||||||||||||||||
unusual_aminoacids = seq_aminoacids - ALL_AMINOACIDS | ||||||||||||||||
unusual_aminoacids_str = '' | ||||||||||||||||
for elem in unusual_aminoacids: | ||||||||||||||||
unusual_aminoacids_str += elem | ||||||||||||||||
unusual_aminoacids_str += ', ' | ||||||||||||||||
return f'This protein contains unusual aminoacids: {unusual_aminoacids_str[:-2]}.' | ||||||||||||||||
Comment on lines
+62
to
+81
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. В целом тоже прикольная идея. Два комментария
|
||||||||||||||||
|
||||||||||||||||
|
||||||||||||||||
def count_charge(seq: str) -> int: | ||||||||||||||||
""" | ||||||||||||||||
Input: a protein sequence (a str type). | ||||||||||||||||
Output: a charge of the sequence (an int type). | ||||||||||||||||
'count_charge' function counts the charge of the protein by the subtraction between the number of positively and negatively charged aminoacids. | ||||||||||||||||
""" | ||||||||||||||||
seq_classes = classify_aminoacids(seq) | ||||||||||||||||
positive_charge = seq_classes['Polar with positive charge'] | ||||||||||||||||
negative_charge = seq_classes['Polar with negative charge'] | ||||||||||||||||
sum_charge = positive_charge - negative_charge | ||||||||||||||||
return sum_charge | ||||||||||||||||
Comment on lines
+84
to
+94
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Хе-хе, вот это здорово! There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ну и seq тут и везде в этой работе не очень хорошее слово мне кажется. Все таки это больше воспринимается про ДНК/РНК, а не про белки. |
||||||||||||||||
|
||||||||||||||||
|
||||||||||||||||
def count_protein_mass(seq: str) -> float: | ||||||||||||||||
""" | ||||||||||||||||
Calculates mass of all aminoacids of input peptide in g/mol scale. | ||||||||||||||||
Arguments: | ||||||||||||||||
- seq (str): one-letter code peptide sequence, case is not important; | ||||||||||||||||
Output: | ||||||||||||||||
Returns mass of peptide (float). | ||||||||||||||||
""" | ||||||||||||||||
aa_mass = 0 | ||||||||||||||||
for aminoacid in seq.upper(): | ||||||||||||||||
if aminoacid in AMINO_ACIDS_MASSES: | ||||||||||||||||
aa_mass += AMINO_ACIDS_MASSES[aminoacid] | ||||||||||||||||
Comment on lines
+107
to
+108
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. А что если else? Просто ничего не прибавляем? Тут либо у вас уже есть проверка ввода, тогда тут она не нужна. Либо ее нет и тогда надо сделать. |
||||||||||||||||
return aa_mass | ||||||||||||||||
|
||||||||||||||||
|
||||||||||||||||
def count_aliphatic_index(seq: str) -> float: | ||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Прикольно) |
||||||||||||||||
""" | ||||||||||||||||
Calculates aliphatic index - relative proportion of aliphatic aminoacids in input peptide. | ||||||||||||||||
The higher aliphatic index the higher thermostability of peptide. | ||||||||||||||||
Argument: | ||||||||||||||||
- seq (str): one-letter code peptide sequence, letter case is not important. | ||||||||||||||||
Output: | ||||||||||||||||
Returns alipatic index (float). | ||||||||||||||||
""" | ||||||||||||||||
ala_count = seq.count('A') / len(seq) | ||||||||||||||||
val_count = seq.count('V') / len(seq) | ||||||||||||||||
lei_count = seq.count('L') / len(seq) | ||||||||||||||||
izlei_count = seq.count('I') / len(seq) | ||||||||||||||||
Comment on lines
+121
to
+124
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Получается вы четырежды пробегаете по белку и четырежды вдобавок к тому считаете его длину. В целом на длину можно было бы поделить один разок в финальной формуле) |
||||||||||||||||
aliph_index = ala_count + 2.9 * val_count + 3.9 * lei_count + 3.9 * izlei_count | ||||||||||||||||
return aliph_index | ||||||||||||||||
|
||||||||||||||||
|
||||||||||||||||
def not_trypsin_cleaved(seq: str) -> int: | ||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Название какое-то не очень информативное касательного того что делает и возвращет функция) |
||||||||||||||||
""" | ||||||||||||||||
Counts non-cleavable sites of trypsin: Arginine/Proline (RP) and Lysine/Proline (KP) pairs. | ||||||||||||||||
Argument: | ||||||||||||||||
- seq (str): one-letter code peptide sequence, case is not important. | ||||||||||||||||
Output: | ||||||||||||||||
Returns number of exception sites that cannot be cleaved by trypsin (int). | ||||||||||||||||
""" | ||||||||||||||||
not_cleavage_count = 0 | ||||||||||||||||
not_cleavage_count += seq.upper().count('RP') | ||||||||||||||||
not_cleavage_count += seq.upper().count('KP') | ||||||||||||||||
Comment on lines
+138
to
+139
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Опять же, тут это можно было бы за один прогон посчитать:) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Но круто что используете There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. И upper() тоже получается делаете много раз, можно было бы повышение регистра как то отдельно вынести в начало работы функции |
||||||||||||||||
return not_cleavage_count | ||||||||||||||||
|
||||||||||||||||
|
||||||||||||||||
def count_trypsin_sites(seq: str) -> int: | ||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Прикольно! Это прям даже неочевидно и полезно для какого-нибдуь масс-спека. Отдельный лайк за то что вы учли не-разрезаемые сайты. Прям огонь |
||||||||||||||||
""" | ||||||||||||||||
Counts number of valid trypsin cleavable sites: | ||||||||||||||||
Arginine/any aminoacid and Lysine/any aminoacid (except Proline). | ||||||||||||||||
Argument: | ||||||||||||||||
- seq (str): one-letter code peptide sequence, case is not important. | ||||||||||||||||
Output: | ||||||||||||||||
Returns number of valid trypsin cleavable sites (int). | ||||||||||||||||
If peptide has not any trypsin cleavable sites, it will return zero. | ||||||||||||||||
""" | ||||||||||||||||
arginine_value = seq.upper().count('R') | ||||||||||||||||
lysine_value = seq.upper().count('K') | ||||||||||||||||
count_cleavage = arginine_value + lysine_value - not_trypsin_cleaved(seq) | ||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
Когда count в начале то это скорее как название какой-то функции) |
||||||||||||||||
return count_cleavage | ||||||||||||||||
|
||||||||||||||||
|
||||||||||||||||
OPERATIONS = {'count_protein_mass':count_protein_mass, | ||||||||||||||||
'count_aliphatic_index': count_aliphatic_index, | ||||||||||||||||
'count_trypsin_sites': count_trypsin_sites, | ||||||||||||||||
'count_seq_length': count_seq_length, | ||||||||||||||||
'classify_aminoacids': classify_aminoacids, | ||||||||||||||||
'check_unusual_aminoacids': check_unusual_aminoacids, | ||||||||||||||||
'count_charge': count_charge} | ||||||||||||||||
|
||||||||||||||||
|
||||||||||||||||
def protein_tools(*args: str) -> list: | ||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||||||||
""" | ||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. йо-о, прям мощная докстринга! |
||||||||||||||||
Calculates protein phisical properties: mass, charge, length, aliphatic index; | ||||||||||||||||
as well as defines biological features: aminoacid composition, trypsin cleavable sites. | ||||||||||||||||
|
||||||||||||||||
Input: a list of protein sequences and one procedure that should be done with these sequences (str type, several values). | ||||||||||||||||
|
||||||||||||||||
Valid operations: | ||||||||||||||||
Protein_tools include several operations: | ||||||||||||||||
- count_seq_length: returns length of protein (int); | ||||||||||||||||
- classify_aminoacids: returns collection of classified aminoacids, included in the protein (dict); | ||||||||||||||||
- check_unusual_aminoacids: informs about whether the unusual aminoacis include into the protein (str); | ||||||||||||||||
- count_charge: returns charge value of protein (int); | ||||||||||||||||
- count_protein_mass: calculates mass of all aminoacids of input peptide in g/mol scale (float); | ||||||||||||||||
- count_aliphatic_index: calculates relative proportion of aliphatic aminoacids in input peptide (float); | ||||||||||||||||
- count_trypsin_sites: counts number of valid trypsin cleavable sites. | ||||||||||||||||
|
||||||||||||||||
Output: a list of outputs from the chosen procedure (list type). | ||||||||||||||||
'run_protein_tools' function take the protein sequences and the name of the procedure that the user gives and applies this procedure by one of the available functions | ||||||||||||||||
to all the given sequences. Also this function check the availabilaty of the procedure and raise the ValueError when the procedure is not in the list of available | ||||||||||||||||
functions (see 'OPERATIONS' global variable). | ||||||||||||||||
""" | ||||||||||||||||
operation = args[-1] | ||||||||||||||||
parsed_seq_list = [] | ||||||||||||||||
for seq in args[0:-1]: | ||||||||||||||||
Comment on lines
+190
to
+192
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Лучше по другому сделайте аргументы. То как было в ДЗ 3 н самом деле не очень удобно. Пусть белки принимаются либо позиционно либо даже одним списком, а вот операция будет именованным. Чтобы это все было не в одной куче, а отдельно. |
||||||||||||||||
if not is_protein(seq): | ||||||||||||||||
raise ValueError("One of these sequences is not protein sequence or does not match the rools of input. Please select another sequence.") | ||||||||||||||||
Comment on lines
+193
to
+194
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Вот это красиво выглядит:) |
||||||||||||||||
else: | ||||||||||||||||
if operation in OPERATIONS: | ||||||||||||||||
parsed_seq_list.append(OPERATIONS[operation](seq)) | ||||||||||||||||
else: | ||||||||||||||||
raise ValueError("This procedure is not available. Please choose another procedure.") | ||||||||||||||||
return parsed_seq_list | ||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Можно кстати тут возвращать не список, а словарь. Ключ - это белок из ввода, а значение - это результат работы программы. Но это чисто как идея, и так тоже хорошо |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Приятный хороший README! Вы все что нужно в целом расписали. Единственное, хорошо когда есть прям пример использования. Строчка кода которую можно прям сразу скопировать, вставить, и посмотреть какой будет вывод (и сравнить с тем что написано в README). Этого немного не хватает:)