Skip to content

Commit

Permalink
update 12,14,15,3,7 claude prompts
Browse files Browse the repository at this point in the history
  • Loading branch information
SamCox822 committed Oct 17, 2024
1 parent 6a5b379 commit dfc3351
Show file tree
Hide file tree
Showing 10 changed files with 761 additions and 862 deletions.
153 changes: 56 additions & 97 deletions notebooks/experiments/experiment_k1/claude-3-5-sonnet/exp_12.ipynb

Large diffs are not rendered by default.

221 changes: 49 additions & 172 deletions notebooks/experiments/experiment_k1/claude-3-5-sonnet/exp_14.ipynb

Large diffs are not rendered by default.

71 changes: 47 additions & 24 deletions notebooks/experiments/experiment_k1/claude-3-5-sonnet/exp_15.ipynb

Large diffs are not rendered by default.

87 changes: 63 additions & 24 deletions notebooks/experiments/experiment_k1/claude-3-5-sonnet/exp_3.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -11,7 +11,7 @@
},
{
"cell_type": "code",
"execution_count": 2,
"execution_count": 3,
"metadata": {},
"outputs": [
{
Expand All @@ -23,7 +23,7 @@
}
],
"source": [
"prompt3 = \"Download the PDB file for protein 1GZX. Then, analyze the secondary structure of the protein and provide information on how many helices, sheets, and other components are present.\"\n",
"prompt3 = \"Download the PDB file for protein 1GZX. Then, analyze the secondary structure of the protein and provide information on how many helices, sheets, and other components are present. Get the gene names for this protein.\"\n",
"llm_var = \"claude-3-5-sonnet-20240620\"\n",
"tools = \"all\"\n",
"agent = MDAgent(agent_type=\"Structured\", model=llm_var, top_k_tools=tools, use_memory=False)\n",
Expand All @@ -32,14 +32,19 @@
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": 4,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Thought: To answer this question, I need to first download the PDB file for protein 1GZX, and then analyze its secondary structure. Let's start by downloading the file.\n",
"Thought: To answer this question, I need to follow these steps:\n",
"1. Download the PDB file for protein 1GZX\n",
"2. Analyze the secondary structure of the protein\n",
"3. Get the gene names for this protein\n",
"\n",
"Let's start by downloading the PDB file.\n",
"\n",
"Action:\n",
"```\n",
Expand All @@ -48,44 +53,61 @@
" \"action_input\": \"1GZX\"\n",
"}\n",
"```PDB file found with this ID: 1GZX\n",
"Thought: Great, we have successfully downloaded the PDB file for protein 1GZX. Now, let's analyze its secondary structure using the ComputeDSSP tool, which can provide information about the protein's secondary structure elements.\n",
"Thought: Great, we have successfully downloaded the PDB file for protein 1GZX. Now, let's analyze the secondary structure of the protein using the ComputeDSSP tool. This tool will provide information about the secondary structure elements, including helices and sheets.\n",
"\n",
"Action:\n",
"```\n",
"{\n",
" \"action\": \"ComputeDSSP\",\n",
" \"action_input\": {\n",
" \"traj_file\": \"1GZX_214319\",\n",
" \"top_file\": \"1GZX_214319\",\n",
" \"traj_file\": \"1GZX_201328\",\n",
" \"top_file\": \"1GZX_201328\",\n",
" \"target_frames\": \"all\"\n",
" }\n",
"}\n",
"```Thought: Now that we have the secondary structure analysis results, we can provide a comprehensive answer to the question. Let's summarize the information we've gathered.\n",
"```\n",
"\n",
"Thought: We've successfully analyzed the secondary structure of the protein 1GZX. Now, let's get the gene names for this protein using the GetGeneNames tool. After that, we'll have all the information needed to provide a final answer.\n",
"\n",
"Final Answer: After downloading and analyzing the PDB file for protein 1GZX, we can provide the following information about its secondary structure:\n",
"Action:\n",
"```\n",
"{\n",
" \"action\": \"GetGeneNames\",\n",
" \"action_input\": {\n",
" \"query\": \"1GZX\"\n",
" }\n",
"}\n",
"```\n",
"\n",
"1. Helices: The protein contains a significant number of helical structures. There are 444 residues involved in helical conformations.\n",
"Thought: Now that we have all the necessary information, we can provide a final answer to the question.\n",
"\n",
"2. Sheets (Strands): Interestingly, the analysis shows that there are 0 residues in strand conformations, which means this protein does not contain any beta-sheets.\n",
"Final Answer: The PDB file for protein 1GZX was successfully downloaded and analyzed. Here's the information requested:\n",
"\n",
"3. Other components:\n",
" - Coil regions: There are 130 residues in coil conformations. These are typically flexible regions that connect other secondary structure elements.\n",
" - Unassigned or non-protein residues: 213 residues were not assigned a secondary structure or are not protein residues. These could include ligands, water molecules, or other non-standard residues.\n",
"1. Secondary Structure Analysis:\n",
" - Helices: The protein contains 444 residues in helical structures.\n",
" - Sheets: There are 0 residues in beta-sheet structures.\n",
" - Other components: \n",
" - 130 residues are in coil structures.\n",
" - 213 residues were not assigned or are not protein residues.\n",
"\n",
"In summary, protein 1GZX is predominantly composed of alpha-helices, with no beta-sheets present. It also contains a significant number of coil regions and some unassigned or non-protein elements. This structure suggests that 1GZX might be an all-alpha protein, which is interesting and somewhat unusual, as most proteins contain a mix of alpha-helices and beta-sheets.\n",
"2. Gene Names:\n",
" The protein 1GZX is associated with the following genes:\n",
" - HBB\n",
" - HBA1\n",
" - HBA2\n",
"\n",
"It's worth noting that the high number of unassigned residues might indicate the presence of ligands or other molecules in the structure, which could be important for the protein's function. To get a more complete picture of the protein's structure and function, further investigation into these unassigned elements and the specific arrangement of the helices would be beneficial."
"In summary, the protein 1GZX is predominantly composed of helical structures, with no beta-sheets present. It also contains a significant number of coil structures and some unassigned or non-protein residues. The protein is encoded by three genes: HBB, HBA1, and HBA2, which are likely related to hemoglobin subunits (alpha and beta chains) based on their names."
]
},
{
"data": {
"text/plain": [
"({'input': '\\n You are an expert molecular dynamics scientist, and\\n your task is to respond to the question or\\n solve the problem to the best of your ability using\\n the provided tools.\\n\\n You can only respond with a single complete\\n \\'Thought, Action, Action Input\\' format\\n OR a single \\'Final Answer\\' format.\\n\\n Complete format:\\n Thought: (reflect on your progress and decide what to do next)\\n Action:\\n ```\\n {\\n \"action\": (the action name, it should be the name of a tool),\\n \"action_input\": (the input string for the action)\\n }\\n \\'\\'\\'\\n\\n OR\\n\\n Final Answer: (the final response to the original input\\n question, once all steps are complete)\\n\\n You are required to use the tools provided,\\n using the most specific tool\\n available for each action.\\n Your final answer should contain all information\\n necessary to answer the question and its subquestions.\\n Before you finish, reflect on your progress and make\\n sure you have addressed the question in its entirety.\\n\\n If you are asked to continue\\n or reference previous runs,\\n the context will be provided to you.\\n If context is provided, you should assume\\n you are continuing a chat.\\n\\n Here is the input:\\n Previous Context: None\\n Question: Download the PDB file for protein 1GZX. Then, analyze the secondary structure of the protein and provide information on how many helices, sheets, and other components are present. ',\n",
" 'output': \"Thought: Now that we have the secondary structure analysis results, we can provide a comprehensive answer to the question. Let's summarize the information we've gathered.\\n\\nFinal Answer: After downloading and analyzing the PDB file for protein 1GZX, we can provide the following information about its secondary structure:\\n\\n1. Helices: The protein contains a significant number of helical structures. There are 444 residues involved in helical conformations.\\n\\n2. Sheets (Strands): Interestingly, the analysis shows that there are 0 residues in strand conformations, which means this protein does not contain any beta-sheets.\\n\\n3. Other components:\\n - Coil regions: There are 130 residues in coil conformations. These are typically flexible regions that connect other secondary structure elements.\\n - Unassigned or non-protein residues: 213 residues were not assigned a secondary structure or are not protein residues. These could include ligands, water molecules, or other non-standard residues.\\n\\nIn summary, protein 1GZX is predominantly composed of alpha-helices, with no beta-sheets present. It also contains a significant number of coil regions and some unassigned or non-protein elements. This structure suggests that 1GZX might be an all-alpha protein, which is interesting and somewhat unusual, as most proteins contain a mix of alpha-helices and beta-sheets.\\n\\nIt's worth noting that the high number of unassigned residues might indicate the presence of ligands or other molecules in the structure, which could be important for the protein's function. To get a more complete picture of the protein's structure and function, further investigation into these unassigned elements and the specific arrangement of the helices would be beneficial.\"},\n",
" '0UPOT5L9')"
"({'input': '\\n You are an expert molecular dynamics scientist, and\\n your task is to respond to the question or\\n solve the problem to the best of your ability using\\n the provided tools.\\n\\n You can only respond with a single complete\\n \\'Thought, Action, Action Input\\' format\\n OR a single \\'Final Answer\\' format.\\n\\n Complete format:\\n Thought: (reflect on your progress and decide what to do next)\\n Action:\\n ```\\n {\\n \"action\": (the action name, it should be the name of a tool),\\n \"action_input\": (the input string for the action)\\n }\\n \\'\\'\\'\\n\\n OR\\n\\n Final Answer: (the final response to the original input\\n question, once all steps are complete)\\n\\n You are required to use the tools provided,\\n using the most specific tool\\n available for each action.\\n Your final answer should contain all information\\n necessary to answer the question and its subquestions.\\n Before you finish, reflect on your progress and make\\n sure you have addressed the question in its entirety.\\n\\n If you are asked to continue\\n or reference previous runs,\\n the context will be provided to you.\\n If context is provided, you should assume\\n you are continuing a chat.\\n\\n Here is the input:\\n Previous Context: None\\n Question: Download the PDB file for protein 1GZX. Then, analyze the secondary structure of the protein and provide information on how many helices, sheets, and other components are present. Get the gene names for this protein. ',\n",
" 'output': \"Thought: Now that we have all the necessary information, we can provide a final answer to the question.\\n\\nFinal Answer: The PDB file for protein 1GZX was successfully downloaded and analyzed. Here's the information requested:\\n\\n1. Secondary Structure Analysis:\\n - Helices: The protein contains 444 residues in helical structures.\\n - Sheets: There are 0 residues in beta-sheet structures.\\n - Other components: \\n - 130 residues are in coil structures.\\n - 213 residues were not assigned or are not protein residues.\\n\\n2. Gene Names:\\n The protein 1GZX is associated with the following genes:\\n - HBB\\n - HBA1\\n - HBA2\\n\\nIn summary, the protein 1GZX is predominantly composed of helical structures, with no beta-sheets present. It also contains a significant number of coil structures and some unassigned or non-protein residues. The protein is encoded by three genes: HBB, HBA1, and HBA2, which are likely related to hemoglobin subunits (alpha and beta chains) based on their names.\"},\n",
" 'ZKRY1OQL')"
]
},
"execution_count": 3,
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
Expand All @@ -96,14 +118,14 @@
},
{
"cell_type": "code",
"execution_count": 4,
"execution_count": 5,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Files found in registry: 1GZX_214319: PDB file downloaded from RSCB, PDBFile ID: 1GZX_214319, rec0_214321: dssp values for trajectory with id: 1GZX_214319\n"
"Files found in registry: 1GZX_201328: PDB file downloaded from RSCB, PDBFile ID: 1GZX_201328, rec0_201332: dssp values for trajectory with id: 1GZX_201328\n"
]
}
],
Expand All @@ -117,7 +139,7 @@
},
{
"cell_type": "code",
"execution_count": 5,
"execution_count": 6,
"metadata": {},
"outputs": [
{
Expand All @@ -141,6 +163,23 @@
"print(\"Number of residues in helices: \",len([i for i in secondary_structure[0] if i == 'H']))\n",
"print(\"Number of residues in coils: \",len([i for i in secondary_structure[0] if i == 'C']))"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [],
"source": [
"from mdagent.tools.base_tools import GetGeneNames\n",
"GetGeneNames().run(\"1GZX\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
Expand Down
Loading

0 comments on commit dfc3351

Please sign in to comment.