-
Notifications
You must be signed in to change notification settings - Fork 8
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge branch 'experiments' of https://github.com/ur-whitelab/md-agent …
…into experiments
- Loading branch information
Showing
13 changed files
with
5,436 additions
and
0 deletions.
There are no files selected for viewing
546 changes: 546 additions & 0 deletions
546
notebooks/experiments_new_prompts/experiment_k1/gpt-3.5-turbo-0125/exp_14.ipynb
Large diffs are not rendered by default.
Oops, something went wrong.
450 changes: 450 additions & 0 deletions
450
notebooks/experiments_new_prompts/experiment_k1/gpt-3.5-turbo-0125/exp_15.ipynb
Large diffs are not rendered by default.
Oops, something went wrong.
398 changes: 398 additions & 0 deletions
398
notebooks/experiments_new_prompts/experiment_k1/gpt-3.5-turbo-0125/exp_25.ipynb
Large diffs are not rendered by default.
Oops, something went wrong.
255 changes: 255 additions & 0 deletions
255
notebooks/experiments_new_prompts/experiment_k1/gpt-3.5-turbo-0125/exp_3.ipynb
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,255 @@ | ||
{ | ||
"cells": [ | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 1, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"import datetime\n", | ||
"import os\n", | ||
"from mdagent import MDAgent\n", | ||
"import matplotlib.pyplot as plt" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 2, | ||
"metadata": {}, | ||
"outputs": [ | ||
{ | ||
"name": "stdout", | ||
"output_type": "stream", | ||
"text": [ | ||
"date and time: 2024-09-27\n", | ||
"time: 10:10:13\n", | ||
"LLM: gpt-3.5-turbo-0125 \n", | ||
"Temperature: 0.1\n" | ||
] | ||
} | ||
], | ||
"source": [ | ||
"prompt3 = \"Download the PDB file for protein 1GZX. Then, analyze the secondary structure of \\\n", | ||
" the protein and provide information on how many helices, sheets, and other components are present. Get the gene names for this protein.\"\n", | ||
"llm_var = \"gpt-3.5-turbo-0125\"\n", | ||
"tools = \"all\"\n", | ||
"agent = MDAgent(agent_type=\"Structured\", model=llm_var, top_k_tools=tools,ckpt_dir='ckpt_70')\n", | ||
"now = datetime.datetime.now()\n", | ||
"date = now.strftime(\"%Y-%m-%d\")\n", | ||
"print(\"date and time:\",date)\n", | ||
"time = now.strftime(\"%H:%M:%S\")\n", | ||
"print(\"time:\",time)\n", | ||
"print(\"LLM: \",agent.llm.model_name,\"\\nTemperature: \",agent.llm.temperature)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 3, | ||
"metadata": {}, | ||
"outputs": [ | ||
{ | ||
"name": "stdout", | ||
"output_type": "stream", | ||
"text": [ | ||
"Thought: To address the question, I need to download the PDB file for protein 1GZX, analyze its secondary structure to determine the number of helices, sheets, and other components present, and retrieve the gene names associated with this protein.\n", | ||
"\n", | ||
"Action:\n", | ||
"```\n", | ||
"{\n", | ||
" \"action\": \"PDBFileDownloader\",\n", | ||
" \"action_input\": {\n", | ||
" \"query\": \"1GZX\"\n", | ||
" }\n", | ||
"}\n", | ||
"```PDB file found with this ID: 1GZX\n", | ||
"Thought: Now that I have successfully downloaded the PDB file for protein 1GZX, the next step is to analyze its secondary structure to determine the number of helices, sheets, and other components present. After that, I will retrieve the gene names associated with this protein.\n", | ||
"\n", | ||
"Action:\n", | ||
"```\n", | ||
"{\n", | ||
" \"action\": \"ComputeDSSP\",\n", | ||
" \"action_input\": {\n", | ||
" \"traj_file\": \"1GZX_101015\",\n", | ||
" \"top_file\": null,\n", | ||
" \"target_frames\": \"all\"\n", | ||
" }\n", | ||
"}\n", | ||
"```Action:\n", | ||
"```\n", | ||
"{\n", | ||
" \"action\": \"GetGeneNames\",\n", | ||
" \"action_input\": {\n", | ||
" \"query\": \"1GZX\"\n", | ||
" }\n", | ||
"}\n", | ||
"```Final Answer: \n", | ||
"The analysis of protein 1GZX reveals the following secondary structure components:\n", | ||
"- Helices: 444 residues\n", | ||
"- Strands: 0 residues\n", | ||
"- Coils: 130 residues\n", | ||
"- Unassigned residues or non-protein residues: 213 residues\n", | ||
"\n", | ||
"The gene names associated with protein 1GZX are:\n", | ||
"- HBB\n", | ||
"- HBA1\n", | ||
"- HBA2\n", | ||
"\n", | ||
"These are the results based on the analysis of the PDB file and gene names retrieval for protein 1GZX." | ||
] | ||
}, | ||
{ | ||
"data": { | ||
"text/plain": [ | ||
"({'input': '\\n You are an expert molecular dynamics scientist, and\\n your task is to respond to the question or\\n solve the problem to the best of your ability using\\n the provided tools.\\n\\n You can only respond with a single complete\\n \\'Thought, Action, Action Input\\' format\\n OR a single \\'Final Answer\\' format.\\n\\n Complete format:\\n Thought: (reflect on your progress and decide what to do next)\\n Action:\\n ```\\n {\\n \"action\": (the action name, it should be the name of a tool),\\n \"action_input\": (the input string for the action)\\n }\\n \\'\\'\\'\\n\\n OR\\n\\n Final Answer: (the final response to the original input\\n question, once all steps are complete)\\n\\n You are required to use the tools provided,\\n using the most specific tool\\n available for each action.\\n Your final answer should contain all information\\n necessary to answer the question and its subquestions.\\n Before you finish, reflect on your progress and make\\n sure you have addressed the question in its entirety.\\n\\n If you are asked to continue\\n or reference previous runs,\\n the context will be provided to you.\\n If context is provided, you should assume\\n you are continuing a chat.\\n\\n Here is the input:\\n Previous Context: None\\n Question: Download the PDB file for protein 1GZX. Then, analyze the secondary structure of the protein and provide information on how many helices, sheets, and other components are present. Get the gene names for this protein. ',\n", | ||
" 'output': 'Final Answer: \\nThe analysis of protein 1GZX reveals the following secondary structure components:\\n- Helices: 444 residues\\n- Strands: 0 residues\\n- Coils: 130 residues\\n- Unassigned residues or non-protein residues: 213 residues\\n\\nThe gene names associated with protein 1GZX are:\\n- HBB\\n- HBA1\\n- HBA2\\n\\nThese are the results based on the analysis of the PDB file and gene names retrieval for protein 1GZX.'},\n", | ||
" '8J6BS6JZ')" | ||
] | ||
}, | ||
"execution_count": 3, | ||
"metadata": {}, | ||
"output_type": "execute_result" | ||
} | ||
], | ||
"source": [ | ||
"agent.run(prompt3)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## Final Answer\n", | ||
"\n", | ||
"Action:\n", | ||
"{\n", | ||
" \"action\": \"Final Answer\",\n", | ||
" \"action_input\": \"The secondary structure analysis of protein 1GZX reveals the following components: \\n- Helices: 444 residues \\n- Strands: 0 residues \\n- Coils: 130 residues \\n- Unassigned or non-protein residues: 213 residues\"\n", | ||
"}\n", | ||
"\n", | ||
"\n", | ||
"Checkpint directory: /gpfs/fs2/scratch/jmedina9/mdagent/md-agent/ckpt/ckpt_70" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 3, | ||
"metadata": {}, | ||
"outputs": [ | ||
{ | ||
"name": "stdout", | ||
"output_type": "stream", | ||
"text": [ | ||
"date and time: 2024-09-10\n", | ||
"time: 10:13:19\n" | ||
] | ||
} | ||
], | ||
"source": [ | ||
"now = datetime.datetime.now()\n", | ||
"date = now.strftime(\"%Y-%m-%d\")\n", | ||
"print(\"date and time:\",date)\n", | ||
"time = now.strftime(\"%H:%M:%S\")\n", | ||
"print(\"time:\",time)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 5, | ||
"metadata": {}, | ||
"outputs": [ | ||
{ | ||
"name": "stdout", | ||
"output_type": "stream", | ||
"text": [ | ||
"Files found in registry: 1GZX_173238: PDB file downloaded from RSCB\n", | ||
" PDBFile ID: 1GZX_173238\n", | ||
" rec0_173240: dssp values for trajectory with id: 1GZX_173238\n" | ||
] | ||
} | ||
], | ||
"source": [ | ||
"registry = agent.path_registry\n", | ||
"print(('\\n').join(registry.list_path_names_and_descriptions().split(',')))" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 8, | ||
"metadata": {}, | ||
"outputs": [ | ||
{ | ||
"name": "stdout", | ||
"output_type": "stream", | ||
"text": [ | ||
"Number of chains: 12\n", | ||
"Number of sheets: 0\n", | ||
"Number of helices: 444\n", | ||
"Number of coils: 130\n" | ||
] | ||
} | ||
], | ||
"source": [ | ||
"import mdtraj as md\n", | ||
"path = registry.get_mapped_path(\"1GZX_173238\")\n", | ||
"path = registry.ckpt_dir + path.split('ckpt_70')[-1]\n", | ||
"traj = md.load(path)\n", | ||
"top = traj.topology\n", | ||
"number_of_chains = top.n_chains\n", | ||
"secondary_structure = md.compute_dssp(traj,simplified=True)\n", | ||
"print(\"Number of chains: \",number_of_chains)\n", | ||
"print(\"Number of sheets: \",len([i for i in secondary_structure[0] if i == 'E']))\n", | ||
"print(\"Number of helices: \",len([i for i in secondary_structure[0] if i == 'H']))\n", | ||
"print(\"Number of coils: \",len([i for i in secondary_structure[0] if i == 'C']))\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"# Experiment Result:\n", | ||
"### Completed without Exception or TimeOut Errors ✅\n", | ||
"### Attempted all necessary steps ✅\n", | ||
"### Logic make sense ✅\n", | ||
"### No hallucination ✅\n", | ||
"### Correct Answer ✅\n", | ||
"### Number of steps completed: 2\n", | ||
"### % steps completed: 100%\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [] | ||
} | ||
], | ||
"metadata": { | ||
"kernelspec": { | ||
"display_name": "mdagent2", | ||
"language": "python", | ||
"name": "python3" | ||
}, | ||
"language_info": { | ||
"codemirror_mode": { | ||
"name": "ipython", | ||
"version": 3 | ||
}, | ||
"file_extension": ".py", | ||
"mimetype": "text/x-python", | ||
"name": "python", | ||
"nbconvert_exporter": "python", | ||
"pygments_lexer": "ipython3", | ||
"version": "3.11.9" | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 2 | ||
} |
Oops, something went wrong.