Skip to content

Commit

Permalink
Benchmarks (#62)
Browse files Browse the repository at this point in the history
  • Loading branch information
marcosfelt authored Aug 3, 2020
1 parent 5e4539d commit a838ae6
Show file tree
Hide file tree
Showing 96 changed files with 14,304 additions and 8,255 deletions.
13 changes: 0 additions & 13 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,19 +7,6 @@ jobs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@master
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v1
with:
python-version: 3.7.8
- name: Cache poetry environment
uses: actions/cache@v1
env:
cache-name: cache-poetry
with:
path: /github/home/.cache/pypoetry/virtualenvs
key: ${{ runner.os }}-poetry-${{ hashFiles('**/pyproject.toml') }}
restore-keys: |
${{ runner.os }}-poetry-
- name: Install
uses: abatilo/[email protected]
with:
Expand Down
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -117,5 +117,6 @@ venv.bak/
tmp_files
Pytest*

# Snar benchmark
# Benchmark temporary files
.snar_benchmark
.cn_benchmark
8 changes: 4 additions & 4 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
FROM python:3.7

WORKDIR /summit_user
COPY setup.py requirements.txt ./
COPY requirements.txt ./
# Have to install numpy first due to Gryffin
RUN pip install numpy==1.18.0 && pip install -r requirements.txt
RUN pip install numpy==1.18.0 && pip install -r requirements.txt
COPY setup.py ./
COPY summit summit/
RUN pip install .
ENTRYPOINT ["python"]

ENTRYPOINT ["python"]
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,8 +62,8 @@ You can change the tag from `latest` to whatever is most appropriate (e.g., the
Then, to run a container, here is an example with the SnAr experiment code. The home directory of the container is called `summit_user`, hence we mount the current working directory into that folder. We remove the container upon finishing using `--rm` and make it interactive using `--it` (remove this if you just want the container to run in the background). [Neptune.ai](https://neptune.ai/) is used for the experiments so the API token is passed in. Finally, I specify the image name and the tag and before referencing the python file I want to run.
```
export token= #place your neptune token here
sudo docker run -v `pwd`/:/summit_user --rm -it --env NEPTUNE_API_TOKEN=$token summit:snar_benchmark snar_experiment_2.py
export NEPTUNE_API_TOKEN= #place your neptune token here
sudo docker run -v `pwd`/:/summit_user --rm -it --env NEPTUNE_API_TOKEN=$NEPTUNE_API_TOKEN summit:snar_benchmark snar_experiment_2.py
```
Singularity (for running Docker containers on the HPC):
Expand Down
16 changes: 12 additions & 4 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -69,20 +69,28 @@ Docker

Sometimes, it is easier to run tests using a Docker container (e.g., on compute clusters). Here are the commands to build and run the docker containers using the included Dockferfile. The container entrypoint is python, so you just need to specify the file name.

To build the container:
To build the container and upload the container to Docker Hub.:

.. code-block::
docker build . -t summit:latest
docker build . -t marcosfelt/summit:latest
docker push marcosfelt/summit:latest
You can change the tag from ``latest`` to whatever is most appropriate (e.g., the branch name).
You can change the tag from ``latest`` to whatever is most appropriate (e.g., the branch name). I have found that this takes up a lot of space on disk, so I have been running the commands on our private servers.

Then, to run a container, here is an example with the SnAr experiment code. The home directory of the container is called ``summit_user``\ , hence we mount the current working directory into that folder. We remove the container upon finishing using ``--rm`` and make it interactive using ``--it`` (remove this if you just want the container to run in the background). `Neptune.ai <https://neptune.ai/>`_ is used for the experiments so the API token is passed in. Finally, I specify the image name and the tag and before referencing the python file I want to run.

.. code-block::
export token= #place your neptune token here
sudo docker run-v `pwd`/:/summit_user --rm -it --env NEPTUNE_API_TOKEN=$token summit:snar_benchmark snar_experiment.py
sudo docker run -v `pwd`/:/summit_user --rm -it --env NEPTUNE_API_TOKEN=$token summit:snar_benchmark snar_experiment_2.py
Singularity (for running Docker containers on the HPC):

.. code-block::
export NEPTUNE_API_TOKEN=
singularity exec -B `pwd`/:/summit_user docker://marcosfelt/summit:snar_benchmark snar_experiment.py
Releases
^^^^^^^^
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -1643,7 +1643,8 @@
{
"cell_type": "markdown",
"metadata": {
"toc-hr-collapsed": true
"toc-hr-collapsed": true,
"toc-nb-collapsed": true
},
"source": [
"## 1. Tuning Kinetic Model"
Expand Down Expand Up @@ -4325,7 +4326,8 @@
{
"cell_type": "markdown",
"metadata": {
"toc-hr-collapsed": true
"toc-hr-collapsed": true,
"toc-nb-collapsed": true
},
"source": [
"## 4. Stopping Criteria Investigation"
Expand Down
18 changes: 18 additions & 0 deletions data/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
# Descriptors Calculation

## Solvent Descriptors

The descriptors are from the paper "[Machine learning and molecular descriptors enable rational solvent selection in asymmetric catalysis](https://pubs.rsc.org/en/content/articlelanding/2019/sc/c9sc01844a#!divAbstract)" by Amar et al.

## Ligand and Base Descriptors

Descriptors are calculated using [COSMOquick](https://www.3ds.com/products-services/biovia/products/molecular-modeling-simulation/solvation-chemistry/cosmoquick/). The QSPR & ADME option is used to calculate the sigma moment descriptors which are named as follows:

- 'area' for the zero sigma moment
- 'M2' for the second sigma moment
- 'M3' for the third sigma moment
- 'Macc3' for the hydrogen bond acceptor strength
- 'Mdon3' for the hydrogen bond donor strength

Additionally the solubility in 2 Me-THF is predicted. All calculations are done at 25°C

File renamed without changes.
8 changes: 8 additions & 0 deletions data/bases.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
name,cas_number,molweight,ringbonds,alkylatoms,alkylgroups,rotatable_bonds,internal_hbonds,conjugated_bonds,rotbsdmod,tmult,nbr11,rbwring,fragments,zwitterion_in_water,N_total,massprotbond,SIMILARITY_CFDB,natoms,hydrogens,bonds,largering,molrefraction,alkane,area,M3,M4,M2,e_dielec,mu_gas,M5,M6,volume,Macc1,Macc2,Macc3,Macc4,Mdon1,Mdon2,Mdon3,Mdon4,avratio,ovality,mu_self,h_hb,h_int,h_vdw,E_Ring,mu_vacuo,mu_water,N_ring,ringatombondratio,N_amino,solubility_2methf
TEA,121-44-8,101.19,0,6,3,3,0,0,2.25,1.03,0,0,0,0,1,25.2975,9,22,15,6,0,33.33,-1,162.2992,40.9469,117.8566,25.8165,-2.4981,9.3499,325.962,916.7185,160.0425,0.0341,3.4096,3.0278,2.6461,0,0,0,0,1.01410063,1.1385236,0.54547,0,-6.46894,-8.33131,1.14099,3.66065141,5.68137656,0,0,1,642.2973283
TMG,80-70-6,115.18,0,4,4,2,0,3,1,1.11,0,0,4,0,3,38.39333333,3.375,21,13,7,0,34.71,-1,165.5447,107.0287,247.8036,81.4847,-17.395,-1.4938,500.1264,1111.256,164.7257,0.135,13.6049,10.215,7.3272,-0.0015,0.268,0.0169,0,1.0049719,1.13917448,5.17492,-0.87177,-3.04676,-7.79853,1.14099,9.85317291,4.3980757,0,0,0,534.01544123
BTMG,29166-72-1,171.28,0,8,5,2,0,3,1,1.07,0,0,4,0,3,57.09333333,3.25,33,21,11,0,53.23,-1,227.3523,14.3676,27.7996,30.554,-10.1045,6.1173,30.5514,48.7875,247.2517,0.0223,2.24,1.1196,0.3475,-0.0007,0.1004,0.0127,0,0.91951764,1.19340825,-1.41366,-0.01563,-9.62175,-11.54402,0.94887,2.24025962,6.80687972,1,0,0,839.81215
DBU,6674-22-2,152.24,12,0,0,0,0,2,2.31,0,0,9.75,0,0,2,152.24,9,27,16,12,7,45,-1,192.4693,82.0661,176.2169,59.8367,-9.7156,6.8437,367.0938,803.3328,200.9029,0.0968,9.7478,7.42,5.3763,0,0,0,0,0.95802151,1.16025435,0.67427,0,-6.9543,-9.46913,-0.97234,4.6460236,2.4873413,11,1.09090909,0,1055.82799
MTBD,84030-20-6,153.23,11,1,1,0,0,3,1.34,0,0,8.25,0,0,3,153.23,9,26,15,12,6,43.98,-1,191.3241,65.4592,125.0217,59.0952,-9.6144,6.7217,224.3194,433.4696,197.3613,0.0816,8.3041,5.606,3.6104,0,0,0,0,0.96941042,1.16710758,0.48121,0,-7.21452,-9.39497,-0.78022,4.49579927,3.19792659,10,1.1,0,1073.99144
BTTP,161118-67-8,312.44,15,4,1,3,0,4,5.88,1.57,0,13.5,1,0,4,78.11,9,54,33,23,5,91.61,-1,316.128,7.053,11.022,24.9502,-3.868,17.1392,12.589,20.9471,410.3591,0.0028,0.2992,0.1964,0.1335,0,0,0,0,0.77036917,1.18377581,-7.57579,0,-17.41933,-16.32376,-1.74083,-3.96144201,4.11196719,15,1,0,460.51781315
P2Et,165535-45-5,339.4,0,12,11,7,0,8,2.75,1.36,0,0,2,0,7,42.425,5.47619048,56,35,20,0,98.44,-1,331.1344,94.9924,216.1193,70.8635,-13.477,8.0112,474.2956,1090.958,442.3334,0.1022,10.2818,8.2312,6.3617,0,0,0,0,0.74860818,1.17947033,-1.92835,0,-12.1581,-16.83079,1.14099,2.08497464,4.26013516,0,0,0,948.44841
Binary file added data/baumgartner/catalyst_prices.xlsx
Binary file not shown.
Binary file added data/baumgartner/descriptors_ligands_bases.xls
Binary file not shown.
4 changes: 4 additions & 0 deletions data/baumgartner/ligands.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
name,cas_number,molweight,ringbonds,alkylatoms,alkylgroups,rotatable_bonds,internal_hbonds,conjugated_bonds,rotbsdmod,tmult,nbr11,rbwring,fragments,zwitterion_in_water,N_total,massprotbond,SIMILARITY_CFDB,natoms,hydrogens,bonds,largering,molrefraction,alkane,area,M3,M4,M2,e_dielec,mu_gas,M5,M6,volume,Macc1,Macc2,Macc3,Macc4,Mdon1,Mdon2,Mdon3,Mdon4,avratio,ovality,mu_self,h_hb,h_int,h_vdw,E_Ring,mu_vacuo,mu_water,N_ring,ringatombondratio,N_amino,solubility_2_methf
tBuXPhos,564483-19-8,424.64,12,17,5,5,0,12,2.5,1.2,0,0,2,0,0,70.77333333,6.16666667,75,45,31,6,138.05,-1,460.7543,30.8413,63.064,67.2057,-8.626,18.6978,89.9089,165.8158,591.7531,0.0366,3.7558,2.3043,1.3457,0,0,0,0,0.77862592,1.35173506,-9.43171,0,-22.11317,-23.86729,-1.16447,-3.94219353,4.91779634,12,1,0,421.25040226
tBuBrettPhos,1160861-53-9,484.7,12,19,7,7,0,12,4,1.14,0,0,5,0,0,60.5875,4.79411765,83,49,35,6,149.47,-1,518.8408,39.4424,79.085,89.8738,-12.666,18.0428,95.3095,160.9703,658.5981,0.0491,5.0578,2.5548,1.182,0,0.0035,0,0,0.78779577,1.41732576,-9.81725,-0.00523,-23.72584,-26.64862,-1.16447,-3.39162618,5.67691025,12,1,0,781.11247064
AlPhos,1805783-60-1,815.06,42,14,5,10,0,18,13.76,1.27,1,16.5,13,0,0,74.09636364,3.55172414,125,67,66,6,230.27,-1,819.933,83.2017,127.8013,129.0808,-0.395,38.355,186.6409,333.96,991.3742,0.0665,6.7723,4.2959,2.5919,0,0.0018,0,0,0.82706712,1.70530671,-16.89674,-0.00614,-36.21253,-40.27856,-5.77539,-8.82368958,-0.2661399,36,1.16666667,0,880.74916884
11 changes: 11 additions & 0 deletions data/baumgartner/ligands_bases.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
cas_number,name,type
564483-19-8,tBuXPhos,ligand
1160861-53-9,tBuBrettPhos,ligand
1805783-60-1,AlPhos,ligand
121-44-8,TEA,base
80-70-6,TMG,base
29166-72-1,BTMG,base
6674-22-2,DBU,base
84030-20-6,MTBD,base
161118-67-8,BTTP,base
165535-45-5,P2Et,base
11 changes: 11 additions & 0 deletions data/baumgartner/ligands_bases.smi
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
CC(C)C1=CC(=C(C(=C1)C(C)C)C2=CC=CC=C2P(C(C)(C)C)C(C)(C)C)C(C)C 564483-19-8
CC(C)C1=CC(=C(C(=C1)C(C)C)C2=C(C=CC(=C2P(C(C)(C)C)C(C)(C)C)OC)OC)C(C)C 1160861-53-9
CCCCC1=C(C(=C(C(=C1F)F)C2=C(C=C(C(=C2C(C)C)C3=C(C(=CC=C3)OC)P(C45CC6CC(C4)CC(C6)C5)C78CC9CC(C7)CC(C9)C8)C(C)C)C(C)C)F)F 1805783-60-1
CCN(CC)CC 121-44-8
CN(C)C(=N)N(C)C 80-70-6
CC(C)(C)N=C(N(C)C)N(C)C 29166-72-1
C1CCC2=NCCCN2CC1 6674-22-2
CN1CCCN2C1=NCCC2 84030-20-6
CC(C)(C)N=P(N1CCCC1)(N2CCCC2)N3CCCC3 161118-67-8
CCN=P(N=P(N(C)C)(N(C)C)N(C)C)(N(C)C)N(C)C 165535-45-5
CC1CCCO1 2-methyltetrahydrofuran
Binary file added data/baumgartner/solubility_ligands_bases.xls
Binary file not shown.
File renamed without changes.
30 changes: 30 additions & 0 deletions experiments/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# Experiments

This is the code used to run and analyze the experiments in the paper. Each folder has code to run experiments, some code for visualization and a Jupyter notebook that contains all the plots used in the paper.

## Steps to run on the HPC

1. Commit changes and push to Github
2. Build the Docker container and push to docker hub. I do this on our private server since it requires quite a bit of space.

```
docker build . -t marcosfelt/summit:tag
docker push marcosfelt/summit:tag
```
Replace `tag` with the name of the branch.
3. Log into the HPC and pull the container using singularity. It's important to do this, so each experiment doesn't have to pull the container.
```
singularity run docker://marcosfelt/summit:tag
```
Replace `tag` with the tag you used in step 2.
4. Run the test script. For the C-N benchmark for example:
```
export SSH_USER= # put your HPC login username here
export SSH_PASSWORD= # put your HPC login password here
export NEPTUNE_API_TOKEN = # put your Neptune API Token here
poetry run pytest test_cn_experiment_MO.py
```
The scripts automatically login to the HPC, submit the jobs to slurm and make sure the Neptune is setup for experiment tracking.
6,928 changes: 6,928 additions & 0 deletions experiments/cn_benchmark/cn_benchmark_visualization.ipynb

Large diffs are not rendered by default.

Loading

0 comments on commit a838ae6

Please sign in to comment.