diff --git a/CHANGES b/CHANGES index 9fb6f078..6147a57a 100644 --- a/CHANGES +++ b/CHANGES @@ -27,6 +27,8 @@ Enhancements simulations (issue #168, PR #179) * new EnsembleAnalysis framework for collecting data from MDPOW simulations (issue #168, PR #179) +* new SolvationAnalysis for quantifing the number of solvent molecules + within a given distance (#195) Fixes diff --git a/doc/sphinx/source/analysis.txt b/doc/sphinx/source/analysis.txt index 797d1f89..ff547e70 100644 --- a/doc/sphinx/source/analysis.txt +++ b/doc/sphinx/source/analysis.txt @@ -17,4 +17,5 @@ are for users who wish to construct their own analyses. analysis/ensemble analysis/ensemble_analysis - analysis/dihedral + analysis/solvation + analysis/dihedral \ No newline at end of file diff --git a/doc/sphinx/source/analysis/solvation.txt b/doc/sphinx/source/analysis/solvation.txt new file mode 100644 index 00000000..5e7a2127 --- /dev/null +++ b/doc/sphinx/source/analysis/solvation.txt @@ -0,0 +1,12 @@ +======================== +Solvation Shell Analysis +======================== + +Analyzes the number of solvent molecules within given distances of the solute. + +.. versionadded:: 0.8.0 + +.. autoclass:: mdpow.analysis.solvation.SolvationAnalysis + :members: + + .. automethod:: run \ No newline at end of file diff --git a/mdpow/analysis/solvation.py b/mdpow/analysis/solvation.py new file mode 100644 index 00000000..6ed6137a --- /dev/null +++ b/mdpow/analysis/solvation.py @@ -0,0 +1,79 @@ +# MDPOW: solvation.py +# 2021 Alia Lescoulie + +from typing import List + +import numpy as np +import pandas as pd + +import MDAnalysis as mda +from MDAnalysis.lib.distances import capped_distance + +from .ensemble import EnsembleAnalysis, Ensemble, EnsembleAtomGroup + +import logging + +logger = logging.getLogger('mdpow.dihedral') + + +class SolvationAnalysis(EnsembleAnalysis): + """Measures the number of solvent molecules withing the given distances + in an :class:`~mdpow.analysis.ensemble.Ensemble` . + + :Parameters: + + *solute* + An :class:`~mdpow.analysis.ensemble.EnsembleAtom` containing the solute + used to measure distance. + + *solvent* + An :class:`~mdpow.analysis.ensemble.EnsembleAtom` containing the solvents + counted in by the distance measurement. Each solvent atom is counted by the + distance calculation. + + *distances* + Array like of the cutoff distances around the solute measured in Angstroms. + + The data is returned in a :class:`pandas.DataFrame` with observations sorted by + distance, solvent, interaction, lambda, time. + + .. rubric:: Example + + Typical Workflow:: + + ens = Ensemble(dirname='Mol') + solvent = ens.select_atoms('resname SOL and name OW') + solute = ens.select_atoms('resname UNK') + + solv_dist = SolvationAnalysis(solute, solvent, [1.2, 2.4]).run(stop=10) + + """ + def __init__(self, solute: EnsembleAtomGroup, solvent: EnsembleAtomGroup, distances: List[float]): + self.check_groups_from_common_ensemble([solute, solvent]) + super(SolvationAnalysis, self).__init__(solute.ensemble) + self._solute = solute + self._solvent = solvent + self._dists = distances + + def _prepare_ensemble(self): + self._col = ['distance', 'solvent', 'interaction', + 'lambda', 'time', 'N_solvent'] + self.results = pd.DataFrame(columns=self._col) + self._res_dict = {key: [] for key in self._col} + + def _single_frame(self): + solute = self._solute[self._key] + solvent = self._solvent[self._key] + pairs, distances = capped_distance(solute.positions, solvent.positions, + max(self._dists), box=self._ts.dimensions) + solute_i, solvent_j = np.transpose(pairs) + for d in self._dists: + close_solv_atoms = solvent[solvent_j[distances < d]] + result = [d, self._key[0], self._key[1],self._key[2], + self._ts.time, close_solv_atoms.n_atoms] + for i in range(len(self._col)): + self._res_dict[self._col[i]].append(result[i]) + + def _conclude_ensemble(self): + for k in self._col: + self.results[k] = self._res_dict[k] diff --git a/mdpow/tests/test_solv_shell.py b/mdpow/tests/test_solv_shell.py new file mode 100644 index 00000000..259b3247 --- /dev/null +++ b/mdpow/tests/test_solv_shell.py @@ -0,0 +1,59 @@ +import numpy as np +import pandas as pd + +from . import tempdir as td + +import py.path + +import pybol +import pytest + +from numpy.testing import assert_almost_equal + +from ..analysis.ensemble import Ensemble + +from ..analysis.solvation import SolvationAnalysis + +from pkg_resources import resource_filename + +RESOURCES = py.path.local(resource_filename(__name__, 'testing_resources')) +MANIFEST = RESOURCES.join("manifest.yml") + + +class TestSolvShell(object): + def setup(self): + self.tmpdir = td.TempDir() + self.m = pybol.Manifest(str(RESOURCES / 'manifest.yml')) + self.m.assemble('example_FEP', self.tmpdir.name) + self.ens = Ensemble(dirname=self.tmpdir.name, solvents=['water']) + self.solute = self.ens.select_atoms('not resname SOL') + self.solvent = self.ens.select_atoms('resname SOL and name OW') + + def teardown(self): + self.tmpdir.dissolve() + + def test_dataframe(self): + solv = SolvationAnalysis(self.solute, self.solvent, [1.2]).run(start=0, stop=4, step=1) + assert isinstance(solv.results, pd.DataFrame) + + for d in solv.results['distance']: + assert d == 1.2 + for s in solv.results['solvent']: + assert s == 'water' + for i in solv.results['interaction'][:12]: + assert i == 'Coulomb' + + @pytest.fixture(scope='class') + def solvation_analysis_list_results(self): + self.setup() # Won't have solute and solvent without this + return SolvationAnalysis(self.solute, self.solvent, [2, 10]).run(start=0, stop=4, step=1) + + @pytest.mark.parametrize("d,ref_mean,ref_std", [(2, 1.10714285,2.07604166), + (10, 5306.89285714, 129.16720594)]) + def test_selection(self, solvation_analysis_list_results, d, ref_mean, ref_std): + results = solvation_analysis_list_results.results + mean = np.mean(results.loc[results['distance'] == d]['N_solvent']) + std = np.std(results.loc[results['distance'] == d]['N_solvent']) + + assert mean == pytest.approx(ref_mean) + assert std == pytest.approx(ref_std)