Skip to content

Commit

Permalink
Merge branch 'enhancement/add_decriptor_tutorial' into 'dev'
Browse files Browse the repository at this point in the history
Enhancement/add decriptor tutorial

See merge request cdd/QSPRpred!171
  • Loading branch information
HellevdM committed Feb 9, 2024
2 parents 9904fc4 + 5614c70 commit 597f9c0
Show file tree
Hide file tree
Showing 8 changed files with 1,806 additions and 59 deletions.
5 changes: 1 addition & 4 deletions docs/cli_usage.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,10 +14,7 @@ e.g. the help message for the :code:`QSPRpred.data_CLI` script can be shown as f
A simple command-line workflow to prepare your dataset and train QSPR models is given below (see :ref:`CLI Example`).

If you want more control over the inputs and outputs or want to customize QSPRpred for your purpose,
you can also use the Python API directly (see `source code <https://github.com/CDDLeiden/QSPRpred/tree/main/tutorials>`_).
Here you can find a tutorial with a Jupyter notebook illustrating some common use cases in the project source code.
Make sure to download the tutorial folder to follow the examples in this CLI tutorial.

you can also use the Python API directly (see `tutorials <https://github.com/CDDLeiden/QSPRpred/tree/main/tutorials>`_).

CLI Example
***********
Expand Down
55 changes: 51 additions & 4 deletions docs/features.rst
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ Overview of available features
* :class:`~qsprpred.data.descriptors.fingerprints.AtomPairFP`: AtomPairFP
* :class:`~qsprpred.data.descriptors.fingerprints.AvalonFP`: AvalonFP
* :class:`~qsprpred.data.descriptors.fingerprints.LayeredFP`: LayeredFP
* :class:`~qsprpred.data.descriptors.fingerprints.MACCsFP`: MACCsFP
* :class:`~qsprpred.data.descriptors.fingerprints.MaccsFP`: MaccsFP
* :class:`~qsprpred.data.descriptors.fingerprints.MorganFP`: MorganFP
* :class:`~qsprpred.data.descriptors.fingerprints.PatternFP`: PatternFP
* :class:`~qsprpred.data.descriptors.fingerprints.RDKitFP`: RDKitFP
Expand Down Expand Up @@ -161,7 +161,7 @@ Overview of available features

.. dropdown:: Model Assessors

:class:`~qsprpred.models.assessment_methods.ModelAssessor`: Base class for model assessors.
:class:`~qsprpred.models.assessment.methods.ModelAssessor`: Base class for model assessors.

Model assessors are used to assess the performance of models.
More information be found in the `model assessment tutorial <https://github.com/CDDLeiden/QSPRpred/blob/main/tutorials/basics/modelling/model_assessment.ipynb>`_.
Expand All @@ -170,8 +170,8 @@ Overview of available features

.. tab-item:: Core

* :class:`~qsprpred.models.assessment_methods.CrossValAssessor`: CrossValAssessor
* :class:`~qsprpred.models.assessment_methods.TestSetAssessor`: TestSetAssessor
* :class:`~qsprpred.models.assessment.methods.CrossValAssessor`: CrossValAssessor
* :class:`~qsprpred.models.assessment.methods.TestSetAssessor`: TestSetAssessor

.. dropdown:: Hyperparameter Optimizers

Expand Down Expand Up @@ -209,3 +209,50 @@ Overview of available features
* :class:`~qsprpred.plotting.classification.MetricsPlot`: MetricsPlot
* :class:`~qsprpred.plotting.classification.ConfusionMatrixPlot`: ConfusionMatrixPlot

.. dropdown:: Monitors

* :class:`~qsprpred.models.monitors.FitMonitor`: Base class for monitoring model fitting
* :class:`~qsprpred.models.monitors.AssessorMonitor`: Base class for monitoring model assessment (subclass of :class:`~qsprpred.models.monitors.FitMonitor`)
* :class:`~qsprpred.models.monitors.HyperparameterOptimizationMonitor`: Base class for monitoring hyperparameter optimization (subclass of :class:`~qsprpred.models.monitors.AssessorMonitor`)

Monitors are used to monitor the training of models.
More information can be found in the `model monitoring tutorial <https://github.com/CDDLeiden/QSPRpred/blob/main/tutorials/advanced/modelling/monitoring.ipynb>`_.

.. tab-set::

.. tab-item:: Core

* :class:`~qsprpred.models.monitors.NullMonitor`: NullMonitor
* :class:`~qsprpred.models.monitors.ListMonitor`: ListMonitor
* :class:`~qsprpred.models.monitors.BaseMonitor`: BaseMonitor
* :class:`~qsprpred.models.monitors.FileMonitor`: FileMonitor
* :class:`~qsprpred.models.monitors.WandBMonitor`: WandBMonitor

.. dropdown:: Scaffolds

:class:`~qsprpred.data.chem.scaffolds.Scaffold`: Base class for scaffolds.

Class for calculating molecular scaffolds of different kinds

.. tab-set::

.. tab-item:: Core

* :class:`~qsprpred.data.chem.scaffolds.Murcko`: Murcko
* :class:`~qsprpred.data.chem.scaffolds.BemisMurcko`: BemisMurcko

.. dropdown:: Clustering

:class:`~qsprpred.data.chem.clustering.MoleculeClusters`: Base class for clustering molecules.

Classes for clustering molecules

.. tab-set::

.. tab-item:: Core

* :class:`~qsprpred.data.chem.clustering.RandomClusters`: RandomClusters
* :class:`~qsprpred.data.chem.clustering.ScaffoldClusters`: ScaffoldClusters
* :class:`~qsprpred.data.chem.clustering.FPSimilarityClusters`: FPSimilarityClusters
* :class:`~qsprpred.data.chem.clustering.FPSimilarityMaxMinClusters`: FPSimilarityMaxMinClusters
* :class:`~qsprpred.data.chem.clustering.FPSimilarityLeaderPickerClusters`: FPSimilarityLeaderPickerClusters
14 changes: 7 additions & 7 deletions qsprpred/data/descriptors/fingerprints.py
Original file line number Diff line number Diff line change
Expand Up @@ -144,7 +144,7 @@ def __str__(self):

class MaccsFP(Fingerprint):
def __init__(self, nBits=167, **kwargs):
super().__init__()
super().__init__(used_bits=list(range(nBits)))
self.nBits = nBits
self.kwargs = kwargs

Expand All @@ -169,7 +169,7 @@ def __str__(self):

class AvalonFP(Fingerprint):
def __init__(self, nBits=1024, **kwargs):
super().__init__()
super().__init__(used_bits=list(range(nBits)))
self.nBits = nBits
self.kwargs = kwargs

Expand All @@ -194,7 +194,7 @@ def __str__(self):

class TopologicalFP(Fingerprint):
def __init__(self, nBits=2048, **kwargs):
super().__init__()
super().__init__(used_bits=list(range(nBits)))
self.nBits = nBits
self.kwargs = kwargs

Expand All @@ -221,7 +221,7 @@ def __str__(self):

class AtomPairFP(Fingerprint):
def __init__(self, nBits=2048, **kwargs):
super().__init__()
super().__init__(used_bits=list(range(nBits)))
self.nBits = nBits
self.kwargs = kwargs

Expand All @@ -248,7 +248,7 @@ def __str__(self):

class RDKitFP(Fingerprint):
def __init__(self, minPath=1, maxPath=7, nBits=2048, **kwargs):
super().__init__()
super().__init__(used_bits=list(range(nBits)))
self.minPath = minPath
self.maxPath = maxPath
self.nBits = nBits
Expand Down Expand Up @@ -281,7 +281,7 @@ def __str__(self):

class PatternFP(Fingerprint):
def __init__(self, nBits=2048, **kwargs):
super().__init__()
super().__init__(used_bits=list(range(nBits)))
self.nBits = nBits
self.kwargs = kwargs

Expand All @@ -308,7 +308,7 @@ def __str__(self):

class LayeredFP(Fingerprint):
def __init__(self, minPath=1, maxPath=7, nBits=2048, **kwargs):
super().__init__()
super().__init__(used_bits=list(range(nBits)))
self.minPath = minPath
self.maxPath = maxPath
self.nBits = nBits
Expand Down
2 changes: 1 addition & 1 deletion qsprpred/data/descriptors/sets.py
Original file line number Diff line number Diff line change
Expand Up @@ -446,7 +446,7 @@ def __init__(self, model: Type["QSPRModel"] | str):
Initialize the descriptorset with a `QSPRModel` object.
Args:
model: a fitted model instance or a path to the model's meta file
model (QSPRModel): a fitted model instance or a path to the model's meta file
"""
super().__init__()
if isinstance(model, str):
Expand Down
43 changes: 8 additions & 35 deletions qsprpred/data/descriptors/tests.py
Original file line number Diff line number Diff line change
Expand Up @@ -163,41 +163,14 @@ def setUp(self):
super().setUp()
self.setUpPaths()

@parameterized.expand(
[
(
f"{desc_set}_{TargetTasks.MULTICLASS}",
desc_set,
[
{
"name": "CL",
"task": TargetTasks.MULTICLASS,
"th": [0, 1, 10, 1200],
}
],
)
for desc_set in DataSetsPathMixIn.getAllDescriptors()
]
+ [
(
f"{desc_set}_{TargetTasks.REGRESSION}",
desc_set,
[{"name": "CL", "task": TargetTasks.REGRESSION}],
)
for desc_set in DataSetsPathMixIn.getAllDescriptors()
]
+ [
(
f"{desc_set}_Multitask",
desc_set,
[
{"name": "CL", "task": TargetTasks.REGRESSION},
{"name": "fu", "task": TargetTasks.SINGLECLASS, "th": [0.3]},
],
)
for desc_set in DataSetsPathMixIn.getAllDescriptors()
]
)
@parameterized.expand([
(
f"{desc_set}_{TargetTasks.REGRESSION}",
desc_set,
[{"name": "CL", "task": TargetTasks.REGRESSION}],
)
for desc_set in DataSetsPathMixIn.getAllDescriptors()
])
def testDescriptorsAll(self, _, desc_set, target_props):
"""Tests all available descriptor sets.
Expand Down
5 changes: 1 addition & 4 deletions qsprpred/extra/data/descriptors/tests.py
Original file line number Diff line number Diff line change
Expand Up @@ -267,10 +267,7 @@ def setUp(self):
(
f"{desc_set}",
desc_set,
[
{"name": "CL", "task": TargetTasks.REGRESSION},
{"name": "fu", "task": TargetTasks.SINGLECLASS, "th": [0.3]},
],
[{"name": "CL", "task": TargetTasks.REGRESSION}],
)
for desc_set in DataSetsMixInExtras.getAllDescriptors()
]
Expand Down
21 changes: 19 additions & 2 deletions qsprpred/utils/testing/path_mixins.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,12 +10,21 @@
from sklearn.preprocessing import StandardScaler

from ...data import RandomSplit, QSPRDataset
from ...data.descriptors.fingerprints import MorganFP
from ...data.descriptors.fingerprints import (
AtomPairFP,
AvalonFP,
LayeredFP,
MaccsFP,
MorganFP,
PatternFP,
RDKitFP,
RDKitMACCSFP,
TopologicalFP,
)
from ...data.descriptors.sets import (
RDKitDescs,
DrugExPhyschem,
PredictorDesc,
RDKitDescs,
TanimotoDistances,
)
from ...data.processing.data_filters import RepeatsFilter
Expand Down Expand Up @@ -101,7 +110,15 @@ def getAllDescriptors(cls):
radius=2,
nBits=128,
),
AtomPairFP(nBits=128),
AvalonFP(nBits=128),
LayeredFP(nBits=128),
MaccsFP(),
MorganFP(radius=2, nBits=128),
PatternFP(nBits=128),
RDKitFP(nBits=128),
RDKitMACCSFP(),
TopologicalFP(nBits=128),
]

return descriptor_sets
Expand Down
Loading

0 comments on commit 597f9c0

Please sign in to comment.