Skip to content

Commit

Permalink
Merge branch 'enhancement/documentation/feature_overview' into 'dev'
Browse files Browse the repository at this point in the history
Enhancement/documentation/feature overview

See merge request cdd/QSPRpred!166
  • Loading branch information
HellevdM committed Feb 7, 2024
2 parents ec966c0 + a0d06ca commit d887003
Show file tree
Hide file tree
Showing 14 changed files with 725 additions and 374 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@
/**/tutorials/tutorial_data/papyrus/
/**/tutorials/tutorial_data/A2A_LIGANDS.tsv
/**/tutorials/tutorial_data/AR_LIGANDS.tsv
/**/tutorials/tutorial_data/AR_LIGANDS_pivot.tsv
/**/tutorials/tutorial_output

### Python template
Expand Down
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ From v2.1.1 to v3.0.0
or equal to `v2.1.0`.
- Fixes to the `fromMolTable` method in various data set implementations, in particular
in copying of the feature standardizer and other settings.
- Fixed not working `cluster` split and `--imputation` from `data_CLI.py`.

## Changes

Expand Down
8 changes: 0 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -95,14 +95,6 @@ document the use of the Python API to build different types of models. The tutor
well as the documentation are still work in progress, and we will be happy for any
contributions where it is still lacking.

To use the commandline to train the same QSAR model as in the tutorial use (run from
tutorial folder):

```bash
python -m qsprpred.data_CLI -i ./data/parkinsons_pivot.tsv -o qspr/data -pr GABAAalpha -pr NMDA -r true -sp random -sf 0.15 -fe Morgan
python -m qsprpred.model_CLI -dp ./qspr/data/GABAAalpha_REGRESSION_df.pkl -o ./qspr/models -m PLS -o bayes -nt 5 -me -s
```

Contributions
=============

Expand Down
418 changes: 418 additions & 0 deletions docs/cli_usage.rst

Large diffs are not rendered by default.

5 changes: 4 additions & 1 deletion docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,10 @@
'sphinx.ext.viewcode',
'sphinx.ext.intersphinx',
'sphinx.ext.napoleon',
'sphinx.ext.viewcode'
'sphinx.ext.viewcode',
'sphinx.ext.autosectionlabel',
"sphinx_design",
"sphinx_design_elements",
]

intersphinx_mapping = {'python': ('https://docs.python.org/3.10', None)}
Expand Down
211 changes: 211 additions & 0 deletions docs/features.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,211 @@
.. _features:

Overview of available features
==============================

.. div:: dropdown-group

.. dropdown:: Data Sources

:class:`~qsprpred.data.sources.data_source.DataSource`: Base class for data sources.

Data sources are used to load data from a source programmatically.

.. tab-set::

.. tab-item:: Core

* :class:`~qsprpred.data.sources.papyrus.papyrus_class.Papyrus`: Papyrus (See `data collection with Papyrus tutorial <https://github.com/CDDLeiden/QSPRpred/blob/main/tutorials/basics/data/data_collection_with_papyrus.ipynb>`_.)

.. dropdown:: Data Filters

:class:`~qsprpred.data.processing.data_filters.DataFilter`: Base class for data filters.

Data filters are used to filter data based on some criteria.
Examples can be found in the `data preparation tutorial <https://github.com/CDDLeiden/QSPRpred/blob/main/tutorials/basics/data/data_preparation.ipynb>`_.

.. tab-set::

.. tab-item:: Core

* :class:`~qsprpred.data.processing.data_filters.CategoryFilter`: CategoryFilter
* :class:`~qsprpred.data.processing.data_filters.RepeatsFilter`: RepeatsFilter

.. dropdown:: Descriptor Sets

:class:`~qsprpred.data.descriptors.sets.DescriptorSet`: Base class for descriptor sets.

Descriptor sets are used to calculate molecular descriptors for a set of molecules.
Examples can be found in the `descriptor calculation tutorial <https://github.com/CDDLeiden/QSPRpred/blob/main/tutorials/basics/data/descriptors.ipynb>`_.

.. tab-set::

.. tab-item:: Core

* :class:`~qsprpred.data.descriptors.sets.DrugExPhyschem`: DrugExPhyschem
* :class:`~qsprpred.data.descriptors.sets.PredictorDesc`:PredictorDesc
* :class:`~qsprpred.data.descriptors.sets.RDKitDescs`: RDKitDescs
* :class:`~qsprpred.data.descriptors.sets.SmilesDescs`: SmilesDescs
* :class:`~qsprpred.data.descriptors.sets.TanimotoDistances`: TanimotoDistances
* :class:`~qsprpred.data.descriptors.sets.DataFrameDescriptorSet`: DataFrameDescriptorSet
* :class:`~qsprpred.data.descriptors.fingerprints.Fingerprint`: Fingerprint
* :class:`~qsprpred.data.descriptors.fingerprints.AtomPairFP`: AtomPairFP
* :class:`~qsprpred.data.descriptors.fingerprints.AvalonFP`: AvalonFP
* :class:`~qsprpred.data.descriptors.fingerprints.LayeredFP`: LayeredFP
* :class:`~qsprpred.data.descriptors.fingerprints.MACCsFP`: MACCsFP
* :class:`~qsprpred.data.descriptors.fingerprints.MorganFP`: MorganFP
* :class:`~qsprpred.data.descriptors.fingerprints.PatternFP`: PatternFP
* :class:`~qsprpred.data.descriptors.fingerprints.RDKitFP`: RDKitFP
* :class:`~qsprpred.data.descriptors.fingerprints.RDKitMACCSFP`: RDKitMACCSFP
* :class:`~qsprpred.data.descriptors.fingerprints.TopologicalFP`: TopologicalFP

.. tab-item:: Extra

* :class:`~qsprpred.extra.data.descriptors.sets.ExtendedValenceSignature`: ExtendedValenceSignature
* :class:`~qsprpred.extra.data.descriptors.sets.Mold2`: Mold2
* :class:`~qsprpred.extra.data.descriptors.sets.Mordred`: Mordred
* :class:`~qsprpred.extra.data.descriptors.sets.PaDEL`: PaDEL
* :class:`~qsprpred.extra.data.descriptors.sets.ProteinDescriptorSet`: ProteinDescriptorSet
* :class:`~qsprpred.extra.data.descriptors.sets.ProDec`: ProDec
* :class:`~qsprpred.data.descriptors.fingerprints.Fingerprint`: Fingerprint
* :class:`~qsprpred.extra.data.descriptors.fingerprints.CDKAtomPairs2DFP`: CDKAtomPairs2DFP
* :class:`~qsprpred.extra.data.descriptors.fingerprints.CDKEStateFP`: CDKEStateFP
* :class:`~qsprpred.extra.data.descriptors.fingerprints.CDKExtendedFP`: CDKExtendedFP
* :class:`~qsprpred.extra.data.descriptors.fingerprints.CDKFP`: CDKFP
* :class:`~qsprpred.extra.data.descriptors.fingerprints.CDKGraphOnlyFP`: CDKGraphOnlyFP
* :class:`~qsprpred.extra.data.descriptors.fingerprints.CDKKlekotaRothFP`: CDKKlekotaRothFP
* :class:`~qsprpred.extra.data.descriptors.fingerprints.CDKMACCSFP`: CDKMACCSFP
* :class:`~qsprpred.extra.data.descriptors.fingerprints.CDKPubchemFP`: CDKPubchemFP
* :class:`~qsprpred.extra.data.descriptors.fingerprints.CDKSubstructureFP`: CDKSubstructureFP

.. dropdown:: Data Splitters

:class:`~qsprpred.data.sampling.splits.DataSplit`: Base class for data splitters.

Data splitters are used to split data into training and test sets.
Examples can be found in the `data splitting tutorial <https://github.com/CDDLeiden/QSPRpred/blob/main/tutorials/basics/data/data_splitting.ipynb>`_.

.. tab-set::

.. tab-item:: Core

* :class:`~qsprpred.data.sampling.splits.RandomSplit`: RandomSplit
* :class:`~qsprpred.data.sampling.splits.ScaffoldSplit`: ScaffoldSplitter
* :class:`~qsprpred.data.sampling.splits.TemporalSplit`: StratifiedSplitter
* :class:`~qsprpred.data.sampling.splits.ManualSplit`: ManualSplit
* :class:`~qsprpred.data.sampling.splits.BootstrapSplit`: BootstrapSplit
* :class:`~qsprpred.data.sampling.splits.GBMTDataSplit`: GBMTDataSplit
* :class:`~qsprpred.data.sampling.splits.GBMTRandomSplit`: GBMTRandomSplit
* :class:`~qsprpred.data.sampling.splits.ClusterSplit`: ClusterSplit

.. tab-item:: Extra

* :class:`~qsprpred.extra.data.sampling.splits.LeaveTargetsOut`: LeaveTargetsOut
* :class:`~qsprpred.extra.data.sampling.splits.PCMSplit`: PCMSplit
* :class:`~qsprpred.extra.data.sampling.splits.TemporalPerTarget`: TemporalPerTarget


.. dropdown:: Feature Filters

:class:`~qsprpred.data.processing.feature_filters.FeatureFilter`: Base class for feature filters.

Feature filters are used to filter features based on some criteria.
Examples can be found in the `data preparation tutorial <https://github.com/CDDLeiden/QSPRpred/blob/main/tutorials/basics/data/data_preparation.ipynb>`_.

.. tab-set::

.. tab-item:: Core

* :class:`~qsprpred.data.processing.feature_filters.HighCorrelationFilter`: HighCorrelationFilter
* :class:`~qsprpred.data.processing.feature_filters.LowVarianceFilter`: LowVarianceFilter
* :class:`~qsprpred.data.processing.feature_filters.BorutaFilter`: BorutaFilter

.. dropdown:: Models

:class:`~qsprpred.models.models.QSPRModel`: Base class for models.

Models are used to predict properties of molecules.
A general example can be found in the `quick start tutorial <https://github.com/CDDLeiden/QSPRpred/blob/main/tutorials/quick_start.ipynb>`_.
More detailed information can be found throughout the basic and advanced modelling tutorials.

.. tab-set::

.. tab-item:: Core

* :class:`~qsprpred.models.scikit_learn.SklearnModel`: SklearnModel

.. tab-item:: Extra

* :class:`~qsprpred.extra.models.pcm.PCMModel`: PCMModel (See `PCM tutorial <https://github.com/CDDLeiden/QSPRpred/blob/main/tutorials/advanced/modelling/PCM_modelling.ipynb>`_.)

.. tab-item:: GPU

More information can be found in the `deep learning tutorial <https://github.com/CDDLeiden/QSPRpred/blob/main/tutorials/advanced/modelling/deep_learning_models.ipynb>`_.

* :class:`~qsprpred.extra.gpu.models.dnn.DNNModel`: DNNModel
* :class:`~qsprpred.extra.gpu.models.chemprop.ChempropModel`: ChempropModel (See `Chemprop tutorial <https://github.com/CDDLeiden/QSPRpred/blob/main/tutorials/advanced/modelling/chemprop_models.ipynb>`_.)
* :class:`~qsprpred.extra.gpu.models.pyboost.PyBoostModel`: PyBoostModel

.. dropdown:: Metrics

:class:`~qsprpred.models.metrics.Metric`: Base class for metrics

Metrics are used to evaluate the performance of models.
More information can be found in the `model assessment tutorial <https://github.com/CDDLeiden/QSPRpred/blob/main/tutorials/basics/modelling/model_assessment.ipynb>`_.

.. tab-set::

.. tab-item:: Core

* :class:`~qsprpred.models.metrics.SklearnMetrics`: SklearnMetrics

.. dropdown:: Model Assessors

:class:`~qsprpred.models.assessment_methods.ModelAssessor`: Base class for model assessors.

Model assessors are used to assess the performance of models.
More information be found in the `model assessment tutorial <https://github.com/CDDLeiden/QSPRpred/blob/main/tutorials/basics/modelling/model_assessment.ipynb>`_.

.. tab-set::

.. tab-item:: Core

* :class:`~qsprpred.models.assessment_methods.CrossValAssessor`: CrossValAssessor
* :class:`~qsprpred.models.assessment_methods.TestSetAssessor`: TestSetAssessor

.. dropdown:: Hyperparameter Optimizers

:class:`~qsprpred.models.hyperparam_optimization.HyperparameterOptimization`: Base class for hyperparameter optimizers.

Hyperparameter optimizers are used to optimize the hyperparameters of models.
More information can be found in the `hyperparameter optimization tutorial <https://github.com/CDDLeiden/QSPRpred/blob/main/tutorials/advanced/modelling/hyperparameter_optimization.ipynb>`_.

.. tab-set::

.. tab-item:: Core

* :class:`~qsprpred.models.hyperparam_optimization.GridSearchOptimization`: GridSearchOptimization
* :class:`~qsprpred.models.hyperparam_optimization.OptunaOptimization`: OptunaOptimization


.. dropdown:: Model Plots

:class:`~qsprpred.plotting.base_plot.ModelPlot`: Base class for model plots.

Model plots are used to visualize the performance of models.
Examples can be found throughout the basic and advanced modelling tutorials.

.. tab-set::

.. tab-item:: Core

* :class:`~qsprpred.plotting.regression.RegressionPlot`: RegressionPlot
* :class:`~qsprpred.plotting.regression.CorrelationPlot`: CorrelationPlot
* :class:`~qsprpred.plotting.regression.WilliamsPlot`: WilliamsPlot
* :class:`~qsprpred.plotting.classification.ClassifierPlot`: ClassifierPlot
* :class:`~qsprpred.plotting.classification.ROCPlot`: ROCPlot
* :class:`~qsprpred.plotting.classification.PRCPlot`: PRCPlot
* :class:`~qsprpred.plotting.classification.CalibrationPlot`: CalibrationPlot
* :class:`~qsprpred.plotting.classification.MetricsPlot`: MetricsPlot
* :class:`~qsprpred.plotting.classification.ConfusionMatrixPlot`: ConfusionMatrixPlot

8 changes: 6 additions & 2 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,15 +6,19 @@
Welcome to QSPRpred's documentation!
====================================

QSPRpred provides functionality to assist with building Quantitative Structure Property relationship models. Model built with QSPRpred can also be used as the enviroment reward function in DrugEx. Here you will find the installation guide (:ref:`installation`), usage examples (:ref:`usage`) and API documentation.
QSPRpred is open-source software libary for building **Quantitative Structure Property Relationship (QSPR)** models developed by Gerard van Westen's Computational Drug
Discovery group. It provides a unified interface for building QSPR models based on different types of descriptors and machine learning algorithms.
Here you will find the installation guide (:ref:`installation-guide`), an overview of available features (:ref:`features`), usage examples (:ref:`cli-usage`) and API documentation (:ref:`Python API`).
For tutorials and examples of the Python API, please visit the `QSPRpred GitHub repository <https://github.com/CDDLeiden/QSPRpred>`_.

.. toctree::
:maxdepth: 2
:caption: Contents:

self
install
use
cli_usage
features
api/modules


Expand Down
25 changes: 22 additions & 3 deletions docs/install.rst
Original file line number Diff line number Diff line change
@@ -1,15 +1,34 @@
.. _installation:
.. _installation-guide:

Installation
============

You do not need anything special to install the package. Just run the following to get the latest version and all dependencies:
You do not need anything special to install the package . Just run the following (with python >= 3.10) to get the latest version and all basic dependencies:

.. code-block::
pip install git+https://github.com/CDDLeiden/QSPRpred.git@main
You can also get tags and development snapshots by varying the :code:`@main` part (i.e. :code:`@1.0.0`). After that you can start building models (see :ref:`usage`).
You can also get tags and development snapshots by varying the :code:`@main` part (i.e. :code:`@1.0.0`). After that you can start building models (see :ref:`cli-usage`).

Note that this will install the basic dependencies, but not the optional dependencies.
If you want to use the optional dependencies, you can install the package with an
option:

.. code-block::
pip install git+https://github.com/CDDLeiden/QSPRpred.git@main#egg=qsprpred[<option>]
The following options are available:

- extra : include extra dependencies for PCM models and extra descriptor sets from
packages other than RDKit
- deep : include deep learning models (torch and chemprop)
- pyboost : include pyboost model (requires cupy, ``pip install cupy-cudaX``, replace X
with your `cuda version <https://docs.cupy.dev/en/stable/install.html>`_, you can obtain
cude toolkit from Anaconda as well: ``conda install cudatoolkit``)
- full : include all optional dependecies (requires cupy, ``pip install cupy-cudaX``,
replace X with your `cuda version <https://docs.cupy.dev/en/stable/install.html>`_)

You can test the installation by running the unit test suite:

Expand Down
4 changes: 3 additions & 1 deletion docs/requirements.txt
Original file line number Diff line number Diff line change
@@ -1,2 +1,4 @@
sphinx
sphinx_rtd_theme
sphinx_rtd_theme
sphinx_design
sphinx_design_elements
Loading

0 comments on commit d887003

Please sign in to comment.