Skip to content

Commit

Permalink
DOCS
Browse files Browse the repository at this point in the history
  • Loading branch information
straussmaximilian committed Jul 8, 2022
1 parent 1726931 commit b362745
Show file tree
Hide file tree
Showing 31 changed files with 546 additions and 12 deletions.
8 changes: 4 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
<p align="center"> <img src="https://user-images.githubusercontent.com/49681382/101802266-48204a00-3b20-11eb-85ec-08c123fca79e.png" height="270" width="277" /> </p>
<p align="center"> <img src="omiclearn.png" height="270" width="277" /> </p>
<h2 align="center"> 📰 Manual and Documentation is available at: <a href="https://github.com/MannLabs/OmicLearn/wiki" target="_blank">OmicLearn Wiki Page </a> </h2>

![OmicLearn Tests](https://github.com/MannLabs/OmicLearn/workflows/OmicLearn%20Tests/badge.svg)
Expand All @@ -10,7 +10,7 @@
---
## OmicLearn

Transparent exploration of machine learning for biomarker discovery from proteomics and omics data. This is a maintained fork from https://github.com/OmicEra/OmicLearn.
Transparent exploration of machine learning for biomarker discovery from proteomics and omics data. This is a maintained fork from [OmicEra](https://github.com/OmicEra/OmicLearn).


## Manuscript
Expand Down Expand Up @@ -57,7 +57,7 @@ Click on one of the links below to download the latest release for:

The following image displays the main steps of OmicLearn:

![OmicLearn Workflow](https://user-images.githubusercontent.com/49681382/91734594-cb421380-ebb3-11ea-91fa-8acc8826ae7b.png)
![OmicLearn Workflow](workflow.png)

Detailed instructions on how to get started with OmicLearn can be found **[here.](https://github.com/MannLabs/OmicLearn/wiki/HOW-TO:-Using)**

Expand All @@ -70,4 +70,4 @@ All contributions are welcome. 👍

When contributing to **OmicLearn**, please **[open a new issue](https://github.com/MannLabs/OmicLearn/issues/new/choose)** to report the bug or discuss the changes you plan before sending a PR (pull request).

We appreciate community contributions to the repository., we ensure that the community is free to use your contributions. 🤝
We appreciate community contributions to the repository.
20 changes: 20 additions & 0 deletions docs/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# Minimal makefile for Sphinx documentation
#

# You can set these variables from the command line, and also
# from the environment for the first two.
SPHINXOPTS ?=
SPHINXBUILD ?= sphinx-build
SOURCEDIR = source
BUILDDIR = build

# Put it first so that "make" without argument is like "make help".
help:
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

.PHONY: help Makefile

# Catch-all target: route all unknown targets to Sphinx using the new
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
%: Makefile
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
35 changes: 35 additions & 0 deletions docs/make.bat
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
@ECHO OFF

pushd %~dp0

REM Command file for Sphinx documentation

if "%SPHINXBUILD%" == "" (
set SPHINXBUILD=sphinx-build
)
set SOURCEDIR=source
set BUILDDIR=build

if "%1" == "" goto help

%SPHINXBUILD% >NUL 2>NUL
if errorlevel 9009 (
echo.
echo.The 'sphinx-build' command was not found. Make sure you have Sphinx
echo.installed, then set the SPHINXBUILD environment variable to point
echo.to the full path of the 'sphinx-build' executable. Alternatively you
echo.may add the Sphinx directory to PATH.
echo.
echo.If you don't have Sphinx installed, grab it from
echo.https://www.sphinx-doc.org/
exit /b 1
)

%SPHINXBUILD% -M %1 %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%
goto end

:help
%SPHINXBUILD% -M help %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%

:end
popd
195 changes: 195 additions & 0 deletions docs/source/METHODS.md

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions docs/source/README.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
.. include:: ../../../README.md
147 changes: 147 additions & 0 deletions docs/source/USING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,147 @@
## Using OmicLearn
**OmicLearn** enables researchers and scientists to explore the latest algorithms in machine learning (ML) for their usage in proteomics/transcriptomics.

The core steps of the pipeline are `Preprocessing`, `Missing Value Imputation`, `Feature Selection`, `Classification`, and `Validation` of selected method/algorithms and are presented in the flowchart below:

![OmicLearn Workflow](workflow.png)

_**Figure 1:** Main steps for the workflow of OmicLearn at a glance_

## Uploading data

Own data can be uploaded via dragging and dropping on the file menu or clicking the link.
The data should be formatted according to the following conventions:

> - The file format should be `.xlsx (Excel)`, `.csv (Comma-separated values)` or `.tsv (tab-separated values)`. For `.csv`, the separator should be either `comma (,)` or `semicolon (;)`.
>
> - Maximum file size is 200 Mb.
>
> - 'Identifiers' such as protein IDs, gene names, lipids or miRNA IDs should be uppercase.
>
> - Each row corresponds to a sample, each column to a feature.
>
> - Additional features should be marked with a leading underscore (`_`).
![DATA_UPLOAD/SELECTION](upload.png)

_**Figure 2:** Uploading a dataset or selecting a sample file_

The data will be checked for consistency, and if your dataset contains missing values (`NaNs`), a notification will appear.
Then, you might consider using the methods listed on the left sidebar for the imputation of missing values.

![NAN_WARNING](nan_warning.png)

_**Figure 3:** Missing value warning_


### Sample Datasets

OmicLearn has several sample [datasets](https://github.com/MannLabs/OmicLearn/tree/master/data) included that can be used for exploring the analysis, which can be selected from the dropdown menu.

Here is the list of sample datasets available:

**`1. Alzheimer Dataset`**
> 📁 **File Name:** Alzheimer.xlsx
>
> 📖 **Description:** Proteome profiling in cerebrospinal fluid reveals novel biomarkers of Alzheimer's disease
>
> 🔗 **Source:** Bader, J., Geyer, P., Müller, J., Strauss, M., Koch, M., & Leypoldt, F. et al. (2020). Proteome profiling in cerebrospinal fluid reveals novel biomarkers of Alzheimer's disease. Molecular Systems Biology, 16(6). doi: [10.15252/msb.20199356](http://doi.org/10.15252/msb.20199356).
**`2. Sample Dataset`**
> 📁 **File Name:** Sample.xlsx
>
> 📖 **Description:** Sample dataset for testing the tool
>
> 🔗 **Source:** -
## Sidebar: Selecting Parameters

OmicLearn has a large variety of options to choose from which are detailed in the [methods wiki](https://github.com/MannLabs/OmicLearn/wiki/METHODS). The parameters can be selected in the sidebar.

Moreover, after changing the parameters, you are asked to re-run the analysis. Each analysis result will be stored in the [`Session History` section](#checking-the-session-history).

![OmicLearn SideBar](sidebar.png)

_**Figure 4:** OmicLearn sidebar options_

## Main Window: Selecting data, define workflow, and explore results

### Data Selection

After uploading the data, the data will be displayed within the OmicLearn window and can be explored. The dropdown menu `Subset` allows you to specify a subset of data based on values within a column. This way, you can exclude data that should not be used at all. An example use case could be that you collected data from different sites and want to exclude a site.

![Subset](subset.png)

_**Figure 5:** Example usage for `Subset` section_

Within `Features`, you should select the target column. This refers to the variable that the classifier should be able to distinguish. As we are performing a binary classification task, there are only two options for the outcome of the classifier. By assigning multiple values to a class, multiple combinations of classifications can be tested.

![Classification target](target.png)

_**Figure 6:** `Classification target` section for selecting the target columns and `Define classes` section for assigning the classes_

Furthermore, `Additional Features` can be selected. This refers to columns that are not your identifiers such as protein IDs, gene names, lipids or miRNA IDs (not uppercase and have a leading underscore (`_`).

![Add Features](additional.png)

_**Figure 7:** Sample case for `Additional Features` option_

The section `Exclude identifiers` enables users to exclude selected features manually. This can be useful e.g., when wanting to asses performance without a top feature. There is also an option to upload a file with multiple features that should be excluded.

> To utilize this option, you should upload a CSV (comma `,` separated) file where each row corresponds to a feature to be excluded. Also, the file should include a header (title row).
![exclude_identifiers](exclude.png)

_**Figure 8:** Selections on the dataset_

The option `Cohort comparison` allows comparing results over different cohorts (i.e., train on one cohort and predict on another)

![dataselections](selection.png)

_**Figure 9:** Selections on the dataset_

### Running the Workflow
After selecting all parameters you are able to execute the workflow by clicking the `Run Analysis` button.

### Analysis results and plots
Once the analysis is completed, OmicLearn automatically generates the plots together with a table showing the results of each validation run. The plots are downloadable as `.pdf` and `.svg` format in addition to the `.png` format provided by Plotly.

![FeatAtt_Chart](feature_importance.png)

![FeatAtt_Table](feature_importance_table.png)

_**Figure 10:** Bar chart for feature importance values received from the classifier after all cross-validation runs, its table containing links to NCBI search and download options_

![ROC Curve](roc_curve.png)

![PR Curve](pr_curve.png)

_**Figure 11:** Receiver operating characteristic (ROC) Curve, Precision-Recall (PR) Curve and download options_

![CONF-MATRIX](confusion.png)

_**Figure 12:** Confusion matrix, slider for looking at the other matrix tables and download options_

OmicLearn generates a `Summary` to describe the method. This can be used for a method section in a publication.

![Results table](summary.png)

![summary text](summary_text.png)

_**Figure 13:** Results table of the analysis, its download option, and auto-generated `Summary` text_

### Checking the Session History

Each analysis run will be appended to the `Session History` so that you can investigate the different results for different parameter sets.

![session](session_history.png)

_**Figure 14:** Session history table and download option_

## Cite us & Report bugs

At the end of the analysis, you find a footnote for reporting bugs. Also, there is information on how to cite OmicLearn in your work if you find it useful.

![bug_report](bugs.png)

_**Figure 15:** Tabs for Citation and Bug Reporting_
58 changes: 58 additions & 0 deletions docs/source/VERSION-HISTORY.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
## Version History

On this page, you might find the list of the previous releases of **OmicLearn** and the notes and significant changes made within the versions.

### - `v1.1.3`

> 📅 June 2022
>
> This is the latest release of **OmicLearn**.
>
> **Updates in this release:**
>
> - [x] Several packages are upgraded to their latest version!
> - [x] The code prettifier and formatter are used to make it more readable!
> - [x] One-click installers
>
### - `v1.1.2`

> 📅 February 2022
>
>
> **Updates in this release:**
>
> - [x] TSV (Tab [\t] Separated Value) File support has been added!
>
### - `v1.1.1`

> 📅 October 2021
>
>
> **Updates in this release:**
>
> - [x] The dependencies are upgraded to the latest versions if possible.
> - [x] The codebase has been updated for UI changes in Streamlit.
>

### - `v1.1.0`

> 📅 July 2021
>
>
> **Updates in this release:**
> - [X] As Exploratory Data Analysis (EDA) is crucial to gain insight into the data, PCA and Hierarchical clustering options are provided.
> - [X] Improvements on UI & UX have been made.
> - [X] The user interface has been updated to make it more easy-to-follow.
> - [X] The Streamlit and other packages/libraries are upgraded to their new versions.
>
<br>

### - `v1.0.0`

> 📅 March 2021
>
> This is the initial release of **OmicLearn**.
Binary file added docs/source/additional.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/source/bugs.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
54 changes: 54 additions & 0 deletions docs/source/conf.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
# Configuration file for the Sphinx documentation builder.
#
# This file only contains a selection of the most common options. For a full
# list see the documentation:
# https://www.sphinx-doc.org/en/master/usage/configuration.html

# -- Path setup --------------------------------------------------------------

# If extensions (or modules to document with autodoc) are in another directory,
# add these directories to sys.path here. If the directory is relative to the
# documentation root, use os.path.abspath to make it absolute, like shown here.
#
# import os
# import sys
# sys.path.insert(0, os.path.abspath('.'))


# -- Project information -----------------------------------------------------

project = 'OmicLearn'
copyright = '2022, Furkan Torun, Maximilian Strauss'
author = 'Furkan Torun, Maximilian Strauss'

# The full version, including alpha/beta/rc tags
release = '1.1.3'


# -- General configuration ---------------------------------------------------

# Add any Sphinx extension module names here, as strings. They can be
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
# ones.
extensions = ['myst_parser']

# Add any paths that contain templates here, relative to this directory.
templates_path = ['_templates']

# List of patterns, relative to source directory, that match files and
# directories to ignore when looking for source files.
# This pattern also affects html_static_path and html_extra_path.
exclude_patterns = []


# -- Options for HTML output -------------------------------------------------

# The theme to use for HTML and HTML Help pages. See the documentation for
# a list of builtin themes.
#
html_theme = 'sphinx_rtd_theme'

# Add any paths that contain custom static files (such as style sheets) here,
# relative to this directory. They are copied after the builtin static files,
# so a file named "default.css" will overwrite the builtin "default.css".
html_static_path = ['_static']
Binary file added docs/source/confusion.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/source/exclude.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/source/feature_importance.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/source/feature_importance_table.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
24 changes: 24 additions & 0 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
.. OmicLearn documentation master file, created by
sphinx-quickstart on Fri Jul 8 23:02:10 2022.
You can adapt this file completely to your liking, but it should at least
contain the root `toctree` directive.
Welcome to OmicLearn's documentation!
=====================================

.. toctree::
:maxdepth: 2
:caption: Contents:

README.rst
USING.md
METHODS.md
VERSION-HISTORY.md


Indices and tables
==================

* :ref:`genindex`
* :ref:`modindex`
* :ref:`search`
Binary file added docs/source/nan_warning.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/source/pr_curve.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/source/roc_curve.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/source/selection.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/source/session_history.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/source/sidebar.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/source/subset.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/source/summary.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/source/summary_text.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/source/target.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/source/upload.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/source/workflow.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/source/workflow0.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added omiclearn.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
16 changes: 8 additions & 8 deletions reqs.txt
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
streamlit==1.8.1
pandas==1.4.2
numpy==1.22.3
scikit-learn==1.0.2
openpyxl==3.0.9
plotly==5.7.0
streamlit==1.10.0
pandas==1.4.3
numpy==1.23.0
scikit-learn==1.1.1
openpyxl==3.0.10
plotly==5.9.0
kaleido==0.2.1
XlsxWriter==3.0.1
watchdog==2.1.7
XlsxWriter==3.0.3
watchdog==2.1.9
xgboost
protobuf==3.20
Binary file added workflow.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit b362745

Please sign in to comment.