TLDR-783 Add notebooks page to docs

ispras · Aug 30, 2024 · 89eb57c · 89eb57c
1 parent ad2bce8
commit 89eb57c
Show file tree

Hide file tree

Showing 8 changed files with 33 additions and 0 deletions.
diff --git a/docs/source/_static/notebooks_data/doc_example.jpeg b/docs/source/_static/notebooks_data/doc_example.jpeg
diff --git a/docs/source/_static/notebooks_data/doc_tables.pdf b/docs/source/_static/notebooks_data/doc_tables.pdf
diff --git a/docs/source/_static/notebooks_data/doc_tables_1.jpeg b/docs/source/_static/notebooks_data/doc_tables_1.jpeg
diff --git a/docs/source/_static/notebooks_data/doc_tables_2.jpeg b/docs/source/_static/notebooks_data/doc_tables_2.jpeg
diff --git a/docs/source/_static/notebooks_data/table_1.png b/docs/source/_static/notebooks_data/table_1.png
diff --git a/docs/source/_static/notebooks_data/table_2.png b/docs/source/_static/notebooks_data/table_2.png
diff --git a/docs/source/index.rst b/docs/source/index.rst
@@ -227,6 +227,7 @@ For a document of unknown or unsupported domain there is an option to use defaul
    :maxdepth: 1
    :caption: Tutorials
 
+   tutorials/notebooks
    tutorials/add_new_doc_format
    tutorials/add_new_structure_type
    tutorials/creating_document_classes

diff --git a/docs/source/tutorials/notebooks.rst b/docs/source/tutorials/notebooks.rst
@@ -0,0 +1,32 @@
+Notebooks with examples of Dedoc usage
+======================================
+
+.. _table_notebooks:
+
+.. flat-table:: Notebooks with Dedoc usage examples
+    :widths: 70 30
+    :header-rows: 1
+    :class: tight-table
+
+    * - Task description
+      - Link to the notebook
+
+    * - Document text preprocessing for the following document classification:
+            * automatic detection of document format: DOC, DOCX, PDF or any image format;
+            * text extraction and its structuring;
+            * saving the result to JSON file.
+      - `Notebook 1 <https://colab.research.google.com/drive/1shZwKu-RMf5RgZIHvufO1XKk7Lb7I7xI?usp=sharing>`_
+
+    * - Tables text and structure extraction from images of scanned documents:
+            * automatic detection of document format: PDF or any image format;
+            * tables extraction including multi-paged tables;
+            * grouping tables by document page where they are located;
+            * saving each page to CSV file.
+      - `Notebook 2 <https://colab.research.google.com/drive/10dCQAtg9tmiANNnLF0PEyJ3hy0hRXTxK?usp=sharing>`_
+
+    * - ADVANCED: Extract text from scanned documents and get its location on the document image:
+            * automatic detection of image format;
+            * text extraction from image;
+            * text location visualization;
+            * text recognition confidence visualization.
+      - `Notebook 3 <https://colab.research.google.com/drive/1W3WK7L7qRUXLU8X1mM_wUqGovEzBILer?usp=sharing>`_