From 156cd51fd779fbff5a9e5da928c5b3624114b185 Mon Sep 17 00:00:00 2001
From: Arun Jose <40291569+arunjose696@users.noreply.github.com>
Date: Fri, 6 Sep 2024 15:33:08 +0200
Subject: [PATCH] DOCS-#7382: Add documentation on how to use Modin Native
 query compiler (#7386)

Co-authored-by: Iaroslav Igoshev <Poolliver868@mail.ru>
Signed-off-by: arunjose696 <arunjose696@gmail.com>
---
 docs/usage_guide/optimization_notes/index.rst | 31 +++++++++++++++++++
 1 file changed, 31 insertions(+)

diff --git a/docs/usage_guide/optimization_notes/index.rst b/docs/usage_guide/optimization_notes/index.rst
index 0dcbe5a25d7..6e9d1ca7d63 100644
--- a/docs/usage_guide/optimization_notes/index.rst
+++ b/docs/usage_guide/optimization_notes/index.rst
@@ -314,6 +314,37 @@ Copy-pastable example, showing how mixing pandas and Modin DataFrames in a singl
   # Possible output: TypeError
 
 
+Execute DataFrame operations using NativeQueryCompiler
+""""""""""""""""""""""""""""""""""""""""""""""""""""""
+
+By default, Modin distributes data across partitions and performs operations
+using the ``PandasQueryCompiler``. However, for certain scenarios such as handling small or empty DataFrames,
+distributing them may introduce unnecessary overhead. In such cases, it's more efficient to default
+to pandas at the query compiler layer. This can be achieved by setting the ``cfg.NativeDataframeMode``
+:doc:`configuration variable: </flow/modin/config>` to ``Pandas``. When set to ``Pandas``, all operations in Modin default to pandas, and the DataFrames are not distributed,
+avoiding additional overhead. This configuration can be toggled on or off depending on whether
+DataFrame distribution is required.
+
+DataFrames created while the ``NativeDataframeMode`` is active will continue to use the ``NativeQueryCompiler``
+even after the config is disabled. Modin supports interoperability between distributed Modin DataFrames and
+those using the ``NativeQueryCompiler``.
+
+.. code-block:: python
+
+  import modin.pandas as pd
+  import modin.config as cfg
+
+  # This dataframe will be distributed and use `PandasQueryCompiler` by default
+  df_distributed = pd.DataFrame(...)
+
+  # Set mode to "Pandas" to avoid distribution and use `NativeQueryCompiler`
+  cfg.NativeDataframeMode.put("Pandas")
+  df_native_qc = pd.DataFrame(...)
+
+  # Revert to default settings for distributed dataframes
+  cfg.NativeDataframeMode.put("Default")
+  df_distributed = pd.DataFrame(...)
+
 Operation-specific optimizations
 """"""""""""""""""""""""""""""""