guestrin-lab · liana313 · Dec 16, 2024 · Dec 12, 2024 · Dec 13, 2024 · Dec 13, 2024
diff --git a/docs/approximation_cascades.rst b/docs/approximation_cascades.rst
@@ -0,0 +1,89 @@
+Efficient Processing with Approximations
+=======================
+
+Overview
+---------------
+
+LOTUS serves approximations for semantic operators to let you balance speed and accuracy. 
+You can set accurayc targets according to the requirements of your application, and LOTUS
+will use approximations to optimize the implementation for lower computaitonal overhead, while providing probabilistic accuracy guarantees.
+One core technique for providing these approximations is the use of cascades.
+Cascades provide a way to optimize certian semantic operators (Join Cascade and Filter Cascade) by blending 
+a less costly but potentially inaccurate proxy model with a high-quality oracle model. The method seeks to achieve
+preset precision and recall targets with a given probability while controlling computational overhead.
+
+Cascades work by intially using a cheap approximation to score and filters/joins tuples. Using statistically
+supported thresholds found from sampling prior, it then assigns each tuple to one of three actions based on the 
+proxy's score: accept, reject, or seek clarification from the oracle model. 
+
+When the proxy is accurate, most of the data is resolved quickly and inexpensively, and those not resolved are 
+sent to the larger LM. 
+
+Using Cascades
+----------------
+To use this approximation cascade-based operators, begin by configuring both the main and helper LM using
+lotus's configuration settings
+
+.. code-block:: python
+
+   import lotus
+   from lotus.models import LM
+   from lotus.types import CascadeArgs
+
+
+   gpt_4o_mini = LM("gpt-4o-mini")
+   gpt_4o = LM("gpt-4o")
+
+   lotus.settings.configure(lm=gpt_4o, helper_lm=gpt_4o_mini)
+
+
+Once the LMs are set up, specify the cascade parameters-like recall and precision targets, sampling percentage, and 
+the acceptable failure probability-using the CascadeArgs object. 
+
+.. code-block:: python
+
+   cascade_args = CascadeArgs(recall_target=0.9, precision_target=0.9, sampling_percentage=0.5, failure_probability=0.2)
+
+After preparing the arguments, call the semantic operator method on the DataFrame
+
+.. code-block:: python
+
+   df, stats = df.sem_filter(user_instruction=user_instruction, cascade_args=cascade_args, return_stats=True)
+
+Note that these parameters guide the trade-off between speed and accuracy when applying the cascade operators
+
+Interpreting Output Statistics
+-------------------------------
+For cascade operators, Output statistics will contain key performance metrics.
+
+An Example output statistic: 
+
+.. code-block:: text
+
+   {'pos_cascade_threshold': 0.62, 
+   'neg_cascade_threshold': 0.52, 
+   'filters_resolved_by_helper_model': 95, 
+   'filters_resolved_by_large_model': 8, 
+   'num_routed_to_helper_model': 95}
+
+Here is a detailed explanation of each metric
+
+1. **pos_cascade_threshold**
+   The Minimum score above which tuples are automatically rejected by the helper model. In the above example, any tuple with a 
+   score above 0.62 is accepted without the need for the oracle LM.
+
+2. **neg_cascade_threshold**
+   The maximum score below which tuples are automatically rejected by the helper model.  
+   Any tuple scoring below 0.52 is rejected without involving the oracle LM.
+
+3. **filters_resolved_by_helper_model**  
+   The number of tuples conclusively classified by the helper model.  
+   A value of 95 indicates that the majority of items were efficiently handled at this stage.
+
+4. **filters_resolved_by_large_model**  
+   The count of tuples requiring the oracle model’s intervention.  
+   Here, only 8 items needed escalation, suggesting that the chosen thresholds are effective.
+
+5. **num_routed_to_helper_model**  
+   The total number of items initially processed by the helper model.  
+   Since 95 items were routed, and only 8 required the oracle, this shows a favorable balance between cost and accuracy.
diff --git a/docs/conf.py b/docs/conf.py
@@ -14,7 +14,7 @@
 project = "LOTUS"
 copyright = "2024, Liana Patel, Siddharth Jha, Carlos Guestrin, Matei Zaharia"
 author = "Liana Patel, Siddharth Jha, Carlos Guestrin, Matei Zaharia"
-release = "0.3.0"
+release = "1.0.1"
 
 # -- General configuration ---------------------------------------------------
 # https://www.sphinx-doc.org/en/master/usage/configuration.html#general-configuration

diff --git a/docs/configurations.rst b/docs/configurations.rst
@@ -0,0 +1,46 @@
+Setting Configurations
+=======================
+
+The Settings module is a central configuration system for managing application-wide settings. 
+It ensures consistent and thread-safe access to configurations, allowing settings to be dynamically 
+adjusted and temporarily overridden within specific contexts. In most examples seen, we have 
+used the settings to configured our LM.
+
+Using the Settings module
+--------------------------
+.. code-block:: python
+
+    from lotus
+    from lotus.models import LM
+
+    lm = LM(model="gpt-4o-mini")
+    lotus.settings.configure(lm=lm)
+
+Configurable Parameters
+--------------------------
+
+1. enable_cache: 
+    * Description: Enables or Disables cahcing mechanisms
+    * Default: False
+.. code-block:: python
+
+    lotus.settings.configure(enable_cache=True)
+
+2. setting RM:
+    * Description: Configures the retrieval model
+    * Default: None
+.. code-block:: python
+
+    rm = SentenceTransformersRM(model="intfloat/e5-base-v2")
+    lotus.settings.configure(rm=rm)
+
+3. setting helper_lm:
+    * Descriptions: Configures secondary helper LM often set along with primary LM
+    * Default: None
+.. code-block:: python
+
+    gpt_4o_mini = LM("gpt-4o-mini")
+    gpt_4o = LM("gpt-4o")
+
+    lotus.settings.configure(lm=gpt_4o, helper_lm=gpt_4o_mini)
+
diff --git a/docs/core_concepts.rst b/docs/core_concepts.rst
@@ -0,0 +1,41 @@
+Core Concepts
+==================
+
+LOTUS' implements the semantic operator programming model. Semantic operators are declarative transformations over one or more
+datasets, parameterized by a natural langauge expression (*langex*) that can be implemnted by a variety of AI-based algorithms.
+Semantic operators seamlessly extend the relational model, operating over datasets that may contain traditional structured data
+as well as unstructured fields, such as free-form text or images. Because semantic operators are composable, modular and declarative, they allow you to write 
+AI-based piplines with intuitive, high-level logic, leaving the rest of the work to the query engine! Each operator can be implmented and 
+optimized in multiple ways, opening a rich space for execution plans, similar to relational operators. Here is a quick example of semantic operators in action:
+
+.. code-block:: python
+
+    langex = "The {abstract} suggests that LLMs efficeintly utilize long context"
+    filtered_df = papers_df.sem_filter(langex)
+
+
+With LOTUS, applications can be built by chaining togethor different operators. Much like relational operators can be used to 
+transform tables in SQL, LOTUS operators can be use to semantically transform Pandas DataFrames. 
+Here are some key semantic operators:
+
+
++--------------+-----------------------------------------------------+
+| Operator     | Description                                         |
++==============+=====================================================+
+| sem_map      |  Map each record using a natural language projection|                
++--------------+-----------------------------------------------------+
+| sem_extract  | Extract one or more attributes from each row        |
++--------------+-----------------------------------------------------+
+| sem_filter   | Keep records that match the natural language predicate |                  
++--------------+-----------------------------------------------------+
+| sem_agg      | Aggregate across all records (e.g. for summarization)              
++--------------+-----------------------------------------------------+
+| sem_topk     | Order the records by some natural langauge sorting criteria                 |
++--------------+-----------------------------------------------------+
+| sem_join     | Join two datasets based on a natural language predicate       |
++--------------+-----------------------------------------------------+
+| sem_sim_join | Join two DataFrames based on semantic similarity             |
++--------------+-----------------------------------------------------+
+| sem_search   | Perform semantic search the over a text column                |
++--------------+-----------------------------------------------------+
+
diff --git a/docs/quickstart.rst → docs/examples.rst b/docs/quickstart.rst → docs/examples.rst
@@ -1,44 +1,6 @@
-Quickstart
-============
-
-LOTUS can be used to easily build LLM applications in a couple steps.
-
-LOTUS Operators and Data Model
-----------------------------------
-
-With LOTUS, applications can be built by chaining together operators. Much like relational operators can be used to transform tables in SQL, LOTUS operators can be used to *semantically* transform Pandas dataframes. Here are some key operators:
-
-+--------------+-----------------------------------------------------+
-| Operator     | Description                                         |
-+==============+=====================================================+
-| Sem_Map      | Map each row of the dataframe                       |
-+--------------+-----------------------------------------------------+
-| Sem_Filter   | Keep rows that match a predicate                    |
-+--------------+-----------------------------------------------------+
-| Sem_Agg      | Aggregate information across all rows               |
-+--------------+-----------------------------------------------------+
-| Sem_TopK     | Order the dataframe by some criteria                |
-+--------------+-----------------------------------------------------+
-| Sem_Join     | Join two dataframes based on a predicate            |
-+--------------+-----------------------------------------------------+
-| Sem_Index    | Create a semantic index over a column               |
-+--------------+-----------------------------------------------------+
-| Sem_Search   | Search the dataframe for relevant rows              |
-+--------------+-----------------------------------------------------+
-
-
-A core principle of LOTUS is to provide users with a declarative interface that separates the user-specified, logical query plan from its underlying implementation. 
-As such, users program with LOTUS's semantic operators by writing parameterized language expressions (*langex*), rather than directly prompting an underlying LM.
-For example, to filter a dataframe of research papers via its abstract column, a LOTUS user may write
-
-.. code-block:: python
-
-    langex = "The {abstract} suggests that LLMs efficeintly utilize long context"
-    filtered_df = papers_df.sem_filter(langex)
-
-
 Examples
--------------------------
+==================
+
 Let's walk through some use cases of LOTUS.
 First let's configure LOTUS to use GPT-3.5-Turbo for the LLM and E5 as the embedding model.
 Then let's define a dataset of courses and their descriptions/workloads.
@@ -129,4 +91,4 @@ Additionally, let's provide some examples to the model that can be used for demo
         Respond with just the topic name and nothing else.", examples=examples_df, suffix="Next Topics"
     )
 
-Now you've seen how to use LOTUS to build LLM applications in a couple steps!
+Now you've seen how to use LOTUS to implement LLM-powered transformations in a couple of steps using semantic operators in LOTUS!
diff --git a/docs/index.rst b/docs/index.rst
@@ -3,6 +3,11 @@
    You can adapt this file completely to your liking, but it should at least
    contain the root `toctree` directive.
 
+.. image:: logo_with_text.png
+   :width: 300px
+   :height: 170px
+   :align: center
+
 LOTUS Makes LLM-Powerd Data Processing Fast and Easy
 =================================================================================
 
@@ -14,12 +19,49 @@ LOTUS implements the semantic operator programming model and provides an optimiz
    :caption: Getting Started
 
    installation
-   quickstart
+   core_concepts
+   examples
 
 .. toctree::
    :hidden:
    :maxdepth: 1
-   :caption: Module Documentation
+   :caption: Semantic Operators
+
+   sem_map
+   sem_extract
+   sem_filter
+   sem_agg
+   sem_topk
+   sem_join
+   sem_search
+   sem_sim_join
+   sem_cluster
+
+.. toctree::
+   :hidden:
+   :maxdepth: 1
+   :caption: Utility Operators
+
+   sem_partition
+   sem_index
+   sem_dedup
+
+.. toctree::
+   :hidden:
+   :maxdepth: 1
+   :caption: Models   
+
+   llm
+   retriever_models
+   reranker_models
+   multimodal_models
+
+.. toctree::
+   :hidden:
+   :maxdepth: 1
+   :caption: Advanced Usage
+
+   approximation_cascades
+   prompt_strategies
+   configurations
 
-   models_module
-   sem_ops_module
diff --git a/docs/installation.rst b/docs/installation.rst
@@ -1,7 +1,7 @@
 Installation
 ============
 
-Lotus can be installed as a Python library through pip.
+LOTUS can be installed as a Python library through pip.
 
 Requirements
 ------------
@@ -12,10 +12,22 @@ Requirements
 Install with pip
 ----------------
 
-You can install Lotus using pip:
+You can install LOTUS using pip:
 
 .. code-block:: console
 
     $ conda create -n lotus python=3.10 -y
     $ conda activate lotus
-    $ pip install lotus-ai
+    $ pip install lotus-ai
+
+If you are running on mac, please install Faiss via conda:
+
+.. code-block:: console
+
+    # CPU-only version
+    $ conda install -c pytorch faiss-cpu=1.9.0
+
+    # GPU(+CPU) version
+    $ conda install -c pytorch -c nvidia faiss-gpu=1.9.0
+
+For more details, see `Installing FAISS via Conda <https://github.com/facebookresearch/faiss/blob/main/INSTALL.md#installing-faiss-via-conda>`_.
diff --git a/docs/llm.rst b/docs/llm.rst
@@ -0,0 +1,38 @@
+LLM
+=======
+
+The LM class is built on top of the LiteLLM library, and supports any model that is supported by LiteLLM.
+Example models include but not limited to: OpenAI, Ollama, vLLM
+
+.. automodule:: lotus.models.lm
+    :members:
+    :show-inheritance:
+
+Example
+---------
+To run a model, you can use the LM class. We use the liteLLMm library to interface with the model. This allows 
+ypu to use any model provider that is supported by liteLLM
+
+Creating a LM object for gpt-4o
+
+.. code-block:: python
+
+    from lotus.models import LM
+    lm = LM(model="gpt-4o")
+
+Creating a LM object to use llama3.2 on Ollama
+
+.. code-block:: python
+
+    from lotus.models import LM
+    lm = LM(model="ollama/llama3.2")
+
+Creating a LM object to use Meta-Llama-3-8B-Instruct on vLLM
+
+.. code-block:: python
+
+    from lotus.models import LM
+    lm = LM(model='hosted_vllm/meta-llama/Meta-Llama-3-8B-Instruct',
+        api_base='http://localhost:8000/v1',
+        max_ctx_len=8000,
+        max_tokens=1000)
diff --git a/docs/logo_with_text.png b/docs/logo_with_text.png
diff --git a/docs/models_module.rst b/docs/models_module.rst