Merge branch 'class6-aigw' into dev

f5devcentral · Dec 1, 2024 · a9e2061 · a9e2061
2 parents 9c53b4d + b014778
commit a9e2061
Show file tree

Hide file tree

Showing 25 changed files with 493 additions and 172 deletions.
diff --git a/docs/_static/js/data.js b/docs/_static/js/data.js
@@ -163,7 +163,7 @@ function displayJSON(jsonInput, statusText) {
 
 document.addEventListener("DOMContentLoaded", function() {
     const data = JSON.parse(localStorage.getItem('data'))
-    const { makeId, hostArcadia, ceArcadia, namespace, ceOnPrem, awsSiteName, vk8sName, kubeconfig, cek8s } = data;
+    const { makeId, hostArcadia, ceArcadia, namespace, ceOnPrem, awsSiteName, vk8sName, kubeconfig, cek8s, ollama } = data;
     replacePlaceholderWithValue('makeId', makeId);
     replacePlaceholderWithValue('hostArcadia', hostArcadia);
     replacePlaceholderWithValue('ceArcadia', ceArcadia);
@@ -173,6 +173,7 @@ document.addEventListener("DOMContentLoaded", function() {
     replacePlaceholderWithValue('vk8sName', vk8sName);
     replacePlaceholderWithValue('kubeconfig', kubeconfig);
     replacePlaceholderWithValue('cek8s', cek8s);
+    replacePlaceholderWithValue('ollama_public_ip',ollama.ollama_public_ip.value)
 
 
   });

diff --git a/docs/class6/class6.rst b/docs/class6/class6.rst
@@ -22,7 +22,7 @@ The following components are used within the application:
 
 AI Components
 
-* **AI Orchestrator** - is in charge of getting the user prompt, storing the conversation and orchastrating all other AI components
+* **LLM Orchestrator** - is in charge of getting the user prompt, storing the conversation and orchastrating all other AI components
 * **RAG** - contains Arcadia Crypto specific knowledge which isn't available to the LLM 
 * **Ollama** - is hosting the LLM. In our case we are using LLama 3.1 8B with Q4 quantization
 

diff --git a/docs/class6/module2/lab1/lab1.rst b/docs/class6/module2/lab1/lab1.rst
@@ -13,7 +13,7 @@ The AI Orchestrator acts as the central hub of the entire AI system, managing th
 * **State Management**: The Orchestrator  maintains the state of the conversation, ensuring continuity across multiple user interactions.
 * **Error Handling**: It manages any errors or exceptions that occur during the process, ensuring graceful failure modes.
 
-**Ollama**
+**Ollama ( Inference Services )**
 
 Ollama is an advanced AI tool that facilitates the local execution of large language models (LLMs), such as Llama 2, Mistral, and in our case LLama 3.1 8B.
 The key Features of Ollama:
@@ -36,12 +36,12 @@ Go to the **AI Assistant** start a new conversation and ask him the bellow quest
     How should I approch investing in crypto ?
 
 
-.. image:: ../pictures/C6Slide6.PNG
+.. image:: ../pictures/Slide1.PNG
    :align: center
 
-1. **User** sends question to **AI Orchestrator**
-2. **AI Orchestrator** forwards the user prompt to the **LLM**
-3. **LLM** returns response to **AI Orchestrator**
-4. **AI Orchestrator** sends the **LLM** response back to the **user**
+1. **User** sends question to **LLM Orchestrator**
+2. **LLM Orchestrator** forwards the user prompt to the **LLM**
+3. **LLM** returns response to **LLM Orchestrator**
+4. **ALLM Orchestrator** sends the **LLM** response back to the **user**
 
 This is the most **basic interaction** with the **LLM**. The **LLM** response is generated based only from the **training data**.
diff --git a/docs/class6/module2/lab2/lab2.rst b/docs/class6/module2/lab2/lab2.rst
@@ -15,7 +15,7 @@ RAG is a crucial component that enhances the AI's ability to provide accurate an
 * **Dynamic Updates**: The RAG system can be updated regularly to include new information, ensuring that the AI always has access to the latest data about Arcadia Crypto and the cryptocurrency market.
 
 
-**AI Orchestrator**
+**LLM Orchestrator**
 
 The AI Orchestrator has additional roles when using the RAG system:
 
@@ -47,15 +47,15 @@ Now, let's add data to the RAG system.
 
 
 
-.. image:: ../pictures/Slide8.PNG
+.. image:: ../pictures/Slide2.PNG
    :align: center
 
 1. **User** sends question to **AI Orchestrator**
-2. **AI Orchestrator** queries the **RAG** with the user prompt to get **contextual data**
+2. **LLM Orchestrator** queries the **RAG** with the user prompt to get **contextual data**
 3. **RAG** responds with up to 5 chunks of **contextual data**
-4. **AI Orchestrator** combines the **prompt + contextual data** and sends it to the **LLM** 
-5. **LLM** returns response to **AI Orchestrator**
-6. **AI Orchestrator** sends the **LLM** response back to the **user**
+4. **LLM Orchestrator** combines the **prompt + contextual data** and sends it to the **LLM** 
+5. **LLM** returns response to **LLM Orchestrator**
+6. **LLM Orchestrator** sends the **LLM** response back to the **user**
 
 
 When using RAG systems, we can enhance the overall knowledge of the LLM with specific information.
diff --git a/docs/class6/module2/lab3/lab3.rst b/docs/class6/module2/lab3/lab3.rst
@@ -4,9 +4,9 @@ Function calling
 
 Let's start by explaining the different functions.
 
-**AI Orchestrator**
+**LLM Orchestrator**
 
-The AI Orchestrator has additional roles when using it with LLM function calling:
+The LLM Orchestrator has additional roles when using it with LLM function calling:
 
 * **Function Calling**: When the LLM indicates a need for additional information, the Orchestrator manages API calls to relevant microservices (e.g., Users, Stocks) to fetch real-time data.
 
@@ -55,21 +55,19 @@ Go to the **AI Assistant** start a new conversation and ask him the bellow quest
 
 ::
 
-    How much Bitcoin can I buy with all my cash?
+    How many bitcoins to I have?
 
-.. image:: ../pictures/Slide10.PNG
+.. image:: ../pictures/Slide3.PNG
    :align: center
 
 1. **User** sends question to **AI Orchestrator**
-2. **AI Orchestrator** combines the prompt + contextual data ( not shown in the diagram ) and sends it to the **LLM**
-3. **LLM** decides that it needs to know how much cash the users has and responses by asking the **AI Orchestrator** to run **get_user_data** with the relevant account ID
-4. **AI Orchestrator** runs the **get_user_data** which is an API call to the **users** microservice and gets the user balance
-5. **AI Orchestrator**  sends the retrieved balance to the **LLM**
-6. **LLM** decides that it still has not enough information and it needs to know the current Bitcoin price and responses by asking the **AI Orchestrator** to run **get_all_stock_prices** 
-7. **AI Orchestrator** runs the **get_all_stock_prices** which is an API call to the **Stocks** microservice and gets current Crypto prices
-8. **AI Orchestrator**  sends the retrieved prices to the **LLM** for final processing
-9. Based on all the information provided so far the **LLM** returns the response to **AI Orchestrator**
-10. **AI Orchestrator** sends the **LLM** response back to the **user**
+2. **LLM Orchestrator** combines the prompt + contextual data ( not shown in the diagram ) and sends it to the **LLM**
+3. **LLM** decides that it needs to know how much cash the users has and responses by asking the **LLM Orchestrator** to run **get_user_data** with the relevant account ID
+4. **LLM Orchestrator** runs the **get_user_data** which is an API call to the **users** microservice and gets the user balance
+5. The internal app microservice respond to the API call with the relevant **user** data
+6. **LLM Orchestrator**  sends the retrieved user data to the **LLM** for final processing
+7. Based on all the information provided so far the **LLM** returns the response to **LLM Orchestrator**
+8. **LLM Orchestrator** sends the **LLM** response back to the **user**
 
 
 Bot joke to make things clear :)

diff --git a/docs/class6/module2/pictures/Slide1.PNG b/docs/class6/module2/pictures/Slide1.PNG
diff --git a/docs/class6/module2/pictures/Slide2.PNG b/docs/class6/module2/pictures/Slide2.PNG
diff --git a/docs/class6/module2/pictures/Slide3.PNG b/docs/class6/module2/pictures/Slide3.PNG
diff --git a/docs/class6/module3/module3.rst b/docs/class6/module3/module3.rst
@@ -6,16 +6,17 @@ Know that we know how the ChatBot works we need to understand what are the attac
 
 Let's go brifely over the attacks defined in **OWASP Top 10 GenAi**:
 
-1. **LLM01: Prompt Injection** occurs when an attacker manipulates a large language model (LLM) through crafted inputs, either by “jailbreaking” the system prompt or embedding prompts in external content. This manipulation can lead to data exfiltration, social engineering, unauthorized plugin use, and the LLM unknowingly executing harmful actions.
-2. **LLM02: Insecure Output Handling** involves inadequate validation and sanitization of outputs from large language models (LLMs) before passing them downstream. This can lead to vulnerabilities like XSS, CSRF, SSRF, and remote code execution, especially if LLMs have elevated privileges or are vulnerable to indirect prompt injections.
-3. **LLM03: Training Data Poisoning** involves tampering with the data used to train large language models (LLMs), introducing vulnerabilities, biases, or backdoors that compromise the model’s security and effectiveness. This can lead to incorrect or harmful outputs, reputational damage, and degraded performance, impacting both developers and users.
-4. **LLM04: Model Denial of Service** attack involves an attacker overwhelming an LLM with resource-intensive queries, leading to degraded service quality and increased costs. Techniques include exceeding the context window, recursive context expansion, and flooding with variable-length inputs, potentially causing the system to become unresponsive.
-5. **LLM05: Supply Chain Vulnerabilities** arise from tampering and attacks on training data, ML models, and deployment platforms. Risks include outdated components, poisoned data, and insecure plugins, leading to biased outcomes and security breaches. Attack scenarios involve compromised libraries, datasets, and malicious LLM plugins.
-6. **LLM06: Sensitive Information Disclosure**. LLM applications may expose sensitive information, proprietary algorithms, or confidential details through their output, leading to unauthorized access and security breaches. Mitigation includes data sanitization, proper Terms of Use, and restrictions on data types returned. However, unpredictable LLM behavior may still pose risks.
-7. **LLM07: Insecure Plugin Design** highlights the risks of insecure plugin design in LLMs. Plugins may accept unchecked, free-text inputs, leading to vulnerabilities like remote code execution, data exfiltration, and privilege escalation. Common issues include lack of input validation, inadequate access control, and treating all content as user-generated without additional authorization.
-8. **LLM08: Excessive Agency** is a vulnerability in LLM-based systems that allows harmful actions due to excessive functionality, permissions, or autonomy. This vulnerability can arise when LLM agents interact with other systems, leading to unintended actions from ambiguous outputs, such as executing unnecessary commands or accessing excessive privileges.
-9. **LLM09: Overreliance** can lead to the spread of misinformation, security vulnerabilities, and reputational damage due to their potential to produce authoritative but erroneous content. To mitigate risks, rigorous oversight, continuous validation, and disclaimers on risk are essential when using LLMs, especially in sensitive contexts.
-10. **LLM10: Model Theft** involves the unauthorized access and exfiltration of proprietary language models, leading to economic and reputational damage, loss of competitive advantage, and unauthorized usage. Attack vectors include exploiting vulnerabilities, insider threats, prompt injections, and functional model replication. Robust security measures are essential to mitigate these risks.
+* **LLM01: Prompt Injection** occurs when user prompts alter the LLM's behavior in unintended ways through direct or indirect inputs, potentially causing the model to violate guidelines, generate harmful content, enable unauthorized access, or influence critical decisions, even when the manipulated content is imperceptible to humans.  
+* **LLM02: Sensitive Information Disclosure** happens when LLMs expose confidential data, including personal information, financial details, health records, proprietary algorithms, and security credentials through their outputs, potentially leading to unauthorized data access, privacy violations, and intellectual property breaches.  
+* **LLM03: Supply Chain** vulnerabilities arise from dependencies on external components, training data, models, and deployment platforms, where risks include outdated components, licensing issues, vulnerable pre-trained models, and weak model provenance, potentially compromising the integrity and security of LLM applications.  
+* **LLM04: Data and Model Poisoning** occurs when pre-training, fine-tuning, or embedding data is manipulated to introduce vulnerabilities, backdoors, or biases, compromising model security, performance, and ethical behavior, leading to harmful outputs or impaired capabilities.  
+* **LLM05: Improper Output Handling** refers to insufficient validation, sanitization, and handling of LLM-generated outputs before they are passed downstream, potentially resulting in XSS, CSRF, SSRF, privilege escalation, or remote code execution in backend systems.  
+* **LLM06: Excessive Agency** occurs when an LLM-based system is granted too much capability to call functions or interface with other systems, leading to damaging actions from unexpected, ambiguous, or manipulated outputs due to excessive functionality, permissions, or autonomy.  
+* **LLM07: System Prompt Leakage** involves the risk of exposing system prompts or instructions that guide the model's behavior, potentially containing sensitive information that was not intended to be discovered and can be used to facilitate other attacks.
+* **LLM08: Vector and Embedding Weaknesses** present security risks in systems using Retrieval Augmented Generation (RAG), where vulnerabilities in vector generation, storage, or retrieval can be exploited to inject harmful content, manipulate model outputs, or access sensitive information.  
+* **LLM09: Misinformation** occurs when LLMs produce false or misleading information that appears credible, potentially leading to security breaches, reputational damage, and legal liability, often caused by hallucinations, biases, or incomplete information.  
+* **LLM10: Unbounded Consumption** refers to vulnerabilities allowing excessive and uncontrolled inferences in LLM applications, leading to denial of service, economic losses, model theft, and service degradation through resource exploitation and unauthorized usage.  
+
 
 
 

diff --git a/docs/class6/module4/lab1/lab1.rst b/docs/class6/module4/lab1/lab1.rst
@@ -1,44 +1,39 @@
-Prompt Security
-###############
+AI Gateway
+##########
 
-Prompt Security is a platform designed to protect organizations from the various risks associated with Generative AI (GenAI). It addresses several critical security concerns that arise from the use of AI technologies, particularly those involving large language models (LLMs).
+F5 **AI Gateway** is a specialized platform designed to route, protect, and manage generative AI traffic between clients and Large Language Model (LLM) backends. It addresses the unique challenges posed by AI applications, particularly their non-deterministic nature and the need for bidirectional traffic monitoring.
 
-Key Functions of Prompt Security:
+The main AI Gateway functions are:
 
-* **Protection Against Prompt Injection**: Prompt injection is a technique where attackers manipulate AI inputs to produce unintended or harmful outputs. Prompt Security helps prevent this by inspecting prompts and model responses to block harmful content and secure against GenAI-specific attacks
-* **Data Privacy and Intellectual Property Protection**: The platform aims to prevent data leaks and the unauthorized disclosure of proprietary information embedded in system prompts. This is crucial in maintaining data privacy and protecting intellectual property.
-* **Denial of Wallet/Service Mitigation**: These attacks involve excessive engagement with LLM-based applications, leading to resource overuse and potential financial costs. Prompt Security helps mitigate these risks by monitoring and managing resource consumption.
-* **Privilege Escalation Prevention**: By monitoring for and blocking prompts that could lead to unauthorized access, Prompt Security helps prevent privilege escalation, ensuring that AI systems do not grant more access than intended.
-* **Comprehensive Visibility and Governance**: The platform provides enterprise leaders with visibility and governance over AI tools used within their organizations, ensuring that AI adoption is secure and compliant with internal policies and regulations.
+* Implementing traffic steering policies
+* Inspects and filters client requests and LLM responses
+* Prevents malicious inputs from reaching LLM backends
+* Ensures safe LLM responses to clients
+* Protects against sensitive information leaks
+* Providing comprehensive logging of all requests and responses
+* Generating observability data through OpenTelemetry
 
-Accessing the **Prompt Security** UI
-------------------------------------
+Core
+""""
 
-1. Browse to https://prompt-security.workshop.emea.f5se.com/ and login into the system
+The AI Gateway core handles HTTP(S) requests destined for an LLM backend. It performs the following tasks:
 
-   .. table:: 
-      :widths: auto
+* Performs Authn/Authz checks, such as validating JWTs and inspecting request headers.
+* Parses and performs basic validation on client requests.
+* Applies processors to incoming requests, which may modify or reject the request.
+* Selects and routes each request to an appropriate LLM backend, transforming requests/responses to match the LLM/client schema.
+* Applies processors to the response from the LLM backend, which may modify or reject the response.
+* Optionally, stores an auditable record of every request/response and the specific activity of each processor. These records can be exported to AWS S3 or S3-compatible storage.
+* Generates and exports observability data via OpenTelemetry.
+* Provides a configuration interface (via API and a config file).
 
-      ====================    ========================================================================================
-      Object                  Value
-      ====================    ========================================================================================
-      **Username**            [email protected]
+Processors
+""""""""""
 
-      **Password**            Can be found in the documentation of the UDF
-      ====================    ========================================================================================
+A processor runs separately from the core and can perform one or more of the following actions on a request or response:
 
-2. Click on the **gear** icon in the top right corner → **Create homegrown applications connector**
+* **Modify**: A processor may rewrite a request or response. For example, by redacting credit card numbers.
+* **Reject**: A processor may reject a request or response, causing the core to halt processing of the given request/response.
+* **Annotate**: A processor may add tags or metadata to a request/response, providing additional information to the administrator. The core can also select the LLM backend based on these tags.
 
-3. Give the connector a name, this will represent your AI Security Policy config. When viewing or making changes allways make sure that you are using this connector.
-
-4. The policy has been created with best practices configuration. In order for us to explore the configuration and capabilities we will uncheck all the boxes, **do that now** and click **Save**
-
-5. Go to the **Deployment** tab and copy the **API key** when traffic will be sent to Prompt Security for inspection you will use this API Key to enable the policy you just created.
-
-6. Replace the **api-key** in the bellow curl command and run it
-
-   .. code-block:: none
-
-      curl -s -k -X POST https://$$hostArcadia$$/v1/ai/security-config \
-        -H "Content-Type: application/json" \
-        -d '{"llmSecurityHost":"prompt-security.workshop.emea.f5se.com", "llmSecurityAppId":"api-key"}'
+Each processor provides specific protection or transformation capabilities to AI Gateway. For example, a processor can detect and remove Personally Identifiable Information (PII) from the input or output of the AI model.