Skip to content

Commit

Permalink
code and readme updated
Browse files Browse the repository at this point in the history
  • Loading branch information
jeffxtang committed Dec 11, 2024
1 parent 1a7fe9c commit fa06774
Show file tree
Hide file tree
Showing 5 changed files with 26 additions and 26 deletions.
38 changes: 19 additions & 19 deletions recipes/use_cases/email_agent/README.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,16 @@
# Emagent - A Llama Powered Email Agent
# Building A Llama Powered Email Agent

This Emagent app shows how to build a Email agent app powered by Llama 3.1 8B running locally via Ollama (for privacy concern since Emagent is about your email). We'll start with building from scratch a basic agent with custom tool calling natively supported in Llama 3.1. The end goal is to cover all components of a production-ready agent app, acting as an assistant to your email, with great user experience: intuitive, engaging, efficient and reliable. We'll use Gmail as an example but any email client API's can be used instead.
This app shows how to build an email agent powered by Llama 3.1 8B running locally via Ollama. We'll start with building from scratch a basic agent with custom tool calling natively supported in Llama 3.1. The end goal is to cover all components of a production-ready agent app, acting as an assistant to your email, with great user experience: intuitive, engaging, efficient and reliable. We'll use Gmail as an example but any email client API's can be used instead.

Currently implemented features of Emagent include:
Currently implemented features include:
* search for emails and attachments
* get email detail
* reply to a specific email
* forward an email
* get summary of a PDF attachment
* draft and send an email

![](gmagent.png)
![](email_agent.png)

# Overview

Expand Down Expand Up @@ -61,20 +61,20 @@ It's time to see an agent app in action and enjoy some coding. Below is a previe
* do i have emails with attachment larger than 1mb?
* what kind of attachments for the email with subject papers to read?
* give me a summary of the pdf thinking_llm.pdf
* Draft an email to gmagent_test1@gmail.com saying working on it and will keep you updated. thanks for your patience.
* Draft an email to xxx@gmail.com saying working on it and will keep you updated. thanks for your patience.
* send the draft
* do i have any emails with attachment larger than 10mb?
* how about 5mb
* reply to the email saying thanks for sharing!
* forward the email to gmagent_test2@gmail.com
* forward the email to xxx@gmail.com
* how many emails do i have from [email protected]?
* how about from [email protected]?

[Here](./examples_log.txt) is an example interaction log with Emagent.

# Setup and Installation

If you feel intimated by the steps of the following Enable Gmail API section, you may want to check again the example asks (to see what you can ask to the agent) and the example log (to see the whole conversation with gmagent) - the devil's in the detail and all the glorious description of a powerful trendy agent may not mention the little details one has to deal with to build it.
If you feel intimated by the steps of the following Enable Gmail API section, you may want to check again the example asks (to see what you can ask to the agent) and the example log (to see the whole conversation with the agent) - the devil's in the detail and all the glorious description of a powerful trendy agent may not mention the little details one has to deal with to build it.

## Enable Gmail API
1. Go to the [Google Cloud Console](https://console.cloud.google.com/).
Expand Down Expand Up @@ -133,15 +133,15 @@ You need to copy the URL above and open it in a browser - if you Sign in with Go

In the latter case, go to APIs & Services > OAuth consent screen > Test users, and click the + ADD USERS button, and you'll see this message: While publishing status is set to "Testing", only test users are able to access the app. Allowed user cap prior to app verification is 100, and is counted over the entire lifetime of the app.

After clicking Continue, check the Select all checkbox to enable both settings required for running Gmagent:
After clicking Continue, check the Select all checkbox to enable both settings required for running the agent:
```
View your email messages and settings.
Manage drafts and send emails.
```

Finally, copy the Authorization code and paste it to the Terminal, hit Enter and you'll see Gmagent's initial greeting (which will likely differ because the default temperature value 0.8 is used here - see [Ollama's model file](https://github.com/ollama/ollama/blob/main/docs/modelfile.md#valid-parameters-and-values) for detail) such as:
Finally, copy the Authorization code and paste it to the Terminal, hit Enter and you'll see the agent's initial greeting (which will likely differ because the default temperature value 0.8 is used here - see [Ollama's model file](https://github.com/ollama/ollama/blob/main/docs/modelfile.md#valid-parameters-and-values) for detail) such as:
```
Hello! I'm Gmagent, here to help you manage your Gmail account with ease.
Hello! I'm Email Agent, here to help you manage your email account with ease.
What would you like to do today? Do you want me to:
Expand All @@ -156,7 +156,7 @@ Let me know how I can assist you!
Your ask:
```

If you cancel here and run the command `python main.py --user_email <your_gmail_address>` again you should see the Gmagent greeting right away without the need to enter an authorization code, unless you enter a different Gmail address for the first time - in fact, for each authorized (added as a test user) Gmail address, a file `[email protected]` will be created which contains the authorized token.
If you cancel here and run the command `python main.py --email <your_gmail_address>` again you should see the agent greeting right away without the need to enter an authorization code, unless you enter a different Gmail address for the first time - in fact, for each authorized (added as a test user) Gmail address, a file `[email protected]` will be created which contains the authorized token.

See the example asks and interaction log above for the types of asks you may enter.

Expand Down Expand Up @@ -215,7 +215,7 @@ In fact, even though many hours of pre-processing work has been done to cover so

## Actual Function Call Implementation

For each defined custom function call, its implementation using the Gmail API is present in `gmagent.py`. For example, the `list_emails` is defined as follows:
For each defined custom function call, its implementation using the Gmail API is present in `email_agent.py`. For example, the `list_emails` is defined as follows:

```
def list_emails(query='', max_results=100):
Expand Down Expand Up @@ -256,7 +256,7 @@ The function will be called by our agent after a user ask such as "do i have ema
```

## The Agent class
Implemented also in `gmagent.py`, the Agent class uses 3 instance members to allow for contextual aware asks to Gmagent, making it have short-term memory:
Implemented also in `email_agent.py`, the Agent class uses 3 instance members to allow for contextual aware asks to the agent, making it have short-term memory:
1. `messages`: this list holds all the previous user asks and the function call results based on Llama's response to the user asks, making Llama able to answer follow-up questions such as "how about 5mb" (after initial ask "attachments larger than 10mb") or "how about from [email protected]" (after ask "any emails from [email protected]).
2. `emails`: this list holds a list of emails that matches the user query, so follow-up questions such as "what kind of attachments for the email with subject xxx" can be answered.
3. `draft_id`: this is used to handle the ask "send the draft" after an initial ask such as "draft an email to xxx".
Expand Down Expand Up @@ -284,11 +284,11 @@ result = func(**parameters)
... <post-processing>
```

When you try out Gmagent, you'll likely find that further pre- and post-processing still needed to make it production ready. In a great video on [Vertical LLM Agents](https://www.youtube.com/watch?v=eBVi_sLaYsc), Jake Heller said "after passes frankly even like 100 tests the odds that it will do on any random distribution of user inputs of the next 100,000, 100% accurately is very high" and "by the time you've dealt with like all the edge cases... there might be dozens of things you build into your application to actually make it work well and then you get to the prompting piece and writing out tests and very specific prompts and the strategy for how you break down a big problem into step by step by step thinking and how you feed in the information how you format that information the right way". That's what all the business logic is about. We'll cover decomposing a complicated ask and multi-step reasoning in a future version of Gmagent, and continue to explore the best possible way to streamline the pre- and post-processing.
When you try out the app, you'll likely find that further pre- and post-processing still needed to make it production ready. In a great video on [Vertical LLM Agents](https://www.youtube.com/watch?v=eBVi_sLaYsc), Jake Heller said "after passes frankly even like 100 tests the odds that it will do on any random distribution of user inputs of the next 100,000, 100% accurately is very high" and "by the time you've dealt with like all the edge cases... there might be dozens of things you build into your application to actually make it work well and then you get to the prompting piece and writing out tests and very specific prompts and the strategy for how you break down a big problem into step by step by step thinking and how you feed in the information how you format that information the right way". That's what all the business logic is about. We'll cover decomposing a complicated ask and multi-step reasoning in a future version of the app, and continue to explore the best possible way to streamline the pre- and post-processing.

## Debugging output

When running Gmagent, the detailed Llama returns, pre-processed tool call specs and the actual tool calling results are inside the `-------------------------` block, e.g.:
When running the app, the detailed Llama returns, pre-processed tool call specs and the actual tool calling results are inside the `-------------------------` block, e.g.:

-------------------------
Calling Llama...
Expand All @@ -297,7 +297,7 @@ Llama returned: {'function_name': 'list_emails', 'parameters': {'query': 'subjec

Calling tool to access Gmail API: list_emails, {'query': 'subject:papers to read has:attachment'}...

Tool calling returned: [{'message_id': '1936ef72ad3f30e8', 'sender': 'gmagent_tester1@gmail.com', 'subject': 'Fwd: papers to read', 'received_time': '2024-11-27 10:51:51 PST'}, {'message_id': '1936b819706a4923', 'sender': 'Jeff Tang <gmagent_tester2@gmail.com>', 'subject': 'papers to read', 'received_time': '2024-11-26 18:44:19 PST'}]
Tool calling returned: [{'message_id': '1936ef72ad3f30e8', 'sender': 'xxx@gmail.com', 'subject': 'Fwd: papers to read', 'received_time': '2024-11-27 10:51:51 PST'}, {'message_id': '1936b819706a4923', 'sender': 'Jeff Tang <xxx@gmail.com>', 'subject': 'papers to read', 'received_time': '2024-11-26 18:44:19 PST'}]

-------------------------

Expand All @@ -308,14 +308,14 @@ Tool calling returned: [{'message_id': '1936ef72ad3f30e8', 'sender': 'gmagent_te
2. Improve the search, reply, forward, create email draft, and query about types of attachments.
3. Improve the fallback and error handling mechanism when the user asks don't lead to a correct function calling spec or the function calling fails.
4. Improve the user experience by showing progress when some Gmail search API calls take long (minutes) to complete.
5. Implement the async behavior of Gmagent - schedule an email to be sent later.
5. Implement the async behavior of the agent - schedule an email to be sent later.
6. Implement the agent planning - decomposing a complicated ask into sub-tasks, using ReAct and other methods.
7. Implement the agent long-term memory - longer context and memory across sessions (consider using Llama Stack/MemGPT/Letta)
8. Implement reflection - on the tool calling spec and results.
9. Introduce multiple-agent collaboration.
10. Implement the agent observability.
11. Compare different agent frameworks using Gmagent as the case study.
12. Add and implement a test plan and productionize Gmagent.
11. Compare different agent frameworks using the app as the case study.
12. Add and implement a test plan and productionize the app.


# Resources
Expand Down
Binary file added recipes/use_cases/email_agent/email_agent.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Original file line number Diff line number Diff line change
Expand Up @@ -503,7 +503,7 @@ def __init__(self, system_prompt=""):
self.system_prompt = system_prompt
self.messages = []

# Gmagent-specific short term memory, used to answer follow up questions AFTER a list of emails is found matching user's query
# agent-specific short term memory, used to answer follow up questions AFTER a list of emails is found matching user's query
self.emails = []
self.draft_id = None

Expand Down Expand Up @@ -580,7 +580,7 @@ def __call__(self, user_prompt_or_tool_result, is_tool_call=False):
elif function_name == "send_draft":
output = result

print(f"\n-------------------------\n\nGmagent: {output}\n")
print(f"\n-------------------------\n\nAgent: {output}\n")
else:
output = result # direct text, not JSON, response by Llama

Expand Down
Binary file removed recipes/use_cases/email_agent/gmagent.png
Binary file not shown.
10 changes: 5 additions & 5 deletions recipes/use_cases/email_agent/main.py
Original file line number Diff line number Diff line change
@@ -1,17 +1,17 @@
import argparse
import gmagent
from gmagent import *
import email_agent
from email_agent import *
from functions_prompt import system_prompt


def main():
parser = argparse.ArgumentParser(description="Set email address")
parser.add_argument("--gmail", type=str, required=True, help="Your Gmail address")
parser.add_argument("--email", type=str, required=True, help="Your Gmail address")
args = parser.parse_args()

gmagent.set_email_service(args.gmail)
email_agent.set_email_service(args.email)

greeting = llama31("hello", "Your name is Gmagent, an assistant that can perform all Gmail related tasks for your user.")
greeting = llama31("hello", "Your name is Email Agent, an assistant that can perform all email related tasks for your user.")
agent_response = f"{greeting}\n\nYour ask: "
agent = Agent(system_prompt)

Expand Down

0 comments on commit fa06774

Please sign in to comment.