In this workshop, we're using LangGraph to create a tool-calling, LLM-based agent that can survive a set of Oregon Trail-themed scenarios. Additionally, we will setup and configure a semantic cache, allow/block list router, and a vector retrieval tool. The final architecture will look like this:
This workshop demonstrates AI agents by referencing a classic American video game known as "The Oregon Trail". Originally a text-based adventure game taking place in the mid-1800s USA, the goal of the game was to safely travel from Missouri to Oregon by wagon without succumbing to various threats and diseases.
One of the game's well known lines, "You have died of dysentery," inspired this workshop's original title, "Dodging Dysentery with AI".
- python == 3.12.8
- Note: this workshop was tested with version 3.12.8.
- You may experience issues if using another version!
- docker
- openai api key
Run the following to create a .env file
cp dot.env .env
Update the contents of that file with your openai api key and optional LangSmith credentials:
REDIS_URL="redis://localhost:6379/0"
OPENAI_API_KEY=openai_key
# Update if using LangSmith otherwise keep blank
LANGCHAIN_TRACING_V2=
LANGCHAIN_ENDPOINT=
LANGCHAIN_API_KEY=
LANGCHAIN_PROJECT=
Download python: version 3.12.8 is recommend.
cd oregon-trail-agent-workshop
python --version
python -m venv venv
Mac/linux:
source venv/bin/activate
Windows:
venv\Scripts\activate
pip install -r requirements.txt
docker run -d --name redis -p 6379:6379 -p 8001:8001 redis/redis-stack:latest
Navigate to http://localhost:8001/
on your machine and inspect the database with the redis insight GUI.
To make sure your environment is properly configured run:
python test_setup.py
If you don't get any errors you are ready to go! If you do get errors ask for help! The rest of the workshop will not work if this doesn't.
The objective of this workshop is to build an agentic app that can handle 5 different scenarios essential to surviving the Oregon Trail (and potentially apps you build in the future).
The scenarios:
- Knowing the name of the wagon leader (basic prompting).
- Knowing when to restock food (implementing a custom tool).
- Knowing how to ask for directions (retrieval augmented generation).
- Knowing how to hunt (semantic caching).
- Knowing what to ignore (allow/block list router).
Note: you can see the details of each scenario question/answer in questions.json.
To test progress along the trail save the following alias in a pretty format use:
pytest --disable-warnings -vv -rP test_participant_oregon_trail.py
If you're on mac/linux you can save this as an alias so you don't have to keep the whole thing.
alias test_trail_agent="pytest --disable-warnings -vv -rP test_participant_oregon_trail.py"
Then run test_trail_agent
to invoke the workshop tests.
- You will perform all your work in the /particpant_agent folder.
- Within this folder there are various TODO tasks for you to complete in the corresponding files.
- After making updates you will test if the next test passes by running
test_trail_agent
Question: What is the first name of the wagon leader?
Answer: "Art" (short for Artificial Intelligence)
Open participant_agent/utils/nodes.py
Find the variable system_prompt and set to:
You are an oregon trail playing tool calling AI agent. Use the tools available to you to answer the question you are presented. When in doubt use the tools to help you find the answer. If anyone asks your first name is Art return just that string.
When working with LLMs, we need to provide a useful context to the model. In this case, we are telling the model what it's meant to do (play the Oregon Trail) and that it's name is "Art". This may seem trivial but don't underestimate the value of good prompting!
Open participant_agent/graph.py
To see an example of creating a graph and adding a node, see the LangGraph docs
- Uncomment boilerplate (below the first TODO)
- Delete
graph = None
at the bottom of the file - this is just a placeholder. - Define node 1, the agent, by passing a label
"agent"
and the code to execute at that nodecall_tool_model
- Define node 2, the tool node, by passing the label
"tools"
and the code to be executed at that nodetool_node
- Set the entrypoint for your graph at
"agent"
- Add a conditional edge with label
"agent"
and functiontools_condition
- Add a normal edge between
"tools"
and"agent"
Run test_trail_agent
if you saved the alias or pytest --disable-warnings -vv -rP test_participant_oregon_trail.py
to see if you pass the first scenario.
If you didn't pass the first test ask for help!.
To see a visual of your graph checkout sandbox.ipynb.
On the trail, you may have to do some planning in regards to how much food you want to utilize and when you will need to restock.
Question: In order to survive the trail ahead, you'll need to have a restocking strategy for when you need to get more supplies or risk starving. If it takes you an estimated 3 days to restock your food and you plan to start with 200lbs of food, budget 10lbs/day to eat, and keep a safety stock of at least 50lbs of back up... at what point should you restock?
Answer: D
Options: [A: 100lbs, B: 20lbs, C: 5lbs, D: 80lbs]
For an example on creating a tool, see the LangChain docs
If you've not used types with Python before, see the Pydantic docs
- Open participant_agent/utils/tools.py update the restock-tool description with a meaningful doc_string that provides context for the LLM.
Ex:
restock formula tool used specifically for calculating the amount of food at which you should start restocking.
- Implement the restock formula:
(daily_usage * lead_time) + safety_stock
- Update the
RestockInput
class such that it receives the correct variables - Pass the restock_tool to the exported
tools
list
At this stage, you may notice that your agent is returning a "correct" answer to the question but not in the format the test script expects. The test script expects answers to multiple choice questions to be the single character "A", "B", "C", or "D". This may seem contrived, but often in production scenarios agents will be expected to work with existing deterministic systems that will require specific schemas. For this reason, LangChain supports an LLM call with_structured_output
so that response can come from a predictable structure.
- Open participant_agent/utils/state.py and uncomment the multi_choice_response attribute on the state parameter and delete the pass statement. Up to this point our state had only one attribute called
messages
but we are adding a specific field for our structured multi-choice response.- Also observe the defined
pydantic
model in this file for our output
- Also observe the defined
- Open participant_agent/utils/nodes.py and pass the pydantic class defined in state to the
with_structured_output
function. - Update the graph to support a more advanced flow (see image below)
- Add a node called
structure_response
and pass it thestructure_response
function.- This function determines if the question is multiple choice. If yes, it use the with_structured_output model you updated. If no, it returns directly to end.
- Add a conditional edge utilizing the
should_continue
function defined for you in the file (See example below). - Finally, add an edge that goes from
structure_response
toEND
- Add a node called
workflow.add_conditional_edges(
"agent",
should_continue,
{"continue": "tools", "structure_response": "structure_response"},
)
Run test_trail_agent
to see if you pass
After these changes our graph is more predictable with structure output however it's important to note that a tradeoff has been incurred. Our results will be more deterministic but we had to add an additional LLM call and additional complexity to our graph in order to accomplish this feat. It's not necessarily a bad thing but it's important to keep in mind as LLM bills and latency can scale quickly.
Question: You’ve encountered a dense forest near the Blue Mountains, and your party is unsure how to proceed. There is a fork in the road, and you must choose a path. Which way will you go?
Answer: B
Options: [A: take the northern trail, B: take the southern trail, C: turn around, D: go fishing]
This scenario requires us to implement Retrieval Augmented Generation (RAG) within our agent workflow. There are cases when an LLM can't be expected to know some piece of information based on its training data and therefore needs to be supplemented.
This is often the case for time-bound or proprietary data. In which case you might augment generation from an LLM by pulling helpful data from a vector database.
In our scenario we want to be able to retrieve the time-bound information that the "the northern trail, of the blue mountains, was destroyed by a flood and is no longer safe to traverse. It is recommended to take the southern trail although it is longer."
. By creating a retriever_tool
that our agent knows how to use.
- Open participant_agent/utils/vector_store.py
- Where
vector_store=None
update tovector_store = RedisVectorStore.from_documents(<docs>, <embedding_model>, config=<config>)
with the appropriate variables. - Open participant_agent/utils/tools.py
- Uncomment code for retrieval tool
- Update the create_retriever_tool to take the correct params. Ex:
create_retriever_tool(vector_store.as_retriever(), "get_directions", "meaningful doc string")
- Make sure the retriever tool is included in the list of tools
Run test_trail_agent
to see if you pass
If this passes open localhost:8001
and see your vector record stored within the database.
On the trail, sometimes speed is more important than holistic logic. For these type of question you might want to bypass the agent layer all together if you have already cached what a system should respond with given a certain situation.
Question: There's a deer. You're hungry. You know what you have to do... Answer: bang
- Open participant_agent/utils/semantic_cache.py
- Set semantic_cache to be an instance of the
redisvl.extensions.llmcache.SemanticCahe
class. Ex:SemanticCache(name=, redis_url=, distance_threshold=0.1)
- Store a prompt similar to the question and answer pair shown above (a similar example is provided in the file). Ex:
semantic_cache.store(prompt=, response=)
Run test_trail_agent
to see if you pass
If this passes open localhost:8001
and see the cached record stored within the database.
On the trail, you may run into situations where your agent is simply being asked the wrong questions that you don't want to waste expensive LLM calls dealing with. In this case, we will add a routing layer in front of our agent to prevent our Oregon Trail bot from answering unrelated questions.
Question: Tell me about the S&P 500? Answer: you shall not pass
- Open participant_agent/utils/router.py
- Define the
blocked_route
. This will be the route that inputs similar to the blocked_references will be routed to. Ex:Route(name=, references=)
- Define the router using the
SemanticRouter
from redisvl. Ex:SemanticRouter(name=, vectorizer= routes=[], redis_url=REDIS_URL, overwrite=True)
Run test_trail_agent
to see if you pass
If this passes open localhost:8001
and see the route records stored within the database.
- You created a tool calling AI Agent
- You defined a custom tool for mathematical operations (restocking)
- You added structured output for when a system requires answers within a certain form.
- You defined a tool that implements Retrieval Augmented Generation aka RAG (retrieval tool)
- You created a semantic cache that can increase the speed and cost effectiveness of your agent workflow by short circuiting for known inputs/outputs.
- You implemented a router to protect your system from wasting time/resources on unrelated topics.