-
Notifications
You must be signed in to change notification settings - Fork 80
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
1 parent
e3590eb
commit d5804ec
Showing
1 changed file
with
1,776 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,1776 @@ | ||
{ | ||
"cells": [ | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"# How to get the most out of this notebook\n", | ||
"## Main points to understand first, Tl;Dr\n", | ||
"* Don't worry about artificial distinctions or paradigms. Don't be afraid of new things.\n", | ||
"* Instead, use specific sets of tools like type checkers, design patterns, and language features to your advantage.\n", | ||
"* Use those tools judiciously to achieve your goals (like program efficiency, user/programmer experience, and correctness). \n", | ||
"* Correctness is _the single most important_ property of a program.\n", | ||
"\n", | ||
"## Steps\n", | ||
"* Open in VSCode with Pylance enabled. \n", | ||
"* Make sure your pyrightconfig.json has \"typeCheckingMode\": \"strict\"\n", | ||
"* Hover over code to get inferences from Pyright. \n", | ||
" * This includes both inferred types and errors.\n", | ||
"* Go through the cells in order, reading the markdown text and code in order. It's like a story book.\n", | ||
"\n", | ||
"### Programming Concepts (Feel free to skip if you're in a hurry)\n", | ||
"* There are no \"paradigms\". OO or functional programming are not categories or \"paradigms\" of languages, but instead (fuzzy) sets of PL _features_. \n", | ||
" * E.g. is Python OO or FP? Well, it has classes, objects, and methods. It also has map and filter, as well as very Pythonic syntactic sugar for complex compositions of those higher-order functions (it's called a list comprehension).\n", | ||
"* There is also no well-defined idea of \"strong\" or \"weak\" typing. These terms are used loosely to refer to certain sets of PL typing features. Usually, \"strongly typed\" means there are certain obligatory type annotations (see below). \n", | ||
"* Static typing vs. dynamic typing. \n", | ||
" * It is NOT an either-or. It is NOT a spectrum. It is at least two independent dimensions. \n", | ||
" * Case in point: C is statically but not dynamically typed. Java is both statically and dynamically typed. Python2 is dynamically but not statically typed. Assembly is neither.\n", | ||
" * Typing is a set of _features_ of a PL (programming language), not a type of PL. \n", | ||
"* Type _inference_ vs. type _annotation_.\n", | ||
" * Type inference is the process of outputting type information given source code. Since it is done only from source code and not from program execution, by definition the type information is _static_. This process is done in different ways by different static analysis tools like ghc, tsc, javac or Pyright. Essentially, this is where the _computer_ tells _you_ (and itself) the static type.\n", | ||
" * Type inference is not the same thing as manual type annotation or type hinting. Type annotation is the inclusion of explicit type information inside source code for use by both static analysis tools and human readers. Sometimes it is obligatory (e.g. Java) and sometimes not (e.g. Python3). Essentially, this is where _you_ tell the _computer_ the static type.\n", | ||
"* Type safety and proof of correctness\n", | ||
" * Correctness is the most important property of a program. If it outputs the wrong answer (or crashes, or hangs forever) then it is useless no matter how efficiently it runs, how nice the user/programmer interface is, or how maintainable the code is. EXAMPLES:\n", | ||
" * Consider a web app for buying plane tickets. Imagine it takes 1 minute to get a confirmation. Now imagine it runs in 1 second, but you show up to the airport and the flight is actually sold out and they don't have your confirmation.\n", | ||
" * Some institutions (like banks) run extremely old Cobol programs that no one can reliably change anymore. It still works, and it's better than no working code. \n", | ||
" * Type inference and annotation/hinting are each features of PLs that can be used in different ways by static analysis tools to reduce program errors.\n", | ||
" * For example, you can ask Pyright to infer all of your types with no explicit annotations. You can further constrain your program's types by adding annotations.\n", | ||
" * When considering program correctness, there is no important difference between an explicit, annotated type and an inferred type. They are different kinds of _constraints_ on the static types of a program.\n", | ||
" * Using a combination of constraints from type hints and inference procedures, tools like Pyright can statically detect errors in your program if they exist. \n", | ||
" * This is also what compilers like javac (or gcc) do. If you get the types wrong, javac will fail and not output any Java bytecode. If it _does_ compile (or if Pyright reports no errors), then you are guaranteed a certain degree of correctness during program execution, depending on the exact PL and static analysis specifications. \n", | ||
" * Notice that the goal of type safety is the same as the goal of test cases: proving correctness. The more you lean on your static analysis tools, the less you have to rely on tests. Put another way, static analysis will save you time to write _better_ tests that can prevent even more errors. \n", | ||
" * Also notice that tests cannot run statically. You have to actually run some piece of your code to get a proof of correctness from a test case.\n", | ||
"* Functions as data\n", | ||
" * Functions are already data no matter what PL you are working in. Your PL may be hiding this fact via its own higher-level abstractions. At the lowest level (i.e. in the von Neumann architecture), programs, which includes functions, are data.\n", | ||
" * Some PLs introduce high-level abstractions that maintain the treatment of functions as data. Many of these are known as \"functional PLs\", but that term is misleading just like \"strongly typed\". \n", | ||
" * For example, like Haskell, Python and Java have high-level abstractions for functions as data. Namely, anonymous functions (lambda expressions)." | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"# Part 0\n", | ||
"\n", | ||
"Please install lm utils 0.0.21:\n", | ||
"\n", | ||
"`pip install lastmile-utils==0.0.21 --force`" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"# Part 1: Basic error handling\n", | ||
"Let's say we're building a local editor that allows you to load an AIConfig\n", | ||
"from a local file and then run methods on it.\n", | ||
"\n", | ||
"In the (simplified) code below, we do just that." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 16, | ||
"metadata": {}, | ||
"outputs": [ | ||
{ | ||
"name": "stdout", | ||
"output_type": "stream", | ||
"text": [ | ||
"Loaded AIConfig: NYC Trip Planner\n", | ||
"\n", | ||
"\n" | ||
] | ||
} | ||
], | ||
"source": [ | ||
"import json\n", | ||
"from typing import Any\n", | ||
"\n", | ||
"\n", | ||
"def read_json_from_file(path: str) -> dict[str, Any]:\n", | ||
" with open(path, \"r\") as f:\n", | ||
" return json.loads(f.read())\n", | ||
" \n", | ||
"\n", | ||
"def start_app(path: str):\n", | ||
" \"\"\"Load an AIConfig from a local path and do something with it.\"\"\"\n", | ||
" aiconfig = read_json_from_file(path)\n", | ||
" print(f\"Loaded AIConfig: {aiconfig['name']}\\n\")\n", | ||
"\n", | ||
"\n", | ||
"start_app(\"cookbooks/Getting-Started/travel.aiconfig.json\")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## Cool, LGTM, ship it!" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"# A few hours later..." | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"# Issue #9000 Editor crashes on new file path\n", | ||
"### opened 2 hours ago by lastmile-biggest-fan\n", | ||
"\n", | ||
"Dear LastMile team,\n", | ||
"I really like the editor, but when I give it a new file path, it crashes!\n", | ||
"I was hoping it would create a new AIConfig for me and write it to the file..." | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"# OK, what happened?" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 17, | ||
"metadata": {}, | ||
"outputs": [ | ||
{ | ||
"ename": "FileNotFoundError", | ||
"evalue": "[Errno 2] No such file or directory: 'i-dont-exist-yet-please-create-me.json'", | ||
"output_type": "error", | ||
"traceback": [ | ||
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", | ||
"\u001b[0;31mFileNotFoundError\u001b[0m Traceback (most recent call last)", | ||
"Cell \u001b[0;32mIn[17], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m \u001b[43mstart_app\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43mi-dont-exist-yet-please-create-me.json\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m)\u001b[49m\n", | ||
"Cell \u001b[0;32mIn[16], line 11\u001b[0m, in \u001b[0;36mstart_app\u001b[0;34m(path)\u001b[0m\n\u001b[1;32m 9\u001b[0m \u001b[38;5;28;01mdef\u001b[39;00m \u001b[38;5;21mstart_app\u001b[39m(path: \u001b[38;5;28mstr\u001b[39m):\n\u001b[1;32m 10\u001b[0m \u001b[38;5;250m \u001b[39m\u001b[38;5;124;03m\"\"\"Load an AIConfig from a local path and do something with it.\"\"\"\u001b[39;00m\n\u001b[0;32m---> 11\u001b[0m aiconfig \u001b[38;5;241m=\u001b[39m json\u001b[38;5;241m.\u001b[39mloads(\u001b[43mread_file\u001b[49m\u001b[43m(\u001b[49m\u001b[43mpath\u001b[49m\u001b[43m)\u001b[49m)\n\u001b[1;32m 12\u001b[0m \u001b[38;5;28mprint\u001b[39m(\u001b[38;5;124mf\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mLoaded AIConfig: \u001b[39m\u001b[38;5;132;01m{\u001b[39;00maiconfig[\u001b[38;5;124m'\u001b[39m\u001b[38;5;124mname\u001b[39m\u001b[38;5;124m'\u001b[39m]\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;130;01m\\n\u001b[39;00m\u001b[38;5;124m\"\u001b[39m)\n\u001b[1;32m 13\u001b[0m \u001b[38;5;28mprint\u001b[39m()\n", | ||
"Cell \u001b[0;32mIn[16], line 5\u001b[0m, in \u001b[0;36mread_file\u001b[0;34m(path)\u001b[0m\n\u001b[1;32m 4\u001b[0m \u001b[38;5;28;01mdef\u001b[39;00m \u001b[38;5;21mread_file\u001b[39m(path: \u001b[38;5;28mstr\u001b[39m) \u001b[38;5;241m-\u001b[39m\u001b[38;5;241m>\u001b[39m \u001b[38;5;28mstr\u001b[39m:\n\u001b[0;32m----> 5\u001b[0m \u001b[38;5;28;01mwith\u001b[39;00m \u001b[38;5;28;43mopen\u001b[39;49m\u001b[43m(\u001b[49m\u001b[43mpath\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43mr\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m)\u001b[49m \u001b[38;5;28;01mas\u001b[39;00m f:\n\u001b[1;32m 6\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m f\u001b[38;5;241m.\u001b[39mread()\n", | ||
"File \u001b[0;32m/opt/homebrew/Caskroom/miniconda/base/envs/aiconfig/lib/python3.10/site-packages/IPython/core/interactiveshell.py:310\u001b[0m, in \u001b[0;36m_modified_open\u001b[0;34m(file, *args, **kwargs)\u001b[0m\n\u001b[1;32m 303\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m file \u001b[38;5;129;01min\u001b[39;00m {\u001b[38;5;241m0\u001b[39m, \u001b[38;5;241m1\u001b[39m, \u001b[38;5;241m2\u001b[39m}:\n\u001b[1;32m 304\u001b[0m \u001b[38;5;28;01mraise\u001b[39;00m \u001b[38;5;167;01mValueError\u001b[39;00m(\n\u001b[1;32m 305\u001b[0m \u001b[38;5;124mf\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mIPython won\u001b[39m\u001b[38;5;124m'\u001b[39m\u001b[38;5;124mt let you open fd=\u001b[39m\u001b[38;5;132;01m{\u001b[39;00mfile\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m by default \u001b[39m\u001b[38;5;124m\"\u001b[39m\n\u001b[1;32m 306\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mas it is likely to crash IPython. If you know what you are doing, \u001b[39m\u001b[38;5;124m\"\u001b[39m\n\u001b[1;32m 307\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124myou can use builtins\u001b[39m\u001b[38;5;124m'\u001b[39m\u001b[38;5;124m open.\u001b[39m\u001b[38;5;124m\"\u001b[39m\n\u001b[1;32m 308\u001b[0m )\n\u001b[0;32m--> 310\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[43mio_open\u001b[49m\u001b[43m(\u001b[49m\u001b[43mfile\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43margs\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43mkwargs\u001b[49m\u001b[43m)\u001b[49m\n", | ||
"\u001b[0;31mFileNotFoundError\u001b[0m: [Errno 2] No such file or directory: 'i-dont-exist-yet-please-create-me.json'" | ||
] | ||
} | ||
], | ||
"source": [ | ||
"start_app(\"i-dont-exist-yet-please-create-me.json\")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"# Oops" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"Ok, let's diagnose the problem here. We forgot to handle the case where the path doesn't exist.\n", | ||
"\n", | ||
"That's understandable. As programmers, we don't always write perfect code.\n", | ||
"Sometimes it's helpful to bring new tools into the workflow to prevent this kind of problem in the future.\n", | ||
"\n", | ||
"\n", | ||
"Hmm, ok. Wouldn't it be nice if we had a static analyzer that could have caught this problem immediately? That way we could have fixed it before the initial PR was merged.\n", | ||
"\n", | ||
"Let's analyze some tools." | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## V2: Optional\n", | ||
"\n", | ||
"First, let's fix the root cause and catch exceptions. Now, what do we do in the `except` block? \n", | ||
"\n", | ||
"Well, we can reraise, but that brings us right back to the previous case and doesn't achieve anything helpful. \n", | ||
"\n", | ||
"Instead, notice what happens if we return None and type hint the function accordingly (Optional[...])." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 6, | ||
"metadata": {}, | ||
"outputs": [ | ||
{ | ||
"name": "stdout", | ||
"output_type": "stream", | ||
"text": [ | ||
"\n", | ||
"[Pyright] Object of type \"None\" is not subscriptable\n", | ||
"PylancereportOptionalSubscript\n", | ||
"(variable) aiconfig: dict[str, Any] | None\n", | ||
"\n" | ||
] | ||
} | ||
], | ||
"source": [ | ||
"from typing import Any, Optional\n", | ||
"\n", | ||
"\n", | ||
"def read_json_from_file(path: str) -> Optional[dict[str, Any]]:\n", | ||
" try:\n", | ||
" with open(path, \"r\") as f:\n", | ||
" return json.loads(f.read())\n", | ||
" except Exception as e:\n", | ||
" return None\n", | ||
" \n", | ||
"\n", | ||
"def start_app(path: str):\n", | ||
" \"\"\"Load an AIConfig from a local path and do something with it.\"\"\"\n", | ||
" aiconfig = read_json_from_file(path)\n", | ||
" print(f\"Loaded AIConfig: {aiconfig['name']}\\n\")\n", | ||
"\n", | ||
"print(\"\"\"\n", | ||
"[Pyright] Object of type \"None\" is not subscriptable\n", | ||
"PylancereportOptionalSubscript\n", | ||
"(variable) aiconfig: dict[str, Any] | None\n", | ||
"\"\"\")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"# Aha!\n", | ||
"\n", | ||
"Now, Pyright immediately tells us that `None` is a possibility, and we have to handle this case. Let's do that.\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 31, | ||
"metadata": {}, | ||
"outputs": [ | ||
{ | ||
"name": "stdout", | ||
"output_type": "stream", | ||
"text": [ | ||
"Loaded AIConfig: NYC Trip Planner\n", | ||
"\n", | ||
"Loaded AIConfig: \n", | ||
"\n" | ||
] | ||
} | ||
], | ||
"source": [ | ||
"from typing import Optional\n", | ||
"from aiconfig.Config import AIConfigRuntime\n", | ||
"\n", | ||
"\n", | ||
"\n", | ||
"def read_json_from_file(path: str) -> Optional[dict[str, Any]]:\n", | ||
" try:\n", | ||
" with open(path, \"r\") as f:\n", | ||
" return json.loads(f.read())\n", | ||
" except Exception:\n", | ||
" return None\n", | ||
"\n", | ||
"def start_app(path: str):\n", | ||
" \"\"\"Load an AIConfig from a local path and do something with it.\"\"\"\n", | ||
" aiconfig = read_json_from_file(path)\n", | ||
" if aiconfig is None:\n", | ||
" print(f\"Could not load AIConfig from path: {path}. Creating and saving.\")\n", | ||
" aiconfig = json.dumps(AIConfigRuntime.create())\n", | ||
" # [save the aiconfig to the path] \n", | ||
" print(f\"Loaded and saved new AIConfig\\n\")\n", | ||
" else:\n", | ||
" print(f\"Loaded AIConfig: {aiconfig}\\n\")\n", | ||
"\n", | ||
"start_app(\"cookbooks/Getting-Started/travel.aiconfig.json\")\n", | ||
"start_app(\"i-dont-exist-yet-please-create-me.json\")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"Ok, cool, much better. But wait, it would be nice to retain some information about what went wrong. My `None` value doesn't tell me anything about why the AIConfig couldn't be loaded. Does the file not exist? Was it a permission problem, networked filesystem problem? etc." | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"# V3: Result" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"The result library (https://github.com/rustedpy/result) provides a neat type\n", | ||
"called `Result`, which is a bit like Optional. It's parametrized by the value type just like optional, but also by a second type for the error case.\n", | ||
"\n", | ||
"We can use it like optional, but store an arbitrary value with information about what went wrong. Result also has more nice features we'll get to later." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 2, | ||
"metadata": {}, | ||
"outputs": [ | ||
{ | ||
"name": "stdout", | ||
"output_type": "stream", | ||
"text": [ | ||
"Loaded AIConfig: NYC Trip Planner\n", | ||
"\n", | ||
"Could not load AIConfig from path: i-dont-exist-yet-please-create-me.json (File not found at path: i-dont-exist-yet-please-create-me.json). Creating and saving.\n", | ||
"Created and saved new AIConfig: \n", | ||
"\n" | ||
] | ||
} | ||
], | ||
"source": [ | ||
"from aiconfig.Config import AIConfigRuntime\n", | ||
"from result import Result, Ok, Err\n", | ||
"from typing import Any\n", | ||
"import json\n", | ||
"from json import JSONDecodeError\n", | ||
"\n", | ||
"\n", | ||
"def read_json_from_file(path: str) -> Result[dict[str, Any], str]: \n", | ||
" \"\"\"\n", | ||
" The idea of this function is to quarantine the exceptions we are stuck with when using\n", | ||
" external code. We can't stop json.loads from raising, and we can't check it statically, \n", | ||
" but we can immediately catch anything raised at the lower boundary of our application.\n", | ||
" This allows us to type check everything _above_ this function, enabling type-safe reuse.\n", | ||
"\n", | ||
" Specifically, we use a string in the error case to contain a helpful error message.\n", | ||
" \"\"\"\n", | ||
" try:\n", | ||
" with open(path, \"r\") as f:\n", | ||
" return Ok(json.loads(f.read()))\n", | ||
" except FileNotFoundError:\n", | ||
" return Err(f\"File not found at path: {path}\")\n", | ||
" except OSError as e:\n", | ||
" return Err(f\"Could not read file at path: {path}: {e}\")\n", | ||
" except JSONDecodeError as e:\n", | ||
" return Err(f\"Could not parse JSON at path: {path}: {e}\")\n", | ||
"\n", | ||
"def start_app(path: str):\n", | ||
" \"\"\"Load an AIConfig from a local path and do something with it.\"\"\"\n", | ||
" file_contents = read_json_from_file(path)\n", | ||
" match file_contents:\n", | ||
" case Ok(aiconfig_ok):\n", | ||
" print(f\"Loaded AIConfig: {aiconfig_ok['name']}\\n\")\n", | ||
" case Err(e):\n", | ||
" print(f\"Could not load AIConfig from path: {path} ({e}). Creating and saving.\")\n", | ||
" aiconfig = AIConfigRuntime.create().model_dump(exclude=\"callback_manager\")\n", | ||
" # [Save to file path]\n", | ||
" # aiconfig.save(path)\n", | ||
" print(f\"Created and saved new AIConfig: {aiconfig['name']}\\n\")\n", | ||
"\n", | ||
"start_app(\"cookbooks/Getting-Started/travel.aiconfig.json\")\n", | ||
"start_app(\"i-dont-exist-yet-please-create-me.json\")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"There are several nice things about this pattern:\n", | ||
"* If you fail to check for the error case, you get static errors similar to the `None` Optional case\n", | ||
"* You also get specific, useful error information unlike Optional\n", | ||
"* Structural pattern matching: When matching the cases, you can elegantly and safely unbox the data inside the result.\n", | ||
"* Because of pyright's ability to check for exhaustive pattern matching, it will yell at you if you don't handle the Err case. Try it! Comment out the Err case." | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"# Part 2: Composition" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"Cool, so we have a very basic example of better error handling. What about a more realistic level of complexity involving a sequence of chained operations? Consider this variant of the previous app example:" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 5, | ||
"metadata": {}, | ||
"outputs": [ | ||
{ | ||
"ename": "KeyError", | ||
"evalue": "'text'", | ||
"output_type": "error", | ||
"traceback": [ | ||
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", | ||
"\u001b[0;31mKeyError\u001b[0m Traceback (most recent call last)", | ||
"Cell \u001b[0;32mIn[5], line 37\u001b[0m\n\u001b[1;32m 32\u001b[0m prompts \u001b[38;5;241m=\u001b[39m get_prompt_text_list(aiconfig)\n\u001b[1;32m 33\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m HTTPResponse(\u001b[38;5;241m200\u001b[39m, JSON({\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mprompts\u001b[39m\u001b[38;5;124m\"\u001b[39m: prompts}))\n\u001b[0;32m---> 37\u001b[0m \u001b[43mendpoint\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43mcookbooks/Getting-Started/travel.aiconfig.json\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m)\u001b[49m\n", | ||
"Cell \u001b[0;32mIn[5], line 32\u001b[0m, in \u001b[0;36mendpoint\u001b[0;34m(path)\u001b[0m\n\u001b[1;32m 30\u001b[0m contents \u001b[38;5;241m=\u001b[39m read_file(path)\n\u001b[1;32m 31\u001b[0m aiconfig \u001b[38;5;241m=\u001b[39m parse_json(contents)\n\u001b[0;32m---> 32\u001b[0m prompts \u001b[38;5;241m=\u001b[39m \u001b[43mget_prompt_text_list\u001b[49m\u001b[43m(\u001b[49m\u001b[43maiconfig\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 33\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m HTTPResponse(\u001b[38;5;241m200\u001b[39m, JSON({\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mprompts\u001b[39m\u001b[38;5;124m\"\u001b[39m: prompts}))\n", | ||
"Cell \u001b[0;32mIn[5], line 26\u001b[0m, in \u001b[0;36mget_prompt_text_list\u001b[0;34m(aiconfig)\u001b[0m\n\u001b[1;32m 25\u001b[0m \u001b[38;5;28;01mdef\u001b[39;00m \u001b[38;5;21mget_prompt_text_list\u001b[39m(aiconfig: JSON) \u001b[38;5;241m-\u001b[39m\u001b[38;5;241m>\u001b[39m \u001b[38;5;28mlist\u001b[39m[\u001b[38;5;28mstr\u001b[39m]:\n\u001b[0;32m---> 26\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m [prompt[\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mtext\u001b[39m\u001b[38;5;124m\"\u001b[39m] \u001b[38;5;28;01mfor\u001b[39;00m prompt \u001b[38;5;129;01min\u001b[39;00m aiconfig[\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mprompts\u001b[39m\u001b[38;5;124m\"\u001b[39m]]\n", | ||
"Cell \u001b[0;32mIn[5], line 26\u001b[0m, in \u001b[0;36m<listcomp>\u001b[0;34m(.0)\u001b[0m\n\u001b[1;32m 25\u001b[0m \u001b[38;5;28;01mdef\u001b[39;00m \u001b[38;5;21mget_prompt_text_list\u001b[39m(aiconfig: JSON) \u001b[38;5;241m-\u001b[39m\u001b[38;5;241m>\u001b[39m \u001b[38;5;28mlist\u001b[39m[\u001b[38;5;28mstr\u001b[39m]:\n\u001b[0;32m---> 26\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m [\u001b[43mprompt\u001b[49m\u001b[43m[\u001b[49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43mtext\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m]\u001b[49m \u001b[38;5;28;01mfor\u001b[39;00m prompt \u001b[38;5;129;01min\u001b[39;00m aiconfig[\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mprompts\u001b[39m\u001b[38;5;124m\"\u001b[39m]]\n", | ||
"\u001b[0;31mKeyError\u001b[0m: 'text'" | ||
] | ||
} | ||
], | ||
"source": [ | ||
"from dataclasses import dataclass\n", | ||
"import json\n", | ||
"from typing import NewType\n", | ||
"\n", | ||
"JSON = NewType(\"JSON\", dict[str, Any])\n", | ||
"\n", | ||
"@dataclass\n", | ||
"class HTTPResponse:\n", | ||
" status_code: int\n", | ||
" body: JSON\n", | ||
"\n", | ||
"def read_file(path: str) -> str:\n", | ||
" with open(path, \"r\") as f:\n", | ||
" return f.read()\n", | ||
" \n", | ||
"\n", | ||
"def parse_json(json_str: str) -> JSON:\n", | ||
" return json.loads(json_str)\n", | ||
"\n", | ||
"\n", | ||
"def read_json_from_file(path: str) -> JSON:\n", | ||
" with open(path, \"r\") as f:\n", | ||
" return json.loads(f.read())\n", | ||
"\n", | ||
"def get_prompt_text_list(aiconfig: JSON) -> list[str]:\n", | ||
" return [prompt[\"text\"] for prompt in aiconfig[\"prompts\"]]\n", | ||
"\n", | ||
"\n", | ||
"def endpoint(path: str):\n", | ||
" \"\"\"Load an AIConfig from a local path and do something with it.\"\"\"\n", | ||
" contents = read_file(path)\n", | ||
" aiconfig = parse_json(contents)\n", | ||
" prompts = get_prompt_text_list(aiconfig)\n", | ||
" return HTTPResponse(200, JSON({\"prompts\": prompts}))\n", | ||
"\n", | ||
"\n", | ||
"\n", | ||
"endpoint(\"cookbooks/Getting-Started/travel.aiconfig.json\")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"This is obviously bad. But it's not fatal, right? The server framework will handle this with catch-all exception handling. But that's not great either: the front-end can't do anything useful with a default error response.\n", | ||
"\n", | ||
"It's better if we return specific codes as part of an explicit contract. Let's create an application protocol." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 7, | ||
"metadata": {}, | ||
"outputs": [ | ||
{ | ||
"data": { | ||
"text/plain": [ | ||
"HTTPResponse(status_code=567, body={'error': \"Could not find key: 'text'\"})" | ||
] | ||
}, | ||
"execution_count": 7, | ||
"metadata": {}, | ||
"output_type": "execute_result" | ||
} | ||
], | ||
"source": [ | ||
"from dataclasses import dataclass\n", | ||
"import json\n", | ||
"from typing import NewType\n", | ||
"\n", | ||
"JSON = NewType(\"JSON\", dict[str, Any])\n", | ||
"\n", | ||
"@dataclass(frozen=True)\n", | ||
"class HTTPResponse:\n", | ||
" status_code: int\n", | ||
" body: JSON\n", | ||
"\n", | ||
"def read_file(path: str) -> str:\n", | ||
" with open(path, \"r\") as f:\n", | ||
" return f.read()\n", | ||
" \n", | ||
"\n", | ||
"def parse_json(json_str: str) -> JSON:\n", | ||
" return json.loads(json_str)\n", | ||
"\n", | ||
"\n", | ||
"def get_prompt_text_list(aiconfig: JSON) -> list[str]:\n", | ||
" return [prompt[\"text\"] for prompt in aiconfig[\"prompts\"]]\n", | ||
"\n", | ||
"\n", | ||
"def prompt_list_to_http_response(prompts: list[str]) -> HTTPResponse:\n", | ||
" return HTTPResponse(200, JSON({\"prompts\": prompts}))\n", | ||
"\n", | ||
"def endpoint(path: str):\n", | ||
" \"\"\"Load an AIConfig from a local path and do something with it.\"\"\"\n", | ||
"\n", | ||
" try:\n", | ||
" contents = read_file(path)\n", | ||
" aiconfig = parse_json(contents)\n", | ||
" prompts = get_prompt_text_list(aiconfig)\n", | ||
" response = prompt_list_to_http_response(prompts)\n", | ||
" return response\n", | ||
" except FileNotFoundError:\n", | ||
" return HTTPResponse(404, JSON({\"error\": \"File not found\"}))\n", | ||
" except OSError as e:\n", | ||
" return HTTPResponse(502, JSON({\"error\": f\"Could not read file: {e}\"}))\n", | ||
" except json.JSONDecodeError as e:\n", | ||
" return HTTPResponse(555, JSON({\"error\": f\"Could not parse JSON: {e}\"}))\n", | ||
" except KeyError as e:\n", | ||
" return HTTPResponse(567, JSON({\"error\": f\"Could not find key: {e}\"}))\n", | ||
"\n", | ||
"\n", | ||
"endpoint(\"cookbooks/Getting-Started/travel.aiconfig.json\")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"This is better, but we can't reuse any of our helper functions! Every endpoint will have to repeat this error handling. \n", | ||
"\n", | ||
"Also, remember that exceptions are not statically checkable. Even if Pyright passes, we can accidentally raise (now or in the future) some other exception, and it will bubble up above `endpoint()` and break our protocol.\n", | ||
"\n", | ||
"OK, new version:\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 24, | ||
"metadata": {}, | ||
"outputs": [ | ||
{ | ||
"data": { | ||
"text/plain": [ | ||
"HTTPResponse(status_code=567, body={'response_type': 'error', 'data': {'message': \"'text'\", 'category': 'GetPromptError', 'exception': \"'text'\"}})" | ||
] | ||
}, | ||
"execution_count": 24, | ||
"metadata": {}, | ||
"output_type": "execute_result" | ||
} | ||
], | ||
"source": [ | ||
"from dataclasses import dataclass\n", | ||
"from enum import Enum\n", | ||
"import json\n", | ||
"from typing import NewType\n", | ||
"from result import Result, Ok, Err\n", | ||
"\n", | ||
"JSON = NewType(\"JSON\", dict[str, Any])\n", | ||
"\n", | ||
"@dataclass(frozen=True)\n", | ||
"class InternalFailure:\n", | ||
" \"\"\"\n", | ||
" This type defines the server-client contract for server-side failures.\n", | ||
" It contains metadata about the failure, including a message, category, and the exception.\n", | ||
" Use the `to_json()` method to construct a corresponding HTTP response.\n", | ||
" \"\"\"\n", | ||
" class Category(Enum):\n", | ||
" FileSystemError = \"FileSystemError\"\n", | ||
" ParseJSONError = \"ParseJSONError\"\n", | ||
" GetPromptError = \"GetPromptError\"\n", | ||
"\n", | ||
" message: str\n", | ||
" category: Category\n", | ||
" exception: Exception\n", | ||
"\n", | ||
" @staticmethod\n", | ||
" def from_exception(\n", | ||
" exception: Exception,\n", | ||
" category: Category \n", | ||
" ) -> \"InternalFailure\":\n", | ||
" return InternalFailure(\n", | ||
" message=str(exception),\n", | ||
" category=category,\n", | ||
" exception=exception,\n", | ||
" ) \n", | ||
"\n", | ||
" def to_json(self) -> JSON:\n", | ||
" return JSON({\n", | ||
" \"message\": self.message,\n", | ||
" \"category\": self.category.value,\n", | ||
" \"exception\": str(self.exception),\n", | ||
" })\n", | ||
"\n", | ||
"@dataclass(frozen=True)\n", | ||
"class HTTPResponse:\n", | ||
" status_code: int\n", | ||
" body: JSON\n", | ||
"\n", | ||
"def read_file(path: str) -> Result[str, InternalFailure]:\n", | ||
" try:\n", | ||
" with open(path, \"r\") as f:\n", | ||
" return Ok(f.read())\n", | ||
" except Exception as e:\n", | ||
" return Err(InternalFailure.from_exception(e, InternalFailure.Category.FileSystemError))\n", | ||
" \n", | ||
"\n", | ||
"def parse_json(json_str: str) -> Result[JSON, InternalFailure]:\n", | ||
" try: \n", | ||
" return Ok(json.loads(json_str))\n", | ||
" except json.JSONDecodeError as e:\n", | ||
" return Err(InternalFailure.from_exception(e, InternalFailure.Category.ParseJSONError))\n", | ||
"\n", | ||
"\n", | ||
"\n", | ||
"def get_prompt_text_list(aiconfig: JSON) -> Result[list[str], InternalFailure]:\n", | ||
" try:\n", | ||
" return Ok([prompt[\"text\"] for prompt in aiconfig[\"prompts\"]])\n", | ||
" except KeyError as e:\n", | ||
" return Err(InternalFailure.from_exception(e, InternalFailure.Category.GetPromptError))\n", | ||
"\n", | ||
"\n", | ||
"def prompt_list_to_http_response(prompts: list[str]) -> HTTPResponse:\n", | ||
" return HTTPResponse(200, JSON({\"prompts\": prompts}))\n", | ||
"\n", | ||
"\n", | ||
"def internal_failure_to_http_response(failure: InternalFailure) -> HTTPResponse:\n", | ||
" \"\"\"\n", | ||
" Statically guarantee exhaustive matching on InternalFailure.Type.\n", | ||
" Try commenting out one of the cases to see what happens!\n", | ||
" Try adding a new case that isn't one of the enum values to see what happens!\n", | ||
" \"\"\"\n", | ||
" def _get_code(category: InternalFailure.Category) -> int:\n", | ||
" match category:\n", | ||
" case InternalFailure.Category.FileSystemError:\n", | ||
" return 502\n", | ||
" case InternalFailure.Category.ParseJSONError:\n", | ||
" return 555\n", | ||
" case InternalFailure.Category.GetPromptError:\n", | ||
" return 567\n", | ||
" \n", | ||
" code = _get_code(failure.category)\n", | ||
" return HTTPResponse(code, JSON({\"response_type\": \"error\", \"data\": failure.to_json()}))\n", | ||
"\n", | ||
"\n", | ||
"def endpoint(path: str):\n", | ||
" \"\"\"Load an AIConfig from a local path and do something with it.\"\"\"\n", | ||
"\n", | ||
" file_contents = read_file(path)\n", | ||
" match file_contents:\n", | ||
" # Try removing one of the Err cases to see what happens!\n", | ||
" case Ok(file_contents_ok):\n", | ||
" parsed_json = parse_json(file_contents_ok)\n", | ||
" match parsed_json:\n", | ||
" case Ok(parsed_json_ok):\n", | ||
" prompts = get_prompt_text_list(parsed_json_ok)\n", | ||
" match prompts:\n", | ||
" case Ok(prompts_ok):\n", | ||
" return prompt_list_to_http_response(prompts_ok)\n", | ||
" case Err(e):\n", | ||
" return internal_failure_to_http_response(e)\n", | ||
" case Err(e):\n", | ||
" return internal_failure_to_http_response(e)\n", | ||
" case Err(e):\n", | ||
" return internal_failure_to_http_response(e)\n", | ||
"\n", | ||
"\n", | ||
"endpoint(\"cookbooks/Getting-Started/travel.aiconfig.json\")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"Cool, so we have an internal failure type that allows us to easily respect our client-server contract, our functions are type-safe and reusable, and our endpoints are guaranteed to be correct (i.e. respect the contract).\n", | ||
"\n", | ||
"But now we have to do this annoying nested error checking! \n", | ||
"\n", | ||
"(Note that Optional would have the same problem, plus no way to distinguish between different errors. They all get converted to None.)\n", | ||
"\n", | ||
"Luckily, Result has a really nice way to deal with this situation." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 12, | ||
"metadata": {}, | ||
"outputs": [ | ||
{ | ||
"data": { | ||
"text/plain": [ | ||
"HTTPResponse(status_code=567, body={'error': \"Could not find key: 'text'\"})" | ||
] | ||
}, | ||
"execution_count": 12, | ||
"metadata": {}, | ||
"output_type": "execute_result" | ||
} | ||
], | ||
"source": [ | ||
"from dataclasses import dataclass\n", | ||
"from enum import Enum\n", | ||
"import json\n", | ||
"from typing import NewType\n", | ||
"from result import Result, Ok, Err\n", | ||
"\n", | ||
"JSON = NewType(\"JSON\", dict[str, Any])\n", | ||
"\n", | ||
"@dataclass(frozen=True)\n", | ||
"class InternalFailure:\n", | ||
" \"\"\"\n", | ||
" This type defines the server-client contract for server-side failures.\n", | ||
" It contains metadata about the failure, including a message, category, and the exception.\n", | ||
" Use the `to_json()` method to construct a corresponding HTTP response.\n", | ||
" \"\"\" \n", | ||
" class Category(Enum):\n", | ||
" FileSystemError = \"FileSystemError\"\n", | ||
" ParseJSONError = \"ParseJSONError\"\n", | ||
" GetPromptError = \"GetPromptError\"\n", | ||
"\n", | ||
" message: str\n", | ||
" category: Category\n", | ||
" exception: Exception\n", | ||
"\n", | ||
" @staticmethod\n", | ||
" def from_exception(\n", | ||
" exception: Exception,\n", | ||
" category: Category \n", | ||
" ) -> \"InternalFailure\":\n", | ||
" return InternalFailure(\n", | ||
" message=str(exception),\n", | ||
" category=category,\n", | ||
" exception=exception,\n", | ||
" ) \n", | ||
"\n", | ||
" def to_json(self) -> JSON:\n", | ||
" return JSON({\n", | ||
" \"message\": self.message,\n", | ||
" \"category\": self.category.value,\n", | ||
" \"exception\": str(self.exception),\n", | ||
" })\n", | ||
"\n", | ||
"@dataclass(frozen=True)\n", | ||
"class HTTPResponse:\n", | ||
" status_code: int\n", | ||
" body: JSON\n", | ||
" \n", | ||
"\n", | ||
"def read_file(path: str) -> Result[str, InternalFailure]:\n", | ||
" try:\n", | ||
" with open(path, \"r\") as f:\n", | ||
" return Ok(f.read())\n", | ||
" except Exception as e:\n", | ||
" return Err(InternalFailure.from_exception(e, InternalFailure.Category.FileSystemError))\n", | ||
" \n", | ||
"\n", | ||
"def parse_json(json_str: str) -> Result[JSON, InternalFailure]:\n", | ||
" try: \n", | ||
" return Ok(json.loads(json_str))\n", | ||
" except json.JSONDecodeError as e:\n", | ||
" return Err(InternalFailure.from_exception(e, InternalFailure.Category.ParseJSONError))\n", | ||
"\n", | ||
"\n", | ||
"def get_prompt_text_list(aiconfig: JSON) -> Result[list[str], InternalFailure]:\n", | ||
" try:\n", | ||
" return Ok([prompt[\"text\"] for prompt in aiconfig[\"prompts\"]])\n", | ||
" except KeyError as e:\n", | ||
" return Err(InternalFailure.from_exception(e, InternalFailure.Category.GetPromptError))\n", | ||
"\n", | ||
"\n", | ||
"def prompt_list_to_http_response(prompts: list[str]) -> HTTPResponse:\n", | ||
" return HTTPResponse(200, JSON({\"prompts\": prompts}))\n", | ||
"\n", | ||
"\n", | ||
"def internal_failure_to_http_response(failure: InternalFailure) -> HTTPResponse:\n", | ||
" \"\"\"\n", | ||
" Statically guarantee exhaustive matching on InternalFailure.Type.\n", | ||
" Try commenting out one of the cases to see what happens!\n", | ||
" Try adding a new case that isn't one of the enum values to see what happens!\n", | ||
" \"\"\"\n", | ||
" def _get_code(category: InternalFailure.Category) -> int:\n", | ||
" match category:\n", | ||
" case InternalFailure.Category.FileSystemError:\n", | ||
" return 502\n", | ||
" case InternalFailure.Category.ParseJSONError:\n", | ||
" return 555\n", | ||
" case InternalFailure.Category.GetPromptError:\n", | ||
" return 567\n", | ||
" \n", | ||
" code = _get_code(failure.category)\n", | ||
" return HTTPResponse(code, JSON({\"response_type\": \"error\", \"data\": failure.to_json()}))\n", | ||
"\n", | ||
"\n", | ||
"def endpoint(path: str):\n", | ||
" \"\"\"Load an AIConfig from a local path and do something with it.\"\"\"\n", | ||
"\n", | ||
" response = (\n", | ||
" read_file(path)\n", | ||
" .and_then(parse_json)\n", | ||
" .and_then(get_prompt_text_list)\n", | ||
" .map(prompt_list_to_http_response)\n", | ||
" .unwrap_or_else(internal_failure_to_http_response)\n", | ||
" )\n", | ||
" return response\n", | ||
"\n", | ||
"\n", | ||
"endpoint(\"cookbooks/Getting-Started/travel.aiconfig.json\")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"Neat. To recap, this code now has the following very nice properties:\n", | ||
"* Unlike exceptions, errors are statically checked, eliminating a whole class of bugs.\n", | ||
"* Modular and highly reusable in a type-safe way.\n", | ||
"* Error information is retained, unlike Optional\n", | ||
"* Concise syntax for chaining operations that can fail." | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"# Part 3: Advanced topics\n", | ||
"* Higher-order functions\n", | ||
"* do-notation for complex Result composition" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## Higher-order functions\n", | ||
"As our application grows, we will inevitably add new external dependencies which can raise exceptions. Let's ease the conversion of those exceptions\n", | ||
"into statically-checkable Err values by abstracting out try-except. \n", | ||
"\n", | ||
"We can do this by leveraging the `core_utils.exception_handled` parametrized decorator. Note that a decorator is just a higher-order function,\n", | ||
"specifically a (Function -> Function). It is a function that takes a function and returns a function. \n", | ||
"Python has a neat \"@\" syntax that is often used to apply a decorator.\n", | ||
"\n", | ||
"Here we use a slight generalization of a decorator that I call a parametrized decorator. \n", | ||
"\n", | ||
"A decorator may only take one input, which is a function. In contrast, a parametrized decorator also takes more inputs which control the decoration at the point where the decorator is used.\n", | ||
"\n", | ||
"In this case, it allows us to safely and concisely associate each \"dangerous\" function with specific metadata for internal error tracking. \n", | ||
"\n", | ||
"See the new code below:" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 4, | ||
"metadata": {}, | ||
"outputs": [ | ||
{ | ||
"data": { | ||
"text/plain": [ | ||
"HTTPResponse(status_code=567, body={'response_type': 'error', 'data': {'message': \"'text'\", 'category': 'GetPromptError', 'exception': \"'text'\"}})" | ||
] | ||
}, | ||
"execution_count": 4, | ||
"metadata": {}, | ||
"output_type": "execute_result" | ||
} | ||
], | ||
"source": [ | ||
"from dataclasses import dataclass\n", | ||
"from enum import Enum\n", | ||
"from functools import partial\n", | ||
"import json\n", | ||
"from typing import Any, NewType, ParamSpec, TypeVar\n", | ||
"from result import Err\n", | ||
"\n", | ||
"import lastmile_utils.lib.core.api as core_utils\n", | ||
"P = ParamSpec(\"P\")\n", | ||
"T = TypeVar(\"T\")\n", | ||
"\n", | ||
"JSON = NewType(\"JSON\", dict[str, Any])\n", | ||
"\n", | ||
"@dataclass(frozen=True)\n", | ||
"class InternalFailure:\n", | ||
" \"\"\"\n", | ||
" This type defines the server-client contract for server-side failures.\n", | ||
" It contains metadata about the failure, including a message, category, and the exception.\n", | ||
" Use the `to_json()` method to construct a corresponding HTTP response.\n", | ||
" \"\"\" \n", | ||
" class Category(Enum):\n", | ||
" FileSystemError = \"FileSystemError\"\n", | ||
" ParseJSONError = \"ParseJSONError\"\n", | ||
" GetPromptError = \"GetPromptError\"\n", | ||
"\n", | ||
" message: str\n", | ||
" category: Category\n", | ||
" exception: Exception\n", | ||
"\n", | ||
" @staticmethod\n", | ||
" def from_exception(\n", | ||
" exception: Exception,\n", | ||
" category: Category \n", | ||
" ) -> \"InternalFailure\":\n", | ||
" return InternalFailure(\n", | ||
" message=str(exception),\n", | ||
" category=category,\n", | ||
" exception=exception,\n", | ||
" ) \n", | ||
"\n", | ||
" def to_json(self) -> JSON:\n", | ||
" return JSON({\n", | ||
" \"message\": self.message,\n", | ||
" \"category\": self.category.value,\n", | ||
" \"exception\": str(self.exception),\n", | ||
" })\n", | ||
" \n", | ||
"\n", | ||
"@dataclass(frozen=True)\n", | ||
"class HTTPResponse:\n", | ||
" status_code: int\n", | ||
" body: JSON\n", | ||
"\n", | ||
"\n", | ||
"def _handle_exception(exception: Exception, category: InternalFailure.Category):\n", | ||
" \"\"\"\n", | ||
" This is a straight-forward helper function. \n", | ||
" Convert the exception and user-specified category into a Result[T, InternalFailure]\n", | ||
" \"\"\"\n", | ||
" return Err(InternalFailure.from_exception(exception, category))\n", | ||
"\n", | ||
"\n", | ||
"def convert_exceptions(category: InternalFailure.Category):\n", | ||
" \"\"\"\n", | ||
" This is a higher-order function that returns a parametrized decorator.\n", | ||
" The code is extremely short, but powerful.\n", | ||
"\n", | ||
" Here's the idea: we want to output a parametrized decorator that can be used\n", | ||
" to convert exceptions to specific InternalFailure instances (with specific categories).\n", | ||
"\n", | ||
" We can leverage core_utils.exception_handled to do the core work, namely actually running try...except for us.\n", | ||
"\n", | ||
" `core_utils.exception_handled` is a parametrized decorator: it allows you to give an extra argument,\n", | ||
" the exception handler (Exception -> Result).\n", | ||
"\n", | ||
" In our case, we don't want to give an exception handler directly at every decorated function definition.\n", | ||
" This function wraps `core_utils.exception_handled` and returns a new parametrized decorator, which accepts just \n", | ||
" the argument we care about: the value of InternalFailure.Category that we want to associate with a decorated function.\n", | ||
"\n", | ||
" \"\"\"\n", | ||
"\n", | ||
" # This is where we create the handler that conforms to the `exception_handled` signature.\n", | ||
" # Specifically, exception_handled expects a function that takes one argument, an Exception.\n", | ||
" # Our function _handle_exception takes an Exception, but also a second argument, category.\n", | ||
" # Partial is exactly the tool to convert a function that takes some arguments into another function\n", | ||
" # that takes fewer arguments. \n", | ||
" # In this case, we are binding the category passed into this function (convert_exceptions(category)) \n", | ||
" # to the category argument of _handle_exception.\n", | ||
" # That creates a new function that just takes one argument, Exception, \n", | ||
" # which is exactly what we need to pass into exception_handled!\n", | ||
" handler = partial(_handle_exception, category=category)\n", | ||
"\n", | ||
" # Here's another way to do this, but it doesn't play as well with type checking.\n", | ||
" # `handler = lambda e: _handle_exception(e, category)``\n", | ||
"\n", | ||
" # Now that we have created a handler that conforms to the `exception_handled` signature, we can pass it into\n", | ||
" # `exception_handled` and get a new decorator.\n", | ||
" # Specifically, we return a new parametrized decorator that takes a category as its extra input.\n", | ||
" return core_utils.exception_handled(handler)\n", | ||
"\n", | ||
"\n", | ||
"\"\"\"\n", | ||
"The decorated functions below use our decorator above to handle exceptions \n", | ||
"and convert them into InternalFailure instances with specific categories.\n", | ||
"\n", | ||
"The decorator changes the function signature, which you can see as follows:\n", | ||
"\n", | ||
"```\n", | ||
"# Type `read_file on a new line and over over it. This is the output of the decorated `read_file`.\n", | ||
"You'll see the inferred type is different from the signature of the nondecorated function.\n", | ||
"read_file\n", | ||
"\n", | ||
"# on-hover. Note that this `Ok | Err` union is nothing but a Result[str, InternalFailure].\n", | ||
"(function) def read_file(path: str) -> (Ok[str] | Err[InternalFailure])\n", | ||
"```\n", | ||
"\"\"\"\n", | ||
"@convert_exceptions(InternalFailure.Category.FileSystemError)\n", | ||
"def read_file(path: str) -> str:\n", | ||
" with open(path, \"r\") as f:\n", | ||
" return f.read()\n", | ||
" \n", | ||
"\n", | ||
"@convert_exceptions(InternalFailure.Category.ParseJSONError)\n", | ||
"def parse_json(json_str: str) -> JSON:\n", | ||
" return JSON(json.loads(json_str))\n", | ||
"\n", | ||
"\n", | ||
"@convert_exceptions(InternalFailure.Category.GetPromptError)\n", | ||
"def get_prompt_text_list(aiconfig: JSON) -> list[str]:\n", | ||
" return [prompt[\"text\"] for prompt in aiconfig[\"prompts\"]]\n", | ||
"\n", | ||
"\n", | ||
"def prompt_list_to_http_response(prompts: list[str]) -> HTTPResponse:\n", | ||
" return HTTPResponse(200, JSON({\"prompts\": prompts}))\n", | ||
"\n", | ||
"\n", | ||
"def internal_failure_to_http_response(failure: InternalFailure) -> HTTPResponse:\n", | ||
" \"\"\"\n", | ||
" Statically guarantee exhaustive matching on InternalFailure.Type.\n", | ||
" Try commenting out one of the cases to see what happens!\n", | ||
" Try adding a new case that isn't one of the enum values to see what happens!\n", | ||
" \"\"\"\n", | ||
" def _get_code(category: InternalFailure.Category) -> int:\n", | ||
" match category:\n", | ||
" case InternalFailure.Category.FileSystemError:\n", | ||
" return 502\n", | ||
" case InternalFailure.Category.ParseJSONError:\n", | ||
" return 555\n", | ||
" case InternalFailure.Category.GetPromptError:\n", | ||
" return 567\n", | ||
" \n", | ||
" code = _get_code(failure.category)\n", | ||
" return HTTPResponse(code, JSON({\"response_type\": \"error\", \"data\": failure.to_json()}))\n", | ||
"\n", | ||
"\n", | ||
"def endpoint(path: str):\n", | ||
" \"\"\"Load an AIConfig from a local path and do something with it.\"\"\"\n", | ||
"\n", | ||
" response = (\n", | ||
" read_file(path)\n", | ||
" .and_then(parse_json)\n", | ||
" .and_then(get_prompt_text_list)\n", | ||
" .map(prompt_list_to_http_response)\n", | ||
" .unwrap_or_else(internal_failure_to_http_response)\n", | ||
" )\n", | ||
" return response\n", | ||
"\n", | ||
"endpoint(\"cookbooks/Getting-Started/travel.aiconfig.json\")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"Notice how powerful this program is per SLOC. It is provably free of unhandled exceptions except \n", | ||
"inside the tiny wrapper functions that call across the boundaries of our code into 3rd party libraries.\n", | ||
"\n", | ||
"All of our logic is separated into small and reusable components.\n", | ||
"\n", | ||
"As we work, Pyright gives immediate feedback about large classes of potential errors.\n", | ||
"We can work with Pyright and Copilot to quickly write obviously- and provably-correct code just based on making the static types work.\n", | ||
"\n", | ||
"Notice a few things we did not need to use or worry about at all:\n", | ||
"* Inheritance or class-based polymorphism (we implicitly used parametric polymorphism when calling `exception_handled`, but I bet you didn't notice or care.)\n", | ||
"* State mutation or \"encapsulation\"\n", | ||
"* Special cases\n", | ||
"* Unconstrained types like `Any`, untyped `*args/**kwargs`, or incomplete generic types like `list`.\n", | ||
"* `None`, `Optional`. We eliminated the Python equivalent of `NullpointerException`! \n", | ||
"* `try...except`, `raise`, or re-`raise`\n", | ||
"* Any of a large body of design patterns like Singleton, Factory, or Proxy\n", | ||
"\n", | ||
"\n", | ||
"\n", | ||
"It's just data and functions, that's it. And as we noted earlier, functions are data, so really it's just data." | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## Do-notation" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"Anyway, cool. Now let's add a requirement to our server to make it a little more realistic.\n", | ||
"Let's suppose that we want to allow the endpoint caller to read from a local CSV, extract a text filter, \n", | ||
"and apply that filter inside `get_prompt_text_list()`.\n", | ||
"\n", | ||
"We make a few small, straight-forward modifications to the code above, but run into a problem\n", | ||
"inside `endpoint()`.\n", | ||
"\n", | ||
"We still want to do `.and_then(get_prompt_text_list)` but that no longer type checks, because `and_then()` only accepts a function with one argument. Since `get_prompt_text_list` now takes another argument, this doesn't work.\n", | ||
"\n", | ||
"(Actually, we run into a similar problem inside `get_prompt_filter()` but in that case we have an easy fix using `partial`. See if you can figure out why that won't work for `get_prompt_text_list()`.)\n", | ||
"\n", | ||
"\n", | ||
"Skip down to the new `endpoint()` to see what's going on:\n", | ||
"\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 9, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"from dataclasses import dataclass\n", | ||
"from enum import Enum\n", | ||
"from functools import partial\n", | ||
"import csv\n", | ||
"from io import StringIO\n", | ||
"\n", | ||
"import json\n", | ||
"from typing import Any, NewType, ParamSpec, TypeVar\n", | ||
"from result import Err\n", | ||
"\n", | ||
"import lastmile_utils.lib.core.api as core_utils\n", | ||
"P = ParamSpec(\"P\")\n", | ||
"T = TypeVar(\"T\")\n", | ||
"\n", | ||
"JSON = NewType(\"JSON\", dict[str, Any])\n", | ||
"\n", | ||
"\n", | ||
"@dataclass(frozen=True)\n", | ||
"class InternalFailure:\n", | ||
" \"\"\"\n", | ||
" This type defines the server-client contract for server-side failures.\n", | ||
" It contains metadata about the failure, including a message, category, and the exception.\n", | ||
" Use the `to_json()` method to construct a corresponding HTTP response.\n", | ||
" \"\"\" \n", | ||
" class Category(Enum):\n", | ||
" FileSystemError = \"FileSystemError\"\n", | ||
" ParseJSONError = \"ParseJSONError\"\n", | ||
" GetPromptError = \"GetPromptError\"\n", | ||
" ParseCSVError = \"ParseCSVError\"\n", | ||
"\n", | ||
" message: str\n", | ||
" category: Category\n", | ||
" exception: Exception\n", | ||
"\n", | ||
" @staticmethod\n", | ||
" def from_exception(\n", | ||
" exception: Exception,\n", | ||
" category: Category \n", | ||
" ) -> \"InternalFailure\":\n", | ||
" return InternalFailure(\n", | ||
" message=str(exception),\n", | ||
" category=category,\n", | ||
" exception=exception,\n", | ||
" ) \n", | ||
"\n", | ||
" def to_json(self) -> JSON:\n", | ||
" return JSON({\n", | ||
" \"message\": self.message,\n", | ||
" \"category\": self.category.value,\n", | ||
" \"exception\": str(self.exception),\n", | ||
" })\n", | ||
" \n", | ||
"\n", | ||
"@dataclass(frozen=True)\n", | ||
"class HTTPResponse:\n", | ||
" status_code: int\n", | ||
" body: JSON\n", | ||
"\n", | ||
"\n", | ||
"def _handle_exception(exception: Exception, category: InternalFailure.Category):\n", | ||
" return Err(InternalFailure.from_exception(exception, category))\n", | ||
"\n", | ||
"\n", | ||
"def convert_exceptions(category: InternalFailure.Category): \n", | ||
" handler = partial(_handle_exception, category=category)\n", | ||
" return core_utils.exception_handled(handler)\n", | ||
"\n", | ||
"\n", | ||
"@convert_exceptions(InternalFailure.Category.FileSystemError)\n", | ||
"def read_file(path: str) -> str:\n", | ||
" with open(path, \"r\") as f:\n", | ||
" return f.read()\n", | ||
" \n", | ||
"\n", | ||
"@convert_exceptions(InternalFailure.Category.ParseJSONError)\n", | ||
"def parse_json(json_str: str) -> JSON:\n", | ||
" return JSON(json.loads(json_str))\n", | ||
"\n", | ||
"\n", | ||
"@convert_exceptions(InternalFailure.Category.GetPromptError)\n", | ||
"def lookup_filter(filter_key: str, filter_mapping: dict[str, str]) -> str:\n", | ||
" return filter_mapping[filter_key]\n", | ||
"\n", | ||
"\n", | ||
"@convert_exceptions(InternalFailure.Category.GetPromptError)\n", | ||
"def get_prompt_text_list(aiconfig: JSON, prompt_filter: str) -> list[str]:\n", | ||
" return [prompt[\"text\"] for prompt in aiconfig[\"prompts\"] if prompt_filter in prompt[\"text\"]]\n", | ||
"\n", | ||
"\n", | ||
"@convert_exceptions(InternalFailure.Category.ParseCSVError)\n", | ||
"def parse_csv_to_mapping(csv_contents: str) -> dict[str, str]:\n", | ||
" \"\"\"Input: like \"key1,value1\\nkey2,value2\\nkey3,value3\\n\"\"\"\n", | ||
" f = StringIO(csv_contents)\n", | ||
" reader = csv.reader(f, delimiter=',')\n", | ||
" return {row[0]: row[1] for row in reader}\n", | ||
"\n", | ||
"\n", | ||
"def prompt_list_to_http_response(prompts: list[str]) -> HTTPResponse:\n", | ||
" return HTTPResponse(200, JSON({\"prompts\": prompts}))\n", | ||
"\n", | ||
"\n", | ||
"def internal_failure_to_http_response(failure: InternalFailure) -> HTTPResponse:\n", | ||
" \"\"\"\n", | ||
" Statically guarantee exhaustive matching on InternalFailure.Type.\n", | ||
" Try commenting out one of the cases to see what happens!\n", | ||
" Try adding a new case that isn't one of the enum values to see what happens!\n", | ||
" \"\"\"\n", | ||
" def _get_code(category: InternalFailure.Category) -> int:\n", | ||
" match category:\n", | ||
" case InternalFailure.Category.FileSystemError:\n", | ||
" return 502\n", | ||
" case InternalFailure.Category.ParseJSONError:\n", | ||
" return 555\n", | ||
" case InternalFailure.Category.GetPromptError:\n", | ||
" return 567\n", | ||
" case InternalFailure.Category.ParseCSVError:\n", | ||
" return 593\n", | ||
" \n", | ||
" code = _get_code(failure.category)\n", | ||
" return HTTPResponse(code, JSON({\"response_type\": \"error\", \"data\": failure.to_json()}))\n", | ||
"\n", | ||
"\n", | ||
"def get_prompt_filter(csv_path: str, filter_key: str) -> Result[str, InternalFailure]:\n", | ||
" return (\n", | ||
" read_file(csv_path)\n", | ||
" .and_then(parse_csv_to_mapping)\n", | ||
" .and_then(partial(lookup_filter, filter_key=filter_key))\n", | ||
" )\n", | ||
"\n", | ||
"def endpoint(path: str, filter_key: str):\n", | ||
" # TODO: finish implementing this function.\n", | ||
" \"\"\"Load an AIConfig from a local path and do something with it.\"\"\"\n", | ||
"\n", | ||
" CSV_PATH = \"my_filters.csv\"\n", | ||
"\n", | ||
" prompt_filter = get_prompt_filter(CSV_PATH, filter_key)\n", | ||
" response = (\n", | ||
" read_file(path)\n", | ||
" .and_then(parse_json)\n", | ||
" # .and_then... what?\n", | ||
"\n", | ||
" )\n", | ||
" return response\n", | ||
"\n", | ||
"# TODO: call this.\n", | ||
"# endpoint(\"cookbooks/Getting-Started/travel.aiconfig.json\", \"some_key\")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"### Detour: List comprehensions" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"Python has an interesting and fundamental syntactical construct called a comprehension. Let's look at how this works with the list type.\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 16, | ||
"metadata": {}, | ||
"outputs": [ | ||
{ | ||
"name": "stdout", | ||
"output_type": "stream", | ||
"text": [ | ||
"[4, 16]\n" | ||
] | ||
} | ||
], | ||
"source": [ | ||
"my_list = [1, 2, 3, 4]\n", | ||
"my_new_list = [x ** 2 for x in my_list if x % 2 == 0]\n", | ||
"print(my_new_list)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"Ok, that's interesting. In this concise expression, are we doing a complex operation that can be described as follows:\n", | ||
"* Construct a new list with values of the form `x ** 2` where `x` will take on different values.\n", | ||
"* Bind `x` to the values in `my_list`, in sequence (preserving order), and use each of those values to construct the new list.\n", | ||
"* Only include even values of `x`, values where `x % 2 == 0`.\n", | ||
"* The output list is therefore a new list `[4, 16] == [2 ** 2, 4 ** 2]`." | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"This can be expressed in more traditional functional terms using higher-order functions. \n", | ||
"\n", | ||
"This code is equivalent:" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 18, | ||
"metadata": {}, | ||
"outputs": [ | ||
{ | ||
"name": "stdout", | ||
"output_type": "stream", | ||
"text": [ | ||
"[4, 16]\n" | ||
] | ||
} | ||
], | ||
"source": [ | ||
"my_list = [1, 2, 3, 4]\n", | ||
"my_new_list = list(\n", | ||
" filter(\n", | ||
" lambda x: x % 2 == 0,\n", | ||
" map(lambda x: x ** 2, my_list)\n", | ||
" )\n", | ||
")\n", | ||
"\n", | ||
"print(my_new_list)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"Interesting. Let's put that aside and consider a new function. This also uses a list comprehension, but in this cases we need two `for` expressions to flatten the input list." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 11, | ||
"metadata": {}, | ||
"outputs": [ | ||
{ | ||
"name": "stdout", | ||
"output_type": "stream", | ||
"text": [ | ||
"[1, 2, 3, 4, 5, 6, 7]\n" | ||
] | ||
} | ||
], | ||
"source": [ | ||
"def flatten(my_list: list[list[int]]) -> list[int]:\n", | ||
" return [\n", | ||
" item \n", | ||
" for sublist in my_list \n", | ||
" for item in sublist\n", | ||
" ]\n", | ||
"\n", | ||
"input_list = [\n", | ||
" [1, 2, 3],\n", | ||
" [4, 5, 6], \n", | ||
" [7]\n", | ||
"]\n", | ||
"\n", | ||
"print(flatten(input_list))" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"This syntax quickly becomes pretty perplexing, but it is fundamental to Python - it's been around since at least Python 2 (https://docs.python.org/2/tutorial/datastructures.html#list-comprehensions).\n", | ||
"\n", | ||
"Here is the correct way to read the implementation of `flatten()`:\n", | ||
"- Start at the first `for`. Unpack all the elements of `my_list` and assign each one in turn to `sublist`.\n", | ||
"- Now proceed to the second `for`. Unpack all the elements of `sublist` and assign each one in turn to `item`.\n", | ||
"- Now we are done evaluating the `for`s, so we go back to the first line of the comprehension.\n", | ||
"- Construct your new list using each value of `item` in the order in which they were unpacked." | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"### Back to the problem at hand" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"Keep that double list comprehension thing in mind.\n", | ||
"\n", | ||
"\n", | ||
"Now remember our problem with `and_then()`. Luckily, Result has a very handy solution to this, and it reads sort of like a list comprehension. \n", | ||
"You can think of it kind of like a Result comprehension. \n", | ||
"\n", | ||
"To understand this example well, make sure to hover over the identifiers and look at their inferred types, especially inside `endpoint()`." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 8, | ||
"metadata": {}, | ||
"outputs": [ | ||
{ | ||
"data": { | ||
"text/plain": [ | ||
"HTTPResponse(status_code=502, body={'response_type': 'error', 'data': {'message': \"[Errno 2] No such file or directory: 'my_filters.csv'\", 'category': 'FileSystemError', 'exception': \"[Errno 2] No such file or directory: 'my_filters.csv'\"}})" | ||
] | ||
}, | ||
"execution_count": 8, | ||
"metadata": {}, | ||
"output_type": "execute_result" | ||
} | ||
], | ||
"source": [ | ||
"from dataclasses import dataclass\n", | ||
"from enum import Enum\n", | ||
"from functools import partial\n", | ||
"import csv\n", | ||
"from io import StringIO\n", | ||
"\n", | ||
"import json\n", | ||
"from typing import Any, NewType, ParamSpec, TypeVar\n", | ||
"from result import Err\n", | ||
"\n", | ||
"import lastmile_utils.lib.core.api as core_utils\n", | ||
"import result\n", | ||
"P = ParamSpec(\"P\")\n", | ||
"T = TypeVar(\"T\")\n", | ||
"\n", | ||
"JSON = NewType(\"JSON\", dict[str, Any])\n", | ||
"\n", | ||
"\n", | ||
"@dataclass(frozen=True)\n", | ||
"class InternalFailure:\n", | ||
" \"\"\"\n", | ||
" This type defines the server-client contract for server-side failures.\n", | ||
" It contains metadata about the failure, including a message, category, and the exception.\n", | ||
" Use the `to_json()` method to construct a corresponding HTTP response.\n", | ||
" \"\"\" \n", | ||
" class Category(Enum):\n", | ||
" FileSystemError = \"FileSystemError\"\n", | ||
" ParseJSONError = \"ParseJSONError\"\n", | ||
" GetPromptError = \"GetPromptError\"\n", | ||
" ParseCSVError = \"ParseCSVError\"\n", | ||
"\n", | ||
" message: str\n", | ||
" category: Category\n", | ||
" exception: Exception\n", | ||
"\n", | ||
" @staticmethod\n", | ||
" def from_exception(\n", | ||
" exception: Exception,\n", | ||
" category: Category \n", | ||
" ) -> \"InternalFailure\":\n", | ||
" return InternalFailure(\n", | ||
" message=str(exception),\n", | ||
" category=category,\n", | ||
" exception=exception,\n", | ||
" ) \n", | ||
"\n", | ||
" def to_json(self) -> JSON:\n", | ||
" return JSON({\n", | ||
" \"message\": self.message,\n", | ||
" \"category\": self.category.value,\n", | ||
" \"exception\": str(self.exception),\n", | ||
" })\n", | ||
" \n", | ||
"\n", | ||
"@dataclass(frozen=True)\n", | ||
"class HTTPResponse:\n", | ||
" status_code: int\n", | ||
" body: JSON\n", | ||
"\n", | ||
"\n", | ||
"def _handle_exception(exception: Exception, category: InternalFailure.Category):\n", | ||
" return Err(InternalFailure.from_exception(exception, category))\n", | ||
"\n", | ||
"\n", | ||
"def convert_exceptions(category: InternalFailure.Category): \n", | ||
" handler = partial(_handle_exception, category=category)\n", | ||
" return core_utils.exception_handled(handler)\n", | ||
"\n", | ||
"\n", | ||
"@convert_exceptions(InternalFailure.Category.FileSystemError)\n", | ||
"def read_file(path: str) -> str:\n", | ||
" with open(path, \"r\") as f:\n", | ||
" return f.read()\n", | ||
" \n", | ||
"\n", | ||
"@convert_exceptions(InternalFailure.Category.ParseJSONError)\n", | ||
"def parse_json(json_str: str) -> JSON:\n", | ||
" return JSON(json.loads(json_str))\n", | ||
"\n", | ||
"\n", | ||
"@convert_exceptions(InternalFailure.Category.GetPromptError)\n", | ||
"def lookup_filter(filter_key: str, filter_mapping: dict[str, str]) -> str:\n", | ||
" return filter_mapping[filter_key]\n", | ||
"\n", | ||
"\n", | ||
"@convert_exceptions(InternalFailure.Category.GetPromptError)\n", | ||
"def get_prompt_text_list(aiconfig: JSON, prompt_filter: str) -> list[str]:\n", | ||
" return [prompt[\"text\"] for prompt in aiconfig[\"prompts\"] if prompt_filter in prompt[\"text\"]]\n", | ||
"\n", | ||
"\n", | ||
"@convert_exceptions(InternalFailure.Category.ParseCSVError)\n", | ||
"def parse_csv_to_mapping(csv_contents: str) -> dict[str, str]:\n", | ||
" \"\"\"Input: like \"key1,value1\\nkey2,value2\\nkey3,value3\\n\"\"\"\n", | ||
" f = StringIO(csv_contents)\n", | ||
" reader = csv.reader(f, delimiter=',')\n", | ||
" return {row[0]: row[1] for row in reader}\n", | ||
"\n", | ||
"\n", | ||
"def prompt_list_to_http_response(prompts: list[str]) -> HTTPResponse:\n", | ||
" return HTTPResponse(200, JSON({\"prompts\": prompts}))\n", | ||
"\n", | ||
"\n", | ||
"def internal_failure_to_http_response(failure: InternalFailure) -> HTTPResponse:\n", | ||
" \"\"\"\n", | ||
" Statically guarantee exhaustive matching on InternalFailure.Type.\n", | ||
" Try commenting out one of the cases to see what happens!\n", | ||
" Try adding a new case that isn't one of the enum values to see what happens!\n", | ||
" \"\"\"\n", | ||
" def _get_code(category: InternalFailure.Category) -> int:\n", | ||
" match category:\n", | ||
" case InternalFailure.Category.FileSystemError:\n", | ||
" return 502\n", | ||
" case InternalFailure.Category.ParseJSONError:\n", | ||
" return 555\n", | ||
" case InternalFailure.Category.GetPromptError:\n", | ||
" return 567\n", | ||
" case InternalFailure.Category.ParseCSVError:\n", | ||
" return 593\n", | ||
" \n", | ||
" code = _get_code(failure.category)\n", | ||
" return HTTPResponse(code, JSON({\"response_type\": \"error\", \"data\": failure.to_json()}))\n", | ||
"\n", | ||
"\n", | ||
"def get_prompt_filter(csv_path: str, filter_key: str) -> Result[str, InternalFailure]:\n", | ||
" return (\n", | ||
" read_file(csv_path)\n", | ||
" .and_then(parse_csv_to_mapping)\n", | ||
" .and_then(partial(lookup_filter, filter_key=filter_key))\n", | ||
" )\n", | ||
"\n", | ||
"def endpoint(path: str, filter_key: str):\n", | ||
" \"\"\"Load an AIConfig from a local path and do something with it.\"\"\"\n", | ||
"\n", | ||
" CSV_PATH = \"my_filters.csv\"\n", | ||
" \n", | ||
" # Much like a list comprehension, the correct way to read this is:\n", | ||
" # - Start at the first `for`. Run get_prompt_filter(), check if it returns an Ok,\n", | ||
" # and if so, unpack the value and assign it to prompt_filter_ok.\n", | ||
" # - If it's an Err, short-circuit and the whole `do() evaluates to that Err.\n", | ||
" # - If it's Ok, continue to the next `for`.\n", | ||
" # - Now run read_file(), check if it returns an Ok, \n", | ||
" # and if so, unpack the value and assign it to aiconfig_path_contents_ok.\n", | ||
" # - If it's an Err, short-circuit and the whole `do() evaluates to that Err.\n", | ||
" # - If it's Ok, continue to the next `for`.\n", | ||
" # - Now run parse_json(), check if it returns an Ok,\n", | ||
" # and if so, unpack the value and assign it to aiconfig_ok.\n", | ||
" # - If it's an Err, short-circuit and the whole `do() evaluates to that Err.\n", | ||
" # - If it's an Ok, then everything we ran returned Ok values, and \n", | ||
" # we are done evaluating the `for` expressions.\n", | ||
" # Proceed to the first line of the `do()` block, which is what \n", | ||
" # the entire `do()` will evaluate to.\n", | ||
" # Take our aiconfig_ok and prompt_filter_ok values, \n", | ||
" # run get_prompt_text_list() on them, and get the return value. \n", | ||
" # Now the `do()` block is done and evaluates to that return value, \n", | ||
" # which is a Result[list[str], InternalFailure].\n", | ||
" prompt_list = result.do(\n", | ||
" get_prompt_text_list(aiconfig_ok, prompt_filter_ok)\n", | ||
" for prompt_filter_ok in get_prompt_filter(CSV_PATH, filter_key)\n", | ||
" for aiconfig_path_contents_ok in read_file(path)\n", | ||
" for aiconfig_ok in parse_json(aiconfig_path_contents_ok)\n", | ||
" )\n", | ||
"\n", | ||
" response = (\n", | ||
" prompt_list\n", | ||
" .map(prompt_list_to_http_response)\n", | ||
" .unwrap_or_else(internal_failure_to_http_response)\n", | ||
" )\n", | ||
"\n", | ||
" return response\n", | ||
"\n", | ||
"endpoint(\"cookbooks/Getting-Started/travel.aiconfig.json\", \"some_key\")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"Let's analyze the similarities and differences between `result.do()` and the double-for list comprehension used in `flatten()`. \n", | ||
"\n", | ||
"The two expressions have the same form, in the sense that they construct values of a specific container type by evaluating a sequence of `for` expressions. In both cases, you start at the first `for`, go in order until you're done with the last `for`, then come back up to the top and evaluate an expression.\n", | ||
"\n", | ||
"Interesting.\n", | ||
"\n", | ||
"Now, you might be wondering why they are so different despite those similarities. In `result.do()`, we have `Ok` and `Err` values, `Err` values short-circuit, and there's no actual iteration over a sequence. Every `Result` is exactly one `Ok` or exactly one `Err`. With list comprehensions, we are iterating over an arbitrary number of elements.\n", | ||
"\n", | ||
"The answer is that this notation abstracts over a class of different concrete \"chainable\" types. Result is one such type; list is another. They are different by construction, and _defined_ by their distinct chaining rules. \n", | ||
"\n", | ||
"In the case of `list`, \"chaining\" is actually _defined_ to involve flattening. Actually, list chaining involves a little more than just flatten: it maps a function over each element, and then flattens. It's called `flatMap`:" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 23, | ||
"metadata": {}, | ||
"outputs": [ | ||
{ | ||
"name": "stdout", | ||
"output_type": "stream", | ||
"text": [ | ||
"[1, 2, 2, 4, 3, 6, 4, 8]\n" | ||
] | ||
} | ||
], | ||
"source": [ | ||
"from typing import Callable\n", | ||
"\n", | ||
"T = TypeVar(\"T\")\n", | ||
"U = TypeVar(\"U\")\n", | ||
"\n", | ||
"# If Python had a flat_map function like the equivalents in \n", | ||
"# Javascript, Scala, or other languages, it would work like this.\n", | ||
"def flatMap(my_list: list[T], f: Callable[[T], list[U]]) -> list[U]:\n", | ||
" return [\n", | ||
" output_item\n", | ||
" for input_item in my_list\n", | ||
" for output_item in f(input_item)\n", | ||
" ]\n", | ||
"\n", | ||
"my_list = [1, 2, 3, 4]\n", | ||
"my_new_list = flatMap(my_list, lambda x: [x, x * 2])\n", | ||
"\n", | ||
"print(my_new_list)\n", | ||
"\n", | ||
"\n", | ||
"# It could also be implemented like this:\n", | ||
"def flat_map_v2(my_list: list[int], f: Callable[[int], list[int]]) -> list[int]:\n", | ||
" return flatten(list(map(f, my_list)))\n", | ||
"\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"Now hopefully you can see the generality of `list` comprehensions, which is like the do-notation of the `list` type. Notice how easy it was to implement both `flatten` and `flatMap` (not to mention `filter`) using list comprehensions.\n", | ||
"\n", | ||
"### Summary\n", | ||
"\n", | ||
"So, Result is just another container type that happens to be chainable. There is `dict` (another container type which happens not to be chainable in any obvious way), there is `list`, and there is `Result.`\n", | ||
"\n", | ||
"The essence of chaining list operations is iteration over the elements, applying list-producting function, and flattening. \n", | ||
"The essence of chaining Result operations is either short-circuiting an Err, or applying a Result-producing function to the `Ok` and then \"flattening\" the nested `Ok`.\n", | ||
"\n", | ||
"Exercises:\n", | ||
"\n", | ||
"* A natural question would be, \"what is the `flatMap` of `Result`?\" See if you can figure it out.\n", | ||
"* Look at how we used `.map()` in our Result chain. `Result.map` applies a function inside a Result if it's an Ok, otherwise it just leaves the Err alone. What is the equivalent operation for lists? " | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"### Further reading\n", | ||
"* Javascript Array flatMap: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/flatMap\n", | ||
"* \"Async/await just is the do-notation of the Promise Monad\": https://gist.github.com/peter-leonov/c86720d1517235a1f28cd453a9d39bb4\n", | ||
"* Returns library: https://returns.readthedocs.io/en/latest/\n", | ||
" * Although more featureful and instructive to look at, I prefer the Result library for its known Pyright compatibility.\n", | ||
"* Haskell monad typeclass: https://wiki.haskell.org/Monad \n", | ||
"* Programming and Programming Languages (PAPL), Brown CS: https://papl.cs.brown.edu/2020/\n", | ||
"* \"Programming Languages:\n", | ||
"Application and Interpretation\": https://www.plai.org/3/2/PLAI%20Version%203.2.2%20electronic.pdf\n", | ||
"* F# for Fun and Profit: https://fsharpforfunandprofit.com/" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"# Part 4: Methods are just functions" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"Did you notice something funny in the previous section? There's an unexplained syntactic difference between `Result`'s `map` and `list`'s `map`, even though they are doing the same kind of operation over their respective data structures.\n", | ||
"\n", | ||
"This really is just a (superficial) syntactic difference. If you look at `Result`'s source code, you'll see it uses classes and methods. Despite this, `Result` is closely analogous to Haskell's `Either` type even though Haskell has no classes or objects in the sense that Python does." | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"What gives? How is this FP type implemented using OOP structures? Well, neither type of programming is a paradigm. They're just collections of PL features.\n", | ||
"\n", | ||
"Funnily enough, Python's method syntax clarifies this." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 26, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"# A snippet from Result's code: the implementation of Ok.and_then()`\n", | ||
"def and_then(self, op: Callable[[T], Result[U, E]]) -> Result[U, E]:\n", | ||
" \"\"\"\n", | ||
" The contained result is `Ok`, so return the result of `op` with the\n", | ||
" original value passed in\n", | ||
" \"\"\"\n", | ||
" return op(self._value)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"A method is literally nothing but a function whose first argument is the object containing the method. In Python, this is explicit and obligatory. \n", | ||
" \n", | ||
"Similarly, the syntax for calling a method is precisely isomorphic to a function call: swap the function and first argument and separate with a `.`." | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"A consequence of this fact is that an object is nothing but a struct bundled with a set of functions whose first argument are that object's type. \n", | ||
"\n", | ||
"Stripping away these superficial differences reveals the substantive differences between FP and OOP styles. Each style (or \"paradigm\" if you insist) can be analyzed into not much more than its distinct set of features. \n", | ||
"\n", | ||
"The most instructive way to understand the difference is not just \"objects vs. functions\". After all, objects have functions (they're called \"methods\") and FP languages have objects (they're called \"structs\", \"records\", etc.)\n", | ||
"\n", | ||
"The main substantive differences are:\n", | ||
"- The use of _nested function types_, namely higher-order functions in FP, vs. the lack of this in OOP.\n", | ||
"- Referential transparency in FP. What you see is what you get. The output of a function only depends on that function and its inputs. This is also called purity, and is distinct from a program whose execution involves mutation or other side effects. One of the essential features of OOP is mutation, which is called \"encapsulation\" in OOP. In contrast, one of the essential features of FP is immutability or purity, which implies referential transparency.\n", | ||
"- Code organization: OOP tends to organize around bundles of mutable state and functions that do that mutation, namely classes/objects. FP sometimes bundles related functions into modules, but does not organize around mutating a piece of state using those functions. \n", | ||
"- Different kinds of polymorphism: OOP-compatible languages like Python or Java bundle together code reuse (inheritance) and ad-hoc polymorphism (subtyping) through subclassing. FP often achieves the same objectives (at a high level) through higher-order functions, but there aren't obvious 1:1 feature mappings.\n", | ||
"- One thing that is shared between OOP and FP is the other major type of polymorphism, parametric polymorphism, which is essentially generic types. This is distinct from ad-hoc polymorphism, which takes different forms in different languages. In OOP style, ad-hoc polymorphism usually comes in the form of dynamic dispatch via subclassing. Both FP and OOP styles use ad-hoc polymorphism in the form of overloading." | ||
] | ||
} | ||
], | ||
"metadata": { | ||
"kernelspec": { | ||
"display_name": "aiconfig", | ||
"language": "python", | ||
"name": "python3" | ||
}, | ||
"language_info": { | ||
"codemirror_mode": { | ||
"name": "ipython", | ||
"version": 3 | ||
}, | ||
"file_extension": ".py", | ||
"mimetype": "text/x-python", | ||
"name": "python", | ||
"nbconvert_exporter": "python", | ||
"pygments_lexer": "ipython3", | ||
"version": "3.10.13" | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 2 | ||
} |