meta-llama · subramen · Nov 13, 2024 · Nov 13, 2024
diff --git a/recipes/use_cases/github_triage/walkthrough.ipynb b/recipes/use_cases/github_triage/walkthrough.ipynb
@@ -1,15 +1,5 @@
 {
   "cells": [
-    {
-      "cell_type": "code",
-      "execution_count": 1,
-      "metadata": {},
-      "outputs": [],
-      "source": [
-        "%load_ext autoreload\n",
-        "%autoreload 2"
-      ]
-    },
     {
       "cell_type": "markdown",
       "metadata": {},
@@ -29,13 +19,13 @@
       ]
     },
     {
-      "cell_type": "code",
-      "execution_count": null,
+      "cell_type": "markdown",
       "metadata": {},
-      "outputs": [],
       "source": [
         "!git clone https://github.com/meta-llama/llama-recipes\n",
+        "\n",
         "%cd recipes/use_cases/github_triage\n",
+        "\n",
         "!pip install -r requirements.txt"
       ]
     },
@@ -355,7 +345,6 @@
         }
       ],
       "source": [
-        "\n",
         "print(f\"Additional metadata generated by LLM: {set(annotated_issues.columns).difference(set(issues_df.columns))}\\n\\n\")\n",
         "print(f\"[Showing 5 out of {annotated_issues.shape[0]} rows]\\n\")\n",
         "print(annotated_issues.head())\n"
@@ -413,8 +402,6 @@
         }
       ],
       "source": [
-        "\n",
-        "\n",
         "# Generate high-level analysis\n",
         "challenges, overview = generate_executive_reports(annotated_issues, theme_counts, repo_name, start_date, end_date)\n",
         "# Validate and save generated data\n",
@@ -586,8 +573,6 @@
         }
       ],
       "source": [
-        "\n",
-        "\n",
         "# Create graphs and charts\n",
         "plot_folder = out_folder + \"/plots\"\n",
         "os.makedirs(plot_folder, exist_ok=True)\n",

diff --git a/recipes/use_cases/github_triage/walkthrough.nbconvert.ipynb b/recipes/use_cases/github_triage/walkthrough.nbconvert.ipynb
@@ -0,0 +1,337 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "7d133814",
+   "metadata": {},
+   "source": [
+    "# Automatic Issues Triaging with Llama\n",
+    "\n",
+    "We utilize an off-the-shelf Llama model to analyze, generate insights, and create a report for better understanding of the state of a repository. \n",
+    "\n",
+    "This notebook walks you through the tool's working. "
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "22b5a0a2",
+   "metadata": {},
+   "source": [
+    "## Setup"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "c34ebfe0",
+   "metadata": {},
+   "source": [
+    "!git clone https://github.com/meta-llama/llama-recipes\n",
+    "\n",
+    "%cd recipes/use_cases/github_triage\n",
+    "\n",
+    "!pip install -r requirements.txt"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "fd091a0b",
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "markdown",
+   "id": "cc67ca5d",
+   "metadata": {},
+   "source": [
+    "### Set access keys and tokens\n",
+    "\n",
+    "Set your GitHub token for API calls. Some privileged information may not be available if you don't have push-access to the target repository.\n",
+    "\n",
+    "Set your groq token for inference. Get one at https://console.groq.com/keys"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "82898b66",
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "markdown",
+   "id": "08dcb408",
+   "metadata": {},
+   "source": [
+    "### Set target repo and period to analyze"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "69103c52",
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "markdown",
+   "id": "978c67ab",
+   "metadata": {},
+   "source": [
+    "---"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "46238d8e",
+   "metadata": {},
+   "source": [
+    "## Fetch issues from the repository\n",
+    "\n",
+    "Use the github API to retrieve issues (including the full discussion on them) and store it in a dataframe."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "683079b5",
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "markdown",
+   "id": "546fea86",
+   "metadata": {},
+   "source": [
+    "---"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "516f9b45",
+   "metadata": {},
+   "source": [
+    "## Use Llama to generate the annotations for this data\n",
+    "\n",
+    "We use 2 prompts defined in `config.yaml` to annotate the issues with additional information that can help triagers and repo maintainers:\n",
+    "1. `parse_issues`: generate annotations and other metadata basd on the contents in the issue thread.\n",
+    "   \n",
+    "2. `assign_category` tags each issue with the most relevant category (from a list of categories specified in the prompt's output schema)."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "5b146534",
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "markdown",
+   "id": "1e3ab18e",
+   "metadata": {},
+   "source": [
+    "We run inference on these prompts along with the issues data in `issues_df`"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "e737f0fc",
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "markdown",
+   "id": "af4c9c26",
+   "metadata": {},
+   "source": [
+    "* The annotations include new metadata like `summary`, `possible_causes`, `remediations` that can help triagers quickly understand and diagnose the issue. \n",
+    "\n",
+    "* Annotations like `issue_type`, `component`, `themes` can help identify the right POC / maintainer to address the issue.\n",
+    "\n",
+    "* Annotations like `severity`, `op_expertise` and `sentiment` can help gauge the general pulse of developers."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "71242552",
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "markdown",
+   "id": "1affecb1",
+   "metadata": {},
+   "source": [
+    "---"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "085f4ada",
+   "metadata": {},
+   "source": [
+    "## Use Llama to generate high-level insights\n",
+    "\n",
+    "The above data is good for OSS maintainers and developers to quickly address any issues. The next section will synthesize this data into high-level insights about this repository."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "c528b4ed",
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "8f48b925",
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "markdown",
+   "id": "3ef74361",
+   "metadata": {},
+   "source": [
+    "### Key Challenges data\n",
+    "\n",
+    "We identify key areas that users are challenged by along with the relevant issues."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "62fd5aaa",
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "markdown",
+   "id": "32294e11",
+   "metadata": {},
+   "source": [
+    "### Overview Data\n",
+    "\n",
+    "As the name suggests, the `overview` dataframe contains columns that provide information about the overall activity in the repository during this period, including:\n",
+    "* an executive summary of all the issues seen during this period\n",
+    "* how many issues were created, discussed and closed\n",
+    "* what are some open questions that the maintainers should address\n",
+    "* how many issues were seen for each theme etc."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "ce9c0a25",
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "markdown",
+   "id": "2cb92256",
+   "metadata": {},
+   "source": [
+    "### Visualizing the data\n",
+    "\n",
+    "Based on this data we can easily create some plots to graphically understand the activity in the repo.\n",
+    "\n",
+    "Some additional data can be accessed via the github API, but this requires you to have push-access to this repo.\n",
+    "\n",
+    "The generated plots are saved as images in `plot_folder`"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "c6a178b7",
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "markdown",
+   "id": "f711ba59",
+   "metadata": {},
+   "source": [
+    "## Putting it together\n",
+    "\n",
+    "Now that we have all the data and insights, we can create a PDF report"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "a9214952",
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "00bf5bc7",
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.9.19"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}