Browser use with Llama 3.2 Vision Quickstart #799

miguelg719 · 2024-11-21T21:03:44Z

Browser Use Llama-Recipe

This is an example notebook on how to create a Llama 3.2 vision-powered agent that can interact with web browsers on your behalf. It includes a detailed explanation of every section and example use cases.

Features

Visual understanding of web pages through screenshots
Autonomous navigation and interaction
Natural language instructions for web tasks
Persistent browser session management

For example, you can ask the agent to:

Search for a product on Amazon
Find the cheapest flight to Tokyo
Buy tickets for the next Warriors game

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes?
Did you write any new necessary tests?

Thanks for contributing 🎉!

HamidShojanazeri

awesome! thanks @miguelg719 for the PR! it would be great if you would like to add a short video demoing it as well.

HamidShojanazeri · 2024-11-24T20:53:59Z

recipes/use_cases/browser_use/agent/browser-use-quickstart.ipynb

+   "outputs": [],
+   "source": [
+    "few_shot_example_1 = \"\"\"\n",
+    "User Input: \"How much did Nvidia stock gain today?\"\n",


can we please change the query to a more neutral example.

HamidShojanazeri · 2024-11-24T21:08:38Z

recipes/use_cases/browser_use/agent/browser-use-quickstart.ipynb

+   "source": [
+    "import base64\n",
+    "from IPython.display import Markdown\n",
+    "imagePath= \"screenshot.png\"\n",


do we need to add this screenshot in this folder as well? right now its been missing

miguelg719 · 2024-12-03T18:32:20Z

Demo video added! @HamidShojanazeri

HamidShojanazeri · 2024-12-06T00:23:19Z

Thanks @miguelg719 great PR!

aidando73 · 2024-12-16T08:54:48Z

That's cool

miguelg719 and others added 3 commits November 21, 2024 12:03

Quickstart Jupyter notebook for browser use with Llama 3.2 Vision

b06832c

Update browser-use-quickstart.ipynb

21647d4

ready for PR merge

e5b7fa8

facebook-github-bot added the cla signed label Nov 21, 2024

HamidShojanazeri self-assigned this Nov 24, 2024

HamidShojanazeri reviewed Nov 24, 2024

View reviewed changes

miguelg719 added 3 commits November 24, 2024 16:10

updated examples and added sample screenshot

94a7912

prompt enhancement update

18c2d65

demo video added

df920c7

updated link to blog and demo

ad72330

HamidShojanazeri approved these changes Dec 6, 2024

View reviewed changes

HamidShojanazeri merged commit 03c61ae into meta-llama:main Dec 6, 2024
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Browser use with Llama 3.2 Vision Quickstart #799

Browser use with Llama 3.2 Vision Quickstart #799

miguelg719 commented Nov 21, 2024 •

edited

Loading

HamidShojanazeri left a comment

HamidShojanazeri Nov 24, 2024

HamidShojanazeri Nov 24, 2024

miguelg719 commented Dec 3, 2024

HamidShojanazeri commented Dec 6, 2024

aidando73 commented Dec 16, 2024

Browser use with Llama 3.2 Vision Quickstart #799

Browser use with Llama 3.2 Vision Quickstart #799

Conversation

miguelg719 commented Nov 21, 2024 • edited Loading

Browser Use Llama-Recipe

Features

Before submitting

HamidShojanazeri left a comment

Choose a reason for hiding this comment

HamidShojanazeri Nov 24, 2024

Choose a reason for hiding this comment

HamidShojanazeri Nov 24, 2024

Choose a reason for hiding this comment

miguelg719 commented Dec 3, 2024

HamidShojanazeri commented Dec 6, 2024

aidando73 commented Dec 16, 2024

miguelg719 commented Nov 21, 2024 •

edited

Loading