Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add bridge mode for extension #228

Merged
merged 45 commits into from
Jan 7, 2025
Merged
Show file tree
Hide file tree
Changes from 35 commits
Commits
Show all changes
45 commits
Select commit Hold shift + click to select a range
d0e1af6
feat: add bridge files
yuyutaotao Dec 27, 2024
15afbdc
feat: add bridge files
yuyutaotao Dec 27, 2024
1eb8e14
feat: add bridge files
yuyutaotao Dec 27, 2024
445b348
feat: bridge mode for local connect
yuyutaotao Dec 29, 2024
0a9449f
Merge branch 'main' into feat/extension-bridge-mode
yuyutaotao Dec 29, 2024
6a8a6ed
feat: add bridge for remote page
yuyutaotao Dec 30, 2024
b75005e
chore: global polyfill
yuyutaotao Dec 30, 2024
43f9d17
feat: bridge panel
yuyutaotao Dec 30, 2024
ce22469
feat: update bridge style
yuyutaotao Dec 31, 2024
82ea466
feat: add bridge files
yuyutaotao Dec 31, 2024
a4756d6
feat: add bridge files
yuyutaotao Dec 31, 2024
044cdc6
fix: bridge io
yuyutaotao Dec 31, 2024
518e5bd
chore: merge main
yuyutaotao Jan 2, 2025
db7ed5f
feat: update bridge implementation
yuyutaotao Jan 2, 2025
6f3c7c4
doc: update doc for yaml scripts
yuyutaotao Jan 2, 2025
e8368ff
feat: update folder and build script
yuyutaotao Jan 2, 2025
1e00708
feat: add bridge files
yuyutaotao Jan 2, 2025
449c396
fix: lint
yuyutaotao Jan 2, 2025
aaca87b
feat: update tips
yuyutaotao Jan 3, 2025
785aca6
feat: allow connect to current tab
yuyutaotao Jan 3, 2025
a8f1187
feat: disable cache for nx build
yuyutaotao Jan 3, 2025
df28466
fix: add the missing deps
yuyutaotao Jan 3, 2025
b2a9b44
chore: update nx config
yuyutaotao Jan 3, 2025
79326a3
doc: add doc for bridge mode
yuyutaotao Jan 3, 2025
520430d
chore: merge main
yuyutaotao Jan 3, 2025
d716acc
doc: update doc for bridge mode
yuyutaotao Jan 3, 2025
93c983f
Merge branch 'main' into feat/extension-bridge-mode
yuyutaotao Jan 3, 2025
00c5ff4
fix: lint
yuyutaotao Jan 3, 2025
40cb9cb
fix: bridge style
yuyutaotao Jan 3, 2025
225df69
feat: update doc
yuyutaotao Jan 3, 2025
aa26c5d
doc: update doc for bridge mode
yuyutaotao Jan 3, 2025
7440a06
Merge branch 'main' into feat/extension-bridge-mode
yuyutaotao Jan 3, 2025
af72e50
doc: add image explaination for bridge mode
yuyutaotao Jan 5, 2025
84d14ce
chore: move agent test
yuyutaotao Jan 6, 2025
3364450
chore: move test case
yuyutaotao Jan 6, 2025
ed603ae
feat: print version when connected
yuyutaotao Jan 6, 2025
578556b
feat: print version when connected
yuyutaotao Jan 6, 2025
3232814
feat: print version when connected
yuyutaotao Jan 6, 2025
98724c6
fix: update export path
yuyutaotao Jan 6, 2025
9e86ad7
fix: exports of web integration
yuyutaotao Jan 6, 2025
39959bf
fix: build
yuyutaotao Jan 6, 2025
b8b6686
fix: lint
yuyutaotao Jan 6, 2025
99c7abc
fix: update export
yuyutaotao Jan 7, 2025
03df48d
fix: use enum for mouse and keyboard events
yuyutaotao Jan 7, 2025
6ea827a
fix: mouse and keyboard event
yuyutaotao Jan 7, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,6 @@
<img alt="Midscene.js" width="260" src="https://github.com/user-attachments/assets/f60de3c1-dd6f-4213-97a1-85bf7c6e79e4">
</p>


<h1 align="center">Midscene.js</h1>
<div align="center">

Expand Down
94 changes: 94 additions & 0 deletions apps/site/docs/en/bridge-mode-by-chrome-extension.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
# Bridge Mode by Chrome Extension

import { PackageManagerTabs } from '@theme';

The bridge mode in the Midscene Chrome extension is a tool that allows you to use local scripts to control the desktop version of Chrome. Your scripts can connect to either a new tab or the currently active tab.

Using the desktop version of Chrome allows you to reuse all cookies, plugins, page status, and everything else you want. You can work with automation scripts to complete your tasks. This mode is commonly referred to as 'man-in-the-loop' in the context of automation.

![bridge mode](/midscene-bridge-mode.jpg)

:::info Demo Project
you can check the demo project of bridge mode here: [https://github.com/web-infra-dev/midscene-example/blob/main/bridge-mode-demo](https://github.com/web-infra-dev/midscene-example/blob/main/bridge-mode-demo)
:::

## Preparation

Install [Midscene extension from Chrome web store](https://chromewebstore.google.com/detail/midscene/gbldofcpkknbggpkmbdaefngejllnief). We will use it later.

## Step 1. install dependencies

<PackageManagerTabs command="install @midscene/web tsx --save-dev" />

## Step 2. write scripts

Write and save the following code as `./demo-new-tab.ts`.

```typescript
import { AgentOverChromeBridge } from "@midscene/web/bridge-mode";

const sleep = (ms) => new Promise((r) => setTimeout(r, ms));
Promise.resolve(
(async () => {
const agent = new AgentOverChromeBridge();

// This will connect to a new tab on your desktop Chrome
// remember to start your chrome extension, click 'allow connection' button. Otherwise you will get an timeout error
await agent.connectNewTabWithUrl("https://www.bing.com");

// these are the same as normal Midscene agent
await agent.ai('type "AI 101" and hit Enter');
await sleep(3000);

await agent.aiAssert("there are some search results");
await agent.destroy();
})()
);
```

## Step 3. run

Launch your desktop Chrome. Start Midscene extension and switch to 'Bridge Mode' tab. Click "Allow connection".

Run your scripts

```bash
tsx demo-new-tab.ts
```

After executing the script, you should see the status of the Chrome extension switched to 'connected', and a new tab has been opened. Now this tab is controlled by your scripts.

:::info
⁠Whether the scripts are run before or after clicking 'Allow connection' in the browser is not significant.
:::

## API

Except [the normal agent interface](./api), `AgentOverChromeBridge` provides some other interfaces to control the desktop Chrome.

:::info
You should always call `connectCurrentTab` or `connectNewTabWithUrl` before doing further actions.

Each of the agent instance can only connect to one tab instance, and it cannot be reconnected after destroy.
:::

### `connectCurrentTab`

Connect to the current active tab on Chrome.

### `connectNewTabWithUrl(ur: string)`

Create a new tab with url and connect to immediately.

### `destroy`

Destroy the connection.

## Use bridge mode in yaml-script

We are still building this, and it will be ready soon.





5 changes: 3 additions & 2 deletions apps/site/docs/en/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -46,11 +46,12 @@ await aiAssert("There is a category filter on the left");

## Multiple ways to integrate

To start experiencing the core feature of Midscene, we recommend you use [The Chrome Extension](./quick-experience). You can call Action / Query / Assert by natural language on any webpage, without needing to set up a code project.
To start experiencing the core feature of Midscene, we recommend you use [the Chrome Extension](./quick-experience). You can call Action / Query / Assert by natural language on any webpage, without needing to set up a code project.

Also, there are several ways to integrate Midscene into your code project:

* [Automate with Scripts in YAML](./automate-with-scripts-in-yaml)
* [Automate with Scripts in YAML](./automate-with-scripts-in-yaml), use this if you prefer to write YAML file instead of code
* [Bridge Mode by Chrome Extension](./bridge-mode-by-chrome-extension), use this to control the desktop Chrome by scripts
* [Integrate with Puppeteer](./integrate-with-puppeteer)
* [Integrate with Playwright](./integrate-with-playwright)

Expand Down
11 changes: 11 additions & 0 deletions apps/site/docs/en/model-provider.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,9 @@ export OPENAI_MAX_TOKENS=2048
Use ADT token provider

```bash
# this is always true when using Azure OpenAI Service
export MIDSCENE_USE_AZURE_OPENAI=1

export MIDSCENE_AZURE_OPENAI_SCOPE="https://cognitiveservices.azure.com/.default"
export AZURE_OPENAI_ENDPOINT="..."
export AZURE_OPENAI_API_VERSION="2024-05-01-preview"
Expand Down Expand Up @@ -110,6 +112,15 @@ export OPENAI_API_KEY="..."
export MIDSCENE_MODEL_NAME="ep-202....."
```

## Example: config request headers (like for openrouter)

```bash
export OPENAI_BASE_URL="https://openrouter.ai/api/v1"
export OPENAI_API_KEY="..."
export MIDSCENE_MODEL_NAME="..."
export MIDSCENE_OPENAI_INIT_CONFIG_JSON='{"defaultHeaders":{"HTTP-Referer":"...","X-Title":"..."}}'
```

## Troubleshooting LLM Service Connectivity Issues

If you want to troubleshoot connectivity issues, you can use the 'connectivity-test' folder in our example project: [https://github.com/web-infra-dev/midscene-example/tree/main/connectivity-test](https://github.com/web-infra-dev/midscene-example/tree/main/connectivity-test)
Expand Down
Binary file added apps/site/docs/public/midscene-bridge-mode.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
94 changes: 94 additions & 0 deletions apps/site/docs/zh/bridge-mode-by-chrome-extension.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
# 使用 Chrome 插件的桥接模式(Bridge Mode)

import { PackageManagerTabs } from '@theme';

使用 Midscene 的 Chrome 插件的桥接模式,你可以用本地脚本控制桌面版本的 Chrome。你的脚本可以连接到新标签页或当前已激活的标签页。

使用桌面版本的 Chrome 可以让你复用已有的 cookie、插件、页面状态等。你可以使用自动化脚本与操作者互动,来完成你的任务。

![bridge mode](/midscene-bridge-mode.jpg)

:::info Demo Project
you can check the demo project of bridge mode here: [https://github.com/web-infra-dev/midscene-example/blob/main/bridge-mode-demo](https://github.com/web-infra-dev/midscene-example/blob/main/bridge-mode-demo)
:::

## 准备工作

安装 [Midscene 插件](https://chromewebstore.google.com/detail/midscene/gbldofcpkknbggpkmbdaefngejllnief)。

## 第一步:安装依赖

<PackageManagerTabs command="install @midscene/web tsx --save-dev" />

## 第二步:编写脚本

编写并保存以下代码为 `./demo-new-tab.ts`。

```typescript
import { AgentOverChromeBridge } from "@midscene/web/bridge-mode";

const sleep = (ms) => new Promise((r) => setTimeout(r, ms));
Promise.resolve(
(async () => {
const agent = new AgentOverChromeBridge();

// 这个方法将连接到你的桌面 Chrome 的新标签页
// 记得启动你的 Chrome 插件,并点击 'allow connection' 按钮。否则你会得到一个 timeout 错误
await agent.connectNewTabWithUrl("https://www.bing.com");

// 这些方法与普通 Midscene agent 相同
await agent.ai('type "AI 101" and hit Enter');
await sleep(3000);

await agent.aiAssert("there are some search results");
await agent.destroy();
})()
);
```

## 第三步:运行脚本

启动你的桌面 Chrome。启动 Midscene 插件,并切换到 'Bridge Mode' 标签页。点击 "Allow connection"。

运行你的脚本

```bash
tsx demo-new-tab.ts
```

执行脚本后,你应该看到 Chrome 插件的状态展示切换为 'connected',并且新标签页已打开。现在这个标签页由你的脚本控制。

:::info
执行脚本和点击插件中的 'Allow connection' 按钮没有顺序要求。
:::

## API

除了 [普通的 agent 接口](./api),`AgentOverChromeBridge` 还提供了一些额外的接口来控制桌面 Chrome。

:::info
你应该在执行其他操作前,先调用 `connectCurrentTab` 或 `connectNewTabWithUrl`。

每个 agent 实例只能连接到一个标签页实例,并且一旦被销毁,就无法重新连接。
:::

### `connectCurrentTab`

连接到当前已激活的标签页。

### `connectNewTabWithUrl(ur: string)`

创建一个新标签页,并立即连接到它。

### `destroy`

销毁连接。

## 在 YAML 脚本中使用桥接模式

这个功能正在开发中,很快就会与你见面。





3 changes: 2 additions & 1 deletion apps/site/docs/zh/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,8 @@ console.log("headphones in stock", items);

此外,还有几种形式将 Midscene 集成到代码:

* [使用 YAML 格式的自动化脚本](./automate-with-scripts-in-yaml)
* [使用 YAML 格式的自动化脚本](./automate-with-scripts-in-yaml),如果你更喜欢写 YAML 文件而不是代码
* [使用 Chrome 插件的桥接模式](./bridge-mode-by-chrome-extension),用它来通过脚本控制桌面 Chrome
* [集成到 Puppeteer](./integrate-with-puppeteer)
* [集成到 Playwright](./integrate-with-playwright)

Expand Down
2 changes: 2 additions & 0 deletions apps/site/docs/zh/model-provider.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,9 @@ export OPENAI_MAX_TOKENS=2048
使用 ADT token provider

```bash
# 使用 Azure OpenAI 服务时,配置为 1
export MIDSCENE_USE_AZURE_OPENAI=1

export MIDSCENE_AZURE_OPENAI_SCOPE="https://cognitiveservices.azure.com/.default"
export AZURE_OPENAI_ENDPOINT="..."
export AZURE_OPENAI_API_VERSION="2024-05-01-preview"
Expand Down
18 changes: 8 additions & 10 deletions apps/site/rspress.config.ts
Original file line number Diff line number Diff line change
Expand Up @@ -26,16 +26,6 @@ export default defineConfig({
'https://applink.larkoffice.com/client/chat/chatter/add_by_link?link_token=291q2b25-e913-411a-8c51-191e59aab14d',
},
],
// footer: {
// message: `
// <footer class="footer">
// <div class="footer-content">
// <img src="/midscene-icon.png" alt="Midscene.js Logo" class="footer-logo" />
// <p class="footer-text">&copy; 2024 Midscene.js. All Rights Reserved.</p>
// </div>
// </footer>
// `,
// },
locales: [
{
lang: 'en',
Expand Down Expand Up @@ -70,6 +60,10 @@ export default defineConfig({
text: 'Automate with Scripts in YAML',
link: '/automate-with-scripts-in-yaml',
},
{
text: 'Bridge Mode by Chrome Extension',
link: '/bridge-mode-by-chrome-extension',
},
{
text: 'Integrate with Playwright',
link: '/integrate-with-playwright',
Expand Down Expand Up @@ -127,6 +121,10 @@ export default defineConfig({
text: '使用 YAML 格式的自动化脚本',
link: '/zh/automate-with-scripts-in-yaml',
},
{
text: '使用 Chrome 插件的桥接模式(Bridge Mode)',
link: '/zh/bridge-mode-by-chrome-extension',
},
{
text: '集成到 Playwright',
link: '/zh/integrate-with-playwright',
Expand Down
4 changes: 1 addition & 3 deletions nx.json
Original file line number Diff line number Diff line change
Expand Up @@ -5,14 +5,12 @@
"dependsOn": ["^build"]
},
"build": {
"dependsOn": ["^build"],
"cache": true
"dependsOn": ["^build"]
},
"build:watch": {
"dependsOn": ["^build"]
},
"test": {
"dependsOn": ["^build"],
"cache": false
},
"e2e": {
Expand Down
12 changes: 5 additions & 7 deletions packages/midscene/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -38,20 +38,18 @@
"prepublishOnly": "npm run build"
},
"dependencies": {
"@anthropic-ai/sdk": "0.33.1",
"@azure/identity": "4.5.0",
"@langchain/core": "0.3.26",
"@anthropic-ai/sdk": "0.33.1",
"@midscene/shared": "workspace:*",
"dirty-json": "0.9.2",
"langchain": "0.3.8",
"openai": "4.57.1",
"optional": "0.1.4",
"socks-proxy-agent": "8.0.4"
"@langchain/core": "0.3.26",
"socks-proxy-agent": "8.0.4",
"openai": "4.57.1"
},
"devDependencies": {
"@modern-js/module-tools": "2.60.6",
"@types/node": "^18.0.0",
"@types/node-fetch": "2.6.11",
"dirty-json": "0.9.2",
"dotenv": "16.4.5",
"langsmith": "0.1.36",
"typescript": "~5.0.4",
Expand Down
3 changes: 2 additions & 1 deletion packages/midscene/src/action/executor.ts
Original file line number Diff line number Diff line change
Expand Up @@ -143,7 +143,8 @@ export class Executor {
taskIndex++;
} catch (e: any) {
successfullyCompleted = false;
task.error = e?.message || 'error-without-message';
task.error =
e?.message || (typeof e === 'string' ? e : 'error-without-message');
task.errorStack = e.stack;

task.status = 'failed';
Expand Down
1 change: 1 addition & 0 deletions packages/midscene/src/ai-model/openai/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -118,6 +118,7 @@ async function createChatClient({
endpoint: getAIConfig(AZURE_OPENAI_ENDPOINT),
apiVersion: getAIConfig(AZURE_OPENAI_API_VERSION),
deployment: getAIConfig(AZURE_OPENAI_DEPLOYMENT),
dangerouslyAllowBrowser: true,
...extraConfig,
...extraAzureConfig,
});
Expand Down
3 changes: 1 addition & 2 deletions packages/visualizer/modern.config.ts
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ import path from 'node:path';
import { defineConfig, moduleTools } from '@modern-js/module-tools';
import { modulePluginNodePolyfill } from '@modern-js/plugin-module-node-polyfill';
import { version } from './package.json';
const externals = ['playwright'];
const externals = ['playwright', 'bufferutil', 'utf-8-validate'];

const commonConfig = {
asset: {
Expand All @@ -18,7 +18,6 @@ const commonConfig = {
: undefined,
define: {
__VERSION__: JSON.stringify(version),
global: 'globalThis',
},
};

Expand Down
3 changes: 3 additions & 0 deletions packages/visualizer/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -56,5 +56,8 @@
"sideEffects": ["**/*.css", "**/*.less", "**/*.sass", "**/*.scss"],
"publishConfig": {
"access": "public"
},
"dependencies": {
"buffer": "6.0.3"
}
}
Loading
Loading