Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -193,7 +193,7 @@ Each desktop adapter has its own detailed documentation with commands reference,
| **Cursor** | Control Cursor IDE — Composer, chat, code extraction | [Doc](./docs/adapters/desktop/cursor.md) |
| **Codex** | Drive OpenAI Codex CLI agent headlessly | [Doc](./docs/adapters/desktop/codex.md) |
| **Antigravity** | Control Antigravity Ultra from terminal | [Doc](./docs/adapters/desktop/antigravity.md) |
| **ChatGPT** | Automate ChatGPT macOS desktop app | [Doc](./docs/adapters/desktop/chatgpt.md) |
| **ChatGPT** | Automate ChatGPT desktop app (default macOS native + explicit CDP surfaces) | [Doc](./docs/adapters/desktop/chatgpt.md) |
| **ChatWise** | Multi-LLM client (GPT-4, Claude, Gemini) | [Doc](./docs/adapters/desktop/chatwise.md) |
| **Notion** | Search, read, write Notion pages | [Doc](./docs/adapters/desktop/notion.md) |
| **Discord** | Discord Desktop — messages, channels, servers | [Doc](./docs/adapters/desktop/discord.md) |
Expand Down
97 changes: 79 additions & 18 deletions docs/adapters/desktop/chatgpt.md
Original file line number Diff line number Diff line change
@@ -1,43 +1,104 @@
# ChatGPT

Control the **ChatGPT macOS Desktop App** directly from the terminal. OpenCLI supports two automation approaches for ChatGPT.
Control the **ChatGPT Desktop App** from the terminal.

## Approach 1: AppleScript (Default, No Setup)
OpenCLI now keeps ChatGPT automation split by **target surface** so new Windows support stays additive and the long-standing macOS behavior stays intact.

The current built-in commands use native AppleScript automation — no extra launch flags needed.
## Surface 1: `macos-native` (default)

This is the original built-in path. If you run `opencli chatgpt ...` with no `--surface` flag, OpenCLI keeps using native macOS automation via AppleScript + Accessibility.

### Prerequisites
1. Install the official [ChatGPT Desktop App](https://openai.com/chatgpt/mac/) from OpenAI.
1. Install the official [ChatGPT desktop app](https://openai.com/chatgpt/download/).
2. Grant **Accessibility permissions** to your terminal app in **System Settings → Privacy & Security → Accessibility**.

### Commands
- `opencli chatgpt status`: Check if the ChatGPT app is currently running.
- `opencli chatgpt new`: Activate ChatGPT and press `Cmd+N` to start a new conversation.
- `opencli chatgpt send "message"`: Copy your message to clipboard, activate ChatGPT, paste, and submit.
- `opencli chatgpt read`: Read the last visible message from the focused ChatGPT window via the Accessibility tree.
### Commands on `macos-native`
- `opencli chatgpt status`
- `opencli chatgpt new`
- `opencli chatgpt send "message"`
- `opencli chatgpt read`
- `opencli chatgpt ask "message"`

### Notes
- `read` returns the **last visible message** from the focused ChatGPT window via the macOS Accessibility tree.
- `ask` remains the original **send + wait + read** macOS-only flow.

## Surface 2: `macos-cdp` (experimental)

This preserves the existing documented idea of a **ChatGPT mac CDP mode**, but makes it explicit instead of automatic.

Use it only on the commands that currently support the narrow CDP path:

- `opencli chatgpt status --surface macos-cdp`
- `opencli chatgpt read --surface macos-cdp`
- `opencli chatgpt send --surface macos-cdp "message"`

## Surface 3: `windows-cdp` (experimental)

This is the new additive surface for the **Windows ChatGPT desktop app**, including WSL workflows that control the Windows app over a local CDP endpoint.

Use it on the same narrow command subset:

- `opencli chatgpt status --surface windows-cdp`
- `opencli chatgpt read --surface windows-cdp`
- `opencli chatgpt send --surface windows-cdp "message"`

> **Important:** OpenCLI does **not** switch ChatGPT into CDP mode automatically just because `OPENCLI_CDP_ENDPOINT` is set. You must opt in per command with `--surface macos-cdp` or `--surface windows-cdp`.

## Approach 2: CDP (Advanced, Electron Debug Mode)
## CDP setup

ChatGPT Desktop is also an Electron app and can be launched with a remote debugging port:
### macOS example

```bash
/Applications/ChatGPT.app/Contents/MacOS/ChatGPT \
--remote-debugging-port=9224

export OPENCLI_CDP_ENDPOINT="http://127.0.0.1:9224"
# Optional but recommended when multiple targets exist:
export OPENCLI_CDP_TARGET="chatgpt"
```

### Windows / WSL example

Fully quit ChatGPT first, then launch the real Windows app with a debugging port:

```powershell
ChatGPT.exe --remote-debugging-port=9224 --remote-debugging-address=127.0.0.1
```

Then from WSL or the same Windows machine:

```bash
export OPENCLI_CDP_ENDPOINT="http://127.0.0.1:9224"
export OPENCLI_CDP_TARGET="chatgpt" # optional but recommended
```

> The CDP approach enables future advanced commands like DOM inspection, model switching, and code extraction.
> On Windows, a **true cold launch matters**. If ChatGPT is already running, relaunching with debug flags may leave you with no usable `/json` target list.

## Command support matrix

| Command | `macos-native` | `macos-cdp` | `windows-cdp` |
|---------|-----------------|-------------|---------------|
| `status` | ✅ | ✅ | ✅ |
| `new` | ✅ | — | — |
| `send` | ✅ | ✅ | ✅ |
| `read` | ✅ | ✅ | ✅ |
| `ask` | ✅ | — | — |

## How the CDP path behaves today

## How It Works
The current CDP implementation is intentionally narrow:

- **AppleScript mode**: Uses `osascript` to control ChatGPT, `pbcopy`/`pbpaste` to paste prompts, and the macOS Accessibility tree to read visible chat messages.
- **CDP mode**: Connects via Chrome DevTools Protocol to the Electron renderer process.
- `status` attaches to the selected ChatGPT target and reports connection state
- `read` returns the **last visible conversation turn** from the current ChatGPT window
- `send` injects the prompt into the active composer and submits it
- the CDP `send` path returns after submission; use `read` later if you want the latest visible output

## Limitations

- macOS only (AppleScript dependency)
- AppleScript mode requires Accessibility permissions
- `read` returns the last visible message in the focused ChatGPT window — scroll first if the message you want is not visible
- `new` and `ask` remain **macOS-native only**.
- CDP support is intentionally limited to `status`, `read`, and `send`.
- If multiple inspectable targets exist, set `OPENCLI_CDP_TARGET=chatgpt`.
- `send` in CDP mode refuses to overwrite an existing draft already sitting in the composer.
- `read` only returns the **last visible** conversation turn, not a full export.
- DOM selectors may drift as ChatGPT desktop changes.
2 changes: 1 addition & 1 deletion docs/adapters/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ Run `opencli list` for the live registry.
| **[Cursor](/adapters/desktop/cursor)** | Control Cursor IDE | `status` `send` `read` `new` `dump` `composer` `model` `extract-code` `ask` `screenshot` `history` `export` |
| **[Codex](/adapters/desktop/codex)** | Drive OpenAI Codex CLI agent | `status` `send` `read` `new` `extract-diff` `model` `ask` `screenshot` `history` `export` |
| **[Antigravity](/adapters/desktop/antigravity)** | Control Antigravity Ultra | `status` `send` `read` `new` `dump` `extract-code` `model` `watch` |
| **[ChatGPT](/adapters/desktop/chatgpt)** | Automate ChatGPT macOS app | `status` `new` `send` `read` `ask` |
| **[ChatGPT](/adapters/desktop/chatgpt)** | Automate ChatGPT desktop app (default macOS native + explicit CDP surfaces) | `status` `new` `send` `read` `ask` |
| **[ChatWise](/adapters/desktop/chatwise)** | Multi-LLM client | `status` `new` `send` `read` `ask` `model` `history` `export` `screenshot` |
| **[Notion](/adapters/desktop/notion)** | Search, read, write pages | `status` `search` `read` `new` `write` `sidebar` `favorites` `export` |
| **[Discord](/adapters/desktop/discord)** | Desktop messages & channels | `status` `send` `read` `channels` `servers` `search` `members` |
Expand Down
19 changes: 19 additions & 0 deletions src/browser.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,25 @@ describe('browser helpers', () => {

expect(target?.webSocketDebuggerUrl).toBe('ws://127.0.0.1:9226/codex');
});

it('boosts ChatGPT targets during generic CDP target scoring', () => {
const target = __test__.selectCDPTarget([
{
type: 'page',
title: 'Session Overview',
url: 'https://example.com/dashboard',
webSocketDebuggerUrl: 'ws://127.0.0.1:9224/other',
},
{
type: 'page',
title: 'ChatGPT',
url: 'https://chatgpt.com/?window_style=main_view',
webSocketDebuggerUrl: 'ws://127.0.0.1:9224/chatgpt',
},
]);

expect(target?.webSocketDebuggerUrl).toBe('ws://127.0.0.1:9224/chatgpt');
});
});

describe('BrowserBridge state', () => {
Expand Down
2 changes: 2 additions & 0 deletions src/browser/cdp.ts
Original file line number Diff line number Diff line change
Expand Up @@ -341,6 +341,7 @@ function scoreCDPTarget(target: CDPTarget, preferredPattern?: RegExp): number {
if (title.includes('codex')) score += 120;
if (title.includes('cursor')) score += 120;
if (title.includes('chatwise')) score += 120;
if (title.includes('chatgpt')) score += 120;
if (title.includes('notion')) score += 120;
if (title.includes('discord')) score += 120;
if (title.includes('netease')) score += 120;
Expand All @@ -349,6 +350,7 @@ function scoreCDPTarget(target: CDPTarget, preferredPattern?: RegExp): number {
if (url.includes('codex')) score += 100;
if (url.includes('cursor')) score += 100;
if (url.includes('chatwise')) score += 100;
if (url.includes('chatgpt') || url.includes('chat.openai')) score += 100;
if (url.includes('notion')) score += 100;
if (url.includes('discord')) score += 100;
if (url.includes('netease')) score += 100;
Expand Down
5 changes: 4 additions & 1 deletion src/clis/chatgpt/README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
# ChatGPT Adapter

Control the **ChatGPT macOS Desktop App** from the terminal via AppleScript or CDP.
Control the **ChatGPT Desktop App** from the terminal.

- Default surface: **macOS native** (AppleScript + Accessibility)
- Experimental opt-in surfaces on `status`, `read`, and `send`: `--surface macos-cdp` or `--surface windows-cdp`

📖 **Full documentation**: [docs/adapters/desktop/chatgpt](../../../docs/adapters/desktop/chatgpt.md)
44 changes: 4 additions & 40 deletions src/clis/chatgpt/README.zh-CN.md
Original file line number Diff line number Diff line change
@@ -1,44 +1,8 @@
# ChatGPT 桌面端适配器

在终端中直接控制 **ChatGPT macOS 桌面应用**。OpenCLI 支持两种自动化方式
在终端中控制 **ChatGPT Desktop App**

## 方式一:AppleScript(默认,无需配置)
- 默认 surface:**macos-native**(AppleScript + 辅助功能)
- `status` / `read` / `send` 可显式切到实验性 CDP surface:`--surface macos-cdp` 或 `--surface windows-cdp`

内置命令使用原生 AppleScript 自动化,无需额外启动参数。

### 前置条件
1. 安装官方 [ChatGPT Desktop App](https://openai.com/chatgpt/mac/)。
2. 在 **系统设置 → 隐私与安全性 → 辅助功能** 中为终端应用授予权限。

### 命令
- `opencli chatgpt status`:检查 ChatGPT 应用是否在运行。
- `opencli chatgpt new`:激活 ChatGPT 并按 `Cmd+N` 开始新对话。
- `opencli chatgpt send "消息"`:将消息复制到剪贴板,激活 ChatGPT,粘贴并提交。
- `opencli chatgpt read`:通过当前聚焦 ChatGPT 窗口的辅助功能树读取最后一条可见消息并返回文本。

## 方式二:CDP(高级,Electron 调试模式)

ChatGPT Desktop 同样是 Electron 应用,可以通过远程调试端口启动以实现更深度的自动化:

```bash
/Applications/ChatGPT.app/Contents/MacOS/ChatGPT \
--remote-debugging-port=9224
```

然后设置环境变量:
```bash
export OPENCLI_CDP_ENDPOINT="http://127.0.0.1:9224"
```

> **注意**:CDP 模式支持未来的高级命令(如 DOM 检查、模型切换、代码提取等),与 Cursor 和 Codex 适配器类似。

## 工作原理

- **AppleScript 模式**:使用 `osascript` 控制 ChatGPT,发送消息时借助 `pbcopy`/`pbpaste` 粘贴文本,读取消息时通过 macOS 辅助功能树获取当前可见聊天内容。
- **CDP 模式**:通过 Chrome DevTools Protocol 连接到 Electron 渲染进程,直接操作 DOM。

## 限制

- 仅支持 macOS(AppleScript 依赖)
- AppleScript 模式需要辅助功能权限
- `read` 返回的是当前聚焦 ChatGPT 窗口里的最后一条可见消息;如果目标消息不在可见区域,需先手动滚动
📖 **完整文档**: [docs/adapters/desktop/chatgpt](../../../docs/adapters/desktop/chatgpt.md)
5 changes: 4 additions & 1 deletion src/clis/chatgpt/ask.ts
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@ import { execSync, spawnSync } from 'node:child_process';
import { cli, Strategy } from '../../registry.js';
import type { IPage } from '../../types.js';
import { getVisibleChatMessages } from './ax.js';
import { requireMacOSHost } from './surface.js';

export const askCommand = cli({
site: 'chatgpt',
Expand All @@ -15,7 +16,9 @@ export const askCommand = cli({
{ name: 'timeout', required: false, help: 'Max seconds to wait for response (default: 30)', default: '30' },
],
columns: ['Role', 'Text'],
func: async (page: IPage | null, kwargs: any) => {
func: async (_page: IPage | null, kwargs: any) => {
requireMacOSHost('ask');

const text = kwargs.text as string;
const timeout = parseInt(kwargs.timeout as string, 10) || 30;

Expand Down
59 changes: 59 additions & 0 deletions src/clis/chatgpt/cdp.test.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
import { describe, expect, it } from 'vitest';
import { __test__ } from './cdp.js';

describe('chatgpt cdp helpers', () => {
it('formats a ready ChatGPT CDP status row with explicit surface metadata', () => {
expect(__test__.formatChatGPTStatusRow({
title: 'ChatGPT',
url: 'https://chatgpt.com/?window_style=main_view',
readyState: 'complete',
likelyChatGPT: true,
turnCount: 6,
composerFound: true,
composerTag: 'DIV',
composerEmpty: true,
draftLength: 0,
sendButtonEnabled: true,
busy: false,
}, 'windows-cdp')).toEqual({
Status: 'Connected',
Surface: 'windows-cdp',
Url: 'https://chatgpt.com/?window_style=main_view',
Title: 'ChatGPT',
Turns: 6,
Composer: 'Ready',
Busy: 'No',
});
});

it('formats send results as successful submissions while keeping table compatibility narrow', () => {
expect(__test__.formatChatGPTSendResultRow({
surface: 'windows-cdp',
submitMethod: 'button',
injectedText: 'Research this carefully',
})).toEqual({
Status: 'Success',
Surface: 'windows-cdp',
Submit: 'button',
InjectedText: 'Research this carefully',
});
});

it('normalizes raw turns and strips repeated UI chrome lines', () => {
expect(__test__.normalizeChatGPTTurns([
{ role: 'user', text: 'Hello there' },
{ role: 'assistant', text: 'Sure\nCopy\nShare' },
{ role: 'assistant', text: 'Sure\nCopy\nShare' },
{ role: 'assistant', text: ' ' },
])).toEqual([
{ Role: 'User', Text: 'Hello there' },
{ Role: 'Assistant', Text: 'Sure' },
]);
});

it('strips localized reasoning chrome and timing-only lines from readback text', () => {
expect(__test__.normalizeChatGPTText('立即回答')).toBe('');
expect(__test__.normalizeChatGPTText('Thought for 10s')).toBe('');
expect(__test__.normalizeChatGPTText('ChatGPT 说:\n已完成推理\n立即回答\n\nOK')).toBe('OK');
});
});
Loading
Loading