Essay · 8 min read
Hands on with OpenCLI
A CLI hub your coding agent learns through one skill file. The commands you’d run, and the output you’d see come back.
TL;DR
- OpenCLI (jackwener/opencli) wraps existing CLIs and websites into one unified hub your coding agent invokes through a single skill file. Apache 2.0, npm-installed, ~6 weeks old.
- The pitch: a stateless, deterministic alternative to MCP servers. Discovery happens at runtime via
opencli list -f jsoninstead of dumping tool schemas into the model’s context window. - Four use cases below: wrap an existing CLI like
ghfor unified discovery, use your already-logged-in Chrome session as an auth layer, scaffold a custom adapter for a site OpenCLI doesn’t know yet, and pull deterministic structured data without paying tokens for HTML interpretation. - Caveats up front: the project is six weeks old, v1.7 was a breaking rewrite, and the “AGENT.md integration” line in the README is aspirational. The actual agent contract is a
SKILL.mdfile. Treat the rough-edges section near the end as load-bearing.
Why this matters
If you’ve wired more than a couple of MCP servers into a coding agent, you’ve felt the cost. Every server dumps its full tool schema into the model’s context the moment the agent starts. Independent benchmarks have shown the MCP path runs anywhere from four to thirty-two times more tokens than calling the equivalent CLI directly, just for discovery. That’s a real bill on every turn, even before the agent does any work.
OpenCLI is the other end of that tradeoff. Instead of pre-loading schemas, the agent learns one rule from a skill file: “if you need a tool, run opencli list -f json and pick from what’s there.” Discovery happens on demand. The agent thinks for one extra step; the context stays small. For Claude Code or Cursor, that’s a fair trade. For weaker agents that can’t reason about tool selection, MCP is still the safer path.
The other interesting move is the browser layer. OpenCLI ships a Chrome extension called Browser Bridge that hooks into your already-logged-in session. Sites that don’t have public APIs (or that rate-limit the ones they do) become tool calls. Your agent can read your Twitter feed, browse Bilibili, post to Reddit. No OAuth dance, no app keys, no headless browser auth gymnastics.
Below are four concrete situations where I’d reach for it, framed from the developer’s side: what you type, what comes back, and what you do with it.
Use case 1
Wrap an existing CLI under unified discovery
The situation. Your agent already knows gh. Your agent already knows docker. Each one of these is a tool you’d otherwise wire up as its own MCP server, schema-dump and all. You’d rather your agent reach for any of them through one consistent surface — and auto-install the binary if it isn’t there.
What you do. Install OpenCLI globally, then call any wrapped tool through the opencli entrypoint. The wrapper is a passthrough: native flags, native output, native exit codes. The win is that opencli list now knows about every tool registered with the hub.
What you see. If gh isn’t installed on your machine, OpenCLI runs brew install gh (or the equivalent for your platform) before passing the command through. The agent doesn’t have to know about your platform; the hub does.
What the agent sees. One discovery call returns the entire registry as structured JSON:
What you do next. Install the OpenCLI usage skill once: npx skills add jackwener/opencli. The skill teaches your coding agent the rules — always request -f json, hit opencli list before guessing, retry with --trace retain-on-failure on errors. From that point on the agent uses the hub the same way it’d use the shell.
Use case 2
Borrow your already-logged-in Chrome session
The situation. You want your agent to read your X timeline, drop a comment on a Reddit thread, or pull from a site that doesn’t expose a clean API. The traditional path is OAuth, app keys, refresh-token plumbing, and a per-vendor SDK. None of that is interesting work. You’d rather just have the agent use the session you’re already logged into.
What you do. Install the Browser Bridge Chrome extension once. From then on, OpenCLI’s browser adapter pipes commands through your live Chrome — same cookies, same session, same login. Each pre-built site adapter (Twitter, Reddit, Bilibili, HN, Spotify, Amazon, GitHub, NotebookLM, and a few dozen more) wraps the underlying browser calls and returns structured JSON.
What you see. Structured JSON. No HTML to interpret. No “the model decides if this DOM element looks like a tweet.” The adapter knows the page layout and returns clean records. If the site ships a redesign, the adapter breaks; the rest of the hub keeps working.
What you do next. Read-only is the safe starting point. Write commands (post, comment, like) exist for several adapters but use them with the same caution you’d use any session-borrowing tool. If you’re letting an agent post on your behalf, log every call.
Use case 3
Author a custom adapter for a site OpenCLI doesn’t know
The situation. Your team uses an internal tool, or a niche site, that OpenCLI doesn’t ship an adapter for. You don’t want to write a Playwright script that lives in some corner of your repo. You want the new tool to show up in opencli list like everything else.
What you do. Three commands. analyze probes the site and produces a manifest. init scaffolds the adapter file. You fill in the selectors. verify runs the fixture set you wrote and tells you what passes.
What you see. A new entry in opencli list -f json. The adapter lives in ~/.opencli/clis/mysite/ as plain JavaScript. You can ship it to your team by checking that directory into a shared dotfile repo or by writing a small installer.
What the agent does. If you also install the opencli-adapter-author skill, your coding agent can walk through this loop on its own — analyze, init, fill in the selectors, run verify, debug failures. About thirty minutes from “what does this site look like” to “agent can search it.” The first time I tried this, it felt unreasonable.
Use case 4
Pull deterministic structured data without paying tokens
The situation. Your agent needs the top five Hacker News stories, or a Spotify playlist’s tracks, or the trending repos on GitHub. The naive way: WebFetch the page, dump the HTML into context, ask the model to extract. That works, costs tokens every time, and is non-deterministic. The model will “round” the title sometimes.
What you do. One command, JSON out, no LLM in the loop:
What you see. The same shape every time. The agent doesn’t have to interpret HTML; the adapter already did. You can pipe the JSON into jq, into a file, into another tool, or into the agent’s prompt. None of those paths cost more tokens than the JSON itself.
What you do next. Pair this with the agent skill. The skill tells the model: “if the user asks about HN/Reddit/Spotify/GitHub trending, hit the OpenCLI adapter first. Don’t WebFetch. Don’t guess.” Cost goes from “every call has an HTML interpretation step” to “every call returns clean JSON in milliseconds.”
What to know before you reach for it
OpenCLI is genuinely interesting and genuinely young. A few things to flag before you put it in your daily flow:
- The project is six weeks old. First release was around March 20, 2026. v1.7 (April 11) was a breaking architecture rewrite — YAML adapters dropped,
operaterenamed tobrowser, several flags renamed. Pin a version if you depend on the API surface. - The “AGENT.md integration” line in the README is aspirational. There is no AGENT.md spec or template in the repo. The actual agent contract lives in
SKILL.mdfiles installed vianpx skills add jackwener/opencli. If you read AGENT.md anywhere in the marketing, mentally substitute SKILL.md. - Browser Bridge has rough edges. Issue #1341 reports persistent Chrome extension errors on macOS in v1.7.12. #1376 reports the YouTube transcript adapter currently broken. Check the issue tracker before betting on a specific adapter.
- Adapters break when sites change. This is the eternal scraper tax. The hub has tooling (
verify) to surface failures fast, but you should still expect to do maintenance on adapters you depend on. - Node.js ≥ 21 required. Aggressive choice. Most teams are still on 20 LTS. Plan accordingly.
- Adapter set is heavy on China-internet. Bilibili, Xiaohongshu, Boss直聘, 1688, Doubao, DingTalk, Lark. If you’re outside that ecosystem you’ll use maybe a third of what’s shipped — but the third you do use is high-leverage (HN, Reddit, Twitter, GitHub, Spotify).
None of these are dealbreakers for the use cases above. They are reasons to read the issue tracker before you bet a production workflow on the project.
Where it sits next to MCP
The interesting question isn’t “is OpenCLI better than MCP.” It’s “which tradeoff fits your agent.” Two paths, two costs:
- MCP servers push tool schemas into the model’s context window upfront. Discovery is free at runtime — the model already knows the menu. The tax is paid in tokens on every turn. For agents that can’t reliably reason about which tool to invoke without seeing all options, this is the safer path.
- OpenCLI keeps the menu out of the context. The agent learns one rule: before you reach for a tool, run
opencli list -f json. Discovery costs one extra reasoning step. The tax is paid in cognition, not tokens. For Claude Code, Cursor, and other strong agents, the cognition is cheap and the token savings compound.
It’s also worth noting that the two compose. You can run an MCP server that wraps OpenCLI itself — the agent gets one MCP tool that proxies into the entire OpenCLI hub. That’s the “best of both” pattern if your harness is MCP-only.
If you’ve been following the Prompt → Context → Harness series, OpenCLI is a tool in the harness layer with an unusual cost model. OpenShell sat one layer below as the runtime. Different layers, different problems, different tradeoffs. The interesting work in 2026 is putting them together.

About the author
Archit Dwivedi builds AI agents and ships them into production. More at /about · what I’m working on · say hi.
Archit · May 2026
Leave a comment