Give your AI agent eyes and hands
One command. Your coding agent can see your screen and control your desktop.
npx juno-cuajuno-cua starts an MCP (Model Context Protocol) server that exposes desktop automation tools to any compatible AI agent. Your agent can take screenshots, click buttons, type text, read the accessibility tree, and navigate between apps.
The use case: you're working with Claude Code and it needs to check if a UI change looks right, or verify something in a browser, or interact with a desktop app. Instead of describing what you see, the agent just looks.
Install
npm (no install needed)
npx juno-cuaHomebrew
brew install lacymorrow/tap/juno-cuaAvailable tools
| Tool | Description |
|---|---|
| screenshot | Capture the current screen state |
| click | Click at specific coordinates or UI elements |
| type | Type text into any field or application |
| scroll | Scroll in any direction |
| accessibility_tree | Read the full UI element hierarchy |
| open_url | Open URLs in the default browser |
| open_app | Launch any installed application |
| drag | Drag from one point to another |
Compatible agents
| Agent | Integration |
|---|---|
| Claude Code | Native MCP support |
| Cursor | MCP tool integration |
| OpenAI Codex | MCP-compatible |
| Gemini CLI | MCP-compatible |
| Custom agents | Any MCP client |
What developers use it for
Visual verification — "Does the button look right after my CSS change?" The agent screenshots and checks.
Cross-app workflows — Agent reads a design in Figma, writes the code, then opens the browser to compare.
Testing — Agent fills forms, clicks through flows, and verifies the app works end-to-end.
Desktop automation — Anything beyond the terminal. Agent manages windows, interacts with native apps, controls the OS.
Open source
juno-cua is part of the Juno project. Source code, issues, and contributions at github.com/lacymorrow/juno. Licensed FSL-1.1-MIT (converts to MIT after 2 years).
Give your agent desktop superpowers.