A phone-use agent in your pocket — always close.
ClosePaw is an open-source agent harness for Android. Give it a natural-language task ("book a table for two at the ramen place near me", "summarize the new Slack threads, then mute the noisy channel") and it operates your phone like you would — via Android's accessibility service, or in the background on a virtual display (with Shizuku).
- 🗣️ Just say what you need. Type or speak — "find the cheapest AirPods Pro", "summarize unread Slack threads and mute the noisy channel." ClosePaw operates the app like you would.
- 🫧 Smart Capsule. A floating overlay that follows the agent across apps while it works. Watch every step, pause, take over, or send a quick note — without leaving whatever app you're in. Voice dictation built in.
- 👀 Watch every step, pause anytime. Tap circles and swipe lines show exactly what the agent is doing. Pause, take over, or stop in one tap.
- 📱 On your phone, with your real accounts. No laptop tethered over ADB, no cloud emulator with empty logins — ClosePaw runs locally against the apps you're already signed into.
- 🪟 Doesn't take your phone hostage. Optional background mode lets the agent work on a virtual screen while you keep scrolling, texting, or watching video. (Needs Shizuku.)
- 🔓 Use any AI. Bring your own — OpenAI key, sign in with ChatGPT/Codex, OpenRouter, or any OpenAI-compatible endpoint. No vendor lock-in.
- 🛡️ Safe by default. Banking, authenticator, and crypto-wallet apps are hard-blocked — no setting can override. Unfamiliar apps prompt for per-app approval (always-allow / session-only / deny). Screens marked
FLAG_SECUREare invisible to the agent's perception by design. - 🔐 Private by design. No telemetry, no third-party analytics. Traces stay on device. Apache 2.0.
Natural-language input · Smart Capsule overlay · Action visualizer · Bring-your-own LLM
Tip
Why ClosePaw, when there are already "phone-use agents" out there? Most open-source phone-use agents today either need a computer tethered over ADB to drive a phone, or run inside a cloud virtual phone that doesn't have your accounts logged in. ClosePaw runs on your actual phone, against your actual apps — Gmail, Slack, your shopping app, your group chats — with your real sessions. No laptop. No cloud sandbox. No re-logging-in.
- 🧠 A full on-device agent harness, in the making. Built in Kotlin, native to Android. ReAct loop, no external orchestrator. The pieces:
- 🔩 Primitive toolset —
mobile_action(tap, type, swipe/scroll),open_app+system_buttonfor navigation, andtodo+scratchpadas in-session working memory for long-horizon tasks. - 🌿 Subagents via
delegate_task— spin off a subagent for delegation in long-range complicated tasks, clean handoff with summary message. - 💾 Long-term memory (preliminary) — markdown files at user / device / per-app scope; the agent appends via
remember_experience. - 📚 Skills (preliminary, two kinds):
- agent-skills — agentskills.io-format skills, progressively loaded on-demand by the agent. Today bundled with the app; a discovery engine is in progress.
- app-skills — ClosePaw-unique design. Per-package
SKILL.mdfiles that teach the agent how to operate specific apps. Auto-loaded whenever that app is in the foreground.
- 🔩 Primitive toolset —
- 🛠️ Advanced agent-first tools. Programmatic escapes from tap-and-swipe:
- 🐚
shell— Android toybox file commands (ls/cat/grep/head/mv/cp). One command per call; no pipes / redirects / command substitution. No setup. - 🐧
termux_shell— full Linux toolchain on the device:python/git/curl/jq, plus anything youpkg install. Needs Termux. - 🌐
browser_script— JS automation against real Chrome via Chrome DevTools Protocol; loops, branches, and retries happen inside one tool call. Needs Chrome + Shizuku.
- 🐚
- 🪟 Virtual display platform. Hybrid background sessions via Shizuku — the agent operates a parallel Android display so the foreground stays yours.
- 🔌 Pluggable LLM layer. OpenAI-compatible API is the contract. ChatGPT/Codex OAuth flow built in. First-class: OpenAI, OpenRouter; or any OpenAI-compatible endpoint via the Other slot (Anthropic, Groq, Together, your own proxy).
- 👁️ Pluggable perception. Accessibility tree by default; optional point-in-time screenshots in screenshot/hybrid modes.
- 🔍 Inspectable traces. Every session writes LLM calls, tool calls, and perception snapshots to on-device storage; pull with
adbfor inspection. - 🔁 Eval-driven agent-harness autotune loop. Run an AndroidWorld task suite (
eval/) against the agent; an autotune harness analyzes failures, proposes prompt / tool / skill fixes, and re-runs.
- An Android device or emulator running API 31+ (Android 12 or later)
- An OpenAI or OpenRouter API key — or sign in with your ChatGPT account via OAuth — or any OpenAI-compatible endpoint (base URL + key) configured under Other
- (Optional, for Power Tools) Shizuku and/or Termux from F-Droid
Recommended — signed release APK. Download the latest APK from GitHub Releases and open it on your device to install.
Tip
Help us ship to the Play Store sooner. Google requires 12 closed testers active for 14 days before we can promote to production. Become an authorized tester by joining the closed-testing Google Group, then join/install on Android or join on the web. Please stay opted in for the full 14 days.
Or build from source. Useful if you want to hack on it or run the latest unreleased changes.
git clone https://github.com/imoonkey/closepaw.git
cd closepaw
./gradlew assembleDebug
adb install app/build/outputs/apk/debug/app-debug.apkRequires JDK 17 and the Android SDK.
On first launch, an onboarding wizard walks you through everything in order — recommended. It covers:
- Enable the Accessibility service so ClosePaw can read screens and dispatch taps
- Grant Display over other apps for the Smart Capsule overlay
- Disable Battery optimization so long-running tasks don't get killed
- Configure your LLM — paste an API key, Sign in with ChatGPT/Codex, or set up an OpenAI-compatible endpoint
- Run a quick demo task to confirm everything works end-to-end
Then type a task on the home screen. The Smart Capsule overlay will follow the agent across apps so you can watch every step, pause, take over, or chime in from wherever you are.
Skipped onboarding, or want to change something later? All the same controls live under Settings — accessibility / overlay / battery toggles, LLM credentials and provider switcher, and the Power Tools opt-ins below.
ClosePaw gets noticeably more capable when you opt in to two optional integrations. Neither is required.
Note
Shizuku — unlocks the virtual display platform (the agent works in the background while you keep using your phone) and the browser_script tool. Follow the Shizuku setup guide, then re-open ClosePaw → Settings → enable Virtual display.
Note
Termux (install from F-Droid, not the Play Store version — it's outdated) — unlocks the termux_shell tool. After install, open Termux once, run pkg install termux-api, then enable the bridge in ClosePaw Settings. Details: doc/main/app/termux_shell.md.
High-level layers:
- Agent loop — ReAct turn engine, optional
delegate_tasksubagent delegation, todo + scratchpad state, cross-session memory - Tools — UI primitives (
mobile_action,open_app,system_button); working memory & control (todo,scratchpad,remember_experience,delegate_task,activate_skill); advanced (shell,termux_shellneeds Termux,browser_scriptneeds Shizuku) - Platforms —
AccessibilityPlatformfor normal use,VirtualDisplayPlatform(Shizuku) for hybrid background sessions - LLM — pluggable clients (OpenAI / OpenRouter / OpenAI-compatible "Other"), OAuth flow for ChatGPT sign-in, model catalog, retry infrastructure
Full design docs live under doc/main/. Start there for the agent loop, tool protocol contracts, and platform abstraction.
Note
On-device inference: the codebase ships an LFMLLMClient (Liquid AI Leap SDK) path, but it's not exposed in the UI yet — on-device models are still too slow to be practically useful for a multi-turn agent loop. See doc/main/infra/llm.md.
The Android accessibility service is genuinely powerful access — it lets ClosePaw read on-screen content and dispatch taps and gestures on your behalf. Please understand what you're granting before enabling it.
A formal Privacy Policy will be linked here. In the meantime:
- The accessibility service is used only to perceive on-screen content and execute the actions required by the task you typed.
- LLM requests go directly to whichever provider you configured — ClosePaw has no server in the loop.
- The microphone is only active while you're actively dictating via the Smart Capsule.
- No third-party analytics or telemetry.
- Session traces and debug logs (which may include screenshots and the text you typed) are written only to on-device storage and can be cleared from Settings at any time.
A CONTRIBUTING.md is on the way. Until then: open an issue to discuss non-trivial changes, follow Conventional Commits (feat:, fix:, refactor:, docs:, test:), and run ./gradlew clean assembleDebug lint test before opening a PR.
Good first contributions: new tools (look at how termux_shell and browser_script are wired up), additional LLM providers, perception improvements, and Smart Capsule UX polish.
-
doc/— docs hub.doc/main/for architecture (start at the README, then drill intoagent/,infra/,ui/);doc/dev/for build / debug / test workflow;doc/release/for signing, Play Store, and privacy materials. -
eval/andinspection_tool/— Python eval harness (AndroidWorld bridge) and FastAPI replay viewer fordebug-output/traces. See each folder's README. -
Project agent skills in
.claude/skills/— ClosePaw-specific workflows for AI coding agents. The improvement pipeline nests three layers by scope of evidence:/cog-tune— one session. Analyze a single trace, classify the root cause as cognition or execution, propose fixes./autotune— one batch. Run a curated AndroidWorld task set, apply the same diagnose-and-fix across all failures in the batch./autotune-loop— many batches. Orchestrate/autotunerounds unattended until convergence.
Two fix paths fork off the diagnosis:
/prompt-tuneapplies cognition-class fixes across prompts / tool descriptions / app-skills (respecting layer ownership);/action-debugisolates execution-class failures at the action layer (baseline vs accessibility-service path)./ux-visual-debugis orthogonal — end-to-end UX QA via ADB, when the question is interaction quality rather than agent reasoning.Both
CLAUDE.mdand.claude/are symlinked to theirAGENTS.md/GEMINI.md/.cursorrules/.codex//.agents/counterparts — the same project-local skills and conventions work for most AI coding agents.
Found a vulnerability? Please do not open a public issue. See SECURITY.md for the private disclosure process.
ClosePaw is an autonomous AI agent that takes real actions on your phone — taps, swipes, typing, sending messages, completing purchases. AI agents make mistakes. They misread screens, misinterpret instructions, send things to the wrong person, or persist past the intended goal. ClosePaw ships guardrails (per-app approval, hard-blocked sensitive apps, pause / takeover from the Smart Capsule), but no guardrail is perfect. Watch what the agent does on anything that touches money, communication, or anything irreversible — and take over the moment something looks off.
This is open-source software provided as-is under the Apache 2.0 License (Sections 7–8: no warranties, no liability). You assume all risk and responsibility for actions the agent takes on your behalf.
Licensed under the Apache License, Version 2.0. See NOTICE for attribution and the bundled open-source license inventory for third-party components.




