Peekaboo — OpenClaw Skill
Full macOS UI automation: screenshots, clicks, typing, windows, apps, and menus.
What This Skill Does
Peekaboo is a comprehensive macOS UI automation CLI that gives your OpenClaw agent the ability to see and interact with any application on your Mac. It can capture annotated screenshots with labeled UI elements, click buttons and links by element ID, type text into fields, scroll and swipe, drag and drop, manage windows and applications, interact with menus and the Dock, handle system dialogs, read and write the clipboard, and perform live capture with motion detection.
The most reliable workflow is "see then click": first, peekaboo see --annotate captures a screenshot with every interactive element labeled (B1, B2, T1, etc.). Your agent reads these labels and can then click any element with peekaboo click --on B3. This approach works with any application regardless of whether it supports accessibility APIs, making it universally applicable for web apps, native apps, and complex UIs alike.
Peekaboo also supports advanced features like live capture with configurable FPS for idle and active periods, window management (move, resize, minimize, maximize), application lifecycle (launch, quit, relaunch, switch), and human-like input profiles for typing and mouse movement. For browser-specific automation, see the Coding Agent skill. For smart home camera capture, see CamSnap.
Example Prompts
Take an annotated screenshot of Safari showing the login page so I can see the UI elements
Click the "Submit" button in the Safari window and then type my email address into the form
Launch Slack, resize its window to 1200x800, and take a screenshot of the current channel
Open the Format menu in TextEdit and click "Show Fonts"
List all open windows and close everything except VS Code and Terminal
Start a 30-second live capture of the dashboard at 8fps and highlight any changes
Check the macOS permissions status for Screen Recording and Accessibility
Scroll down 6 ticks smoothly in the frontmost window
Requirements
Binary dependency: peekaboo
- macOS:
brew install steipete/tap/peekaboo - Permissions: Screen Recording + Accessibility (System Settings → Privacy)
- Platform: macOS only
Setup on KiwiClaw
Peekaboo is pre-installed on KiwiClaw macOS tenant machines. Screen Recording and Accessibility permissions must be granted for the agent process. Configure permissions in the KiwiClaw dashboard. Once set up, your agent can see and interact with the entire macOS desktop.
Setup Self-Hosted
- Install Peekaboo:
brew install steipete/tap/peekaboo - Grant Screen Recording permission: System Settings → Privacy → Screen Recording
- Grant Accessibility permission: System Settings → Privacy → Accessibility
- Verify:
peekaboo permissions - Test:
peekaboo see --annotate --path /tmp/test.png
Related Skills
- CamSnap -- capture from physical cameras alongside screen capture
- OpenHue -- control smart lights alongside desktop automation
- iMsg -- send screenshots via iMessage to collaborators
- GifGrep -- extract stills from screen recordings for documentation
FAQ
What can the Peekaboo skill do in OpenClaw?
Peekaboo provides full macOS UI automation: capture annotated screenshots, click UI elements by ID, type text, scroll, drag and drop, manage windows and apps, interact with menus and the Dock, handle system dialogs, and perform live capture with motion detection.
Does Peekaboo require special macOS permissions?
Yes. Peekaboo requires Screen Recording permission (for screenshots and UI inspection) and Accessibility permission (for clicking, typing, and other interactions). Grant these in System Settings → Privacy & Security. Learn more in our running OpenClaw without a Mac Mini guide.
How does the see-then-click workflow work?
First, peekaboo see --annotate captures an annotated screenshot with labeled UI elements (B1, B2, T1, etc.). Then you can click any element by its label: peekaboo click --on B3. This is the most reliable way to interact with any app's UI.
Is the Peekaboo skill safe to use?
The Peekaboo skill has been security-vetted by KiwiClaw. It runs locally on your Mac and never sends screenshots or UI data to external servers. All interactions are logged and require the explicit permissions you've granted.