Visual context for agents
Grounding maximizes agents by giving keyboard shortcuts, UI feature metadata, and surface context via a MCP server and training data.
pip install groundingpipx install groundinguv tool install groundinguvx grounding

MCP SERVER
The glasses for agents
Serve real UI affordances to your MCP-compatible agents.
SHORTCUT INTELLIGENCE
Keyboard knowledge your agent can reason over
Agents query normalized shortcuts for hundreds of desktop and web apps, mapped to OS variants and fuzzy-matched actions, to increase navigation speed and minimize token usage.
FEATURE GROUNDING
Feature-level context for web and desktop applications
Agents choose through rich contextual data of feature name, plain-language feature description on applications, and pixel coordinates based on screen, operating system, and browser.
TRAINING DATA
Structured, reliable & comprehensive.
Production-ready workflow training data with multi-tier action fallbacks. Choose from prebuilt datasets or on-demand generation.
Loading graph...
UI states and feature groups for Gmail
Browse Prebuilt Workflows
Access thousands of ready-made workflows across popular applications. Each includes keyboard shortcuts, text grounding, and coordinate fallbacks.
Generate Custom Workflows
Provide a natural language task and your target applications. We generate custom workflow training data on-demand.
Multi-Tier Action Fallbacks
Every step includes three grounding tiers for maximum reliability. This ensures your agents work across UI variations.
FAQs
The Answers You’re Looking For
Quick answers about the Grounding MCP Server.