Visual context for agents

Grounding maximizes agents by giving keyboard shortcuts, UI feature metadata, and surface context via a MCP server and training data.

pip install groundingpipx install groundinguv tool install groundinguvx grounding
Grounding platform interface showing AI agent visual context and UI automation - screenshot 1
Grounding platform interface showing AI agent visual context and UI automation - screenshot 2

MCP SERVER

The glasses for agents

Serve real UI affordances to your MCP-compatible agents.

SHORTCUT INTELLIGENCE

Keyboard knowledge your agent can reason over

Agents query normalized shortcuts for hundreds of desktop and web apps, mapped to OS variants and fuzzy-matched actions, to increase navigation speed and minimize token usage.

FEATURE GROUNDING

Feature-level context for web and desktop applications

Agents choose through rich contextual data of feature name, plain-language feature description on applications, and pixel coordinates based on screen, operating system, and browser.

TRAINING DATA

Structured, reliable & comprehensive.

Production-ready workflow training data with multi-tier action fallbacks. Choose from prebuilt datasets or on-demand generation.

Loading graph...

UI states and feature groups for Gmail

Browse Prebuilt Workflows

Access thousands of ready-made workflows across popular applications. Each includes keyboard shortcuts, text grounding, and coordinate fallbacks.

Generate Custom Workflows

Provide a natural language task and your target applications. We generate custom workflow training data on-demand.

Multi-Tier Action Fallbacks

Every step includes three grounding tiers for maximum reliability. This ensures your agents work across UI variations.

FAQs

The Answers You’re Looking For

Quick answers about the Grounding MCP Server.

Maximize your agent performance