Appium MCP XSS lets a test app hijack your AI agent

Slender mechanical arm touching an etched glass pane with a bright return filament

By Noam Alum · Jun 20, 2026 · 07:10 UTC ·Security news · 4 min read

Appium is one of the most widely used tools for automating tests on phones, and its official server for the Model Context Protocol (MCP) lets an AI agent drive those tests in plain language. A flaw disclosed on June 19 turns that convenience into a foothold: the mobile app being tested can inject code that runs inside the agent's interface and then calls the agent's own tools. The data the agent was sent to inspect becomes the code that controls it.

The bug, tracked as GHSA-x975-rgx4-5fh4, carries a CVSS score of 8.2 and affects appium-mcp at version 1.85.9 and earlier. The fix landed in 1.85.10, and the current release is 1.86.1. No CVE is assigned yet. The research group EQSTLab reported it, and it sits in the open vulnerability database as a high-severity cross-site scripting (XSS) issue.

How can a tested app run code in the agent?

The app under test controls its own on-screen text and element attributes. Appium reads those over the page source, and the server pastes them unescaped into the small interface the AI client renders. A crafted attribute, such as an image tag carrying an error handler, executes the moment the client draws the panel. No exploit toolkit is required, just control of what the app shows.

The weak spot is a function called createLocatorGeneratorUI. When an agent asks Appium to suggest selectors for on-screen elements, that function builds a panel listing each element's text, content description, resource ID, and the generated selector, and it dropped those values straight into an HTML template. None of them were escaped first.

The postMessage bridge is the real payload

Injected script alone would be a content problem. What makes this serious is where the script can reach. MCP UI resources render inside the client and are allowed to talk back to it with window.parent.postMessage. That channel is how a legitimate panel asks the client to run a tool. Borrowed by injected script, it becomes a way to invoke any tool the agent has registered: take a screenshot, read the page source, or anything else the host exposes. The call goes through without a human approving it.

If the agent's host also exposes file or shell utilities, the blast radius grows from data theft to lateral movement on the developer's machine. In a continuous integration pipeline, where these agents increasingly run unattended, a single hostile test target could reach whatever that runner can reach.

Why the usual XSS math undersells this

The score lists user interaction as required. For a person clicking around a page, that caveat is real and it lowers the risk. For an agent, the interaction is just the agent doing the job it was told to do: calling generate_locators on whatever app it was pointed at. An autonomous testing loop trips the trigger on its own, with no one watching. The human-in-the-loop assumption baked into most cross-site scripting ratings does not survive contact with software that acts by itself.

There is a second lesson in the patch. A neighboring function in the same file, createPageSourceInspectorUI, already escaped its input. One function was hardened and the one beside it was not. Manual escaping is a coin flip at scale, and whoever wrote the second function simply forgot. The durable fixes are escaping that is applied automatically by context, or UI resources that carry no executable markup at all.

The trust boundary that moved

In ordinary Appium use, the app under test is the target. It has no special standing and no path to the machine driving it. Wire the same setup to an AI agent and the relationship inverts. The app's strings are now input to code the agent renders and to tools the agent can fire. Anyone who can influence what appears on the screen, a shared test build, a third-party SDK, a web view loading remote content, gets a say in what the agent does next.

What to do now

Update appium-mcp to 1.85.10 or later; 1.86.1 is current. If you cannot update right away, do not point the MCP server at apps you do not fully control, and avoid running the locator generator against untrusted screens. Teams standing up MCP servers more broadly should treat every rendered resource as attacker-influenced input: sandbox the renderer, escape output by context, and limit which tools a UI resource may call rather than trusting the panel's origin.

This is the same shape as other recent agent bugs. A rigged document drove Langflow into full server takeover, and a poisoned search result steered Microsoft 365 Copilot into leaking data. AI development tooling is a target in its own right now, from plugins that quietly steal AI API keys to test harnesses that hand control to whatever app is on the screen. For an AI agent, the content it reads is an attack surface, and the tools it holds are the payload.

Frequently asked questions

What is GHSA-x975-rgx4-5fh4?

GHSA-x975-rgx4-5fh4 is a high-severity cross-site scripting flaw in appium-mcp, the official Appium server for the Model Context Protocol. It scores 8.2 on CVSS and lets a malicious app under test inject code that runs inside an AI agent's interface and calls the agent's tools.

Which versions of appium-mcp are affected?

Versions 1.85.9 and earlier are vulnerable. The fix shipped in 1.85.10, and 1.86.1 is the current release. There is no CVE assigned yet. Upgrading to any version at or above 1.85.10 removes the flaw.

How is the vulnerability exploited?

An attacker who controls an app's on-screen text or element attributes plants an HTML or script payload. When the agent calls the locator generator, the server renders those values without escaping them, the script runs in the client, and it invokes the agent's tools through postMessage.

Is this being exploited in the wild?

No public reports describe active exploitation as of June 20, 2026. The research group EQSTLab disclosed the flaw, and it was patched the same day it became public. The risk is real for anyone pointing the MCP server at apps they do not control.

What should teams running MCP servers do?

Update appium-mcp to 1.85.10 or later first. More broadly, treat every resource an MCP server renders as attacker-influenced input: sandbox the renderer, escape output by context, and restrict which tools a UI resource is allowed to call rather than trusting its origin.

Keep reading

The app you're testing can hijack the AI agent testing it: Appium MCP's XSS flaw

How can a tested app run code in the agent?

The postMessage bridge is the real payload

Why the usual XSS math undersells this

The trust boundary that moved

What to do now

Stay close to the work.

Frequently asked questions

What is GHSA-x975-rgx4-5fh4?

Which versions of appium-mcp are affected?

How is the vulnerability exploited?

Is this being exploited in the wild?

What should teams running MCP servers do?

Ready to meet the Guardians?

Something great
is coming.

You're on the list.

The app you're testing can hijack the AI agent testing it: Appium MCP's XSS flaw

How can a tested app run code in the agent?

The postMessage bridge is the real payload

Why the usual XSS math undersells this

The trust boundary that moved

What to do now

Stay close to the work.

Frequently asked questions

What is GHSA-x975-rgx4-5fh4?

Which versions of appium-mcp are affected?

How is the vulnerability exploited?

Is this being exploited in the wild?

What should teams running MCP servers do?

Related posts

PaperCut's Windows print client can be tricked into giving a local attacker total control

Run Central Dogma across servers? It may be guarding your config with a password printed in its source code

Your Squid proxy can leak other users' passwords, and the 7.6 update won't fix it

Ready to meet the Guardians?