Home/ Blog/ Security news/ Article
Blog · Security news

A rigged puzzle talked six AI browsers into leaking a developer's SSH keys. One patch won't save you.

A game-themed web page talked six AI browsers into leaking a developer's SSH keys. Why patching one vendor is not the fix, and what to watch instead.

AI browser window as a puzzle board leaking a stream of small keys

Give an AI browser your logged-in sessions and you have not added a helper. You have added a second identity that any web page it reads can try to talk into acting for someone else. That is the real lesson of BioShocking, a technique the security firm LayerX published on June 29, 2026. LayerX did not find a bug in one product. It found a way to argue six different AI browsers out of their own safety rules using nothing but a web page dressed as a game, then walk one of them into copying a developer's private keys.

An AI browser here means one you can flip into agent mode, where it clicks and types on your behalf and operates inside whatever accounts you happen to be logged in to. That last part is the whole story.

The trick is an argument, not an exploit

The malicious page is a puzzle themed on the video game BioShock, and it rewards deliberately wrong answers. It insists 2 + 2 = 5 and treats that as the winning move. Once the agent accepts that the rules of this little world are inverted, it stops applying normal safety judgment and starts following the game. LayerX frames the root cause plainly: the page's content and the user's own instructions reach the model as one stream of text, so the agent has no reliable way to separate a trusted command from attacker text sitting in the page it is reading.

There is no memory corruption here, no unpatched CVE, no binary payload. The exploit is a few paragraphs of English. That is what makes it hard to reason about with the usual patch-and-move-on reflex, and it is the same failure we wrote up when a crafted link turned Microsoft 365 Copilot into a one-click data thief in SearchLeak.

What it actually stole

After the agent won the rigged puzzle, LayerX told it to open a page named /code and copy the contents of a text box. That page redirected to the victim's authenticated work GitHub repository, and the agent pulled the SSH login credentials straight out and handed them back. It treated the theft as one more step in the game and celebrated finishing.

Sit with that for a second. The credentials were reachable only because a human had already signed the browser into GitHub. The agent inherited that session and none of the human's suspicion. An agent in this mode is a confused deputy with your privileges and no instinct that anything is wrong. Anything the human authenticated to, the agent can reach, and it will do so the instant a rendered page tells it to. The attack surface is not the model's reasoning. It is the standing access you handed the model.

Six browsers, one class of problem, no clean fix

LayerX ran its disclosures to the affected vendors across late 2025 into January 2026, and the responses are the part defenders should read closely.

How each vendor responded to LayerX's BioShocking disclosure. Source: LayerX.

Only OpenAI shipped a real fix, in ChatGPT Atlas. Perplexity marked the Comet report resolved and changed nothing. Three of the smaller names, Fellou, Genspark, and Sigma, never replied at all. Anthropic did push a patch to its Claude extension, but by LayerX's account it broke again the moment they retested it.

Read that as a defender, not a scorecard. Indirect prompt injection is a class of problem, not a single bug. The 2-plus-2-equals-5 framing is one way to invert the rules, and there are unlimited others. Every vendor filter that blocks the last phrasing is a guess at the next one, which is exactly why a patched extension can fail on the second try. You cannot patch a tool out of trusting the text it was built to read, which is the same wall the industry hit with AI coding agents that ran the commands they had just refused.

This is the third time we have watched this exact failure

We have covered this shape enough now to name it. In AutoJack, a single web page turned a local AI agent's trust in its own machine into a takeover. In a clean repository, the code was fine and the AI coding agent was the thing that handed an attacker a shell. In JetBrains plugins, the AI tooling itself carried the credential theft.

The through-line does not change. The AI tool holds legitimate access, and content it merely reads becomes the exfiltration channel. BioShocking is the browser version. The novelty is not the puzzle. It is that a tool holding your live sessions will act on a stranger's instructions and believe it is being helpful.

Keep agent mode away from anything it can spend

Treat an agentic browser as an unmanaged identity, not as a browser. If it can reach corporate GitHub, a cloud console, email, or a secrets manager on a saved session, assume any site it visits can reach those too. The first control is to not sign it into them.

Because there is no CVE and no patch cadence to track, detection has to be behavioral. Watch egress for an AI-browser process auto-navigating to credential-bearing destinations it has no task reason to touch, and for outbound requests that carry secret-shaped data right after the agent visited an untrusted page. The victim endpoint shows no malware, so the signal lives in the network, not on disk.

Scope the session. Where a vendor offers per-action confirmation or hard access limits, turn them on, and revoke the browser's access to a sensitive account the moment the task that needed it is done. And do not read "our vendor patched it" as remediation, because LayerX's retest of a patched product still worked. The durable control is architectural: keep the agent's standing access small enough that winning an argument with it is not worth an attacker's time.

LayerX found no sign anyone is using BioShocking in the wild yet, and there is no CVE to assign. That is the window, not the all-clear. The proof of concept is public, the technique is a paragraph of text, and the tools it targets are being handed real corporate sessions right now.

AI browserVendorResponse to LayerX
ChatGPT AtlasOpenAIFixed
CometPerplexityClosed the report, no fix
Claude browser extensionAnthropicPatched, but the fix broke on LayerX's retest
FellouFellouNo response
GensparkGensparkNo response
SigmaSigmaNo response
Topics

Frequently asked questions

What is the BioShocking attack?

BioShocking is a technique from security firm LayerX that uses a puzzle-styled web page to talk AI browsers out of their safety rules.

The page rewards deliberately wrong answers, and once the agent accepts the inverted logic, it follows attacker instructions embedded in the page rather than its normal guardrails.

Which AI browsers are affected?

LayerX reproduced the attack against six agentic browsers and assistants: OpenAI's ChatGPT Atlas, Perplexity's Comet, Anthropic's Claude browser extension, Fellou, Genspark, and Sigma.

Only ChatGPT Atlas was confirmed fixed at the time of disclosure.

Is BioShocking being exploited in the wild?

No. LayerX published BioShocking on June 29, 2026 as a proof of concept, and no CVE has been assigned.

There is no evidence of real-world use yet, but the technique is public and needs no special tooling to reproduce.

How did the attack steal a developer's credentials?

After the agent solved the rigged puzzle, LayerX instructed it to open a page named /code and copy a text box.

That page redirected to the victim's logged-in GitHub repository, and the agent extracted the SSH login credentials and returned them to the attacker.

Did the vendors fix BioShocking?

The response was uneven.

OpenAI fixed it in ChatGPT Atlas. Perplexity closed the report on Comet without acting, and Fellou, Genspark, and Sigma did not respond. Anthropic patched its Claude extension, but LayerX says the fix did not hold on retest.

How can a security team defend against agentic-browser prompt injection?

Treat the AI browser as an unmanaged identity.

Keep it signed out of corporate GitHub, cloud consoles, email, and secrets managers, since any page it visits can reach whatever it can. With no CVE to patch, rely on egress monitoring and per-action limits.

Ready to meet the Guardians?

Deploys fast - agentless for monitoring and cloud, a lightweight agent for deep endpoint security. Just Suriq, standing watch.