General Tech

Claude Code is steganographically marking requests

By kirushik

hacker-news5h ago · Jun 30, 20264 min read

Image courtesy of hacker-news

Article URL: https://thereallo.dev/blog/claude-code-prompt-steganography Comments URL: https://news.ycombinator.com/item?id=48734373 Points: 676 # Comments: 202

I was inspecting Claude Code for privacy reasons.

Most devs give their harnesses ridiculous access. FS, shell, git, browser access, even computer use nowadays. That is the whole point. They need enough context to do useful work.

That also means the client itself deserves scrutiny. If a coding agent can read your repo and run commands, the binary that ships it should be boring (ƒor example, pi harness)

So I took a look at my local Claude Code (2.1.196) install.

Inside the Claude Code binary, there is a function that changes the current date string inserted into the system prompt.

The normal string looks like this:

Claude Code can silently change two things:

The apostrophe in Today's
The date separator, from - to /

Here is the relevant code, cleaned up from the minified bundle:

This is prompt steganography, a technique used to hide data in plain sight.

The visible sentence still reads like a normal date. The model and the user see something boring. The raw request contains a marker.

The trigger is ANTHROPIC_BASE_URL, Claude Code's API base URL override.

Then it checks if:

the system timezone is Asia/Shanghai or Asia/Urumqi
the API base URL hostname matches a decoded domain list
the hostname contains specific AI lab keywords

The timezone check changes:

into:

The hostname check changes the apostrophe:

Condition	Apostrophe
Normal	`'`
Known domain	`\u2019`
Lab keyword	`\u02BC`
Known domain and lab keyword	`\u02B9`

These are visually tiny changes you would never notice in most mono fonts.

The domain and keyword lists are stored as base64 strings and XOR-decoded with key 91.

The decoded lab keyword list is:

The decoded domain list is much larger. It contains Chinese corporate domains, AI company domains, and a lot of proxy / reseller / gateway domains.

Some examples:

The date function is used when building the agent context:

So the marker becomes part of the system context sent to the model. (Where Anthropic probably parses in their backend)

My installed binary is signed by Anthropic:

My current shell had ANTHROPIC_BASE_URL unset, and my timezone was:

So on my machine, under my current environment, this path would produce the normal apostrophe and the normal YYYY-MM-DD date string.

Anthropic probably wants to detect API resellers, unauthorized Claude Code gateways, and model "distillation attack" pipelines. A custom ANTHROPIC_BASE_URL pointing at a known reseller domain is a useful signal. A hostname containing deepseek or zhipu is also a useful signal.

That part makes sense, but the implementation is weird.

CC silently alters the system prompt using invisible-ish Unicode markers. It encodes proxy / gateway classification into a sentence that looks like plain English. It hides the domain list behind XOR and base64. This is not a malicious feature, but it is a weird choice for a developer tool that asks for trust.

Coding agents already live on the wrong side of a scary boundary. They can inspect code, summarize secrets by accident, run commands, install packages, edit files, and push commits on your local machine. Most developers accept that because the productivity gain is worth the risk.

Trust from real developers depends on the boring behavior.

If the client wants to detect custom API gateways, it can say so plainly. It can send an explicit telemetry field with documentation. It can make the policy visible. It can put the behavior in release notes.

Hiding the signal in the system prompt makes every other privacy claim harder to believe.

For most users, this path probably stays inactive.

If you are using the official Anthropic API endpoint, Crt() returns early. If ANTHROPIC_BASE_URL is unset, Crt() returns early. If you are using a normal setup, the date prompt stays "boring".

The interesting case is people routing CC through a custom base URL. That includes:

Internal gateways
Local proxies
Model routers
Resellers
Research setups

In that case, Claude Code classifies the hostname and encodes the result into the prompt.

The bypass is also trivial. Change hostname, change timezone, patch the binary, wrap the process. Any serious adversary can make this signal useless.

So the feature mostly punishes the exact people who are easier to fingerprint: normal developers doing weird but legitimate things.

I think this could have been explicit.

Developer tools can enforce terms. API providers can detect abuse. Companies can protect their models.

When a tool with filesystem and shell access starts hiding classification bits inside invisible prompt punctuation, the correct reaction is scrutiny.

Trust is earned in the boring parts.

Claude Code is steganographically marking requests

Claude Sonnet 5

Claude Science

Trump's plan to redesign every .gov website leads to AI-designed horrors

Keep reading

Claude Sonnet 5

Claude Science

Trump's plan to redesign every .gov website leads to AI-designed horrors