Observability is not surveillance

How we architected HeimWall so the manager sees signal, not content, and why that choice is structural, not cosmetic.

The HeimWall teamApril 20, 20266 min read

Every AI coding governance product has to answer the same question on its first slide: what exactly does the manager get to see? The honest answers cluster into three buckets, and the bucket you choose determines everything about the product that comes after.

The first bucket is full content, full retention. Every prompt and every response gets captured and stored. The manager can browse them. This is what classical SIEM vendors sell, with "AI coding" retrofitted. It is also what engineers immediately route around. If your team believes every keystroke is archived and searchable, two things happen: they stop using the tool for anything interesting, or they use it on a personal machine.

The second bucket is no content, no capture. The agent is a metrics emitter. You get counts, timestamps, latencies. This is what some privacy-first tools ship, and it has the opposite problem. When a real incident happens (an API key goes out, customer data leaks), you cannot investigate, because you captured nothing to investigate. You are compliant and blind.

The third bucket is what we chose: signal, not content, by default, with a gated break-glass for incidents. It is the harder architecture. It is also the only one that lets engineers keep trusting the tool and lets managers keep owning the outcome. Here's how the three pillars of that architecture actually work.

Pillar 1: Detection at the source

Every prompt gets inspected on the engineer's own machine, before anything leaves the process. The HeimWall agent is a small native Rust + Tauri app — the shipping macOS build is about 7 megabytes. It sits locally next to Cursor, Claude Code, Copilot, and Windsurf. When a prompt is composed, the agent runs a deterministic rule engine: dozens of hand-written, individually tested rules with regex + context heuristics for things like AWS keys, database connection strings, and JWT tokens. The architecture also includes a second, feature-gated semantic tier — a small ONNX-quantized classifier for fuzzier categories like proprietary business logic — which is on the roadmap; the deterministic tier is what ships today.

The whole pipeline is engineered against a hard 50-millisecond p95 budget on Apple Silicon. We obsess over that number because it's the only one that matters for engineer trust. Over 50ms, engineers feel the lag and the tool becomes something to work around. Under 50ms, they stop noticing.

What the agent produces is a classification record: category (Secret / PII / ProprietaryCode / CustomerData / Other), severity (Info → Low → Medium → High → Critical), a SHA-256 hash of the prompt for deduplication, a character count, and the originating tool. What it does not produce is the prompt body. That stays in process memory, and when the user hits send, it goes where they intended (to Cursor's cloud, to Anthropic, to GitHub) exactly as it would have without HeimWall installed. The agent is an observer, not a proxy.

Pillar 2: Redaction before transport

The classification record travels to our cloud over TLS 1.3. What travels is metadata. What doesn't travel is content.

We went further than "don't send the body." We built the pipeline so that sending the body is structurally difficult. The on-device agent has no code path that transmits raw prompt text to our API; the API route that would accept it doesn't exist. If we wanted to ship a build that exfiltrates prompt content, we'd have to add a new network capability, a new backend endpoint, and a new storage schema. It's not a configuration away. It's a version-bump-and-design-doc away.

We made this choice because "trust us, we only send metadata" is not credible to a paying security buyer. "Here's the architecture. There is no physical path from the prompt body to our cloud" is credible.

Pillar 3: Investigation Mode, audit-logged

Sometimes a real incident happens. A customer calls and says an engineer accidentally pasted the source for their crown-jewel algorithm into an unverified AI tool. You need to see the text to know what leaked. That's what Investigation Mode exists for, and that's the only thing it exists for.

Triggering Investigation Mode on a specific engineer requires four things simultaneously. First, a second-factor step-up from the requesting admin. Second, a written justification of at least 50 characters. "Investigating potential IP leak per ticket INC-4412" is fine; "looking around" is not. Third, automatic notification to the affected engineer within one hour, email + dashboard alert, no exceptions. Fourth, a 24-hour time-box, at the end of which access expires and has to be renewed with a new request.

Every single action is audit-logged and exportable. The audit log is append-only and signed. If a customer's legal team ever needs to prove a specific investigation was conducted with cause, the log proves it. If a customer's engineering team ever needs to prove they weren't secretly surveilled, the log proves that too.

The non-dismissible banner

One more choice deserves a paragraph.

The manager dashboard we are building carries a yellow banner at the top of every page that reads: "Safety scores and flag history are not to be used in performance reviews, promotion decisions, or compensation decisions. This is a contractual term." The banner is not dismissible. It is not a toast that disappears. It is rendered server-side into every route, and there is no admin setting to turn it off.

We are doing this for two reasons. One, it keeps the sales conversation honest. Every VP Eng who demos HeimWall will see that banner in the first three seconds, and they can ask the question on the spot: "Is this in the contract?" The answer is yes. Two, it keeps the product honest. We cannot wake up in six months and decide surveillance features would 3x our ARR. The banner is a promise to our customers' engineers, rendered in the same pixels as the data those customers pay to see. Removing it would be a product regression we'd have to explain.

Signal, not content

The word "observability" is borrowed from distributed systems. The deal there is: you don't read every log line, you watch metrics and traces, and you drill into the individual record only when a metric spikes. Stable systems are summarized; anomalies are inspected.

We think the same deal works for AI coding. You don't need to read every prompt your team sends to Cursor this week. You need to know that secret-category flags are trending up in the data-platform squad, that the on-call rotation has been pasting customer records into Claude Code twice a day since Monday, and that one specific engineer had a single Critical-severity event yesterday worth a coaching conversation.

That's signal. That's what HeimWall shows you. Everything else we architected away, on purpose.