The Guardrails

How to design for a system that will inevitably lie to you.

Design SystemsDecember 20259 min read
The Guardrails

Your AI is going to fail. It’s going to lie.

It's going to hallucinate. It's going to try to sell a Chevy Tahoe for $1. It's going to tell a customer to leave their husband. (These are all real examples.)

In the deterministic era, we designed for "happy paths." In the probabilistic era, we have to design for containment. We have to build Guardrails.

Guardrails aren't about limiting your AI. They're about earning trust. Users who trust your AI will use it more. And they'll only trust it if they know it can't do catastrophic things.

Types of Design Guardrails

1. The Input Guardrail (The Bouncer)

Stop the bad request before it hits the model.

Pattern: Intent Classification. Before the user's prompt goes to the LLM, run it through a tiny, fast classifier that checks: "Is this safe? Is this on-topic?"

UX Implication: If the user asks a finance bot for medical advice, the UI immediately says, "I can only help with finance questions," without even waking up the big model.

2. The Process Guardrail (The Chaperone)

Don't let the AI work alone on high-stakes decisions.

Pattern: Human-in-the-Loop. If the AI's confidence score is below a threshold, or if the transaction value is above a limit, the UI changes from "Execute" to "Review."

UX Implication: A "Draft Mode" where the AI pre-fills the form but forces the human to click "Submit." The AI proposes; the human disposes.

3. The Output Guardrail (The Editor)

Check the work before showing it to the user.

Pattern: Verification Layer. Run AI output through validation before display.

  • If the AI generates code, run a linter. If it fails, regenerate or flag the error.
  • If the AI cites a source, verify the source exists before showing the citation.
  • If the AI provides numbers, cross-check against the actual database.

4. The Scope Guardrail (The Fence)

Limit what the AI can access and affect.

  • The AI can read your calendar but can't modify it without approval.
  • The AI can draft emails but can't send them.
  • The AI can query the database but can't write to it.

5. The Rollback Guardrail (The Time Machine)

Make every AI action reversible.

  • "Undo last AI action" button prominently displayed.
  • Automatic snapshots before any AI modification.
  • 24-hour grace period before AI changes become permanent.

The Mindset Shift

Engineers optimize for capabilities: "Look what it can do!"

Designers must optimize for safety: "Look what I prevented it from doing."

Guardrails aren't exciting. Neither are seatbelts. But you notice when they're missing.

Let's talk about your product, team, or idea.

Whether you're a company looking for design consultation, a team wanting to improve craft, or just want to collaborate—I'm interested.

Get in Touch
Newsletter

Weekly design insights

Weekly observations on design, AI, leadership, and the craft of building. What I'm reading, thinking about, and making.