In 2015, if you didn't know the difference between a JPG and an SVG, you weren't a serious UI designer. You didn't need to write the compression algorithm, but you needed to know which one to use for a logo and which one for a photo.
In 2025, the material has changed. We aren't sculpting pixels; we're sculpting probability. And if you don't understand the physics of the material, you will design things that break.
The four properties of "AI Clay" you must understand are: Context, Temperature, Latency, and Retrieval.
1. The Context Window (Short-Term Memory)
The most common bad design pattern I see is the "Infinite Memory" fallacy. Designers assume the AI remembers everything. It doesn't.
The Context Window is the amount of information the model can "see" at any one moment. Think of it as the AI's working memory. It's finite—and while it's growing, it's still a hard constraint.
Design Implications:
- Explicit Context: Don't rely on implicit memory. Build UI that lets users "Pin" key facts the AI should always remember.
- Summarization: If a chat gets too long, design a "Summarize & Reset" interaction where the AI compresses the history.
- The "What do you know?" Pattern: A sidebar or panel that shows exactly what documents and context are currently loaded.
- Memory Management UI: Let users add and remove context explicitly.
2. Temperature (The Creativity Knob)
Every LLM has a setting called Temperature, usually between 0 and 1.
- Low Temp (0.1 - 0.3): Deterministic, focused, predictable. Good for code generation, data extraction, and following strict rules.
- High Temp (0.7 - 0.9): Creative, varied, unpredictable. Good for brainstorming, writing drafts, and generating options.
The Constraint: Wrong temperature = wrong behavior. High temperature on a data extraction task means the AI might invent data.
Design Implications: Map Components to Temperature. Expose the Knob (Sometimes). Set Expectations.
3. Latency (The Speed of Thought)
LLMs are slow. Generating a paragraph takes time.
The Constraint: The "Spinner" is death. If a user waits 5 seconds for a response with a generic spinner, they perceive the system as broken.
Design Implications:
- Streaming: Don't wait for the whole message. Stream it token by token.
- Skeleton Screens: If you're generating a chart, show the axes and labels immediately while the data loads.
- Thought Bubbles: While the big model is thinking, use a tiny, fast model to show intermediate status.
- Cancel Button: Always. Prominently. If it's taking too long, let them bail.
4. Retrieval (RAG)
The model doesn't know your business. It knows the internet as of its training date. To make it know your business, we use RAG—Retrieval Augmented Generation.
The Constraint: Retrieval is imperfect. It misses things. It finds the wrong things.
Design Implications:
- Citation UI: Every claim that comes from retrieved data must link to the source document.
- The Source Inspector: Let users see what the AI read before it wrote the answer.
- Negative Feedback Loops: If the AI cites the wrong document, give users a button to say "Wrong Source."
- Retrieval Scope Controls: Let users narrow or expand what the AI searches.
- Empty State Design: What happens when retrieval finds nothing? Don't let the AI make something up.
The Material Shapes the Design
You don't need a PhD in machine learning. But you do need to understand:
- Context is finite. Design for memory management.
- Temperature affects creativity vs. accuracy. Match it to the task.
- Latency is real. Design for progressive disclosure and streaming.
- Retrieval is imperfect. Design for transparency and verification.
Stop treating AI like magic. It's a probabilistic engine with specific constraints. Design with the grain, not against it.
Let's talk about your product, team, or idea.
Whether you're a company looking for design consultation, a team wanting to improve craft, or just want to collaborate—I'm interested.
Get in Touch