System Prompts Are Design: Persona, Tone, Constraints
A system prompt is an interface. Architect it like one — labeled anatomy, important instructions first and last, parameterized templates, and testable constraints.
Most system prompts are a wish list typed into a box. A pile of instructions, no order, no structure, the important rule buried in paragraph four where the model half-ignores it. Then the team wonders why the assistant's persona drifts, why its tone is wrong in support but fine in onboarding, why a constraint holds nine times and breaks on the tenth. The prompt isn't broken — it was never designed. It was accumulated.
A system prompt is an interface. It's the surface where you define how the product thinks, sounds, and refuses — and like any interface, it can be architected or it can be a mess. The same series thesis applies: the model ships median behavior, and a kitchen-sink prompt does nothing to overwrite it. Two real skills cover this layer. prompt-architecture handles the structure — anatomy, instruction order, parameterization, testable constraints. system-behavior-shaping handles the character — a consistent persona, per-context tone dials, an emotional response map. Together they turn a wish list into an interface.
Key Takeaways
- A system prompt is an interface, not a wish list. It has anatomy — identity, context, rules, output spec, examples — and it should be laid out as deliberately as a UI.
- Order is load-bearing. Put the most important instructions first and last; the middle is where rules go to be forgotten.
- Parameterize, don't hardcode. A template with named slots beats a fresh hand-written prompt every time, and it makes the prompt reusable and testable.
- Two skills, two jobs.
prompt-architecturestructures the prompt;system-behavior-shapinggives it a consistent character, tone dials, and an emotional response map. Install both withnpx skills add https://github.com/Owl-Listener/ai-design-skills --skill prompt-architectureand--skill system-behavior-shaping. - Constraints must be testable. "Be concise" is a hope; "answers under 60 words unless asked to expand" is a constraint you can check.
The Anatomy of an Architected Prompt
An architected prompt has labeled sections, each doing one job. When the parts are named and ordered, the model follows them more reliably and you can edit one part without breaking the rest — exactly like a well-structured component.
| Section | Job | What goes wrong without it |
|---|---|---|
| Identity | Who the assistant is, its persona | Voice drifts; no consistent character |
| Context | What it knows, what situation it's in | Generic answers detached from the product |
| Rules | Hard constraints and prohibitions | Inconsistent enforcement; random failures |
| Output spec | Exact format of the response | Unpredictable, unparseable output |
| Examples | Few-shot demos of edge cases | Right behavior on easy cases, wrong on hard ones |
The discipline is to write each section on purpose and label it. prompt-architecture gives you this skeleton so every prompt you write has the same readable anatomy instead of being a fresh blob each time. This is the prompt-side complement to designing AI product behavior — the behavior is the spec, the prompt is where you encode it.
Kitchen-Sink vs Architected
Put the two approaches side by side and the difference is obvious. The kitchen-sink prompt accumulates; the architected prompt is built.
| Property | Kitchen-sink prompt | Architected prompt |
|---|---|---|
| Structure | One run-on blob | Labeled sections, each with a job |
| Instruction order | Random | Important rules first AND last |
| Reuse | Rewritten each time | Parameterized template with named slots |
| Examples | None, or random | Few-shot targeting known failure modes |
| Constraints | "Be helpful and concise" | "Under 60 words unless asked; cite sources" |
| Testing | Vibes | Check each constraint against cases |
| Persona | Drifts | Consistent character with tone dials |
The right column isn't more work once you have the skills — it's a template you fill in. That's the whole point of treating the prompt as an interface: you design the structure once and reuse it, instead of free-handing hope into a text box every time.
Order, Parameters, and Few-Shot
Three structural moves do most of the heavy lifting.
Order is load-bearing. Models attend most reliably to the start and the end of a prompt; the middle is where instructions quietly die. So put your most critical rules first and repeat the non-negotiables at the end.
[IDENTITY] You are the support assistant for Acme. Critical rules:
never promise refunds, never invent policy, always offer escalation.
[CONTEXT] ...
[RULES] ...
[OUTPUT SPEC] ...
[EXAMPLES] ...
[REMINDER] Before responding, re-check: no refund promises, no invented
policy, escalation offered when unresolved.
Parameterize. A prompt is a template with named slots, not a one-off. Pull the variable parts out so the same architecture serves every context:
You are {{persona}} for {{product}}. Tone: {{tone_for_context}}.
Constraints: {{hard_rules}}. Output: {{format_spec}}.
On failure modes {{known_failures}}, respond per the examples below.
Few-shot the failures, not the easy cases. Examples are scarce attention — don't spend them demonstrating what the model already does. Spend them on the edge cases it gets wrong: the ambiguous request, the adversarial input, the moment it should refuse. That's where a testable constraint and a targeted example together actually move the behavior.
Shaping Character and Tone
Structure makes the prompt reliable; system-behavior-shaping makes it feel like one product. A consistent character means the assistant sounds the same across every surface — not warm in onboarding and robotic in error states by accident. Tone dials let you shift register per context without changing the persona: more concise in a power-user setting, more reassuring in a billing dispute. And an emotional response map decides how the product reacts to the user's state — what it does when someone is confused, frustrated, or in a hurry.
PERSONA: calm, competent, never sycophantic. Same voice everywhere.
TONE DIALS:
- onboarding: encouraging, a little more verbose
- power-user: terse, assume expertise
- error/billing: reassuring, take responsibility, offer next step
EMOTIONAL RESPONSE MAP:
- confused -> slow down, one step at a time, check understanding
- frustrated -> acknowledge, simplify, offer escalation
- rushed -> lead with the answer, skip the preamble
That map is a design artifact. It turns "the AI's personality" from an emergent accident into something you specify and test — which is what designing trust looks like at the prompt layer. The next piece, trust is a design material, picks up exactly there.
Frequently Asked Questions
Why two skills instead of one prompt skill?
Because structure and character are different problems. prompt-architecture is about how the prompt is built — anatomy, order, parameters, testable rules. system-behavior-shaping is about who the assistant is — persona, tone, emotional responses. You can have a beautifully structured prompt with a drifting personality, or a vivid character with no testable constraints. You want both.
Does instruction order really matter that much?
Yes. Models attend most reliably to the beginning and end of a prompt. Critical rules buried in the middle get followed inconsistently. Putting non-negotiables first and repeating them at the end is one of the cheapest reliability wins available — it costs a few tokens and meaningfully cuts constraint violations.
What makes a constraint "testable"?
A testable constraint can be checked against an output without judgment calls. "Be concise" can't — concise is subjective. "Under 60 words unless the user asks to expand" can — you count the words. Rewrite every fuzzy instruction into something you could verify with a script or a quick eval, which connects directly to the evaluation layer.
How do few-shot examples fit in?
Examples are expensive attention, so target them at failure modes. Don't demonstrate the easy path the model already handles — demonstrate the ambiguous request, the refusal case, the format it keeps getting wrong. A handful of well-chosen edge-case examples beats a dozen generic ones.
Where does this sit in the anti-slop stack?
The system prompt is the layer that encodes behavior into the model's actual responses — it sits between behavior design and trust in the six-layer anti-slop stack. The pillar argument holds: a kitchen-sink prompt ships the median; an architected one overwrites it.
Explore prompt architecture and the deeper design layers in the Designs category, or browse the full skill catalog at aiskill.market.