Structured Output Design for Predictable Results
Design AI skill outputs that are consistent, parseable, and useful. Master output format specification for reliable, actionable results.
Structured Output Design for Predictable Results
Unpredictable outputs kill productivity. One day your code review comes as bullet points, the next as prose, the next as a table. Each response requires different parsing, different interpretation, different action.
Structured output design solves this. By specifying exactly how Claude should format responses, you get consistent, predictable outputs every time. These outputs are easier to read, easier to parse, and easier to act on.
This guide covers output format design: what to specify, how to specify it, and common patterns that work well.
Why Structure Matters
The Consistency Problem
Without structure:
Request 1: "Review this code"
Response: "The code looks fine. Maybe add some error handling."
Request 2: "Review this code"
Response:
## Issues Found
1. Missing null check on line 23
2. Potential memory leak in useEffect
## Recommendations
...
Same request, completely different formats. Users cannot develop expectations.
Benefits of Structure
Predictability: Users know what to expect. They can scan to the section they need.
Parseability: Structured output can be extracted programmatically when needed.
Actionability: Clear structure → clear actions. Each section has a purpose.
Consistency: Same format every time builds trust and efficiency.
Output Format Specification
Explicit Structure Definition
Tell Claude exactly what format to use:
## Output Format
Structure all code review responses as:
### Summary
One paragraph (2-4 sentences) overall assessment.
Include: overall quality, main concerns, recommendation to merge or not.
### Critical Issues
Must-fix items that block merge.
Format:
- [FILE:LINE] Issue description
Current:
problematic codeSuggested:fixed code
### Improvements
Should-fix items that improve code quality.
Same format as Critical Issues.
### Positive Notes
What was done well (2-3 items minimum).
- Good pattern/practice noted
### Verdict
One of: APPROVE | REQUEST_CHANGES | NEEDS_DISCUSSION
Plus one sentence explanation.
Template Approach
Provide a fill-in template:
## Response Template
Use this exact template:
Analysis of [FILE/TOPIC]
Quick Summary: [1-2 sentences]
Key Findings:
- [Finding]
- [Finding]
- [Finding]
Detailed Analysis:
[Category 1]
[Analysis paragraph]
[Category 2]
[Analysis paragraph]
Recommendations:
- [Action item]
- [Action item]
Confidence: [High/Medium/Low] - [Brief reason]
Field-Level Specifications
Define each field precisely:
## Error Report Fields
### Error Code
- Format: `ERR_[CATEGORY]_[NUMBER]`
- Categories: AUTH, VALIDATION, DATABASE, NETWORK, UNKNOWN
- Numbers: 3-digit, increment per category
- Example: `ERR_AUTH_001`
### Severity
- One of: CRITICAL, HIGH, MEDIUM, LOW, INFO
- CRITICAL: System down, data loss risk
- HIGH: Feature broken, workaround difficult
- MEDIUM: Feature degraded, workaround exists
- LOW: Minor issue, cosmetic
- INFO: For awareness only
### Message
- User-friendly (no technical jargon)
- Actionable (tells user what to do)
- Concise (max 100 characters)
- Example: "Session expired. Please sign in again."
### Details
- Technical context for developers
- Include: stack trace excerpt, request ID, timestamp
- Sanitized (no PII, credentials, or internal paths)
Common Output Patterns
The Report Pattern
For analytical outputs:
## Report Structure
### Executive Summary
- Bullet points of key findings
- Maximum 5 points
- Each point standalone (no "as mentioned above")
### Analysis
#### Section 1: [Topic]
Paragraph(s) of analysis.
Support with data/examples.
#### Section 2: [Topic]
...
### Data Tables (if applicable)
| Metric | Value | Benchmark | Status |
|--------|-------|-----------|--------|
| ... | ... | ... | ... |
### Recommendations
Numbered list, ordered by priority.
1. Most important action
2. Second priority
...
### Appendix (if needed)
Supporting detail, raw data, extended examples.
The Checklist Pattern
For verification outputs:
## Checklist Output
Format verification results as:
[Check Category] Results
✅ PASS: [Check name] - [Optional: what was verified] ❌ FAIL: [Check name] - [What is wrong] - [How to fix] ⚠️ WARN: [Check name] - [Concern] - [Recommendation] ⏭️ SKIP: [Check name] - [Reason skipped]
Summary
- Passed: X/Y
- Failed: N (list)
- Warnings: M (list)
Next Steps
- [If failed, what to do]
- [Address warnings]
The Diff Pattern
For change recommendations:
## Change Output Format
Show changes as diffs:
```diff
--- Current
+++ Proposed
@@ File: path/to/file.ts, Line: 45 @@
- const data = fetchData()
+ const data = await fetchData()
@@ File: path/to/file.ts, Line: 50-52 @@
- if (data) {
- process(data)
- }
+ if (data != null) {
+ await process(data)
+ }
Include after each change:
- Reason: Why this change
- Impact: What this affects
- Risk: Low/Medium/High
### The Decision Pattern
For recommendation outputs:
```markdown
## Decision Output
Structure decisions as:
### Question
[Restate the decision being made]
### Options Considered
#### Option A: [Name]
- **Pros:** [List]
- **Cons:** [List]
- **Effort:** Low/Medium/High
- **Risk:** Low/Medium/High
#### Option B: [Name]
...
### Recommendation
**Choose: [Option]**
Rationale:
[2-3 sentences explaining why]
Mitigations:
[How to address the cons of chosen option]
### Alternatives if Constraints Change
- If [condition], consider [Option X] instead
The Teaching Pattern
For explanatory outputs:
## Explanation Structure
### The Problem
What we are solving, in plain language.
### The Solution
High-level approach.
### How It Works
Step-by-step breakdown:
1. **Step One**
What happens and why.
```code example```
2. **Step Two**
...
### Key Concepts
| Term | Meaning |
|------|---------|
| ... | ... |
### Common Mistakes
- Mistake: [What people do wrong]
Fix: [How to do it right]
### Try It Yourself
[Exercise or example to practice]
Format for Different Audiences
For Humans (Reading)
Emphasize:
- Clear headers for navigation
- Prose for context and explanation
- Bullet points for scanning
- Tables for comparison
## Code Review: UserService.ts
### Overview
This service handles user authentication and profile management.
The implementation is solid overall, with a few areas for improvement.
### Key Findings
**Security Concern:**
The password comparison on line 45 uses `==` instead of a
timing-safe comparison, which could enable timing attacks.
**Recommendation:**
Replace with `crypto.timingSafeEqual()` after converting strings
to buffers.
### Detailed Review
...
For Machines (Parsing)
Emphasize:
- Consistent delimiters
- Parseable formats (JSON, YAML)
- Predictable structure
## Output Format (Machine Readable)
Return results as JSON:
```json
{
"status": "success" | "error",
"findings": [
{
"type": "security" | "performance" | "style",
"severity": "critical" | "high" | "medium" | "low",
"file": "path/to/file.ts",
"line": 45,
"message": "Description of issue",
"suggestion": "How to fix"
}
],
"summary": {
"total": 5,
"critical": 1,
"high": 2,
"medium": 1,
"low": 1
}
}
### For Both (Hybrid)
Structure for humans, but parseable sections:
```markdown
## Review Output
### Human Summary
[Prose paragraph for human reading]
### Machine Summary
```json
{"pass": false, "criticalCount": 2, "blockers": ["AUTH_001", "PERF_003"]}
Detailed Findings
[Structured list format that is also parseable]
## Handling Variable Output
### Conditional Sections
When output varies based on content:
```markdown
## Output Sections
### Always Include
- Summary
- Verdict
### Include If Applicable
- Critical Issues (only if found)
- Security Concerns (only if found)
- Performance Notes (only if relevant)
### Section Absence
If a section has no content, include header with "None found":
Critical Issues
None found.
Do not omit sections silently.
Scaling Output Length
Match output to input scope:
## Output Scaling
### Small Changes (1-10 lines)
- Brief summary (1-2 sentences)
- Issues as bullet points
- Verdict
### Medium Changes (11-100 lines)
- Summary paragraph
- Issues organized by category
- Detailed recommendations
- Verdict with explanation
### Large Changes (100+ lines)
- Executive summary
- Issues organized by file/module
- Separate sections for different concern types
- Prioritized recommendations
- Detailed verdict with conditions
Empty/Error Cases
Define output for edge cases:
## Edge Case Outputs
### No Issues Found
Review Complete
No issues found. Code looks good!
Positive Notes
- Clean implementation of [pattern]
- Good test coverage
Verdict
APPROVE - Ready to merge.
### Cannot Complete Review
Review Incomplete
I could not complete this review because:
- [Reason: missing context, unclear scope, etc.]
To proceed, please:
- [What is needed]
What I was able to review:
- [Partial findings if any]
Output Quality Guidelines
Be Complete
Every response should feel complete:
## Completeness Checklist
Every response must include:
- [ ] Answers the original question
- [ ] Provides actionable next steps
- [ ] Indicates confidence level
- [ ] Notes any limitations or caveats
Never leave the reader wondering "okay, but what now?"
Be Consistent
Same structure across responses:
## Consistency Rules
### Header Levels
- H2 for major sections
- H3 for subsections
- Never skip levels (H2 → H4)
### List Style
- Bullets for unordered items
- Numbers for sequences/priorities
- Checkboxes for actionable tasks
### Code Blocks
- Always specify language
- Include file path if relevant
- Keep examples minimal but complete
Be Scannable
Optimize for quick comprehension:
## Scannability Guidelines
### Lead with Key Information
Start sections with the most important point.
### Use Visual Hierarchy
- Bold for emphasis
- Headers for navigation
- Lists for multiple items
### Front-Load Sentences
Put the important part first:
- Good: "Line 45 has a null pointer bug."
- Bad: "There is an issue that might occur in certain conditions on line 45."
### Include Section Summaries
For long sections, start with a one-line summary.
Testing Output Format
Verification Checklist
Test your format specification:
- Completeness: Does it cover all scenarios?
- Clarity: Can someone unfamiliar understand it?
- Consistency: Does Claude follow it reliably?
- Parsability: If needed, can it be extracted programmatically?
- Utility: Does the format help users take action?
Example Responses
Include example responses in your skill:
## Example Outputs
### Example: Simple Issue
Input: Simple function with one type error
Output:
Review: utils/format.ts
Summary
Minor type issue found. Quick fix, approve after addressing.
Issues
- [Line 12] Parameter
dateshould be typed asDate | string, notany- function formatDate(date: any): string + function formatDate(date: Date | string): string
Verdict
REQUEST_CHANGES - One minor fix needed.
### Example: No Issues
...
### Example: Multiple Issues
...
Summary
Structured output design is about predictability and usability. Effective structure:
- Specifies format explicitly - Templates, field definitions, examples
- Matches the use case - Human reading, machine parsing, or both
- Handles variations - Conditional sections, scaling, edge cases
- Ensures quality - Complete, consistent, scannable
- Includes examples - Show what correct output looks like
When users know exactly what to expect, they can act faster and with more confidence. Consistent structure is not about rigid rules; it is about creating reliable, useful outputs that respect the reader's time.
Ready to connect skills to external tools? Continue to Tool Integration in Skills for guidance on when and how skills should use tools.