Red-Teaming Skills in Hermes: The Bundled Red-Teaming Skill Library Tour

Hermes Agent ships with 100+ bundled skills spread across categories like software-development, devops, mlops, writing-communication, and — the one this post is about — red-teaming. The red-teaming category is for authorized security work: testing your own systems, running approved engagements, building defensive evals. It is not a toolkit for attacking things you do not own.

This post is a tour. What is in the category, what each kind of skill does, the responsible-use norms Nous Research documents alongside them, and a realistic example of chaining several skills across a single engagement.

Key Takeaways

Hermes bundles a red-teaming/ skill category alongside devops, mlops, software-development, and others.
The skills are framed for defenders: prompt-injection probing, jailbreak analysis, safety evals, attack-surface mapping.
Responsible-use rules apply: authorization, scope, non-destructive by default, no exfiltration.
Skills are markdown files with SKILL.md front-matter; they are readable and auditable before you run them.
A typical engagement chains four or five skills: scope the surface, probe, analyze, report.
Persistence matters here — Hermes's memory keeps engagement context across days, which a per-session CLI cannot.

What Ships in `red-teaming/`

Categories inside red-teaming/ (at time of writing; the exact catalog evolves):

prompt-injection-probing/ — systematic prompting patterns for surfacing instruction override, context hijack, and tool confusion issues against LLM-backed apps.
jailbreak-analysis/ — analysis-oriented skills for classifying and documenting jailbreak attempts against a target model or app.
safety-evals/ — running structured evaluations against a model or agent to produce a scored report on refusal, misuse potential, etc.
attack-surface-mapping/ — enumerating the public surface of an AI-backed product (endpoints, tools, MCP servers, prompts) as a prelude to testing.
tool-abuse-testing/ — skills focused on agent tool abuse (asking an agent to use its tools in harmful or unintended ways).
report-writing/ — templates for producing client-ready findings with severity ratings and remediations.

Because skills are just markdown with SKILL.md front-matter, you can read any of them before execution. That matters in red-team work, where you need to know exactly what a skill will cause the agent to do.

Responsible-Use Norms

The Hermes docs frame the red-teaming category tightly: these skills are for authorized security work. Concretely, the norms documented alongside the skills:

Authorization: you have written permission to test the target.
Scope: the skills respect an explicit scope (URLs, endpoints, accounts); they do not expand it.
Non-destructive by default: no data mutation, no persistent changes, no denial-of-service patterns.
No exfiltration: findings stay in the engagement workspace; nothing leaves without operator confirmation.
Logging: every action goes to the Hermes memory store in ~/.hermes/ so you can produce a clean audit trail.

This is ordinary pentesting discipline, but it matters more with agents because a sloppy agent can do a lot of damage quickly. The skills encode the norms; they do not replace human judgment.

Example Engagement: Chaining Four Skills

Here is a realistic flow for a one-week engagement testing an internal RAG-backed support agent that a customer has authorized you to assess.

Day 1: Attack-Surface Mapping

Start a Hermes session with the attack-surface-mapping skill and a scope file:

hermes chat --skill red-teaming/attack-surface-mapping

Feed the scope:

"Target is https://support-agent.example.internal. Authorized accounts are in ~/engagements/acme/creds.yaml. Scope is the chat endpoint and the two MCP servers it exposes (docs-search and ticket-lookup). Out of scope: anything else on the domain."

Output: a structured map of endpoints, tool surfaces, and observed system-prompt boundaries. Hermes stores this in memory under the engagement tag.

Day 2: Prompt-Injection Probing

Load the probing skill and reference the previous day's map:

hermes chat --skill red-teaming/prompt-injection-probing

"Using the attack-surface map from engagement acme-2026-04, generate and run 30 injection probes against the chat endpoint. Respect the authorized accounts. Log every probe and response."

Probes run, results log to ~/.hermes/engagements/acme-2026-04/probes.md. Hermes's FTS5 memory makes every probe searchable for the rest of the week.

Day 3: Jailbreak Analysis

Take the raw probe results and run them through the analysis skill:

hermes chat --skill red-teaming/jailbreak-analysis

"Classify yesterday's probes by mechanism (context override, persona swap, encoded instruction, tool confusion). Flag any that succeeded and estimate severity."

Output: a classified table with severity, which feeds the final report. Because memory persists, you do not have to re-paste probe data — Hermes pulls it from storage.

Day 4: Safety Evals

Run the safety-evals skill against a known eval suite the customer cares about:

hermes chat --skill red-teaming/safety-evals --eval-suite ./evals/acme-specific.yaml

Produces scored outputs, stored alongside the engagement.

Day 5: Report Writing

Finally, the report-writing skill pulls everything together:

hermes chat --skill red-teaming/report-writing

"Write the engagement report for acme-2026-04. Include executive summary, methodology, findings ranked by severity, remediations, and appendix with probe logs."

Because every prior day's artifacts are in Hermes memory, the report draft is grounded in real data, not the model's imagination.

Why Persistence Matters for Red Team Work

An interactive CLI agent loses context between sessions. A week-long engagement needs:

Continuity across days (yesterday's findings inform today's probes).
Searchable history of every probe and response.
Reproducibility — the ability to re-run a probe and compare.

Hermes's FTS5-indexed memory handles all three. Claude Code can do this manually if you discipline yourself with CLAUDE.md and notes, but Hermes is the first runtime I have used where the memory model was clearly designed for multi-day work.

For how the memory system works, see Hermes memory deep dive: FTS5 markdown recall. For how these skills relate to the broader ecosystem, see the agentskills.io standard in Hermes and Claude Code.

Responsible Disclosure and Scope Discipline

One more norm worth calling out: if your probes surface a serious vulnerability, the report-writing skill's default template includes a responsible-disclosure section. Fill it out with the customer before sharing externally. Hermes does not call external services on its own during red-team engagements unless you wire it up — the defaults stay local to your engagement directory. That is a feature, not a limitation.

What This Category Is Not

The red-teaming/ category is not:

A penetration-testing framework against arbitrary targets.
A collection of exploits or malware.
A toolkit for bypassing Anthropic's safety training at scale.

If you are looking for the first, use Metasploit or Burp. If you are looking for the second or third, Hermes is not for you; the skills are structured to keep red-team work scoped, documented, and lawful.

Sources

GitHub: NousResearch/hermes-agent — see skills/red-teaming/
Hermes docs: hermes-agent.nousresearch.com/docs/
Related: Hermes memory deep dive: FTS5 markdown recall
Related: The agentskills.io standard across Hermes, Claude Code, and Cursor
Related: Hermes for DevOps: bundled skills with Claude

Red-Teaming Skills in Hermes: The Bundled Red-Teaming Skill Library Tour

Key Takeaways

What Ships in `red-teaming/`

Responsible-Use Norms

Example Engagement: Chaining Four Skills

Day 1: Attack-Surface Mapping

Day 2: Prompt-Injection Probing

Day 3: Jailbreak Analysis

Day 4: Safety Evals

Day 5: Report Writing

Why Persistence Matters for Red Team Work

Responsible Disclosure and Scope Discipline

What This Category Is Not

Sources

Related Skills to Try

Related Skills to Try

Trail of Bits Security Skills

Related Articles

Related Articles

Security Research Skills for Claude

Slack Skills for Developer Teams

How Agents Build On-Brand UIs

Trail of Bits Security Skills

Varlock Secret Management

Security Blue Book Builder

writing-skills

Varlock Secret Management

Security Blue Book Builder

writing-skills

Key Takeaways

What Ships in red-teaming/

Responsible-Use Norms

Example Engagement: Chaining Four Skills

Day 1: Attack-Surface Mapping

Day 2: Prompt-Injection Probing

Day 3: Jailbreak Analysis

Day 4: Safety Evals

Day 5: Report Writing

Why Persistence Matters for Red Team Work

Responsible Disclosure and Scope Discipline

What This Category Is Not

Sources

Related Skills to Try

Related Skills to Try

Trail of Bits Security Skills

Related Articles

Related Articles

Security Research Skills for Claude

Slack Skills for Developer Teams

How Agents Build On-Brand UIs

Varlock Secret Management

Security Blue Book Builder

writing-skills

What Ships in `red-teaming/`