Hermes for DevOps: Using the Bundled DevOps and MLOps Skills with Claude
Hermes ships devops and mlops skill categories. Tour representative skills and watch them chain through a typical incident — alert to rollback — with Claude in the loop.
DevOps work is the kind of agent task that benefits most from a persistent runtime. Alerts fire when you are asleep. Log correlations take minutes of reading, not seconds of clever prompting. Rollback decisions want traceability. Hermes Agent ships devops/ and mlops/ skill categories designed for this, and because Hermes is a long-lived daemon with memory, it fits the problem better than a per-session CLI does.
This post tours representative skills in both categories and walks through a realistic incident chain: an alert fires, Hermes reads the log, correlates with recent deploys, proposes a rollback, and delegates the rollback command to Claude Code as a subagent.
Key Takeaways
- Hermes bundles
devops/andmlops/skill categories alongside software-development and others. - DevOps skills cover CI debugging, log triage, alert handling, incident response, and rollback workflows.
- MLOps skills cover model deployment, drift detection, eval scheduling, and training-job triage.
- Hermes's persistent daemon is a natural fit for on-call responsibilities — it is always up and always listening.
- Skills chain:
alert-triage→log-correlation→deploy-history→rollback-proposal→ hand off to Claude Code. - The
claude-codesubagent skill lets Hermes delegate heavy filesystem or repo work to a Claude Code process without ever leaving the daemon.
What Ships in devops/
Representative skills (exact names evolve; this is the shape of the category):
alert-triage/— parse an alert payload (PagerDuty, Grafana, Prometheus Alertmanager) and extract the affected service, severity, and suspected cause.log-correlation/— pull logs across services around the alert timestamp and surface the likely signal.ci-debug/— read CI failure output, identify the failing step, propose a fix.deploy-history/— query recent deploys (git, Vercel, Fly, Kubernetes) and highlight the diffs that correlate with the incident window.rollback-proposal/— given a suspect deploy, draft a rollback plan with the exact command and the rollback-safety checks to run first.incident-report/— post-incident writeup template with timeline, blast radius, root cause, and follow-ups.runbook-executor/— step-through a YAML-defined runbook with operator confirmation at each step.
What Ships in mlops/
model-deploy/— promote a model version from staging to production with the canary pattern configured per-environment.drift-detection/— run a drift check against a baseline dataset and flag features out of tolerance.eval-scheduler/— schedule evals to run against production traffic snapshots (pairs naturally with Hermes cron).training-job-triage/— given a failed or slow training job, pull the logs and propose a diagnosis.artifact-audit/— verify model artifacts (hashes, sizes, metadata) before a deploy.
Why Persistent Runtime Is the Right Shape for This Work
A session-based CLI agent is wrong for on-call. You do not want to start a new session for every alert; you want an agent that has been watching. The Hermes daemon:
- Listens for webhooks (PagerDuty, Alertmanager, GitHub Actions).
- Runs cron jobs (hourly drift checks, nightly backups).
- Keeps memory across the whole week (so yesterday's incident context informs today's triage).
- Calls Claude Sonnet 4.6 for reasoning-heavy steps.
- Delegates to Claude Code when it needs to actually change files in a repo.
Claude Code is still in the loop — Hermes is not trying to replace it. Hermes orchestrates, Claude Code edits.
Incident Chain: Alert → Rollback
Here is the full chain for a typical "latest deploy broke production" incident. Trigger: a Grafana alert hits the Hermes webhook endpoint.
Step 1: Alert Triage
Hermes receives the webhook payload and runs the alert-triage skill:
Alert payload parsed. Service: checkout-api. Severity: high. Signal: p99 latency 2.4s (baseline 120ms). Time window: last 8 minutes. Likely cause category: recent deploy.
Step 2: Log Correlation
The log-correlation skill reads logs from the last 30 minutes:
# example config ~/.hermes/skills/devops/log-correlation/SKILL.md
services:
checkout-api:
logs:
- source: loki
query: '{service="checkout-api"} | json'
window_minutes: 30
Hermes pulls logs, finds a surge of timeout errors from a downstream dependency that started exactly at the deploy time.
Step 3: Deploy History
The deploy-history skill runs:
Last 3 deploys to checkout-api: 14 min ago (commit abc123), 2 days ago (commit def456), 5 days ago. The 14-min-ago deploy is suspect. Diff touches the payment-client timeout configuration.
Hermes has now narrowed the scope. It has the suspect commit, the changed files, and the correlating log signature.
Step 4: Rollback Proposal
The rollback-proposal skill produces:
Proposed action: revert commit abc123 and redeploy. Safety checks: ensure no DB migration in the diff (confirmed: none), ensure staging is healthy on prior version (confirmed), ensure rollback target matches deploy manifest (confirmed). Command draft:
gh workflow run deploy.yml --ref v2026.04.21.
Hermes does not execute this. The skill posts a message to the on-call Slack channel via the Slack messaging gateway, tagging the on-call engineer, with the proposal and a one-click confirmation link.
Step 5: Handoff to Claude Code
Once the operator confirms, Hermes delegates the actual repo work to Claude Code using the bundled claude-code subagent skill:
# pseudocode of the subagent call
spawn: claude-code
task: |
Revert commit abc123 on main, open a PR with title
"revert: payment-client timeout regression (incident 2026-04-22-01)",
link the incident memory file, and wait for CI.
Claude Code runs in a sandboxed workspace, does the revert, opens the PR, and reports back. Hermes watches the PR, waits for CI to pass, and then triggers the deploy via the confirmed workflow run.
Throughout all of this, the memory store in ~/.hermes/ accumulates the incident file: who paged, when, what logs were pulled, what decisions were made. That becomes the raw material for the post-incident report via the incident-report skill.
MLOps Chain: Drift Detected, Retrain Proposed
A shorter MLOps example. The drift-detection skill runs hourly on cron:
# ~/.hermes/cron.yaml
- name: drift-hourly
schedule: "0 * * * *"
skill: mlops/drift-detection
args:
baseline: s3://models/fraud-v12/baseline-features.parquet
current_window: 1h
When a feature drifts, the skill:
- Logs the drift to the engagement memory.
- If the drift exceeds a configured threshold, opens a Slack thread with the drift plot and proposes running
mlops/training-job-triageto check recent training data freshness. - Optionally schedules a retrain via the
model-deployskill's staging pathway (never production without explicit approval).
Why This Stack Is Appealing
The combination that makes this compelling:
- Hermes is always on (daemon), so alerts do not miss.
- The bundled skills are auditable markdown, so you know what the agent will do.
- Claude Sonnet 4.6 is the reasoning engine; it is good at log triage and correlation.
- Claude Code handles the actual code changes, which is what it is best at.
- Memory persists, so post-incident reports write themselves.
For more on the Claude Code subagent pattern, see spawning Claude Code as a Hermes subagent. For scheduled work specifically, see scheduling Claude agents: Hermes cron for daily reports. For the messaging gateways used in the handoff, see Hermes messaging gateway: Telegram and Discord.
Guardrails Worth Setting
A few defaults I would not run production on-call without:
- Max-turns cap per incident session, to prevent runaway loops.
- Budget cap per day via the Hermes cost-control config.
- Required operator confirmation before any
kubectl delete,gh workflow run ... --ref prod, or rollback that touches a paid-tier service. - Read-only credentials for the log sources wherever possible.
See cost control: Hermes max-turns, budget, fallback for the concrete knobs.
Sources
- GitHub: NousResearch/hermes-agent — see
skills/devops/andskills/mlops/ - Hermes docs: hermes-agent.nousresearch.com/docs/
- Related: Spawning Claude Code as a Hermes subagent
- Related: Scheduling Claude agents: Hermes cron for daily reports
- Related: Cost control: Hermes max-turns, budget, fallback