Deploying Hermes on Daytona: Ephemeral Dev Environments for Claude Agents
Using Hermes's Daytona backend to give every agent task a fresh workspace. Walkthrough, contrast with Docker, and when clean rooms are worth the slower start.
Using Hermes's Daytona backend to give every agent task a fresh workspace. Walkthrough, contrast with Docker, and when clean rooms are worth the slower start.
The most awkward failure mode for a coding agent is side-effect leakage. The agent installs a package, edits a config, spawns a background process, and two tasks later that stale state contaminates a completely unrelated run. A pip install that silently bumped a transitive dependency. A leftover environment variable. A PID file nobody cleaned up.
The answer is ephemeral environments. Give every agent task a fresh workspace, let it run to completion, destroy the workspace. Hermes supports this natively through its daytona terminal backend. Daytona is a purpose-built ephemeral-workspace platform; Hermes provisions a new Daytona workspace per task and tears it down when done.
max_budget_usd and max_turns caps — ephemeral environments do not protect you from runaway token spend.If your agent only reads files and talks to APIs, you can probably get away with a long-lived local or Docker environment. The moment your agent starts modifying things — installing packages, generating code, running tests, writing temporary files — state accumulation becomes a real problem.
Three concrete scenarios that push you toward ephemeral:
Daytona is an open-source platform for ephemeral dev environments. You define a workspace spec (base image, tools, any bootstrap scripts), point a Daytona server at a compute backend (Docker, Kubernetes, cloud), and on demand it spins up isolated workspaces.
The Hermes integration treats Daytona as a terminal backend. When the agent issues a command, Hermes routes that command to the Daytona workspace provisioned for the current task. When the task ends, the workspace is destroyed.
A rough shape (see docs for exact field names):
terminal:
backend: daytona
daytona:
api_url: "https://your-daytona-server.example.com"
api_token: "${DAYTONA_API_TOKEN}"
workspace_template: "hermes-default"
per_task_workspace: true
auto_destroy: true
auto_destroy_after_seconds: 3600
Key fields:
per_task_workspace: true — each agent task gets its own workspace. The alternative (one shared workspace) is cheaper but defeats the purpose.workspace_template — a Daytona workspace spec. Pre-bake your common tooling here (Python, Node, ripgrep, git, whatever your skills assume) so tasks do not reinstall from scratch.auto_destroy — clean up workspaces when tasks complete. Without this, you accumulate orphaned workspaces and pay for compute that is not doing anything.The canonical use case. Say your agent accepts a Python snippet from an external source and needs to run it to verify behavior.
task:
name: "verify-snippet"
prompt: |
Here is a Python snippet. Run it in a clean environment,
report on what it does, flag anything that looks suspicious,
and produce a diff of what files it created or modified.
[snippet attached]
terminal:
backend: daytona
workspace_template: "sandbox-minimal"
max_turns: 10
max_budget_usd: 0.50
Every task gets a fresh sandbox-minimal workspace. The snippet can do whatever it wants — install packages, write files, try to escape — and at the end the workspace is torn down. Your host is untouched. Your other tasks see no residue.
For a deeper safety posture, combine this with Modal's gVisor-backed containers (see hermes-on-modal-serverless-claude-agents) if you need stronger isolation than Daytona provides by default.
The Docker backend is faster and cheaper. Containers start in fractions of a second, image layers cache, and you can reuse a container across tasks if you choose.
The Daytona backend is slower but more opinionated about isolation. Each workspace is a fuller environment with a deterministic lifecycle. You can attach an IDE to it, rehydrate it in a debugger, keep it around for inspection if something goes wrong.
Rule of thumb: Docker for speed and cheap isolation, Daytona for fully managed lifecycle and per-task clean rooms.
For most agent workloads, the Docker backend is fine. Reach for Daytona when:
The ephemeral pattern is especially useful for scheduled work. A daily job that ingests external data, runs transformations, and emits a report should not accumulate state between runs. Wire the scheduler to the Daytona backend and every morning's job starts from a known-clean baseline.
schedules:
- name: "daily-ingestion"
cron: "0 2 * * *"
prompt: "Ingest yesterday's third-party data, validate, write report..."
terminal:
backend: daytona
workspace_template: "ingestion-pipeline"
max_turns: 30
max_budget_usd: 1.00
See scheduling-claude-agents-hermes-cron-daily-reports for the full scheduler picture.
The template is where your common tooling lives. Invest in getting this right — every task pays the provisioning cost, so making that workspace actually ready to work saves real time.
A sensible hermes-default template might include:
ripgrep, jq, gh, curl, git.Avoid baking in secrets. Pass those in per-task via environment variables.
Two anti-patterns.
Ephemeral environments are a discipline, not a feature. The feature (Daytona backend) makes the discipline cheap to follow. Start using it for the things that genuinely should be clean-room — untrusted code, external-data pipelines, experiments — and you will never worry about environment rot for that class of tasks again.
Manage Apple Notes via the memo CLI on macOS (create, view, search, edit).
Manage Apple Reminders via remindctl CLI (list, add, complete, delete).
Delegate coding tasks to OpenAI Codex CLI agent. Use for building features, refactoring, PR reviews, and batch issue fixing. Requires the codex CLI and a git repository.
Track Apple devices and AirTags via FindMy.app on macOS using AppleScript and screen capture.