MCP in Hermes vs MCP in Claude Code: Same Servers, Different Runtime

Model Context Protocol is a transport-and-schema spec. A filesystem MCP server does not know whether the thing talking to it is Claude Code or Hermes Agent. That part is genuinely portable.

What is not portable is the host. Claude Code runs MCP clients per invocation, inside a short-lived CLI process. Hermes runs MCP clients inside a long-lived Python daemon with 47 built-in tools sitting alongside them. The protocol is the same. The lifecycle, connection behavior, and failure modes are different, and those differences matter more than the protocol similarity once you are running anything in production.

Key Takeaways

MCP servers themselves are portable between Claude Code and Hermes; the same binaries and configs work.
Claude Code spawns MCP clients per CLI invocation; servers start and stop with each session.
Hermes keeps MCP clients alive inside its long-running daemon, which means persistent connections and warmer state.
Long-lived hosts improve connection pooling (Postgres) but require explicit reconnect logic when servers restart.
Configuration location differs: ~/.claude.json or .mcp.json for Claude Code; ~/.hermes/config.yaml for Hermes.
Same auth flows (token files, env vars) work in both; Hermes can also read Claude Code's credential store.
Pick the host based on how long you need the MCP session open, not based on which MCP servers you want to use.

The Same Spec, Two Very Different Hosts

MCP specifies a client-server protocol: the agent host spawns (or connects to) an MCP server, negotiates capabilities, then calls tools or reads resources over stdio, HTTP, or SSE. Neither the server nor the model knows who is hosting.

Claude Code hosts MCP in the CLI. When you run claude, the CLI reads your MCP config, spawns each enabled server as a subprocess (or opens the configured HTTP connection), and tears everything down when the session ends. Next invocation, everything is cold again.

Hermes hosts MCP in the hermes-agent daemon. That daemon may run for days on a VPS. MCP servers launched by Hermes persist across chat sessions, cron runs, incoming messages from the Telegram gateway, and ACP calls from VS Code. One server instance can service many interactions.

This is the split that shapes everything else.

Filesystem MCP Server: Identical Config, Different Lifecycle

The reference filesystem MCP server from the modelcontextprotocol project works unchanged in either host. The configs are structurally the same.

Claude Code, in ~/.claude.json or project-level .mcp.json:

{
  "mcpServers": {
    "filesystem": {
      "command": "npx",
      "args": [
        "-y",
        "@modelcontextprotocol/server-filesystem",
        "/Users/me/projects"
      ]
    }
  }
}

Hermes, in ~/.hermes/config.yaml:

mcp_servers:
  filesystem:
    command: npx
    args:
      - "-y"
      - "@modelcontextprotocol/server-filesystem"
      - "/home/ubuntu/projects"

Same binary. Same args. Different host behavior:

In Claude Code, the filesystem server starts when you run claude and exits when you quit. Directory caches inside the server are always cold.
In Hermes, the filesystem server starts when the daemon starts and stays up. If it caches directory listings internally, those caches warm over time.

For a filesystem server this is mostly fine either way. It matters more for the next two.

GitHub MCP Server: Token Reuse and Rate Limit Headroom

GitHub's MCP server authenticates with a personal access token. In Claude Code, that means re-authenticating (or at least re-reading the token from env) every session. In Hermes, the daemon reads the token once at startup and the server holds a warm HTTP connection to the GitHub API.

Hermes config:

mcp_servers:
  github:
    command: docker
    args:
      - run
      - "-i"
      - "--rm"
      - "-e"
      - "GITHUB_PERSONAL_ACCESS_TOKEN"
      - "ghcr.io/github/github-mcp-server"
    env:
      GITHUB_PERSONAL_ACCESS_TOKEN: ${GITHUB_TOKEN}

The implication for rate limits is not huge (GitHub tokens share a rate limit regardless of connection count), but pooled HTTPS connections and avoided cold-starts do shave latency off every call. This is more visible when a cron job fires every 10 minutes and the GitHub server has been running the whole time versus being spawned fresh for each claude -p invocation.

Postgres MCP Server: Connection Pooling Actually Matters Here

Postgres is where the runtime difference shows up most. Database connections are expensive to establish, and Postgres servers have real connection limits. A Claude Code session that opens a new connection on every invocation is wasteful. A Hermes daemon can keep a single connection (or small pool) open for the lifetime of the daemon.

Hermes config for a Postgres MCP server:

mcp_servers:
  postgres:
    command: npx
    args:
      - "-y"
      - "@modelcontextprotocol/server-postgres"
      - "postgresql://app:pass@db.internal:5432/production"

If you are running a Hermes agent that answers Slack questions about production data, the Postgres MCP server stays warm and one connection serves every question. In Claude Code, every claude -p "what was yesterday's signup count" opens a new connection. Not a deal-breaker, but noticeable at scale.

Error Recovery and the Long-Lived-Daemon Problem

The trade-off for warm MCP connections is that Hermes has to think about reconnection. If the Postgres server restarts, the cached MCP connection inside the Hermes daemon is dead until Hermes reconnects. Hermes handles this with a supervisor that restarts MCP subprocesses on exit, but HTTP-based MCP servers need the daemon to detect broken connections and reconnect.

Claude Code does not have this problem because every session starts fresh. It also does not get the benefits.

Practical implication: when you add an MCP server to Hermes, check the daemon logs after a week of running. If you see repeated reconnect messages, configure more aggressive health checks. If a specific server flakes under long-lived hosting, run it behind a supervisor (Docker with --restart=always) rather than letting Hermes spawn it directly.

Which Runtime for Which MCP Workload

A rough decision rule based on how you use the server:

One-shot, interactive use (code review, quick query, debugging): Claude Code. The cold-start cost does not matter and you get the tight CLI feedback loop.
Always-on background work (cron jobs, Slack bot responses, incoming webhooks): Hermes. The daemon lifecycle matches the workload lifecycle.
Both: run MCP servers that support multiple clients (HTTP-based servers, not stdio) and point both hosts at the same server.

For more on how Hermes talks to Claude specifically, see running Hermes on Claude Sonnet 4.6. For the broader comparison of where each runtime fits, see Hermes vs Claude Code: when to use which. For MCP over editor transports, see running Hermes in VS Code and Zed via the ACP adapter.

Sources

GitHub: NousResearch/hermes-agent
Hermes docs: hermes-agent.nousresearch.com/docs/
Model Context Protocol spec: modelcontextprotocol.io
Related: Hermes vs Claude Code: when to use which
Related: Running Hermes on Claude Sonnet 4.6
Related: Running Hermes in VS Code and Zed via the ACP adapter

MCP in Hermes vs MCP in Claude Code: Same Servers, Different Runtime

Key Takeaways

The Same Spec, Two Very Different Hosts

Filesystem MCP Server: Identical Config, Different Lifecycle

GitHub MCP Server: Token Reuse and Rate Limit Headroom

Postgres MCP Server: Connection Pooling Actually Matters Here

Error Recovery and the Long-Lived-Daemon Problem

Which Runtime for Which MCP Workload

Sources

Related Skills to Try

Related Skills to Try

Linear CLI Integration

Feishu Integration Developer

Related Articles

Related Articles

Dynamic Runtime in AI Skill Design

Is SwiftUI Ready for AI-Built Apps?

Catalyst Patterns for AI Mac Apps

Linear CLI Integration

Feishu Integration Developer

codex

apple-notes

codex

apple-notes