Intelligent Model Router
Intelligent model routing for sub-agent task delegation. Choose the optimal model based on task complexity, cost, and capability requirements. Reduces costs...
Intelligent model routing for sub-agent task delegation. Choose the optimal model based on task complexity, cost, and capability requirements. Reduces costs...
Real data. Real impact.
Emerging
Developers
Per week
Open source
Skills give you superpowers. Install in 30 seconds.
CORE SKILL: This skill is infrastructure, not guidance. Installation = enforcement. Run
to activate.bash skills/intelligent-router/install.sh
Automatically classifies any task into a tier (SIMPLE/MEDIUM/COMPLEX/REASONING/CRITICAL) and recommends the cheapest model that can handle it well.
The problem it solves: Without routing, every cron job and sub-agent defaults to Sonnet (expensive). With routing, monitoring tasks use free local models, saving 80-95% on cost.
python3 skills/intelligent-router/scripts/router.py classify "task description"
python3 skills/intelligent-router/scripts/spawn_helper.py "task description" # Outputs the exact model ID and payload snippet to use
python3 skills/intelligent-router/scripts/spawn_helper.py --validate '{"kind":"agentTurn","message":"..."}'
# Cron job without model = Sonnet default = expensive waste {"kind": "agentTurn", "message": "check server..."} # ← WRONG
# Always specify model from router recommendation {"kind": "agentTurn", "message": "check server...", "model": "ollama/glm-4.7-flash"}
| Tier | Use For | Primary Model | Cost |
|---|---|---|---|
| 🟢 SIMPLE | Monitoring, heartbeat, checks, summaries | (alt: proxy-4) | $0.50/M |
| 🟡 MEDIUM | Code fixes, patches, research, data analysis | | $0.40/M |
| 🟠 COMPLEX | Features, architecture, multi-file, debug | | $3/M |
| 🔵 REASONING | Proofs, formal logic, deep analysis | | $1/M |
| 🔴 CRITICAL | Security, production, high-stakes | | $5/M |
SIMPLE fallback chain:
anthropic-proxy-4/glm-4.7 → nvidia-nim/qwen/qwen2.5-7b-instruct ($0.15/M)
⚠️
is BLOCKED for cron/spawn use. Ollama binds toollama-gpu-serverby default — unreachable over LAN from the OpenClaw host. The127.0.0.1enforcer will reject any payload referencing it.router_policy.py
Tier classification uses 4 capability signals (not cost alone):
effective_params (50%) — extracted from model ID or known-model-params.json for closed-source modelscontext_window (20%) — larger = more capablecost_input (20%) — price as quality proxy (weak signal, last resort for unknown sizes)reasoning_flag (10%) — bonus for dedicated thinking specialists (R1, QwQ, Kimi-K2)router_policy.py catches bad model assignments before they are created, not after they fail.
python3 skills/intelligent-router/scripts/router_policy.py check \ '{"kind":"agentTurn","model":"ollama-gpu-server/glm-4.7-flash","message":"check server"}' # Output: VIOLATION: Blocked model 'ollama-gpu-server/glm-4.7-flash'. Recommended: anthropic-proxy-6/glm-4.7
python3 skills/intelligent-router/scripts/router_policy.py recommend "monitor alphastrike service" # Output: Tier: SIMPLE Model: anthropic-proxy-6/glm-4.7python3 skills/intelligent-router/scripts/router_policy.py recommend "monitor alphastrike service" --alt
Output: Tier: SIMPLE Model: anthropic-proxy-4/glm-4.7 ← alternate key for load distribution
python3 skills/intelligent-router/scripts/router_policy.py audit # Scans all crons, reports any with blocked or missing models
python3 skills/intelligent-router/scripts/router_policy.py blocklist
ollama-gpu-server/* and bare ollama/* are rejected for cron useRun once to self-integrate into AGENTS.md:
bash skills/intelligent-router/install.sh
This patches AGENTS.md with the mandatory protocol so it's always in context.
# ── Policy enforcer (run before creating any cron/spawn) ── python3 skills/intelligent-router/scripts/router_policy.py check '{"kind":"agentTurn","model":"...","message":"..."}' python3 skills/intelligent-router/scripts/router_policy.py recommend "task description" python3 skills/intelligent-router/scripts/router_policy.py recommend "task" --alt # alternate proxy key python3 skills/intelligent-router/scripts/router_policy.py audit # scan all crons python3 skills/intelligent-router/scripts/router_policy.py blocklist── Core router ──
Classify + recommend model
python3 skills/intelligent-router/scripts/router.py classify "task"
Get model id only (for scripting)
python3 skills/intelligent-router/scripts/spawn_helper.py --model-only "task"
Show spawn command
python3 skills/intelligent-router/scripts/spawn_helper.py "task"
Validate cron payload has model set
python3 skills/intelligent-router/scripts/spawn_helper.py --validate '{"kind":"agentTurn","message":"..."}'
List all models by tier
python3 skills/intelligent-router/scripts/router.py models
Detailed scoring breakdown
python3 skills/intelligent-router/scripts/router.py score "task"
Config health check
python3 skills/intelligent-router/scripts/router.py health
Auto-discover working models (NEW)
python3 skills/intelligent-router/scripts/discover_models.py
Auto-discover + update config
python3 skills/intelligent-router/scripts/discover_models.py --auto-update
Test specific tier only
python3 skills/intelligent-router/scripts/discover_models.py --tier COMPLEX
15-dimension weighted scoring (not just keywords):
Confidence:
1 / (1 + exp(-8 × (score - 0.5)))
Models defined in
config.json. Add new models there, router picks them up automatically.
Local Ollama models have zero cost — always prefer them for SIMPLE tasks.
The intelligent-router can automatically discover working models from all configured providers via real live inference tests (not config-existence checks).
~/.openclaw/openclaw.json → finds all models"hi" to each model, checks it actually responds (catches auth failures, quota exhaustion, 404s, timeouts)sk-ant-oat01-* tokens (Anthropic OAuth) are skipped in raw HTTP — OpenClaw refreshes these transparently, so they're always marked availablecontent=None + reasoning_content (GLM-4.7, Kimi-K2, Qwen3-thinking) are correctly detected as availabletier_classifier.py using 4 capability signalsa8992c1f) keeps model list current, alerts if availability changes by >2# One-time discovery python3 skills/intelligent-router/scripts/discover_models.pyAuto-update config with working models only
python3 skills/intelligent-router/scripts/discover_models.py --auto-update
Set up hourly refresh cron
openclaw cron add --job '{ "name": "Model Discovery Refresh", "schedule": {"kind": "every", "everyMs": 3600000}, "payload": { "kind": "systemEvent", "text": "Run: bash skills/intelligent-router/scripts/auto_refresh_models.sh", "model": "ollama/glm-4.7-flash" } }'
✅ Self-healing: Automatically removes broken models (e.g., expired OAuth) ✅ Zero maintenance: No manual model list updates ✅ New models: Auto-adds newly released models ✅ Cost optimization: Always uses cheapest working model per tier
Results saved to
skills/intelligent-router/discovered-models.json:
{ "scan_timestamp": "2026-02-19T21:00:00", "total_models": 25, "available_models": 23, "unavailable_models": 2, "providers": { "anthropic": { "available": 2, "unavailable": 0, "models": [...] } } }
To preserve a model even if it fails discovery:
{ "id": "special-model", "tier": "COMPLEX", "pinned": true // Never remove during auto-update }
Current router is reactive not proactive:
Needed improvements:
router.get_best_available(n_concurrent=2) APINo automatic installation available. Please visit the source repository for installation instructions.
View Installation Instructions1,500+ AI skills, agents & workflows. Install in 30 seconds. Part of the Torly.ai family.
© 2026 Torly.ai. All rights reserved.