17 Search Engines in One Skill: How Multi Search Engine Works
A deep dive into the multi-search-engine ClawHub skill. Learn how it queries 17 search engines simultaneously, aggregates results, and why multi-source search dramatically improves AI research quality.
17 Search Engines in One Skill: How Multi Search Engine Works
When your AI agent searches the web, it searches Google. Maybe Bing. That's one or two lenses on the world's information — and it's a significant blind spot.
Different search engines index different content, use different ranking algorithms, and surface different results for the same query. Technical documentation appears prominently on DuckDuckGo. Academic content ranks differently on Semantic Scholar. Real-time information indexes differently on Brave. Developer content surfaces differently on You.com.
The multi-search-engine skill queries 17 search engines simultaneously and synthesizes the results. This tutorial explains how it works, how to install and configure it, and when multi-source search makes a meaningful difference.
Why Multiple Search Engines?
The case for multi-source search is concrete, not theoretical.
For any given query, Google returns results optimized for commercial intent, recency, and domain authority. That's useful but narrow. Here's what you get from other sources:
| Engine | Strength |
|---|---|
| General web, commercial, news | |
| Bing | Alternative general web, Microsoft content |
| DuckDuckGo | Privacy-focused, technical documentation |
| Brave Search | Independent index, privacy-respecting |
| You.com | Developer-focused, technical content |
| Semantic Scholar | Academic papers, citations |
| arXiv Search | Preprints, cutting-edge research |
| Hacker News | Developer discussion, tech community |
| Reddit Search | Community knowledge, practical experience |
| GitHub Search | Code examples, open-source projects |
| Stack Overflow | Q&A, code solutions |
| Wikipedia | Encyclopedic, structured knowledge |
| Wolfram Alpha | Factual, mathematical, scientific data |
| Perplexity | AI-synthesized answers |
| Kagi | Premium independent web index |
| Ecosia | European results, different ranking |
| Startpage | Google results with privacy layer |
For a query like "best approach to handling JWT refresh tokens in Next.js", Google surfaces blog posts and tutorials. GitHub Search surfaces real implementations. Stack Overflow surfaces the battle-tested answers from developers who hit the same problem. Hacker News surfaces the discussions about trade-offs. You need all of them.
Installation
clawhub install multi-search-engine
Verify installation:
clawhub inspect multi-search-engine
# multi-search-engine v3.1.0
# Type: command
# Invocation: /search
Basic Usage
/search JWT refresh token best practices Next.js 2026
The skill queries all configured search engines in parallel, deduplicates results, ranks by relevance, and returns a synthesized summary with source attribution.
What You Get Back
Query: "JWT refresh token best practices Next.js 2026"
Sources queried: 17 | Results collected: 94 | Unique results: 67 | Synthesis time: 3.2s
## Summary
Based on 67 unique results from 17 sources:
The current consensus (2026) favors **short-lived access tokens (15 min) + long-lived refresh tokens (30 days)** stored in HttpOnly cookies. Key points:
1. Never store refresh tokens in localStorage — XSS vulnerability
2. Implement refresh token rotation — each use issues a new refresh token
3. In Next.js, handle token refresh in middleware (middleware.ts) to catch all routes
4. Use the `jose` library for JWT operations — it's edge-compatible
**Diverging opinions:**
- Redis for refresh token storage vs database: Redis wins for scale, DB is simpler for small apps
- Refresh token expiry length: security-focused sources say 7 days, UX-focused say 30 days
## Top Sources
1. [Next.js Auth Best Practices 2026](https://...) — Stack Overflow (2.4k upvotes)
2. [JWT Security in 2026](https://...) — GitHub (auth library README)
3. [The Refresh Token Problem](https://...) — Hacker News (342 points)
4. [IETF OAuth 2.0 Security Best Practices](https://...) — Semantic Scholar
...
This is research that would take 45 minutes to do manually, delivered in 3 seconds.
Configuration
Select Your Engines
Not all 17 engines are right for every project. Configure the active set:
// .claude/multi-search.config.json
{
"engines": {
"google": { "enabled": true, "weight": 1.0 },
"bing": { "enabled": true, "weight": 0.8 },
"duckduckgo": { "enabled": true, "weight": 0.9 },
"brave": { "enabled": true, "weight": 0.9 },
"you_com": { "enabled": true, "weight": 0.8 },
"semantic_scholar": { "enabled": false, "weight": 1.0 },
"arxiv": { "enabled": false, "weight": 1.0 },
"hackernews": { "enabled": true, "weight": 0.7 },
"reddit": { "enabled": true, "weight": 0.6 },
"github": { "enabled": true, "weight": 0.9 },
"stackoverflow": { "enabled": true, "weight": 1.0 },
"wikipedia": { "enabled": true, "weight": 0.8 },
"wolframalpha": { "enabled": false, "weight": 1.0 },
"perplexity": { "enabled": false, "weight": 0.9 },
"kagi": { "enabled": false, "weight": 1.0 },
"ecosia": { "enabled": false, "weight": 0.5 },
"startpage": { "enabled": false, "weight": 0.7 }
}
}
weight affects result ranking in the final synthesis. A weight of 1.0 means that engine's results are considered full-value; 0.5 means they're considered half-value relative to other sources.
Create Search Profiles
Different tasks need different engine sets. Create profiles:
{
"profiles": {
"development": {
"engines": ["stackoverflow", "github", "you_com", "hackernews", "duckduckgo"],
"description": "Technical development research"
},
"research": {
"engines": ["semantic_scholar", "arxiv", "google", "wikipedia", "perplexity"],
"description": "Academic and scientific research"
},
"competitive": {
"engines": ["google", "bing", "brave", "kagi", "duckduckgo"],
"description": "Competitive analysis and market research"
}
},
"default_profile": "development"
}
Use a profile in your search:
/search --profile research "transformer architecture attention mechanism improvements 2026"
/search --profile competitive "claude code competitors pricing features"
Result Count and Depth
{
"results_per_engine": 5,
"max_total_results": 50,
"synthesis_depth": "standard"
}
synthesis_depth options:
"quick"— Top results only, fast summary"standard"— Balanced depth and speed (default)"deep"— All results, full analysis, slower
For time-sensitive queries: quick. For research that feeds into a major decision: deep.
Advanced Usage
Domain-Filtered Search
Restrict results to trusted domains:
/search "PostgreSQL connection pooling" --domains "postgresql.org,supabase.com,pganalyze.com"
Recency Filter
For rapidly-evolving topics:
/search "Next.js 15 performance improvements" --since "2026-01-01"
Format for Code Generation
Get results optimized for feeding into code generation:
/search "Zod schema validation patterns" --output code-context
Instead of a prose summary, this returns a structured list of patterns and examples extracted from results — ready to paste into a code generation prompt.
Search + Save to Memory
Combine with the elite-longterm-memory skill:
/search "Supabase Row Level Security patterns" --save-to-memory
The synthesis is automatically stored as a memory entry. Future sessions load it without re-searching.
Understanding Result Deduplication
When 17 engines all return the same Stack Overflow answer, you don't want it listed 17 times. The skill uses URL normalization + content fingerprinting to deduplicate:
- URL normalization — Strips tracking parameters, normalizes trailing slashes
- Content fingerprinting — Detects near-duplicate content even from different URLs
- Source attribution — When a result appears from multiple engines, all source engines are noted
A result that appears in 12 out of 17 engine results gets a high consensus score. That's surfaced prominently in the synthesis.
Result Quality Signals
The synthesis engine scores each result on:
- Consensus — How many engines returned this result
- Source authority — Domain authority and topic relevance
- Recency — Publication/update date
- Content depth — Length and specificity of the content
- Engagement signals — Votes, shares, citations (where available)
These signals combine into a final relevance score. You see the top results — not the raw dump.
When Multi-Search Makes the Biggest Difference
The skill has the highest ROI for these query types:
Emerging topics — When something is new, Google may not have indexed the best content yet. Other engines pick up different early adopter content.
Technical decisions — Stack Overflow, GitHub, and Hacker News provide practitioner knowledge that doesn't rank well on general search.
Controversial topics — Different engines reflect different community perspectives. For "Should I use X or Y?", you want the full range of opinion.
Current events in tech — Hacker News and Reddit surface real-time developer community reactions that take days to appear in standard search rankings.
Security research — Academic sources (Semantic Scholar, arXiv) surface peer-reviewed security research that doesn't rank on general web search.
Performance
17 simultaneous queries sounds slow. It's not, because they run in parallel:
- Median synthesis time: 2.8 seconds
- P95 synthesis time: 5.4 seconds
- Single-engine baseline: ~1.2 seconds
You get 17x the information for roughly 2-3x the latency. For any research task that would otherwise take 10+ minutes manually, that's an overwhelming win.
Next Steps
- Install multi-search-engine from the marketplace
- Combine with elite-longterm-memory to cache research results
- Read the proactive agent guide to trigger searches automatically based on context