AnyCrawl-API
Perform high-performance web scraping, crawling, and Google search with multi-engine support and structured data extraction via AnyCrawl API.
Perform high-performance web scraping, crawling, and Google search with multi-engine support and structured data extraction via AnyCrawl API.
Real data. Real impact.
Emerging
Developers
Per week
Open source
Skills give you superpowers. Install in 30 seconds.
AnyCrawl API integration for OpenClaw - Scrape, Crawl, and Search web content with high-performance multi-threaded crawling.
export ANYCRAWL_API_KEY="your-api-key"
Make it permanent by adding to
~/.bashrc or ~/.zshrc:
echo 'export ANYCRAWL_API_KEY="your-api-key"' >> ~/.bashrc source ~/.bashrc
Get your API key at: https://anycrawl.dev
openclaw config.patch --set ANYCRAWL_API_KEY="your-api-key"
Scrape a single URL and convert to LLM-ready structured data.
Parameters:
url (string, required): URL to scrapeengine (string, optional): Scraping engine - "cheerio" (default), "playwright", "puppeteer"formats (array, optional): Output formats - ["markdown"], ["html"], ["text"], ["json"], ["screenshot"]timeout (number, optional): Timeout in milliseconds (default: 30000)wait_for (number, optional): Delay before extraction in ms (browser engines only)wait_for_selector (string/object/array, optional): Wait for CSS selectorsinclude_tags (array, optional): Include only these HTML tags (e.g., ["h1", "p", "article"])exclude_tags (array, optional): Exclude these HTML tagsproxy (string, optional): Proxy URL (e.g., "http://proxy:port")json_options (object, optional): JSON extraction with schema/promptextract_source (string, optional): "markdown" (default) or "html"Examples:
// Basic scrape with default cheerio anycrawl_scrape({ url: "https://example.com" })// Scrape SPA with Playwright anycrawl_scrape({ url: "https://spa-example.com", engine: "playwright", formats: ["markdown", "screenshot"] })
// Extract structured JSON anycrawl_scrape({ url: "https://product-page.com", engine: "cheerio", json_options: { schema: { type: "object", properties: { product_name: { type: "string" }, price: { type: "number" }, description: { type: "string" } }, required: ["product_name", "price"] }, user_prompt: "Extract product details from this page" } })
Search Google and return structured results.
Parameters:
query (string, required): Search queryengine (string, optional): Search engine - "google" (default)limit (number, optional): Max results per page (default: 10)offset (number, optional): Number of results to skip (default: 0)pages (number, optional): Number of pages to retrieve (default: 1, max: 20)lang (string, optional): Language locale (e.g., "en", "zh", "vi")safe_search (number, optional): 0 (off), 1 (medium), 2 (high)scrape_options (object, optional): Scrape each result URL with these optionsExamples:
// Basic search anycrawl_search({ query: "OpenAI ChatGPT" })// Multi-page search in Vietnamese anycrawl_search({ query: "hướng dẫn Node.js", pages: 3, lang: "vi" })
// Search and auto-scrape results anycrawl_search({ query: "best AI tools 2026", limit: 5, scrape_options: { engine: "cheerio", formats: ["markdown"] } })
Start crawling an entire website (async job).
Parameters:
url (string, required): Seed URL to start crawlingengine (string, optional): "cheerio" (default), "playwright", "puppeteer"strategy (string, optional): "all", "same-domain" (default), "same-hostname", "same-origin"max_depth (number, optional): Max depth from seed URL (default: 10)limit (number, optional): Max pages to crawl (default: 100)include_paths (array, optional): Path patterns to include (e.g., ["/blog/*"])exclude_paths (array, optional): Path patterns to exclude (e.g., ["/admin/*"])scrape_paths (array, optional): Only scrape URLs matching these patternsscrape_options (object, optional): Per-page scrape optionsExamples:
// Crawl entire website anycrawl_crawl_start({ url: "https://docs.example.com", engine: "cheerio", max_depth: 5, limit: 50 })// Crawl only blog posts anycrawl_crawl_start({ url: "https://example.com", strategy: "same-domain", include_paths: ["/blog/"], exclude_paths: ["/blog/tags/"], scrape_options: { formats: ["markdown"] } })
// Crawl product pages only anycrawl_crawl_start({ url: "https://shop.example.com", strategy: "same-domain", scrape_paths: ["/products/*"], limit: 200 })
Check crawl job status.
Parameters:
job_id (string, required): Crawl job IDExample:
anycrawl_crawl_status({ job_id: "7a2e165d-8f81-4be6-9ef7-23222330a396" })
Get crawl results (paginated).
Parameters:
job_id (string, required): Crawl job IDskip (number, optional): Number of results to skip (default: 0)Example:
// Get first 100 results anycrawl_crawl_results({ job_id: "xxx", skip: 0 })// Get next 100 results anycrawl_crawl_results({ job_id: "xxx", skip: 100 })
Cancel a running crawl job.
Parameters:
job_id (string, required): Crawl job IDQuick helper: Search Google then scrape top results.
Parameters:
query (string, required): Search querymax_results (number, optional): Max results to scrape (default: 3)scrape_engine (string, optional): Engine for scraping (default: "cheerio")formats (array, optional): Output formats (default: ["markdown"])lang (string, optional): Search languageExample:
anycrawl_search_and_scrape({ query: "latest AI news", max_results: 5, formats: ["markdown"] })
| Engine | Best For | Speed | JS Rendering |
|---|---|---|---|
| Static HTML, news, blogs | ⚡ Fastest | ❌ No |
| SPAs, complex web apps | 🐢 Slower | ✅ Yes |
| Chrome-specific, metrics | 🐢 Slower | ✅ Yes |
All responses follow this structure:
{ "success": true, "data": { ... }, "message": "Optional message" }
Error response:
{ "success": false, "error": "Error type", "message": "Human-readable message" }
400 - Bad Request (validation errors)401 - Unauthorized (invalid API key)402 - Payment Required (insufficient credits)404 - Not Found429 - Rate limit exceeded500 - Internal server errorNo automatic installation available. Please visit the source repository for installation instructions.
View Installation Instructions1,500+ AI skills, agents & workflows. Install in 30 seconds. Part of the Torly.ai family.
© 2026 Torly.ai. All rights reserved.