Playwright Scraper Skill
Playwright-based web scraping OpenClaw Skill with anti-bot protection. Successfully tested on complex sites like Discuss.com.hk.
Playwright-based web scraping OpenClaw Skill with anti-bot protection. Successfully tested on complex sites like Discuss.com.hk.
Real data. Real impact.
Emerging
Developers
Per week
Open source
Skills give you superpowers. Install in 30 seconds.
A Playwright-based web scraping OpenClaw Skill with anti-bot protection. Choose the best approach based on the target website's anti-bot level.
| Target Website | Anti-Bot Level | Recommended Method | Script |
|---|---|---|---|
| Regular Sites | Low | web_fetch tool | N/A (built-in) |
| Dynamic Sites | Medium | Playwright Simple | scripts/playwright-simple.js |
| Cloudflare Protected | High | Playwright Stealth β | scripts/playwright-stealth.js |
| YouTube | Special | deep-scraper | Install separately |
| Special | reddit-scraper | Install separately |
cd playwright-scraper-skill npm install npx playwright install chromium
Use OpenClaw's built-in web_fetch tool:
# Invoke directly in OpenClaw Hey, fetch me the content from https://example.com
Use Playwright Simple:
node scripts/playwright-simple.js "https://example.com"
Example output:
{ "url": "https://example.com", "title": "Example Domain", "content": "...", "elapsedSeconds": "3.45" }
Use Playwright Stealth:
node scripts/playwright-stealth.js "https://m.discuss.com.hk/#hot"
Features:
Use deep-scraper (install separately):
# Install deep-scraper skill npx clawhub install deep-scraper # Use it cd skills/deep-scraper node assets/youtube_handler.js "https://www.youtube.com/watch?v=VIDEO_ID"
If the site doesn't have dynamic loading, use OpenClaw's web_fetch toolβit's fastest.
If you need to wait for JavaScript rendering, use playwright-simple.js.
If you encounter 403 or Cloudflare challenges, use playwright-stealth.js.
All scripts support environment variables:
# Set screenshot path SCREENSHOT_PATH=/path/to/screenshot.png node scripts/playwright-stealth.js URL # Set wait time (milliseconds) WAIT_TIME=10000 node scripts/playwright-simple.js URL # Enable headful mode (show browser) HEADLESS=false node scripts/playwright-stealth.js URL # Save HTML SAVE_HTML=true node scripts/playwright-stealth.js URL # Custom User-Agent USER_AGENT="Mozilla/5.0 ..." node scripts/playwright-stealth.js URL
| Method | Speed | Anti-Bot | Success Rate (Discuss.com.hk) |
|---|---|---|---|
| web_fetch | β‘ Fastest | β None | 0% |
| Playwright Simple | π Fast | β οΈ Low | 20% |
| Playwright Stealth | β±οΈ Medium | β Medium | 100% β |
| Puppeteer Stealth | β±οΈ Medium | β Medium-High | ~80% |
| Crawlee (deep-scraper) | π’ Slow | β Detected | 0% |
| Chaser (Rust) | β±οΈ Medium | β Detected | 0% |
Lessons learned from our testing:
Solution: Use playwright-stealth.js
Solution:
Solution:
Best Solution: Pure Playwright + anti-bot techniques (framework-independent)
No automatic installation available. Please visit the source repository for installation instructions.
View Installation Instructions1,500+ AI skills, agents & workflows. Install in 30 seconds. Part of the Torly.ai family.
Β© 2026 Torly.ai. All rights reserved.