Smart Web Scraper
Extract structured data from any web page. Supports CSS selectors, auto-detection of tables and lists, JSON/CSV output formats. Use when asked to scrape a we...
Extract structured data from any web page. Supports CSS selectors, auto-detection of tables and lists, JSON/CSV output formats. Use when asked to scrape a we...
Real data. Real impact.
Emerging
Developers
Per week
Open source
Skills give you superpowers. Install in 30 seconds.
Extract structured data from web pages into clean JSON or CSV.
# Scrape a page, extract all text content uv run --with beautifulsoup4 --with lxml python scripts/scraper.py extract "https://example.com"Extract specific elements with CSS selector
uv run --with beautifulsoup4 --with lxml python scripts/scraper.py extract "https://example.com/products" -s ".product-card"
Auto-detect and extract tables
uv run --with beautifulsoup4 --with lxml python scripts/scraper.py tables "https://example.com/pricing"
Extract all links from a page
uv run --with beautifulsoup4 --with lxml python scripts/scraper.py links "https://example.com"
Extract structured data (title, meta, headings, links)
uv run --with beautifulsoup4 --with lxml python scripts/scraper.py structure "https://example.com"
Output as JSON
uv run --with beautifulsoup4 --with lxml python scripts/scraper.py extract "https://example.com" -s ".item" -f json
Output as CSV
uv run --with beautifulsoup4 --with lxml python scripts/scraper.py extract "https://example.com" -s "table tr" -f csv
Save to file
uv run --with beautifulsoup4 --with lxml python scripts/scraper.py extract "https://example.com" -s ".product" -f json -o products.json
Multi-page scrape (follow pagination)
uv run --with beautifulsoup4 --with lxml python scripts/scraper.py crawl "https://example.com/page/1" --pages 5 -s ".article"
| Command | Args | Description |
|---|---|---|
| | Extract content, optionally filtered by CSS selector |
| | Auto-detect and extract all HTML tables |
| | Extract all links (href + text) |
| | Extract page structure: title, meta, headings, images, links |
| | Follow pagination links, extract from multiple pages |
| Format | Flag | Description |
|---|---|---|
| Text | | Plain text (default) |
| JSON | | Structured JSON array |
| CSV | | Comma-separated values |
| Markdown | | Markdown-formatted |
uv run --with beautifulsoup4 --with lxml python scripts/scraper.py extract "https://shop.example.com" -s ".product" -f json
Output:
[ {"text": "Widget Pro - $29.99", "tag": "div", "class": "product"}, {"text": "Widget Max - $49.99", "tag": "div", "class": "product"} ]
uv run --with beautifulsoup4 --with lxml python scripts/scraper.py tables "https://example.com/pricing" -f csv
uv run --with beautifulsoup4 --with lxml python scripts/scraper.py links "https://example.com" --external
--delay 0.5 (seconds between requests)robots.txt by default (override with --ignore-robots)beautifulsoup4 and lxml (auto-installed by uv run --with)No automatic installation available. Please visit the source repository for installation instructions.
View Installation Instructions1,500+ AI skills, agents & workflows. Install in 30 seconds. Part of the Torly.ai family.
© 2026 Torly.ai. All rights reserved.