Nanonets OCR
Document extraction API by Nanonets. Convert PDFs and images to markdown, JSON, or CSV with confidence scoring. Use when you need to OCR documents, extract invoice fields, parse receipts, or convert t
Document extraction API by Nanonets. Convert PDFs and images to markdown, JSON, or CSV with confidence scoring. Use when you need to OCR documents, extract invoice fields, parse receipts, or convert t
Real data. Real impact.
Emerging
Developers
Per week
Open source
Skills give you superpowers. Install in 30 seconds.
Document extraction API — convert PDFs, images, and documents to markdown, JSON, or CSV with per-field confidence scoring.
Get your API key: https://docstrange.nanonets.com/app
curl -X POST "https://extraction-api.nanonets.com/api/v1/extract/sync" \ -H "Authorization: Bearer $DOCSTRANGE_API_KEY" \ -F "file=@document.pdf" \ -F "output_format=markdown"
Response:
{ "success": true, "record_id": "550e8400-e29b-41d4-a716-446655440000", "status": "completed", "result": { "markdown": { "content": "# Invoice\n\n**Invoice Number:** INV-2024-001..." } } }
# Visit the dashboard https://docstrange.nanonets.com/app
Save your API key:
export DOCSTRANGE_API_KEY="your_api_key_here"
Recommended: Use environment variables (most secure):
{ skills: { entries: { "docstrange": { enabled: true, // API key loaded from environment variable DOCSTRANGE_API_KEY }, }, }, }
Alternative: Store in config file (use with caution):
{ skills: { entries: { "docstrange": { enabled: true, env: { DOCSTRANGE_API_KEY: "your_api_key_here", }, }, }, }, }
Security Note: If storing API keys in
~/.openclaw/openclaw.json:
chmod 600 ~/.openclaw/openclaw.jsoncurl -X POST "https://extraction-api.nanonets.com/api/v1/extract/sync" \ -H "Authorization: Bearer $DOCSTRANGE_API_KEY" \ -F "file=@document.pdf" \ -F "output_format=markdown"
Access content:
response["result"]["markdown"]["content"]
Simple field list:
curl -X POST "https://extraction-api.nanonets.com/api/v1/extract/sync" \ -H "Authorization: Bearer $DOCSTRANGE_API_KEY" \ -F "file=@invoice.pdf" \ -F "output_format=json" \ -F 'json_options=["invoice_number", "date", "total_amount", "vendor"]' \ -F "include_metadata=confidence_score"
With JSON schema:
curl -X POST "https://extraction-api.nanonets.com/api/v1/extract/sync" \ -H "Authorization: Bearer $DOCSTRANGE_API_KEY" \ -F "file=@invoice.pdf" \ -F "output_format=json" \ -F 'json_options={"type": "object", "properties": {"invoice_number": {"type": "string"}, "total_amount": {"type": "number"}}}'
Response with confidence scores:
{ "result": { "json": { "content": { "invoice_number": "INV-2024-001", "total_amount": 500.00 }, "metadata": { "confidence_score": { "invoice_number": 98, "total_amount": 99 } } } } }
curl -X POST "https://extraction-api.nanonets.com/api/v1/extract/sync" \ -H "Authorization: Bearer $DOCSTRANGE_API_KEY" \ -F "file=@table.pdf" \ -F "output_format=csv" \ -F "csv_options=table"
For documents >5 pages, use async and poll:
Queue the document:
curl -X POST "https://extraction-api.nanonets.com/api/v1/extract/async" \ -H "Authorization: Bearer $DOCSTRANGE_API_KEY" \ -F "file=@large-document.pdf" \ -F "output_format=markdown"Returns: {"record_id": "12345", "status": "processing"}
Poll for results:
curl -X GET "https://extraction-api.nanonets.com/api/v1/extract/results/12345" \ -H "Authorization: Bearer $DOCSTRANGE_API_KEY"Returns: {"status": "completed", "result": {...}}
Get element coordinates for layout analysis:
-F "include_metadata=bounding_boxes"
Extract document structure (sections, tables, key-value pairs):
-F "json_options=hierarchy_output"
Enhanced table and number formatting:
-F "markdown_options=financial-docs"
Guide extraction with prompts:
-F "custom_instructions=Focus on financial data. Ignore headers." -F "prompt_mode=append"
Request multiple formats in one call:
-F "output_format=markdown,json"
| Document Size | Endpoint | Notes |
|---|---|---|
| <=5 pages | | Immediate response |
| >5 pages | | Poll for results |
JSON Extraction:
["field1", "field2"] — quick extractions{"type": "object", ...} — strict typing, nested dataConfidence Scores:
include_metadata=confidence_score{ "type": "object", "properties": { "invoice_number": {"type": "string"}, "date": {"type": "string"}, "vendor": {"type": "string"}, "total": {"type": "number"}, "line_items": { "type": "array", "items": { "type": "object", "properties": { "description": {"type": "string"}, "quantity": {"type": "number"}, "price": {"type": "number"} } } } } }
{ "type": "object", "properties": { "merchant": {"type": "string"}, "date": {"type": "string"}, "total": {"type": "number"}, "items": { "type": "array", "items": {"type": "object", "properties": {"name": {"type": "string"}, "price": {"type": "number"}}} } } }
Important: Documents uploaded to DocStrange are transmitted to
https://extraction-api.nanonets.com and processed on external servers.
Before uploading sensitive documents:
Best practices:
file_url with publicly accessible URLs instead of uploading large files directly"your_api_key_here" in examples400 Bad Request:
file, file_url, or file_base64Sync Timeout:
/extract/results/{record_id}Missing Confidence Scores:
json_options (field list or schema)include_metadata=confidence_scoreAuthentication Errors:
DOCSTRANGE_API_KEY environment variable is setBefore publishing or updating this skill, verify:
package.json declares requiredEnv and primaryEnv for DOCSTRANGE_API_KEYpackage.json lists API endpoints in endpoints array"your_api_key_here") not real keysSKILL.md or package.jsonNo automatic installation available. Please visit the source repository for installation instructions.
View Installation Instructions1,500+ AI skills, agents & workflows. Install in 30 seconds. Part of the Torly.ai family.
© 2026 Torly.ai. All rights reserved.