Image Vision
Analyze and interpret images by describing content, extracting text, answering questions, comparing visuals, and extracting structured data from JPG, PNG, GI...
Analyze and interpret images by describing content, extracting text, answering questions, comparing visuals, and extracting structured data from JPG, PNG, GI...
Real data. Real impact.
Emerging
Developers
Per week
Open source
Skills give you superpowers. Install in 30 seconds.
Analyze images using the built-in vision capabilities of multimodal AI models.
Describe what's in an image:
# The agent will automatically use vision when you provide an image path image("/path/to/image.jpg", prompt="Describe what's in this image")
Extract text from images:
image("/path/to/document.png", prompt="Extract all text from this image")
Compare or analyze multiple images:
images(["/path/to/image1.jpg", "/path/to/image2.jpg"], prompt="Compare these two images and describe the differences")
Ask specific questions about image content:
image("menu.jpg", prompt="What are the prices of the main courses?") image("chart.png", prompt="What trend does this graph show?") image("screenshot.png", prompt="What error message is displayed?")
Check image content:
image("upload.jpg", prompt="Is this image appropriate for a professional setting?")
Extract structured data from visual content:
image("receipt.jpg", prompt="Extract the date, total amount, and items purchased") image("business_card.png", prompt="Extract name, phone, email, and company") image("form.jpg", prompt="Extract all filled fields as key-value pairs")
Compare images:
images(["before.jpg", "after.jpg"], prompt="What changes were made between these two images?")
No automatic installation available. Please visit the source repository for installation instructions.
View Installation Instructions1,500+ AI skills, agents & workflows. Install in 30 seconds. Part of the Torly.ai family.
© 2026 Torly.ai. All rights reserved.