AI Vision:
Share and analyze images in chat

Bring visuals into every chat with Vision Image Processing and Image Citations. Analyze, understand, and display images in text conversions.

Trusted by teams at 10,000+ organizations

Trusted by

Why visual understanding matter

Some of your most important information is stored in images, not text. Unlock all your knowledge, not just words with AI that can identify and analyze diagrams, charts, and illustrations, just like a human expert.

Smarter results, every time

Understands complex visuals like charts, schematics, and handwritten notes

Displays relevant images directly alongside agent responses

Enhances comprehension with clear visuals and text together

Support teams

Education & training teams

Product & documentation teams

Business impact of AI Vision

Core features of AI Vision

Visual understanding

Analyzes diagrams, charts, schematics, and photos to extract both meaning and context.

Contextual interpretation

Goes beyond text recognition to understand relationships between visual elements.

Seamless integration

Works automatically through the existing file upload system with a simple toggle.

Smart display

Shows referenced images directly alongside responses for richer, more intuitive conversations.

Why organizations choose CustomGPT.ai

"AI Vision was exactly what we needed, a way to share our visual data with ease."

by GAI Insights

Awarded Top 7 emerging leader in GenAI business solutions

Enterprise-grade
data security

Answers you trust

Plans & pricing

AI Vision is available across CustomGPT.ai plans.

Standard:
500 images/month

Premium:
2500 images/month

Enterprise:
Starting at 5,000 images/month

Frequently asked questions

What is Vision Image Processing?

Vision Image Processing is a new feature that enables CustomGPT.ai agents to understand and process images uploaded to the platform. Unlike traditional OCR which only extracts text, this feature uses advanced AI vision capabilities to comprehend the full context of images, including diagrams, charts, schematics, and other visual content.

How does Vision Image Processing work?

When you upload images to your CustomGPT.ai agent, you can enable Vision Image Processing with a simple toggle. The system will then analyze the images using OpenAI’s vision capabilities, understanding both the visual elements and any text within them. This processed information becomes part of your agent’s knowledge base, allowing it to reference and utilize this visual information when responding to user queries.

What types of images can be processed?

Vision Image Processing can handle virtually any type of visual content, including:

Technical diagrams and schematics
Charts and graphs
Photographs
Illustrations
Handwritten text
Screenshots
Product images
And more

What file formats are supported?

Currently, the feature supports JPEG, PNG, WEBP, and non-animated GIF formats.

Are there any limitations on image size?

Yes. Standard and Premium tier users can process images up to 1024×1024 pixels. Enterprise customers can request custom size limits that can be adjusted based on their specific needs.

Is Vision Image Processing available on all subscription tiers?

Yes, the feature is available on all subscription tiers. However, the number of images you can process per month will depend on your subscription level.

How does this differ from the existing OCR feature?

Vision Image Processing is an entirely new capability that goes beyond traditional OCR. While OCR only extracts text from images, Vision Image Processing understands the full context of the image, including text, visual elements, relationships between objects, and the overall meaning of the visual content.

Can I edit images after they've been processed?

Currently, you can only delete images after they’ve been processed.

How long are processed images stored?

Processed images follow the same data retention policies as your other agent data on CustomGPT.ai.

What are Image Citations?

Image Citations is a feature that automatically displays relevant images alongside your CustomGPT.ai agent’s responses. When your agent references information that was derived from an image in its knowledge base, that image will be displayed next to the relevant text, enhancing user understanding.

How do Image Citations work?

When your agent generates a response that references information from an image in its knowledge base, the system automatically identifies the relevant image and displays it alongside the text. This creates a more comprehensive and intuitive experience for users, particularly when dealing with technical or complex information.

Can I control which images appear as citations?

The system automatically determines which images to display based on the information being referenced in the response.

Do Image Citations affect agent performance or speed?

Image Citations are designed to work seamlessly with minimal impact on performance. The system is optimized to display images efficiently without significantly affecting response times.

How do Image Citations enhance the user experience?

Image Citations significantly improve comprehension by providing visual context alongside text explanations. This is especially valuable for:

Technical troubleshooting where seeing a diagram is essential
Educational content where visual aids enhance learning
Product documentation where images clarify features or assembly
Any scenario where “a picture is worth a thousand words”

Will Image Citations work with all types of agents?

Yes, but they’re particularly valuable for agents trained on technical documentation, manuals, educational content, or any knowledge base where visual information enhances understanding.

Is there an additional cost for these features?

No, both Vision Image Processing and Image Citations are included in your existing subscription plan at no additional cost. Usage limits will vary based on your subscription tier.

What are some ideal use cases for these features?

These features are particularly valuable for:

Technical support agents that need to understand and reference diagrams
Educational agents that benefit from visual aids
Documentation agents for products with visual components
Research assistants working with charts and graphs
Any agent where visual information enhances understanding

How will these features integrate with the current data management system?

Initially, Vision Image Processing will integrate with the file upload system. You’ll see a new toggle option similar to the current OCR option when uploading files. Future updates may expand integration to other data sources.

Enterprise

AI Vision:
Share and analyze images in chat

Bring visuals into every chat with Vision Image Processing and Image Citations. Analyze, understand, and display images in text conversions.

Trusted by teams at 10,000+ organizations

Trusted by

Why visual understanding matter

Smarter results, every time

Support teams

Education & training teams

Product & documentation teams

Core features of AI Vision

Visual understanding

Contextual interpretation

Seamless integration

Smart display

by GAI Insights

Awarded Top 7 emerging leader in GenAI business solutions

Enterprise-grade
data security

Answers you trust

Frequently asked questions

Product

Use cases

Compare

Company

Resources

Dev Resources

Enterprise

AI Vision: Share and analyze images in chat

Bring visuals into every chat with Vision Image Processing and Image Citations. Analyze, understand, and display images in text conversions.

Trusted by teams at 10,000+ organizations

Trusted by

Why visual understanding matter

Smarter results, every time

Support teams

Education & training teams

Product & documentation teams

Business impact of AI Vision

Core features of AI Vision

Visual understanding

Contextual interpretation

Seamless integration

Smart display

Why organizations choose CustomGPT.ai

by GAI Insights

Awarded Top 7 emerging leader in GenAI business solutions

Enterprise-grade data security

Answers you trust

Plans & pricing

Frequently asked questions

Bring visual intelligence to every conversation

Product

Use cases

Compare

Company

Resources

Dev Resources

AI Vision:
Share and analyze images in chat

Enterprise-grade
data security