How AI Agent uses Vision to understand imagesUpdated a month ago

Vision is a built-in capability that AI Agent uses to read and understand images sent by customers in email conversations.

Vision helps AI Agent to:

Identify customer problems faster
Reduce unnecessary escalations
Improve customer satisfaction by acknowledging and responding to images
Increase automation of straightforward support cases that include visual content, without needing to change your workflows

Vision is automatically enabled when you use AI Agent. There are no extra steps required.

Requirements

You must have an active AI Agent subscription
You must have AI Agent enabled on Email
- Image recognition is not currently available on Chat channels

How Vision works

AI Agent’s Vision uses a large language model (LLM) capable of analyzing visual content to understand images. When a customer sends an image, AI Agent uses the LLM to extract information like text, product details and other context-based insights.

Details from the image analysis and the customer’s message are then given to AI Agent as context to generate a response. With this information, AI Agent can resolve tickets that depend on visual information, such as:

Product issues: customer shares photo of package. AI Agent can identify missing items or product damage
Return/refund validation: customer uploads image of product. AI Agent evaluates eligibility based on your Guidance instructions.
Order processing: customer shares image of receipt or confirmation email. AI Agent extracts order number to provide tracking status update.
Product suggestions: a shopper attaches an image of a black hat. AI Agent identifies and shares an appropriate matching product from your Shopify catalog.

Note: AI Agent's Vision only applies to images that customers attach to email conversations. AI Agent cannot use Vision to:

read images from your knowledge sources, including your help center article, public URLs or uploaded documents. Make sure any important information conveyed in images is also written as text in these sources.
generate or send images to customers when responding with answers
read alt text on images

Optimizing AI Vision with Guidance

You can use Guidance to instruct AI Agent on how it uses images. You can tell AI Agent when to proactively ask for an image, or what to look for when a customer shares images in different scenarios to determine next steps.

Here are some examples:

Damaged product handling: “If the image clearly shows a damaged item and the shopper is requesting help, offer to send a replacement or issue a refund based on our refund policy”
- Why this works: Shoppers often include photos of broken or defective products. The AI can now verify that the product is indeed damaged and act according to your policy without escalating to a human.
Missing items: “If a shopper sends an image of the package contents and a product is missing, acknowledge the issue and offer to ship the missing item”
- Why this works: Sometimes customers show their opened package to prove something’s missing. The AI can visually confirm and follow your preset resolution process.
Return label acknowledgement: “If the shopper shares an image of a return label or proof of shipment, confirm receipt and notify them that the return is being processed”
- Why this works: Customers frequently send photos of return labels or tracking slips. The AI Agent can detect this and close the loop without requiring a human to interpret the image.

FAQ

What image formats does AI Vision support?

AI Vision supports common image formats shared by customers over email, including

PNG (.png)
JPEG (.jpeg or .jpg)
WEBP (.webp)
Non-animated GIFs (.gif)

Vision is not compatible with video files or documents like DOCX and PDFs, even if they include images.

How does AI Agent handle sensitive or private information in images?

We designed AI Agent with significant privacy controls. If an image contains sensitive information (like a credit card number or other identification details), AI Agent only uses text from the image as part of its conversation with the shopper.

All information is handled via a secure link between Gorgias and our LLMs, after which the data is deleted according to our zero retention policy.

Read our security and privacy FAQ to learn more.

Do customers need to send images in a certain way?

Yes, currently AI Vision only supports image attachments on email. An image attachment is any file that has been properly attached to email (using the 📎 icon), or that has been cop-pasted as an inline attachment.

AI Vision is not currently supported on chats.

Does AI Vision support multiple languages?

Yes, AI Agent can read text from images in any language that our large language models (LLMs) support — over 80+ languages

Was this article helpful?

Yes