How AI Agent uses Vision to understand imagesUpdated 15 hours ago
Vision is a built-in capability that AI Agent uses to read and understand images sent by customers in email conversations.
Vision helps AI Agent to:
- Identify customer problems faster
- Reduce unnecessary escalations
- Improve customer satisfaction by acknowledging and responding to images
- Increase automation of straightforward support cases that include visual content, without needing to change your workflows
Vision is automatically enabled when you use AI Agent. There are no extra steps required.
Requirements
- You must have an active AI Agent subscription
- You must have AI Agent enabled on Email
- Image recognition is not currently available on Chat channels
How Vision works
AI Agent’s Vision uses a large language model (LLM) capable of analyzing visual content to understand images. When a customer sends an image, AI Agent uses the LLM to extract information like text, product details and other context-based insights.
Details from the image analysis and the customer’s message are then given to AI Agent as context to generate a response. With this information, AI Agent can resolve tickets that depend on visual information, such as:
- Product issues: customer shares photo of package. AI Agent can identify missing items or product damage
- Return/refund validation: customer uploads image of product. AI Agent evaluates eligibility based on your Guidance instructions.
- Order processing: customer shares image of receipt or confirmation email. AI Agent extracts order number to provide tracking status update.
- Product suggestions: a shopper attaches an image of a black hat. AI Agent identifies and shares an appropriate matching product from your Shopify catalog.
Note: AI Agent's Vision only applies to images that customers attach to email conversations. AI Agent cannot use Vision to:
- read images from your knowledge sources, including your help center article, public URLs or uploaded documents. Make sure any important information conveyed in images is also written as text in these sources.
- generate or send images to customers when responding with answers
- read alt text on images
Optimizing AI Vision with Guidance
You can use Guidance to instruct AI Agent on how it uses images. You can tell AI Agent when to proactively ask for an image, or what to look for when a customer shares images in different scenarios to determine next steps.
Here are some examples:
- Damaged product handling: “If the image clearly shows a damaged item and the shopper is requesting help, offer to send a replacement or issue a refund based on our refund policy”
- Why this works: Shoppers often include photos of broken or defective products. The AI can now verify that the product is indeed damaged and act according to your policy without escalating to a human.
- Missing items: “If a shopper sends an image of the package contents and a product is missing, acknowledge the issue and offer to ship the missing item”
- Why this works: Sometimes customers show their opened package to prove something’s missing. The AI can visually confirm and follow your preset resolution process.
- Return label acknowledgement: “If the shopper shares an image of a return label or proof of shipment, confirm receipt and notify them that the return is being processed”
- Why this works: Customers frequently send photos of return labels or tracking slips. The AI Agent can detect this and close the loop without requiring a human to interpret the image.
FAQ
What image formats does AI Vision support?
AI Vision supports common image formats shared by customers over email, including JPG and PNG files.
Vision is not compatible with video files or documents like DOCX and PDFs, even if they include images.
Do customers need to send images in a certain way?
Yes, currently AI Vision only supports image attachments on email. An image attachment is any file that has been properly attached to email (using the 📎 icon), or that has been cop-pasted as an inline attachment.
AI Vision is not currently supported on chats.