<ms-inference:read-image
doc:id="dfbd1a61-6e98-4b5b-b77a-bfe031e70d45"
config-ref="OpenAIConfig"
doc:name="Read image">
<ms-inference:prompt>
<![CDATA[Describe what you see in this image in detail]]>
</ms-inference:prompt>
<ms-inference:image-url>
<![CDATA[https://example.com/image.png]]>
</ms-inference:image-url>
</ms-inference:read-image>
Configuring Vision Operations
Configure the [Image] Read by (Url or Base64) operation.
Configure the Image Read by (Url or Base64) Operation
The [Image] Read by (Url or Base64) operation reads and interprets an image based on a prompt.
Apply the [Image] Read by (Url or Base64) operation in various scenarios, such as for:
-
Image Analysis
Analyze images in business reports, presentations, or customer service scenarios.
-
Content Generation
Describe images for blog posts, articles, or social media.
-
Visual Insights
Extract insights from images in research or design projects.
To configure the [Image] Read by (Url or Base64) operation:
-
Select the operation on the Anypoint Code Builder or Studio canvas.
-
In the General properties tab for the operation, enter these values:
-
Prompt
Enter the prompt for the operation.
-
Image
Enter the URL or Base64 String of the image file that is to be read.
-
This is the XML for this operation:
Output Configuration
This operation responds with a JSON payload containing the main LLM response. This is an example response:
{
"payload": {
"response": "The image depicts the Eiffel Tower in Paris during a snowy day. The tower is partially covered in snow, and the surrounding trees and ground are also blanketed in snow. There is a pathway leading towards the Eiffel Tower, with a lamppost and some fencing along the sides. The overall scene has a serene and picturesque winter atmosphere."
}
}
The operation also returns attributes that aren’t within the main JSON payload, that include information about token usage, for example:
{
"attributes": {
"tokenUsage": {
"inputCount": 267,
"outputCount": 68,
"totalCount": 335
},
"additionalAttributes": {
"finish_reason": "stop",
"model": "gpt-4o-mini",
"id": "604ae573-8265-4dc0-b06e-457422f2fbd8"
}
}
}
-
tokenUsage
: Token usage metadata returned as attributes-
inputCount
: Number of tokens used to process the input -
outputCount
: Number of tokens used to generate the output -
totalCount
: Total number of tokens used for input and output
-
-
additionalAttributes
: Additional metadata from the LLM provider-
finish_reason
: The finish reason for the LLM response -
model
: The ID of the model used -
id
: The ID of the request
-