Contact Us 1-800-596-4880

Configuring Vision Operations

Configure the [Image] Read by (Url or Base64) operation.

Configure the Image Read by (Url or Base64) Operation

The [Image] Read by (Url or Base64) operation reads and interprets an image based on a prompt.

Apply the [Image] Read by (Url or Base64) operation in various scenarios, such as for:

  • Image Analysis

    Analyze images in business reports, presentations, or customer service scenarios.

  • Content Generation

    Describe images for blog posts, articles, or social media.

  • Visual Insights

    Extract insights from images in research or design projects.

To configure the [Image] Read by (Url or Base64) operation:

  1. Select the operation on the Anypoint Code Builder or Studio canvas.

  2. In the General properties tab for the operation, enter these values:

    • Prompt

      Enter the prompt for the operation.

    • Image

      Enter the URL or Base64 String of the image file that is to be read.

This is the XML for this operation:

<ms-inference:read-image
  doc:id="dfbd1a61-6e98-4b5b-b77a-bfe031e70d45"
  config-ref="OpenAIConfig"
  doc:name="Read image">
    <ms-inference:prompt>
      <![CDATA[Describe what you see in this image in detail]]>
    </ms-inference:prompt>
    <ms-inference:image-url>
      <![CDATA[https://example.com/image.png]]>
    </ms-inference:image-url>
</ms-inference:read-image>

Output Configuration

This operation responds with a JSON payload containing the main LLM response. This is an example response:

{
    "payload": {
        "response": "The image depicts the Eiffel Tower in Paris during a snowy day. The tower is partially covered in snow, and the surrounding trees and ground are also blanketed in snow. There is a pathway leading towards the Eiffel Tower, with a lamppost and some fencing along the sides. The overall scene has a serene and picturesque winter atmosphere."
    }
}

The operation also returns attributes that aren’t within the main JSON payload, that include information about token usage, for example:

{
  "attributes": {
      "tokenUsage": {
          "inputCount": 267,
          "outputCount": 68,
          "totalCount": 335
      },
      "additionalAttributes": {
          "finish_reason": "stop",
          "model": "gpt-4o-mini",
          "id": "604ae573-8265-4dc0-b06e-457422f2fbd8"
      }
  }
}
  • tokenUsage: Token usage metadata returned as attributes

    • inputCount: Number of tokens used to process the input

    • outputCount: Number of tokens used to generate the output

    • totalCount: Total number of tokens used for input and output

  • additionalAttributes: Additional metadata from the LLM provider

    • finish_reason: The finish reason for the LLM response

    • model: The ID of the model used

    • id: The ID of the request

View on GitHub