Configuring Chat Operations

Configure the CHAT Answer Prompt and CHAT Answer Prompt Streaming operations.

Configure the CHAT Answer Prompt Operation

The CHAT Answer Prompt operation is a simple prompt request operation to the configured LLM. It uses a plain text prompt as input and responds with a plain text answer.

To configure the CHAT Answer Prompt operation:

Select the operation on the Anypoint Code Builder or Studio canvas.
In the General properties tab for the operation, enter these values:
- Prompt
  
  The prompt as plain text for the operation.
- Model name
  
  The name of the LLM. You can select any model from the supported LLM Providers.
- Region
  
  The AWS region.
- Temperature
  
  A value between 0 and 1 that regulates the creativity of LLMs' responses. Use lower temperature if you want more deterministic responses, and use higher temperature if you want more creative or different responses for the same prompt from LLMs on Amazon Bedrock. The current default value is 0.7.
- Top p
  
  The percentage of most-likely candidates that the model considers for the next token. It typically ranges between 0.9 and 0.95. Refer to the model provider documentation to confirm whether this parameter is supported and to verify the appropriate acceptable range.
- Top k
  
  The number of most-likely candidates that the model considers for the next token. It typically ranges between 0.4 and 0.6. Refer to the model provider documentation to confirm whether this parameter is supported and to verify the appropriate acceptable range.
- Max token count
  
  The maximum number of tokens to consume during output generation. For consistent and predictable responses, it is recommended to explicitly configure maxTokens based on the model’s supported limits and expected output size.

This is the XML for this operation:

<ms-bedrock:
  chat-answer-prompt
  doc:name="Chat answer prompt"
  doc:id="23456789-2345-2345-2345-234567890abc"
  config-ref="AWS"
  prompt="#[payload.prompt]"
  modelName="amazon.titan-text-premier-v1:0"
/>

Output Configuration

This operation responds with a JSON payload. This is an example response:

{
    "inputTextTokenCount": 6,
    "results": [
        {
            "tokenCount": 69,
            "outputText": "Bern is the capital of Switzerland.\n\nBern is the capital of the Swiss Confederation. The municipality is located at the confluence of the Aare River into the river of the same name, and is the eighth-most populous city in Switzerland, with a population of around 134,200.",
            "completionReason": "FINISH"
        }
    ]
}

inputTextTokenCount

Token used to process the input.
results
- tokenCount: The number of tokens used to generate the output.
- outputText: The response from the LLM on the prompt sent.
- completionReason: The reason the response finished being generated. Possible values:
  - FINISHED – The response was fully generated.
  - LENGTH – The response was truncated because of the response length you set.
  - STOP_CRITERIA_MET – The response was truncated because the stop criteria was reached.
  - RAG_QUERY_WHEN_RAG_DISABLED – The feature is disabled and cannot complete the query.
  - CONTENT_FILTERED – The contents were filtered or removed by the content filter applied.

Configure the CHAT Answer Prompt Streaming Operation

The CHAT Answer Prompt Streaming operation is similar to the CHAT Answer Prompt operation but streams the response incrementally, allowing for real-time display of the response as it is generated.

To configure the CHAT Answer Prompt Streaming operation:

Select the operation on the Anypoint Code Builder or Studio canvas.
In the General properties tab for the operation, enter these values:
- Prompt
  
  The prompt as plain text for the operation.
- Model name
  
  The name of the LLM. You can select any model from the supported LLM Providers.
- Region
  
  The AWS region.
- Temperature
  
  A value between 0 and 1 that regulates the creativity of LLMs' responses. Use lower temperature if you want more deterministic responses, and use higher temperature if you want more creative or different responses for the same prompt from LLMs on Amazon Bedrock. The current default value is 0.7.
- Top p
  
  The percentage of most-likely candidates that the model considers for the next token. It typically ranges between 0.9 and 0.95. Refer to the model provider documentation to confirm whether this parameter is supported and to verify the appropriate acceptable range.
- Top k
  
  The number of most-likely candidates that the model considers for the next token. It typically ranges between 0.4 and 0.6. Refer to the model provider documentation to confirm whether this parameter is supported and to verify the appropriate acceptable range.
- Max token count
  
  The maximum number of tokens to consume during output generation. For consistent and predictable responses, it is recommended to explicitly configure maxTokens based on the model’s supported limits and expected output size.

This is the XML for this operation:

<ms-bedrock:
  chat-answer-prompt-streaming
  doc:name="Chat answer prompt streaming"
  doc:id="34567890-3456-3456-3456-345678901234"
  config-ref="AWS"
  prompt="#[payload.prompt]"
  modelName="amazon.titan-text-express-v1"
/>

Output Configuration

This operation streams the response using Server-Sent Events (SSE). The response is delivered incrementally as the model generates it, allowing for real-time display of the response in chat applications.

Configuring Chat Operations

Configure the CHAT Answer Prompt Operation

Output Configuration

Configure the CHAT Answer Prompt Streaming Operation

Output Configuration

See Also