<ms-inference:toxicity-detection-text
doc:name="Toxicity detection text"
doc:id="b5770a5b-d3f9-47ba-acec-ab0bd41e4188"
config-ref="OpenAIConfig">
<ms-inference:text>
<![CDATA[You are fat]]>
</ms-inference:text>
</ms-inference:toxicity-detection-text>
Configuring Moderation Operations
Configure the [Toxicity] Detection by Text operation.
Configure the Toxicity Detection by Text Operation
The [Toxicity] Detection by Text operation classifies and scores any harmful content by the user or the LLM.
Apply the [Toxicity] Detection by Text operation in various scenarios, such as for:
-
Toxic Inputs Detection
Detect and block toxic input by the user to prevent sending it to the LLM.
-
Harmful Responses Detection
Filter out LLM responses that could be considered toxic or offensive by users.
To configure the [Toxicity] Detection by Text operation:
-
Select the operation on the Anypoint Code Builder or Studio canvas.
-
In the General properties tab for the operation, enter these values:
-
Text
Text to check for harmful content.
-
This is the XML for this operation:
Output Configuration
This operation responds with a JSON payload containing the toxicity detection and rating. This is an example response:
{
"payload": {
"flagged": true,
"categories": [
{
"illicit/violent": 0.0000025466403947055455,
"self-harm/instructions": 0.00023480495744356635,
"harassment": 0.9798945372458964,
"violence/graphic": 0.000005920916517463734,
"illicit": 0.000013552078562406772,
"self-harm/intent": 0.0002233150331012493,
"hate/threatening": 0.0000012029639084557005,
"sexual/minors": 0.0000024300240743279605,
"harassment/threatening": 0.0007499928075102617,
"hate": 0.00720390551996062,
"self-harm": 0.0004822186797755494,
"sexual": 0.00012644219446392274,
"violence": 0.0004960569708019355
}
]
}
}