<ms-vectors:transform-parse-document
doc:name="[Transform] Parse document"
doc:id="a1b2c3d4-e5f6-7890-abcd-ef1234567890"
config-ref="MuleSoft_Vectors_Connector_Document_config"
documentBinary="<![CDATA[#[payload.documentPath]]]>"
documentParser="text">
</ms-vectors:transform-parse-document>
Configuring Transform Operations
Configure the [Transform] Parse document and [Transform] Chunk text operations.
Configure the Transform Parse Document Operation
The [Transform] Parse document operation parses a document from a raw binary or Base64-encoded content.
To configure the [Transform] Parse document operation:
-
Select the operation on the Anypoint Code Builder or Studio canvas.
-
In the General properties tab for the operation, enter these values:
-
Document binary
Enter the raw binary or Base64-encoded content of the document to parse.
-
Document parser
Enter the document parser to use.
-
This is the XML for this operation:
Output Configuration
This operation responds with a JSON payload. This is an example response:
{
In the modern world, technological advancements have become essential for businesses to remain competitive. E-commerce giants have redefined the retail landscape through innovative use of technology and data analytics.
}
Configure the Transform Chunk Text Operation
The [Transform] Chunk text operation chunks the provided text into multiple segments based on the segmentation parameters. This operation splits the input text into smaller segments according to the maximum segment size and overlap size specified in the segmentation parameters. The result is returned as a JSON document containing the chunked text segments and associated metadata.
To configure the [Transform] Chunk text operation:
-
Select the operation on the Anypoint Code Builder or Studio canvas.
-
In the General properties tab for the operation, enter these values:
-
Text
Enter the text content to chunk.
-
Max Segment Size (Characters)
Enter the maximum size of a segment in characters.
-
Max Overlap Size (Characters)
Enter the maximum overlap between segments in characters.
-
This is the XML for this operation:
<ms-vectors:transform-chunk-text
doc:name="[Transform] Chunk text"
doc:id="b2c3d4e5-f6g7-8901-bcde-f23456789012"
config-ref="MuleSoft_Vectors_Connector_Document_config"
text="In the modern world, technological advancements have become essential for businesses to remain competitive. E-commerce giants have redefined the retail landscape through innovative use of technology and data analytics."
maxSegmentSize="1000"
maxOverlapSize="100">
</ms-vectors:transform-chunk-text>
Output Configuration
This operation responds with a JSON payload. This is an example response:
["In the modern world, technological advancements have become essential for businesses to", "remain competitive. E-commerce giants have redefined the retail landscape through innovative use of technology and data analytics"]