Contact Us 1-800-596-4880

IDP Overview

MuleSoft Intelligent Document Processing (IDP) enables you to read invoices, purchase orders, and other unstructured or semi-structured documents and then analyze and refine the extracted content using AI capabilities to create a structured response.

With the simple IDP interface, you can create and publish document actions as APIs to use for further integration with RPA, Mule applications, and other systems, without subscribing to external services.

A document action is a multi-step process that uses multiple AI engines to scan a document, filter out fields, and return a structured response as a JSON object. Each document action defines the types of documents it expects as input, the fields to extract, and the fields to filter out from the response.

Configure prompts to extract specific data from a document using questions in natural language, for example:

  • What is the subtotal amount?

  • What is the grand total?

  • When is the due date?

  • What is the highest price?

The confidence score represents the probability that IDP has properly extracted the value from a document. For example, a confidence score of 100% means that IDP extracted the value with total accuracy. However, a confidence score of 75% means that there’s a 25% chance that the extracted value is not correct.

Each processed document shows a confidence score for each extracted field. When this value is lower than the defined threshold, IDP sends the document for review by a human to verify the accuracy of the extracted values. You can add single reviewers or teams to each document action.

For an introduction to IDP, see our Trailhead badge, MuleSoft IDP Basics (login required). Sign up if you don’t have a Trailhead account.

Analyze Documents With Custom User-Defined Schemas

Analyze any type of document and fully customize the output structure by creating a Generic document action and enabling Customize Schema. You can define fields and tables in the output structure and configure instructions for Einstein to analyze the document and extract the data for each field.

Einstein supports these predictive models:

  • OpenAI’s GPT-4o (gpt-4o-2024-08-06) LLM

  • OpenAI’s GPT-4o Mini (gpt-4o-mini-2024-07-18) LLM

Einstein accesses these models through the Salesforce Einstein Trust layer, which is part of the Salesforce Einstein platform.

Select the model to use during document analysis by configuring Settings in the document action editor.

Document actions created before February 5th support only OpenAI’s GPT-4o (gpt-4o-2024-05-13). To enable model selection, create a new document action.

Enhance Data Extraction With Einstein

When you add prompts to your Invoice or Purchase Order document actions, you can select whether to use the default natural language processing model (IDP NLP) or Einstein to generate an answer for each of your prompts. Einstein can answer complex questions that require further analysis of the document instead of just searching and extracting a field. For example, you can ask Einstein what’s the total amount due in an invoice after deducting taxes and other values from the document.

Use Einstein to analyze documents that don’t use a standard format or are difficult to read without performing a complex analysis of the extracted data, such as a driver’s license or a certificate of medical leave.

Einstein doesn’t use customer data to train any models for document analysis in IDP.