Main concepts

Extracting values
The reciTAL Extract platform allows you to extract data from unstructured documents, like PDF, Word or Powerpoint files.

Extracting data from a document is a 3 step process :

  • Authenticate on the platform
  • Determine the type of document
  • Upload the document
  • Download the results

The type of a document is its DocType. It can be an Invoice, a Quote, a Receipt, a Contract, a Government form…
In each DocType, you are interested in extracting a certain number of DataPoints, like the Name of the supplier, the Address of the supplier, the Amount tax excluded, tax included…

When using reciTAL Extract, it is assumed that the DocType of the files to be processed is known. If not, you can use the reciTAL Classify solution to automatically identify the type of your documents (link).

DocTypes have to be defined before extracting the data. DocType definition can be easily done through the Extract User Interface on

Did this page help you?