Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

A format essentially defines what you want to get out of a certain type of document. It defines the structure of the output and is very visible in The data view of the Data Entry Companion.

...

  • A file is a container in which some data is stored. These can be a .pdf, .jpg, .eml, .zip and so on. The illustration above shows one file.

  • A document is a representation of information that can be understood by a human. Here, we talk about invoices, receipts, ID cards, and so on. Each file contains at least one document but can contain more than one. In the illustration above, this one file contains 3 documents: a contract, an invoice, and an ID card document.

  • A page is one side of a document. In the review pane, you can view one page at a time. Page splitting enables the classification of different formats within one file. Each file contains at least one page, and the same goes for documents. The illustration below has 4 pages (1 page of a contract, 2 pages of invoices, and 1 page of an identity card).

  • A section is a part of a bigger item: A file can be split into different documents (sections), and a page can be split into different subpages (sub-sections). Subpage splitting enables the detection of different sections within one page. Sections are encountered usually when smaller receipts or ID cards are processed. For example, the front and back side of ID cards are usually saved in one single page, or several receipts are usually grouped together in one single page. The illustration below shows that the fourth page is divided in 2 sections.

For fields in tables, there can only be one scope chosen per table.

Multiple and the importance of page/subpage splitting

...