Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 12 Next »

INDEX

FORMAT NAME

DESCRIPTION

1

invoice

recognises invoices

2

receipt

recognises receipts

3

e-mail

recognises e-mails

4

bank_card

recognises bank cards

5

ID

recognises IDs

6

salary_slip

recognises salary slips, is mainly trained on Belgian ones

7

secci-format

does not work well yet for classification, only extraction for supported fields

8

delivery_note

does not work well yet for classification, only extraction for supported fields

9

credit_note

does not work well yet for classification, only extraction for supported fields

10

purchase_order

does not work well yet for classification, only extraction for supported fields

11

quote

work in progress

12

tax_return

work in progress

As of now, the documents that can be properly classified are invoices, receipts, e-mails, bank card, IDs and salary slips. Classification on delivery notes, credit notes, purchase orders, sales quotes and tax returns does not work properly yet.

Document with a ⭐ should be part of the “out-of-the-box configuration”

  • Document [would lead to a split of the file]

    • Photo ⭐

    • Identity document

      • International passport ⭐

      • National ID ⭐

    • Communication [has sender/receiver]

      • Transaction

        • Invoice ⭐

          • Hospital invoice

        • Receipt ⭐

          • Pharmacy receipt (BVAC)

        • Purchase order ⭐

        • Quote ⭐

        • Delivery note ⭐

        • Credit card statement ⭐

      • Email ⭐

      • Proof of income

        • Salary slip ⭐

        • Tax return

        • Proof of pension

    • European Accident Form ⭐

    • SECCI document

The above would be document from a customer perspective. Internally, I would use the is_a field to indicate that your vote does not contradict another one.

E.g., if Athar builds a visual card classifier and “cutter” to separate if there are multiple on one page. It will be known that that is a card, but not what type of card. The customer will probably never be interested in document_type card, but if we know this and we see an MRZ of an identity document, then we will know it is a national identity card and not an international passport.

Similarly, if something is an email, and another classifier votes for “invoice”, we should be able to understand that this is not a contradiction. An invoice can be an email, but a european accident form cannot be an invoice.

  • No labels