Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

As of now, the documents that can be properly classified are invoices, receipts, e-mails, bank card, IDs and salary slips. Classification on delivery notes, credit notes, purchase orders, sales quotes and tax returns does not work properly yet.

Document with a ⭐ should be part of the “out-of-the-box configuration”

  • Document [would lead to a split of the file]

    • Photo ⭐

    • Identity document

      • International passport ⭐

      • National ID ⭐

    • Communication [has sender/receiver]

      • Transaction

        • Invoice ⭐

          • Hospital invoice

        • Receipt ⭐

          • Pharmacy receipt (BVAC)

        • Purchase order ⭐

        • Quote ⭐

        • Delivery note ⭐

        • Credit card statement ⭐

      • Email ⭐

      • Proof of income

        • Salary slip ⭐

        • Tax return

        • Proof of pension

    • European Accident Form ⭐

    • SECCI document

The above would be document from a customer perspective. Internally, I would use the is_a field to indicate that your vote does not contradict another one.

E.g., if Athar builds a visual card classifier and “cutter” to separate if there are multiple on one page. It will be known that that is a card, but not what type of card. The customer will probably never be interested in document_type card, but if we know this and we see an MRZ of an identity document, then we will know it is a national identity card and not an international passport.

Similarly, if something is an email, and another classifier votes for “invoice”, we should be able to understand that this is not a contradiction. An invoice can be an email, but a european accident form cannot be an invoice.