Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Code Block
languagejson
{
  "key_value_pairs": {
    "rule_config": {
      "email_coming_from": {
        "no_reply": {
          "rules": [
            {
              "confidence": 97,
              "+rule": ["L:no-reply@contract.fit"]
            }
          ]
        },
        "info": {
          "rules": [
            {
              "confidence": 97,
              "+rule": ["L:info@contract.fit", "L:"]                
            }
          ]
        }
      }
    }
  }
}

Per tag field (level of no_reply and info) you can specify a list of rules with a confidence per rule. The rule that matched with the highest confidence will be presented in the prediction. If we see no-reply@contract.fit somewhere in the uploaded document, a prediction with confidence 97 for the tag_option no_reply will be returned.

...

The rule we have specified now will look in the whole document, this is way too broad. We only want to search inside the email field which specifies the from. For this we can use the where_to_search option specified on the same level of the +rule. AnchorAnd-or-not-logicAnd-or-not-logic

...

languagejson

...

To specify where and how to search there are 4 options. All of these are optional:

  1. preprocess_text: This is a bool specifying if the text needs to be cleaned.

  2. search_in: This field specifies which part of the text we look in. There are 5 options which can be combined. When left empty we will look in all these options:

    1. email_from

    2. email_to

    3. email_subject

    4. email_body

    5. attachment

  3. limits: This field specifies which limits we apply to our search_space (add link to further here).

  4. granularity: This field specifies what the granularity for a match should be (add link to further here).

In the case of our example the rule would now look like this:

Code Block
languagejson
{
  "confidence": 97,
  "+rule": ["L:no-reply@contract.fit"]
  "where_to_search": {"search_in": ["email_from"]}
}

Limits

You can apply different limits

Granularity

And-or-not logic

In some cases matching normal regexes in specified places don’t accomodate your needs anymore. /

...