Ever wanted to add a field that does not exist in our out-of-the-box models and you thought: We don’t need a smart model to

FAQ

...

title	How can I only look in the email-header for my regex?

...

do this. I just need to look at these certain words. For this use-case we provide a predictor you can control through the predictor settings. First add a tag field to your format. We choose the name email_coming_from with the tag options no_reply and info After, add your specified rule as a regex in the predictor settings. On the swagger page of your server you can find the endpoint /predictor_settings/{scope}. The scope is the inbox/project for which you would want this predictor to run. Inside key_value_pairs::rule_config you can specify per tag_field which regexes you want to match for a given field_name. Note that the field_name and tag options have to match exactly (case sensitive). Otherwise the prediction will be empty. We created these rules to match no-reply@contract.fit and info@contract.fit respectively:

Code Block

language	json

{
  "

field

key_

to

value_

detect

pairs": {
    "rule_config": {
      "

type

email_coming_from":

"tag

{
        "no_reply": {
          "rules": [
            {
              "confidence": 97,
              "+rule": ["L:no-reply@contract.fit"]
            }
          ]
        },
        "info": {
          "rules": [
            {
              "confidence": 97,
              "+rule": ["

REG:subject_to_match"]

L:info@contract.fit"]                
            }
          ]
        }
      }
    }
  }
}

Per tag field (level of no_reply and info) you can specify a list of rules with a confidence per rule. The rule that matched with the highest confidence will be presented in the prediction. If we see no-reply@contract.fit somewhere in the uploaded document, a prediction with confidence 97 for the tag_option no_reply will be returned.

Specifying the search space

The rule we have specified now will look in the whole document, this is way too broad. We only want to search inside the email field which specifies the from. For this we can use the where_to_search option specified on the same level of the +rule.

Anchor

	And-or-not-logic
	And-or-not-logic

Code Block

language	json

{
  "where_to_search": {
    "search_in": [
      "email_from",
      "email_to",
      "email_subject",
      "email_body",
      "attachment"
    ],
    "preprocess_text": bool,
    "limits": {
      "$ref": "#/definitions/Limits"
    },
    "granularity": {
      "$ref": "#/definitions/Granularity"
    }
  }
}

And-or-not logic

In some cases matching normal regexes in specified places don’t accomodate your needs anymore. /

Lemmatisation

FAQ

Expand

title	How can I only look in the email subject for my regex?

By adding the option where_to_search::search_in to your rule. An example field would look like this:

Code Block

language	json

"rules": [
  {
      "confidence": 97,                     
      "+rule": ["email_subjectL:no-reply@contract.fit"]     
      "where_to_search":
     }   {
      }    "search_in": ["email_subject"]
        }
  }
]

Expand

title	How can I test if my regex has the right format?

In the front-end you can go to a certain page and select the text view (see attached image). This way you can copy the text you are searching in. On this site you can test the regex you created: https://regex101.com/. PS: Don’t forget to set the language to python on the left hand side of the screen.

Expand

title	How can I add a default option for the case that rules did not match anything?

By adding this option in the tag field and making a new rule in the predictor settings. Note that the confidence should be slightly lower than the lowest confidence of the other rules since the . will match anything. An example rule would look as follows:

Code Block

language	json

{
  "email_coming_from": {
    "other": {
      "rules": [
        {
          "confidence": 96,
          "+rule": [
            "L:."
          ]
        }
      ]
    }
  }
}

Version	Old Version 1	New Version 2
Changes made by	Sebastiaan Verplancke	Sebastiaan Verplancke
Saved on	Sept 26, 2022	Sept 26, 2022

Versions Compared

Key

FAQ

Specifying the search space

And-or-not logic

Lemmatisation

FAQ

Page Comparison

Versions Compared

Key

FAQ

Specifying the search space

And-or-not logic

Lemmatisation

FAQ