Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

D stands definition which prefixes a variable.

Specifying the search space

...

  1. preprocess_text: This is a boolean specifying if the text needs to be cleaned (= removal of noise such as unexpected characters eg. multiple dashes).

  2. search_in (for email): This field specifies which part of the email we look in. There are 5 options which can be combined. When left empty we will look in all these options:

    1. email_from

    2. email_to

    3. email_subject

    4. email_body

    5. attachment

  3. limits: This field specifies which limits we apply to our search_space (add link to further here): more info here.

  4. granularity: This field specifies what the granularity for a match should be (add link to further here): more info here.

In the case of our example, we only want to look in the email_from, so the rule will look like this:

Code Block
languagejson
{
  "confidence": 97,
  "+rule": ["L:no-reply@contract.fit"]
  "where_to_search": {"search_in": ["email_from"]}
}

Limits
Anchor
limits
limits

You can apply different limits to the search space of your query.

...

Code Block
languagejson
{
  "confidence": 97,
  "+rule": ["L:no-reply@contract.fit"],
  "where_to_search": {
    "search_in": ["email_from"],
    "limits": {
      "characters": [[0,10], [-20]]
    }
  }
}

Granularity
Anchor
granularity
granularity

The granularity allows you to specify in which blocks of text we want to search.

...

  • +rule: list of strings starting with L: or D: . When evaluating they will be appended into one regex.

  • -rule: not version of rule

...

Note

IMPORTANT: Regex is quite a bit more efficient than the and/or operators. Try to use regexes as much as possible.

Info

Note that when using different operators the where_to_search will be passed down. If on a lower level one is found, that one will be used.

This way you can:

  • Specify a granularity that applies to different and/or rules

  • Limit the search space for different and/or rules without having to define the where_to_search multiple times

Variables
Anchor

...

variables

...

variables

When writing rules you might come across the case where you have added the same regex in many rules. With variables you can avoid this problem. Specify them on the highest level in a dictionary and use them with the prefix D: in your rules. Re-using the telephone number, our full dictionary would now look like this:

...