Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The search space is the area of the document our the rules apply to, ie. where they try to find look for a matching result. Coming back to our first example above, the rules here applied to the whole document. This is too broad for this use-case (“is the sender no-reply@contract.fit or info@contract.fit ?”). In this case we only want to search inside the email field which specifies the sender.

For this we can use the where_to_search option.

This function contains 4 options, all are optional:

  1. preprocess_text: This is a boolean specifying if the text needs to be cleaned (= removal of noise such as unexpected characters eg. multiple dashes).

  2. search_in (for email): This field specifies which part of the email we look in. There are 5 options which can be combined. When left empty we will look in all these options:

    1. email_from

    2. email_to

    3. email_subject

    4. email_body

    5. attachment

  3. limits: This field specifies which limits we apply to our search_space. See below.

  4. Granularity

Search_in

In the case of our example, we only want to look in the email_from, so we add

To be more precise about this area, we can use the where_to_search function.

This function contains 4 options, all are optional:

  1. preprocess_text: This is a boolean specifying if the text needs to be cleaned (= removal of noise such as unexpected characters eg. multiple dashes).

  2. search_in (for email): This field specifies which part of the email we look in. There are 5 options which can be combined. When left empty we will look in all these options:

    1. email_from

    2. email_to

    3. email_subject

    4. email_body

    5. attachment

  3. limits: This field specifies which limits we apply to our search_space. See below.

  4. Granularity

Search_in

In the example given above (“is the sender no-reply@contract.fit or info@contract.fit ?”) our rules applied to the entire document. This is too broad for this use-case. Instead we would like to limit our search to the email field which specifies the sender, the email_from field. We can do this by adding the following:

Code Block
languagejson
{
  "confidence": 97,
  "+rule": ["L:no-reply@contract.fit"]
  "where_to_search": {"search_in": ["email_from"]}
}

...