...
preprocess_text: This is a boolean specifying if the text needs to be cleaned (= removal of noise such as unexpected characters eg. multiple dashes).
search_in (for email): This field specifies which part of the email we look in. There are 5 options which can be combined. When left empty we will look in all these options:
email_from
email_to
email_subject
email_body
attachment
limits: This field specifies which limits we apply to our search_space . See below. (more info here).
Granularity
Search_in
In the example given above (“is the sender no-reply@contract.fit
or info@contract.fit
?”) our rules applied to the entire document. This is too broad for this use-case. Instead we would like to limit our search to the email field which specifies the sender, the email_from field. We can do this by adding the following:
...
Code Block |
---|
[start, stop] # items from start through stop-1 [-start:-stop] # items from start (counting from end) through stop-1 (counting from end) [start] # items from start through end (only allowed for the last slice) [-start] # items frpmfrom start (counting from end) through end (only allowed for the last slice) |
...