...
For option 2, 3, 4 and 5 we use the method of a Python Slice.
Anchor | ||||
---|---|---|---|---|
|
...
Lemmatisation is the process of giving all words the same form. More information can be found here: https://en.wikipedia.org/wiki/Lemmatisation. This is a very powerful way to improve the matching capabilties capabilities of regexes. This is a list of After "+lemma"
you can add a list of lemmatised words that can be in the file. If one matches, your rule will match. Because of the improved matching capabilities it is advised to reduce the search space by looking for strings that are contained in the same granularity. Here we don’t look at the original text, but the lemmatised version of the text.
Note |
---|
IMPORTANT: Regex is quite a bit more efficient than the and/or operators. Try to use regexes as much as possible. |
Note that when using different operators the where_to_search will be passed down. If on a lower level one is found, that one will be used.
This way you can:
Specify a granularity that applies to different and/or rules
Let’s say we would want to find if our email is coming from noreply by looking what is written in the email body. For our rule we specify a granularity of sentence meaning that all matches should be in the same sentence. Our rule would look like this:
Code Block |
---|
{
"no_reply": {
"rules": [
{
"+and": [
{"+lemma": ["not"]},
{"+lemma": ["reply", "answer"]}
],
"confidence": 80,
"where_to_search": {"search_in": ["email_body"], "granularity": "sentence"}
}
]
}
} |
If someone now writes in the email body. “I don’t expect you to reply.” or “I do not want you answering this email”. It will match the rule
Note |
---|
IMPORTANT: Regex is quite a bit more efficient than the and/or operators. Try to use regexes as much as possible. |
Info |
---|
Note that when using different operators the where_to_search will be passed down. If on a lower level one is found, that one will be used. This way you can:
|
Tag example
Code Block |
---|
Extraction example
Code Block |
---|
Document type example
Code Block |
---|
{
"type": "document_type",
"rules": [
{
"gen_id": "Bat&BallHotel",
"confidence": 100,
"rule_type": ["first"],
"+and": [
{"+rule": ["L:(?i)The Bat & Ball Hotel"]},
{"+rule": ["L:(?i)Order"]}
]
}
]
} |
FAQ
Expand | |||||
---|---|---|---|---|---|
| |||||
By adding the option where_to_search::search_in to your rule. An example field would look like this:
|
...