...
Lemmatisation is the process of giving all words the same form. More information can be found here: https://en.wikipedia.org/wiki/Lemmatisation. This is a very powerful way to improve the matching capabilties capabilities of regexes. This is a list of After "+lemma"
you can add a list of lemmatised words that can be in the file. If one matches, your rule will match. Because of the improved matching capabilities it is advised to reduce the search space by looking for strings that are contained in the same granularity. Here we don’t look at the original text, but the lemmatised version of the text.
Let’s say we would want to find if our email is coming from noreply by looking what is written in the email body. For our rule we specify a granularity of sentence meaning that all matches should be in the same sentence. Our rule would look like this:
Code Block |
---|
{
"no_reply": {
"rules": [
{
"+and": [
{"+lemma": ["not"]},
{"+lemma": ["reply", "answer"]}
],
"confidence": 80,
"where_to_search": {"search_in": ["email_body"], "granularity": "sentence"}
}
]
}
} |
If someone now writes in the email body
Note |
---|
IMPORTANT: Regex is quite a bit more efficient than the and/or operators. Try to use regexes as much as possible. |
Info |
---|
Note that when using different operators the where_to_search will be passed down. If on a lower level one is found, that one will be used. This way you can:
|
...