...
Per tag field (level of no_reply
and info
) you can specify a list of rules with a confidence per rule. The rule that matched with the highest confidence will be presented in the prediction. If we see no-reply@contract.fit
somewhere in the uploaded document, a prediction with confidence 97 for the tag_option no_reply
will be returned.
“+rule”
incl L & D clarificationThe simplest version of a rule should be specified with +rule
the value with this key is a list of string prefixed with L: or D:. This list will be concatenated into one regex.
L stands for literal and prefixes a normal regex. Note that the regex should be double escaped, so the regex for digit becomes \\d instead of \d.
D stands definition which prefixes a variable.
Specifying the search space
...
+lemma: list of strings which are contained in the lemma (add link here)
-lemma: not version of lemma
TODO: add example
Note |
---|
IMPORTANT: Regex is quite a bit more efficient than the and/or operators. Try to use regexes as much as possible. |
Info |
---|
Note that when using different operators the where_to_search will be passed down. If on a lower level one is found, that one will be used. This way you can:
|
Variables
Anchor | ||||
---|---|---|---|---|
|
When writing rules you might come across the case where you have added the same regex in many rules. With variables you can avoid this problem. Specify them on the highest level in a dictionary and use them with the prefix D: in your rules. Re-using the telephone number, our full dictionary would now look like this:
...