X-Git-Url: https://git.openstreetmap.org./nominatim.git/blobdiff_plain/affe1300d9941c87b21c2bcadfdf0803247d5531..02d357d29e40a8dfe5bc8eb4eac35c3ad3cc0958:/docs/admin/Tokenizers.md diff --git a/docs/admin/Tokenizers.md b/docs/admin/Tokenizers.md index 782d50b8..6f8898c8 100644 --- a/docs/admin/Tokenizers.md +++ b/docs/admin/Tokenizers.md @@ -52,6 +52,12 @@ The ICU tokenizer uses the [ICU library](http://site.icu-project.org/) to normalize names and queries. It also offers configurable decomposition and abbreviation handling. +To enable the tokenizer add the following line to your project configuration: + +``` +NOMINATIM_TOKENIZER=icu +``` + ### How it works On import the tokenizer processes names in the following four stages: @@ -171,7 +177,7 @@ It is also possible to restrict replacements to the beginning and end of a name: ``` yaml -- ^south => n # matches only at the beginning of the name +- ^south => s # matches only at the beginning of the name - road$ => rd # matches only at the end of the name ``` @@ -192,8 +198,8 @@ a shortcut notation for it: The simple arrow causes an additional variant to be added. Note that decomposition has an effect here on the source as well. So a rule -```yaml -- ~strasse => str +``` yaml +- "~strasse -> str" ``` means that for a word like `hauptstrasse` four variants are created: