X-Git-Url: https://git.openstreetmap.org./nominatim.git/blobdiff_plain/c6fdcf9b0d0fca42d0d5f69f0fc469259e17ca24..189f74a40de0f0f7ff1346a66d9beffe026d9e23:/docs/develop/Tokenizers.md diff --git a/docs/develop/Tokenizers.md b/docs/develop/Tokenizers.md index 5282db1a..273e65e2 100644 --- a/docs/develop/Tokenizers.md +++ b/docs/develop/Tokenizers.md @@ -6,7 +6,7 @@ tokenizers that use different strategies for normalisation. This page describes how tokenizers are expected to work and the public API that needs to be implemented when creating a new tokenizer. For information on how to configure a specific tokenizer for a database see the -[tokenizer chapter in the administration guide](../admin/Tokenizers.md). +[tokenizer chapter in the Customization Guide](../customize/Tokenizers.md). ## Generic Architecture @@ -93,7 +93,7 @@ for a custom tokenizer implementation. Nominatim expects two files for a tokenizer: -* `nominiatim/tokenizer/_tokenizer.py` containing the Python part of the +* `nominatim/tokenizer/_tokenizer.py` containing the Python part of the implementation * `lib-php/tokenizer/_tokenizer.php` with the PHP part of the implementation @@ -105,7 +105,7 @@ functions. By convention, these should be placed in `lib-sql/tokenizer`. If the tokenizer has a default configuration file, this should be saved in the `settings/_tokenizer.`. -### Configuration and Persistance +### Configuration and Persistence Tokenizers may define custom settings for their configuration. All settings must be prefixed with `NOMINATIM_TOKENIZER_`. Settings may be transient or @@ -245,11 +245,11 @@ Currently, tokenizers are encouraged to make sure that matching works against both the search token list and the match token list. ```sql -FUNCTION token_normalized_postcode(postcode TEXT) RETURNS TEXT +FUNCTION token_get_postcode(info JSONB) RETURNS TEXT ``` -Return the normalized version of the given postcode. This function must return -the same value as the Python function `AbstractAnalyzer->normalize_postcode()`. +Return the postcode for the object, if any exists. The postcode must be in +the form that should also be presented to the end-user. ```sql FUNCTION token_strip_info(info JSONB) RETURNS JSONB