X-Git-Url: https://git.openstreetmap.org./nominatim.git/blobdiff_plain/0de83c4a5188878742ad081c3acdc0e5d98a7c00..c7d80a2cc8cacb7dba95f023c2f480d25f7bf6b1:/docs/develop/Tokenizers.md diff --git a/docs/develop/Tokenizers.md b/docs/develop/Tokenizers.md index 2b4da005..03988ce0 100644 --- a/docs/develop/Tokenizers.md +++ b/docs/develop/Tokenizers.md @@ -105,7 +105,7 @@ functions. By convention, these should be placed in `lib-sql/tokenizer`. If the tokenizer has a default configuration file, this should be saved in the `settings/_tokenizer.`. -### Configuration and Persistance +### Configuration and Persistence Tokenizers may define custom settings for their configuration. All settings must be prefixed with `NOMINATIM_TOKENIZER_`. Settings may be transient or @@ -130,18 +130,18 @@ class as defined below. ### Python Tokenizer Class -All tokenizers must inherit from `nominatim.tokenizer.base.AbstractTokenizer` +All tokenizers must inherit from `nominatim_db.tokenizer.base.AbstractTokenizer` and implement the abstract functions defined there. -::: nominatim.tokenizer.base.AbstractTokenizer - rendering: - heading_level: 4 +::: nominatim_db.tokenizer.base.AbstractTokenizer + options: + heading_level: 6 ### Python Analyzer Class -::: nominatim.tokenizer.base.AbstractAnalyzer - rendering: - heading_level: 4 +::: nominatim_db.tokenizer.base.AbstractAnalyzer + options: + heading_level: 6 ### PL/pgSQL Functions @@ -189,6 +189,28 @@ a house number token text. If a place has multiple house numbers they must be listed with a semicolon as delimiter. Must be NULL when the place has no house numbers. +```sql +FUNCTION token_is_street_address(info JSONB) RETURNS BOOLEAN +``` + +Return true if this is an object that should be parented against a street. +Only relevant for objects with address rank 30. + +```sql +FUNCTION token_has_addr_street(info JSONB) RETURNS BOOLEAN +``` + +Return true if there are street names to match against for finding the +parent of the object. + + +```sql +FUNCTION token_has_addr_place(info JSONB) RETURNS BOOLEAN +``` + +Return true if there are place names to match against for finding the +parent of the object. + ```sql FUNCTION token_matches_street(info JSONB, street_tokens INTEGER[]) RETURNS BOOLEAN ``` @@ -245,11 +267,11 @@ Currently, tokenizers are encouraged to make sure that matching works against both the search token list and the match token list. ```sql -FUNCTION token_normalized_postcode(postcode TEXT) RETURNS TEXT +FUNCTION token_get_postcode(info JSONB) RETURNS TEXT ``` -Return the normalized version of the given postcode. This function must return -the same value as the Python function `AbstractAnalyzer->normalize_postcode()`. +Return the postcode for the object, if any exists. The postcode must be in +the form that should also be presented to the end-user. ```sql FUNCTION token_strip_info(info JSONB) RETURNS JSONB