## Configuring the Import
-Which OSM objects are added to the database and which of the tags are used
-can be configured via the import style configuration file. This
-is a JSON file which contains a list of rules which are matched against every
-tag of every object and then assign the tag its specific role.
-
-The style to use is given by the `NOMINATIM_IMPORT_STYLE` configuration
-option. There are a number of default styles, which are explained in detail
-in the [Import section](../admin/Import.md#filtering-imported-data). These
-standard styles may be referenced by their name.
-
-You can also create your own custom syle. Put the style file into your
+In the very first step of a Nominatim import, OSM data is loaded into the
+database. Nominatim uses [osm2pgsql](https://osm2pgsql.org) for this task.
+It comes with a [flex style](https://osm2pgsql.org/doc/manual.html#the-flex-output)
+specifically tailored to filter and convert OSM data into Nominatim's
+internal data representation.
+
+There are a number of default configurations for the flex style which
+result in geocoding databases of different detail. The
+[Import section](../admin/Import.md#filtering-imported-data) explains
+these default configurations in detail.
+
+You can also create your own custom style. Put the style file into your
project directory and then set `NOMINATIM_IMPORT_STYLE` to the name of the file.
It is always recommended to start with one of the standard styles and customize
-those. You find the standard styles under the name `import-<stylename>.style`
+those. You find the standard styles under the name `import-<stylename>.lua`
in the standard Nominatim configuration path (usually `/etc/nominatim` or
`/usr/local/etc/nominatim`).
-The remainder of the page describes the format of the file.
-
-### Configuration Rules
+The remainder of the page describes how the flex style works and how to
+customize it.
-A single rule looks like this:
+### The `flex-base.lua` module
-```json
-{
- "keys" : ["key1", "key2", ...],
- "values" : {
- "value1" : "prop",
- "value2" : "prop1,prop2"
- }
-}
-```
+The core of Nominatim's flex import configuration is the `flex-base` module.
+It defines the table layout used by Nominatim and provides standard
+implementations for the import callbacks that make it easy to customize
+how OSM tags are used by Nominatim.
-A rule first defines a list of keys to apply the rule to. This is always a list
-of strings. The string may have four forms. An empty string matches against
-any key. A string that ends in an asterisk `*` is a prefix match and accordingly
-matches against any key that starts with the given string (minus the `*`). A
-suffix match can be defined similarly with a string that starts with a `*`. Any
-other string constitutes an exact match.
+Every custom style should include this module to make sure that the correct
+tables are created. Thus start your custom style as follows:
-The second part of the rules defines a list of values and the properties that
-apply to a successful match. Value strings may be either empty, which
-means that they match any value, or describe an exact match. Prefix
-or suffix matching of values is not possible.
+``` lua
+local flex = require('flex-base')
-For a rule to match, it has to find a valid combination of keys and values. The
-resulting property is that of the matched values.
+```
-The rules in a configuration file are processed sequentially and the first
-match for each tag wins.
+The following sections explain how the module can be customized.
+
+
+### Changing the recognized tags
+
+If you just want to change which OSM tags are recognized during import,
+then there are a number of convenience functions to set the tag lists used
+during the processing.
+
+!!! warning
+ There are no built-in defaults for the tag lists, so all the functions
+ need to be called from your style script to fully process the data.
+ Make sure you start from one of the default style and only modify
+ the data you are interested in. You can also derive your style from an
+ existing style by importing the appropriate module, e.g.
+ `local flex = require('import-street')`.
+
+Many of the following functions take _key match lists_. These lists can
+contain three kinds of strings to match against tag keys:
+A string that ends in an asterisk `*` is a prefix match and accordingly matches
+against any key that starts with the given string (minus the `*`).
+A suffix match can be defined similarly with a string that starts with a `*`.
+Any other string is matched exactly against tag keys.
+
+
+#### `set_main_tags()` - principal tags
+
+If a principal or main tag is found on an OSM object, then the object
+is included in Nominatim's search index. A single object may also have
+multiple main tags. In that case, the object will be included multiple
+times in the index, once for each main tag.
+
+The flex script distinguishes between four types of main tags:
+
+* __always__: a main tag that is used unconditionally
+* __named__: consider this main tag only, if the object has a proper name
+ (a reference is not enough, see below).
+* __named_with_key__: consider this main tag only, when the object has
+ a proper name with a domain prefix. For example, if the main tag is
+ `bridge=yes`, then it will only be added as an extra row, if there is
+ a tag `bridge:name[:XXX]` for the same object. If this property is set,
+ all other names that are not domain-specific are ignored.
+* __fallback__: use this main tag only, if there is no other main tag.
+ Fallback always implied `named`, i.e. fallbacks are only tried for
+ named objects.
+
+The `set_main_tags()` function takes exactly one table parameter which
+defines the keys and key/value combinations to include and the kind of
+main tag. Each lua table key defines an OSM tag key. The value may
+be a string defining the kind of main key as described above. Then the tag will
+be considered a main tag for any possible value. To further restrict
+which values are acceptable, give a table with the permitted values
+and their kind of main tag. If the table contains a simple value without
+key, then this is used as default for values that are not listed.
+
+!!! example
+ ``` lua
+ local flex = require('import-full')
+
+ flex.set_main_tags{
+ boundary = {administrative = 'named'},
+ highway = {'always', street_lamp = 'named'},
+ landuse = 'fallback'
+ }
+ ```
+
+ In this example an object with a `boundary` tag will only be included
+ when it has a value of `administrative`. Objects with `highway` tags are
+ always included. However when the value is `street_lamp` then the object
+ must have a name, too. With any other value, the object is included
+ independently of the name. Finally, if a `landuse` tag is present then
+ it will be used independely of the concrete value if neither boundary
+ nor highway tags were found and the object is named.
+
+
+#### `set_prefilters()` - ignoring tags
+
+Pre-filtering of tags allows to ignore them for any further processing.
+Thus pre-filtering takes precedence over any other tag processing. This is
+useful when some specific key/value combinations need to be excluded from
+processing. When tags are filtered, they may either be deleted completely
+or moved to `extratags`. Extra tags are saved with the object and returned
+to the user when requested, but are not used otherwise.
+
+`set_prefilters()` takes a table with four optional fields:
+
+* __delete_keys__ is a _key match list_ for tags that should be deleted
+* __delete_tags__ contains a table of tag keys pointing to a list of tag
+ values. Tags with matching key/value pairs are deleted.
+* __extra_keys__ is a _key match list_ for tags which should be saved into
+ extratags
+* __extra_tags__ contains a table of tag keys pointing to a list of tag
+ values. Tags with matching key/value pairs are moved to extratags.
+
+Key list may contain three kinds of strings:
+A string that ends in an asterisk `*` is a prefix match and accordingly matches
+against any key that starts with the given string (minus the `*`).
+A suffix match can be defined similarly with a string that starts with a `*`.
+Any other string is matched exactly against tag keys.
+
+!!! example
+ ``` lua
+ local flex = require('import-full')
+
+ flex.set_prefilters{
+ delete_keys = {'source', 'source:*'},
+ extra_tags = {amenity = {'yes', 'no'}}
+ }
+ flex.set_main_tags{
+ amenity = 'always'
+ }
+ ```
-A rule where key and value are the empty string is special. This defines the
-fallback when none of the rules match. The fallback is always used as a last
-resort when nothing else matches, no matter where the rule appears in the file.
-Defining multiple fallback rules is not allowed. What happens in this case,
-is undefined.
+ In this example any tags `source` and tags that begin with `source:` are
+ deleted before any other processing is done. Getting rid of frequent tags
+ this way can speed up the import.
-### Tag Properties
+ Tags with `amenity=yes` or `amenity=no` are moved to extratags. Later
+ all tags with an `amenity` key are made a main tag. This effectively means
+ that Nominatim will use all amenity tags except for those with value
+ yes and no.
-One or more of the following properties may be given for each tag:
+#### `set_name_tags()` - defining names
-* `main`
+The flex script distinguishes between two kinds of names:
- A principal tag. A new row will be added for the object with key and value
- as `class` and `type`.
+* __main__: the primary names make an object fully searchable.
+ Main tags of type _named_ will only cause the object to be included when
+ such a primary name is present. Primary names are usually those found
+ in the `name` tag and its variants.
+* __extra__: extra names are still added to the search index but they are
+ alone not sufficient to make an object named.
-* `with_name`
+`set_name_tags()` takes a table with two optional fields `main` and `extra`.
+They take _key match lists_ for main and extra names respectively.
- When the tag is a principal tag (`main` property set): only really add a new
- row, if there is any name tag found (a reference tag is not sufficient, see
- below).
+!!! example
+ ``` lua
+ local flex = require('flex-base')
-* `with_name_key`
+ flex.set_main_tags{highway = {traffic_light = 'named'}}
+ flex.set_name_tags{main = {'name', 'name:*'},
+ extra = {'ref'}
+ }
+ ```
- When the tag is a principal tag (`main` property set): only really add a new
- row, if there is also a name tag that matches the key of the principal tag.
- For example, if the main tag is `bridge=yes`, then it will only be added as
- an extra row, if there is a tag `bridge:name[:XXX]` for the same object.
- If this property is set, all other names that are not domain-specific are
- ignored.
+ This example creates a search index over traffic lights but will
+ only include those that have a common name and not those which just
+ have some reference ID from the city.
-* `fallback`
+#### `set_address_tags()` - defining address parts
- When the tag is a principal tag (`main` property set): only really add a new
- row, when no other principal tags for this object have been found. Only one
- fallback tag can win for an object.
+Address tags will be used to build up the address of an object.
-* `operator`
+`set_address_tags()` takes a table with arbitrary fields pointing to
+_key match lists_. To fields have a special meaning:
- When the tag is a principal tag (`main` property set): also include the
- `operator` tag in the list of names. This is a special construct for an
- out-dated tagging practise in OSM. Fuel stations and chain restaurants
- in particular used to have the name of the chain tagged as `operator`.
- These days the chain can be more commonly found in the `brand` tag but
- there is still enough old data around to warrant this special case.
+* __main__: defines
+the tags that make a full address object out of the OSM object. This
+is usually the housenumber or variants thereof. If a main address tag
+appears, then the object will always be included, if necessary with a
+fallback of `place=house`. If the key has a prefix of `addr:` or `is_in:`
+this will be stripped.
-* `name`
+* __extra__: defines all supplementary tags for addresses, tags like `addr:street`, `addr:city` etc. If the key has a prefix of `addr:` or `is_in:` this will be stripped.
- Add tag to the list of names.
+All other fields will be handled as summary fields. If a key matches the
+key match list, then its value will be added to the address tags with the
+name of the field as key. If multiple tags match, then an arbitrary one
+wins.
-* `ref`
+Country tags are handled slightly special. Only tags with a two-letter code
+are accepted, all other values are discarded.
- Add tag to the list of names as a reference. At the moment this only means
- that the object is not considered to be named for `with_name`.
+!!! example
+ ``` lua
+ local flex = require('import-full')
-* `address`
+ flex.set_address_tags{
+ main = {'addr:housenumber'},
+ extra = {'addr:*'},
+ postcode = {'postal_code', 'postcode', 'addr:postcode'},
+ country = {'country-code', 'ISO3166-1'}
+ }
+ ```
- Add tag to the list of address tags. If the tag starts with `addr:` or
- `is_in:`, then this prefix is cut off before adding it to the list.
+ In this example all tags which begin with `addr:` will be saved in
+ the address tag list. If one of the tags is `addr:housenumber`, the
+ object will fall back to be entered as a `place=house` in the database
+ unless there is another interested main tag to be found.
-* `postcode`
+ Tags with keys `country-code` and `ISO3166-1` are saved with their
+ value under `country` in the address tag list. The same thing happens
+ to postcodes, they will always be saved under the key `postcode` thus
+ normalizing the multitude of keys that are used in the OSM database.
- Add the value as a postcode to the address tags. If multiple tags are
- candidate for postcodes, one wins out and the others are dropped.
-* `country`
+#### `set_unused_handling()` - processing remaining tags
- Add the value as a country code to the address tags. The value must be a
- two letter country code, otherwise it is ignored. If there are multiple
- tags that match, then one wins out and the others are dropped.
+This function defines what to do with tags that remain after all tags
+have been classified using the functions above. There are two ways in
+which the function can be used:
-* `house`
+`set_unused_handling(delete_keys = ..., delete_tags = ...)` deletes all
+keys that match the descriptions in the parameters and moves all remaining
+tags into the extratags list.
+`set_unused_handling(extra_keys = ..., extra_tags = ...)` moves all tags
+matching the parameters into the extratags list and then deletes the remaining
+tags. For the format of the parameters see the description in `set_prefilters()`
+above.
- If no principle tags can be found for the object, still add the object with
- `class`=`place` and `type`=`house`. Use this for address nodes that have no
- other function.
+!!! example
+ ``` lua
+ local flex = require('import-full')
-* `interpolation`
+ flex.set_address_tags{
+ main = {'addr:housenumber'},
+ extra = {'addr:*', 'tiger:county'}
+ }
+ flex.set_unused_handling{delete_keys = {'tiger:*'}}
+ ```
- Add this object as an address interpolation (appears as `class`=`place` and
- `type`=`houses` in the database).
+ In this example all remaining tags except those beginning with `tiger:`
+ are moved to the extratags list. Note that it is not possible to
+ already delete the tiger tags with `set_prefilters()` because that
+ would remove tiger:county before the address tags are processed.
-* `extra`
+### Customizing osm2pgsql callbacks
- Add tag to the list of extra tags.
+osm2pgsql expects the flex style to implement three callbacks, one process
+function per OSM type. If you want to implement special handling for
+certain OSM types, you can override the default implementations provided
+by the flex-base module.
-* `skip`
+#### Changing the relation types to be handled
- Skip the tag completely. Useful when a custom default fallback is defined
- or to define exceptions to rules.
+The default scripts only allows relations of type `multipolygon`, `boundary`
+and `waterway`. To add other types relations, set `RELATION_TYPES` for
+the type to the kind of geometry that should be created. The following
+kinds of geometries can be used:
-A rule can define as many of these properties for one match as it likes. For
-example, if the property is `"main,extra"` then the tag will open a new row
-but also have the tag appear in the list of extra tags.
+* __relation_as_multipolygon__ creates a (Multi)Polygon from the ways in
+ the relation. If the ways do not form a valid area, then the object is
+ silently discarded.
+* __relation_as_multiline__ creates a (Multi)LineString from the ways in
+ the relations. Ways are combined as much as possible without any regards
+ to their order in the relation.
+
+!!! Example
+ ``` lua
+ local flex = require('import-full')
+
+ flex.RELATION_TYPES['site'] = flex.relation_as_multipolygon
+ ```
+
+ With this line relations of `type=site` will be included in the index
+ according to main tags found. This only works when the site relation
+ resolves to a valid area. Nodes in the site relation are not part of the
+ geometry.
+
+
+#### Adding additional logic to processing functions
+
+The default processing functions are also exported by the flex-base module
+as `process_node`, `process_way` and `process_relation`. These can be used
+to implement your own processing functions with some additional processing
+logic.
+
+!!! Example
+ ``` lua
+ local flex = require('import-full')
+
+ function osm2pgsql.process_relation(object)
+ if object.tags.boundary ~= 'administrative' or object.tags.admin_level ~= '2' then
+ flex.process_relation(object)
+ end
+ end
+ ```
+
+ This example discards all country-level boundaries and uses standard
+ handling for everything else. This can be useful if you want to use
+ your own custom country boundaries.
+
+
+### Customizing the main processing function
+
+The main processing function of the flex style can be found in the function
+`process_tags`. This function is called for all OSM object kinds and is
+responsible for filtering the tags and writing out the rows into Postgresql.
+
+!!! Example
+ ``` lua
+ local flex = require('import-full')
+
+ local original_process_tags = flex.process_tags
+
+ function flex.process_tags(o)
+ if o.object.tags.highway ~= nil and o.object.tags.access == 'no' then
+ return
+ end
+
+ original_process_tags(o)
+ end
+ ```
+
+ This example shows the most simple customization of the process_tags function.
+ It simply adds some additional processing before running the original code.
+ To do that, first save the original function and then overwrite process_tags
+ from the module. In this example all highways which are not accessible
+ by anyone will be ignored.
+
+
+#### The `Place` class
+
+The `process_tags` function receives a Lua object of `Place` type which comes
+with some handy functions to collect the data necessary for geocoding and
+writing it into the place table. Always use this object to fill the table.
+
+The Place class has some attributes which you may access read-only:
+
+* __object__ is the original OSM object data handed in by osm2pgsql
+* __admin_level__ is the content of the admin_level tag, parsed into an
+ integer and normalized to a value between 0 and 15
+* __has_name__ is a boolean indicating if the object has a full name
+* __names__ is a table with the collected list of name tags
+* __address__ is a table with the collected list of address tags
+* __extratags__ is a table with the collected list of additional tags to save
+
+There are a number of functions to fill these fields. All functions expect
+a table parameter with fields as indicated in the description.
+Many of these functions expect match functions which are described in detail
+further below.
+
+* __delete{match=...}__ removes all tags that match the match function given
+ in _match_.
+* __grab_extratags{match=...}__ moves all tags that match the match function
+ given in _match_ into extratags. Returns the number of tags moved.
+* __clean{delete=..., extra=...}__ deletes all tags that match _delete_ and
+ moves the ones that match _extra_ into extratags
+* __grab_address_parts{groups=...}__ moves matching tags into the address table.
+ _groups_ must be a group match function. Tags of the group `main` and
+ `extra` are added to the address table as is but with `addr:` and `is_in:`
+ prefixes removed from the tag key. All other groups are added with the
+ group name as key and the value from the tag. Multiple values of the same
+ group overwrite each other. The function returns the number of tags saved
+ from the main group.
+* __grab_main_parts{groups=...}__ moves matching tags into the name table.
+ _groups_ must be a group match function. If a tags of the group `main` is
+ present, the object will be marked as having a name. Tags of group `house`
+ produce a fallback to `place=house`. This fallback is return by the function
+ if present.
+
+There are two functions to write a row into the place table. Both functions
+expect the main tag (key and value) for the row and then use the collected
+information from the name, address, extratags etc. fields to complete the row.
+They also have a boolean parameter `save_extra_mains` which defines how any
+unprocessed tags are handled: when True, the tags will be saved as extratags,
+when False, they will be simply discarded.
+
+* __write_row(key, value, save_extra_mains)__ creates a new table row from
+ the current state of the Place object.
+* __write_place(key, value, mtype, save_extra_mains)__ creates a new row
+ conditionally. When value is nil, the function will attempt to look up the
+ value in the object tags. If value is still nil or mtype is nil, the row
+ is ignored. An mtype of `always` will then always write out the row,
+ a mtype of `named` only, when the object has a full name. When mtype
+ is `named_with_key`, the function checks for a domain name, i.e. a name
+ tag prefixed with the name of the main key. Only if at least one is found,
+ the row will be written. The names are replaced with the domain names found.
+
+#### Match functions
+
+The Place functions usually expect either a _match function_ or a
+_group match function_ to find the tags to apply their function to.
+
+The __match function__ is a Lua function which takes two parameters,
+key and value, and returns a boolean to indicate that a tag matches. The
+flex-base module has a convenience function `tag_match()` to create such a
+function. It takes a table with two optional fields: `keys` takes a key match
+list (see above), `tags` takes a table with keys that point to a list of
+possible values, thus defining key/value matches.
+
+The __group match function__ is a Lua function which also takes two parameters,
+key and value, and returns a string indicating to which group or type they
+belong to. The `tag_group()` can be used to create such a function. It expects
+a table where the group names are the keys and the values are a key match list.
+
+
+
+### Using the gazetteer output of osm2pgsql
+
+Nominatim still allows you to configure the gazetteer output to remain
+backwards compatible with older imports. It will be automatically used
+when the style file name ends in `.style`. For documentation of the
+old import style, please refer to the documentation of older releases
+of Nominatim. Do not use the gazetteer output for new imports. There is no
+guarantee that new versions of Nominatim are fully compatible with the
+gazetteer output.
### Changing the Style of Existing Databases