From: marc tobias Date: Wed, 28 Nov 2018 17:57:17 +0000 (+0100) Subject: document what country_osm_grid does X-Git-Tag: v3.3.0~54^2 X-Git-Url: https://git.openstreetmap.org./nominatim.git/commitdiff_plain/8e19336f49e508ff9587be04ec8b5b10fa77daad?ds=sidebyside document what country_osm_grid does --- diff --git a/data-sources/country-grid/README.md b/data-sources/country-grid/README.md new file mode 100644 index 00000000..c94929af --- /dev/null +++ b/data-sources/country-grid/README.md @@ -0,0 +1,77 @@ +# Fallback Country Boundaries + +Each place is assigned a `country_code` and partition. Partitions derive from `country_code`. + +Nominatim imports two pre-generated files + + * `data/country_name.sql` (country code, name, default language, partition) + * `data/country_osm_grid.sql` (country code, geometry) + +before creating places in the database. This helps with fast lookups and missing data (e.g. if the data the user wants to import doesn't contain any country places). + +The number of countries in the world can change (South Sudan created 2011, Germany reunification), so can their boundaries. This document explain how the pre-generated files can be updated. + + + +## Country code + +Each place is assigned a two letter country_code based on its location, e.g. `gb` for Great Britain. Or `NULL` if no suitable country is found (usually it's in open water then). + +In `sql/functions.sql: get_country_code(geometry)` the place's center is checked against + + 1. country places already imported from the user's data file. Places are imported by rank low-to-high. Lowest rank 2 is countries so most places should be matched. Still the data file might be incomplete. + 2. if unmatched: OSM grid boundaries + 3. if still unmatched: OSM grid boundaries, but allow a small distance + + + +## Partitions + +Each place is assigned partition, which is a number 0..250. 0 is fallback/other. + +During place indexing (`sql/functions.sql: placex_insert()`) a place is assigned the partition based on its country code (`sql/functions.sql: get_partition(country_code)`). It checks in the `country_name` table. + +Most countries have their own parition, some share a partition. Thus partition counts vary greatly. + +Several database tables are split by partition to allow queries to run against less indices and improve caching. + + * `location_area_large_` + * `search_name_` + * `location_road_` + + + + + +## Data files + +### `data/country_name.sql` + +Export from existing database table plus manual changes. `country_default_language_code` most taken from [https://wiki.openstreetmap.org/wiki/Nominatim/Country_Codes](), see `utils/country_languages.php`. + + + +### `data/country_osm_grid.sql` + +`country_grid.sql` merges territories by country. Then uses `function.sql: quad_split_geometry` to split each country into multiple [Quadtree](https://en.wikipedia.org/wiki/Quadtree) polygons for faster point-in-polygon lookups. + +To visualize one country as geojson feature collection, e.g. for loading into [geojson.io](http://geojson.io/): + +``` +-- http://www.postgresonline.com/journal/archives/267-Creating-GeoJSON-Feature-Collections-with-JSON-and-PostGIS-functions.html + +SELECT row_to_json(fc) +FROM ( + SELECT 'FeatureCollection' As type, array_to_json(array_agg(f)) As features + FROM ( + SELECT 'Feature' As type, + ST_AsGeoJSON(lg.geometry)::json As geometry, + row_to_json((country_code, area)) As properties + FROM country_osm_grid As lg where country_code='mx' + ) As f +) As fc; +``` + +`cat /tmp/query.sql | psql -At nominatim > /tmp/mexico.quad.geojson` + +![mexico](mexico.quad.png) diff --git a/sql/country_grid.sql b/data-sources/country-grid/country_grid.sql similarity index 100% rename from sql/country_grid.sql rename to data-sources/country-grid/country_grid.sql diff --git a/data-sources/country-grid/mexico.quad.png b/data-sources/country-grid/mexico.quad.png new file mode 100644 index 00000000..61c12802 Binary files /dev/null and b/data-sources/country-grid/mexico.quad.png differ diff --git a/docs/CMakeLists.txt b/docs/CMakeLists.txt index 68af5429..87bb3cd5 100644 --- a/docs/CMakeLists.txt +++ b/docs/CMakeLists.txt @@ -14,6 +14,8 @@ ADD_CUSTOM_TARGET(doc COMMAND ${CMAKE_COMMAND} -E create_symlink ${CMAKE_CURRENT_SOURCE_DIR}/index.md ${CMAKE_CURRENT_BINARY_DIR}/index.md COMMAND ${CMAKE_COMMAND} -E create_symlink ${CMAKE_CURRENT_SOURCE_DIR}/extra.css ${CMAKE_CURRENT_BINARY_DIR}/extra.css COMMAND ${CMAKE_COMMAND} -E create_symlink ${PROJECT_SOURCE_DIR}/data-sources/us-tiger/README.md ${CMAKE_CURRENT_BINARY_DIR}/data-sources/US-Tiger.md + COMMAND ${CMAKE_COMMAND} -E create_symlink ${PROJECT_SOURCE_DIR}/data-sources/country-grid/README.md ${CMAKE_CURRENT_BINARY_DIR}/data-sources/Country-Grid.md + COMMAND ${CMAKE_COMMAND} -E create_symlink ${PROJECT_SOURCE_DIR}/data-sources/country-grid/mexico.grid.png ${CMAKE_CURRENT_BINARY_DIR}/data-sources/mexico.grid.png COMMAND ${CMAKE_CURRENT_SOURCE_DIR}/bash2md.sh ${PROJECT_SOURCE_DIR}/vagrant/Install-on-Centos-7.sh ${CMAKE_CURRENT_BINARY_DIR}/appendix/Install-on-Centos-7.md COMMAND ${CMAKE_CURRENT_SOURCE_DIR}/bash2md.sh ${PROJECT_SOURCE_DIR}/vagrant/Install-on-Ubuntu-16.sh ${CMAKE_CURRENT_BINARY_DIR}/appendix/Install-on-Ubuntu-16.md COMMAND ${CMAKE_CURRENT_SOURCE_DIR}/bash2md.sh ${PROJECT_SOURCE_DIR}/vagrant/Install-on-Ubuntu-18.sh ${CMAKE_CURRENT_BINARY_DIR}/appendix/Install-on-Ubuntu-18.md diff --git a/docs/mkdocs.yml b/docs/mkdocs.yml index 1a690e7b..271fd207 100644 --- a/docs/mkdocs.yml +++ b/docs/mkdocs.yml @@ -23,6 +23,7 @@ pages: - 'External Data Sources': - 'Overview' : 'data-sources/overview.md' - 'US Census (Tiger)': 'data-sources/US-Tiger.md' + - 'Country Grid': 'data-sources/Country-Grid.md' - 'Appendix': - 'Installation on CentOS 7' : 'appendix/Install-on-Centos-7.md' - 'Installation on Ubuntu 16' : 'appendix/Install-on-Ubuntu-16.md'