From 48be8c33ba3da6934faccd87702337caf67f69d1 Mon Sep 17 00:00:00 2001 From: Sarah Hoffmann Date: Wed, 27 Oct 2021 20:59:45 +0200 Subject: [PATCH] docs: add new maintenance section currently used for postcode updates, word count updates and deleted relations. --- docs/admin/Maintenance.md | 51 +++++++++++++++++++++++++++++++++++++++ docs/develop/Postcodes.md | 45 ---------------------------------- docs/mkdocs.yml | 2 +- 3 files changed, 52 insertions(+), 46 deletions(-) create mode 100644 docs/admin/Maintenance.md delete mode 100644 docs/develop/Postcodes.md diff --git a/docs/admin/Maintenance.md b/docs/admin/Maintenance.md new file mode 100644 index 00000000..782b377c --- /dev/null +++ b/docs/admin/Maintenance.md @@ -0,0 +1,51 @@ +This chapter describes the various operations the Nominatim database administrator +may use to clean and maintain the database. None of these operations is mandatory +but they may help improve the performance and accuracy of results. + + +## Updating postcodes + +Command: `nominatim refresh --postcodes` + +Postcode centroids (aka 'calculated postcodes') are generated by looking at all +postcodes of a country, grouping them and calculating the geometric centroid. +There is currently no logic to deal with extreme outliers (typos or other +mistakes in OSM data). There is also no check if a postcodes adheres to a +country's format, e.g. if Swiss postcodes are 4 digits. + +When running regular updates, postcodes results can be improved by running +this command on a regular basis. Note that only the postcode table and the +postcode search terms are updated. The postcode that is assigned to each place +is only updated when the place is updated. + +The command takes around 70min to run on the planet and needs ca. 40GB of +temporary disk space. + + +## Updating word counts + +Command: `nominatim refresh --word-counts` + +Nominatim keeps frequency statistics about all search terms it indexes. These +statistics are currently used to optimise queries to the database. Thus better +statistics mean better performance. Word counts are created once after import +and are usually sufficient even when running regular updates. You might want +to rerun the statistics computation when adding larger amounts of new data, +for example, when adding an additional country via `nominatim add-data`. + + +## Removing large deleted objects + +Nominatim refuses to delete very large areas because often these deletions are +accidental and are reverted within hours. Instead the deletions are logged in +the `import_polygon_delete` table and left to the administrator to clean up. + +There is currently no command to do that. You can use the following SQL +query to force a deletion on all objects that have been deleted more than +a certain timespan ago (here: 1 month): + +```sql +SELECT place_force_delete(p.place_id) FROM import_polygon_delete d, placex p +WHERE p.osm_type = d.osm_type and p.osm_id = d.osm_id + and age(p.indexed_date) > '1 month'::interval +``` diff --git a/docs/develop/Postcodes.md b/docs/develop/Postcodes.md deleted file mode 100644 index 343b8de3..00000000 --- a/docs/develop/Postcodes.md +++ /dev/null @@ -1,45 +0,0 @@ -# Postcodes in Nominatim - -The blog post -[Nominatim and Postcodes](https://www.openstreetmap.org/user/lonvia/diary/43143) -describes the handling implemented since Nominatim 3.1. - -Postcode centroids (aka 'calculated postcodes') are generated by looking at all -postcodes of a country, grouping them and calculating the geometric centroid. -There is currently no logic to deal with extreme outliers (typos or other -mistakes in OSM data). There is also no check if a postcodes adheres to a -country's format, e.g. if Swiss postcodes are 4 digits. - - -## Regular updating calculated postcodes - -The script to rerun the calculation is -`nominatim refresh --postcodes` -and runs once per night on nominatim.openstreetmap.org. - - -## Finding places that share a specific postcode - -In the Nominatim database run - -```sql -SELECT address->'postcode' as pc, - osm_type, osm_id, class, type, - st_x(centroid) as lon, st_y(centroid) as lat -FROM placex -WHERE country_code='fr' - AND upper(trim (both ' ' from address->'postcode')) = '33210'; -``` - -Alternatively on [Overpass](https://overpass-turbo.eu/) run the following query - -``` -[out:json][timeout:250]; -area["name"="France"]->.boundaryarea; -( -nwr(area.boundaryarea)["addr:postcode"="33210"]; -); -out body; ->; -out skel qt; -``` diff --git a/docs/mkdocs.yml b/docs/mkdocs.yml index 4ac9e460..0f4ed655 100644 --- a/docs/mkdocs.yml +++ b/docs/mkdocs.yml @@ -21,6 +21,7 @@ pages: - 'Deploy' : 'admin/Deployment.md' - 'Nominatim UI' : 'admin/Setup-Nominatim-UI.md' - 'Advanced Installations' : 'admin/Advanced-Installations.md' + - 'Maintenance' : 'admin/Maintenance.md' - 'Migration from older Versions' : 'admin/Migration.md' - 'Troubleshooting' : 'admin/Faq.md' - 'Customization Guide': @@ -37,7 +38,6 @@ pages: - 'Architecture Overview' : 'develop/overview.md' - 'OSM Data Import' : 'develop/Import.md' - 'Tokenizers' : 'develop/Tokenizers.md' - - 'Postcodes' : 'develop/Postcodes.md' - 'Testing' : 'develop/Testing.md' - 'External Data Sources': 'develop/data-sources.md' - 'Appendix': -- 2.39.5