## Downloading additional data
-### Wikipedia rankings
+### Wikipedia/Wikidata rankings
Wikipedia can be used as an optional auxiliary data source to help indicate
-the importance of osm features. Nominatim will work without this information
+the importance of OSM features. Nominatim will work without this information
but it will improve the quality of the results if this is installed.
This data is available as a binary download:
cd $NOMINATIM_SOURCE_DIR/data
- wget https://www.nominatim.org/data/wikipedia_article.sql.bin
- wget https://www.nominatim.org/data/wikipedia_redirect.sql.bin
+ wget https://www.nominatim.org/data/wikimedia-importance.sql.gz
-Combined the 2 files are around 1.5GB and add around 30GB to the install
-size of nominatim. They also increase the install time by an hour or so.
+The file is about 400MB and adds around 4GB to Nominatim database.
-*NOTE:* you'll need to download the Wikipedia rankings before performing
-the initial import of the data if you want the rankings applied to the
-loaded data.
+*NOTE:* if you forgot to download the wikipedia rankings, you can also add
+them after the import by running `./utils/setup.php --import-wikipedia-articles`
+and then `./utils/update.php --recompute-importance`.
-### UK postcodes
+### Great Britain, USA postcodes
-Nominatim can use postcodes from an external source to improve searches that involve a UK postcode. This data can be optionally downloaded:
+Nominatim can use postcodes from an external source to improve searches that
+involve a GB or US postcode. This data can be optionally downloaded:
cd $NOMINATIM_SOURCE_DIR/data
wget https://www.nominatim.org/data/gb_postcode_data.sql.gz
+ wget https://www.nominatim.org/data/us_postcode_data.sql.gz
## Choosing the Data to Import
Please be aware that some extracts are not cut exactly along the country
boundaries. As a result some parts of the boundary may be missing which means
-that cannot compute the areas for some administrative areas.
+that Nominatim cannot compute the areas for some administrative areas.
### Dropping Data Required for Dynamic Updates
If you only want to use the Nominatim database for reverse lookups or
if you plan to use the installation only for exports to a
-[photon](http://photon.komoot.de/) database, then you can set up a database
+[photon](https://photon.komoot.de/) database, then you can set up a database
without search indexes. Add `--reverse-only` to your setup command above.
This saves about 5% of disk space.
The style can be changed with the configuration `CONST_Import_Style`.
-To give you an idea of the impact of using the different style, the table
+To give you an idea of the impact of using the different styles, the table
below gives rough estimates of the final database size after import of a
2018 planet and after using the `--drop` option. It also shows the time
needed for the import on a machine with 32GB RAM, 4 CPUS and SSDs. Note that
full | 80h | 575 GB | 300 GB
You can also customize the styles further. For an description of the
-style format see [the developement section](../develop/Import.md).
+style format see [the development section](../develop/Import.md).
## Initial import of the data
**Important:** first try the import with a small extract, for example from
[Geofabrik](https://download.geofabrik.de).
-Download the data to import and load the data with the following command:
+Download the data to import and load the data with the following command
+from the build directory:
```sh
./utils/setup.php --osm-file <data file> --all [--osm2pgsql-cache 28000] 2>&1 | tee setup.log
2/3 of RAM available. If your machine starts swapping reduce the size.
Computing word frequency for search terms can improve the performance of
-forward geocoding in particular under high load as it helps Postgres' query
+forward geocoding in particular under high load as it helps PostgreSQL's query
planner to make the right decisions. To recompute word counts run:
```sh
TIGER data to your own Nominatim instance by following these steps. The
entire US adds about 10GB to your database.
- 1. Get preprocessed TIGER 2018 data and unpack it into the
+ 1. Get preprocessed TIGER 2019 data and unpack it into the
data directory in your Nominatim sources:
cd Nominatim/data
- wget https://nominatim.org/data/tiger2018-nominatim-preprocessed.tar.gz
- tar xf tiger2018-nominatim-preprocessed.tar.gz
+ wget https://nominatim.org/data/tiger2019-nominatim-preprocessed.tar.gz
+ tar xf tiger2019-nominatim-preprocessed.tar.gz
`data-source/us-tiger/README.md` explains how the data got preprocessed.
- 2. Import the data into your Nominatim database:
+ 2. Import the data into your Nominatim database:
./utils/setup.php --import-tiger-data
## Updates
-There are many different possibilities to update your Nominatim database.
+There are many different ways to update your Nominatim database.
The following section describes how to keep it up-to-date with Pyosmium.
For a list of other methods see the output of `./utils/update.php --help`.
#### Installing the newest version of Pyosmium
-It is recommended to install Pyosmium via pip. Run (as the same user who
-will later run the updates):
+It is recommended to install Pyosmium via pip. Make sure to use python3.
+Run (as the same user who will later run the updates):
```sh
-pip install --user osmium
+pip3 install --user osmium
```
-Nominatim needs a tool called `pyosmium-get-updates`, which comes with
+Nominatim needs a tool called `pyosmium-get-updates` which comes with
Pyosmium. You need to tell Nominatim where to find it. Add the
following line to your `settings/local.php`:
If you want a different update source you will need to add some settings
to `settings/local.php`. For example, to use the daily country extracts
-diffs for Ireland from geofabrik add the following:
+diffs for Ireland from Geofabrik add the following:
// base URL of the replication service
@define('CONST_Replication_Url', 'https://download.geofabrik.de/europe/ireland-and-northern-ireland-updates');
It outputs the date where updates will start. Recheck that this date is
what you expect.
-The --init-updates command needs to be rerun whenever the replication service
+The `--init-updates` command needs to be rerun whenever the replication service
is changed.
#### Updating Nominatim