are in process of consolidating the style. The following rules apply:
* Python code uses the official Python style
- * indention
+ * indentation
* SQL use 2 spaces
* all other file types use 4 spaces
* [BSD style](https://en.wikipedia.org/wiki/Indent_style#Allman_style) for braces
## Development
Vagrant maps the virtual machine's port 8089 to your host machine. Thus you can
-see Nominatim in action on [locahost:8089](http://localhost:8089/nominatim/).
+see Nominatim in action on [localhost:8089](http://localhost:8089/nominatim/).
You edit code on your host machine in any editor you like. There is no need to
restart any software: just refresh your browser window.
!!! note
The external module is only needed when using the legacy tokenizer.
- If you have choosen the ICU tokenizer, then you can ignore this section
+ If you have chosen the ICU tokenizer, then you can ignore this section
and follow the standard import documentation.
### Option 1: Compiling the library on the database server
### Installing the required packages
-Nginx has no built-in PHP interpreter. You need to use php-fpm as a deamon for
+Nginx has no built-in PHP interpreter. You need to use php-fpm as a daemon for
serving PHP cgi.
On Ubuntu/Debian install nginx and php-fpm with:
### Software
!!! Warning
- For larger installations you **must have** PostgreSQL 11+ and Postgis 3+
+ For larger installations you **must have** PostgreSQL 11+ and PostGIS 3+
otherwise import and queries will be slow to the point of being unusable.
- Query performance has marked improvements with PostgrSQL 13+ and Postgis 3.2+.
+ Query performance has marked improvements with PostgreSQL 13+ and PostGIS 3.2+.
For compiling:
fsync = off
full_page_writes = off
-Don't forget to reenable them after the initial import or you risk database
+Don't forget to re-enable them after the initial import or you risk database
corruption.
# If no endpoint is given, then use search.
RewriteRule ^(/|$) "search.php"
- # If format-html is explicity requested, forward to the UI.
+ # If format-html is explicitly requested, forward to the UI.
RewriteCond %{QUERY_STRING} "format=html"
RewriteRule ^([^/]+)(.php)? ui/$1.html [R,END]
a replication source with an update frequency that is an order of magnitude
lower. For example, if you want to update once a day, use an hourly updated
source. This makes sure that you don't miss an entire day of updates when
- the source is unexpectely late to publish its update.
+ the source is unexpectedly late to publish its update.
If you want to use the source with the same update frequency (e.g. a daily
updated source with daily updates), use the
removed and reimported while updating the database with fresh OSM data.
It is thus not useful to treat it as permanent for later use.
-The combination `osm_type`+`osm_id` is slighly better but remember in
+The combination `osm_type`+`osm_id` is slightly better but remember in
OpenStreetMap mappers can delete, split, recreate places (and those
get a new `osm_id`), there is no link between those old and new ids.
Places can also change their meaning without changing their `osm_id`,
* city_district, district, borough, suburb, subdivision
* hamlet, croft, isolated_dwelling
* neighbourhood, allotments, quarter
- * city_block, residental, farm, farmyard, industrial, commercial, retail
+ * city_block, residential, farm, farmyard, industrial, commercial, retail
* road
* house_number, house_name
* emergency, historic, military, natural, landuse, place, railway,
in the [Import section](../admin/Import.md#filtering-imported-data). These
standard styles may be referenced by their name.
-You can also create your own custom syle. Put the style file into your
+You can also create your own custom style. Put the style file into your
project directory and then set `NOMINATIM_IMPORT_STYLE` to the name of the file.
It is always recommended to start with one of the standard styles and customize
those. You find the standard styles under the name `import-<stylename>.style`
Each country is assigned a partition number in the country_name table (see
below) and the data is then split between a set of tables, one for each
partition. Note that Nominatim still manually manages partitioned tables.
-Native support for partitions in PostgreSQL only became useable with version 13.
+Native support for partitions in PostgreSQL only became usable with version 13.
It will be a little while before Nominatim drops support for older versions.
![address tables](address-tables.svg)
default languages and saves the assignment of countries to partitions.
* `country_osm_grid` provides a fallback for country geometries
-## Auxilary data tables
+## Auxiliary data tables
-Finally there are some table for auxillary data:
+Finally there are some table for auxiliary data:
* `location_property_tiger` - saves housenumber from the Tiger import. Its
layout is similar to that of `location_propoerty_osmline`.
# Setting up Nominatim for Development
-This chapter gives an overview how to set up Nominatim for developement
+This chapter gives an overview how to set up Nominatim for development
and how to run tests.
!!! Important
If the tokenizer has a default configuration file, this should be saved in
the `settings/<NAME>_tokenizer.<SUFFIX>`.
-### Configuration and Persistance
+### Configuration and Persistence
Tokenizers may define custom settings for their configuration. All settings
must be prefixed with `NOMINATIM_TOKENIZER_`. Settings may be transient or
## US Census TIGER
-For the United States you can choose to import additonal street-level data.
+For the United States you can choose to import additional street-level data.
The data isn't mixed into OSM data but queried as fallback when no OSM
result can be found.
$this->bFallback = $oParams->getBool('fallback', $this->bFallback);
- // List of excluded Place IDs - used for more acurate pageing
+ // List of excluded Place IDs - used for more accurate pageing
$sExcluded = $oParams->getStringList('exclude_place_ids');
if ($sExcluded) {
foreach ($sExcluded as $iExcludedPlaceID) {
}
/**
- * Get the orginal phrase of the string.
+ * Get the original phrase of the string.
*/
public function getPhrase()
{
// starts if the search is on POI or street level,
// searches for the nearest POI or street,
// if a street is found and a POI is searched for,
- // the nearest POI which the found street is a parent of is choosen.
+ // the nearest POI which the found street is a parent of is chosen.
$sSQL = 'select place_id,parent_place_id,rank_address,country_code,';
$sSQL .= ' ST_distance('.$sPointSQL.', geometry) as distance';
$sSQL .= ' FROM ';
// We can't reliably go from the closest street to an
// interpolation line because the closest interpolation
// may have a different street segments as a parent.
- // Therefore allow an interpolation line to take precendence
+ // Therefore allow an interpolation line to take precedence
// even when the street is closer.
$fDistance = $iRankAddress < 28 ? 0.001 : $aPlace['distance'];
}
* Add the given full-word token to the list of terms to search for in the
* name.
*
- * @param interger iId ID of term to add.
+ * @param integer iId ID of term to add.
* @param bool bRareName True if the term is infrequent enough to not
* require other constraints for efficient search.
*/
*
* @return mixed[] An array with two fields: IDs contains the list of
* matching place IDs and houseNumber the houseNumber
- * if appicable or -1 if not.
+ * if applicable or -1 if not.
*/
public function query(&$oDB, $iMinRank, $iMaxRank, $iLimit)
{
public function extendSearch($oSearch, $oPosition)
{
// Full words can only be a name if they appear at the beginning
- // of the phrase. In structured search the name must forcably in
+ // of the phrase. In structured search the name must forcibly in
// the first phrase. In unstructured search it may be in a later
// phrase when the first phrase is a house number.
if ($oSearch->hasName()
showUsage($aSpec, $bExitOnError, 'Option \''.$aLine[0].'\' is missing');
}
if ($aCounts[$aLine[0]] > $aLine[3]) {
- showUsage($aSpec, $bExitOnError, 'Option \''.$aLine[0].'\' is pressent too many times');
+ showUsage($aSpec, $bExitOnError, 'Option \''.$aLine[0].'\' is present too many times');
}
if ($aLine[6] == 'bool' && !array_key_exists($aLine[0], $aResult)) {
$aResult[$aLine[0]] = false;
function loadSettings($sProjectDir)
{
@define('CONST_InstallDir', $sProjectDir);
- // Temporary hack to set the direcory via environment instead of
+ // Temporary hack to set the directory via environment instead of
// the installed scripts. Neither setting is part of the official
// set of settings.
defined('CONST_ConfigDir') or define('CONST_ConfigDir', $_SERVER['NOMINATIM_CONFIGDIR']);
$aLinkedLines = $oDB->getAll($sSQL);
}
-// All places this is an imediate parent of
+// All places this is an immediate parent of
$aHierarchyLines = false;
if ($bIncludeHierarchy) {
$sSQL = 'SELECT obj.place_id, osm_type, osm_id, class, type, housenumber,';
centroid GEOMETRY
);
--- feature intersects geoemtry
+-- feature intersects geometry
-- for areas and linestrings they must touch at least along a line
CREATE OR REPLACE FUNCTION is_relevant_geometry(de9im TEXT, geom_type TEXT)
RETURNS BOOLEAN
and rank_search = 30 AND ST_GeometryType(geometry) in ('ST_Polygon','ST_MultiPolygon')
LIMIT 1;
ELSE
- -- See if we can inherit addtional address tags from an interpolation.
+ -- See if we can inherit additional address tags from an interpolation.
-- These will become permanent.
FOR location IN
SELECT (address - 'interpolation'::text - 'housenumber'::text) as address
{% if debug %}RAISE WARNING 'Using full index mode for % %', NEW.osm_type, NEW.osm_id;{% endif %}
IF linked_place is not null THEN
-- Recompute the ranks here as the ones from the linked place might
- -- have been shifted to accomodate surrounding boundaries.
+ -- have been shifted to accommodate surrounding boundaries.
SELECT place_id, osm_id, class, type, extratags,
centroid, geometry,
(compute_place_rank(country_code, osm_type, class, type, admin_level,
THEN
-- Update the list of country names.
-- Only take the name from the largest area for the given country code
- -- in the hope that this is the authoritive one.
+ -- in the hope that this is the authoritative one.
-- Also replace any old names so that all mapping mistakes can
-- be fixed through regular OSM updates.
FOR location IN
NEW.postcode := get_nearest_postcode(NEW.country_code, NEW.geometry);
END IF;
- {% if debug %}RAISE WARNING 'place update % % finsihed.', NEW.osm_type, NEW.osm_id;{% endif %}
+ {% if debug %}RAISE WARNING 'place update % % finished.', NEW.osm_type, NEW.osm_id;{% endif %}
NEW.token_info := token_strip_info(NEW.token_info);
RETURN NEW;
#!/bin/sh
#
-# Plugin to monitor the types of requsts made to the API
+# Plugin to monitor the types of requests made to the API
#
# Can be configured through libpq environment variables, for example
# PGUSER, PGDATABASE, etc. See man page of psql for more information.
config: Optional[str] = None) -> Any:
""" Load additional configuration from a file. `filename` is the name
of the configuration file. The file is first searched in the
- project directory and then in the global settings dirctory.
+ project directory and then in the global settings directory.
If `config` is set, then the name of the configuration file can
be additionally given through a .env configuration option. When
""" Handler for the '!include' operator in YAML files.
When the filename is relative, then the file is first searched in the
- project directory and then in the global settings dirctory.
+ project directory and then in the global settings directory.
"""
fname = loader.construct_scalar(node)
def drop_table(self, name: str, if_exists: bool = True, cascade: bool = False) -> None:
""" Drop the table with the given name.
- Set `if_exists` to False if a non-existant table should raise
+ Set `if_exists` to False if a non-existent table should raise
an exception instead of just being ignored. If 'cascade' is set
to True then all dependent tables are deleted as well.
"""
def drop_table(self, name: str, if_exists: bool = True, cascade: bool = False) -> None:
""" Drop the table with the given name.
- Set `if_exists` to False if a non-existant table should raise
+ Set `if_exists` to False if a non-existent table should raise
an exception instead of just being ignored.
"""
with self.cursor() as cur:
from nominatim.db.connection import Connection
def set_property(conn: Connection, name: str, value: str) -> None:
- """ Add or replace the propery with the given name.
+ """ Add or replace the property with the given name.
"""
with conn.cursor() as cur:
cur.execute('SELECT value FROM nominatim_properties WHERE property = %s',
def index_postcodes(self) -> None:
- """Index the entries ofthe location_postcode table.
+ """Index the entries of the location_postcode table.
"""
LOG.warning("Starting indexing postcodes using %s threads", self.num_threads)
# asynchronously get the next batch
has_more = fetcher.fetch_next_batch(cur, runner)
- # And insert the curent batch
+ # And insert the current batch
for idx in range(0, len(places), batch):
part = places[idx:idx + batch]
LOG.debug("Processing places: %s", str(part))
""" Tracks and prints progress for the indexing process.
`name` is the name of the indexing step being tracked.
`total` sets up the total number of items that need processing.
- `log_interval` denotes the interval in seconds at which progres
+ `log_interval` denotes the interval in seconds at which progress
should be reported.
"""
# Copyright (C) 2022 by the Nominatim developer community.
# For a full list of authors see the git log.
"""
-Abstract class defintions for tokenizers. These base classes are here
+Abstract class definitions for tokenizers. These base classes are here
mainly for documentation purposes.
"""
from abc import ABC, abstractmethod
the search index.
Arguments:
- place: Place information retrived from the database.
+ place: Place information retrieved from the database.
Returns:
A JSON-serialisable structure that will be handed into
init_db: When set to False, then initialisation of database
tables should be skipped. This option is only required for
- migration purposes and can be savely ignored by custom
+ migration purposes and can be safely ignored by custom
tokenizers.
TODO: can we move the init_db parameter somewhere else?
existing database.
A tokenizer is something that is bound to the lifetime of a database. It
-can be choosen and configured before the intial import but then needs to
+can be chosen and configured before the initial import but then needs to
be used consistently when querying and updating the database.
This module provides the functions to create and configure a new tokenizer
-as well as instanciating the appropriate tokenizer for updating an existing
+as well as instantiating the appropriate tokenizer for updating an existing
database.
A tokenizer usually also includes PHP code for querying. The appropriate PHP
class ICUTokenizer(AbstractTokenizer):
- """ This tokenizer uses libICU to covert names and queries to ASCII.
+ """ This tokenizer uses libICU to convert names and queries to ASCII.
Otherwise it uses the same algorithms and data structures as the
normalization routines in Nominatim 3.
"""
def _remove_special_phrases(self, cursor: Cursor,
new_phrases: Set[Tuple[str, str, str, str]],
existing_phrases: Set[Tuple[str, str, str, str]]) -> int:
- """ Remove all phrases from the databse that are no longer in the
+ """ Remove all phrases from the database that are no longer in the
new phrase list.
"""
to_delete = existing_phrases - new_phrases
def _retrieve_full_tokens(self, name: str) -> List[int]:
""" Get the full name token for the given name, if it exists.
- The name is only retrived for the standard analyser.
+ The name is only retrieved for the standard analyser.
"""
assert self.conn is not None
norm_name = self._search_normalized(name)
def scan(self, postcode: str, country: Optional[str]) -> Optional[Tuple[str, str]]:
""" Check the postcode for correct formatting and return the
normalized version. Returns None if the postcode does not
- correspond to the oficial format of the given country.
+ correspond to the official format of the given country.
"""
match = self.matcher.match(country, postcode)
if match is None:
True when the item passes the filter.
If the parameter is empty, the filter lets all items pass. If the
- paramter is a string, it is interpreted as a single regular expression
+ parameter is a string, it is interpreted as a single regular expression
that must match the full kind string. If the parameter is a list then
any of the regular expressions in the list must match to pass.
"""
class _VariantMaker:
- """ Generater for all necessary ICUVariants from a single variant rule.
+ """ Generator for all necessary ICUVariants from a single variant rule.
All text in rules is normalized to make sure the variants match later.
"""
class MutationVariantGenerator:
""" Generates name variants by applying a regular expression to the name
and replacing it with one or more variants. When the regular expression
- matches more than once, each occurence is replaced with all replacement
+ matches more than once, each occurrence is replaced with all replacement
patterns.
"""
return CheckState.FATAL, dict(config=config)
-@_check(hint="""placex table has no data. Did the import finish sucessfully?""")
+@_check(hint="""placex table has no data. Did the import finish successfully?""")
def check_placex_size(conn: Connection, _: Configuration) -> CheckResult:
""" Checking for placex content
"""
tokenizer = tokenizer_factory.get_tokenizer_for_db(config)
except UsageError:
return CheckState.FAIL, dict(msg="""\
- Cannot load tokenizer. Did the import finish sucessfully?""")
+ Cannot load tokenizer. Did the import finish successfully?""")
result = tokenizer.check_database(config)
for version, func in _MIGRATION_FUNCTIONS:
if db_version <= version:
title = func.__doc__ or ''
- LOG.warning("Runnning: %s (%s)", title.split('\n', 1)[0],
+ LOG.warning("Running: %s (%s)", title.split('\n', 1)[0],
version_str(version))
kwargs = dict(conn=conn, config=config, paths=paths)
func(**kwargs)
def add_step_column_for_interpolation(conn: Connection, **_: Any) -> None:
""" Add a new column 'step' to the interpolations table.
- Also convers the data into the stricter format which requires that
+ Also converts the data into the stricter format which requires that
startnumbers comply with the odd/even requirements.
"""
if conn.table_has_column('location_property_osmline', 'step'):
def import_wikipedia_articles(dsn: str, data_path: Path, ignore_errors: bool = False) -> int:
""" Replaces the wikipedia importance tables with new data.
The import is run in a single transaction so that the new data
- is replace seemlessly.
+ is replace seamlessly.
Returns 0 if all was well and 1 if the importance file could not
be found. Throws an exception if there was an error reading the file.
self.black_list, self.white_list = self._load_white_and_black_lists()
self.sanity_check_pattern = re.compile(r'^\w+$')
# This set will contain all existing phrases to be added.
- # It contains tuples with the following format: (lable, class, type, operator)
+ # It contains tuples with the following format: (label, class, type, operator)
self.word_phrases: Set[Tuple[str, str, str, str]] = set()
# This set will contain all existing place_classtype tables which doesn't match any
# special phrases class/type on the wiki.
"""
from typing import Any, Union, Mapping, TypeVar, Sequence, TYPE_CHECKING
-# Generics varaible names do not confirm to naming styles, ignore globally here.
+# Generics variable names do not confirm to naming styles, ignore globally here.
# pylint: disable=invalid-name,abstract-method,multiple-statements
# pylint: disable=missing-class-docstring,useless-import-alias
POSTGRESQL_REQUIRED_VERSION = (9, 5)
POSTGIS_REQUIRED_VERSION = (2, 2)
-# Cmake sets a variabe @GIT_HASH@ by executing 'git --log'. It is not run
+# Cmake sets a variable @GIT_HASH@ by executing 'git --log'. It is not run
# on every execution of 'make'.
# cmake/tool-installed.tmpl is used to build the binary 'nominatim'. Inside
# there is a call to set the variable value below.
# Copyright (C) 2022 by the Nominatim developer community.
# For a full list of authors see the git log.
"""
-Tests for specialised conenction and cursor classes.
+Tests for specialised connection and cursor classes.
"""
import pytest
import psycopg2
sudo dnf install -y epel-release redhat-rpm-config
# EPEL contains Postgres 9.6 and 10, but not PostGIS. Postgres 9.4+/10/11/12
-# and PostGIS 2.4/2.5/3.0 are availble from postgresql.org. Enable these
+# and PostGIS 2.4/2.5/3.0 are available from postgresql.org. Enable these
# repositories and make sure, the binaries can be found:
sudo dnf -qy module disable postgresql