]> git.openstreetmap.org Git - nominatim.git/log
nominatim.git
2 years agoMerge pull request #2786 from lonvia/export-centroid-for-tokenizer
Sarah Hoffmann [Mon, 1 Aug 2022 09:38:24 +0000 (11:38 +0200)]
Merge pull request #2786 from lonvia/export-centroid-for-tokenizer

Export centroid to tokenizer

2 years agoexport centroid to tokenizer
Sarah Hoffmann [Sun, 31 Jul 2022 20:10:58 +0000 (22:10 +0200)]
export centroid to tokenizer

May come in handy when developping sanitizers for an area smaller
than country size.

2 years agoMerge pull request #2784 from lonvia/doscs-customizing-icu-tokenizer
Sarah Hoffmann [Sun, 31 Jul 2022 17:15:50 +0000 (19:15 +0200)]
Merge pull request #2784 from lonvia/doscs-customizing-icu-tokenizer

Document the public API of sanitizers and token analysis modules

2 years agofix various typos
Sarah Hoffmann [Sun, 31 Jul 2022 15:10:35 +0000 (17:10 +0200)]
fix various typos

2 years agoadd simple examples of sanitizers and token analysis
Sarah Hoffmann [Fri, 29 Jul 2022 15:15:25 +0000 (17:15 +0200)]
add simple examples of sanitizers and token analysis

2 years agooverhaul the token analysis interface
Sarah Hoffmann [Fri, 29 Jul 2022 13:14:11 +0000 (15:14 +0200)]
overhaul the token analysis interface

The functional split betweenthe two functions is now that the
first one creates the ID that is used in the word table and
the second one creates the variants. There no longer is a
requirement that the ID is the normalized version. We might
later reintroduce the requirement that a normalized version be available
but it doesn't necessarily need to be through the ID.

The function that creates the ID now gets the full PlaceName. That way
it might take into account attributes that were set by the sanitizers.

Finally rename both functions to something more sane.

2 years agomove PlaceName into the generic data module
Sarah Hoffmann [Fri, 29 Jul 2022 09:39:55 +0000 (11:39 +0200)]
move PlaceName into the generic data module

2 years agoharmonize spelling
Sarah Hoffmann [Fri, 29 Jul 2022 08:52:01 +0000 (10:52 +0200)]
harmonize spelling

Stick with the American spelling of Analyze.

2 years agoharmonize interface of token analysis module
Sarah Hoffmann [Fri, 29 Jul 2022 08:43:07 +0000 (10:43 +0200)]
harmonize interface of token analysis module

The configure() function now receives a Transliterator object instead
of the ICU rules. This harmonizes the parameters with the create
function.

2 years agoadd documentation for custom token analysis
Sarah Hoffmann [Fri, 29 Jul 2022 07:41:28 +0000 (09:41 +0200)]
add documentation for custom token analysis

2 years agoadd documentation for sanitizer interface
Sarah Hoffmann [Thu, 28 Jul 2022 20:00:29 +0000 (22:00 +0200)]
add documentation for sanitizer interface

Also switches mkdocstrings to 0.18 with the rather unfortunate
consequence that now mkdocstrings-python-legacy is needed as well.

2 years agoMerge pull request #2780 from lonvia/python-modules-in-project-directory
Sarah Hoffmann [Thu, 28 Jul 2022 19:58:04 +0000 (21:58 +0200)]
Merge pull request #2780 from lonvia/python-modules-in-project-directory

Support for external sanitizer and token analysis modules

2 years agoadd support for external token analysis modules
Sarah Hoffmann [Mon, 25 Jul 2022 14:27:22 +0000 (16:27 +0200)]
add support for external token analysis modules

2 years agoadd support for external sanitizer modules
Sarah Hoffmann [Mon, 25 Jul 2022 14:10:19 +0000 (16:10 +0200)]
add support for external sanitizer modules

2 years agoadd function for loading plugin modules
Sarah Hoffmann [Mon, 25 Jul 2022 13:17:20 +0000 (15:17 +0200)]
add function for loading plugin modules

Loads modules for configurable code like tokenizers, sanitizers, etc.
Supports internal modules, external libraries and code from the
project directory.

2 years agoMerge pull request #2775 from lonvia/remove-centos-instructions
Sarah Hoffmann [Mon, 25 Jul 2022 08:29:32 +0000 (10:29 +0200)]
Merge pull request #2775 from lonvia/remove-centos-instructions

Remove vagrant scripts for CentOS

2 years agovagrant: remove proj dependency and only require php-cli
Sarah Hoffmann [Sun, 24 Jul 2022 08:24:18 +0000 (10:24 +0200)]
vagrant: remove proj dependency and only require php-cli

2 years agoremove CentOS installation instructions
Sarah Hoffmann [Sun, 24 Jul 2022 08:22:22 +0000 (10:22 +0200)]
remove CentOS installation instructions

Fixes #2601.

2 years agoMerge pull request #2774 from lonvia/parameter-arrays
Sarah Hoffmann [Sat, 23 Jul 2022 21:56:32 +0000 (23:56 +0200)]
Merge pull request #2774 from lonvia/parameter-arrays

Ignore URL parameters in array notation

2 years agoignore API parameters in array notation
Sarah Hoffmann [Sat, 23 Jul 2022 08:51:44 +0000 (10:51 +0200)]
ignore API parameters in array notation

PHP automatically parses parameters in an array notation(foo[]) into
array types. Ignore these parameters as 'unknown'.

Fixes #2763.

2 years agoMerge pull request #2772 from kianmeng/fix-typos
Sarah Hoffmann [Wed, 20 Jul 2022 15:13:30 +0000 (17:13 +0200)]
Merge pull request #2772 from kianmeng/fix-typos

docs: fix typos

2 years agodocs: fix typos
Kian-Meng Ang [Wed, 20 Jul 2022 14:05:25 +0000 (22:05 +0800)]
docs: fix typos

2 years agodocs: slightly increase recommended hardware requirements
Sarah Hoffmann [Wed, 20 Jul 2022 08:16:23 +0000 (10:16 +0200)]
docs: slightly increase recommended hardware requirements

2 years agoMerge pull request #2770 from lonvia/typed-python
Sarah Hoffmann [Tue, 19 Jul 2022 07:03:30 +0000 (09:03 +0200)]
Merge pull request #2770 from lonvia/typed-python

Type annotations for Python code

2 years agoCI: remove installation of pip on Ubuntu 20
Sarah Hoffmann [Mon, 18 Jul 2022 10:19:04 +0000 (12:19 +0200)]
CI: remove installation of pip on Ubuntu 20

2 years agoadd explicit cast for fetchone
Sarah Hoffmann [Mon, 18 Jul 2022 08:18:51 +0000 (10:18 +0200)]
add explicit cast for fetchone

2 years agoCIL use psutil type stubs
Sarah Hoffmann [Sun, 17 Jul 2022 22:16:33 +0000 (00:16 +0200)]
CIL use psutil type stubs

2 years agoremove typing_extensions requirement
Sarah Hoffmann [Sun, 17 Jul 2022 21:18:55 +0000 (23:18 +0200)]
remove typing_extensions requirement

The typing_extensions package is only necessary now when running mypy.
It won't be used at runtime anymore.

2 years agoCI: make type checking strict
Sarah Hoffmann [Sun, 17 Jul 2022 21:20:21 +0000 (23:20 +0200)]
CI: make type checking strict

2 years agoadd type annotations for command line functions
Sarah Hoffmann [Sun, 17 Jul 2022 16:31:51 +0000 (18:31 +0200)]
add type annotations for command line functions

2 years agoadd type annotations for Tiger import function
Sarah Hoffmann [Sun, 17 Jul 2022 09:01:44 +0000 (11:01 +0200)]
add type annotations for Tiger import function

2 years agoadd type annotations to special phrase importer
Sarah Hoffmann [Sun, 17 Jul 2022 08:46:59 +0000 (10:46 +0200)]
add type annotations to special phrase importer

2 years agoadd type annotations to database check functions
Sarah Hoffmann [Sun, 17 Jul 2022 07:59:51 +0000 (09:59 +0200)]
add type annotations to database check functions

2 years agoadd type annotations for database import functions
Sarah Hoffmann [Sat, 16 Jul 2022 22:29:34 +0000 (00:29 +0200)]
add type annotations for database import functions

2 years agoadd type annotations for migrations
Sarah Hoffmann [Sat, 16 Jul 2022 21:44:36 +0000 (23:44 +0200)]
add type annotations for migrations

2 years agoadd type annotations to tool functions
Sarah Hoffmann [Sat, 16 Jul 2022 21:28:02 +0000 (23:28 +0200)]
add type annotations to tool functions

2 years agoadd type annotations for ICU tokenizer
Sarah Hoffmann [Sat, 16 Jul 2022 15:33:19 +0000 (17:33 +0200)]
add type annotations for ICU tokenizer

2 years agoadd type annotations for legacy tokenizer
Sarah Hoffmann [Fri, 15 Jul 2022 20:52:26 +0000 (22:52 +0200)]
add type annotations for legacy tokenizer

2 years agoadd type annotations to ICU tokenizer helper modules
Sarah Hoffmann [Wed, 13 Jul 2022 20:55:40 +0000 (22:55 +0200)]
add type annotations to ICU tokenizer helper modules

2 years agoadd typing extensions for Ubuntu22.04
Sarah Hoffmann [Wed, 13 Jul 2022 18:49:54 +0000 (20:49 +0200)]
add typing extensions for Ubuntu22.04

2 years agoadd type annotations for token analysis
Sarah Hoffmann [Wed, 13 Jul 2022 15:18:53 +0000 (17:18 +0200)]
add type annotations for token analysis

No annotations for ICU types yet.

2 years agoadd type hints for sanitizers
Sarah Hoffmann [Tue, 12 Jul 2022 21:15:19 +0000 (23:15 +0200)]
add type hints for sanitizers

2 years agoadd type annotations for indexer
Sarah Hoffmann [Tue, 12 Jul 2022 16:40:51 +0000 (18:40 +0200)]
add type annotations for indexer

2 years agoadd typing information for postcode formatter
Sarah Hoffmann [Fri, 8 Jul 2022 09:52:45 +0000 (11:52 +0200)]
add typing information for postcode formatter

2 years agoadd typing information for place_info and country_info
Sarah Hoffmann [Thu, 7 Jul 2022 15:31:20 +0000 (17:31 +0200)]
add typing information for place_info and country_info

2 years agoadd typing information for utils submodule
Sarah Hoffmann [Tue, 5 Jul 2022 15:28:02 +0000 (17:28 +0200)]
add typing information for utils submodule

2 years agotype annotations for non-blocking DB connection
Sarah Hoffmann [Tue, 5 Jul 2022 13:00:33 +0000 (15:00 +0200)]
type annotations for non-blocking DB connection

2 years agoadd type annotations for SQL preprocessor
Sarah Hoffmann [Tue, 5 Jul 2022 09:24:53 +0000 (11:24 +0200)]
add type annotations for SQL preprocessor

2 years agoadd type annotation to DB utils
Sarah Hoffmann [Tue, 5 Jul 2022 08:46:55 +0000 (10:46 +0200)]
add type annotation to DB utils

As a cursor is needed as type, make this a public type.

2 years agoadd typing information to DB properties
Sarah Hoffmann [Tue, 5 Jul 2022 08:34:55 +0000 (10:34 +0200)]
add typing information to DB properties

2 years agoadd typing annotations for DB status module
Sarah Hoffmann [Mon, 4 Jul 2022 09:29:12 +0000 (11:29 +0200)]
add typing annotations for DB status module

Requires TypedDict which is only available from Python 3.8. Require
therefore typing_extensions to make the functions available for
earlier Python versions.

2 years agoadapt use of Connection in bdd tests to name change
Sarah Hoffmann [Mon, 4 Jul 2022 06:46:07 +0000 (08:46 +0200)]
adapt use of Connection in bdd tests to name change

2 years agoadd type annotations to freeze functions
Sarah Hoffmann [Sun, 3 Jul 2022 17:04:05 +0000 (19:04 +0200)]
add type annotations to freeze functions

2 years agofix uses of config.get_path() to expect None
Sarah Hoffmann [Sun, 3 Jul 2022 16:36:33 +0000 (18:36 +0200)]
fix uses of config.get_path() to expect None

2 years agodefine type for enivronment dictionaries
Sarah Hoffmann [Sun, 3 Jul 2022 15:38:11 +0000 (17:38 +0200)]
define type for enivronment dictionaries

2 years agorestrict return type more
Sarah Hoffmann [Sun, 3 Jul 2022 15:21:46 +0000 (17:21 +0200)]
restrict return type more

2 years agoadd type annotations to exec_utils
Sarah Hoffmann [Sun, 3 Jul 2022 12:48:15 +0000 (14:48 +0200)]
add type annotations to exec_utils

2 years agoCI: install type info for psycopg2
Sarah Hoffmann [Sun, 3 Jul 2022 09:49:50 +0000 (11:49 +0200)]
CI: install type info for psycopg2

2 years agoavoid issues with Python < 3.9 and linting
Sarah Hoffmann [Sun, 3 Jul 2022 09:33:19 +0000 (11:33 +0200)]
avoid issues with Python < 3.9 and linting

2 years agomove complex typing annotations to extra file
Sarah Hoffmann [Sat, 2 Jul 2022 09:59:19 +0000 (11:59 +0200)]
move complex typing annotations to extra file

2 years agotype annotations for DB utils
Sarah Hoffmann [Sat, 2 Jul 2022 08:18:10 +0000 (10:18 +0200)]
type annotations for DB utils

2 years agotype annotations for DB connection
Sarah Hoffmann [Fri, 1 Jul 2022 11:55:24 +0000 (13:55 +0200)]
type annotations for DB connection

2 years agomypy: add psycopg2 typing info from typeshed
Sarah Hoffmann [Thu, 30 Jun 2022 13:57:44 +0000 (15:57 +0200)]
mypy: add psycopg2 typing info from typeshed

2 years agoadd type annotations to config module
Sarah Hoffmann [Thu, 30 Jun 2022 13:43:18 +0000 (15:43 +0200)]
add type annotations to config module

2 years agoadd type annotations for version.py
Sarah Hoffmann [Thu, 30 Jun 2022 12:36:19 +0000 (14:36 +0200)]
add type annotations for version.py

2 years agomypy: ignore dotenv library
Sarah Hoffmann [Thu, 30 Jun 2022 12:07:02 +0000 (14:07 +0200)]
mypy: ignore dotenv library

2 years agodocument use of mypy
Sarah Hoffmann [Thu, 30 Jun 2022 09:56:14 +0000 (11:56 +0200)]
document use of mypy

2 years agoCI: add mypy to tests
Sarah Hoffmann [Thu, 30 Jun 2022 09:52:45 +0000 (11:52 +0200)]
CI: add mypy to tests

2 years agomypy: minimal annotations to enable a clean run
Sarah Hoffmann [Thu, 30 Jun 2022 08:48:04 +0000 (10:48 +0200)]
mypy: minimal annotations to enable a clean run

2 years agoMerge pull request #2761 from lonvia/repair-index-analysis
Sarah Hoffmann [Mon, 18 Jul 2022 07:38:08 +0000 (09:38 +0200)]
Merge pull request #2761 from lonvia/repair-index-analysis

Repair `admin --analyse-indexing`

2 years agoMerge pull request #2764 from otbutz/patch-4
Sarah Hoffmann [Wed, 13 Jul 2022 13:51:47 +0000 (15:51 +0200)]
Merge pull request #2764 from otbutz/patch-4

Remove legacy Postgres options

2 years agoRemove legacy Postgres options
otbutz [Tue, 12 Jul 2022 07:49:10 +0000 (09:49 +0200)]
Remove legacy Postgres options

2 years agoMerge pull request #2691 from mtmail/ubuntu-22
Sarah Hoffmann [Mon, 11 Jul 2022 13:37:51 +0000 (15:37 +0200)]
Merge pull request #2691 from mtmail/ubuntu-22

Vagrant and CI tests for Ubuntu 22.04

2 years agoIn tests for PHP 8 disable Just-in-time, it conflicts with tools that determine coverage
marc tobias [Mon, 4 Jul 2022 21:52:36 +0000 (23:52 +0200)]
In tests for PHP 8 disable Just-in-time, it conflicts with tools that determine coverage

2 years agoVagrant and CI tests for Ubuntu 22.04
Marc Tobias [Mon, 2 May 2022 16:16:08 +0000 (18:16 +0200)]
Vagrant and CI tests for Ubuntu 22.04

2 years agodecode_json() always create arrays instead of objects
Sarah Hoffmann [Sat, 9 Jul 2022 07:10:21 +0000 (09:10 +0200)]
decode_json() always create arrays instead of objects

2 years agoconvert admin --analyse-indexing to new indexing method
Sarah Hoffmann [Thu, 7 Jul 2022 09:23:14 +0000 (11:23 +0200)]
convert admin --analyse-indexing to new indexing method

A proper run of indexing requires the place information from the
analyzer. Add the pre-processing of place data, so the right
information is handed into the update function.

2 years agoMerge pull request #2760 from lonvia/reorganize-data-classes
Sarah Hoffmann [Thu, 7 Jul 2022 14:12:11 +0000 (16:12 +0200)]
Merge pull request #2760 from lonvia/reorganize-data-classes

Code cleanup: move some common code into the data submodule

2 years agoremove analyze() from PlaceInfo class
Sarah Hoffmann [Wed, 6 Jul 2022 09:33:07 +0000 (11:33 +0200)]
remove analyze() from PlaceInfo class

The function creates circular dependencies.

2 years agomove country_info into data submodule
Sarah Hoffmann [Wed, 6 Jul 2022 09:08:36 +0000 (11:08 +0200)]
move country_info into data submodule

2 years agomove PlaceInfo into data submodule
Sarah Hoffmann [Wed, 6 Jul 2022 08:54:47 +0000 (10:54 +0200)]
move PlaceInfo into data submodule

This data structure is shared between indexer and tokenizer.

2 years agotest: avoid column names with upper-case letters
Sarah Hoffmann [Tue, 5 Jul 2022 07:12:55 +0000 (09:12 +0200)]
test: avoid column names with upper-case letters

This may cause problems when the column names get quoted.

2 years agoCI: remove unneed stuff to make space for DB
Sarah Hoffmann [Sun, 3 Jul 2022 12:52:16 +0000 (14:52 +0200)]
CI: remove unneed stuff to make space for DB

2 years agoMerge pull request #2706 from mtmail/php-fixes-php7-vs-php8
Sarah Hoffmann [Sun, 3 Jul 2022 09:28:52 +0000 (11:28 +0200)]
Merge pull request #2706 from mtmail/php-fixes-php7-vs-php8

PHP 8 behaves slightly different with in_array and usort

2 years agoPHP 8 behaves slightly different with in_array and usort
Marc Tobias [Tue, 10 May 2022 16:30:49 +0000 (18:30 +0200)]
PHP 8 behaves slightly different with in_array and usort

2 years agofix syntax error with tablespaces
Sarah Hoffmann [Thu, 30 Jun 2022 07:19:16 +0000 (09:19 +0200)]
fix syntax error with tablespaces

2 years agodocs: replace deprecated pages option
Sarah Hoffmann [Sat, 25 Jun 2022 19:29:00 +0000 (21:29 +0200)]
docs: replace deprecated pages option

Fixes #2661.

2 years agofix handling of zero importance
Sarah Hoffmann [Wed, 29 Jun 2022 15:54:30 +0000 (17:54 +0200)]
fix handling of zero importance

To avoid importance becoming zero and cancelling out other weights,
df008d99f549d850d07580b4592435388e44387c introduced a minimum value
for importance. That broke importances for interpolated addresses,
which are less than zero.

Instead of setting a minimum, set zero importances to a very small
value.

Fixes #2753.

2 years agoMerge pull request #2757 from lonvia/filter-postcodes
Sarah Hoffmann [Fri, 24 Jun 2022 19:09:41 +0000 (21:09 +0200)]
Merge pull request #2757 from lonvia/filter-postcodes

Add filtering, normalisation and variants for postcodes

2 years agoignore 5+ postcodes in the US for now
Sarah Hoffmann [Thu, 23 Jun 2022 14:17:47 +0000 (16:17 +0200)]
ignore 5+ postcodes in the US for now

Hierarchical postcodes need a different treatment.

2 years agobdd: correctly skip postcode tests for legacy
Sarah Hoffmann [Wed, 22 Jun 2022 09:38:23 +0000 (11:38 +0200)]
bdd: correctly skip postcode tests for legacy

2 years agobdd: do not expect legacy word table to be without empty tokens
Sarah Hoffmann [Wed, 22 Jun 2022 08:47:08 +0000 (10:47 +0200)]
bdd: do not expect legacy word table to be without empty tokens

It can happen for bogus names and this will not get fixed anymore.

2 years agoadapt search algorithm to new postcode format in word
Sarah Hoffmann [Wed, 22 Jun 2022 07:54:47 +0000 (09:54 +0200)]
adapt search algorithm to new postcode format in word

2 years agohandle postcodes properly on word table updates
Sarah Hoffmann [Tue, 21 Jun 2022 20:05:35 +0000 (22:05 +0200)]
handle postcodes properly on word table updates

update_postcodes_from_db() needs to do the full postcode treatment
in order to derive the correct word table entries.

2 years agoadd documentation for postcode customization
Sarah Hoffmann [Mon, 20 Jun 2022 15:42:12 +0000 (17:42 +0200)]
add documentation for postcode customization

2 years agofix linting issue
Sarah Hoffmann [Fri, 17 Jun 2022 16:14:23 +0000 (18:14 +0200)]
fix linting issue

2 years agofix up BDD tests for postcode changes
Sarah Hoffmann [Fri, 17 Jun 2022 15:28:51 +0000 (17:28 +0200)]
fix up BDD tests for postcode changes

Includes smaller code fixes found by the tests.

2 years agoport legacy tokenizer to new postcode handling
Sarah Hoffmann [Wed, 8 Jun 2022 06:19:55 +0000 (08:19 +0200)]
port legacy tokenizer to new postcode handling

Also documents the changes to the SQL functions of the tokenizer.

2 years agofix postcode pattern for Mozambique
Sarah Hoffmann [Wed, 8 Jun 2022 05:42:35 +0000 (07:42 +0200)]
fix postcode pattern for Mozambique

Optional groups are not implemented yet.

2 years agoadd tests for discarding bad postcodes
Sarah Hoffmann [Wed, 8 Jun 2022 05:24:53 +0000 (07:24 +0200)]
add tests for discarding bad postcodes