]>
git.openstreetmap.org Git - nominatim.git/log
Sarah Hoffmann [Thu, 20 Jan 2022 14:38:02 +0000 (15:38 +0100)]
add pytest config
We are using custom marks now which need to be registered to avoid
warnings.
Sarah Hoffmann [Thu, 20 Jan 2022 11:07:12 +0000 (12:07 +0100)]
clean_housenumbers: make kinds and delimiters configurable
Also adds unit tests for various options.
Sarah Hoffmann [Fri, 7 Jan 2022 21:41:09 +0000 (22:41 +0100)]
factor out housenumber splitting into sanitizer
Sarah Hoffmann [Wed, 19 Jan 2022 16:09:36 +0000 (17:09 +0100)]
Merge pull request #2585 from lonvia/name-mutations
Introduce character mutations to token analysis
Sarah Hoffmann [Wed, 19 Jan 2022 14:28:01 +0000 (15:28 +0100)]
docs: add pointer to caddy deployment discussion
Sarah Hoffmann [Thu, 13 Jan 2022 08:30:31 +0000 (09:30 +0100)]
fix linting error
Sarah Hoffmann [Wed, 12 Jan 2022 18:41:16 +0000 (19:41 +0100)]
move parsing of mutation config to setup phase
Sarah Hoffmann [Wed, 12 Jan 2022 16:37:06 +0000 (17:37 +0100)]
add documentation for new mutation feature
Sarah Hoffmann [Wed, 12 Jan 2022 15:25:47 +0000 (16:25 +0100)]
introduce mutation variants to generic token analyser
Mutations are regular-expression-based replacements that are applied
after variants have been computed. They are meant to be used for
variations on character level.
Add spelling variations for German umlauts.
Sarah Hoffmann [Wed, 12 Jan 2022 08:53:32 +0000 (09:53 +0100)]
move variant configuration reading in separate file
Sarah Hoffmann [Tue, 11 Jan 2022 16:51:05 +0000 (17:51 +0100)]
refactor variant production to use generators
Sarah Hoffmann [Thu, 13 Jan 2022 13:54:35 +0000 (14:54 +0100)]
Merge pull request #2578 from lonvia/iso-3166-2
Make ISO3166-2 references searchable
Sarah Hoffmann [Thu, 13 Jan 2022 13:01:57 +0000 (14:01 +0100)]
Merge pull request #2579 from geofabrik/doc-update-typo
Fix typo in name of service. The rest of the docs call it nominatim-updateS
Amanda McCann [Thu, 13 Jan 2022 12:14:17 +0000 (13:14 +0100)]
Fix typo in name of service. The rest of the docs call it nominatim-updateS
Sarah Hoffmann [Thu, 13 Jan 2022 08:44:42 +0000 (09:44 +0100)]
make ISO3166-2 references searchable
Sarah Hoffmann [Tue, 11 Jan 2022 08:41:07 +0000 (09:41 +0100)]
Merge pull request #2571 from lonvia/ukrainian-apostrophe
Consider "modifier letter apostrophe" to be punctuation
Sarah Hoffmann [Mon, 10 Jan 2022 16:40:03 +0000 (17:40 +0100)]
consider "modifier letter apostrophe" to be punctuation
While technically being a letter, the apostrophe is often replaced
with a normal apostrophe in writing which is a punctuation mark.
This makes sure that the modifier letter apostrophe yields the same
normalization results and thus is really interchangable.
Only has an effect after the next reimport.
Fixes #2569.
Sarah Hoffmann [Mon, 10 Jan 2022 13:21:48 +0000 (14:21 +0100)]
Merge pull request #2570 from woodpeck/patch-3
Fix typos
Frederik Ramm [Mon, 10 Jan 2022 12:38:53 +0000 (13:38 +0100)]
Fix typos
Sarah Hoffmann [Thu, 6 Jan 2022 08:02:46 +0000 (09:02 +0100)]
Merge pull request #2565 from lonvia/swap-wordset-order
Swap order of query interpretation
Sarah Hoffmann [Wed, 5 Jan 2022 14:21:14 +0000 (15:21 +0100)]
swap order of query interpretation
A forward interpretation of the form 'street, city, country' is
much more frequent than the reverse form 'country, city, street'.
Thus swap the order of interpretations that the forward order comes
first.
Sarah Hoffmann [Tue, 4 Jan 2022 22:10:37 +0000 (23:10 +0100)]
Merge pull request #2562 from lonvia/copyright-headers
Add consistent copyright headers
Sarah Hoffmann [Mon, 3 Jan 2022 15:23:58 +0000 (16:23 +0100)]
add consistent SPDX copyright headers
Sarah Hoffmann [Mon, 3 Jan 2022 14:13:57 +0000 (15:13 +0100)]
Merge pull request #2559 from lonvia/disable-jit-in-queries
Disable JIT and parallel workers on search frontend
Sarah Hoffmann [Wed, 22 Dec 2021 07:59:31 +0000 (08:59 +0100)]
disable JIT and parallel workers on search frontend
Bad query planning now also interferes with queries for search and
reverse.
Sarah Hoffmann [Tue, 14 Dec 2021 14:52:34 +0000 (15:52 +0100)]
Merge pull request #2553 from lonvia/revert-street-matching-to-full-names
Revert street matching to full names
Sarah Hoffmann [Wed, 8 Dec 2021 20:58:43 +0000 (21:58 +0100)]
correctly match abbreviated addr:street
This only works when addr:street is abbreviated and the street
name isn't. It does not work the other way around.
Sarah Hoffmann [Tue, 7 Dec 2021 14:44:45 +0000 (15:44 +0100)]
Merge pull request #2542 from lonvia/update-phpunit
Update PHPUnit use to 9.5
Sarah Hoffmann [Tue, 7 Dec 2021 13:49:31 +0000 (14:49 +0100)]
restrict PHPUnit to 9.5 version
There are so many breaking changes with PHPUnit that it is
impossible to give any other guarantees.
Sarah Hoffmann [Tue, 7 Dec 2021 11:07:17 +0000 (12:07 +0100)]
enable PHPUnit 9 for coverage
A couple of functions have been renamed.
Sarah Hoffmann [Tue, 7 Dec 2021 10:34:21 +0000 (11:34 +0100)]
php unit: replace deprecated regex assert
The regEx assertion has been renamed in PHPUnit 9.5
and causes deprecation warnings.
Sarah Hoffmann [Tue, 7 Dec 2021 10:31:45 +0000 (11:31 +0100)]
php unit: don't enforce a name on the test database
Also gets rid of a PHPUnit deprecation warning.
Sarah Hoffmann [Tue, 7 Dec 2021 10:20:38 +0000 (11:20 +0100)]
php test: class must be called like the file
Sarah Hoffmann [Tue, 7 Dec 2021 10:13:30 +0000 (11:13 +0100)]
disable codecov
Not working.
Sarah Hoffmann [Tue, 7 Dec 2021 08:17:29 +0000 (09:17 +0100)]
Merge pull request #2540 from lonvia/remove-support-for-centos7
Remove installation instructions for CentOS 7
Sarah Hoffmann [Mon, 6 Dec 2021 15:05:27 +0000 (16:05 +0100)]
remove installation instructions for CentOS 7
This ends official support for CentOS 7.
Sarah Hoffmann [Mon, 6 Dec 2021 14:17:00 +0000 (15:17 +0100)]
remove some odd varaints of addr:street from the styles
Some import has added names in partial tags which confuse the
street name matching.
Sarah Hoffmann [Mon, 6 Dec 2021 13:46:40 +0000 (14:46 +0100)]
skip most addr: tags with suffixes
Only one addr: tag can be processed currently, so make
sure it is the one without suffixes to not get odd data.
addr:street is the exception because it uses a different
matching mechanism.
Sarah Hoffmann [Mon, 6 Dec 2021 13:26:08 +0000 (14:26 +0100)]
ICU: matching any street name will do again
Sarah Hoffmann [Mon, 6 Dec 2021 10:38:38 +0000 (11:38 +0100)]
revert to using full names for street name matching
Using partial names turned out to not work well because there are
often similarly named streets next to each other. It also
prevents us from being able to take into account all addr:street:*
tags.
This change gets all the full term tokens for the addr:street tags
from the DB. As they are used for matching only, we can assume that
the term must already be there or there will be no match. This
avoid creating unused full name tags.
Sarah Hoffmann [Fri, 3 Dec 2021 16:08:25 +0000 (17:08 +0100)]
Merge pull request #2539 from lonvia/clean-up-python-tests
Restructure and extend python unit tests
Sarah Hoffmann [Fri, 3 Dec 2021 11:01:53 +0000 (12:01 +0100)]
specify text type in test SQL
Older version of postgres fail otherwise.
Sarah Hoffmann [Thu, 2 Dec 2021 22:45:48 +0000 (23:45 +0100)]
split cli tests by subcommand and extend coverage
Sarah Hoffmann [Thu, 2 Dec 2021 14:54:24 +0000 (15:54 +0100)]
remove unnecessary pass statements
Sarah Hoffmann [Thu, 2 Dec 2021 14:46:36 +0000 (15:46 +0100)]
more unit tests for tokenizers
Sarah Hoffmann [Wed, 1 Dec 2021 19:48:29 +0000 (20:48 +0100)]
extend API unit tests
Sarah Hoffmann [Wed, 1 Dec 2021 19:27:40 +0000 (20:27 +0100)]
add tests for migration
Sarah Hoffmann [Wed, 1 Dec 2021 13:58:54 +0000 (14:58 +0100)]
more testing for refresh functions
Sarah Hoffmann [Wed, 1 Dec 2021 13:23:51 +0000 (14:23 +0100)]
more tests for exec utilities
Sarah Hoffmann [Wed, 1 Dec 2021 10:54:58 +0000 (11:54 +0100)]
add more tests for database import
Sarah Hoffmann [Wed, 1 Dec 2021 10:22:46 +0000 (11:22 +0100)]
add tests for adding additional data
Also adds checks that parameters for osm2pgsql are set
as expected.
Sarah Hoffmann [Wed, 1 Dec 2021 09:24:11 +0000 (10:24 +0100)]
add tests for flatten_config_file and other than yaml formats
Sarah Hoffmann [Tue, 30 Nov 2021 17:01:46 +0000 (18:01 +0100)]
tests: add fixture for making test project directory
Sarah Hoffmann [Tue, 30 Nov 2021 13:07:39 +0000 (14:07 +0100)]
generalize fixtures for cli tests
Sarah Hoffmann [Tue, 30 Nov 2021 11:03:16 +0000 (12:03 +0100)]
python test: move single-use fixtures to subdirectories
Sarah Hoffmann [Tue, 30 Nov 2021 10:23:00 +0000 (11:23 +0100)]
remove unused test files
Sarah Hoffmann [Tue, 30 Nov 2021 10:10:47 +0000 (11:10 +0100)]
organise python tests in subdirectories
The directories follow the same structure as the modules in
nominatim/.
Sarah Hoffmann [Thu, 25 Nov 2021 07:41:25 +0000 (08:41 +0100)]
Merge pull request #2530 from lonvia/declassify-highway
Change default rank for highway objects to 30
Sarah Hoffmann [Wed, 24 Nov 2021 13:40:23 +0000 (14:40 +0100)]
change default rank for highway objects to 30
The highway key is being used more and more for non-ways these
days. This clashes with Nominatim's assumption that essentially
everything that has a highway tag can be used as the street part
of the address.
Change the default rank of highway objects to 30 to avoid this.
Only the known values for streets keep the rank 26 and are now
listed explicitly.
Sarah Hoffmann [Wed, 24 Nov 2021 15:23:41 +0000 (16:23 +0100)]
Merge pull request #2529 from lonvia/sort-street-results-by-tiger-housenumber
Take tiger housenumber into account when ranking street results
Sarah Hoffmann [Wed, 24 Nov 2021 10:05:04 +0000 (11:05 +0100)]
add migration for inclusive housenumber Tiger index
Sarah Hoffmann [Tue, 23 Nov 2021 19:24:08 +0000 (20:24 +0100)]
add index for Tiger housenumber queries
Sarah Hoffmann [Tue, 23 Nov 2021 19:04:50 +0000 (20:04 +0100)]
take Tiger housenumbers into account when ranking street results
Queries with a housenumber need to rank streets higher that
have the requested housenumber attached. We already do that for
ordinary housenumber objects and for interpolations. This
adds support for Tiger housenumbers as well.
Fixes #2501.
Sarah Hoffmann [Sun, 21 Nov 2021 09:53:20 +0000 (10:53 +0100)]
Merge pull request #2528 from lonvia/allow-french-extra-housenumbers
Don't penalize French 'bis' housenumbers
Sarah Hoffmann [Fri, 19 Nov 2021 20:14:53 +0000 (21:14 +0100)]
Merge pull request #2526 from lonvia/docs-moving-database
Add a section about moving the database to another machine
Sarah Hoffmann [Fri, 19 Nov 2021 20:12:17 +0000 (21:12 +0100)]
don't penalize French 'bis' housenumbers
House numbers of the form '9 bis' are usual in France. So
be a bit more lenient before adding penalties to house numbers
with letters in them.
Fixes #2527.
Sarah Hoffmann [Fri, 19 Nov 2021 15:16:30 +0000 (16:16 +0100)]
Merge pull request #2525 from lonvia/fix-replication-indexer
Fix instantiation of indexer for replication
Sarah Hoffmann [Fri, 19 Nov 2021 15:11:32 +0000 (16:11 +0100)]
add a section about moving the database to another machine
Sarah Hoffmann [Fri, 19 Nov 2021 13:47:00 +0000 (14:47 +0100)]
only instantiate indexer once for replication
Also makes sure that indexer object exists everywhere were needed.
See #2518.
Sarah Hoffmann [Thu, 11 Nov 2021 06:42:42 +0000 (07:42 +0100)]
Merge pull request #2517 from lonvia/transliteration-special-chars
ICU: avoid non-alphanumerical characters in transliteration
Sarah Hoffmann [Wed, 10 Nov 2021 16:15:34 +0000 (17:15 +0100)]
make sure housenumbers are properly quoted
Sarah Hoffmann [Wed, 10 Nov 2021 16:14:13 +0000 (17:14 +0100)]
avoid special characters in word tokens
Transliteration should only consist of ASCII letters
and numbers. Avoid any other characters.
Sarah Hoffmann [Wed, 10 Nov 2021 12:27:09 +0000 (13:27 +0100)]
Merge pull request #2516 from lonvia/test-for-website-dir
Better error reporting when API script does not exist
Sarah Hoffmann [Wed, 10 Nov 2021 08:42:49 +0000 (09:42 +0100)]
better error reporting when API script does not exist
Check if the API script exists on the expected location before
running php-cli. This way we can add a useful hint about the
project directory.
Fixes #2513.
Sarah Hoffmann [Sat, 6 Nov 2021 11:11:55 +0000 (12:11 +0100)]
Merge pull request #2511 from lonvia/fix-combination-error-needs-address
Fix boolean combination of NeedsAddress flag
Sarah Hoffmann [Fri, 5 Nov 2021 21:18:37 +0000 (22:18 +0100)]
fix combination of NeedsAddress flag
When dealing with multiple partial terms, only keep the
flag, when all partial terms are so frequent as to need
an address.
Fixes #2510.
Sarah Hoffmann [Mon, 1 Nov 2021 11:14:53 +0000 (12:14 +0100)]
prepare release 4.0.0
Sarah Hoffmann [Tue, 2 Nov 2021 10:09:17 +0000 (11:09 +0100)]
fix typo
Sarah Hoffmann [Mon, 1 Nov 2021 15:12:23 +0000 (16:12 +0100)]
Merge pull request #2502 from lonvia/improve-development-documentation
Extend developer's documentation
Sarah Hoffmann [Mon, 1 Nov 2021 10:04:03 +0000 (11:04 +0100)]
docs: add overview over indexing
Sarah Hoffmann [Fri, 29 Oct 2021 10:03:22 +0000 (12:03 +0200)]
docs: section about database layout
Replaces the import description which basically was
table layout only now.
Sarah Hoffmann [Thu, 28 Oct 2021 13:28:47 +0000 (15:28 +0200)]
Merge pull request #2498 from lonvia/ordering-for-unlisted-place-results
Include unlisted places in ordering by housenumber
Sarah Hoffmann [Thu, 28 Oct 2021 09:33:34 +0000 (11:33 +0200)]
Merge pull request #2497 from lonvia/docs-maintenance
docs: add new maintenance section
Sarah Hoffmann [Thu, 28 Oct 2021 09:27:31 +0000 (11:27 +0200)]
include unlisted places in ordering by housenumber
When ordering results by the fact that they have a housenumber,
also take cases into account where the housenumber is on the
place itself. This may happen when the search includes the name
of the place and the housenumber or for addr:place addresses
where the place is unlisted.
Sarah Hoffmann [Wed, 27 Oct 2021 18:59:45 +0000 (20:59 +0200)]
docs: add new maintenance section
currently used for postcode updates, word count updates and
deleted relations.
Sarah Hoffmann [Wed, 27 Oct 2021 12:40:42 +0000 (14:40 +0200)]
Merge pull request #2495 from lonvia/fix-normalization-in-php
ICU: use correct normalization during search
Sarah Hoffmann [Wed, 27 Oct 2021 08:07:19 +0000 (10:07 +0200)]
ICU: use normalization from config in PHP
The TERM_NORMALIZATION config option is no longer applicable.
That was already documented but not yet implemented.
Sarah Hoffmann [Tue, 26 Oct 2021 15:29:03 +0000 (17:29 +0200)]
bdd: add tests for non-latin scripts
Sarah Hoffmann [Tue, 26 Oct 2021 15:00:43 +0000 (17:00 +0200)]
Merge pull request #2493 from lonvia/handle-frequent-partials
Tune search queries with frequent partial words
Sarah Hoffmann [Tue, 26 Oct 2021 10:07:13 +0000 (12:07 +0200)]
adapt BDD tests to stricter partial search
Sarah Hoffmann [Tue, 26 Oct 2021 09:42:42 +0000 (11:42 +0200)]
do not count words when in reverse-only mode
Sarah Hoffmann [Tue, 26 Oct 2021 08:57:51 +0000 (10:57 +0200)]
further refactor setup to keep function small
Sarah Hoffmann [Tue, 26 Oct 2021 08:28:28 +0000 (10:28 +0200)]
searches for house numbers must have an address
Sarah Hoffmann [Tue, 26 Oct 2021 08:23:55 +0000 (10:23 +0200)]
disallow search for partials without address
Very frequent partial terms take too long to look up and
do not return any valuable results unless the search is
further narrowed down by an address.
Sarah Hoffmann [Tue, 26 Oct 2021 07:37:57 +0000 (09:37 +0200)]
make word count computation part of the import
Accurate word counts are now essential when using
the ICU tokenizer and don't hurt for the legacy one.
Adds about an hour import time.
Sarah Hoffmann [Tue, 26 Oct 2021 08:32:43 +0000 (10:32 +0200)]
actions: move ICU tests into its own run
Sarah Hoffmann [Mon, 25 Oct 2021 19:45:08 +0000 (21:45 +0200)]
Merge pull request #2486 from lonvia/fix-special-phrases
Fix parsing of operator in special phrases
Sarah Hoffmann [Mon, 25 Oct 2021 19:33:27 +0000 (21:33 +0200)]
ICU: add an index over word_ids
Needed for keyword lookup in the details response.
Sarah Hoffmann [Mon, 25 Oct 2021 17:51:20 +0000 (19:51 +0200)]
be case-insensitve about special phrase operator
Sarah Hoffmann [Mon, 25 Oct 2021 17:46:30 +0000 (19:46 +0200)]
fix parsing of operator in special phrases
Because of unstripped input, the operators wouldn't match.