]>
git.openstreetmap.org Git - nominatim.git/log
Sarah Hoffmann [Fri, 23 Apr 2021 13:49:38 +0000 (15:49 +0200)]
introduce external processing in indexer
Indexing is now split into three parts: first a preparation step
that collects the necessary information from the database and
returns it to Python. In a second step the data is transformed
within Python as necessary and then returned to the database
through the usual UPDATE which now not only sets the indexed_status
but also other fields. The third step comprises the address
computation which is still done inside the update trigger in
the database.
The second processing step doesn't do anything useful yet.
Sarah Hoffmann [Thu, 22 Apr 2021 20:47:34 +0000 (22:47 +0200)]
move word table and normalisation SQL into tokenizer
Creating and populating the word table is now the responsibility
of the tokenizer.
The get_maxwordfreq() function has been replaced with a
simple template parameter to the SQL during function installation.
The number is taken from the parameter list in the database to
ensure that it is not changed after installation.
Sarah Hoffmann [Wed, 21 Apr 2021 13:38:52 +0000 (15:38 +0200)]
add migration for configurable tokenizer
Adds a migration that initialises a legacy tokenizer for
an existing database. The migration is not active yet as
it will need completion when more functionality is added
to the legacy tokenizer.
Sarah Hoffmann [Wed, 21 Apr 2021 13:00:37 +0000 (15:00 +0200)]
move module installation to legacy tokenizer
Sarah Hoffmann [Wed, 21 Apr 2021 07:57:17 +0000 (09:57 +0200)]
introduce tokenizer modules
This adds the boilerplate for selecting configurable tokenizers.
A tokenizer can be chosen at import time and will then install
itself such that it is fixed for the given database import even
when the software itself is updated.
The legacy tokenizer implements Nominatim's traditional algorithms.
Sarah Hoffmann [Fri, 30 Apr 2021 09:19:35 +0000 (11:19 +0200)]
Merge pull request #2303 from lonvia/remove-aux-support
Remove support for AUX housenumber tables
Sarah Hoffmann [Fri, 30 Apr 2021 08:08:29 +0000 (10:08 +0200)]
remove support for AUX housenumber tables
These tables have never been actively maintained and the code is
completely untested. With the upcomming changes, it is unlikely
that the code remains usable.
This removes the aux tables and all code that references them.
Sarah Hoffmann [Tue, 27 Apr 2021 10:18:45 +0000 (12:18 +0200)]
Merge pull request #2299 from lonvia/update-actions
Fix database check for reverse-only
Sarah Hoffmann [Tue, 27 Apr 2021 09:57:05 +0000 (11:57 +0200)]
Merge pull request #2291 from AntoJvlt/special-phrases-statistics
Special phrases statistics
Sarah Hoffmann [Tue, 27 Apr 2021 08:14:26 +0000 (10:14 +0200)]
do not check for extra housenumber index for reverse-only
Also adds a database check for reverse only import to the CI.
Sarah Hoffmann [Mon, 26 Apr 2021 21:01:06 +0000 (23:01 +0200)]
add tests for different scripts
Sarah Hoffmann [Mon, 26 Apr 2021 09:21:44 +0000 (11:21 +0200)]
Merge pull request #2298 from lonvia/add-warming-to-ci
Add warming to CI import tests and fix more Python 3.5 compatibility issues
Sarah Hoffmann [Mon, 26 Apr 2021 08:16:05 +0000 (10:16 +0200)]
avoid Path in subprocess parameters
Not supported by Python 3.5.
Sarah Hoffmann [Mon, 26 Apr 2021 07:54:09 +0000 (09:54 +0200)]
add warming to CI import test
AntoJvlt [Sun, 25 Apr 2021 15:56:12 +0000 (17:56 +0200)]
Switching to log info and only send warning for invalid phrases
AntoJvlt [Thu, 22 Apr 2021 15:34:35 +0000 (17:34 +0200)]
Implemented statistics for the import of special phrases through the SpecialPhrasesImporterStatistics class
AntoJvlt [Wed, 21 Apr 2021 15:11:57 +0000 (17:11 +0200)]
reorganization of folder/file for the special phrases importer
Sarah Hoffmann [Sat, 24 Apr 2021 13:35:00 +0000 (15:35 +0200)]
Merge pull request #2297 from lonvia/update-deployment-docs
docs: update deployment to use project directory
Sarah Hoffmann [Sat, 24 Apr 2021 13:03:28 +0000 (15:03 +0200)]
Merge pull request #2296 from lonvia/disable-too-few-public-methods-check
pylint: disable too-few-public-methods check
Sarah Hoffmann [Sat, 24 Apr 2021 13:00:18 +0000 (15:00 +0200)]
docs: update deployment to use project directory
Fixes #2295.
Sarah Hoffmann [Sat, 24 Apr 2021 09:44:36 +0000 (11:44 +0200)]
fix pylint complaints
Sarah Hoffmann [Sat, 24 Apr 2021 09:39:44 +0000 (11:39 +0200)]
pylint: disable check too-few-public-methods
Sarah Hoffmann [Sat, 24 Apr 2021 07:20:28 +0000 (09:20 +0200)]
Merge pull request #2293 from darkshredder/update-manpage
Updated manual page
Sarah Hoffmann [Fri, 23 Apr 2021 21:33:15 +0000 (23:33 +0200)]
Merge pull request #2294 from lonvia/update-actions
CI: add import test against Python 3.5 and fix discovered issues
Sarah Hoffmann [Fri, 23 Apr 2021 13:45:54 +0000 (15:45 +0200)]
actions: add import on ubuntu 18.04
This uses oldest possible dependencies where possible.
Sarah Hoffmann [Fri, 23 Apr 2021 20:27:12 +0000 (22:27 +0200)]
indexes with includes are not available for postgresql < 11
Sarah Hoffmann [Fri, 23 Apr 2021 20:18:55 +0000 (22:18 +0200)]
use group() for regex matches
Needed for compatibility with Python 3.5.
Sarah Hoffmann [Fri, 23 Apr 2021 19:57:05 +0000 (21:57 +0200)]
use pathlib version of open
Sarah Hoffmann [Fri, 23 Apr 2021 19:49:41 +0000 (21:49 +0200)]
subprocess needs string argument
Compatibility change for Python 3.5.
Sarah Hoffmann [Fri, 23 Apr 2021 19:42:24 +0000 (21:42 +0200)]
check for existance of custom .env before opening
Sarah Hoffmann [Fri, 23 Apr 2021 19:10:19 +0000 (21:10 +0200)]
use more generic ImportError to check for module
ModuleNotFoundError was only introduced in Python 3.6.
Sarah Hoffmann [Fri, 23 Apr 2021 18:53:00 +0000 (20:53 +0200)]
replace usages of fromisoformat() with strptime()
fromisoformat was only introduced with Python 3.7 while we
still support Python 3.5.
Fixes #2292.
Sarah Hoffmann [Fri, 23 Apr 2021 18:27:14 +0000 (20:27 +0200)]
remove argparse dependency for vagrant scripts
Users don't need to recreate the manpage.
Darkshredder [Fri, 23 Apr 2021 20:12:38 +0000 (01:42 +0530)]
Updated manual page
Sarah Hoffmann [Thu, 22 Apr 2021 15:31:00 +0000 (17:31 +0200)]
bdd tests: fix place dependen ranking tests
The ranks of places may differ for some countries. Force the
place nodes in the test on null island which always uses the
default ranking.
Sarah Hoffmann [Thu, 22 Apr 2021 15:12:25 +0000 (17:12 +0200)]
Merge pull request #2288 from RhinoDevel/patch-1
Replace "nominatim-update" with "nominatim".
RhinoDevel [Thu, 22 Apr 2021 13:40:22 +0000 (15:40 +0200)]
Replace "nominatim-update" with "nominatim".
If I am not mistaken, the correct command to index imported data via commandline is "nominatim index".
Sarah Hoffmann [Wed, 21 Apr 2021 08:33:45 +0000 (10:33 +0200)]
indexer: reset query counter
Reset the counter for queries after the asynchronous connections
have been reopened.
Sarah Hoffmann [Tue, 20 Apr 2021 13:34:14 +0000 (15:34 +0200)]
Merge pull request #2285 from lonvia/split-indexer-code
Rework indexer code
Sarah Hoffmann [Tue, 20 Apr 2021 09:16:12 +0000 (11:16 +0200)]
factor out async connection handling into separate class
Also adds a test for reconnecting regularly while indexing.
Sarah Hoffmann [Mon, 19 Apr 2021 16:15:09 +0000 (18:15 +0200)]
indexer: make self.conn function-local
Also switches to our internal connect function which gives us
a cursor with a sclar() function.
Sarah Hoffmann [Mon, 19 Apr 2021 16:00:28 +0000 (18:00 +0200)]
make index() function private
Sarah Hoffmann [Mon, 19 Apr 2021 15:34:26 +0000 (17:34 +0200)]
move analyse function into indexinf function
Sarah Hoffmann [Mon, 19 Apr 2021 15:20:31 +0000 (17:20 +0200)]
indexer: move runner into separate file
Sarah Hoffmann [Mon, 19 Apr 2021 16:28:04 +0000 (18:28 +0200)]
Merge pull request #2284 from lonvia/cleanup-word-frequency-computation
Rename and simplify function for word pre-computation
Sarah Hoffmann [Mon, 19 Apr 2021 14:54:22 +0000 (16:54 +0200)]
simplify token precomputation
Rename function to reflect that it is only used for precomputation.
The token IDs are not really needed, so don't bother to compute
the array of tokens.
Sarah Hoffmann [Mon, 19 Apr 2021 14:40:57 +0000 (16:40 +0200)]
remove unused word recomputation script
Has been replaced by a script recomputing counts from search_name.
Sarah Hoffmann [Mon, 19 Apr 2021 11:56:36 +0000 (13:56 +0200)]
Merge pull request #2283 from darkshredder/tiger-data-test-fix
Fix: tiger-data tarfile test
Darkshredder [Mon, 19 Apr 2021 10:23:01 +0000 (15:53 +0530)]
Fix: tiger-data tarfile test
Sarah Hoffmann [Mon, 19 Apr 2021 10:14:25 +0000 (12:14 +0200)]
Merge pull request #2282 from lonvia/add-paths-to-config
Include software paths in Python config object
Sarah Hoffmann [Mon, 19 Apr 2021 08:01:17 +0000 (10:01 +0200)]
simplify sql and website creation functions
Sarah Hoffmann [Mon, 19 Apr 2021 07:38:17 +0000 (09:38 +0200)]
simplify constructor for SQL preprocessor
Use sql path from config.
Sarah Hoffmann [Mon, 19 Apr 2021 07:23:37 +0000 (09:23 +0200)]
simplify interface for adding tiger data
Also simplifies tests using existing fixtures.
Sarah Hoffmann [Mon, 19 Apr 2021 07:06:42 +0000 (09:06 +0200)]
add library directories to config
Allows to reduce the number of parameters in functions that take
the config anyway.
Sarah Hoffmann [Mon, 19 Apr 2021 06:42:59 +0000 (08:42 +0200)]
Merge pull request #2281 from changpingc/changping/fix-tiger-index
fix index on location_property_tiger (parent_place_id)
Channgping Chen [Mon, 19 Apr 2021 00:01:01 +0000 (00:01 +0000)]
fix index on location_property_tiger (parent_place_id)
Looks like
2af82975cd968ec09683ae5b16a9aa157a7f2176
accidentally renamed an index. Because of the added "if not
exists" clause, the index doesn't get created. This
significantly slows down reverse queries because they now
require full scans on location_property_tiger.
Without this fix, reverse queries can take 8s on a full
planet install on an r5.8xlarge instance in EC2.
Sarah Hoffmann [Sun, 18 Apr 2021 09:57:19 +0000 (11:57 +0200)]
Merge pull request #2280 from AntoJvlt/Fix-special-phrases-import-and-tests-cleaning
Fix regex and sanity check for the import of special phrases and tests cleaning.
AntoJvlt [Sat, 17 Apr 2021 17:45:24 +0000 (19:45 +0200)]
Only log a warning if a wrong input is detected on the wiki while importing special phrases
AntoJvlt [Sat, 17 Apr 2021 17:24:13 +0000 (19:24 +0200)]
Fix occurence regex
AntoJvlt [Sat, 17 Apr 2021 17:23:33 +0000 (19:23 +0200)]
Cleaned tests and add database cleaning tests on test_import_from_wiki
Sarah Hoffmann [Sat, 17 Apr 2021 09:51:21 +0000 (11:51 +0200)]
Merge pull request #2279 from lonvia/add-index-for-continued-indexing
Add index for continued indexing
Sarah Hoffmann [Sat, 17 Apr 2021 09:10:36 +0000 (11:10 +0200)]
add tests for continuing import
Sarah Hoffmann [Sat, 17 Apr 2021 09:07:04 +0000 (11:07 +0200)]
add support index when continuing import at index phase
Indexing scans the placex table sequentially during indexing
on the initial import. That is okay because we know that all
rows need to be processed anywhere. When continuing the import,
however, a large part might already be indexed, so that the
process spends a lot of time going through rows that are no
longer of interest. Create a supporting index for all unindexed
rows to speed up the scan. This is the same index as used later
for updates.
Sarah Hoffmann [Sat, 17 Apr 2021 08:13:33 +0000 (10:13 +0200)]
Merge pull request #2278 from lonvia/remove-transistion-functions
Remove transition functions
Sarah Hoffmann [Fri, 16 Apr 2021 16:41:14 +0000 (18:41 +0200)]
remove transition functions from Python
Sarah Hoffmann [Fri, 16 Apr 2021 15:40:43 +0000 (17:40 +0200)]
Merge pull request #2277 from lonvia/update-osm2pgsql
Update osm2pgsql to current master
Sarah Hoffmann [Fri, 16 Apr 2021 15:28:51 +0000 (17:28 +0200)]
remove PHP code for transition functions
Sarah Hoffmann [Fri, 16 Apr 2021 15:09:40 +0000 (17:09 +0200)]
remove installation of PHP util scripts
Sarah Hoffmann [Fri, 16 Apr 2021 14:57:04 +0000 (16:57 +0200)]
Merge pull request #2276 from lonvia/port-country-code-creation-to-python
Port country code creation to python
Sarah Hoffmann [Fri, 16 Apr 2021 13:37:53 +0000 (15:37 +0200)]
add test for new postcode import function
Sarah Hoffmann [Fri, 16 Apr 2021 13:05:40 +0000 (15:05 +0200)]
port function to compute initial postcodes to Python
Sarah Hoffmann [Fri, 16 Apr 2021 13:04:10 +0000 (15:04 +0200)]
Merge pull request #2275 from lonvia/switch-to-absolute-imports
Use absolute imports in Python code
Sarah Hoffmann [Fri, 16 Apr 2021 12:20:09 +0000 (14:20 +0200)]
use absolute imports in Python code
Relative imports are no longer officially recommended.
Sarah Hoffmann [Thu, 15 Apr 2021 08:24:01 +0000 (10:24 +0200)]
update osm2pgsql to current master (fixes version output)
Sarah Hoffmann [Thu, 15 Apr 2021 08:13:25 +0000 (10:13 +0200)]
Merge pull request #2263 from AntoJvlt/special-phrases-autoupdate
Implemented auto update of special phrases while importing them
Sarah Hoffmann [Thu, 15 Apr 2021 08:12:53 +0000 (10:12 +0200)]
Merge pull request #2270 from lonvia/simplify-place-boundary-merge
Simplify matching between place and boundary names
Sarah Hoffmann [Wed, 14 Apr 2021 07:58:14 +0000 (09:58 +0200)]
adapt database check to new index layout
Sarah Hoffmann [Fri, 9 Apr 2021 19:24:35 +0000 (21:24 +0200)]
add migration for new placenode geometry index
Sarah Hoffmann [Fri, 9 Apr 2021 19:10:00 +0000 (21:10 +0200)]
simplify name matching between boundary and place node
Instead of normalising the names simply compare them in lower
case. This removes the dependency on the tokenizer for
linking boundaries and nodes. When looking up the linked places
by place type also allow that one name is simply contained in the
other. This catches the frequent case where one of the names has
an addendum (e.g. Newport vs. City of Newport).
Drops the special index for the name lookup and insted relies
on a slightly extended version of the geometry index used for
reverse lookup. Saves around 100MB on a planet.
Sarah Hoffmann [Wed, 14 Apr 2021 15:50:02 +0000 (17:50 +0200)]
Merge pull request #2269 from lonvia/fix-actions
github actions: reintroduce postgresql repo
Sarah Hoffmann [Wed, 14 Apr 2021 14:19:49 +0000 (16:19 +0200)]
github actions: reintroduce postgresql repo
Sarah Hoffmann [Wed, 14 Apr 2021 08:56:12 +0000 (10:56 +0200)]
Merge pull request #2264 from darkshredder/tiger-data-tests
Fix: Error if last statements is wrong and improved tests in tiger data import
Darkshredder [Tue, 13 Apr 2021 09:36:02 +0000 (15:06 +0530)]
Fix: Removed error if endstatement is wrong and improved tests
AntoJvlt [Mon, 12 Apr 2021 12:10:30 +0000 (14:10 +0200)]
Tests added for the auto update of special phrases during import
AntoJvlt [Mon, 12 Apr 2021 09:55:18 +0000 (11:55 +0200)]
Implemented auto update of special phrases while importing them
Sarah Hoffmann [Sun, 11 Apr 2021 21:09:45 +0000 (23:09 +0200)]
Merge pull request #2260 from AntoJvlt/fix-load-languages-special-phrases
Fix default languages loading for special phrases import
AntoJvlt [Sun, 11 Apr 2021 20:26:31 +0000 (22:26 +0200)]
Fix default languages loading
Sarah Hoffmann [Sat, 10 Apr 2021 19:19:55 +0000 (21:19 +0200)]
Merge pull request #2258 from darkshredder/code-coverage
Disabled Code coverage status checks
Darkshredder [Sat, 10 Apr 2021 16:58:29 +0000 (22:28 +0530)]
CodeCov comment only when codecoverage changes
Darkshredder [Sat, 10 Apr 2021 15:14:52 +0000 (20:44 +0530)]
Disabled Coverage status checks
Sarah Hoffmann [Sat, 10 Apr 2021 14:57:39 +0000 (16:57 +0200)]
add badge for codecov
Sarah Hoffmann [Sat, 10 Apr 2021 14:37:12 +0000 (16:37 +0200)]
Merge pull request #2252 from darkshredder/code-coverage
Added Code coverage support using Codecov
Sarah Hoffmann [Fri, 9 Apr 2021 15:48:28 +0000 (17:48 +0200)]
split LANGUAGES parameter before use
The user supplies the languages as a comma-separated list.
Sarah Hoffmann [Thu, 8 Apr 2021 09:01:19 +0000 (11:01 +0200)]
add migration information for new configuration format
Sarah Hoffmann [Thu, 8 Apr 2021 08:54:16 +0000 (10:54 +0200)]
Merge pull request #2256 from lonvia/remove-reverseinplan-option
Remove ReverseInPlan option
Sarah Hoffmann [Thu, 8 Apr 2021 08:35:14 +0000 (10:35 +0200)]
remove special handling for reversed queries in getGroupedSearches
getGroupedSearches is guaranteed not to be called with reversed
structured queries, so there is no need to have special exclusion
code.
Sarah Hoffmann [Thu, 8 Apr 2021 08:19:27 +0000 (10:19 +0200)]
remove reverseInPlan option from Geocode
Disabling query reversal is no longer possible in the configuration,
so there is no need to keep this as an option. Reversal is
automatically disabled for structured search only.
Sarah Hoffmann [Tue, 6 Apr 2021 19:23:29 +0000 (21:23 +0200)]
prepare 3.7.0 release
Sarah Hoffmann [Tue, 6 Apr 2021 14:09:53 +0000 (16:09 +0200)]
docs: minor spelling corrections
Sarah Hoffmann [Tue, 6 Apr 2021 13:56:08 +0000 (15:56 +0200)]
docs: unpacking tiger data is no longer necessary