Sarah Hoffmann [Fri, 9 Apr 2021 19:10:00 +0000 (21:10 +0200)]
simplify name matching between boundary and place node
Instead of normalising the names simply compare them in lower
case. This removes the dependency on the tokenizer for
linking boundaries and nodes. When looking up the linked places
by place type also allow that one name is simply contained in the
other. This catches the frequent case where one of the names has
an addendum (e.g. Newport vs. City of Newport).
Drops the special index for the name lookup and insted relies
on a slightly extended version of the geometry index used for
reverse lookup. Saves around 100MB on a planet.
Sarah Hoffmann [Thu, 8 Apr 2021 08:19:27 +0000 (10:19 +0200)]
remove reverseInPlan option from Geocode
Disabling query reversal is no longer possible in the configuration,
so there is no need to keep this as an option. Reversal is
automatically disabled for structured search only.
Sarah Hoffmann [Thu, 1 Apr 2021 12:29:34 +0000 (14:29 +0200)]
use non-key index to speed up housenumber search
On Postgresql versions 11+ add an index to speed up the lookup
of housenumbers for terms found in search_name. This is really
just a band-aid around the query planer's interpretation of the
query.
Sarah Hoffmann [Mon, 29 Mar 2021 10:06:51 +0000 (12:06 +0200)]
allow sorting by housenumbers for rare street names
Usually we don't narrow down search results by house number when
only a street name is given because there may be a lot of rows
to cross check when the street name is very frequent. However,
when it is known to be rare, the housenumber check may be done
anyway.
Sarah Hoffmann [Sun, 21 Mar 2021 15:47:22 +0000 (16:47 +0100)]
avoid division by zero in progress meter
On Windows systems the timer may not be accurate enough to measure
the time between init() and done(). Avoid computing statistics with
a diff time of 0 in such cases.
AntoJvlt [Sat, 20 Mar 2021 17:55:08 +0000 (18:55 +0100)]
Ported functions for the import of special phrases from php to python.
- the command is now --import-special-phrases
- the output is not an sql file anymore, data are directly imported to the database.
- the little part on the documentation (section data import) has been modified.
Sarah Hoffmann [Tue, 16 Mar 2021 21:13:33 +0000 (22:13 +0100)]
bdd: run all setup via nominatim Python library
Drops all calls to PHP utility functions. nominatim cli functions
are used where possible, to stay as close to the final code as
possible with the tests.
By removing the PHP calls, the test code now only uses osm2pgsql and
the database module from the build directory.
Sarah Hoffmann [Thu, 11 Mar 2021 19:34:21 +0000 (20:34 +0100)]
higher penalty for special searches
Adds a general higher penalty for special search term and an
additional one if the term is anywhere but the beginning or the
end. Also housenumbers and special searches together are less
likely.
Sarah Hoffmann [Thu, 11 Mar 2021 16:14:46 +0000 (17:14 +0100)]
fix result splitting for last search group
When we are in the final iteration of the search groups, it is not
possible to further delay the results. Unconditionally use the
results with the best rank instead.
Sarah Hoffmann [Thu, 11 Mar 2021 14:03:36 +0000 (15:03 +0100)]
give preference to full words in address, too
Full word terms are already preferred for the name part. Adding
only one-word partials to the address, makes it impossible to
give a similar preference for the address part. Each term adds
a rank penalty. The problem here is that we interpret the query
forwards and backwards. Having different penalty systems for
name and address means that the same term ends up with different
penalties and that often leads to interpretations of the wrong
direction being in the way.
Sarah Hoffmann [Tue, 2 Mar 2021 20:26:13 +0000 (21:26 +0100)]
automatic migration from 3.6 release
Adds a 'admin --migrate' command that checks for the current
database version and runs any necessary migrations. Also
has migrations going back to 3.6.
Sarah Hoffmann [Thu, 4 Mar 2021 09:55:24 +0000 (10:55 +0100)]
port index creation to python
Also switches to jinja-based preprocessing, which allows to
simplify the SQL files. Use 'if not exists' where possible
so that the step can be rerun to fix missing indexes.