Sarah Hoffmann [Wed, 25 Nov 2020 19:33:15 +0000 (20:33 +0100)]
filter postcodes by search rank when adding to address list
The post codes are the last part that does not fit the new
address ranking scheme. In particular, the search rank is still
relevant for choosing if a postcode should be included into
the address terms. Filter out irrelevant postcodes in
getNearFeatures() already, to avoid having to check for
geometry relation.
Sarah Hoffmann [Wed, 25 Nov 2020 10:44:25 +0000 (11:44 +0100)]
improve handling of multi-word partials in SearchDescription
Multi-word partial terms had an undue advantage over separate partial
terms because they only need to pay the penalty once. This changes
the behaviour by setting the penalty according to the number of
words in the token. This should get rid of search interpretations
with low chance of matching.
This also fixes handling of exact term matching. We now match against
all exact terms of the query, not just a couple of them collected
while building the interpretations.
Sarah Hoffmann [Thu, 19 Nov 2020 11:06:53 +0000 (12:06 +0100)]
Search housenumbers with unknown address parts by housenumber term
House numbers need special handling because they may appear after
the street term. That means we canot just use them as the main name
for searches where the address has its own search term entries.
Doing this right now, we are able to find '40, Main St, Town' but not
'Main St 40, Town'.
This switches to using the housenumber token as the name term instead.
House number tokens can get special handling when building the search
query that covers the case where they come after the street.
The main disadvantage is that this once more increases the numbers
of possible search interpretation of which we have already too many.
Sarah Hoffmann [Tue, 17 Nov 2020 18:44:43 +0000 (19:44 +0100)]
POIs with unknown addr:place must add parent name to address
The previous behaviour was a left-over from a former version
where such POIs parented to the street. Now that they parent to
places, it should be included.
Sarah Hoffmann [Wed, 11 Nov 2020 10:52:14 +0000 (11:52 +0100)]
get additional addresses for rank 30 objects
get_addressdata() now also checks if the place itself has entries
in the place_addressline table and merges them into the results.
Also restrict checking for address tag places to cases where the
name cannot be found in the parent's address search terms. Looking
up all address tags is just too slow.
Sarah Hoffmann [Mon, 9 Nov 2020 11:03:37 +0000 (12:03 +0100)]
lookup places for address tags for rank < 30
While previously the content of addr:* tags was only added
to the list of address search keywords, we now really look up
the matching place. This has the advantage that we pull in all
potential translations from the place, just like all the other
address terms that are looked up by neighbourhood search.
If no place can be found for a given name, the content of the
addr:* tag is still added to the search keywords as before.
Sarah Hoffmann [Tue, 27 Oct 2020 20:29:35 +0000 (21:29 +0100)]
remove now unused settings related to website
There are two places where the website URL is still used:
for icons, replace the URL with a link to the icon repository
of the UI repo. The more URL now builds the link from the
server info.
Sarah Hoffmann [Fri, 23 Oct 2020 08:43:57 +0000 (10:43 +0200)]
minor fixes for geometry compuation during boundary ranking
Go back to using centroid when determining if one admin level
is within another. There are cases where boundaries are slightly
misaligned due to mapping errors (not using the same ways in the
relations).
Only declare boundaries the same when they have the same wikidata
tag _and_ have exactly the same geometry. This works around tagging
errors with the wikidata tag, which happen because of automated
edits to the wikidata tag.
Sarah Hoffmann [Tue, 20 Oct 2020 21:26:44 +0000 (23:26 +0200)]
detect and remove admin boundary duplicates
The Polish community maps admin boundaries that span multiple
levels by duplicating the boundary relations. Detect this situation
by looking out for matching wikidata tags. The higher ranked
duplicates are then thrown out from the address pool by setting
their address rank to 0.
Sarah Hoffmann [Thu, 22 Oct 2020 08:20:16 +0000 (10:20 +0200)]
adjust secondary order when no addressimportance available
In cases of countries and remote places without an address
it is possible that 'addressimportance' comes back empty.
Adjust the 'foundorder' to the places importance instead
in such cases.
Sarah Hoffmann [Tue, 20 Oct 2020 18:20:49 +0000 (20:20 +0200)]
reorganize ranks of high-level place types
Rank 25 is now available for places that should appear in addresses
but not when a street is present. Use this for som block-like
place types. Also document the particularity of rank 25.
subdevisions and allotments are now at the same level as landuse
which they are frequently used together with.
Sarah Hoffmann [Mon, 19 Oct 2020 08:39:01 +0000 (10:39 +0200)]
add explicit bbox contains check
Now that the containment check uses ST_Relate, we need to add
a separate bbox contains check to ensure that Postgis does the
efficient check first. Note that we still cannot get rid of the
overlap(&&) check because then Postgis will use the wrong indexes.
Sarah Hoffmann [Thu, 15 Oct 2020 15:30:52 +0000 (17:30 +0200)]
use computed centroid for location_area_large
The new address computation assumes that the centroid is inside
the area. Therefore we cannot use the centroid function. Use the
pre-computed centroid instead which has already been corrected to
be inside the area.
Sarah Hoffmann [Wed, 14 Oct 2020 09:33:47 +0000 (11:33 +0200)]
demote admin boundaries for place areas
Also demote the address rank of an admin boundary when there
is a place area of higher rank that completely contains the
area. This catches the case where city boundaries do not exactly
align with administrative units (see for example Moscow).
Sarah Hoffmann [Tue, 13 Oct 2020 20:10:07 +0000 (22:10 +0200)]
overhaul address computation
This is a complete rewrite of the selection of address parts to
be inserted into the place_addressline table.
The new algorithm selects for each rank:
* the boundary overlapping with the addressee and contained
in the already selected boundaries of lower rank, or failing that
* the place node closest to the addressee that is contained in
the already selected boundaries and in the influence radius
of already selected place nodes of lower rank
Place nodes that are not contained in already selected boundaries
of lower rank are completely thrown away. All other candidates are
added as non-address parts.