Sarah Hoffmann [Mon, 30 May 2022 12:32:36 +0000 (14:32 +0200)]
move quoting hack to wiki loader
The bad quotes around the type for special phrases
specifically occure in the Wiki pages, so it should be
removed by the loader and not in the generic SpecialPhrase
object.
Sarah Hoffmann [Fri, 27 May 2022 14:49:14 +0000 (16:49 +0200)]
allow search for partials consisting of 3 or more words
The search query builder currently rejects searches for partial
names only, when the partial terms are all very frequent to avoid
queries that return too many results.
This change slightly relaxes the condition to allow the search when
there are 3 or more partial terms. With so many terms the number
of matches should be managable.
Sarah Hoffmann [Mon, 23 May 2022 08:11:28 +0000 (10:11 +0200)]
fix bug with keeping linking on updates
When moving the finding of linked places to the precomputation stage,
it was also moved before the statement where the linked_place_id was
removed from the linkee. The result was that the current linkee was
excluded when looking for a linked place on updates because it was
still linked to the boundary to be updated.
Fixed by allowing to either keep the linkage or change to an unlinked
place.
Sarah Hoffmann [Wed, 18 May 2022 08:19:05 +0000 (10:19 +0200)]
remove county nodes in Canada from addresses
Canada has complete coverage for administrative boundaries on
county level. Removing the county nodes from the addresses avoids error
due to a wide-spread doubling of place nodes for city counties.
Sarah Hoffmann [Wed, 11 May 2022 13:03:02 +0000 (15:03 +0200)]
add offline import mode
In offline mode no attempts are made to download data from the internet.
At the moment that only concerns the computation of the database date.
It contacts the main API to get the date.
Sarah Hoffmann [Wed, 11 May 2022 09:54:25 +0000 (11:54 +0200)]
no longer allow fuzzy assignment of country
The fallback country boundaries already contain a sufficiently large
part of the water area, so there is no need to extend the country
assignment even more. Features outside countries should not show a
country in their address.
Sarah Hoffmann [Wed, 11 May 2022 08:25:00 +0000 (10:25 +0200)]
pylint: disable no-self-use check
This checker encourages bad behaviour (namely changing the static
status of a function during inheritence) and will be made optional
in upcoming versions of pylint.
Sarah Hoffmann [Mon, 2 May 2022 07:48:51 +0000 (09:48 +0200)]
accept any OSM type in street member of associatedStreet
This is needed for pedestrian areas mapped as multipolygons
and consequently as relations. The lookup in placex guarantees
that the referenced OSM object is indeed a street.
Sarah Hoffmann [Fri, 29 Apr 2022 10:11:39 +0000 (12:11 +0200)]
add check for wikipedia importance data
Adds a new check level WARNING because missing wikipedia importances
are not necessarily an error. If the database is run for reverse
requests only, then it is fine to go without them.
Sarah Hoffmann [Thu, 28 Apr 2022 19:38:00 +0000 (21:38 +0200)]
keep inherited address parts after indexing
The inherited housenumber is needed for display output. We can't
take the one from the housenumber field because it is already
normalized. Remove the inherited address only when reindexing.
Sarah Hoffmann [Thu, 28 Apr 2022 15:20:56 +0000 (17:20 +0200)]
ICU: better letter identification in normalization
The Letter class does not include non-spacing marks that can also
have a consonant or vowel meaning, especially in Indian languages.
Use the alnum propoerty instead which includes them all. Also
include the vowel-canceling Virama, which is not a letter by itself
but changes the transliteration.
Sarah Hoffmann [Wed, 27 Apr 2022 08:58:25 +0000 (10:58 +0200)]
geocodejson: add osm_key and osm_value fields
Return OSM main tag information in geocodejson. This is not part
of the official spec but can be useful to get more detailed information
of the object type. Brings the Nominatim output closer to what
Photon produces.
Sarah Hoffmann [Wed, 27 Apr 2022 08:53:12 +0000 (10:53 +0200)]
geocodejson: type should contain the general feature class
'type' so far contained the value of the OSM tag. That is rarely
helpful because it is not a restricted class of values. Change
this to contain the types as defined in the geocodejson spec,
which correspond to the address layer names.
Sarah Hoffmann [Fri, 22 Apr 2022 12:32:19 +0000 (14:32 +0200)]
further tweaking of address distance
For point features, keep using the distance to centroid.
For area features, add a tie breaker for the case where the
center point falls on the boundary.
Sarah Hoffmann [Thu, 21 Apr 2022 19:56:59 +0000 (21:56 +0200)]
change distance computation between place and address part
Instead of computing the distance to the centroid of the area
compute the distance of the area to the centroid of the feature.
This means we give preference to the area that covers the centroid.
It's still a heuristics but one that is a bit less random.