Conversation
|
Preview is ready! 🚀 You can view it here: https://valhalla-app-tests.gis-ops.com/407 |
|
yeah, you're right, of course we can filter dups as well. it's generally a good idea for sure. especially your paris example is so frustrating! looking at the osm relations, they seem to be exactly the same (geometry). what's the difference then.. I didn't go through the tons of attributes to understand, but for me it looks a bit data-broken. did you already ask upstream in a nominatim discussion? I'd be interested to hear a pro's opinion why that's not somehow handled during/post data ingestion on nominatim side. |
|
asked here: osm-search/Nominatim#4058. |
nilsnolde
left a comment
There was a problem hiding this comment.
some small things, overall looks good!
| const normalizedTitle = result.display_name | ||
| .toLowerCase() | ||
| .normalize('NFD') | ||
| .replace(/[\u0300-\u036f]/g, ''); |
There was a problem hiding this comment.
can you leave some explanation what this represents?
There was a problem hiding this comment.
This actually normalizes the dedup comparison, toLowercase makes it case insensitive , normalize('NFD') for characters like é and regex operation removes those . Eg: Île-de-France and Ile-de-France are same
| VITE_NOMINATIM_URL=https://nominatim.openstreetmap.org | ||
| VITE_TILE_SERVER_URL="https://tile.openstreetmap.org/{z}/{x}/{y}.png" | ||
| VITE_CENTER_COORDS="52.51831,13.393707" | ||
|
|
There was a problem hiding this comment.
unrelated whitespace change, revert pls
src/utils/nominatim.ts
Outdated
| const roundedLat = parseFloat(result.lat).toFixed(4); | ||
| const roundedLon = parseFloat(result.lon).toFixed(4); | ||
| const dedupeKey = `${normalizedTitle}${roundedLat}${roundedLon}`; |
There was a problem hiding this comment.
that makes sense to me deduping
| if (seenKeys.has(dedupeKey)) { | ||
| continue; | ||
| } | ||
| seenKeys.add(dedupeKey); |
There was a problem hiding this comment.
but there has to be some sort of set in JS (i.e. unique element container) which can do this in one line?
There was a problem hiding this comment.
yeah but seenKeys also does the same function :)
There was a problem hiding this comment.
Haha ok then follow up question: why you check for "has"?
There was a problem hiding this comment.
if the key is already in the set, we skip it - if not, we add it. that's it
There was a problem hiding this comment.
that's the whole point of a set: it only allows unique elements, so a seenKeys.add(dedupeKey) is all it should need for such a structure.
There was a problem hiding this comment.
oops you are right, we dont want .has() here , we can use .add() to keep it simple
There was a problem hiding this comment.
processedResults.push({
title:
result.display_name.length > 0
? result.display_name
: lngLat?.toString() || '',
description: `https://www.openstreetmap.org/${result.osm_type}/${result.osm_id}`,
selected: false,
addresslnglat: [parseFloat(result.lon), parseFloat(result.lat)],
sourcelnglat:
lngLat === undefined
? [parseFloat(result.lon), parseFloat(result.lat)]
: lngLat,
displaylnglat:
lngLat !== undefined
? lngLat
: [parseFloat(result.lon), parseFloat(result.lat)],
key: index,
addressindex: index,
});but wait, the job of .has() is to get out of the loop if it is already present and never execute the processedResults.push() , hope that makes sense (lines 66 - 84)
will investigate and fix soon |
client-side dedup is now working correctly, however, nominatim still returns 4 entries from the API :
[
{
"place_id": 88703675,
"licence": "Data © OpenStreetMap contributors, ODbL 1.0. http://osm.org/copyright",
"osm_type": "relation",
"osm_id": 71525,
"lat": "48.8534951",
"lon": "2.3483915",
"class": "boundary",
"type": "administrative",
"place_rank": 12,
"importance": 0.897098092136026,
"addresstype": "city",
"name": "Paris",
"display_name": "Paris, Ile-de-France, Metropolitan France, France",
"boundingbox": [
"48.8155755",
"48.9021560",
"2.2241220",
"2.4697602"
]
},
{
"place_id": 90003570,
"licence": "Data © OpenStreetMap contributors, ODbL 1.0. http://osm.org/copyright",
"osm_type": "relation",
"osm_id": 7444,
"lat": "48.8588897",
"lon": "2.3200410",
"class": "boundary",
"type": "administrative",
"place_rank": 15,
"importance": 0.897098092136026,
"addresstype": "suburb",
"name": "Paris",
"display_name": "Paris, Ile-de-France, Metropolitan France, France",
"boundingbox": [
"48.8155755",
"48.9021560",
"2.2241220",
"2.4697602"
]
},
{
"place_id": 308463887,
"licence": "Data © OpenStreetMap contributors, ODbL 1.0. http://osm.org/copyright",
"osm_type": "relation",
"osm_id": 115357,
"lat": "33.6617962",
"lon": "-95.5555130",
"class": "boundary",
"type": "administrative",
"place_rank": 16,
"importance": 0.5298648354283636,
"addresstype": "town",
"name": "Paris",
"display_name": "Paris, Lamar County, Texas, 75460, United States",
"boundingbox": [
"33.6206345",
"33.7383866",
"-95.6279396",
"-95.4354115"
]
},
{
"place_id": 89055753,
"licence": "Data © OpenStreetMap contributors, ODbL 1.0. http://osm.org/copyright",
"osm_type": "relation",
"osm_id": 1641193,
"lat": "48.8588897",
"lon": "2.3200410",
"class": "boundary",
"type": "administrative",
"place_rank": 14,
"importance": 0.4594627538257798,
"addresstype": "city_district",
"name": "Paris",
"display_name": "Paris, Ile-de-France, Metropolitan France, France",
"boundingbox": [
"48.8155755",
"48.9021560",
"2.2241220",
"2.4697602"
]
}
]Our fix deduplicates using normalizedTitle + boundingbox as the key, falling back to normalizedTitle + lat/lon when boundingbox is missing and the root cause seems to be in nominatim's side osm-search/Nominatim#4058 |
| processedResults.push({ | ||
| title: | ||
| result.display_name.length > 0 | ||
| ? result.display_name | ||
| : lngLat?.toString() || '', | ||
| description: `https://www.openstreetmap.org/${result.osm_type}/${result.osm_id}`, | ||
| selected: false, | ||
| addresslnglat: [parseFloat(result.lon), parseFloat(result.lat)], | ||
| sourcelnglat: | ||
| lngLat === undefined | ||
| ? [parseFloat(result.lon), parseFloat(result.lat)] | ||
| : lngLat, | ||
| displaylnglat: | ||
| lngLat !== undefined | ||
| ? lngLat | ||
| : [parseFloat(result.lon), parseFloat(result.lat)], | ||
| key: index, | ||
| addressindex: index, | ||
| }); |
There was a problem hiding this comment.
This is gets executed if we use .add() in the the set instead of .has() followed by continue


Fixes #406

[ { "place_id": 88664949, "licence": "Data © OpenStreetMap contributors, ODbL 1.0. http://osm.org/copyright", "osm_type": "relation", "osm_id": 71525, "lat": "48.8534951", "lon": "2.3483915", "class": "boundary", "type": "administrative", "place_rank": 12, "importance": 0.897098092136026, "addresstype": "city", "name": "Paris", "display_name": "Paris, Ile-de-France, Metropolitan France, France", "boundingbox": [ "48.8155755", "48.9021560", "2.2241220", "2.4697602" ] }, { "place_id": 88715228, "licence": "Data © OpenStreetMap contributors, ODbL 1.0. http://osm.org/copyright", "osm_type": "relation", "osm_id": 7444, "lat": "48.8588897", "lon": "2.3200410", "class": "boundary", "type": "administrative", "place_rank": 15, "importance": 0.897098092136026, "addresstype": "suburb", "name": "Paris", "display_name": "Paris, Ile-de-France, Metropolitan France, France", "boundingbox": [ "48.8155755", "48.9021560", "2.2241220", "2.4697602" ] }, { "place_id": 310325807, "licence": "Data © OpenStreetMap contributors, ODbL 1.0. http://osm.org/copyright", "osm_type": "relation", "osm_id": 115357, "lat": "33.6617962", "lon": "-95.5555130", "class": "boundary", "type": "administrative", "place_rank": 16, "importance": 0.5298648354283636, "addresstype": "town", "name": "Paris", "display_name": "Paris, Lamar County, Texas, 75460, United States", "boundingbox": [ "33.6206345", "33.7383866", "-95.6279396", "-95.4354115" ] }, { "place_id": 88625937, "licence": "Data © OpenStreetMap contributors, ODbL 1.0. http://osm.org/copyright", "osm_type": "relation", "osm_id": 1641193, "lat": "48.8588897", "lon": "2.3200410", "class": "boundary", "type": "administrative", "place_rank": 14, "importance": 0.4594627538257798, "addresstype": "city_district", "name": "Paris", "display_name": "Paris, Ile-de-France, Metropolitan France, France", "boundingbox": [ "48.8155755", "48.9021560", "2.2241220", "2.4697602" ] } ]fixed and tested the duplicate entries in API response when selecting a waypoint