简体   繁体   中英

ArangoDB geo-spatial queries do not work as expected

I have a toy dataset having 12 rows in csv format, as follows:

Cricketers.csv:数据集中具有“lng”和“lat”属性的所有行

I am trying to load this data in arangodb, index it spatially and then fetch data using arangodb spatial queries. My steps to load the data in the db and index it are as follows:

arangoimport --file "cricketers.csv" --type csv --create-collection --create-collection-type document --translate "id=_key" --collection "players"

db.players.ensureIndex({type: 'geo', fields: ['lng', 'lat'], geoJson: false})

After this, I try to fetch some data by sending spatial queries to the db as follows:

db._query({'query': 'FOR node IN players FILTER GEO_CONTAINS(GEO_POLYGON([[[-70,-40],[-70,60],[180,60],[180,-40],[-70,-40]]]), [node.lng, node.lat]) RETURN node', "options" : {fullCount:true}}).getExtra();

The above example query should ideally fetch all the data points, because it specifies a geo_polygon which spans all the data points. However, the query does not return any of the data points. This is what the query returns (see full count):

{   "warnings" : [ ],   "stats" : {     "writesExecuted" : 0,     "writesIgnored" : ,     scannedFull" : 12,     "scannedIndex" : 0,     "filtered" : 12,     "httpRequests" : 0,     "fullCount" : 0,      "executionTime" : 0.0015139159995669615,     "peakMemoryUsage" :     }

If I perform the same query without using spatial queries, ie using simple filters like this:

db._query({'query': 'for node in players filter -40 <= node.lat <= 60 and -70 <= node.lng <= 180 return node', 'options': {fullCount: true}}).getExtra();

This is what I get, which is as expected:

{   "warnings" : [ ],   "stats" : { "writesExecuted" : 0,     "writesIgnored" : 0,     scannedFull" : 12,     "scannedIndex" : 0,     "filtered" : 0,     "httpRequests" : 0,     fullCount" : 12,     "executionTime" : 0.0005607399998552864,     "peakMemoryUsage" : 0    }

Please help me understand what is it that I am doing wrong? Why are the spatial queries not working? I have tried playing around with the order in which I send 'lat' and 'lng' to the spatial data fetch and indexing queries, but to no effect.

The ArangoDB documentation only talks about an example in JSON, even that is not very helpful. It talks something about analyzers which I think is something I need to use, but it is not clear how to do so for csv data.

Thank you!

There are two separate issues. The documentation for Non-GeoJSON Geo spatial indexes states:

The first field is always defined to be the latitude and the second is the longitude.

So you should create your index as db.players.ensureIndex({type: 'geo', fields: ['lat', 'lng'], geoJson: false})

The second, more important, issue is the polygon you use. The [Polygon documentation}(https://www.arangodb.com/docs/stable/indexing-geo.html#polygon) mentions the following limitation:

A linear ring defines two regions on the sphere. ArangoDB will always interpret the region of smaller area to be the interior of the ring. This introduces a practical limitation that no polygon may have an outer ring enclosing more than half the Earth's surface

Unfortunately this is true for your chosen polygon, so that is why it does not cover the area to intended. Please note though, that the polygon visualization in the web UI actually shows the polygon as you intended. This is a known issue and we are currently working to resolve it.

"analyzers" are only relevant if you use ArangoSearch, which also supports geo spatial indexes since v3.8.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM