简体   繁体   中英

ArangoDB: GRAPH_EDGES command very slow (more than 20 sec) on small collections

I'am evaluating ArangoDB and I see that GRAPH_EDGES and GRAPH_VERTICES commands are very slow, on small collections (300 vertices).

I have 3 collections:

TactiveService( 300 Vertices) --> TusesCommand( 300 Edges) --> Tcommand (1 Vertex)

Using GRAPH_EDGES, this query take 24 sec

FOR service IN TactiveService
   LET usesCommand = (
      return FIRST(GRAPH_EDGES("topvision", {}, { edgeExamples : [{_from: service._id}], edgeCollectionRestriction : "TusesCommand", includeData:true, maxDepth : 1 }))
   )
   LET command = DOCUMENT(usesCommand[0]._to)
RETURN { service : service, usesCommand: usesCommand[0], command:command} 

For the same result , this query takes 0.020 sec

FOR service IN TactiveService
   LET usesCommand = (
      FOR usesCommand IN TusesCommand
         FILTER usesCommand._from == service._id
         RETURN usesCommand
   )
   LET command = DOCUMENT(usesCommand[0]._to)
RETURN { service : service, usesCommand: usesCommand[0], command:command} 

GRAPH_EDGES is unusable for me in FOR statement (same problem with GRAPH_VERTICES).

Ideas on the reason of this slowness are welcome.

We are well aware that GRAPH_EDGES is not well suited to be used like this in a query.

We therefore introduced AQL pattern matching traversals , which should perform significantly better.

You could formulate your query like this, replacing the GRAPH_EDGES with a traversal:

FOR service IN TactiveService
LET usesCommand = (
                   FOR v, e IN 1..1 OUTBOUND service "TusesCommand"
                       FILTER e._from == service._id RETURN e
   )
   LET command = DOCUMENT(usesCommand[0]._to)
RETURN { service : service, usesCommand: usesCommand[0], command:command} 

Please note that the specified filter is implicitely true because of we queried for OUTBOUND edges starting from service - so e._from will always be equal to service._id . Instead of specifying GRAPH "topvision" and later on limit the edge collections we want to take into account in the traversal, we use the an anonymous graph query only taking into account the edge collection TusesCommand as you did.

So simplifying it a little more, the query could look like:

FOR service IN TactiveService
LET usesCommand = (
          FOR v, e IN 1..1 OUTBOUND service "TusesCommand" RETURN {v: v, e: e}
   )
RETURN { service : service, usesCommand: usesCommand} 

This may return more vertices than your query, but it will only fetch them once; so the result set may be bigger, but the number of index lookups is reduced by the removed DOCUMENT calls of the query.

As you already noticed and formulated with your second query, if your actual problem works better with a classic join ArangoDB offers you the freedom of choice to work with your data like that.

edit : Michael is right for sure, the direction has to be OUTBOUND

if for some reason you do not want to upgrade to 2.8 as @dothebart suggests. You can also fix the old query. Original:

FOR service IN TactiveService
   LET usesCommand = (
      return FIRST(GRAPH_EDGES("topvision", {}, { edgeExamples : [{_from: service._id}], edgeCollectionRestriction : "TusesCommand", includeData:true, maxDepth : 1 }))
   )
   LET command = DOCUMENT(usesCommand[0]._to)
RETURN { service : service, usesCommand: usesCommand[0], command:command} 

The slow part of the query is finding the starting point. The API of GRAPH_EDGES uses the second parameter as start example. {} matches to all start Points. So it now computes all outbound edges for all vertices first (this is expensive, as this actually means for every vertex in the start collection, we collect all edges for every vertex in the start collection). Than it post filters all found edges with the example you gave (Which removes almost all of the edges again). If you replace the start example by the _id of the start vertex it will just collect the edges for this specific vertex. Now you are also interested in the edges of only one direction (OUTBOUND) so you can just give it in the options as well (so only edges with _from == service._id are fetched by GRAPH_EDGES in first place).

FOR service IN TactiveService
   LET usesCommand = (
      RETURN FIRST(GRAPH_EDGES("topvision", service._id, { edgeCollectionRestriction : "TusesCommand", includeData:true, maxDepth : 1, direction: 'outbound' }))
   )
   LET command = DOCUMENT(usesCommand[0]._to)
RETURN { service : service, usesCommand: usesCommand[0], command:command}

However I still expect that the version of @dothebart is faster in 2.8 and i would also recommend to switch to the newest version.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM