简体   繁体   中英

ArangoDB How to get all neighbours with fulldepth matching the pattern with AQL?

We have ~300k document and ~3m edge in our ArangoDB v3.2 I want to get all neighbors of start document and neighbor's neighbours as long as neighbor's rate > 0.5.. It will give me the community has rate > 0.5 and related with start doc.

Now im doing it with multiple requests but data and project is going to be bigger and i need better performance so i need a AQL query to get all neigbours with one request ..

I have tried to get neighbours with depth 1..10 so it will enought to get all neigbours but it is very ver slow so i cant give absolute depth..

for vertex, edge, path in 1..10 any 'docs/10' doc_relations
    filter edge.rate > 0.5
return distinct edge

I need some thing like while loop but there is no query function like this.

I found the answer. I extended the arango with User Functions . You can write any function with javascript after add it to arango you can use it on aql . You can use any query of aql in your js too. I wrote a js and added it on arangosh

To add a user function write it and save it a file has extension ".js"

Lets say for example we saved the js function to path/to/func/file.js

First select a db to add new user func

db._useDatabase("dbName");

You can add it like this

require("@arangodb/aql/functions").register("MYFUNCNAMESPACE::SEARCHRELATEDCLIPS",   require("path/to/func/file.js"), false);

Note: 3 th parameter added for future versions it not works for v3.2

You can unregister like this

require("@arangodb/aql/functions").unregister("MYFUNCNAMESPACE::SEARCHRELATEDCLIPS");

Neighbor search function in path/to/func/file.js

 function searchRelatedDocs( docID ) { var db = require("@arangodb").db; var groupDocs = []; var unSearchedDocs = [docID]; var searchedDocs = {}; var stepCounter = 0; var searchedDocCounter = 0; var start = new Number(new Date()); var edgeSearchTime = 0; while( unSearchedDocs.length > 0 ) { searchedDocCounter++; var docID = unSearchedDocs.shift(); groupDocs.push(docID); searchedDocs[docID] = true; var startE = new Number(new Date()); var docEdges = db.doc_relations.edges( docID ); edgeSearchTime += ( new Number(new Date()) - startE ); if( docEdges == null || docEdges == undefined || !( docEdges instanceof Array && docEdges.length != 0) ) continue; for( var i = 0; i < docEdges.length; i++ ) { stepCounter++; var edge = docEdges[i]; if( edge.rate > 0.5 ) { var relatedDocID = undefined; if( edge._to == docID ) { relatedDocID = edge._from; } else relatedDocID = edge._to; if( searchedDocs[relatedDocID] == undefined ) { searchedDocs[relatedDocID] = false; unSearchedDocs.push(relatedDocID); } } } } var end = new Number(new Date()); var result = {}; result.time = ( end - start ); result.searchedDocs = searchedDocCounter; result.searchedEdges = stepCounter; result.edgeSearchTime = edgeSearchTime; var documents = db.docs.documents(groupDocs); if( documents != undefined && documents != null && documents.documents != undefined && documents.documents instanceof Array ) result.vertices = documents.documents; return result; } module.exports = searchRelatedDocs; 

Don't forget to add module.exports = funcName; in file.js.

It seems to me that you should be able to achieve what you want using AQL.

Fine-tuning the AQL query

It looks like you could fine-tune the definitional AQL query. Specifically, I would propose using an AQL query along the following lines:

for vertex, edge, path in 1..100000 OUTBOUND 'docs/10' doc_relations
   OPTIONS {uniqueVertices: "global", bfs: true }
   FILTER path.edges[*].rate ALL > 0.5
   return vertex

That is:

  • if possible, use OUTBOUND rather than ALL ;
  • use the OPTIONS as shown since you don't care how many admissible paths there are to an admissible vertex;
  • your requirements seem to impose the "rate > 0.5" restriction on every edge along an admissible path to a "neighbor", hence ALL ;
  • since you want the vertices, simply return vertex ; there is no need for DISTINCT because of the aforementioned OPTIONS .

Add a "skip-list index" on .rate

Cacheing the neighbours in the database

i need a AQL query to get all neigbours with one request

In terms of performance, there is obviously a time/space tradeoff here, and it might make sense to pre-compute the neighbours of all the vertices, or to compute-and-cache them on demand.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM