简体繁体中英

how to handle large dataset using neo4j and gremlin?

原文 2013-10-12 10:37:30 2 1 neo4j/ gremlin

I have around 88 millions nodes and 200 millions edges. I am using Neo4j Db. I am using Batch Graph using Gremlin. So, is it advisable to use gremlin queries for this dataset using gremlin REPL. I mean avoid timeout or heap related issues.

Currently our scope is not to use faunus api for hadoop map.reduce sructure.

Can I handle this using simple Neo4j Db with gremlin ? Any alternative or solution ?

1 answers

I think Marko/Peter both gave good answers to this on the gremlin-users mailing list:

https://groups.google.com/forum/#!topic/gremlin-users/w3xM4YJTA2I

I'm not sure I'm saying much more than they said, but I'll just repeat a bit in my own words. The answer largely depends on the nature of what you intend to do with your graph and the structure of the graph itself. If your workload is a lot of local traversals (ie start at some vertex and traverse out from there) and don't expect a lot of supernodes then Gremlin and Neo4j should do just fine. Give it a lot of memory, do a bit of neo4j specific tuning and you should be quite pleased. If on the other hand your traversals are more global in nature (ie they start with gV or gE) where you have to touch the entire graph to do your calculation then you will be less pleased. It takes a long time to iterate tens/hundreds of millions of things.

Ultimately you have to understand the problem you are facing, your use cases, your graph structure and the strengths/weaknesses of the graph databases available to decide how you will approach a graph of that size.

how to design my dataset using neo4j and gremlin

How to store tree structure using neo4j and gremlin

Add Neo4j to Gremlin Server - how to?

How to import large dataset into Neo4j with relationships defined in CSV

How to use load csv for large dataset in neo4j?

neo4j not creating index on large dataset

Neo4j Performance for large dataset

How to handle transaction rollback in Neo4j while Loading large data from csv using periodic commit

Gremlin Traversal in Neo4J

Unable to connect to Neo4j db using gremlin - Error instantiating Neo4j Database

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question how to design my dataset using neo4j and gremlin How to store tree structure using neo4j and gremlin Add Neo4j to Gremlin Server - how to? How to import large dataset into Neo4j with relationships defined in CSV How to use load csv for large dataset in neo4j? neo4j not creating index on large dataset Neo4j Performance for large dataset How to handle transaction rollback in Neo4j while Loading large data from csv using periodic commit Gremlin Traversal in Neo4J Unable to connect to Neo4j db using gremlin - Error instantiating Neo4j Database

Related Tags

how to handle large dataset using neo4j and gremlin?

Question

1 answers

solution1 1 2013-10-12 16:33:45

solution1
1 2013-10-12 16:33:45