Attempting to batch create nodes & relationships - batch creation is failing - Traceback at end of the post
Note code functions with smaller subset of nodes - fails when get into massive number of relationships, unclear at what limit this is occurring.
One cluster within the data set ends up with around 625525 relationships out of 700+ nodes. Total Relationships will be 1M+ - utilizing an Apple Macbook Pro Retina with x86_64 - Ubuntu 13.04, SSD, 8GB memory.
https://github.com/alienone/OSINT/blob/master/MANDIANTAPT/spitball.py
Traceback (most recent call last): File "/home/alienone/Programming/Python/OSINT/MANDIANTAPT/spitball.py", line 63, in main() File "/home/alienone/Programming/Python/OSINT/MANDIANTAPT/spitball.py", line 59, in main graph_db.create(*sorted_nodes) File "/home/alienone/.pythonbrew/pythons/Python-2.7.5/lib/python2.7/site-packages/py2neo/neo4j.py", line 420, in create return batch.submit() File "/home/alienone/.pythonbrew/pythons/Python-2.7.5/lib/python2.7/site-packages/py2neo/neo4j.py", line 2123, in submit for response in self._submit() File "/home/alienone/.pythonbrew/pythons/Python-2.7.5/lib/python2.7/site-packages/py2neo/neo4j.py", line 2092, in submit for id , request in enumerate(self.requests) File "/home/alienone/.pythonbrew/pythons/Python-2.7.5/lib/python2.7/site-packages/py2neo/rest.py", line 428, in _send return self._client().send(request) File "/home/alienone/.pythonbrew/pythons/Python-2.7.5/lib/python2.7/site-packages/py2neo/rest.py", line 365, in send return Response(request .graph_db, rs.status, request.uri, rs.getheader("Location", None), rs_body) File "/home/alienone/.pythonbrew/pythons/Python-2.7.5/lib/python2.7/site-packages/py2neo/rest.py", line 279, in init raise SystemError(body) SystemError: None
Process finished with exit code 1
I had a similar issue. One way to deal with it is to do the batch.submit()
for chunks of your data and not the whole data set. This is slower of course, but splitting one million nodes in chunks of 5000 is still faster than adding every node separately.
I use a small helper class to do this, note that all my nodes are indexed: https://gist.github.com/anonymous/6293739
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.