简体   繁体   English

如何从Python获取Neo4j图形数据库的节点数?

[英]How can I get the number of nodes of a Neo4j graph database from Python?

I'm trying to get the number of nodes of a Neo4j graph database using Python, but I don't find any method or property to do that. 我正在尝试使用Python获取Neo4j图形数据库的节点数,但我没有找到任何方法或属性来做到这一点。

Does anybody how can I get this information? 有人如何获得这些信息?

Other Python packages like NetworkX has a method to get this information. 像NetworkX这样的其他Python软件包有一种获取此信息的方法。

>>> G = nx.Graph()   # or DiGraph, MultiGraph, MultiDiGraph, etc
>>> G.add_path([0,1,2])
>>> len(G)
3

Update: 更新:

Since I first wrote this, the answer has changed. 自从我第一次写这篇文章以来,答案已经改变。 The database now keeps exact counts of total nodes, as well as counts by label. 数据库现在保留总节点的精确计数,以及按标签计数。 Unlike most databases, this is not a heuristic, these counters are transactionally kept in sync with the rest of the datastore. 与大多数数据库不同,这不是启发式算法,这些计数器在事务上与数据存储的其余部分保持同步。

This means you can get exact node counts in O(1) time from Neo4j. 这意味着您可以从Neo4j获得O(1)时间内的精确节点计数。 You get access to them by asking Cypher: 您可以通过询问Cypher来访问它们:

MATCH (n) RETURN count(*)

Original reply: 原始回复:

There are two ways to get the number of nodes in a neo4j database. 有两种方法可以获得neo4j数据库中的节点数。 The first one is to actually iterate through all the nodes, and counting them. 第一个是实际迭代所有节点,并对它们进行计数。

Alternative two is to use the "number of node ids in use" statistic provided by the db kernel, which does not guarantee to be exact, but will be at least the number of nodes in use. 备选方案2是使用db内核提供的“正在使用的节点ID数量”统计信息,这不保证是准确的,但至少是使用中的节点数。 In a high-load db it will be higher, since it also contains ids of deleted nodes that have not been reclaimed yet. 在高负载数据库中,它会更高,因为它还包含尚未回收的已删除节点的ID。

Alt one is reasonably exact (depending on how many are created/deleted while you iterate), but can be super slow. Alt one是相当精确的(取决于迭代时创建/删除的数量),但可能超级慢。 Alt two is potentially way off, but is a O(1) operation. Alt 2可能是关闭的,但是是O(1)操作。

You currently don't have much choice, because alt one is the only one that works. 你目前没有太多选择,因为alt one是唯一可行的。 It isn't officially supported, so doing it today looks a bit dirty: 它没有得到官方支持,所以今天这样做看起来有点脏:

from neo4j import GraphDatabase
db = GraphDatabase('..')
node_count = sum(1 for _ in db.getAllNodes().iterator())

I've added two issues for this, one to add support for accessing management info (eg. support the alt two method), and one to add support for these use cases: 我为此添加了两个问题,一个是添加对访问管理信息的支持(例如支持alt两种方法),另一个是添加对这些用例的支持:

node_count = sum(1 for _ in db.nodes)
node_count = len(db.nodes)

Follow these issues here: 请点击这些问题:

https://github.com/neo4j/python-embedded/issues/7 https://github.com/neo4j/python-embedded/issues/7

https://github.com/neo4j/python-embedded/issues/6 https://github.com/neo4j/python-embedded/issues/6

Please let us now if you run into any other trouble with neo4j-embedded, add a ticket to the github issues if you discover any bugs or think of any other enhancements! 如果您遇到neo4j-embedded的任何其他问题,请立即告诉我们,如果您发现任何错误或想到任何其他增强功能,请为github问题添加一张票!

Alternatively (might be able to execute this query from Python somehow), you can 或者(可能能够以某种方式从Python执行此查询),您可以

count the total number of nodes 计算节点总数

and return it by executing a CYPHER query via the default neo4j browser interface @ http://localhost:7474/browser/ . 并通过默认的neo4j浏览器界面@ http://localhost:7474/browser/执行CYPHER查询返回它。 The precise command follows: 精确的命令如下:

MATCH (`n: *`) RETURN count(*)+" nodes" as total;

Hope this helps. 希望这可以帮助。

如果您愿意进行REST API查询,那么这个答案将为您提供粗略的“使用中的节点ID数量”值。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM