简体   繁体   English

如何通过密码查询在neo4j中获取不同的标签及其计数?

[英]How to get different labels and their count in neo4j by cypher query?

I was needing to check what different labels present in graph database neo4j.我需要检查图形数据库 neo4j 中存在哪些不同的标签。

How to get different labels and their count in neo4j by cypher query?如何通过密码查询在neo4j中获取不同的标签及其计数?

I finally found a solution to the multiple label problem that is less complicated:我终于找到了一个不那么复杂的多标签问题的解决方案:

MATCH (a) WITH DISTINCT LABELS(a) AS temp, COUNT(a) AS tempCnt
UNWIND temp AS label
RETURN label, SUM(tempCnt) AS cnt

通过这个密码查询,我们可以得到 neo4j 中存在的不同标签及其计数。

MATCH (n) RETURN DISTINCT LABELS(n), COUNT(n)

It's surprisingly complex to get a count per label since nodes can have multiple labels and labels (n) returns a collection of strings representing those labels.获取每个标签的计数非常复杂,因为节点可以有多个标签,而labels (n)返回代表这些标签的字符串集合。 On a graph consisting of three nodes and two labels, as {:A} , {:B} and {:A:B} , labels (n) returns three distinct string collections.在由三个节点和两个标签组成的图上,如{:A}{:B}{:A:B}labels (n)返回三个不同的字符串集合。 Instead of counting two nodes with :A and two nodes with :B , the result would be one for each of the three label combinations.与使用:A计算两个节点和使用:B计算两个节点不同,对于三个标签组合中的每一个,结果都是一个。 See console .请参阅控制台 To aggregate per label, not per label collection, you'd have to group by the values within collection, which is cumbersome.要聚合每个标签,而不是每个标签集合,您必须按集合中的值进行分组,这很麻烦。

I have an ugly way to do it, maybe someone can suggest a better one: first find out max number of labels any node has.我有一个丑陋的方法来做到这一点,也许有人可以提出更好的方法:首先找出任何节点具有的最大标签数。

MATCH (n)
RETURN max(length(labels(n)))

Then chain that many queries with UNION , counting nodes by the label at position i in the collection, where i starts at 0 and increments to the max-1.然后使用UNION链接许多查询,通过集合中位置i处的标签计算节点,其中i从 0 开始并递增到 max-1。 If the nodes have at most 3 labels,如果节点最多有 3 个标签,

MATCH (n)
RETURN labels (n)[0] as name, count (n) as cnt
UNION MATCH (n)
RETURN labels (n)[1] as name, count (n) as cnt
UNION MATCH (n)
RETURN labels (n)[2] as name, count (n) as cnt

This aggregates the label counts correctly, but it returns a null count for every case where the index is out of the collection bounds.这将正确聚合标签计数,但对于索引超出集合边界的每种情况,它都会返回一个null计数。 For the first return (the [0] index) this signifies nodes that don't have a label.对于第一次返回( [0]索引),这表示没有标签的节点。 For the other lines, the null count similarly signifies nodes with less labels than queried for, but this information is irrelevant, so can be ignored对于其他行,空计数同样表示标签少于查询的节点,但此信息无关紧要,因此可以忽略

MATCH (n)
RETURN labels (n)[0] as name, count (n) as cnt
UNION MATCH (n)
WITH labels (n)[1] as name, count (n) as cnt
WHERE name IS NOT NULL
RETURN name, cnt
UNION MATCH (n)
WITH labels (n)[2] as name, count (n) as cnt
WHERE name IS NOT NULL
RETURN name, cnt

I'm sure this could be done more gracefully, but that's as far as I got.我相信这可以做得更优雅,但就我所知。

You can leverage apoc library meta graph as described here https://stackoverflow.com/a/52489029 --> run the code below, it works even if there is more labels attached to a one node.您可以利用此处描述的apoc 库元图https://stackoverflow.com/a/52489029 --> 运行下面的代码,即使一个节点附加了更多标签,它也能工作。

CALL apoc.meta.stats() YIELD labels
RETURN labels

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM