简体   繁体   English

如何有选择地调用此 APOC 程序? (仅在节点的子集上)

[英]How can I call this APOC procedure selectively? (only on a subset of nodes)

I have a Neo4J database with a number of nodes of label com.我有一个 Neo4J 数据库,其中包含许多 label com 节点。 These nodes contain a key property - which uniquely groups them in the fashion that I want.这些节点包含一个关键属性——它以我想要的方式对它们进行唯一分组。 They also have a timestamp property, as well as a number of other integer properties.它们还具有时间戳属性,以及许多其他 integer 属性。

Here's the issue I'm facing: I want to use the APOC graph grouping procedure to aggregate these nodes together, based on their key properties.这是我面临的问题:我想使用 APOC 图分组过程根据它们的关键属性将这些节点聚合在一起。 However, I want to do so selectively - such that I only aggregate the nodes if their timestamp property meets a provided time window.但是,我想有选择地这样做——这样我只聚合节点,如果它们的时间戳属性满足提供的时间 window。

I have tried to MATCH and filter the nodes with a WHERE clause based on their timestamp, but I am unable to specifically pass those nodes to the nodes.group procedure.我试图根据时间戳使用 WHERE 子句匹配和过滤节点,但我无法将这些节点专门传递给nodes.group过程。 Basically, I need to figure out how to CALL nodes.group only on a specific subset of nodes.基本上,我需要弄清楚如何只在特定的节点子集上调用nodes.group I'd appreciate any help.我会很感激任何帮助。

Here is the CALL I'm performing:这是我正在执行的呼叫:

CALL apoc.nodes.group(['com'], ['key'], [{val1: 'sum', val2: 'sum', val3: 'sum',' time_start: 'collect'}]) YIELD node

As I mentioned above, I tried performing a正如我上面提到的,我尝试执行

MATCH (c:com) WHERE c.time_start >= datetime('2020-12-16T21:45:05Z')

...prior to the procedure and then chaining queries, but it did not work. ...在程序之前,然后链接查询,但它不起作用。

The procedure still got called on ALL nodes of com relationship, not just the ones I filtered.该过程仍然在 com 关系的所有节点上调用,而不仅仅是我过滤的那些节点。

The procedure itself does not allow you to pass such filters.该过程本身不允许您通过此类过滤器。 There are however two possibilities to circumvent this:然而,有两种可能性可以规避这一点:

  1. build the virtual graph yourself with vNode and vRelationship使用 vNode 和 vRelationship 自己构建虚拟图
  2. set a temporary label after your node selection and group on that在您的节点选择和分组之后设置一个临时 label

I will focus on option 2:我将专注于选项2:

Take the following graph as an example:以下图为例:

UNWIND range(1, 200) AS i
CREATE (n:com)
SET n.timestamp = i, 
n.key = apoc.coll.randomItem(items)

And let's say I have an hypothetical window to use that is 30 to 70 , I can find only the nodes matching my window predicate:假设我有一个假设的 window 使用,即30 to 70 ,我只能找到与我的 window 谓词匹配的节点:

WITH [30, 70] AS window
MATCH (n:com) 
WHERE n.timestamp > window[0] 
AND n.timestamp < window[1]
RETURN count(n)

╒══════════╕
│"count(n)"│
╞══════════╡
│39        │
└──────────┘

Before jumping in the grouping query, I just want to show that you can set a label and remove it in the same query, using the predicate.在进入分组查询之前,我只想说明您可以设置一个 label 并在同一个查询中使用谓词将其删除。

WITH [30, 70] AS window
MATCH (n:com) 
WHERE n.timestamp > window[0] 
AND n.timestamp < window[1]
SET n:temporary
WITH count(n) AS doSomething
MATCH (n:temporary)
REMOVE n:temporary
WITH count(*) AS break, doSomething
RETURN doSomething

The last WITH count(*) is necessary to not return one row per temporary node.最后一个WITH count(*)是每个临时节点不返回一行所必需的。

Now, using this logic, we can:现在,使用这个逻辑,我们可以:

  1. MATCH nodes using the window predicate使用 window 谓词的MATCH节点
  2. Assign them a new temporary label为他们分配一个新的temporary label
  3. Use apoc.nodes.group on the temporary label insteadtemporary label 上使用apoc.nodes.group
  4. Remove the temporary label卸下temporary label
  5. Return the grouped nodes返回分组的节点
WITH [30, 70] AS window
MATCH (n:com) WHERE n.timestamp > window[0] AND n.timestamp < window[1]
SET n:temporary
WITH window, count(*) AS x
CALL apoc.nodes.group(['temporary'], ['key'], null, {})
YIELD node, relationship
WITH collect(node) AS elements
MATCH (n:temporary) REMOVE n:temporary
WITH count(*) AS break, elements
UNWIND elements AS element
RETURN element

╒════════════════════════╕
│"element"               │
╞════════════════════════╡
│{"count_*":6,"key":"f"} │
├────────────────────────┤
│{"count_*":6,"key":"e"} │
├────────────────────────┤
│{"count_*":12,"key":"d"}│
├────────────────────────┤
│{"count_*":1,"key":"c"} │
├────────────────────────┤
│{"count_*":5,"key":"b"} │
├────────────────────────┤
│{"count_*":9,"key":"a"} │
└────────────────────────┘

在此处输入图像描述

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 我可以有选择地创建Postgres数据库的备份,只有certian表吗? - Can I selectively create a backup of Postgres database, with only certian tables? 如何在HQL中调用没有表名的存储过程? - How can I call a stored procedure without table names in HQL? 如何在sql脚本而不是存储过程中有选择地执行某些sql语句 - How to selectively exec some sql statement in a sql script, not in a stored procedure 如何每次都在for循环中调用存储过程并将结果存储在数据库中 - How can I call stored procedure inside for loop everytime and store results in Database 如何使用C#从Visual Studio调用SELECT过程? - How can I call a SELECT procedure from Visual Studio using C#? 如何在MySQL中将架构定义为另一个架构的子集 - How can I define a schema in MySQL as a subset of another schema 如何提供对数据库子集的公共访问? - How can I provide public access to a subset of a database? 我们如何使用 Hibernate 和 JPA 调用存储过程? - How can we call a stored procedure with Hibernate and JPA? 我该如何写不同的程序 - How can I write this procedure differently 当我只有总文件的子集时,如何应用TF-IDF? - How Do I Apply TF-IDF When I Only Have a Subset of the Total Documents?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM