简体   繁体   English

具有相同节点属性的密码窄搜索

[英]cypher narrow search with same node property

the question I have is here : cypher-how-get-relation-between-every-two-node-and-the-distance-from-start-node , more detail here: 2 million companies, each of them must and only belong to a leading company,called group, so every node has properties: groupId and companyId; 我的问题在这里: 密码如何在两个节点之间以及与起始节点之间的距离之间建立关系 ,此处有更多详细信息:200万家公司,每个公司必须且仅属于到领先的公司(称为组),因此每个节点都具有以下属性:groupId和companyId; what's more, companies in different group may have relationship. 而且,不同组中的公司可能有关系。 QUESTION: given a groupId and the leading company id, return all relations in this group and every node in the group's shortest distance to leading company. 问题:给定一个groupId和领先公司ID,返回该组中的所有关系以及该组中与领先公司最短距离的每个节点。

since the sql that anwser have big performance issue, especially the shortPath one, so my question is can we narrow down the search scope when use shortPath, only search nodes with same property? 由于anwser的sql有很大的性能问题,尤其是shortPath,所以我的问题是使用shortPath时,我们可以缩小搜索范围,仅搜索具有相同属性的节点吗?

or are there other way to solve the original question? 还是有其他方法可以解决原始问题?

sorry since I am in China mainland, cannot reach the console.neo4j.com(even with VPN), so I put my sample here: 抱歉,由于我在中国大陆,无法访问console.neo4j.com(即使使用VPN),因此我将示例放在这里:

create (a :COMPANY {companyId:"a",groupId:"ag"}),
       (b:COMPANY  {companyId:"b",groupId:"ag"}),
       (c:COMPANY  {companyId:"c",groupId:"ag"}),
       (d:COMPANY {companyId:"d",groupId:"ag"}),
       (e:COMPANY  {companyId:"e",groupId:"eg"})
create (a)-[:INVESTMENT]->(b),
       (b)-[:INVESTMENT]->(c),
       (c)-[:INVESTMENT]->(d),
       (a)-[:INVESTMENT]->(c),
       (d)-[:INVESTMENT]->(b),
       (c)-[:INVESTMENT]->(e) 
return *

here the node a,b,c,d are same group and a is leading company, e are another group but has relationship with c . 这里的节点a,b,c,d是同一组,而a是领导公司, e是另一组,但与c有关系。 so I want get the node-node relation in ag group, for example: ab,ac,bc,cd,db and the shortest distance from a to group member, for example,return dist.a=0,dist.b=1,dist.c=1,dist.d=2 所以我想获得ag组中的节点-节点关系,例如: ab,ac,bc,cd,db以及从a到组成员的最短距离,例如,返回dist.a=0,dist.b=1,dist.c=1,dist.d=2

I think that this can not be solved with the help of pure cypher. 我认为这无法借助纯密码解决。 You can try using the APOC library by adding a temporary property to the relation, and applying the Dijkstra algorithm . 您可以通过在关系中添加临时属性并应用Dijkstra算法来尝试使用APOC库。

Input params: 输入参数:

{
  "groupId": "ag",
  "leadingCompany": "a"
}

Query: 查询:

// Search for a leading company
MATCH (lc:COMPANY {companyId: $leadingCompany, groupId: $groupId})
WITH lc, 
     apoc.create.uuid() as tmpProp // Temporary property name

// All relationships in the group are found. 
// And the value of the temporary property is set ..
MATCH (c1:COMPANY {groupId: $groupId})-[r:INVESTMENT]->(c2:COMPANY {groupId: $groupId})
CALL apoc.create.setRelProperty(r, tmpProp, 1) yield rel
WITH lc, tmpProp, 
     count(r) as tmp

// For each node in the group, need to find short paths to the leading company
MATCH (c:COMPANY {groupId: $groupId})
CALL apoc.algo.dijkstraWithDefaultWeight(lc, c, 'INVESTMENT', tmpProp, 2000000) yield path
WITH tmpProp, c, 
     min(length(path)) as distanceToLeading

// All paths in the group are found, and the temporary property is deleted
MATCH (c)-[r:INVESTMENT]->(:COMPANY {groupId: $groupId})
CALL apoc.create.removeRelProperties(r, [tmpProp]) yield rel
RETURN c as groupNode, distanceToLeading, 
       collect(r) as groupRelations

APOC Procedures can help out here, as some of the path expander procedures can be used to find the shortest distance to each node in the group, and there's also a cover() procedure that will find all relationships between a group of nodes. APOC过程可以在这里提供帮助,因为一些路径扩展器过程可以用来查找到组中每个节点的最短距离,还有一个cover()过程可以找到一组节点之间的所有关系。

You'll want to make sure you have an index on :Company(groupId) and :Company(companyId) first. 您需要确保首先在:Company(groupId)和:Company(companyId)上具有索引。

MATCH (c:Company{groupId:$groupId})
WITH collect(c) as companies
WITH companies, [c in companies | id(c)] as companyIds, [c in companies 
 WHERE NOT (c)<-[:INVESTMENT]-(:Company{groupId:$groupId})][0] as lead
// for the above, if you already know the lead companyId, just MATCH to the lead instead of this filter
CALL apoc.algo.cover(companyIds) YIELD rel
WITH companies, lead, collect(rel {start:startNode(rel).companyId, end:endNode(rel).companyId}) as relationships
UNWIND companies as company
MATCH path = shortestPath((lead)-[:INVESTMENT*]->(company))
WHERE all(node in nodes(path) WHERE node in companies)
RETURN relationships, collect(company {.companyId, distance:length(path)}) as distance

This query will get you the desired output: 此查询将为您提供所需的输出:

 match p=((c:COMPANY{companyId:'a'})-[i:INVESTMENT*0..99]->(l:COMPANY)) 
    where l.groupId=c.groupId 
    with c,i,l,nodes(p) as path  order by c.companyId
    with c,l,collect(distinct l.companyId) as Companies,min(size(path))-1 as Dist
    match pp=shortestpath((cc:COMPANY{companyId:'a'})-[ii:INVESTMENT*0..99]->(ll:COMPANY)) 
    where ll.companyId in Companies
    with c,Companies,Dist,reduce(s='',x in nodes(pp)|s + x.companyId ) as CompanyPath     
return c.companyId,Companies,Dist,CompanyPath order by Dist

You will notice, it does not require advanced knowledge of the groupId. 您会注意到,它不需要groupId的高级知识。 If a lead company can be in two groups, you would need to include this in the initial where. 如果一家牵头公司可以分为两组,那么您需要将其包括在初始位置。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM