[英]Neo4J: How to find unique nodes from a collection of paths
我正在使用neo4j來解決實時規范化問題。 假設我有來自2個不同來源的3個位置。 1個源45
給了我2個實際上彼此重復的位置,1個源55
給了我1個正確的標識符。 但是,對於任何地點標識符(重復或不重復),我想找到由Feed標識符唯一的最接近的地點集。 我的數據如下:
CREATE (a: Place {feedId:45, placeId: 123, name:"Empire State", address: "350 5th Ave", city: "New York", state: "NY", zip: "10118" })
CREATE (b: Place {feedId:45, placeId: 456, name:"Empire State Building", address: "350 5th Ave", city: "New York", state: "NY"})
CREATE (c: Place {feedId:55, placeId: 789, name:"Empire State", address: "350 5th Ave", city: "New York", state: "NY", zip: "10118"})
我已通過匹配節點連接這些節點,因此我可以對數據進行一些規范化。 例如:
MERGE (m1: Matching:NameAndCity { attr: "EmpireStateBuildingNewYork", cost: 5.0 })
MERGE (a)-[:MATCHES]-(m1)
MERGE (b)-[:MATCHES]-(m1)
MERGE (c)-[:MATCHES]-(m1)
MERGE (m2: Matching:CityAndZip { attr: "NewYork10118", cost: 7.0 })
MERGE (a)-[:MATCHES]-(m2)
MERGE (c)-[:MATCHES]-(m2)
當我想從起始位置ID找到最接近的匹配項時,我可以在起始節點的所有路徑上運行匹配,按成本排序,即:
MATCH p=(a:Place {placeId:789, feedId:55})-[*..4]-(d:Place)
WHERE NONE (n IN nodes(p)
WHERE size(filter(x IN nodes(p)
WHERE n = x))> 1)
WITH p,
reduce(costAccum = 0, n in filter(n in nodes(p) where has(n.cost)) | costAccum+n.cost) AS costAccum
order by costAccum
RETURN p, costAccum
但是,由於存在多個到相同位置的路徑,因此在查詢時會多次復制相同的節點。 是否有可能收集節點及其成本,然后只返回一個不同的子集(例如,從Feed 45
和55
給我最好的結果?
我如何返回一組不同的路徑,按成本排序,並根據Feed標識符唯一? 我構建這種類型的問題錯了嗎?
請幫忙!
您可以收集每個地點的所有路徑d,然后在每個集合中選擇最佳路徑(因為它們將被分類然后收集)
MATCH p=(a:Place {placeId:789, feedId:55})-[*..4]-(d:Place)
WITH d, collect(p) as paths,
reduce(costAccum = 0, n in filter(n in nodes(p) where has(n.cost)) | costAccum+n.cost) AS costAccum
order by costAccum
RETURN head(paths) as p, costAccum
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.