[英]Gremlin traversal subtracting value from elements in list
使用諸神示例的圖表並添加以下'amount'屬性:
rand = new Random()
g.withSack {rand.nextFloat()}.E().property('amount',sack())
下面的遍歷基於https://neo4j.com/docs/graph-algorithms/current/algorithms/similarity-pearson/ ,目標是計算(Ai - mean(A))和(Bi - mean(B) )條款:
g.V().match(
__.as('v1').outE().valueMap().select('amount').fold().as('e1'),
__.as('v1').V().as('v2'),
__.as('v2').outE().valueMap().select('amount').fold().as('e2'),
__.as('v1').outE().inV().dedup().fold().as('v1n'),
__.as('v2').outE().inV().dedup().fold().as('v2n')
).
where('v1',neq('v2').and(without('v1n'))).
where('v2',without('v1n')).
project('v1','v2','a1','a2','a1m','a2m').
by(select('v1')).
by(select('v2')).
by(select('e1')).
by(select('e2')).
by(select('e1').unfold().mean()).
by(select('e2').unfold().mean()).
where(select('a1').unfold().count().is(gt(0))).
where(select('a2').unfold().count().is(gt(0)))
遍歷輸出:
==>[a1:[v1:v[4096],v2:v[4248],a1:[0.30577987,0.8416171,0.5247855,0.57484317,0.35161084],a2:[0.7349615,0.80212617,0.6879539],a1m:0.5197273015975952,a2m:0.7416805227597555]] ==>[a1:[v1:v[4096],v2:v[4264],a1:[0.30577987,0.8416171,0.5247855,0.57484317,0.35161084],a2:[0.37226892,0.8902944,0.4158439,0.9709829],a1m:0.5197273015975952,a2m:0.6623475253582001]] ==>[a1:[v1:v[8192],v2:v[4096],a1:[0.32524675],a2:[0.30577987,0.8416171,0.5247855,0.57484317,0.35161084],a1m:0.32524675130844116,a2m:0.5197273015975952]] ==>[a1:[v1:v[8192],v2:v[4184],a1:[0.32524675],a2:[0.53761715,0.9604127,0.87463444,0.7719325],a1m:0.32524675130844116,a2m:0.786149188876152]] ==>[a1:[v1:v[8192],v2:v[4248],a1:[0.32524675],a2:[0.7349615,0.80212617,0.6879539],a1m:0.32524675130844116,a2m:0.7416805227597555]] ==>[a1:[v1:v[8192],v2:v[4264],a1:[0.32524675],a2:[0.37226892,0.8902944,0.4158439,0.9709829],a1m:0.32524675130844116,a2m:0.6623475253582001]] ==>[a1:[v1:v[4184],v2:v[4096],a1:[0.53761715,0.9604127,0.87463444,0.7719325],a2:[0.30577987,0.8416171,0.5247855,0.57484317,0.35161084],a1m:0.786149188876152,a2m:0.5197273015975952]] ==>[a1:[v1:v[4184],v2:v[8192],a1:[0.53761715,0.9604127,0.87463444,0.7719325],a2:[0.32524675],a1m:0.786149188876152,a2m:0.32524675130844116]] ==>[a1:[v1:v[4248],v2:v[4096],a1:[0.7349615,0.80212617,0.6879539],a2:[0.30577987,0.8416171,0.5247855,0.57484317,0.35161084],a1m:0.7416805227597555,a2m:0.5197273015975952]] ==>[a1:[v1:v[4248],v2:v[8192],a1:[0.7349615,0.80212617,0.6879539],a2:[0.32524675],a1m:0.7416805227597555,a2m:0.32524675130844116]] ==>[a1:[v1:v[4264],v2:v[4096],a1:[0.37226892,0.8902944,0.4158439,0.9709829],a2:[0.30577987,0.8416171,0.5247855,0.57484317,0.35161084],a1m:0.6623475253582001,a2m:0.5197273015975952]]
怎么會在遍歷中從這一點計算'a1-a1m'和'a2-a2m'? 這里的問題是從列表中的每個元素中減去一個值,然后返回差異列表,對示例的任何幫助都會很棒。
既然你已經擁有了地圖中的所有值,那就讓我們從那里開始吧。
gremlin> __.inject(['a1': [0.30577987,0.8416171,0.5247855,0.57484317,0.35161084],
......1> 'a2': [0.7349615,0.80212617,0.6879539],
......2> 'a1m': 0.5197273015975952,
......3> 'a2m': 0.7416805227597555])
==>[a1:[0.30577987,0.8416171,0.5247855,0.57484317,0.35161084],a2:[0.7349615,0.80212617,0.6879539],a1m:0.5197273015975952,a2m:0.7416805227597555]
從每個單個值(ai)中減去平均值(am)就像展開a
一樣簡單,進行數學運算( ai-am
或(am-ai)*(-1)
)並將它們折疊在一起:
gremlin> __.inject(['a1': [0.30577987,0.8416171,0.5247855,0.57484317,0.35161084],
......1> 'a2': [0.7349615,0.80212617,0.6879539],
......2> 'a1m': 0.5197273015975952,
......3> 'a2m': 0.7416805227597555]).
......4> sack(assign).
......5> by(select('a1m')).
......6> select('a1').unfold().
......7> sack(minus).
......8> sack(mult).
......9> by(constant(-1)).
.....10> sack().fold()
==>[-0.2139474315975952,0.3218897984024048,0.0050581984024048,0.0551158684024048,-0.1681164615975952]
因此,對於這兩個值,它只是另一個投影:
gremlin> __.inject(['a1': [0.30577987,0.8416171,0.5247855,0.57484317,0.35161084],
......1> 'a2': [0.7349615,0.80212617,0.6879539],
......2> 'a1m': 0.5197273015975952,
......3> 'a2m': 0.7416805227597555]).
......4> project('a','b').
......5> by(sack(assign).
......6> by(select('a1m')).
......7> select('a1').unfold().
......8> sack(minus).
......9> sack(mult).
.....10> by(constant(-1)).
.....11> sack().fold()).
.....12> by(sack(assign).
.....13> by(select('a2m')).
.....14> select('a2').unfold().
.....15> sack(minus).
.....16> sack(mult).
.....17> by(constant(-1)).
.....18> sack().fold())
==>[a:[-0.2139474315975952,0.3218897984024048,0.0050581984024048,0.0551158684024048,-0.1681164615975952],b:[-0.0067190227597555,0.0604456472402445,-0.0537266227597555]]
我想會有更多的步驟來提出最終值,我確信最終的查詢可以簡化很多,但最好在另一個線程中處理。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.