ONgDB併發計算節點相似度

節點間相似度計算

以下測試均在ongdb完成
在ongdb集羣的運行過程中,READ_REPLICA節點只支持運行Mode.READ類型的過程(集羣中所有節點都支持Mode.READ類型的過程),需要運行中寫入的過程需要在支持寫入的節點運行。
併發計算的過程不支持寫入,運算得到的結果需要neo4j-stream的支持進行寫入,或者額外的插件單獨寫入。結果的保存也可以放在其它存儲系統。
可支持的併發數量與服務器的CPU核數有關係。

一、Jaccard相似度 - algo.similarity.jaccard - Mode.WRITE

傑卡爾德相似度計算更適合在大規模數據下的分佈式並行運算

1、創建測試數據

CREATE (a:Person {name:'Alice'})
CREATE (b:Person {name:'Bob'})
CREATE (c:Person {name:'Charlie'})
CREATE (d:Person {name:'Dana'})
CREATE (i1:Item {name:'p1'})
CREATE (i2:Item {name:'p2'})
CREATE (i3:Item {name:'p3'})
CREATE (a)-[:LIKES]->(i1),
 (a)-[:LIKES]->(i2),
 (a)-[:LIKES]->(i3),
 (b)-[:LIKES]->(i1),
 (b)-[:LIKES]->(i2),
 (c)-[:LIKES]->(i3)

2、運行相似度計算

不支持併發計算支持寫入

MATCH (p:Person)-[:LIKES]->(i:Item)
WITH {item:id(p), categories: collect(distinct id(i))} as userData
WITH collect(userData) as data
CALL algo.similarity.jaccard(data, {write:true,showComputations:true,similarityCutoff:0.1}) yield p25, p50, p75, p90, p95, p99, p999, p100, nodes, similarityPairs, computations RETURN *

3、相似度計算結果

生成了一條SIMILAR關係線,更新了score屬性;很明確可以看到人物節點之間的相似性得分。
在這裏插入圖片描述

4、algo.similarity.jaccard.stream - Mode.READ

併發計算,不支持寫入

MATCH (p:Person)-[:LIKES]->(i:Item)
WITH {item:id(p), categories: collect(distinct id(i))} as userData
WITH collect(userData) as data
call algo.similarity.jaccard.stream(data,{topK:4,concurrency:4,similarityCutoff:-0.1}) yield item1, item2, count1, count2, intersection, similarity RETURN * ORDER BY item1,item2

[外鏈圖片轉存失敗,源站可能有防盜鏈機制,建議將圖片保存下來直接上傳(img-PrOL9k9L-1589613836761)(_v_images/20200514191125019_15935.png)]

二、餘弦相似度 - algo.similarity.cosine - Mode.WRITE

1、創建測試數據

CREATE (a:Person {name:'Alice'})
CREATE (b:Person {name:'Bob'})
CREATE (c:Person {name:'Charlie'})
CREATE (d:Person {name:'Dana'})
CREATE (i1:Item {name:'p1'})
CREATE (i2:Item {name:'p2'})
CREATE (i3:Item {name:'p3'})
CREATE (a)-[:LIKES {stars:1}]->(i1),
 (a)-[:LIKES {stars:2}]->(i2),
 (a)-[:LIKES {stars:5}]->(i3),
 (b)-[:LIKES {stars:1}]->(i1),
 (b)-[:LIKES {stars:3}]->(i2),
 (c)-[:LIKES {stars:4}]->(i3)

2、運行相似度計算

MATCH (i:Item) WITH i ORDER BY id(i) MATCH (p:Person) OPTIONAL MATCH (p)-[r:LIKES]->(i)
WITH {item:id(p), weights: collect(coalesce(r.stars,0))} as userData
WITH collect(userData) as data
CALL algo.similarity.cosine(data, {write:true,showComputations:true,similarityCutoff:0.1}) yield p25, p50, p75, p90, p95, p99, p999, p100, nodes, similarityPairs, computations RETURN *

3、algo.similarity.cosine.stream - Mode.READ

CALL algo.similarity.cosine.stream([{item:id, weights:[weights]}], {similarityCutoff:-1,degreeCutoff:0})
YIELD item1, item2, count1, count2, intersection, similarity - computes cosine distance

三、Pearson相似度 - algo.similarity.pearson - Mode.WRITE

1、創建測試數據

CREATE (a:Person {name:'Alice'})
CREATE (b:Person {name:'Bob'})
CREATE (c:Person {name:'Charlie'})
CREATE (d:Person {name:'Dana'})
CREATE (i1:Item {name:'p1'})
CREATE (i2:Item {name:'p2'})
CREATE (i3:Item {name:'p3'})
CREATE (i4:Item {name:'p4'})
CREATE (a)-[:LIKES {stars:1}]->(i1),
 (a)-[:LIKES {stars:2}]->(i2),
 (a)-[:LIKES {stars:3}]->(i3),
 (a)-[:LIKES {stars:4}]->(i4),
 (b)-[:LIKES {stars:2}]->(i1),
 (b)-[:LIKES {stars:3}]->(i2),
 (b)-[:LIKES {stars:4}]->(i3),
 (b)-[:LIKES {stars:5}]->(i4),
 (c)-[:LIKES {stars:3}]->(i1),
 (c)-[:LIKES {stars:4}]->(i2),
 (c)-[:LIKES {stars:4}]->(i3),
 (c)-[:LIKES {stars:5}]->(i4),
 (d)-[:LIKES {stars:3}]->(i2),
 (d)-[:LIKES {stars:2}]->(i3),
 (d)-[:LIKES {stars:5}]->(i4)

2、運行相似度計算

MATCH (i:Item) WITH i ORDER BY id(i) MATCH (p:Person) OPTIONAL MATCH (p)-[r:LIKES]->(i)
WITH {item:id(p), weights: collect(coalesce(r.stars,0))} as userData
WITH collect(userData) as data
CALL algo.similarity.pearson(data,{similarityCutoff:-0.1}) yield p25, p50, p75, p90, p95, p99, p999, p100, nodes, similarityPairs, computations RETURN *

3、algo.similarity.jaccard.stream - Mode.READ

MATCH (i:Item) WITH i ORDER BY i MATCH (p:Person) OPTIONAL MATCH (p)-[r:LIKES]->(i)
WITH p, i, r ORDER BY id(p), id(i) WITH {item:id(p), weights: collect(coalesce(r.stars,$missingValue))} as userData
WITH collect(userData) as data
call algo.similarity.pearson.stream(data,{similarityCutoff:-0.1}) yield item1, item2, count1, count2, intersection, similarity RETURN item1, item2, count1, count2, intersection, similarity ORDER BY item1,item2

四、歐式距離 - algo.similarity.euclidean - Mode.WRITE

1、創建測試數據

CREATE (a:Person {name:'Alice'})
CREATE (b:Person {name:'Bob'})
CREATE (c:Person {name:'Charlie'})
CREATE (d:Person {name:'Dana'})
CREATE (i1:Item {name:'p1'})
CREATE (i2:Item {name:'p2'})
CREATE (i3:Item {name:'p3'})
CREATE (a)-[:LIKES {stars:1}]->(i1),
 (a)-[:LIKES {stars:2}]->(i2),
 (a)-[:LIKES {stars:5}]->(i3),
 (b)-[:LIKES {stars:1}]->(i1),
 (b)-[:LIKES {stars:3}]->(i2),
 (c)-[:LIKES {stars:4}]->(i3)

2、運行相似度計算

MATCH (i:Item) WITH i ORDER BY id(i) MATCH (p:Person) OPTIONAL MATCH (p)-[r:LIKES]->(i)
WITH {item:id(p), weights: collect(coalesce(r.stars,0))} AS userData
WITH collect(userData) AS data
CALL algo.similarity.euclidean(data, {similarityCutoff:-0.1}) YIELD p25, p50, p75, p90, p95, p99, p999, p100, nodes, similarityPairs, computations RETURN *

3、algo.similarity.jaccard.stream - Mode.READ

MATCH (i:Item) WITH i ORDER BY id(i) MATCH (p:Person) OPTIONAL MATCH (p)-[r:LIKES]->(i)
WITH {item:id(p), weights: collect(coalesce(r.stars,$missingValue))} AS userData
WITH collect(userData) AS data
CALL algo.similarity.euclidean.stream(data,{similarityCutoff:-0.1}) YIELD item1, item2, count1, count2, intersection, similarity RETURN item1, item2, count1, count2, intersection, similarity ORDER BY item1,item2

五、重疊相似度 - algo.similarity.overlap - Mode.WRITE

1、創建測試數據

CREATE (a:Person {name:'Alice'})
CREATE (b:Person {name:'Bob'})
CREATE (c:Person {name:'Charlie'})
CREATE (d:Person {name:'Dana'})
CREATE (i1:Item {name:'p1'})
CREATE (i2:Item {name:'p2'})
CREATE (i3:Item {name:'p3'})
CREATE (a)-[:LIKES]->(i1),
 (a)-[:LIKES]->(i2),
 (a)-[:LIKES]->(i3),
 (b)-[:LIKES]->(i1),
 (b)-[:LIKES]->(i2),
 (c)-[:LIKES]->(i3)

2、運行相似度計算

MATCH (p:Person)-[:LIKES]->(i:Item)
WITH {item:id(p), categories: collect(distinct id(i))} as userData
WITH collect(userData) as data
CALL algo.similarity.overlap(data, {similarityCutoff:-0.1}) yield p25, p50, p75, p90, p95, p99, p999, p100, nodes, similarityPairs, computations RETURN p25, p50, p75, p90, p95, p99, p999, p100, nodes, similarityPairs, computations

3、algo.similarity.jaccard.stream - Mode.READ

MATCH (p:Person)-[:LIKES]->(i:Item)
WITH {item:id(p), categories: collect(distinct id(i))} as userData
WITH collect(userData) as data
call algo.similarity.overlap.stream(data,{similarityCutoff:-0.1}) yield item1, item2, count1, count2, intersection, similarity RETURN item1, item2, count1, count2, intersection, similarity ORDER BY item1,item2

備註

neo4j-graph-algorithms包相似度計算源碼位置:
algo/src/main/java/org/neo4j/graphalgo/similarity

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章