圖數據庫-Neo4j介紹與Cypher入門

1、Neo4j簡介
2、單機安裝和簡單使用(社區版)
3、Cypher Query Language
3.1.基本語法
3.2.模式實踐
3.3.Getting the Results You Want
3.4.Compose Large Statements:編寫大型語句
3.5.Utilizing Data Structures
3.6.Labels, Constraints and Indexes


1、Neo4j簡介

Neo4j是一個高性能的,NOSQL圖形數據庫,它將結構化數據存儲在網絡上形成圖譜而不是表中。Neo4j也可以被看作是一個高性能的圖引擎,該引擎具有成熟數據庫的所有特性,如:事務,索引等。程序員工作在一個面向對象的、靈活的網絡結構下而不是嚴格、靜態的表中——但是他們可以享受到具備完全的事務特性、企業級的數據庫的所有好處。

在一個圖中包含兩種基本的數據類型:Nodes(節點) 和 Relationships(關係)。Nodes 和 Relationships 包含key/value形式的屬性。Nodes通過Relationships所定義的關係相連起來,形成關係型網絡結構。

目前的客戶端接口方式包括:neo4j-shell,REST API,Driver(Java\.NET\JS\Python\Ruby\PHP)等等。截止此時,最新版本爲2.3.0社區版和企業版,其中企業版支持HA。Neo4j中的語言採用Cypher,一種類似於SQL的語法格式。Neo4j中的數據處理分析可採用Spark的Gragh圖計算處理。

Neo4j因其嵌入式、高性能、輕量級等優勢,越來越受到關注。


2、單機安裝和簡單使用(社區版)

下載地址:http://neo4j.com/download/

直接解壓啓動:tar -zxcf  neo4j-community-2.3.0-M02-unix.tar.gz

配置外網訪問IP地址:conf/neo4f-server.properties
org.neo4j.server.webserver.address=0.0.0.0
執行命令啓動數據庫服務:neo4j/bin/neo4j start


瀏覽器http://localhost:7474/browser/啓動,進行Gragh操作。默認的賬號和密碼爲neo4j/neo4j。進入後需要修改密碼。以下是創建圖語句和結果視圖:




3、Cypher 查詢語言簡單使用
3.1.基本語法
Node語法:
Cypher使用一對圓括號來表示一個節點:提供了多種格式如下:
() 匿名節點
(matrix)  爲節點添加一個ID
(:Movie) Movie label標籤,聲明的是節點類型。noe4j的索引使用label,每個索引由標籤和屬性組成
(matrix:Movie)
(matrix:Movie {title: "The Matrix"}) 節點屬性(如:title)代表一個key\value 的List
(matrix:Movie {title: "The Matrix", released: 1997})

RelationShip語法:
--  表示一個無指向的關係
--> 表示一個有指向的關係
[] 能夠添加ID,屬性,類型等信息
-[role]->
-[:ACTED_IN]->
-[role:ACTED_IN]->
-[role:ACTED_IN {roles: ["Neo"]}]->

Pattern 語法:
節點和關係語法的合併就表示模式。
(keanu:Person:Actor   {name: "Keanu Reeves"} )
-[role:ACTED_IN   {roles: ["Neo"] } ]->
(matrix:Movie    {title: "The Matrix"} )
Pattern Identifiers :
爲模式分配ID,爲例增加模塊化和重複使用
acted_in = (:Person)-[:ACTED_IN]->(:Movie)

3.2.模式實踐
bin/neo4j-shell
創建一個節點數據:
CREATE (:Movie { title:"The Matrix",released:1997 }) ;


如果想返回創建的數據,需要指定ID:
create (p:Person {name:"weiw",born:2000}) return p;


創建多個節點數據,多個元素間用逗號或者用create分開:
create (a:Person {name:"jiaj",born:2003})-[r:ACTED_IN {roles:["student"]}]->(m:School {name:"CDLG",address:"chengdu"})
create (d:Person {name:"weiw",born:2001})-[:DIRECTED]->(m)
return a,d,r,m;


Matching Patterns :模式匹配
我們想連接新的數據到已經存在的結構,這個需求需要我們知道怎樣找到在圖中已經存在的模式。
match (m:School) return m;


match (p:Person {name:"weiw"}) return p;


match (p:Person {name:"jiaj"})-[r:ACTED_IN]->(m:School) return m.name,r.roles;


Attaching Structures:
將match和create進行合併使用。將匹配到的節點連接到一個新的節點上。
match (p:Person {name:"jiaj"})
create (m:School {name:"DEJY",address:"deyang"})
create (p)-[r:ACTED_IN {roles:["studeng"]}]->(m)
return p,r,m;


Completing Patterns :
merge在查找時,如果找到則返回,如果沒找到則創建。可以避免創建重複的節點
merge (m:School {name:"SCDX"})
on create set m.address="chegndu"
return m;

MATCH (m:School { name:"CDLG" })
MATCH (p:Person { name:"jiaj" })
MERGE (p)-[r:ACTED_IN]->(m)
ON CREATE SET r.roles =['teacher']
RETURN p,r,m ;



之前的案列中,關聯的方向是隨意的,你可以改變箭頭的指向。MERGE 會檢查關聯兩邊的方向,如果沒有匹配到關係,則創建一個新的方向的關係。
CREATE (y:Year { year:2014 })
MERGE (y)<-[:IN_YEAR]-(m10:Month { month:10 })
MERGE (y)<-[:IN_YEAR]-(m11:Month { month:11 })
RETURN y,m10,m11 ;


3.3.Getting the Results You Want
數據準備:以人在電影中扮演的角色爲例
CREATE (matrix:Movie { title:"The Matrix",released:1997 })
CREATE (cloudAtlas:Movie { title:"Cloud Atlas",released:2012 })
CREATE (forrestGump:Movie { title:"Forrest Gump",released:1994 })
CREATE (keanu:Person { name:"Keanu Reeves", born:1964 })
CREATE (robert:Person { name:"Robert Zemeckis", born:1951 })
CREATE (tom:Person { name:"Tom Hanks", born:1956 })
CREATE (tom)-[:ACTED_IN { roles: ["Forrest"]}]->(forrestGump)
CREATE (tom)-[:ACTED_IN { roles: ['Zachry']}]->(cloudAtlas)
CREATE (robert)-[:DIRECTED]->(forrestGump)


Filtering Results :數據過濾
常用謂詞:AND, OR, XOR and NOT
match (m:Movie) where m.title="The Matrix" return m;


MATCH (p:Person)-[r:ACTED_IN]->(m:Movie)
WHERE p.name =~ "K.+" OR m.released > 2000 OR "Neo" IN r.roles 
RETURN p,r,m ;
最後一個角色條件沒有完全匹配上:


MATCH (p:Person)-[:ACTED_IN]->(m)
WHERE NOT (p)-[:DIRECTED]->()
RETURN p,m ;

Returning Results :結果返回
返回 numbers, strings and arrays as [1,2,3], and maps like {name:"Tom Hanks", born:1964, movies:["Forrest Gump", ...], count:13}.
常用表達式:
 names[0] ,movies[1..-1].  length(array), toInt("12"), substring("2014-07-01",0,4), or coalesce(p.nickname,"n/a") ,DISTINCT

MATCH (p:Person) 
RETURN p, p.name AS name, upper(p.name), coalesce(p.nickname,"n/a") AS nickname, { name: p.name, label:head(labels(p))} AS person;


Aggregating Information:聚合操作
常用聚合: count, sum, avg, min, max,count(DISTINCT role),NULL值自動跳過
MATCH (:Person)
RETURN count(*) AS people


To find out how often an actor and director worked together, you’d run this statement:
MATCH (actor:Person)-[:ACTED_IN]->(movie:Movie)<-[:DIRECTED]-(director:Person)
RETURN actor,director,count(*) AS collaborations

Ordering and Pagination :排序和分頁
排序用法:ORDER BY person.age
分頁用法:SKIP {offset} LIMIT {count}
MATCH (a:Person)-[:ACTED_IN]->(m:Movie)
RETURN a,count(*) AS appearances
ORDER BY appearances DESC LIMIT 10;

Collecting Aggregation:聚集聚合
collects all aggregated  values into a real array or list。
MATCH (m:Movie)<-[:ACTED_IN]-(a:Person)
RETURN m.title AS movie, collect(a.name) AS cast, count(*) AS actors


3.4.Compose Large Statements:編寫大型語句
UNION:
MATCH (p:Person)-[r:ACTED_IN]->(m:Movie)
RETURN p,type(r) AS rel,m
UNION
MATCH (p:Person)-[r:DIRECTED]->(m:Movie)
RETURN p,type(r) AS rel,m
WITH:

3.5.Utilizing Data Structures
MATCH (m:Movie)<-[:ACTED_IN]-(a:Person)
RETURN m.title AS movie, collect(a.name)[0..5] AS five_of_cast

List謂詞:
When using lists and arrays in comparisons you can use predicates like value IN list or any(x IN list
WHERE x = value). There are list predicates to satisfy conditions for all, any, none and single elements.
MATCH path =(:Person)-->(:Movie)<--(:Person)
WHERE ALL (r IN rels(path) WHERE type(r)= 'ACTED_IN') AND ANY (n IN nodes(path) WHERE n.name = 'Clint  Eastwood') 
RETURN path

List處理:
 you want to process lists to filter, aggregate (reduce) or transform (extract) their values.
WITH range(1,10) AS numbers
WITH extract(n IN numbers | n*n) AS squares
WITH filter(n IN squares WHERE n > 25) AS large_squares
RETURN reduce(a = 0, n IN large_squares | a + n) AS sum_large_squares;


MATCH (m:Movie)<-[r:ACTED_IN]-(a:Person)
WITH m.title AS movie, collect({ name: a.name, roles: r.roles }) AS cast
RETURN movie, extract(c2 IN filter(c1 IN cast WHERE c1.name =~ "T.*")| c2.roles)

Unwind Lists:列表展開
you have collected information into a list, but want to use each element individually as a row。For instance, you might want to further match patterns in the graph.
MATCH (a:Person)-[:ACTED_IN]->(m:Movie)<-[:ACTED_IN]-(colleague:Person)
WITH colleague, count(*) AS frequency, collect(DISTINCT m) AS movies
ORDER BY frequency DESC LIMIT 5 UNWIND movies AS m
MATCH (m)<-[:ACTED_IN]-(a)
RETURN m.title AS movie, collect(a.name) AS cast

3.6.Labels, Constraints and Indexes
使用約束: title被唯一化約束
adding the unique constraint will add an index on that property。
CREATE CONSTRAINT ON (movie:Movie) ASSERT movie.title IS UNIQUE

查看索引:
CREATE INDEX ON :Actor(name)
CREATE (actor:Actor { name:"Tom Hanks" }),(movie:Movie { title:'Sleepless IN Seattle' }), (actor)-[:ACTED_IN]->(movie); 

標籤:
MATCH (actor:Actor { name: "Tom Hanks" })  SET actor :American return actor; 
刪除標籤:
MATCH (actor:Actor { name: "Tom Hanks" })  REMOVE actor:American;


本期關於neo4j的介紹只是做了入門級別的使用,後續會更加詳細的介紹Cypher的所有語法、Neo4j-JDBC的使用以及HA的安裝過程。

官方文檔手冊下載地址:http://download.csdn.net/detail/wangweislk/8983743



發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章