Elasticsearch相關概念講解,參看官方博客 ,可以看下這篇文章對Elasticsearch來個初步印象:《終於有人把Elasticsearch原理講透了》。
筆記參考:
假設我們已經安裝好Elasticsearch和Kibana,瀏覽器輸入ip:5601
,進入Kibana操作界面,這次我們使用到的是Kibana菜單裏的Dev Tools
功能。
創建文檔
執行如下命令:
POST twitter/_doc/1
{
"username":"Dannis",
"uid":1
}
這行命令表明,向Elasticsearch發送一個POST
請求,創建一個名爲twitter
的索引(index
),同時生成一個文檔,文檔的ID爲1,點擊三角圖標執行請求。
生成結果爲:
{
"_index" : "twitter",
"_type" : "_doc",
"_id" : "1",
"_version" : 1,
"result" : "created",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 0,
"_primary_term" : 1
}
我們來分析下返回的結果信息,_index
表明我們創建的索引名爲twitter
,_type
類型是文檔類型_doc
,_id
文檔ID爲1,這些信息都是我們在請求裏指定的,沒問題,接下來,version
表示版本號,result
表明我們的操作屬於創建created
,_shards
表示分片信息,裏面的信息表明有2個分片,成功1個,失敗0個,這是什麼意思呢?
我們執行GET _cat/shards/twitter
命令,該命令是用來查看索引的分片信息,輸出結果如下所示:
twitter 0 p STARTED 1 3.8kb 172.29.0.2 09baeea2d96a
twitter 0 r UNASSIGNED
我們看到創建的索引twitter
有兩個分片,p
是primary
的首字母縮寫,表示主分片,STAARTED
表示分片正常。r
又是什麼呢?
我們再來看下索引的設置信息setting
,輸入並執行:
GET twitter/_settings
返回結果如下:
{
"twitter" : {
"settings" : {
"index" : {
"creation_date" : "1591930378288",
"number_of_shards" : "1",
"number_of_replicas" : "1",
"uuid" : "3vc_AZQcQbGM-a6FtTSX5g",
"version" : {
"created" : "7070199"
},
"provided_name" : "twitter"
}
}
}
}
可以看到索引的詳細信息,creation_date
表示索引的創建時間,以時間戳形式展示;number_of_shards
表示主分片的數量,默認是1,number_of_replicas
表示索引的副本數量,默認也是1。
回到上面GET _cat/shards/twitter
命令的返回結果,我們知道了r
表示的是分片副本的意思,即replica
的首字母,UNASSIGNED
意爲未分配,爲什麼是這個呢?這是因爲我本地環境開啓的的Elasticsearch環境爲開發模式,只有一個節點,也就是主節點,如果開啓的是集羣模式,則副本數據則會自動分配到加入的節點上。
創建文檔的時候,如果沒有指定文檔ID,Elasticsearch會自動給該文檔生成一個ID,如下:
POST twitter/_doc
{
"username":"test",
"age":20
}
返回結果:
{
"_index" : "twitter",
"_type" : "_doc",
"_id" : "EIcep3IBFDGGZnEIs9aT",
"_version" : 1,
"result" : "created",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 4,
"_primary_term" : 1
}
我們看到返回有將文檔_id
返回,再來查看一下:
GET twitter/_doc/EIcep3IBFDGGZnEIs9aT
返回內容正是我們期望寫入的數據:
{
"_index" : "twitter",
"_type" : "_doc",
"_id" : "EIcep3IBFDGGZnEIs9aT",
"_version" : 1,
"_seq_no" : 4,
"_primary_term" : 1,
"found" : true,
"_source" : {
"username" : "test",
"age" : 20
}
}
我們還可以通過調用_create
方法來創建文檔,如下:
POST twitter/_create/2
{
"username":"GB",
"uid":1,
"city":"Guangzhou",
"province":"Guangdong",
"country":"China"
}
返回結果:
{
"_index" : "twitter",
"_type" : "_doc",
"_id" : "2",
"_version" : 1,
"result" : "created",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 6,
"_primary_term" : 1
}
可知數據已成功寫入,如果我們再次調用如下命令:
POST twitter/_create/2
{
"username":"GB",
"uid":1,
"city":"Guangzhou",
"province":"Guangdong",
"country":"China"
}
返回結果如下:
{
"error" : {
"root_cause" : [
{
"type" : "version_conflict_engine_exception",
"reason" : "[2]: version conflict, document already exists (current version [1])",
"index_uuid" : "AiRNBupCRWWLFKqFSVW3mw",
"shard" : "0",
"index" : "twitter"
}
],
"type" : "version_conflict_engine_exception",
"reason" : "[2]: version conflict, document already exists (current version [1])",
"index_uuid" : "AiRNBupCRWWLFKqFSVW3mw",
"shard" : "0",
"index" : "twitter"
},
"status" : 409
}
返回錯誤,提示我們文檔已經存在,那麼如果執行下面的命令呢?
POST twitter/_doc/1
{
"username":"Dannis",
"uid":1
}
返回結果如下:
{
"_index" : "twitter",
"_type" : "_doc",
"_id" : "1",
"_version" : 2,
"result" : "updated",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 7,
"_primary_term" : 1
}
發現成功響應,只是result
值變成了updated
,並且版本號也變成了2,由此可知,如果使用POST twitter/_create/2
方式來創建文檔,只有該文檔ID不存在才能創建成功,使用POST twitter/_doc/1
方式來創建文檔,如果文檔不存在,則創建,如果文檔已存在,則結果會變成對指定文檔的更新操作,並且版本號加1。
查詢文檔
輸入執行:
GET twitter/_doc/1
返回結果:
{
"_index" : "twitter",
"_type" : "_doc",
"_id" : "1",
"_version" : 1,
"_seq_no" : 0,
"_primary_term" : 1,
"found" : true,
"_source" : {
"username" : "Dannis",
"uid" : 1
}
}
可以看到我們定義的文檔數據被放在了_source
字段下面,found
字段表示查到了該文檔,如果我們輸入以下命令:
GET twitter/_doc/2
返回結果:
{
"_index" : "twitter",
"_type" : "_doc",
"_id" : "2",
"found" : false
}
可以看到found
爲false。
修改文檔
輸入命令並執行:
POST twitter/_update/1
{
"doc": {
"username":"Dannis-update"
}
}
返回結果:
{
"_index" : "twitter",
"_type" : "_doc",
"_id" : "1",
"_version" : 2,
"result" : "updated",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 1,
"_primary_term" : 1
}
我們看到result
值變成了updated
,_version
也變成了2,再查看一下:
GET twitter/_doc/1
返回結果:
{
"_index" : "twitter",
"_type" : "_doc",
"_id" : "1",
"_version" : 2,
"_seq_no" : 1,
"_primary_term" : 1,
"found" : true,
"_source" : {
"username" : "Dannis-update",
"uid" : 1
}
}
沒問題,username
已變成了我們想要的。
如果我們更新的時候是加入一個不存在的字段,看下發生什麼?輸入以下命令並執行:
POST twitter/_update/1
{
"doc": {
"age":25
}
}
返回結果:
{
"_index" : "twitter",
"_type" : "_doc",
"_id" : "1",
"_version" : 3,
"result" : "updated",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 2,
"_primary_term" : 1
}
可以看到文檔的版本號變爲3了。
再來查看一下,
GET twitter/_doc/1
返回結果:
{
"_index" : "twitter",
"_type" : "_doc",
"_id" : "1",
"_version" : 3,
"_seq_no" : 2,
"_primary_term" : 1,
"found" : true,
"_source" : {
"username" : "Dannis-update",
"uid" : 1,
"age" : 25
}
}
發現返回的_source
裏也包含了剛加的內容,由此可知,如果字段存在,則會更新原來的字段內容,如果字段不存在,則會添加新的字段內容。
刪除文檔
輸入命令並執行:
DELETE twitter/_doc/1
返回結果:
{
"_index" : "twitter",
"_type" : "_doc",
"_id" : "1",
"_version" : 4,
"result" : "deleted",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 3,
"_primary_term" : 1
}
可看到_version
變成了4,而result
爲deleted,由此發現,我們對文檔每進行一次寫入/更新操作,版本號都會加1。
我們再來查看一下:
GET twitter/_doc/1
返回結果:
{
"_index" : "twitter",
"_type" : "_doc",
"_id" : "1",
"found" : false
}
found
爲false,表示找不到ID爲1的文檔了,表明刪除文檔成功。
下面來看下對文檔的批量操作。
批量創建文檔
批量創建多個文檔,輸入以下命令並執行:
POST _bulk
{"index":{"_index":"twitter","_id":3}}
{"username":"張三","uid":3,"age":30}
{"index":{"_index":"twitter","_id":4}}
{"username":"李四","uid":4,"age":25}
{"index":{"_index":"twitter","_id":5}}
{"username":"王五","uid":5,"age":18}
返回結果:
{
"took" : 44,
"errors" : false,
"items" : [
{
"index" : {
"_index" : "twitter",
"_type" : "_doc",
"_id" : "3",
"_version" : 1,
"result" : "created",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 9,
"_primary_term" : 1,
"status" : 201
}
},
{
"index" : {
"_index" : "twitter",
"_type" : "_doc",
"_id" : "4",
"_version" : 1,
"result" : "created",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 10,
"_primary_term" : 1,
"status" : 201
}
},
{
"index" : {
"_index" : "twitter",
"_type" : "_doc",
"_id" : "5",
"_version" : 1,
"result" : "created",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 11,
"_primary_term" : 1,
"status" : 201
}
}
]
}
沒有異常,如果我們想同時往不同的索引裏寫入數據呢?執行以下命令:
POST _bulk
{"index":{"_index":"twitter","_id":6}}
{"username":"Jack","uid":6,"age":22}
{"index":{"_index":"twitter_v1","_id":1}}
{"username":"test","age":20}
我們同時往twitter
和twitter_v1
各寫入一條數據,分別用GET twitter/_doc/6
和GET twitter_v1/_doc/1
查詢,都能正常查詢到。
批量查詢文檔
如果我們想同時查詢多個文檔,可執行如下命令:
GET _mget
{
"docs":[
{
"_index":"twitter",
"_id":1
},
{
"_index":"twitter",
"_id":2
}
]
}
返回結果:
{
"docs" : [
{
"_index" : "twitter",
"_type" : "_doc",
"_id" : "1",
"_version" : 3,
"_seq_no" : 8,
"_primary_term" : 1,
"found" : true,
"_source" : {
"username" : "GB",
"uid" : 1,
"city" : "Guangzhou2",
"province" : "Guangdong",
"country" : "China"
}
},
{
"_index" : "twitter",
"_type" : "_doc",
"_id" : "2",
"_version" : 1,
"_seq_no" : 6,
"_primary_term" : 1,
"found" : true,
"_source" : {
"username" : "GB",
"uid" : 1,
"city" : "Guangzhou",
"province" : "Guangdong",
"country" : "China"
}
}
]
}
這是同時查同一個索引的文檔,也可以同時查多個索引下的文檔,比如:
GET _mget
{
"docs":[
{
"_index":"twitter",
"_id":1
},
{
"_index":"twitter_v1",
"_id":1
}
]
}
結果如下:
{
"docs" : [
{
"_index" : "twitter",
"_type" : "_doc",
"_id" : "1",
"_version" : 3,
"_seq_no" : 8,
"_primary_term" : 1,
"found" : true,
"_source" : {
"username" : "GB",
"uid" : 1,
"city" : "Guangzhou2",
"province" : "Guangdong",
"country" : "China"
}
},
{
"_index" : "twitter_v1",
"_type" : "_doc",
"_id" : "1",
"_version" : 1,
"_seq_no" : 0,
"_primary_term" : 1,
"found" : true,
"_source" : {
"username" : "test",
"age" : 20
}
}
]
}
批量更新文檔
批量更新文檔,執行如下命令:
POST _bulk
{"update":{"_index":"twitter","_id":1}}
{"doc":{"username":"GB-update"}}
{"update":{"_index":"twitter_v1","_id":1}}
{"doc":{"age":25}}
返回結果:
{
"took" : 107,
"errors" : false,
"items" : [
{
"update" : {
"_index" : "twitter",
"_type" : "_doc",
"_id" : "1",
"_version" : 4,
"result" : "updated",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 13,
"_primary_term" : 1,
"status" : 200
}
},
{
"update" : {
"_index" : "twitter_v1",
"_type" : "_doc",
"_id" : "1",
"_version" : 2,
"result" : "updated",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 1,
"_primary_term" : 1,
"status" : 200
}
}
]
}
我們再用以下命令查詢一下:
GET _mget
{
"docs":[
{
"_index":"twitter",
"_id":1
},
{
"_index":"twitter_v1",
"_id":1
}
]
}
返回結果如下所示,發現數據都變成了我們想要的內容。
{
"docs" : [
{
"_index" : "twitter",
"_type" : "_doc",
"_id" : "1",
"_version" : 4,
"_seq_no" : 13,
"_primary_term" : 1,
"found" : true,
"_source" : {
"username" : "GB-update",
"uid" : 1,
"city" : "Guangzhou2",
"province" : "Guangdong",
"country" : "China"
}
},
{
"_index" : "twitter_v1",
"_type" : "_doc",
"_id" : "1",
"_version" : 2,
"_seq_no" : 1,
"_primary_term" : 1,
"found" : true,
"_source" : {
"username" : "test",
"age" : 25
}
}
]
}
批量刪除文檔
批量刪除文檔跟批量更新的命令類似,如下:
POST _bulk
{"delete":{"_index":"twitter","_id":1}}
{"delete":{"_index":"twitter_v1","_id":1}}
返回結果:
{
"took" : 80,
"errors" : false,
"items" : [
{
"delete" : {
"_index" : "twitter",
"_type" : "_doc",
"_id" : "1",
"_version" : 5,
"result" : "deleted",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 14,
"_primary_term" : 1,
"status" : 200
}
},
{
"delete" : {
"_index" : "twitter_v1",
"_type" : "_doc",
"_id" : "1",
"_version" : 3,
"result" : "deleted",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 2,
"_primary_term" : 1,
"status" : 200
}
}
]
}
我們再用GET _mget
命令去查這兩個文檔的時候,發現是找不到數據了,返回結果如下:
{
"docs" : [
{
"_index" : "twitter",
"_type" : "_doc",
"_id" : "1",
"found" : false
},
{
"_index" : "twitter_v1",
"_type" : "_doc",
"_id" : "1",
"found" : false
}
]
}
同時進行創建、更新、刪除文檔操作
通過上面的例子我們發現,對文檔的批量操作都是通過_bulk
命令來操作,只是傳入的參數不同,那麼可不可以同時進行創建、更新、刪除操作呢?試一下:
POST _bulk
{"index":{"_index":"twitter","_id":10}}
{"username":"小飛飛","age":12}
{"update":{"_index":"twitter","_id":3}}
{"doc":{"username":"張三-update"}}
{"delete":{"_index":"twitter","_id":6}}
返回結果:
{
"took" : 51,
"errors" : false,
"items" : [
{
"index" : {
"_index" : "twitter",
"_type" : "_doc",
"_id" : "10",
"_version" : 1,
"result" : "created",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 15,
"_primary_term" : 1,
"status" : 201
}
},
{
"update" : {
"_index" : "twitter",
"_type" : "_doc",
"_id" : "3",
"_version" : 2,
"result" : "updated",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 16,
"_primary_term" : 1,
"status" : 200
}
},
{
"delete" : {
"_index" : "twitter",
"_type" : "_doc",
"_id" : "6",
"_version" : 2,
"result" : "deleted",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 17,
"_primary_term" : 1,
"status" : 200
}
}
]
}
我們再來批量查詢一下:
GET _mget
{
"docs":[
{
"_index":"twitter",
"_id":10
},
{
"_index":"twitter",
"_id":3
},
{
"_index":"twitter",
"_id":6
}
]
}
返回結果:
{
"docs" : [
{
"_index" : "twitter",
"_type" : "_doc",
"_id" : "10",
"_version" : 1,
"_seq_no" : 15,
"_primary_term" : 1,
"found" : true,
"_source" : {
"username" : "小飛飛",
"age" : 12
}
},
{
"_index" : "twitter",
"_type" : "_doc",
"_id" : "3",
"_version" : 2,
"_seq_no" : 16,
"_primary_term" : 1,
"found" : true,
"_source" : {
"username" : "張三-update",
"uid" : 3,
"age" : 30
}
},
{
"_index" : "twitter",
"_type" : "_doc",
"_id" : "6",
"found" : false
}
]
}
發現數據都符合預期。
到這裏,發現了通過bulk
創建文檔時,參數:{"index":{"_index":"twitter","_id":10}}
裏的index
其實是個動詞,即創建索引,特此說明。
x":“twitter”,
“_id”:10
},
{
“_index”:“twitter”,
“_id”:3
},
{
“_index”:“twitter”,
“_id”:6
}
]
}
返回結果:
```json
{
"docs" : [
{
"_index" : "twitter",
"_type" : "_doc",
"_id" : "10",
"_version" : 1,
"_seq_no" : 15,
"_primary_term" : 1,
"found" : true,
"_source" : {
"username" : "小飛飛",
"age" : 12
}
},
{
"_index" : "twitter",
"_type" : "_doc",
"_id" : "3",
"_version" : 2,
"_seq_no" : 16,
"_primary_term" : 1,
"found" : true,
"_source" : {
"username" : "張三-update",
"uid" : 3,
"age" : 30
}
},
{
"_index" : "twitter",
"_type" : "_doc",
"_id" : "6",
"found" : false
}
]
}
發現數據都符合預期。
到這裏,發現了通過bulk
創建文檔時,參數:{"index":{"_index":"twitter","_id":10}}
裏的index
其實是個動詞,即創建索引,特此說明。