建立父-子文檔語法
首先看一下如何建立父子文檔,明顯和網上”_parent”的方式不一樣,說明es後期版本已經修改了語法
1 2 3 4 5 6 7 8 9 10 11 12 13 |
PUT my_index { "mappings": { "properties": { "my_join_field": { "type": "join", "relations": { "question": "answer" } } } } } |
這段代碼建立了一個my_index的索引,其中my_join_field是一個用於join的字段,type爲join,關係relations爲:父爲question, 子爲answer
至於建立一父多子關係,只需要改爲數組即可:"question": ["answer", "comment"]
插入數據
插入兩個父文檔,語法如下
1 2 3 4 5 6 7 |
PUT my_index/_doc/1?refresh { "text": "This is a question", "my_join_field": { "name": "question" } } |
同時也可以省略name
1 2 3 4 5 |
PUT my_index/_doc/1?refresh { "text": "This is a question", "my_join_field": "question" } |
插入子文檔
子文檔的插入語法如下,注意routing是父文檔的id,平時我們插入文檔時routing的默認就是id
此時name爲answer,表示這是個子文檔
1 2 3 4 5 6 7 |
PUT /my_index/_doc/3?routing=1 { "text": "This is an answer", "my_join_field": { "name": "answer", "parent": "1" } |
通過parent_id查詢子文檔
通過parent_id query傳入父文檔id即可
1 2 3 4 5 6 7 8 9 |
GET my_index/_search { "query": { "parent_id": { "type": "answer", "id": "1" } } } |
父-子文檔的性能及限制性
父-子文檔主要適用於一對多的實體關係,將其反範式存入文檔中
父-子文檔主要由以下特性:
- Only one join field mapping is allowed per index.
每個索引只能有一個join字段 - Parent and child documents must be indexed on the same shard. This means that the same routing value needs to be provided when getting, deleting, or updating a child document.
父-子文檔必須在同一個分片上,也就是說增刪改查一個子文檔,必須使用和父文檔一樣的routing key(默認是id) - An element can have multiple children but only one parent.
每個元素可以有多個子,但只有一個父 - It is possible to add a new relation to an existing join field.
可以爲一個已存在的join字段添加新的關聯關係 - It is also possible to add a child to an existing element but only if the element is already a parent.
可以在一個元素已經是父的情況下添加一個子
總結
es中通過父子文檔來實現join,但在一個索引中只能有一個一父多子的join
關係字段
es會自動生成一個額外的用於表示關係的字段:field#parent
我們可以通過以下方式查詢
1 2 3 4 5 6 7 8 9 10 |
POST my_index/_search { "script_fields": { "parent": { "script": { "source": "doc['my_join_field#question']" } } } } |
部分響應爲
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
{ "_index" : "my_index", "_type" : "_doc", "_id" : "8", "_score" : 1.0, "fields" : { "parent" : [ "8" ] } }, { "_index" : "my_index", "_type" : "_doc", "_id" : "4", "_score" : 1.0, "_routing" : "10", "fields" : { "parent" : [ "10" ] } } |
有_routing字段的說明是子文檔,它的parent字段是父文檔id,如果沒有_routing就是父文檔,它的parent指向當前id
全局序列
父-子文檔的join查詢使用一種叫做全局序列(Global ordinals)的技術來加速查詢,它採用預加載的方式構建,防止在第一次查詢或聚合時出現太長時間的延遲,但在索引元數據改變時重建,父文檔越多,構建時間就越長,重建在refresh時進行,這會造成refresh大量延遲時間(在refresh時也是預加載).
如果join字段很少用,可以關閉這種預加載模式:"eager_global_ordinals": false
全局序列的監控
1 2 3 4 |
# 每個索引 curl -X GET "localhost:9200/_stats/fielddata?human&fields=my_join_field#question&pretty" # 每個節點上的每個索引 curl -X GET "localhost:9200/_nodes/stats/indices/fielddata?human&fields=my_join_field#question&pretty" |
一父多子的祖孫結構
考慮以下結構
1 2 3 4 5 6 7 |
question / \ / \ comment answer | | vote |
建立索引
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
PUT my_index { "mappings": { "properties": { "my_join_field": { "type": "join", "relations": { "question": ["answer", "comment"], "answer": "vote" } } } } } |
插入孫子節點
注意這裏的routing和parent值不一樣,routing指的是祖父字段,即question,而parent指的就是字面意思answer
1 2 3 4 5 6 7 8 |
PUT my_index/_doc/3?routing=1&refresh { "text": "This is a vote", "my_join_field": { "name": "vote", "parent": "2" } } |
has-child查詢
查詢包含特定子文檔的父文檔,這是一種很耗性能的查詢,儘量少用。它的查詢標準格式如下
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
GET my_index/_search { "query": { "has_child" : { "type" : "child", "query" : { "match_all" : {} }, "max_children": 10, //可選,符合查詢條件的子文檔最大返回數 "min_children": 2, //可選,符合查詢條件的子文檔最小返回數 "score_mode" : "min" } } } |
測試代碼
部分測試代碼如下
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 |
DELETE my_index PUT /my_index?pretty { "mappings": { "properties": { "my_join_field": { "type": "join", "relations": { "question": "answer" } } } } } # 插入父 PUT /my_index/_doc/8?refresh&pretty { "text": "This is a question", "my_join_field": { "name": "question" } } PUT /my_index/_doc/10?refresh&pretty { "text": "This is a new question", "my_join_field": { "name": "question" } } PUT /my_index/_doc/12?refresh&pretty { "text": "This is a new question", "my_join_field": { "name": "question" } } # 插入子 PUT /my_index/_doc/3?routing=8&refresh&pretty { "text": "This is an answer", "my_join_field": { "name": "answer", "parent": "8" } } PUT /my_index/_doc/4?routing=10&refresh&pretty { "text": "This is another answer", "my_join_field": { "name": "answer", "parent": "10" } } # 通過parent_id查詢子文檔 GET my_index/_search { "query": { "parent_id": { "type": "answer", "id": "8" } } } # 查詢relation POST my_index/_search { "script_fields": { "parent": { "script": { "source": "doc['my_join_field#question']" } } } } |