Elasticsearch 6 入門教程之查詢語法(查詢詳解)

系列文章

Elasticsearch 6 入門教程之ElasticSearch概述

Elasticsearch 6 入門教程之安裝Elasticsearch

Elasticsearch 6 入門教程之ElasticSearch倒排索引和分詞器

Elasticsearch 6 入門教程之ElasticSearch API 實現CRUD

Elasticsearch 6 入門教程之什麼是Mapping

Elasticsearch 6 入門教程之查詢語法(查詢詳解)

數據準備

PUT /lib3 { 
    "settings":{ 
        "number_of_shards" : 3, 
        "number_of_replicas" : 0
    }, 
    "mappings":{ 
        "user":{ 
            "properties":{ 
                "name": {"type":"text"}, 
                "address": {"type":"text"}, 
                "age": {"type":"integer"}, 
                "interests": {"type":"text"}, 
                "birthday": {"type":"date"} 
            } 
        } 
    } 
}

GET /lib3/user/_search?q=name:lisi

GET /lib3/user/_search?q=name:zhaoliu&sort=age:desc

查詢

日期數值類型等需要精確查詢(因爲沒有分詞)
字符串text keyword 等可以模糊匹配(分詞)

GET _search 查詢所有文檔

GET /lib/_search 查詢lib索引下的所有文檔

GET /lib,lib3/_search 查詢lib,lib3索引下的所有文檔

GET /*3,*4/_search 查詢*3,*4 索引下的所有文檔 *3,*4 *通配符

GET /lib/user/_search 查詢lib下user類型的所有文檔

GET /lib,lib4/user,items/_search 查詢lib,lib4索引下user,items類型的所有文檔

GET /_all/_search 查詢集羣下所有索引的所有文檔

GET /_all/user,items/_search 查詢集羣下所有索引下的user,items類型的所有文檔

查詢結果解釋

took //查詢時間(毫秒)

timed_our //是否超時

_shards： //共請求了多少個shard

total： //查詢出的文檔總個數

hits： //查詢結果，不指定返回數量，默認查詢前10個文檔

max_score：本次查詢中，相關度分數的最大值，文檔和此次查詢的匹配度越高，_score的值越大，排位越靠前

term查詢和terms查詢

term query會去倒排索引中尋找確切的term，它並不知道分詞器的存在。這種查詢適合keyword 、numeric、date。

term:查詢某個字段裏含有某個關鍵詞的文檔
GET /lib3/user/_search/ { 
    "query": { 
        "term": {"interests": "changge"} 
    } 
}
terms:查詢某個字段裏含有多個關鍵詞的文檔
GET /lib3/user/_search { 
    "query":{ 
        "terms":{ 
            "interests": ["hejiu","changge"] 
        } 
    } 
}

from、size控制查詢返回的數量

from：從哪一個文檔開始

size：需要的個數

類似於mysql中的 limit 0,10
GET /lib3/user/_search { 
    "from":0, 
    "size":10, 
    "query":{ 
        "terms":{ 
            "interests": ["hejiu","changge"] 
        } 
    } 
}

version：返回版本號

GET /lib3/user/_search { 
    "version":true, 
    "query":{ 
        "terms":{ 
            "interests": ["hejiu","changge"] 
        } 
    } 
}

match:查詢

match query知道分詞器的存在，會對filed進行分詞操作，然後再查詢
GET /lib3/user/_search { 
    "query":{ 
        "match":{ "name": "zhaoliu" }
     } 
}
GET /lib3/user/_search { 
    "query":{ 
        "match":{ "age": 20 } 
    } 
}

match_all:查詢所有文檔

GET /lib3/user/_search { 
    "query": { 
        "match_all": {} 
    } 
}

multi_match:可以指定多個字段

GET /lib3/user/_search { 
    "query":{ 
        "multi_match": { 
            "query": "lvyou", 
            "fields": ["interests","name"]  //query搜索"interests","name"兩個字段
        } 
    } 
}

match_phrase:短語匹配查詢

ElasticSearch引擎首先分析（analyze）查詢字符串，從分析後的文本中構建短語查詢，這意味着必須匹配短語中的所有分詞，並且保證各個分詞的相對位置不變：
GET lib3/user/_search { 
    "query":{
        "match_phrase":{
            "interests": "duanlian，shuoxiangsheng" 
        } 
    } 
}

_source:指定返回的字段

GET /lib3/user/_search { 
    "_source": ["address","name"], 
    "query": { 
        "match": { "interests": "changge" } 
    } 
}

控制加載的字段

includes：包含的字段
excludes：排除的字段

GET /lib3/user/_search { 
    "query": { 
        "match_all": {} 
    },
    "_source": {
        "includes": ["name","address"],
        "excludes": ["age","birthday"]
    }
}
//支持使用通配符匹配字段名稱
GET /lib3/user/_search { 
    "_source": { 
        "includes": "addr*", 
        "excludes": ["name","bir*"]
    },
    "query": {
        "match_all": {}
    }
}

sort:排序

使用sort實現排序： desc:降序，asc升序

GET /lib3/user/_search { 
    "query": { 
        "match_all": {} 
    }, 
    "sort": [ 
        { "age": { "order":"asc" } } 
    ]
}

GET /lib3/user/_search { 
    "query": { 
        "match_all": {} 
    }, 
    "sort": [ 
        { "age": { "order":"desc" }} 
    ]
}

match_phrase_prefix:前綴匹配查詢

GET /lib3/user/_search { 
    "query": { 
        "match_phrase_prefix": { 
            "name": { "query": "zhao" } 
        } 
    } 
}

range:範圍查詢

range:實現範圍查詢

參數：from,to,include_lower,include_upper,boost

from :開始的範圍

include_lower:是否包含範圍的左邊界，默認是true

to :結束的範圍

include_upper:是否包含範圍的右邊界，默認是true

boost :設置權重
GET /lib3/user/_search { 
    "query": { 
        "range": { 
            "birthday": { 
                "from": "1990-10-10", 
                "to": "2018-05-01" 
            } 
        } 
    } 
}
GET /lib3/user/_search { 
    "query": { 
        "range": { 
            "age": { 
                "from": 20, 
                "to": 25, 
                "include_lower": true, 
                "include_upper": false 
            } 
        } 
    } 
}

wildcard:查詢

允許使用通配符* 和 ?來進行查詢

*代表0個或多個字符

？代表任意一個字符
GET /lib3/user/_search { 
    "query": { 
        "wildcard": { "name": "zhao*" } 
    } 
}
GET /lib3/user/_search { 
    "query": { 
        "wildcard": { "name": "li?i" } 
    } 
}

fuzzy:實現模糊查詢查詢性能略低

value：查詢的關鍵字

boost：查詢的權值，默認值是1.0

min_similarity:設置匹配的最小相似度，默認值爲0.5，對於字符串，取值爲0-1(包括0和1);對於數值，取值可能大於1;對於日期型取值爲1d,1m等，1d就代表1天

prefix_length:指明區分詞項的共同前綴長度，默認是0

max_expansions:查詢中的詞項可以擴展的數目，默認可以無限大
GET /lib3/user/_search { 
    "query": { 
        "fuzzy": { 
            "interests": "chagge" 
        } 
    } 
}
GET /lib3/user/_search { 
    "query": { 
        "fuzzy": { 
            "interests": { "value": "chagge" } 
        } 
    } 
}

highlight:高亮搜索結果

GET /lib3/user/_search { 
    "query":{ 
        "match":{ "interests": "changge" } 
    }, 
    "highlight": { 
        "fields": { "interests": {} } 
    } 
}

Filter:查詢

filter是不計算相關性的，同時可以cache。因此，filter速度要快於query。

簡單的過濾查詢

Get /lib4/items/_search { 
    "query":{ 
        "bool":{ 
            "filter":[
                {"term":{"price": 40}}
            ]
        }
    }
}
Get /lib4/items/_search { 
    "query":{ 
        "bool":{ 
            "filter":[
                {"terms":{"price": [25,40]}}  //價格25或者40  不是價格25到40
            ]
        }
    }
}
Get /lib4/items/_search { 
    "query":{ 
        "bool":{ 
            "filter":[
                {"term":{"itemID": ID100123}}
            ]
        }
    }
}
GET /lib4/items/_search { "post_filter": { "term": { "price": 40 } } }
GET /lib4/items/_search { "post_filter": { "terms": { "price": [25,40] } } }
GET /lib4/items/_search { "post_filter": { "term": { "itemID": "ID100123" } } }

ID100123默認會被映射成text類型，默認是分詞的

查看分詞器分析的結果：

GET /lib4/_mapping

不希望商品id字段被分詞，則重新創建映射

DELETE lib4

PUT /lib4 { "mappings": { "items": { "properties": { "itemID": { "type": "text", "index": false } } } } }

bool:過濾查詢

可以實現組合過濾查詢

格式：
{ "bool": { "must": [], "should": [], "must_not": [] } }

must:必須滿足的條件---and
should：可以滿足也可以不滿足的條件--or
must_not:不需要滿足的條件--not

GET /lib4/items/_search { 
    "post_filter": { 
        "bool": { 
            "should": [ 
                {"term": {"price":25}}, 
                {"term": {"itemID": "id100123"}}
            ],
            "must_not": { "term":{"price": 30}}                   
        }
     }
}

嵌套使用bool：

GET /lib4/items/_search { 
	"post_filter": { 
		"bool": { 
			"should": [ 
				{"term": {"itemID": "id100123"}}, 
				{"bool": { 
					"must": [ 
						{"term": {"itemID": "id100124"}}, 
						{"term": {"price": 40}} 
					] 
				}} 
			] 
		} 
	} 
}

gt、lt、gte、lte：範圍過濾

gt: > 大於

lt: < 小於

gte: >= 大於等於

lte: <= 小於等於
GET /lib4/items/_search { 
    "post_filter": { 
        "range": { 
            "price": { "gt": 25, "lt": 50 } 
        } 
    }
}

exists:過濾非空

GET /lib4/items/_search { 
	"query": { 
		"bool": { 
			"filter": { "exists":{ "field":"price" } }
		} 
	} 
} 

GET /lib4/items/_search { 
	"query" : { 
		"constant_score" : { 
			"filter": { "exists" : { "field" : "price" } } 
		} 
	} 
}

過濾器緩存

ElasticSearch提供了一種特殊的緩存，即過濾器緩存（filter cache），用來存儲過濾器的結果，
被緩存的過濾器並不需要消耗過多的內存（因爲它們只存儲了哪些文檔能與過濾器相匹配的相關信息），
而且可供後續所有與之相關的查詢重複使用，從而極大地提高了查詢性能。

注意：ElasticSearch並不是默認緩存所有過濾器，以下過濾器默認不緩存：

   numeric_range

   script

   geo_bbox

   geo_distance

   geo_distance_range

   geo_polygon

   geo_shape

   and

   or

   not

exists,missing,range,term,terms默認是開啓緩存的
開啓方式：在filter查詢語句後邊加上 "_catch":true

post_filter

post_filter出現在聚合章節，描述post_filter的作用爲：只過濾搜索結果，不過濾聚合結果；
如果只做查詢不做聚合，post_filter的作用和我們常用的filter是類似的，但由於post_filter是在查詢之後纔會執行，
所以post_filter不具備filter對查詢帶來的好處(忽略評分、緩存等)，因此，在普通的查詢中不要用post_filter來替代filter；

聚合查詢 sum、min、max、avg、cardinality、terms

1)sum

GET /lib4/items/_search { 
    "size":0, 
    "aggs": { 
        "price_of_sum": { 
            "sum": { "field": "price" }
        } 
    } 
}

2)min

GET /lib4/items/_search { 
    "size": 0, 
    "aggs": { 
        "price_of_min": { 
            "min": { "field": "price" } 
        } 
    }
}

3)max

GET /lib4/items/_search { 
    "size": 0, 
    "aggs": { 
        "price_of_max": { 
            "max": { "field": "price" } 
        } 
    } 
}

4)avg

GET /lib4/items/_search { 
    "size":0, 
    "aggs": { 
        "price_of_avg": { 
            "avg": { "field": "price" } 
        } 
    } 
}

5)cardinality:求基數互不相同的值個數

GET /lib4/items/_search { 
    "size":0, 
    "aggs": { 
        "price_of_cardi": { 
            "cardinality": { "field": "price" } 
        } 
    } 
}

6)terms:分組

GET /lib4/items/_search { 
    "size":0, 
    "aggs": { 
        "price_group_by": { 
            "terms": { "field": "price" } 
        } 
    } 
}

對那些有唱歌興趣的用戶按年齡分組

GET /lib3/user/_search { 
    "query": { 
        "match": { "interests": "changge" } 
    }, 
    "size": 0, 
    "aggs":{ 
        "age_group_by":{ 
            "terms": { 
                "field": "age", 
                "order": { "avg_of_age": "desc" } 
            }, 
            "aggs": { 
                "avg_of_age": { 
                    "avg": { "field": "age" } 
                } 
            } 
        } 
    } 
}

複合查詢

將多個基本查詢組合成單一查詢的查詢
使用bool查詢
接收以下參數：

must：文檔必須匹配這些條件才能被包含進來。 ----and

must_not：文檔必須不匹配這些條件才能被包含進來。----not

should：如果滿足這些語句中的任意語句，將增加 _score，----or

否則，無任何影響。它們主要用於修正每個文檔的相關性得分。

filter：必須匹配，但它以不評分、過濾模式來進行。這些語句對評分沒有貢獻，只是根據過濾標準來排除或包含文檔。

相關性得分是如何組合的。每一個子查詢都獨自地計算文檔的相關性得分。一旦他們的得分被計算出來，
bool 查詢就將這些得分進行合併並且返回一個代表整個布爾操作的得分。

下面的查詢用於查找 title 字段匹配 how to make millions 並且不被標識爲 spam 的文檔。
那些被標識爲 starred 或在2014之後的文檔，將比另外那些文檔擁有更高的排名。如果兩者都滿足，那麼它排名將更高：
{ 
    "bool": { 
        "must": { 
            "match": { "title": "how to make millions" }
        }, 
        "must_not": { 
            "match": { "tag": "spam" }
        }, 
        "should": [ 
            { "match": { "tag": "starred" }}, 
            { "range": { "date": { "gte": "2014-01-01" }}} 
        ] 
    } 
}
如果沒有 must 語句，那麼至少需要能夠匹配其中的一條 should 語句。但，如果存在至少一條 must 語句，則對 should 語句的匹配沒有要求。
如果我們不想因爲文檔的時間而影響得分，可以用 filter 語句來重寫前面的例子：
{ 
    "bool": { 
        "must": { 
            "match": { "title": "how to make millions" }
        }, 
        "must_not": { 
            "match": { "tag": "spam" }
        }, 
        "should": [ 
            { "match": { "tag": "starred" }} 
        ], 
        "filter": { 
            "range": { "date": { "gte": "2014-01-01" }} 
        } 
    } 
}
通過將 range 查詢移到 filter 語句中，我們將它轉成不評分的查詢，將不再影響文檔的相關性排名。由於它現在是一個不評分的查詢，
可以使用各種對 filter 查詢有效的優化手段來提升性能。

bool 查詢本身也可以被用做不評分的查詢。簡單地將它放置到 filter 語句中並在內部構建布爾邏輯：
{ 
    "bool": { 
        "must": { 
            "match": { "title": "how to make millions" }
        }, 
        "must_not": {     
            "match": { "tag": "spam" }
        }, 
        "should": [ 
            { "match": { "tag": "starred" }} 
        ], 
        "filter": { 
            "bool": { 
                "must": [ { 
                    "range": {
                        "date": { "gte": "2014-01-01" }
                    } 
                }, 
                { 
                    "range": { 
                        "price": { "lte": 29.99 }
                    }
                } ], 
                "must_not": [ { 
                    "term": { "category": "ebooks" }
                } ] 
            }
        } 
    } 
}

constant_score:查詢

它將一個不變的常量評分應用於所有匹配的文檔。它被經常用於你只需要執行一個 filter 而沒有其它查詢（例如，評分查詢）的情況下。
{ 

    "constant_score": { 
        "filter": { 
            "term": { "category": "ebooks" } 
        } 
    } 
}
term 查詢被放置在 constant_score 中，轉成不評分的filter。這種方式可以用來取代只有 filter 語句的 bool 查詢。

Elasticsearch 9300端口與9200端口的區別

9300 端口:ES節點之間通訊使用使用的是TCP協議端口號,ES
9200 端口:ES節點和外部通訊使用暴露ES RESTful接口端口號

Elasticsearch 6 入門教程之查詢語法(查詢詳解)

系列文章

數據準備

查詢

查詢結果解釋

term查詢和terms查詢

from、size控制查詢返回的數量

version：返回版本號

match:查詢

match_all:查詢所有文檔

multi_match:可以指定多個字段

match_phrase:短語匹配查詢

_source:指定返回的字段

sort:排序

match_phrase_prefix:前綴匹配查詢

range:範圍查詢

wildcard:查詢

fuzzy:實現模糊查詢查詢性能略低

highlight:高亮搜索結果

Filter:查詢

bool:過濾查詢

gt、lt、gte、lte：範圍過濾

exists:過濾非空

過濾器緩存

post_filter

聚合查詢 sum、min、max、avg、cardinality、terms

constant_score:查詢

它將一個不變的常量評分應用於所有匹配的文檔。它被經常用於你只需要執行一個 filter 而沒有其它查詢（例如，評分查詢）的情況下。

Elasticsearch 9300端口與9200端口的區別

Asp.Net Core下HttpResponseMessage輸出文件前端始終輸出Json

C#MySql.Data報錯Guid should contain 32 digits with 4 dashes (xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx)

05 | 深入淺出索引（下）

Elasticsearch 6 入門教程之安裝Elasticsearch

Elasticsearch 6 入門教程之ElasticSearch倒排索引和分詞器

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結

Elasticsearch 6 入門教程之查詢語法(查詢詳解)

系列文章

數據準備

查詢

查詢結果解釋

term查詢和terms查詢

from、size控制查詢返回的數量

version：返回版本號

match:查詢

match_all:查詢所有文檔

multi_match:可以指定多個字段

match_phrase:短語匹配查詢

_source:指定返回的字段

sort:排序

match_phrase_prefix:前綴匹配查詢

range:範圍查詢

wildcard:查詢

fuzzy:實現模糊查詢 查詢性能略低

highlight:高亮搜索結果

Filter:查詢

bool:過濾查詢

gt、lt、gte、lte：範圍過濾

exists:過濾非空

過濾器緩存

post_filter

聚合查詢 sum、min、max、avg、cardinality、terms

constant_score:查詢

它將一個不變的常量評分應用於所有匹配的文檔。它被經常用於你只需要執行一個 filter 而沒有其它查詢（例如，評分查詢）的情況下。

Elasticsearch 9300端口與9200端口的區別

fuzzy:實現模糊查詢查詢性能略低