ElasticSearch API實現CRUD

2.5、版本控制

2.6、實現映射mapping

2.7、基本查詢（Query查詢）

2.7.1、數據準備

2.7.2、term查詢和terms查詢

2.7.3、控制查詢返回的數量

2.7.4、返回版本號

2.7.5、match查詢（分詞器）

2.7.6、控制加載的字段

2.7.7、排序

2.7.8、前綴匹配查詢

2.7.9、範圍查詢

2.7.10、willdcard查詢

2.7.11、fuzzy實現模糊查詢

2.7.12、高亮搜索結果

2.8、中文的基本查詢（Query查詢）

2.8、Filter查詢

2.8.1、簡單的過濾查詢

2.8.2、bool過濾查詢

2.8.3、範圍過濾查詢

2.8.4、過濾非空

2.8.5、過濾器緩存

2.8.6、聚合查詢

2.8.7、複合查詢

添加索引：

PUT /lib/
{
"settings":{
"index":{
"number_of_shards": "5",
"number_of_replicas": "1"
}
}
}
PUT lib

查看索引信息：

GET /lib/_settings
查看全部索引信息：

GET _all/_settings
添加文檔：

PUT /lib/user/1

{"first_name":"Fir",

}

2.5、版本控制
ElasticSearch採用了樂觀鎖來保證數據的一致性，也就是說，當用戶對document進行操作時，並不需要對該document作加鎖和解鎖的操作，只需要指定要操作的版本即可，當版本號一致時，ElasticSearch會允許該操作順利執行，而當版本號存在衝突時，ElasticSearch會提示衝突並拋出異常（VersionConflictEngineException異常）。

ElasticSearch的版本號的取值範圍爲1到2^63 - 1。

內部版本控制：使用的是 _version

外部版本控制：ElasticSearch在處理外部版本號時會與內部版本號的處理有些不同。它不再是檢查_version是否與請求中指定的數值相同，而檢查當前的_version是否比指定的數值小，如果請求成功，那麼外部的版本號就會被存儲到文檔中_versionz中。

爲了保持_version與外部版本控制的數據一致，使用version_type = external。

2.6、實現映射mapping

創建索引的時候，可以預先定義字段的類型以及相關屬性，這樣就能夠把日期字段處理成日期，把數字字段處理成數字，把字符串字段處理字符串值等支持的數據類型：

（1）核心數據類型（Code datatypes）

字符型：string，string類型包括

text和keyword

text類型被用來索引長文本，在建立索引前會將這些文本進行分詞，轉化爲詞的組合，建立索引，允許es來檢索這些詞語。text類型不能用來排序和聚合。

keyword 類型不需要進行分詞，可以被用來檢索過濾、排序和聚合。keyword類型字段只能用本身來進行檢索。

數字型：long，integer，short，byte，double，float

日期型：date

布爾型：boolean

二進制型：binary
（2）複雜數據類型（Complex dataypes）

數組類型（Array datatype）；數組類型不需要專門制定數組元素的type，例如：

字符型數組：["one","two"]

整數數組：[1,2]

數組型整數：[1,[2,3]] 等價於 [1,2,3]

對象數組：[{"name": "Mary", "age":12},{"name" : "john" , "age" : 10}]

對象類型（Object datatype）：_object_ 用於單個JSON對象；

嵌套類型（Nested datatype）：_nested_用於JSON數組；
（3）地理位置類型（Geo datatypes）

地理座標類型（Geo-point datatype）：_geo_point_ 用於經緯度座標；

地理形狀類型（Geo-Shape datatype）：_geo_shape_ 用於類似於多邊形的複雜形狀；
（4）特定類型（Specialised datatype）

IPv4類型（IPv4 datatype）：_ip_ 用於IPv4地址；

Completion類型（Completion datatype）：_ completion _ 提供自動補全建議；

Token count類型（Token count datatype）：_ token _ count _ 用於統計做了標記的字段的index數目，該值會一直增加，不會因爲過濾條件而減少。mapper-murmur3

類型：通過插件，可以通過 _ murmur3 _ 來計算index的hash值；

附加類型（Attachment datatype）：採用mapper-attachments

插件，可支持 _ attachments _ 索引
支持的屬性：

2.7、基本查詢（Query查詢）
2.7.1、數據準備
創建一個mapping：

PUT /lib3
{
"settings": {
"number_of_shards": 3
, "number_of_replicas": 0
},
"mappings": {
"user":{
"properties": {
"name":{"type": "text"},
"address":{"type": "text"},
"age":{"type": "integer"},
"interests":{"type": "text"},
"birthday":{"type": "date"}
}
}
}
}
插入幾條數據：

PUT /lib3/user/1
{
"name" : "zhaoliu",
"address" : "hei long jiang sheng tie ling shi",
"age" : 50,
"birthday" : "1970-12-12",
"interests" : "xi huan hejiu,duanlian,lvyou"
}

PUT /lib3/user/2
{
"name" : "zhaoming",
"address" : "bei jing hai dian qu ",
"age" : 20,
"birthday" : "1998-10-12",
"interests" : "xi huan hejiu,duanlian,lvyou"
}
PUT /lib3/user/3
{
"name" : "lisi",
"address" : "hei long jiang sheng tie ling shi",
"age" : 23,
"birthday" : "1970-12-12",
"interests" : "xi huan hejiu,duanlian,lvyou"
}
PUT /lib3/user/4
{
"name" : "wangwu",
"address" : "bei jing hai dian qu",
"age" : 26,
"birthday" : "1995-12-12",
"interests" : "xi huan hejiu,duanlian,lvyou"
}
PUT /lib3/user/5
{
"name" : "zhangsan",
"address" : "bei jing chao yang qu",
"age" : 29,
"birthday" : "1988-12-12",
"interests" : "xi huan hejiu,duanlian,lvyou"
}
查看全部的內容：

GET /lib3/user/_search
按條件查詢：

#"max_score": 0.6931472：和當前搜索相關度的匹配分數
GET /lib3/user/_search?q=name:lisi
搜索結果：

{
"took": 2,
"timed_out": false,
"_shards": {
"total": 3,
"successful": 3,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.6931472,#和當前搜索相關度的匹配分數
"hits": [
{
"_index": "lib3",
"_type": "user",
"_id": "3",
"_score": 0.6931472,
"_source": {
"name": "lisi",
"address": "hei long jiang sheng tie ling shi",
"age": 23,
"birthday": "1970-12-12",
"interests": "xi huan hejiu,duanlian,lvyou"
}
}
]
}
}
GET lib3/user/_search?q=interests:hejiu&sort=age:desc
2.7.2、term查詢和terms查詢
term query會去倒排索引中尋找確切的term，它並不知道分詞器的存在。這種查詢適合keyword、numeric、date。

term：查詢某個字段裏含有某個關鍵詞的文檔

GET lib3/user/_search
{
"query" :{
"term": {
"name": "zhaoliu"
}
}
}
terms：查詢某個字段裏含有多個關鍵詞的文檔

GET lib3/user/_search
{
"query" :{
"terms": {
"interests": ["hejiu","lvyou"]
}
}
}
2.7.3、控制查詢返回的數量
from：從哪一個文檔開始

size：需要的個數

取前2個文檔：

GET lib3/user/_search
{
"from": 0,
"size": 2,
"query" :{
"terms": {
"interests": ["hejiu","lvyou"]
}
}
}
2.7.4、返回版本號
添加上版本號：

GET lib3/user/_search
{
"version": true,
"query" :{
"terms": {
"interests": ["hejiu","lvyou"]
}
}
}
2.7.5、match查詢（分詞器）
match query知道分詞器的存在，會對filed進行分詞操作，然後再查詢

GET lib3/user/_search
{
"query" :{
"match": {
"name": "zhaoliu wangwu"
}
}
}
GET lib3/user/_search
{
"query" :{
"match": {
"interests": "duanlian changge"
}
}
}
GET lib3/user/_search
{
"query" :{
"match": {
"age": "20"
}
}
}
match_all：查詢所有文檔

GET lib3/user/_search
{
"query" :{
"match_all": {}
}
}
multi_match：可以指定多個字段

GET lib3/user/_search
{
"query" :{
"multi_match": {
"query": "hejiu",
"fields": ["interests","name"]
}
}
}
match_phrase：短語匹配查詢

GET lib3/user/_search
{
"query" :{
"match_phrase": {
"interests": "duanlian,lvyou"
}
}
}
指定返回的字段：

GET lib3/user/_search
{
"_source": ["address","name"],
"query": {
"match": {
"interests": "duanlian"
}
}
}
ElasticSearch引擎首先分析（analyze）查詢字符串，從分析後的文本中構建短語查詢，這意味着匹配短語中的所有分詞，並且保證各個分詞的相對位置不變：

2.7.6、控制加載的字段
includes：包含的字段 excludes：排除哪些字段

GET lib3/user/_search
{
"query": {
"match_all": {}
},
"_source": {
"includes": ["name","address"]
, "excludes": ["age","birthday"]
}
}
也可以使用通配符來表示字段：

GET lib3/user/_search
{
"query": {
"match_all": {}
},
"_source": {
"includes": "addr*"
, "excludes": ["age","bir*"]
}
}
2.7.7、排序
使用sort實現排序：desc：降序，asc升序

GET lib3/user/_search
{
"query": {
"match_all": {}
},
"sort": [
{
"age": {
"order": "desc"
}
}
]
}
2.7.8、前綴匹配查詢
GET lib3/user/_search
{
"query": {
"match_phrase_prefix": {
"name": {
"query": "zhao"
}
}
}
}
2.7.9、範圍查詢
range：實現範圍查詢

參數：from，to，include_lower，include_upper，boost

include_lower：是否包含範圍的左邊界，默認是true

include_upper：是否包含範圍的右邊界，默認是true

GET lib3/user/_search
{
"query": {
"range": {
"birthday": {
"from": "1990-10-10",
"to": "2018-05-01"
}
}
}
}
GET lib3/user/_search
{
"query": {
"range": {
"age": {
"from": 20,
"to": 25,
"include_lower":true,
"include_upper":false
}
}
}
}
2.7.10、willdcard查詢
允許使用通配符*和？來進行查詢

*代表0個或多個字符

？代表任意一個字符

GET lib3/user/_search
{
"query": {
"wildcard": {
"name": "zhao*"
}
}
}

GET lib3/user/_search
{
"query": {
"wildcard": {
"name": "li?i"
}
}
}
2.7.11、fuzzy實現模糊查詢
value：查詢的關鍵字

boost：查詢的權值，默認值是1.0

min_similarity：設置匹配的最小相似度，默認值爲0.5，對於字符創，取值爲0-1（包括0和1）；對於數值，取值可能大於1；對於日期型取值爲1d，1m等，1d代表1天

prefix_length：指名分區詞項的共同前綴長度，默認是0

max_expansions：查詢中的詞項可以擴展的數目，默認可以無限大

GET lib3/user/_search
{
"query": {
"fuzzy": {
"name": "zholiu"
}
}
}
GET lib3/user/_search
{
"query": {
"fuzzy": {
"interests": {
"value": "duanlin"
}
}
}
}
2.7.12、高亮搜索結果
GET lib3/user/_search
{
"query": {
"match": {
"interests": "duanlian"
}
},
"highlight": {
"fields": {
"interests": {}
}
}
}
2.8、中文的基本查詢（Query查詢）

ik帶有兩個分詞器

ik_max_word：會將文本做最細粒度的拆分；儘可能多的拆分出詞語

ik_smart：會做最粗粒度的拆分；已被分出的詞語將不會再次被其它詞語佔有

PUT /lib4
{
"settings": {
"number_of_shards": 3
, "number_of_replicas": 0
},
"mappings": {
"user":{
"properties": {
"name":{"type": "text","analyzer": "ik_max_word"},
"address":{"type": "text","analyzer": "ik_max_word"},
"age":{"type": "integer"},
"interests":{"type": "text","analyzer": "ik_max_word"},
"birthday":{"type": "date"}
}
}
}
}
跟英文查詢相同。

2.8、Filter查詢
filter是不計算相關性的，同時可以cache。因此，filter速度要快於query

創建數據：

POST /lib4/items/_bulk
{"index":{"_id":1}}
{"price":40,"itemID":"ID100123"}
{"index":{"_id":2}}
{"price":50,"itemID":"ID100124"}
{"index":{"_id":3}}
{"price":25,"itemID":"ID100125"}
{"index":{"_id":4}}
{"price":30,"itemID":"ID100126"}
{"index":{"_id":5}}
{"price":null,"itemID":"ID100127"}
2.8.1、簡單的過濾查詢
GET /lib4/items/_search
{
"query": {
"bool": {
"filter": [
{"term":{"price":40}}
]
}
}
}

GET /lib4/items/_search
{
"query": {
"bool": {
"filter": [
{"terms":{"price":[25,40]}}
]
}
}
}

GET /lib4/items/_search
{
"query": {
"bool": {
"filter": [
{"term":{ "itemID": "id100123" }}
]
}
}
}
查看分詞器分析的結果：

GET /lib4/_mapping
不希望商品id字段被分詞，則重新創建映射

PUT lib4
{
"mappings": {
"items": {
"properties": {
"itemID":{
"type": "text",
"index": false
}
}
}
}
}
2.8.2、bool過濾查詢
可以實現組合過濾查詢

格式：

{"bool":{"must":[],"should":[],"must_not":[]}}

must：必須滿足的條件 --- and

should：可以滿足也可以不滿足的條件 --- or

must_not ：不需要滿足的條件 --- not

GET /lib4/items/_search
{
"query": {
"bool": {
"should":[
{"term":{"price":25}},
{"term":{"itemID":"id100123"}}
],
"must_not": [
{"term": {
"price": "30"
}}
]
}
}
}
嵌套使用bool：

GET /lib4/items/_search
{
"query": {
"bool": {
"should": [
{"term": {"itemID": "id100123"}},
{
"bool": {
"must": [
{"term": {
"itemID": "id100124"
}},
{
"term": {
"price": "40"
}
}
]
}
}
]
}
}
}
2.8.3、範圍過濾查詢
gt：>

it：<

gte：>=

lte：<=

GET lib4/items/_search
{
"query": {
"bool": {
"filter": {
"range": {
"price": {
"gte": 20,
"lte": 50
}
}
}
}
}
}
2.8.4、過濾非空
GET lib4/items/_search
{
"query": {
"bool": {
"filter": {
"exists": {
"field": "price"
}
}
}
}
}
2.8.5、過濾器緩存
ElasticSearch提供了一種特殊的緩存，即過濾器緩存（filter cache），永愛存儲過濾器的結果，被緩存的過濾器並不需要消耗過多的內存（因爲它們只存儲了哪些文檔能與過濾相匹配的相關信息），而且可供後續所有與之相關的查詢重複使用，從而極大地特高了查詢性能。

注意：ElasticSearch並不是默認緩存所有的過濾器，以下過濾器默認不緩存：

2.8.6、聚合查詢
# 聚合查詢

#SUM
GET lib4/items/_search
{
"size" : 0,
"aggs":{
"price_of_sum":{
"sum" : {
"field" : "price"
}
}
}
}

#最小值
GET lib4/items/_search
{
"size": 0,
"aggs": {
"price_of_min": {
"min": {
"field": "price"
}
}
}
}

#最大值
GET lib4/items/_search
{
"size": 0,
"aggs": {
"price_of_max": {
"max": {
"field": "price"
}
}
}
}

#平均值
GET lib4/items/_search
{
"size": 0,
"aggs": {
"price_of_avg": {
"avg": {
"field": "price"
}
}
}
}

#有多少個互不相同的值
GET lib4/items/_search
{
"size": 0,
"aggs": {
"price_of_cardi": {
"cardinality": {
"field": "price"
}
}
}
}

#分組
GET lib4/items/_search
{
"size": 0,
"aggs": {
"price_of_group": {
"terms": {
"field": "price"
}
}
}
}

#對那些有鍛鍊興趣的用戶按年齡分組,排序
GET lib3/user/_search
{
"query": {
"match": {
"interests": "duanlian"
}
}
, "size": 0,
"aggs": {
"age_of_group": {
"terms": {
"field": "age"
, "order": {
"age_of_avg": "desc"
}
}
, "aggs": {
"age_of_avg": {
"avg": {
"field": "age"
}
}
}
}
}
}
2.8.7、複合查詢
將多個基本查詢組合成單一查詢的查詢

1. 使用bool查詢

接收以下參數：

must：文檔必須匹配這些條件才能被包含進來。

must_out : 文檔必須不匹配這個條件才能被包含進來。

should ：如果滿足這些語句中的任意句，將增加_score，否則，無任何影響。它們主要用於修正每個文檔的相關性得分。

filter ：必須匹配，但它以不評分、過濾模式來進行。這些語句對評分沒有共享，只有根據過濾標準來排除或包含文檔。

相關性得分是如何組合的，每一個子查詢都獨自地計算文檔的相關性得分，一旦他們的得分被計算出來，bool查詢就將這些得分進行合併並且返回一個代表整個布爾操作的得分。

下面的查詢用於查找title字段匹配how to make millions 並且不被標識爲spam的文檔。那些被標識爲starred或在2014之後的文檔，將比另外那些文檔擁有更高的排名。如果兩者都滿足，那麼它排名講更高：

GET lib4/items/_search
{
"query": {
"bool": {
"filter": {
"range": {
"price": {
"gte": 20,
"lte": 50
}
}
}
}
}
}
GET lib4/items/_search
{
"query": {
"bool": {
"filter": {
"exists": {
"field": "price"
}
}
}
}
}

GET lib3/user/_search
{
"query": {
"bool": {
"must": [
{
"match": {"interests": "duanlian"}}],
"must_not": [{"match": {"interests": "lvyou"}}]
, "should": [
{"match": {"address": "bei jing"}},
{ "range": {"birthday": {"gte": "1996-01-01"}}}

]

}
}
}

GET lib3/user/_search
{
"query": {
"bool": {
"must": [
{"match": {
"interests": "duanlian"
}}

]
, "must_not": [
{"match": {
"interests": "lvyou"
}}
]
, "should": [
{"match": {
"address": "beijing"
}}
]
, "filter": {
"range": {
"birthday": {
"gte": "1996-01-01"
}
}
}
}
}
}
constant_score查詢(不計算相關度分數)

它將一個不變的常量評分應用於所有匹配的文檔，它被經常用於你需要執行一個filter而沒有其他查詢（例如，評分查詢）的情況下。

term查詢被放置在constant_score中，轉成不評分的filter。這種方式可以用來取代只有filter語句的bool查詢。

GET lib3/user/_search
{
"query": {
"constant_score": {
"filter": {
"term": {
"interests": "duanlian"
}
}
}
}
}

原文鏈接：https://blog.csdn.net/qq_41851454/article/details/81353359

ElasticSearch API實現CRUD

美團一面：項目中有 10000 個 if else 如何優化？想了半天，被問懵了！

京東面試：如何進行JVM調優？

Python 將PowerPoint (PPT/PPTX) 轉爲HTML

SQL優化-20231016

nginx搭建網關服務器

浮點數精度丟失

浮點類型float double 以及BigDecimal

發現kafka丟消息後的排查

加密算法及區別

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結