ES搜索 term與match區別 bool查詢

1、ik_max_word

會將文本做最細粒度的拆分，比如會將“中華人民共和國人民大會堂”拆分爲“中華人民共和國、中華人民、中華、華人、人民共和國、人民、共和國、大會堂、大會、會堂等詞語。

2、ik_smart
會做最粗粒度的拆分，比如會將“中華人民共和國人民大會堂”拆分爲中華人民共和國、人民大會堂。

`term` 和 `match` 總結

在實際的項目查詢中，term和match 是最常用的兩個查詢，而經常搞不清兩者有什麼區別，趁機總結有空總結下。

term用法

先看看term的定義，term是代表完全匹配，也就是精確查詢，搜索前不會再對搜索詞進行分詞拆解。

這裏通過例子來說明，先存放一些數據：

{
    "title": "love China",
    "content": "people very love China",
    "tags": ["China", "love"]
}
{
    "title": "love HuBei",
    "content": "people very love HuBei",
    "tags": ["HuBei", "love"]
}

來使用term 查詢下：

{
  "query": {
    "term": {
      "title": "love"
    }
  }
}

結果是，上面的兩條數據都能查詢到：

{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": 0.6931472,
    "hits": [
      {
        "_index": "test",
        "_type": "doc",
        "_id": "8",
        "_score": 0.6931472,
        "_source": {
          "title": "love HuBei",
          "content": "people very love HuBei",
          "tags": ["HuBei","love"]
        }
      },
      {
        "_index": "test",
        "_type": "doc",
        "_id": "7",
        "_score": 0.6931472,
        "_source": {
          "title": "love China",
          "content": "people very love China",
          "tags": ["China","love"]
        }
      }
    ]
  }
}

發現，title裏有關love的關鍵字都查出來了，但是我只想精確匹配 love China這個，按照下面的寫法看看能不能查出來：

{
  "query": {
    "term": {
      "title": "love China"
    }
  }
}

執行發現無數據，從概念上看，term屬於精確匹配，只能查單個詞。我想用term匹配多個詞怎麼做？可以使用terms來：

{
  "query": {
    "terms": {
      "title": ["love", "China"]
    }
  }
}

查詢結果爲：

{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": 0.6931472,
    "hits": [
      {
        "_index": "test",
        "_type": "doc",
        "_id": "8",
        "_score": 0.6931472,
        "_source": {
          "title": "love HuBei",
          "content": "people very love HuBei",
          "tags": ["HuBei","love"]
        }
      },
      {
        "_index": "test",
        "_type": "doc",
        "_id": "7",
        "_score": 0.6931472,
        "_source": {
          "title": "love China",
          "content": "people very love China",
          "tags": ["China","love"]
        }
      }
    ]
  }
}

發現全部查詢出來，爲什麼？因爲terms裏的[ ] 多個是或者的關係，只要滿足其中一個詞就可以。想要通知滿足兩個詞的話，就得使用bool的must來做，如下：

{
  "query": {
    "bool": {
      "must": [
        {
          "term": {
            "title": "love"
          }
        },
        {
          "term": {
            "title": "china"
          }
        }
      ]
    }
  }
}

可以看到，我們上面使用china是小寫的。當使用的是大寫的China 我們進行搜索的時候，發現搜不到任何信息。這是爲什麼了？title這個詞在進行存儲的時候，進行了分詞處理。我們這裏使用的是默認的分詞處理器進行了分詞處理。我們可以看看如何進行分詞處理的？

分詞處理器

GET test/_analyze
{
  "text" : "love China"
}

結果爲：

{
  "tokens": [
    {
      "token": "love",
      "start_offset": 0,
      "end_offset": 4,
      "type": "<ALPHANUM>",
      "position": 0
    },
    {
      "token": "china",
      "start_offset": 5,
      "end_offset": 10,
      "type": "<ALPHANUM>",
      "position": 1
    }
  ]
}

分析出來的爲love和china的兩個詞。而term只能完完整整的匹配上面的詞，不做任何改變的匹配。所以，我們使用China這樣的方式進行的查詢的時候，就會失敗。稍後會有一節專門講解分詞器。

match 用法

先用 love China來匹配。

GET test/doc/_search
{
  "query": {
    "match": {
      "title": "love China"
    }
  }
}

結果是：

{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": 1.3862944,
    "hits": [
      {
        "_index": "test",
        "_type": "doc",
        "_id": "7",
        "_score": 1.3862944,
        "_source": {
          "title": "love China",
          "content": "people very love China",
          "tags": [
            "China",
            "love"
          ]
        }
      },
      {
        "_index": "test",
        "_type": "doc",
        "_id": "8",
        "_score": 0.6931472,
        "_source": {
          "title": "love HuBei",
          "content": "people very love HuBei",
          "tags": [
            "HuBei",
            "love"
          ]
        }
      }
    ]
  }
}

發現兩個都查出來了，爲什麼？因爲match進行搜索的時候，會先進行分詞拆分，拆完後，再來匹配，上面兩個內容，他們title的詞條爲： love china hubei ，我們搜索的爲love China 我們進行分詞處理得到爲love china ，並且屬於或的關係，只要任何一個詞條在裏面就能匹配到。如果想 love 和 China 同時匹配到的話，怎麼做？使用 match_phrase

match_phrase 用法

match_phrase 稱爲短語搜索，要求所有的分詞必須同時出現在文檔中，同時位置必須緊鄰一致。

GET test/doc/_search
{
  "query": {
    "match_phrase": {
      "title": "love china"
    }
  }
}

結果爲：

{
  "took": 5,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 1.3862944,
    "hits": [
      {
        "_index": "test",
        "_type": "doc",
        "_id": "7",
        "_score": 1.3862944,
        "_source": {
          "title": "love China",
          "content": "people very love China",
          "tags": [
            "China",
            "love"
          ]
        }
      }
    ]
  }
}

這次好像符合我們的需求了，結果只出現了一條記錄。

bool查詢的使用

Bool查詢對應Lucene中的BooleanQuery，它由一個或者多個子句組成，每個子句都有特定的類型。

must

返回的文檔必須滿足must子句的條件，並且參與計算分值

filter

返回的文檔必須滿足filter子句的條件。但是不會像Must一樣，參與計算分值

should

返回的文檔可能滿足should子句的條件。在一個Bool查詢中，如果沒有must或者filter，有一個或者多個should子句，那麼只要滿足一個就可以返回。`minimum_should_match`參數定義了至少滿足幾個子句。

must_nout

返回的文檔必須不滿足must_not定義的條件。

如果一個查詢既有filter又有should，那麼至少包含一個should子句。

bool查詢也支持禁用協同計分選項disable_coord。一般計算分值的因素取決於所有的查詢條件。

bool查詢也是採用more_matches_is_better的機制，因此滿足must和should子句的文檔將會合並起來計算分值。

{
    "bool" : {
        "must" : {
            "term" : { "user" : "kimchy" }
        },
        "filter": {
            "term" : { "tag" : "tech" }
        },
        "must_not" : {
            "range" : {
                "age" : { "from" : 10, "to" : 20 }
            }
        },
        "should" : [
            {
                "term" : { "tag" : "wow" }
            },
            {
                "term" : { "tag" : "elasticsearch" }
            }
        ],
        "minimum_should_match" : 1,
        "boost" : 1.0
    }
}

bool.filter的分值計算

在filter子句查詢中，分值將會都返回0。分值會受特定的查詢影響。

比如，下面三個查詢中都是返回所有status字段爲active的文檔

第一個查詢，所有的文檔都會返回0:

GET _search
{
  "query": {
    "bool": {
      "filter": {
        "term": {
          "status": "active"
        }
      }
    }
  }
}

下面的bool查詢中包含了一個match_all，因此所有的文檔都會返回1

GET _search
{
  "query": {
    "bool": {
      "must": {
        "match_all": {}
      },
      "filter": {
        "term": {
          "status": "active"
        }
      }
    }
  }
}

constant_score與上面的查詢結果相同，也會給每個文檔返回1：

GET _search
{
  "query": {
    "constant_score": {
      "filter": {
        "term": {
          "status": "active"
        }
      }
    }
  }
}

使用named query給子句添加標記

如果想知道到底是bool裏面哪個條件匹配，可以使用named query查詢：

{
    "bool" : {
        "should" : [
            {"match" : { "name.first" : {"query" : "shay", "_name" : "first"} }},
            {"match" : { "name.last" : {"query" : "banon", "_name" : "last"} }}
        ],
        "filter" : {
            "terms" : {
                "name.last" : ["banon", "kimchy"],
                "_name" : "test"
            }
        }
    }
}

ES搜索 term與match區別 bool查詢

`term` 和 `match` 總結

bool查詢的使用

must

返回的文檔必須滿足must子句的條件，並且參與計算分值

filter

返回的文檔必須滿足filter子句的條件。但是不會像Must一樣，參與計算分值

should

返回的文檔可能滿足should子句的條件。在一個Bool查詢中，如果沒有must或者filter，有一個或者多個should子句，那麼只要滿足一個就可以返回。`minimum_should_match`參數定義了至少滿足幾個子句。

must_nout

返回的文檔必須不滿足must_not定義的條件。

如果一個查詢既有filter又有should，那麼至少包含一個should子句。

bool.filter的分值計算

使用named query給子句添加標記

如何使用 JS 判斷用戶是否處於活躍狀態

通過HPA+CronHPA組合應對業務複雜彈性伸縮場景

Kafka時間輪（TimingWheel)和Kafka中的延時操作

git常見問題分析

集羣部署腳本

ES搜索 term與match區別 bool查詢

kafka中的ISR、AR又代表什麼？ISR伸縮又是什麼？

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結

ES搜索 term與match區別 bool查詢

term 和 match 總結

bool查詢的使用

must

返回的文檔必須滿足must子句的條件，並且參與計算分值

filter

返回的文檔必須滿足filter子句的條件。但是不會像Must一樣，參與計算分值

should

返回的文檔可能滿足should子句的條件。在一個Bool查詢中，如果沒有must或者filter，有一個或者多個should子句，那麼只要滿足一個就可以返回。minimum_should_match參數定義了至少滿足幾個子句。

must_nout

返回的文檔必須不滿足must_not定義的條件。

如果一個查詢既有filter又有should，那麼至少包含一個should子句。

bool.filter的分值計算

使用named query給子句添加標記

`term` 和 `match` 總結

返回的文檔可能滿足should子句的條件。在一個Bool查詢中，如果沒有must或者filter，有一個或者多個should子句，那麼只要滿足一個就可以返回。`minimum_should_match`參數定義了至少滿足幾個子句。