本文基於Elasticsearch7.x
全文搜索在搜索時, 會對輸入的搜索文本進行分詞, 然後去倒排索引中進行匹配, 只要能匹配上任意一個關鍵詞(詞項), 就可以作爲結果返回.
在學習本篇博客前先了解下Elasticsearch全文搜索之基礎語法API
Rest API
添加搜索實例數據
POST /blogs/_bulk
{"index": {}}
{"post_date": "2020-01-01", "title": "Quick brown rabbits", "content": "Brown rabbits are commonly seen.", "author_id": 11401}
{"index": {}}
{"post_date": "2020-01-02", "title": "Keeping pets healthy", "content": "My quick brown fox eats rabbits on a regular basis.", "author_id": 11402}
{"index": {}}
{"post_date": "2020-01-03", "title": "My dog barks", "content": "I see a lot of barking dogs on the road.", "author_id": 11403}
bool
基礎匹配API的實例都是對一個搜索文本進行匹配, 即單條件搜索. 下面我們來學習下bool多條件搜索, 即由多個搜索文本構成的複合搜索.
(1) bool語法
- must
必須匹配, 貢獻算分. - must_not
必須不能匹配, 貢獻算分. - should
選擇性匹配, 貢獻算分. - filter
必須匹配, 不貢獻算分.
must, must_not, should這三個條件是會用於相關度分數計算的, 而filter不會, 從而filter的性能會更好. 由以上四種搜索子句合併爲一條複合搜索語句, 這就是bool搜索.
基礎匹配API中講述的match, match_phrase, dis_max, multi_match, term是基礎的搜索語法, bool搜索是基於它們來實現的.
(2) 實例
a. 基礎使用
GET /blogs/_search
{
"query": {
"bool": {
"must": [
{
"term": {
"author_id": {
"value": "11403"
}
}
}
],
"must_not": [
{
"range": {
"post_date": {
"lte": "2020-01-02"
}
}
}
],
"should": [
{
"term": {
"title.keyword": {
"value": "My dog barks"
}
}
},
{
"term": {
"content.keyword": {
"value": "barking dogs"
}
}
}
],
"minimum_should_match": 1
}
}
}
b. 嵌套bool搜索
GET /blogs/_search
{
"query": {
"bool": {
"must": [
{
"term": {
"author_id": {
"value": "11403"
}
}
}
],
"should": [
{
"bool": {
"must_not": [
{
"term": {
"post_date": {
"value": "2020-01-02"
}
}
}
]
}
}
],
"minimum_should_match": 1
}
}
}
c. 排序與分頁
GET /blogs/_search
{
"query": {
"bool": {
"must": [
{
"range": {
"post_date": {
"gte": "2020-01-01",
"lte": "2020-01-03"
}
}
}
]
}
},
"sort": [
{
"author_id": {
"order": "desc"
}
}
],
"from": 0,
"size": 2
}
filter
基礎匹配API中講述的match, match_phrase, dis_max, multi_match, term是基礎的搜索語法, filter過濾是基於它們來實現的. filter不計算相關度分數, 可以有效的利用緩存, 效率會更高.
(1) 語法
- constant_score
- bool
(2) 實例
a. constant_score語法
constant_score以固定的評分來執行搜索, 默認爲1.
GET /blogs/_search
{
"query": {
"constant_score": {
"filter": {
"range": {
"post_date": {
"gte": "2020-01-01",
"lte": "2020-01-03"
}
}
}
}
},
"sort": [
{
"author_id": {
"order": "desc"
}
}
],
"from": 0,
"size": 2
}
b. bool語法
GET /blogs/_search
{
"query": {
"bool": {
"filter": {
"term": {
"post_date": "2020-01-03"
}
},
"should": [
{
"term": {
"title.keyword": {
"value": "My dog barks"
}
}
},
{
"term": {
"content.keyword": {
"value": "barking dogs"
}
}
}
],
"minimum_should_match": 1
}
}
}
Java API
下面介紹Elasticsearch Java Client 的使用, 我們來將上文的實例轉化爲 Java Client.
(1) main方法
public static void main(String[] args) throws IOException {
RestHighLevelClient client = new RestHighLevelClient(
RestClient.builder(
new HttpHost("localhost", 9200, "http")));
bulkIndex(client);
baseApi(client);
boolNestApi(client);
boolPaginationApi(client);
constantScoreApi(client);
boolFilterApi(client);
client.close();
}
新增文檔和查詢文檔請求不要一起執行, 這樣會查不到文檔, 因爲新增文檔後需要1s時間進行倒排索引創建, 這也是ES被稱爲近實時的原因.
(2) 添加搜索數據
private static void bulkIndex(RestHighLevelClient client) throws IOException {
BulkRequest bulkRequest = new BulkRequest();
bulkRequest.add(new IndexRequest("blogs").id("1")
.source(XContentType.JSON, "post_date", "2020-01-01", "title", "Quick brown rabbits", "content", "Brown rabbits are commonly seen.", "author_id", 11401));
bulkRequest.add(new IndexRequest("blogs").id("2")
.source(XContentType.JSON, "post_date", "2020-01-02", "title", "Keeping pets healthy", "content", "My quick brown fox eats rabbits on a regular basis.", "author_id", 11402));
bulkRequest.add(new IndexRequest("blogs").id("3")
.source(XContentType.JSON, "post_date", "2020-01-03", "title", "My dog barks", "content", "I see a lot of barking dogs on the road.", "author_id", 11403));
client.bulk(bulkRequest, RequestOptions.DEFAULT);
}
(3) bool搜索基礎使用
private static void baseApi(RestHighLevelClient client) throws IOException {
SearchRequest searchRequest = new SearchRequest("blogs");
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
BoolQueryBuilder boolQueryBuilder = new BoolQueryBuilder();
boolQueryBuilder.must(new TermQueryBuilder("author_id", "11403"));
boolQueryBuilder.mustNot(new RangeQueryBuilder("post_date").lte("2020-01-02"));
boolQueryBuilder.should(new TermQueryBuilder("title.keyword", "My dog barks"));
boolQueryBuilder.should(new TermQueryBuilder("content.keyword", "barking dogs"));
boolQueryBuilder.minimumShouldMatch(1);
searchSourceBuilder.query(boolQueryBuilder);
searchRequest.source(searchSourceBuilder);
SearchResponse searchResponse = client.search(searchRequest, RequestOptions.DEFAULT);
SearchHit[] hits = searchResponse.getHits().getHits();
for (SearchHit hit : hits) {
System.out.println(hit.getSourceAsString());
}
}
(4) 嵌套bool搜索
private static void boolNestApi(RestHighLevelClient client) throws IOException {
SearchRequest searchRequest = new SearchRequest("blogs");
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
BoolQueryBuilder boolQueryBuilder = new BoolQueryBuilder();
boolQueryBuilder.must(new TermQueryBuilder("author_id", "11403"));
BoolQueryBuilder boolQueryBuilder2 = new BoolQueryBuilder();
boolQueryBuilder2.mustNot(new TermQueryBuilder("post_date", "2020-01-02"));
boolQueryBuilder.should(boolQueryBuilder2);
boolQueryBuilder.minimumShouldMatch(1);
searchSourceBuilder.query(boolQueryBuilder);
searchRequest.source(searchSourceBuilder);
SearchResponse searchResponse = client.search(searchRequest, RequestOptions.DEFAULT);
SearchHit[] hits = searchResponse.getHits().getHits();
for (SearchHit hit : hits) {
System.out.println(hit.getSourceAsString());
}
}
(5) 排序與分頁
private static void boolPaginationApi(RestHighLevelClient client) throws IOException {
SearchRequest searchRequest = new SearchRequest("blogs");
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
BoolQueryBuilder boolQueryBuilder = new BoolQueryBuilder();
boolQueryBuilder.must(new RangeQueryBuilder("post_date").gte("2020-01-01").lte("2020-01-03"));
searchSourceBuilder.query(boolQueryBuilder);
searchSourceBuilder.sort("author_id", SortOrder.DESC);
searchSourceBuilder.from(0);
searchSourceBuilder.size(2);
searchRequest.source(searchSourceBuilder);
SearchResponse searchResponse = client.search(searchRequest, RequestOptions.DEFAULT);
SearchHit[] hits = searchResponse.getHits().getHits();
for (SearchHit hit : hits) {
System.out.println(hit.getSourceAsString());
}
}
(6) constant_score
private static void constantScoreApi(RestHighLevelClient client) throws IOException {
SearchRequest searchRequest = new SearchRequest("blogs");
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
RangeQueryBuilder rangeQueryBuilder = new RangeQueryBuilder("post_date").gte("2020-01-01").lte("2020-01-03");
ConstantScoreQueryBuilder constantScoreQueryBuilder = new ConstantScoreQueryBuilder(rangeQueryBuilder);
searchSourceBuilder.postFilter(constantScoreQueryBuilder);
searchSourceBuilder.sort("author_id", SortOrder.DESC);
searchSourceBuilder.from(0);
searchSourceBuilder.size(2);
searchRequest.source(searchSourceBuilder);
SearchResponse searchResponse = client.search(searchRequest, RequestOptions.DEFAULT);
SearchHit[] hits = searchResponse.getHits().getHits();
for (SearchHit hit : hits) {
System.out.println(hit.getSourceAsString());
}
}
(7) filter
private static void boolFilterApi(RestHighLevelClient client) throws IOException {
SearchRequest searchRequest = new SearchRequest("blogs");
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
BoolQueryBuilder boolQueryBuilder = new BoolQueryBuilder();
RangeQueryBuilder rangeQueryBuilder = new RangeQueryBuilder("post_date").gte("2020-01-01").lte("2020-01-03");
boolQueryBuilder.filter(rangeQueryBuilder);
TermsQueryBuilder termsQueryBuilder1 = new TermsQueryBuilder("title.keyword", "My dog barks");
TermsQueryBuilder termsQueryBuilder2 = new TermsQueryBuilder("content.keyword", "barking dogs");
boolQueryBuilder.should(termsQueryBuilder1);
boolQueryBuilder.should(termsQueryBuilder2);
boolQueryBuilder.minimumShouldMatch(1);
searchSourceBuilder.query(boolQueryBuilder);
searchRequest.source(searchSourceBuilder);
SearchResponse searchResponse = client.search(searchRequest, RequestOptions.DEFAULT);
SearchHit[] hits = searchResponse.getHits().getHits();
for (SearchHit hit : hits) {
System.out.println(hit.getSourceAsString());
}
}