elasticsearch索引結構和配置優化

原創

Garry1115

2020-06-17 07:23

elasticsearch索引結構和配置簡單調優.

1.搜索時對特定字段設置更高權值，以弱化相關性低的字段

例如：我們在搜索時認爲標題對我們更重要就可以對標題提高匹配權重

boolQuery.must(

                QueryBuilders.matchQuery(HouseIndexKey.TITLE, rentSearch.getKeywords())

                        .boost(2.0f)

       );

2.一般elasticsearch只是用來做檢索的，而不適合存儲原始結果集，所以我們只需要檢索後id(比如houseId)，而不需要返回整個結果集

所以我們只需要獲取id即可（如果返回整個字段的數據集，當數據量過大將會導致性能大大降低）；因此我們可以通過

setFetchSource(HouseIndexKey.HOUSE_ID, null)方法只返回houseId。

SearchRequestBuilder requestBuilder = this.esClient.prepareSearch(INDEX_NAME)

                .setTypes(INDEX_TYPE)

                .setQuery(boolQuery)

                .addSort(

                        HouseSort.getSortKey(rentSearch.getOrderBy()),

                        SortOrder.fromString(rentSearch.getOrderDirection())

                )

                .setFrom(rentSearch.getStart())

                .setSize(rentSearch.getSize())

                .setFetchSource(HouseIndexKey.HOUSE_ID, null);

3.索引結構優化

索引讀寫優化：索引存儲採用niofs

"index.store.type": "niofs",

索引模版採用strict嚴格模式，即在整個索引結構穩定的情況下不允許隨意更改，當然不穩定的情況下可以指定爲false，可以動態更改

"dynamic": "strict",

禁用_all字段，防止將所有的字符串字段連接起來做全文檢索，影響檢索性能（es6.x以上版本貌似已經廢棄該字段）

Index中默認會有_all這個字段（es6.x已經禁用），默認會把所有字段的內容都拷貝到這一個字段裏面，這樣會給查詢帶來方便，但是會增加索引時間和索引尺寸。

"_all": {

        "enabled": false

      },

設置默認查詢字段

"index.query.default_field": "title"

設置節點掉線延時操作時間（5m），防止由於網絡原因導致集羣中卸載該分配節點

"index.unassigned.node_left.delayed_timeout": "5m"

注意：分片和副本的設置需要看集羣的大小（我如下的索引設置副本爲0，分配爲5是因爲我是單節點測試的，各位如果是集羣節點注意修改這另兩個參數）

{

  "settings": {

    "number_of_replicas": 0,

    "number_of_shards": 5,

    "index.store.type": "niofs",

    "index.query.default_field": "title",

    "index.unassigned.node_left.delayed_timeout": "5m"

  },



  "mappings": {

    "house": {

      "dynamic": "strict",

      "_all": {

        "enabled": false

      },

      "properties": {

        "houseId": {

          "type": "long"

        },

        "title": {

          "type": "text",

          "index": "analyzed",

          "analyzer": "ik_smart",

          "search_analyzer": "ik_smart"

        },

        "price": {

          "type": "integer"

        },

        "area": {

          "type": "integer"

        },

        "createTime": {

          "type": "date",

          "format": "strict_date_optional_time||epoch_millis"

        },

        "lastUpdateTime": {

          "type": "date",

          "format": "strict_date_optional_time||epoch_millis"

        },

        "cityEnName": {

          "type": "keyword"

        },

        "regionEnName": {

          "type": "keyword"

        },

        "direction": {

          "type": "integer"

        },

        "distanceToSubway": {

          "type": "integer"

        },

        "subwayLineName": {

          "type": "keyword"

        },

        "subwayStationName": {

          "type": "keyword"

        },

        "tags": {

          "type": "text"

        },

        "street": {

          "type": "keyword"

        },

        "district": {

          "type": "keyword"

        },

        "description": {

          "type": "text",

          "index": "analyzed",

          "analyzer": "ik_smart",

          "search_analyzer": "ik_smart"

        },

        "layoutDesc" : {

          "type": "text",

          "index": "analyzed",

          "analyzer": "ik_smart",

          "search_analyzer": "ik_smart"

        },

        "traffic": {

          "type": "text",

          "index": "analyzed",

          "analyzer": "ik_smart",

          "search_analyzer": "ik_smart"

        },

        "roundService": {

          "type": "text",

          "index": "analyzed",

          "analyzer": "ik_smart",

          "search_analyzer": "ik_smart"

        },

        "rentWay": {

          "type": "integer"

        },

        "suggest": {

          "type": "completion"

        },

        "location": {

          "type": "geo_point"

        }

      }

    }

  }

}

4.配置優化

禁止通配符刪除索引（索引刪除的後果是不可逆的，且刪且珍惜）

執行PUT http://10.0.2.19:9200/_cluster/settings

設置請求參數：

{

    "transient":{

        "action.destructive_requires_name":true

    }

}

查看設置：

設置延時刷新時間

調整refresh時間間隔，優化點：減少刷新頻率，降低潛在的寫磁盤性能損耗，默認的刷新時間間隔是1s，對於寫入量很大的場景，這樣的配置會導致寫入吞吐量很低，適當提高刷新間隔，可以提升寫入量，代價就是讓新寫入的數據在60s之後可以被搜索，新數據可見的及時性有所下降。

index.refresh_interval: 30s

集羣發現超時優化

#節點間的存活時間檢測間隔

discovery.zen.fd.ping_interval: 10s

#存活超時時間

discovery.zen.fd.ping_timeout: 120s

#存活超時重試次數

discovery.zen.fd.ping_retries: 5

另外對於集羣機器資源夠多的情況下，可以設置主節點不存儲數據（一般小集羣規模會設置主節點和從節點都作爲數據節點），看各自的業務情況應變處理。

指揮節點（主節點）：

#指揮節點配置
#節點名稱
node.name: master
#是否是主節點
node.master: true
#是否存儲數據(改爲false則不做數據節點，根據情況設計)
node.data: true

數據節點（從節點）：

#數據節點配置
node.name: slave1
node.master: false
node.data: true

針對數據節點 http功能關閉設置（關閉數據節點的http通信，只開啓tcp數據通信，可以降低數據節點的訪問負載）

http.enable: false

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

elasticsearch索引結構和配置優化

前端使用 Konva 實現可視化設計器（13）- 折線 - 最優路徑應用【思路篇】

elasticsearch的分佈式架構原理

mysql添加外鍵約束失敗cannot add foreign key constraint

HBuilder、HBuilderX連接夜神模擬器

大話“用戶註冊激活，忘記密碼”發送郵件功能

elasticsearch搜素關鍵字自動補全(suggest)

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結