elasticsearch集羣&&IK分詞器&&同義詞

wget https://download.elastic.co/elasticsearch/release/org/elasticsearch/distribution/tar/elasticsearch/2.3.3/elasticsearch-2.3.3.tar.gz



集羣安裝:

三個節點:master,slave1,slvae2



vi elasticsearch.yml


cluster.name: my-application

node.name: node-3(節點獨有的名稱,注意唯一性)

network.host: 192.168.137.117

http.port: 9200

discovery.zen.ping.unicast.hosts: ["master","slave1", "slave2"]





安裝插件

/home/qun/soft/elasticsearch-2.3.3/bin/plugin install analysis-icu

/home/qun/soft/elasticsearch-2.3.3/bin/plugin install mobz/elasticsearch-head

marvel:

/home/qun/soft/elasticsearch-2.3.3/bin/plugin install license

/home/qun/soft/elasticsearch-2.3.3/bin/plugin install marvel-agent



在各個節點上執行:

elasticsearch -d



殺死節點

kill -9 `ps -ef|grep elasticsearch|awk '{print $2}'`


啓動

/home/qun/soft/elasticsearch-2.3.3/bin/elasticsearch -d



訪問集羣:

http://master:9200/_plugin/head/


一個節點(node)就是一個Elasticsearch實例,而一個集羣(cluster)由一個或多個節點組成,它們具有相同的cluster.name,

它們協同工作,分享數據和負載。當加入新的節點或者刪除一個節點時,集羣就會感知到並平衡數據。



做爲用戶,我們能夠與集羣中的任何節點通信,包括主節點。每一個節點都知道文檔存在於哪個節點上,它們可以轉發請求到相應的節點上。

我們訪問的節點負責收集各節點返回的數據,最後一起返回給客戶端。這一切都由Elasticsearch處理。


獲取集羣狀態

http://master:9200/_cluster/health/


{
"cluster_name": "my-application",
"status": "green",
"timed_out": false,
"number_of_nodes": 3,
"number_of_data_nodes": 3,
"active_primary_shards": 22,
"active_shards": 44,
"relocating_shards": 0,
"initializing_shards": 0,
"unassigned_shards": 0,
"delayed_unassigned_shards": 0,
"number_of_pending_tasks": 0,
"number_of_in_flight_fetch": 0,
"task_max_waiting_in_queue_millis": 0,
"active_shards_percent_as_number": 100
}




設置添加分片

PUT /blogs/_settings

{

   "number_of_replicas" : 2

}




刪除索引

curl -XDELETE 'http://master:9200/.marvel-es-1-2016.05.29' 

curl -XDELETE 'http://master:9200/.marvel-es-data-1' 






安裝IK分詞器(https://github.com/medcl/elasticsearch-analysis-ik)

wget https://github.com/medcl/elasticsearch-analysis-ik/archive/master.zip


mvn package

mkdir -p /home/qun/soft/elasticsearch-2.3.4/plugins/ik

cp  /home/qun/soft/elasticsearch-2.3.3/elasticsearch-analysis-ik-master/target/releases/elasticsearch-analysis-ik-1.9.3.zip /home/qun/soft/elasticsearch-2.3.3/plugins/ik

unzip elasticsearch-analysis-ik-1.9.3.zip


測試分詞

/twitter/_analyze?analyzer=standard&pretty=true&text=我是中國人

/twitter/_analyze?analyzer=ik&pretty=true&text=我是中國人



添加用戶自定義詞典:

elasticsearch-2.3.3/plugins/ik/config/IKAnalyzer.cfg.xml

栗子:添加sougou.dic,分號分隔,相對路徑,重啓es集羣

<entry key="ext_dict">custom/mydict.dic;custom/single_word_low_freq.dic;custom/sougou.dic</entry>


<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd">
<properties>
<comment>IK Analyzer 擴展配置</comment>
<!--用戶可以在這裏配置自己的擴展字典 -->
<entry key="ext_dict">custom/mydict.dic;custom/single_word_low_freq.dic;custom/sougou.dic</entry>
 <!--用戶可以在這裏配置自己的擴展停止詞字典-->
<entry key="ext_stopwords">custom/ext_stopword.dic</entry>
<!--用戶可以在這裏配置遠程擴展字典 -->
<!-- <entry key="remote_ext_dict">words_location</entry> -->
<!--用戶可以在這裏配置遠程擴展停止詞字典-->
<!-- <entry key="remote_ext_stopwords">words_location</entry> -->
</properties>





配置同義詞

修改:elasticsearch-2.3.3/config/elasticsearch.yml,在末尾加上如下內容

index:
  analysis:
    analyzer:
      ik_syno:
          type: custom
          tokenizer: ik_max_word
          filter: [my_synonym_filter]
      ik_syno_smart:
          type: custom
          tokenizer: ik_smart
          filter: [my_synonym_filter]
    filter:
      my_synonym_filter:
          type: synonym
          synonyms_path: analysis/synonym.txt



添加詞典:

mkdir -p elasticsearch-2.3.3/config/analysis

vi elasticsearch-2.3.3/config/analysis/synonym.txt

ipod, i-pod, i pod

foozball , foosball

universe , cosmos

西紅柿, 番茄

馬鈴薯, 土豆



測試同義詞:

GET   /iktest/_analyze?analyzer=ik_syno_smart&pretty=true&text=馬鈴薯西紅柿

結果:

{
"tokens": [
{
"token": "馬鈴薯",
"start_offset": 0,
"end_offset": 3,
"type": "CN_WORD",
"position": 0
}
,
{
"token": "土豆",
"start_offset": 0,
"end_offset": 3,
"type": "SYNONYM",
"position": 0
}
,
{
"token": "西紅柿",
"start_offset": 3,
"end_offset": 6,
"type": "CN_WORD",
"position": 1
}
,
{
"token": "番茄",
"start_offset": 3,
"end_offset": 6,
"type": "SYNONYM",
"position": 1
}
]
}


發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章