Linux下zookeeper集羣搭建

一、zookeeper集羣簡介

Zookeeper集羣的特點:可複製性

強烈建立使用奇數個服務器,每個服務運行在單獨的機器上.容錯激設置至少需要3臺服務器.

記錄：

1.設置zookeeper集羣是一定要部署爲2xF+1（即奇數）個server，這樣可以允許F個Server出錯（宕機或其他）。即：假設有三臺server，則最多可以允許一臺服務器宕機。集羣繼續正常工作。

2. 設計zookeeper集羣時，爲了儘量擡高可靠性、容錯性。部署服務器的時候，儘量讓服務器位於不同的機房、或者在同一個機房中連接不同的交換機。防止一臺交換機出錯時，導致整體不可用。

二、集羣中的配置介紹

需要注意的配置

1.initLimit：

集羣中的小弟follower服務器，和領導leader服務器之間完成初始化同步連接時的能接受的最大心跳數，

此時如果集羣環境非常大，同步數據的時間較長，這個參數我們需要進行適當調整.

注意，在zookeeper中，任何時間的設置都是以ticktime的倍數來進行定義，如果我們設置initLimit=2.那我們能接受的最大時間就是ticktime*2

2.syncLimit：

follower和leader之間請求和應答能接受的最大心跳數

集羣節點的配置

server.id = host:port:port

id：通過在各自的dataDir目錄下創建一個myId的文件來爲每臺機器賦予一個服務器id，這個id我們一般用基數數字表示

兩個port：第一個follower用來連接到leader，第二個用來選舉leader

記錄：瞭解Leader選舉

Zookeeper的啓動過程中leader選舉是非常重要而且最複雜的一個環節。那麼什麼是leader選舉呢？zookeeper爲什麼需要leader選舉呢？zookeeper的leader選舉的過程又是什麼樣子的？

　　首先我們來看看什麼是leader選舉。其實這個很好理解，leader選舉就像總統選舉一樣，每人一票，獲得多數票的人就當選爲總統了。在zookeeper集羣中也是一樣，每個節點都會投票，如果某個節點獲得超過半數以上的節點的投票，則該節點就是leader節點了。

　　以一個簡單的例子來說明整個選舉的過程.

假設有五臺服務器組成的zookeeper集羣,它們的id從1-5,同時它們都是最新啓動的,也就是沒有歷史數據,在存放數據量這一點上,都是一樣的.假設這些服務器依序啓動,來看看會發生什麼。

1) 服務器1啓動,此時只有它一臺服務器啓動了,它發出去的報沒有任何響應,所以它的選舉狀態一直是LOOKING狀態

2) 服務器2啓動,它與最開始啓動的服務器1進行通信,互相交換自己的選舉結果,由於兩者都沒有歷史數據,所以id值較大的服務器2勝出,但是由於沒有達到超過半數以上的服務器都同意選舉它(這個例子中的半數以上是3),所以服務器1,2還是繼續保持LOOKING狀態.

3) 服務器3啓動,根據前面的理論分析,服務器3成爲服務器1,2,3中的老大,而與上面不同的是,此時有三臺服務器選舉了它,所以它成爲了這次選舉的leader.

4) 服務器4啓動,根據前面的分析,理論上服務器4應該是服務器1,2,3,4中最大的,但是由於前面已經有半數以上的服務器選舉了服務器3,所以它只能接收當小弟的命了.

5) 服務器5啓動,同4一樣,當小弟

三、zookeeper集羣搭建

1>機器及端口準備

注意：防火牆問題，關閉防火牆或是防火牆過濾端口優先選擇過濾端口

需要在防火牆配置中(/etc/sysconfig/iptables)過濾以上使用到的端口號(Centos7默認防火牆firewall)

2>安裝JDK 【此步驟省略】。

3>Zookeeper壓縮包上傳到服務器(這兒使用的是zookeeper-3.4.10.tar.gz);

壓縮包下載地址:

https://archive.apache.org/dist/zookeeper/zookeeper-3.4.10/zookeeper-3.4.10.tar.gz

4>將zookeeper壓縮包解壓到/opt/zookeeper/目錄下(這兒看個人公司要求，我們的組件默認都在此目錄下)

tar -zxvf zookeeper-3.4.10.tar.gz -C /opt/zookeeper/

5>將 zookeeper-3.4.10/conf下zoo_sample.cfg 文件改名爲 zoo.cfg

mv zoo_sample.cfg zoo.cfg

6>修改/opt/zookeeper/zookeeper-3.4.10/conf/zoo.cfg

vim /opt/zookeeper/zookeeper-3.4.10/conf/zoo.cfg

zoo.cfg文件配置如下:(當前爲我們的外網環境配置示範)

# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial 
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between 
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just 
# example sakes.（保存數據的路徑，需要絕對路徑）
dataDir=/data02/zookeeper/data		//數據存儲路徑
dataLogDir=/data01/zookeeper/log			//日誌存儲路徑
# the port at which the clients will connect
clientPort=9611		//當前實例端口號
# the maximum number of client connections.
# increase this if you need to handle more clients
#maxClientCnxns=60
#
# Be sure to read the maintenance section of the 
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1
server.1= 10.123.14.32:9612:9613
server.2= 10.123.14.33:9612:9613
server.3= 10.123.14.34:9612:9613

參數說明：

10.123.14.xx記錄的是每個zookeeper實例的ip
9612 每個zookeeper實例通信的端口號s
9613選舉leader時zookeeper實例使用的端口號

注意:data01下的zookeeper需手動創建(mkdir zookeeper)

data02下的zookeeper/data需手動創建

7>同步/opt/zookeeper/zookeeper-3.4.10目錄內容到另外兩臺機器(可在另外兩臺機器重複以上操作也可用命令)

scp	/opt/zookeeper/zookeeper-3.4.10 [email protected]:/opt/zookeeper/
scp	/opt/zookeeper/zookeeper-3.4.10 [email protected]:/opt/zookeeper/

注意:zookeeper目錄需要手動創建(在opt/下執行 mkdir zookeeper) 否則會找不到路徑

8>在每個實例配置文件zoo.cfg中dataDir指定的路徑下（即/data02/zookeeper/data）創建一個myid文件，myid文件內容即爲zookeeper實例的序列號，第一個實例即1，第二個即2，以此類推(即server.X= 10.123.14.32:9612:9613中的X)

記錄：dataDir和dataLogDir

zookeeper官方建議我們添加上dataLogDir來存放事務日誌。如果只有dataDir目錄而沒有dataLogDir目錄的話，它會把運行日誌和事務日誌都放在dataDir的那個目錄上面去，事務日誌和運行日誌有什麼區別？這個事務日誌就相當於我們zookeeper的數據庫，需要使用到讀取和恢復等功能的時候，它就需要這麼一個數據庫來恢復。

9>修改環境變量 vim /etc/profile(已做jdk環境變量可忽略)

export ZOOKEEPER_HOME=/usr/local/zookeeper-3.4.10
export PATH=$ZOOKEEPER_HOME/bin:$PATH

配置生效：

source /etc/profile

10>添加zookeeper用戶並設置密碼(看個人公司要求，爲規避風險儘量避免使用root用戶)

useradd zookeeper
passwd  zookeeper

刪除用戶：userdel 用戶名

11>將opt下的zookeeper所屬切換到zookeeper用戶下,文件賦予zookeeper用戶權限

在opt路徑下執行（數據日誌和事務日誌文件夾需同樣的操作）

chown -R  zookeeper:zookeeper zookeeper/

12>將當前的root用戶切換到zookeeper用戶

su zookeeper

13>集羣啓動-在zookeeper用戶下分別啓動集羣下各個zookeeper實例(規避風險-非root用戶啓動)

bin/zkServer.sh start

14>查看zk啓動狀態

bin/zkServer.sh status

至此，搭建完成

三、zookeeper集羣測試

客戶端連接：

./bin/zkCli.sh -server 10.123.14.33:9611

一些操作命令

[zk: 10.123.14.33:9611(CONNECTED) 2] create /zk_test hello
Created /zk_test
[zk: 10.123.14.33:9611(CONNECTED) 3] get /zk_test
hello
cZxid = 0x100000003
ctime = Wed Sep 19 18:04:00 CST 2018
mZxid = 0x100000003
mtime = Wed Sep 19 18:04:00 CST 2018
pZxid = 0x100000003
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 5
numChildren = 0
[zk: 10.123.14.33:9611(CONNECTED) 4] set /zk_test fuck
cZxid = 0x100000007
ctime = Wed Sep 19 18:09:32 CST 2018
mZxid = 0x100000008
mtime = Wed Sep 19 18:09:40 CST 2018
pZxid = 0x100000007
cversion = 0
dataVersion = 1
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 4
numChildren = 0
[zk: 10.123.14.33:9611(CONNECTED) 5] get /zk_test
fuck
cZxid = 0x100000007
ctime = Wed Sep 19 18:09:32 CST 2018
mZxid = 0x100000008
mtime = Wed Sep 19 18:09:40 CST 2018
pZxid = 0x100000007
cversion = 0
dataVersion = 1
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 4
numChildren = 0
[zk: 10.123.14.33:9611(CONNECTED) 6] delete /zk_test
[zk: 10.123.14.33:9611(CONNECTED) 7] ls /
[zookeeper]
[zk: 10.123.14.33:9611(CONNECTED) 8]

四、zookeeper高可用測試

準備：配置了3臺的測試環境，三臺完全啓動之後，查看狀態，兩臺是follower,一臺是leader.
開始測試高可用：

運行kill -9 命令殺死三個zookeeper中的一個。集羣運行正常。

2. 運行kill -9 命令殺死三個zookeeper中的兩個。集羣全部失效。

總結：也就是說三臺服務器中，如果一臺宕機，集羣是可以繼續正常運行的，但是兩臺宕機就不可以了。

查閱官網文檔之後，發現如下的doc：

Cross Machine Requirements
For the ZooKeeper service to be active, there must be a majority of non-failing machines that can communicate with each other. To create a deployment that can tolerate the failure of F machines, you should count on deploying 2xF+1 machines. Thus, a deployment that consists of three machines can handle one failure, and a deployment of five machines can handle two failures. Note that a deployment of six machines can only handle two failures since three machines is not a majority. For this reason, ZooKeeper deployments are usually made up of an odd number of machines.
To achieve the highest probability of tolerating a failure you should try to make machine failures independent. For example, if most of the machines share the same switch, failure of that switch could cause a correlated failure and bring down the service. The same holds true of shared power circuits, cooling systems, etc.

總結起來，就是如下幾點：

五、異常情況----安裝或啓動異常

1>連接異常：（服務是否正常啓動，端口號是否已過濾開放）

端口號開放：

查詢端口號80 是否開啓：firewall-cmd --query-port=80/tcp
永久開放80端口號：firewall-cmd --permanent --zone=public --add-port=80/tcp
移除80端口號：firewall-cmd --permanent --zone=public --remove-port=80/tcp
--zone #作用域
--add-port=80/tcp #添加端口，格式爲：端口/通訊協議
--permanent #永久生效，沒有此參數重啓後失效

查看防火牆狀態
systemctl status firewalld.service
啓動|關閉|重新啓動防火牆
systemctl [start|stop|restart] firewalld.service

2>Error contacting service. It is probably not running.

網上問題答案有許多種，一一歸納：

1. zoo.cfg配置文件中指定目錄卻沒有創建！創建相應目錄即可。

2. zoo.cfg中dataDir指定路徑爲Myid文件的路徑。

Myid內容與：server.?=localhost:2888:3888 中你所設置？一致！

3.使用service iptables stop 關閉防火牆

使用service iptables status確認

4. 1,打開zkServer.sh 找到

status)

STAT=`echo stat | nc localhost $(grep clientPort "$ZOOCFG" | sed -e 's/.*=//') 2> /dev/null| grep Mode`

在nc與localhost之間加上 -q 1 （是數字1而不是字母l）

如果已存在則去掉

5.2181端口被佔用！ #我就是死在這的，死了很久很久。。

zkServer.sh stop #先停止zookeep

netstat -an | grep 9611 #查看端口是否佔用，如果佔用

clientPort = 9611 #隨便找個沒佔用的端口號！

Linux下zookeeper集羣搭建

985 碩士程序員，空窗 4 個月沒有 Offer！

一文搞懂 Spring 循環依賴

賽博鬥地主——使用大語言模型扮演Agent智能體玩牌類遊戲。

VScode右鍵打開(添加到右鍵)

記一次 .NET某工控視覺自動化系統卡死分析

ElasticSearch優化建議

ElasticSearch-head Linux下下載安裝使用

分佈式調度中心xxl-job——The timestamp difference between admin and executor exceeds the limit.

Linux下zookeeper集羣搭建

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結