Hadoop zookeeper高可用配置

**

HDFS的高可用(HA)的實現方式:

**

一種是將NN維護的元數據保存一份到NFS上,當NN故障,可以通過另一臺NNe讀取NFS目錄中的元數據備份進行恢復工作,需要手動進行操作,並不是真正意義上的HA方案。

另一種是準備一臺備用NN節點,通過定期下載NN的元數據和日誌文件來備份,當NN故障時,可以通過這臺進行恢復,由於主備節點元數據和日誌並不是實時同步,所以會丟失一些數據。

前兩種方案都不是很理想,社區提供一種更好的方案,基於QJM(Qurom Journal Manager)的共享日誌方案。QJM的基本原理是NN(Active)把日誌寫本地和2N+1(奇數)臺JournalNode上,當數據操作返回成功時才寫入日誌,這個日誌叫做editlog,而元數據存在fsimage文件中,NN(Standby)定期從JournalNode上讀取editlog到本地。在這手動切換的基礎上又開發了基於Zookeeper的ZKFC(ZookeeperFailover Controller)自動切換機制,Active和Standby節點各有ZKFC進程監控NN監控狀況,定期發送心跳,當Active節點故障時Standby會自動切換爲ActiveNode,本次就用的此方案。如下圖所示:

這裏寫圖片描述


**

zookeeper安裝

**
hadoop安裝根據前一篇部署即可 刪除所所有節點tmp下數據 配置完成後重新 初始化即可
————————————————————————————————————
zookeeper需要jdk環境 在配置正確的情況下無法啓動 一般是沒有jdk 安裝重啓即可

安裝在hadoop目錄下即可
需要三臺zookeeper節點
##############################################
編輯節點信息
vim  conf/zoo.cfg 
————————————————————————————————————————————————————————————
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1
server.1=172.25.15.7:2888:3888
server.2=172.25.15.8:2888:3888
server.3=172.25.15.9:2888:3888
————————————————————————————————————————————————————————————
創建數據目錄 並創建server id文件
mkdir   /tmp/zookeeper

echo  3 > myid
與zoo.cfg中配置對應即可
##############################################################

開啓服務
三臺主機全部開啓
[root@server2 zookeeper-3.4.9]# bin/zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /usr/local/hadoop/zookeeper-3.4.9/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED

查看狀態  中間的主機爲leader  其他爲follower
————————————————————————————————————————————————————————————
[root@server3 zookeeper-3.4.9]# bin/zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /usr/local/hadoop/zookeeper-3.4.9/bin/../conf/zoo.cfg
Mode: leader

————————————————————————————————————————————————————————————
[root@server2 zookeeper-3.4.9]# bin/zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /usr/local/hadoop/zookeeper-3.4.9/bin/../conf/zoo.cfg
Mode: follower
################################################################
鏈接zookeepre查看
[root@server3 zookeeper-3.4.9]# bin/zkCli.sh -server 127.0.0.1:2181
Connecting to 127.0.0.1:2181
————————————————————————————————————————————————————————————————
查看數據
[zk: 127.0.0.1:2181(CONNECTED) 0] ls /
[zookeeper]
[zk: 127.0.0.1:2181(CONNECTED) 1] get  /zookeeper/quota

cZxid = 0x0
ctime = Thu Jan 01 08:00:00 CST 1970
mZxid = 0x0
mtime = Thu Jan 01 08:00:00 CST 1970
pZxid = 0x0
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 0
numChildren = 0


配置hadoop


在server1上配置
[root@server1 hadoop]# vim  etc/hadoop/core-site.xml 

<configuration>
 <property>
         <name>fs.defaultFS</name>
                 <value>hdfs://master</value>
                     </property>

<property>
<name>ha.zookeeper.quorum</name>
<value>172.25.15.7:2181,172.25.15.8:2181,172.25.15.9:2181</value>
</property>
</configuration>
————————————————————————————————————————————————————————————————————————
[root@server1 hadoop]# vim  etc/hadoop/hdfs-site.xml 

<configuration>
  <property>
          <name>dfs.replication</name>
                  <value>3</value>
                      </property>

<property>
<name>dfs.nameservices</name>
<value>masters</value>
</property>

<property>
<name>dfs.ha.namenodes.masters</name>
<value>h1,h2</value>
</property>


<property>
<name>dfs.namenode.rpc-address.masters.h1</name>
<value>172.25.15.6:9000</value>
</property>


<property>
<name>dfs.namenode.http-address.masters.h1</name>
<value>172.25.15.6:50070</value>
</property>

<property>
<name>dfs.namenode.rpc-address.masters.h2</name>
<value>172.25.15.10:9000</value>
</property>

<property>
<name>dfs.namenode.http-address.masters.h2</name>
<value>172.25.15.10:50070</value>
</property>


<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://172.25.15.7:8485;172.25.15.8:8485;172.25.15.9:8485/masters</value>
</property>

<property>
<name>dfs.journalnode.edits.dir</name>
<value>/tmp/journaldata</value>
</property>

<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>


<property>
<name>dfs.client.failover.proxy.provider.master</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>


<property>
<name>dfs.ha.fencing.methods</name>
<value>
sshfence
shell(/bin/true)
</value>
</property>

<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/root/.ssh/id_rsa</value>
</property>


<property>
<name>dfs.ha.fencing.ssh.connect-timeout</name>
<value>30000</value>
</property>

啓動服務

3. 啓動 hdfs 集羣(按順序啓動)
1)在三個 DN 上依次啓動 zookeeper 集羣
$ bin/zkServer.sh start
[hadoop@server2 ~]$ jps
1222 QuorumPeerMain
1594 Jps
2)在三個 DN 上依次啓動 journalnode(第一次啓動 hdfs 必須先啓動 journalnode)
$ sbin/hadoop-daemon.sh start journalnode
[hadoop@server2 ~]$ jps
1493 JournalNode
1222 QuorumPeerMain
1594 Jps3)格式化 HDFS 集羣
$ bin/hdfs namenode -format
Namenode 數據默認存放在/tmp,需要把數據拷貝到 h2
$ scp -r /tmp/hadoop-hadoop 172.25.0.5:/tmp
3) 格式化 zookeeper (只需在 h1 上執行即可)
$ bin/hdfs zkfc -formatZK
(注意大小寫)
4)啓動 hdfs 集羣(只需在 h1 上執行即可)
$ sbin/start-dfs.sh

查看各個節點信息

master節點
[root@server1 hadoop]# jps
3043 DFSZKFailoverController
2755 NameNode
3144 Jps
——————————————————————————
數據節點
[root@server2 hadoop]# jps
2289 JournalNode
2355 Jps
2198 DataNode
1705 QuorumPeerMain

瀏覽器查看
這裏寫圖片描述
這裏寫圖片描述

ok 在同一時間只有一臺主機active

測試高可用 停掉server5的namenode 測試數據是否同步

[root@server5 mnt]# jps
1808 Jps
1705 DFSZKFailoverController
1613 NameNode
[root@server5 mnt]# kill -9  1613

ok server1自動接管服務
這裏寫圖片描述

重新開啓server5 namenode服務即可

[root@server5 hadoop]# sbin/hadoop-daemon.sh start namenode
starting namenode, logging to /usr/local/hadoop/logs/hadoop-root-namenode-server5.out

這裏寫圖片描述


yarn 的高可用:

修改配置文件
——————————————————————————————————————————————————————————————————
[root@server5 hadoop]# cat  etc/hadoop/mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
——————————————————————————————————————————————————————————————————————————

[root@server5 hadoop]# cat  etc/hadoop/mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
[root@server5 hadoop]# cat  etc/hadoop/yarn-
yarn-env.cmd   yarn-env.sh    yarn-site.xml  
[root@server5 hadoop]# cat  etc/hadoop/yarn-site.xml 
<?xml version="1.0"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->
<configuration>


<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>

<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>true</value>
</property>

<property>
<name>yarn.resourcemanager.cluster-id</name>
<value>RM_CLUSTER</value>
</property>


<property>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>rm1,rm2</value>
</property>


<property>
<name>yarn.resourcemanager.hostname.rm1</name>
<value>172.25.15.6</value>
</property>


<property>
<name>yarn.resourcemanager.hostname.rm2</name>
<value>172.25.15.10</value>
</property>



<property>
<name>yarn.resourcemanager.recovery.enabled</name>
<value>true</value>
</property>

<property>
<name>yarn.resourcemanager.store.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
</property>

<property>
<name>yarn.resourcemanager.zk-address</name>
<value>172.25.15.7:2181,172.25.15.8:2181,172.25.15.9:2181</value>
</property>

</configuration>


啓動服務

啓動 yarn 服務
$ sbin/start-yarn.sh
[hadoop@server1 hadoop]$ jps
6559 Jps
2163 NameNode
1739 DFSZKFailoverController5127 ResourceManager
RM2 上需要手動啓動
$ sbin/yarn-daemon.sh start resourcemanager
[hadoop@server5 hadoop]$ jps
1191 NameNode
3298 Jps
1293 DFSZKFailoverController
2757 ResourceManager

最好是把 RM 與 NN 分離運行,這樣可以更好的保證程序的運行性能。但實驗環境可以放在一起測試。
打開瀏覽器查看數據
這裏寫圖片描述
這裏寫圖片描述
這裏寫圖片描述

依舊爲一臺服務器運行

測試高可用性能

[root@server1 hadoop]# jps
8752 NameNode
9394 ResourceManager
9050 DFSZKFailoverController
9466 Jps
[root@server1 hadoop]# kill -9 9394

server5成功接管
這裏寫圖片描述

server1重新開啓服務即可

[root@server1 hadoop]# sbin/yarn-daemon.sh start resourcemanager
starting resourcemanager, logging to /usr/local/hadoop/logs/yarn-root-resourcemanager-server1.out

hbase的使用


Hbase 分佈式部署
1) hbase 配置
$ tar zxf hbase-1.2.4-bin.tar.gz
$ vim hbase-env.sh
export JAVA_HOME=/home/hadoop/java
#指定 jdk
export HBASE_MANAGES_ZK=false
#默認值時 true,hbase 在啓動時自動開啓 zookeeper,如需自己維護 zookeeper 集羣需設置爲 false
export HADOOP_HOME=/home/hadoop/hadoop #指定 hadoop 目錄,否則 hbase無法識別 hdfs 集羣配置。

——————————————————————————————————————————————————————————————————————————
vim hbase-site.xml
<configuration>
<!-- 指定 region server 的共享目錄,用來持久化 HBase。這裏指定的 HDFS 地址
是要跟 core-site.xml 裏面的 fs.defaultFS 的 HDFS 的 IP 地址或者域名、端口必須一致。 -->
<property>
<name>hbase.rootdir</name>
<value>hdfs://masters/hbase</value>
</property>
<!-- 啓用 hbase 分佈式模式 -->
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<!-- Zookeeper 集羣的地址列表,用逗號分割。默認是 localhost,是給僞分佈式用
的。要修改才能在完全分佈式的情況下使用。 -->
<property>
<name>hbase.zookeeper.quorum</name>
<value>172.25.0.2,172.25.0.3,172.25.0.4</value>
</property>
<!-- 指定數據拷貝 2 份,hdfs 默認是 3 份。 -->這一步可以不用配置 默認三份即可
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<!-- 指定 hbase 的 master -->
<property><name>hbase.master</name>
<value>h1</value>
</property>
</configuration>

$ cat regionservers(zookeeper節點ip)
172.25.0.3
172.25.0.4
172.25.0.2
————————————————————————————————————————————————————————————————
啓動 hbase
主節點運行:
$ bin/start-hbase.sh
[hadoop@server1 hbase]$ jps
6559 Jps
2163 NameNode
1739 DFSZKFailoverController
5127 ResourceManager
1963 HMaster
備節點運行:
[hadoop@server5 hbase]$ bin/hbase-daemon.sh start master
1191 NameNode
3298 Jps
1293 DFSZKFailoverController
2757 ResourceManager
1620 HMaster
瀏覽器查看即可

HBase Master 默認端口時 16000,還有個 web 界面默認在 Master 的 16010 端口
上,HBase RegionServers 會默認綁定 16020 端口,在端口 16030 上有一個展示
信息的界面。
這裏寫圖片描述
這裏寫圖片描述

測試故障切換 kill進程即可

測試數據

[root@server1 hbase-1.2.4]# bin/hbase shell
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/hadoop/hbase-1.2.4/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
HBase Shell; enter 'help<RETURN>' for list of supported commands.
Type "exit<RETURN>" to leave the HBase Shell
Version 1.2.4, r67592f3d062743907f8c5ae00dbbe1ae4f69e5af, Tue Oct 25 18:10:20 CDT 2016

hbase(main):001:0> 
——————————————————————————————————————————————————————————————————————
hbase(main):001:0> create 'test', 'cf'

——————————————————————————————————————————————————————————————————————
hbase(main):005:0> list  "test"
TABLE                                                                                                                                                 
test                                                                                                                                                  
1 row(s) in 0.0060 seconds

=> ["test"]

——————————————————————————————————————————————————————————————————
hbase(main):006:0> put 'test', 'row1', 'cf:a', 'value1'
0 row(s) in 0.3610 seconds

hbase(main):007:0> put 'test', 'row2', 'cf:b', 'value2'
0 row(s) in 0.0260 seconds

hbase(main):008:0> put 'test', 'row3', 'cf:c', 'value3'
0 row(s) in 0.0170 seconds
————————————————————————————————————————————————————————————————————
查看數據是否添加
[root@server5 hadoop]# bin/hdfs dfs -ls /
Found 2 items
drwxr-xr-x   - root supergroup          0 2018-08-28 12:12 /hbase
drwxr-xr-x   - root supergroup          0 2018-08-28 11:48 /user

添加成功

停掉server1 hbase 服務 測試高可用性能
server5 成功接管 ok
這裏寫圖片描述
開啓server hbase 變爲備用節點
這裏寫圖片描述

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章