4臺虛擬機實現高可用Hadoop集羣步驟

目錄

一、集羣安裝

1、軟件版本選擇

2、機器配置

    1)4臺機器分配 

    2)修改hosts

    3)免密登錄

3、軟件安裝

      1)安裝jdk

      2)安裝zookeeper

      3)安裝hadoop

      4)小結

二、啓動集羣

1、啓動zookeeper

2、啓動hadoop

1)啓動journalnode進程初始化namenode

2)啓動文件系統

3)啓動yarn集羣

4)啓動 mapreduce 任務歷史服務器

三、驗證集羣是否成功

四、一鍵啓動腳本 


一、集羣安裝

1、軟件版本選擇

正確的選擇hadoop、zookeeper、jdk和系統版本會事半功倍,且有利於以後擴展hive、storm、spark等集羣組件。我跟你講,特別是不要64位的系統裝32位的hadoop、32位的系統裝64位的hadoop等等等,然後又去找對應版本的libhadoop.so.1.0.0替換,會出現一系列莫名其妙還隱藏很深的問題。下面是我使用的軟件和系統版本:

系統:64位 redhat 6.5  鏈接:https://pan.baidu.com/s/12Mr6RHzaYac4F-xaAzm59Q  提取碼:op9z 
軟件:jdk 1.7、hadoop 2.6、zookeeper 3.4 鏈接:https://pan.baidu.com/s/13phkiYe2zlw9NhaEvbDRzQ  提取碼:t59h 

2、機器配置

    1)4臺機器分配 

主機名 redhat01 redhat02 redhat03 redhat04
ip 192.168.202.121 192.168.202.122 192.168.202.123 192.168.202.124
  zookeeper zookeeper zookeeper zookeeper(obsrver)
  datanode datanode datanode datanode
  nameipnode namenode resourcemanager resourcemanager

    2)修改hosts

在4臺機器的hosts添加如下配置 

vim /etc/hosts

192.168.202.121 redhat01
192.168.202.122 redhat02
192.168.202.123 redhat03
192.168.202.124 redhat04

3)免密登錄

(推薦4臺機器都新建個相同的普通用戶,以下操作都使用普通用戶)在4臺機器上分別執行:

ssh-keygen -t rsa

一路回車後,在redhat01執行:

touch ~/.ssh/authorized_keys
chomd 600 ~/.ssh/authorized_keys

將4臺機器 ~/.ssh/id_rsa.pub 的內容都粘貼到上面新建的authorized_keys文件裏,然後將authorized_keys文件分發到另外的3臺機器

scp ~/.shh/authorized_keys redhat02:~/.shh/authorized_keys
scp ~/.shh/authorized_keys redhat03:~/.shh/authorized_keys
scp ~/.shh/authorized_keys redhat04:~/.shh/authorized_keys

之後,在每臺機器上都ssh連接4臺機器(包括自己),第一次連接都要手動輸入yes,以後就不要了,例如在redhat01上

ssh redhat01
#手動輸入yes

ssh redhat02
#手動輸入yes

ssh redhat03
#手動輸入yes

ssh redhat04
#手動輸入yes

3、軟件安裝

      1)安裝jdk

解壓jdk後,配置環境變量:

注意一定要配置到.bashrc,這樣就不用修改hadoop-env.sh裏面的JAVA_HOME項了。不然會報JAVA_HOME不存在的錯。這是因爲hadoop的守護進程是不會讀取~/.bash_profile 這個配置文件的,但會讀取.bashrc。想要具體瞭解可以百度“bashrc與profile的區別”。

vim ~/.bashrc

#加在最後 
export JAVA_HOME=/home/hadoop/bigdata/jdk1.7.0_80
export PATH=.:$JAVA_HOME/bin:$PATH
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar

      2)安裝zookeeper

解壓zookeeper,配置環境

vim ~/.bash_profile

##  在最後添加 
export ZOOKEEPER_HOME=/home/hadoop/bigdata/zookeeper-3.4.11
PATH=$PATH:$HOME/bin:$ZOOKEEPER_HOME/bin

export PATH

進入zookeeper安裝目錄,在conf目錄存放的是zookeeper的配置文件

cp zoo_sample.cfg zoo.cfg

複製一下配置到zoo.cfg文件

# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial 
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between 
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just 
# example sakes.
dataDir=/home/hadoop/bigdata/zookeeper-3.4.11/data
# the port at which the clients will connect
clientPort=2181
# the maximum number of client connections.
# increase this if you need to handle more clients
#maxClientCnxns=60
#
# Be sure to read the maintenance section of the 
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1

## add by user 
# 以下內容手動添加
# server.id=主機名:心跳端口:選舉端口
# 注意:這裏給每個節點定義了id,這些id寫到配置文件中
# id爲1-255之間的任意的不重複的數字,一定要記得每個節點的id的對應關係
#observer表示該機器爲觀察者角色
server.1=redhat01:2888:3888
server.2=redhat02:2888:3888
server.3=redhat03:2888:3888
server.4=redhat04:2888:3888:observer

zookeeper的三種角色:

           leader:能接收所有的讀寫請求,也可以處理所有的讀寫請求,而且整個集羣中的所有寫數據請求都是由leader進行處理 

           follower:能接收所有的讀寫請求,但是讀數據請求自己處理,寫數據請求轉發給leader

           observer:跟follower的唯一的區別就是沒有選舉權和被選舉權

注意這一行配置dataDir=/home/hadoop/bigdata/zookeeper-3.4.11/data,這個目錄是需要改爲你自己的目錄,並手動新建好,在這個目錄下,還要新建個myid文件,並且每臺機器內容都不一樣,需要和zoo.cfg最後的server.後面的數保持一致,

#在Redhat01上 
echo 1 >/home/hadoop/bigdata/zookeeper-3.4.11/data/myid

#在Redhat02上 
echo 2 >/home/hadoop/bigdata/zookeeper-3.4.11/data/myid

#在Redhat01上 
echo 3 >/home/hadoop/bigdata/zookeeper-3.4.11/data/myid

#在Redhat01上 
echo 4 >/home/hadoop/bigdata/zookeeper-3.4.11/data/myid

      3)安裝hadoop

解壓hadoop,配置環境

vim ~/.bash_profile


##  在最後添加 
export ZOOKEEPER_HOME=/home/hadoop/bigdata/zookeeper-3.4.11
export HADOOP_HOME=/home/hadoop/bigdata/hadoop-2.6.0
PATH=$PATH:$HOME/bin:$ZOOKEEPER_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin

export PATH

修改配置文件,在hadoop目錄的etc/hadoop 目錄下

core-site.xml

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
    <!--指定 hdfs 的 nameservice 爲 ns(這個可以自定義)-->
    <property>
    <name>fs.defaultFS</name>
    <value>hdfs://ns/</value>
    </property>
    
    <!-- 指定 hadoop 工作目錄 -->
    <property>
        <name>hadoop.tmp.dir</name>
        <value>/home/hadoop/bigdata/hadoop-2.6.0/data/hadoopdata</value>
    </property>

    <!-- 指定 zookeeper 集羣訪問地址 -->
    <property>
        <name>ha.zookeeper.quorum</name>
        <value>redhat01:2181,redhat02:2181,redhat03:2181,redhat04:2181</value>
    </property>
    <!--修改core-site.xml中的ipc參數,防止出現連接journalnode服務ConnectException
        如果不加下面的參數,在用star-dfs.sh啓動集羣時,可能會報連接拒絕錯誤 
    -->
    <property>
        <name>ipc.client.connect.max.retries</name>
        <value>50</value>
        <description>Indicates the number of retries a client will make to establish a server connection.</description>
    </property>
    <property>
        <name>ipc.client.connect.retry.interval</name>
        <value>10000</value>
        <description>Indicates the number of milliseconds a client will wait for before retrying to establish a server connection.</description>
    </property>
</configuration>

hdfs-site.xml

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
    <!-- 指定副本數:不要超過datanode節點數量-->
    <property>
        <name>dfs.replication</name>
        <value>2</value>
    </property>

    <!--指定 hdfs 的 nameservice 爲 ns,需要和 core-site.xml 中保持一致-->
    <property>
        <name>dfs.nameservices</name>
        <value>ns</value>
    </property>

    <!-- ns 下面有兩個 NameNode,分別是 nn1,nn2 -->
    <property>
        <name>dfs.ha.namenodes.ns</name>
        <value>nn01,nn02</value>
    </property>

    <!-- nn1 的 RPC 通信地址 -->
    <property>
        <name>dfs.namenode.rpc-address.ns.nn01</name>
        <value>redhat01:9000</value>
    </property>
    <!-- nn1 的 http 通信地址 -->
    <property>
        <name>dfs.namenode.http-address.ns.nn01</name>
        <value>redhat01:50070</value>
    </property>
    <!-- nn2 的 RPC 通信地址 -->
    <property>
        <name>dfs.namenode.rpc-address.ns.nn02</name>
        <value>redhat02:9000</value>
    </property>
    <!-- nn2 的 http 通信地址 -->
    <property>
        <name>dfs.namenode.http-address.ns.nn02</name>
        <value>redhat02:50070</value>
    </property>

    <!-- 指定 NameNode 的 edits 元數據在 JournalNode 上的存放位置 -->
    <property>
        <name>dfs.namenode.shared.edits.dir</name>
        <value>qjournal://redhat01:8485;redhat02:8485;redhat03:8485/ns</value>
    </property>

    <!-- 指定 JournalNode 在本地磁盤存放數據的位置 -->
    <property>
        <name>dfs.journalnode.edits.dir</name>
        <value>/home/hadoop/bigdata/hadoop-2.6.0/data/journaldata</value>
    </property>

    <!-- 開啓 NameNode 失敗自動切換 -->
    <property>
        <name>dfs.ha.automatic-failover.enabled</name>
        <value>true</value>
    </property>

    <!-- 配置失敗自動切換實現方式 -->
    <!-- 此處配置較長,在安裝的時候切記檢查不要換行-->
    <property>
        <name>dfs.client.failover.proxy.provider.ns</name>
        <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
    </property>

    <!-- 配置隔離機制方法,多個機制用換行分割,即每個機制佔用一行-->
    <property>
        <name>dfs.ha.fencing.methods</name>
        <value>
            sshfence
            shell(/bin/true)
        </value>
    </property>

    <!-- 使用 sshfenns離機制時需要 ssh 免登陸 注意這裏要修改成你自己的地址 -->
    <property>
        <name>dfs.ha.fencing.ssh.private-key-files</name>
        <value>/home/hadoop/.ssh/id_rsa</value>
    </property>
    
    <!-- 配置 sshfence 隔離機制超時時間(30s) -->
    <property>
        <name>dfs.ha.fencing.ssh.connect-timeout</name>
        <value>30000</value>
    </property>
</configuration>

mapred-site.xml 

注意集羣沒有這個文件這個要從mapred-site.xml.template複製一份,改名爲mapred-site.xml

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
    <!-- 指定 mr 框架爲 yarn 方式 -->
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>

    <!-- 設置 mapreduce 的歷史服務器地址和端口號 -->
    <property>
        <name>mapreduce.jobhistory.address</name>
        <value>redhat01:10020</value>
    </property>

    <!-- mapreduce 歷史服務器的 web 訪問地址  -->
    <property>
        <name>mapreduce.jobhistory.webapp.address</name>
        <value>redhat01:19888</value>
    </property>
</configuration>

yarn-site.xml

<?xml version="1.0"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->
<configuration>

<!-- Site specific YARN configuration properties -->

    <!-- 開啓 RM 高可用 -->
    <property>
        <name>yarn.resourcemanager.ha.enabled</name>
        <value>true</value>
    </property>

    <!-- 指定 RM 的 cluster id,可以自定義-->
    <property>
        <name>yarn.resourcemanager.cluster-id</name>
        <value>jyarn</value>
    </property>

    <!-- 指定 RM 的名字,可以自定義 -->
    <property>
        <name>yarn.resourcemanager.ha.rm-ids</name>
        <value>rm1,rm2</value>
    </property>

    <!-- 分別指定 RM 的地址 -->
    <property>
        <name>yarn.resourcemanager.hostname.rm1</name>
        <value>redhat03</value>
    </property>

    <property>
        <name>yarn.resourcemanager.hostname.rm2</name>
        <value>redhat04</value>
    </property>

    <!-- 指定 zk 集羣地址 -->
    <property>
        <name>yarn.resourcemanager.zk-address</name>
        <value>redhat01:2181,redhat02:2181,redhat03:2181,redhat04:2181</value>
    </property>

    <!-- 要運行 MapReduce 程序必須配置的附屬服務 -->
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>

    <!-- 開啓 YARN 集羣的日誌聚合功能 -->
    <property>
        <name>yarn.log-aggregation-enable</name>
        <value>true</value>
    </property>

    <!-- YARN 集羣的聚合日誌最長保留時長 -->
    <property>
        <name>yarn.log-aggregation.retain-seconds</name>
        <!--1天-->
        <value>86400</value>
    </property>

    <!-- 啓用自動恢復 -->
    <property>
        <name>yarn.resourcemanager.recovery.enabled</name>
    <value>true</value>
    </property>

    <!-- 制定 resourcemanager 的狀態信息存儲在 zookeeper 集羣上-->
    <property>
        <name>yarn.resourcemanager.store.class</name>
        <value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
    </property>


</configuration>

最後在修改 slaves配置文件,指定datanode所在節點

vim slaves

redhat01
redhat02
redhat03
redhat04

      4)小結

這些在4臺機器上都要配置,有點繁瑣,但請一定要細心,不然很容易出問題。也可以在一臺服務器上配置好後直接拷貝到另外的機器

scp -r /home/hadoop/bigdata/hadoop-2.6.0 redhat02:/home/hadoop/bigdata/hadoop-2.6.0

特別是免密登錄配置好後,使用scp拷貝非常方便

左右準備工作都做好後,要刷新下環境變量

#每天機器都要執行
source ~/.bash_profile

二、啓動集羣

1、啓動zookeeper

在四臺服務器上分別執行

zkserver.sh start

啓動後可以查看節點狀態,如redhat04確實處於observer模式

輸入jps後可以看到zookeeper進程

2、啓動hadoop

1)啓動journalnode進程初始化namenode

在redhat01、redhat02、redhat03這3個journalnode節點上啓動journalnode進程

hadoop-daemon.sh start journalnode

在redhat01上初始化文件系統

hadoop namenode -format

在redhat01上啓動namenode進程 

hadoop-daemon.sh start namenode

在redhat02上同步namenode元數據

hadoop namenode -bootstrapStandby

在redhat01或redhat02上格式化zkfc

hdfs zkfc -formatZK

2)啓動文件系統

先停掉所有進程,之後通過start-dfs.sh 一起啓動

#在redhat01、redhat02、redhat03上分別執行,停掉journalnode進程
hadoop-daemon.sh stop journalnode

#在redhat01上執行,停掉namenode進程
hadoop-daemon.sh stop namenode

#在每個機器上執行jps,可看到所有進程都沒停掉了。只有個jps進程和zookeeper進程 
jps

在redhat01上執行start-dfs.sh

啓動結果如下

通過http://ip:50070訪問web界面  ip是redhat01、redhat02的靜態ip

 

3)啓動yarn集羣

在redhat03上執行start-yarn.sh

在redhat04上啓動resourcemanager進程

yarn-daemon.sh start resourcemanager

通過http://ip:8088端口查看yarn的web界面   ip是redhat03、redhat04的靜態ip

由於redhat03的resourcemanager處於活動狀態,redhat04處於standby,輸入redhat04的ip訪問yarn的web界面時會轉到redhat03,由於本機電腦並沒有配置redhat03的hosts所以找不到主機,並不是集羣出錯了

4)啓動 mapreduce 任務歷史服務器

在redhat01上執行

mr-jobhistory-daemon.sh start historyserver

通過http://ip:19888 訪問歷史任務web界面 ip爲redhat01靜態ip

 

三、驗證集羣是否成功

四臺機器的進程如下:

[hadoop@redhat01 ~]$ jps
2443 JobHistoryServer
2286 NodeManager
2071 DFSZKFailoverController
1950 JournalNode
1787 DataNode
2516 Jps
1363 QuorumPeerMain
1688 NameNode


[hadoop@redhat02 ~]$ jps
1667 DFSZKFailoverController
1249 QuorumPeerMain
1873 NodeManager
1417 NameNode
1559 JournalNode
2058 Jps
1484 DataNode


[hadoop@redhat03 ~]$ jps
1494 JournalNode
1624 ResourceManager
2067 Jps
1245 QuorumPeerMain
1725 NodeManager
1420 DataNode


[hadoop@redhat04 ~]$ jps
1701 Jps
1476 NodeManager
1355 DataNode
1247 QuorumPeerMain
1622 ResourceManager

查看集羣狀態

[hadoop@redhat01 ~]$ hdfs dfsadmin -report
Configured Capacity: 50189762560 (46.74 GB)
Present Capacity: 21985267712 (20.48 GB)
DFS Remaining: 21984055296 (20.47 GB)
DFS Used: 1212416 (1.16 MB)
DFS Used%: 0.01%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0

-------------------------------------------------
Live datanodes (4):

Name: 192.168.202.124:50010 (redhat04)
Hostname: redhat04
Decommission Status : Normal
Configured Capacity: 12547440640 (11.69 GB)
DFS Used: 544768 (532 KB)
......
......(省略)

[hadoop@redhat01 ~]$ 

運行hadoop自帶wordconut程序

#新建路徑 
hadoop fs -mkdir -p /user/hadoop/input

#將hadoop安裝目錄下license.txt文件上傳
hadoop fs -put /home/hadoop/bigdata/hadoop-2.6.0/LICENSE.txt /user/hadoop/input/

#將hadoop安裝目錄下share/hadoop/mapreduce/ 目錄下有自帶的wordconut程序
hadoop jar /home/hadoop/bigdata/hadoop-2.6.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar wordcount /user/hadoop/input /user/hadoop/output

程序執行成功後在集羣/user/hadoop/output目錄下可以看到結果

四、一鍵啓動腳本 

由於每次啓動太麻煩了,寫了一個hadoop一鍵啓動和關閉腳本,該腳本適合第二次啓動,第一次初始化還是按照上面一步步來:

#!/bin/bash

zp_home_dir='/home/hadoop/bigdata/zookeeper-3.4.11/bin'
hd_home_dir='/home/hadoop/bigdata/hadoop-2.6.0/sbin'
nodeArr=(redhat01 redhat02 redhat03 redhat04)
echo '---- start zookeeper ----'
for node in ${nodeArr[@]};do
echo $node '-> zookeeper started'
ssh ${node} "${zp_home_dir}/zkServer.sh start"
done

sleep 3s
echo '---- start hdfs ----'
start-dfs.sh

sleep 3s
echo '----redhat03 start yarn ----'
ssh redhat03 "${hd_home_dir}/start-yarn.sh"

sleep 3s
echo '----redhat04 start resourcemanager ----'
ssh redhat04 "${hd_home_dir}/yarn-daemon.sh start resourcemanager"

echo '----redhat01 start mapreduce jobhistory tracker ----'
mr-jobhistory-daemon.sh start historyserver 

關閉腳本:

#!/bin/bash

zp_home_dir='/home/hadoop/bigdata/zookeeper-3.4.11/bin'
hd_home_dir='/home/hadoop/bigdata/hadoop-2.6.0/sbin'
nodeArr=(redhat01 redhat02 redhat03 redhat04)

echo '----redhat01 stop mapreduce jobhistory tracker ----'
mr-jobhistory-daemon.sh stop historyserver 

sleep 3s
echo '----redhat04 stop resourcemanager ----'
ssh redhat04 "${hd_home_dir}/yarn-daemon.sh stop resourcemanager"

sleep 3s
echo '----redhat03 stop yarn ----'
ssh redhat03 "${hd_home_dir}/stop-yarn.sh"

sleep 3s
echo '---- stop hdfs ----'
stop-dfs.sh

echo '---- stop zookeeper ----'
for node in ${nodeArr[@]};do
echo $node '-> zookeeper stopping'
ssh ${node} "${zp_home_dir}/zkServer.sh stop"
done

 

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章