1、實驗環境:
Node1:192.168.1.17(RHEL5.8_32bit,web server)
Node2:192.168.1.18(RHEL5.8_32bit,web server)
NFS :192.168.1.19(RHEL5.8_32bit,nfs server)
VIP:192.168.1.20(webip)
2、準備工作
<1> 配置主機名
節點名稱使用/etc/hosts解析;節點名稱必須跟uname -n命令的執行結果一致
Node1:
# hostname node1.ikki.com # vim /etc/sysconfig/network HOSTNAME=node1.ikki.com
Node2:
# hostname node1.ikki.com # vim /etc/sysconfig/network HOSTNAME=node2.ikki.com
<2> 配置節點ssh基於密鑰方式互相通信
Node1:
# ssh-keygen -t rsa # ssh-copy-id -i ~/.ssh/id_rsa.pub root@node2
Node2:
# ssh-keygen -t rsa # ssh-copy-id -i ~/.ssh/id_rsa.pub root@node1
<3> 配置各節點基於主機名互相通信
Node1&Node2:
# vim /etc/hosts 192.168.1.17 node1.ikki.com node1 192.168.1.18 node2.ikki.com node2
<4> 配置各節點時間同步
Node1&Node2:
# crontab -e */5 * * * * /sbin/ntpdate 202.120.2.101 &> /dev/null
3、安裝corosync和pacemaker(各個節點)
<1> 依賴的rpm包:
libibverbs, librdmacm, lm_sensors, libtool-ltdl, openhpi-libs, openib, perl-TimeDate, libnes
<2> 下載軟件包至本地某專用目錄(如/root/cluster):
# cd /root/cluster # ls cluster-glue-1.0.6-1.6.el5.i386.rpm cluster-glue-libs-1.0.6-1.6.el5.i386.rpm corosync-1.2.7-1.1.el5.i386.rpm corosynclib-1.2.7-1.1.el5.i386.rpm heartbeat-3.0.3-2.3.el5.i386.rpm heartbeat-libs-3.0.3-2.3.el5.i386.rpm libesmtp-1.0.4-5.el5.i386.rpm pacemaker-1.1.5-1.1.el5.i386.rpm pacemaker-libs-1.1.5-1.1.el5.i386.rpm resource-agents-1.0.4-1.1.el5.i386.rpm
<3> 安裝本地軟件包及依賴包:
# cd /root/cluster # yum -y --nogpgcheck localinstall *.rpm
4、配置corosync
Node1:
# cd /etc/corosync # cp corosync.conf.example corosync.conf # vim corosync.conf # 添加如下內容: service { ver: 0 name: pacemaker # use_mgmtd: yes } aisexec { user: root group: root } # vim corosync.conf # 修改如下內容: bindnetaddr: 192.168.1.0 # 網卡所在網絡的網絡地址 secauth: on # 開啓認證 to_syslog: no # 關閉系統日誌記錄(使用單獨logfile記錄) threads: 2 # 設置線程數
生成節點間通信時用到的認證密鑰文件:
# corosync-keygen
將corosync.conf和authkey複製至Node2:
# scp -p corosync.conf authkey node2:/etc/corosync/
分別爲兩個節點創建corosync生成的日誌所在的目錄:
# mkdir /var/log/cluster # ssh node2 'mkdir /var/log/cluster'
5、啓動服務並檢查
Node1:
# /etc/init.d/corosync start
查看corosync引擎是否正常啓動:
# grep -e "Corosync Cluster Engine" -e "configuration file" /var/log/cluster/corosync.log Sep 16 18:59:29 corosync [MAIN ] Corosync Cluster Engine ('1.2.7'): started and ready to provide service. Sep 16 18:59:29 corosync [MAIN ] Successfully read main configuration file '/etc/corosync/corosync.conf'. Sep 16 19:28:26 corosync [MAIN ] Corosync Cluster Engine exiting with status 0 at main.c:170. Sep 16 19:54:14 corosync [MAIN ] Corosync Cluster Engine ('1.2.7'): started and ready to provide service. Sep 16 19:54:14 corosync [MAIN ] Successfully read main configuration file '/etc/corosync/corosync.conf'.
查看初始化成員節點通知是否正常發出:
# grep TOTEM /var/log/cluster/corosync.log Sep 16 18:59:29 corosync [TOTEM ] Initializing transport (UDP/IP). Sep 16 18:59:29 corosync [TOTEM ] Initializing transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0). Sep 16 18:59:29 corosync [TOTEM ] The network interface [192.168.1.17] is now up. Sep 16 18:59:29 corosync [TOTEM ] A processor joined or left the membership and a new membership was formed.
檢查啓動過程中是否有錯誤產生:
# grep ERROR: /var/log/cluster/corosync.log | grep -v unpack_resources
查看pacemaker是否正常啓動:
# grep pcmk_startup /var/log/cluster/corosync.log Sep 16 18:59:29 corosync [pcmk ] info: pcmk_startup: CRM: Initialized Sep 16 18:59:29 corosync [pcmk ] Logging: Initialized pcmk_startup Sep 16 18:59:29 corosync [pcmk ] info: pcmk_startup: Maximum core file size is: 4294967295 Sep 16 18:59:29 corosync [pcmk ] info: pcmk_startup: Service: 9 Sep 16 18:59:29 corosync [pcmk ] info: pcmk_startup: Local hostname: node1.ikki.com
如以上檢查正常,即可啓動Node2上的corosync(啓動Node2需要在Node1上遠程啓動,勿要在Node2節點上直接啓動)
# ssh node2 -- /etc/init.d/corosync start
查看集羣節點的啓動狀態:
# crm status ============ Last updated: Tue Sep 17 23:39:11 2013 Stack: openais Current DC: node1.ikki.com - partition with quorum Version: 1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f 2 Nodes configured, 2 expected votes 0 Resources configured. ============ Online: [ node1.ikki.com node2.ikki.com ]
查看corosync啓動的相關進程:
# ps auxf root 13200 0.6 0.7 86880 3952 ? Ssl 12:29 4:06 corosync root 13208 0.0 0.4 11724 2104 ? S 12:29 0:00 \_ /usr/lib/heartbeat/stonithd 101 13209 0.0 0.7 12872 3820 ? S 12:29 0:01 \_ /usr/lib/heartbeat/cib root 13210 0.0 0.4 6572 2156 ? S 12:29 0:00 \_ /usr/lib/heartbeat/lrmd 101 13211 0.0 0.3 12060 2040 ? S 12:29 0:00 \_ /usr/lib/heartbeat/attrd 101 13212 0.0 0.5 8836 2900 ? S 12:29 0:00 \_ /usr/lib/heartbeat/pengine 101 13213 0.0 0.6 12280 3112 ? S 12:29 0:02 \_ /usr/lib/heartbeat/crmd
6、配置集羣禁用stonith設備
corosync默認啓用了stonith,而當前實驗環境並沒有相應的stonith設備,因此需要禁用stonith:
# crm configure property stonith-enabled=false
查看當前的配置信息:
# crm configure show node node1.ikki.com node node2.ikki.com property $id="cib-bootstrap-options" \ dc-version="1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f" \ cluster-infrastructure="openais" \ expected-quorum-votes="2" \ stonith-enabled="false" \
7、爲集羣添加IP地址資源(webip):
Node1:
# crm configure primitive webip ocf:heartbeat:IPaddr params ip=192.168.1.20
查看資源啓動狀態:
# crm status ============ Last updated: Tue Sep 17 23:48:10 2013 Stack: openais Current DC: node1.ikki.com - partition with quorum Version: 1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f 2 Nodes configured, 2 expected votes 1 Resources configured. ============ Online: [ node1.ikki.com node2.ikki.com ] webip (ocf::heartbeat:IPaddr): Started node1.ikki.com
查看webip是否生效:
# ifconfig eth0:0 Link encap:Ethernet HWaddr 08:00:27:F1:60:13 inet addr:192.168.1.20 Bcast:192.168.1.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
8、配置集羣禁用法定票數
Node2:
停止Node1上的corosync服務:
# ssh node1 -- /etc/init.d/corosync stop
查看集羣工作狀態:
# crm status ============ Last updated: Tue Sep 17 23:49:41 2013 Stack: openais Current DC: node2.ikki.com - partition WITHOUT quorum Version: 1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f 2 Nodes configured, 2 expected votes 1 Resources configured. ============ Online: [ node2.ikki.com ] OFFLINE: [ node1.ikki.com ]
在雙節點集羣環境中法定票數無法起效,當Node1離線時,則webip資源無法轉移至Node2,因此需要禁用quorum:
# crm configure property no-quorum-policy=ignore
再次查看集羣工作狀態:
# crm status ============ Last updated: Tue Sep 17 23:51:27 2013 Stack: openais Current DC: node2.ikki.com - partition with quorum Version: 1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f 2 Nodes configured, 2 expected votes 1 Resources configured. ============ Online: [ node1.ikki.com node2.ikki.com ] webip (ocf::heartbeat:IPaddr): Started node2.ikki.com
啓動Node1上的corosync服務:
# ssh node1 -- /etc/init.d/corosync start
爲資源指定默認黏性值:
# crm configure rsc_defaults resource-stickiness=100
9、配置active/passive模型的高可用Web集羣
<1> 在各節點上安裝httpd服務並提供測試頁面
<2> 爲集羣添加web服務資源(httpd):
# crm configure primitive httpd lsb:httpd
查看資源的啓用狀態:
# crm status ============ Last updated: Tue Sep 17 23:54:36 2013 Stack: openais Current DC: node2.ikki.com - partition with quorum Version: 1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f 2 Nodes configured, 2 expected votes 2 Resources configured. ============ Online: [ node1.ikki.com node2.ikki.com ] webip (ocf::heartbeat:IPaddr): Started node1.ikki.com httpd (lsb:httpd): Started node2.ikki.com
<3> 配置資源約束:
# crm configure colocation httpd-with-ip INFINITY: httpd webip
<4> 配置資源順序(資源啓動順序爲webip, httpd):
# crm configure order httpd-after-ip mandatory: webip httpd
<5> 配置集羣位置約束:
# crm configure location prefer-node1 httpd rule 200: #uname eq node1.ikki.com
10、搭建NFS服務器
NFS:
# mkdir -p /web/htdocs # vim /etc/exports /web/htdocs 192.168.1.0/24(ro) # exportfs -rav
11、爲集羣添加由nfs提供的webstore資源並配置約束
Node1:
<1> 添加webstore資源
# crm configure primitive webstore ocf:heartbeat:Filesystem params device=192.168.1.19:/web/htdocs directory=/var/www/html fstype=nfs op start timeout=60 op stop timeout=60
<2> 設置位置約束
# crm configure colocation httpd_with_webstore inf: httpd webstore
<3> 設置順序約束
# crm configure order webstore_before_httpd mandatory: webstore httpd
<4> 設置順序約束(使用crm交互式命令)
# crm(live)configure# edit 刪除此前定義的約束order httpd_after_ip inf: webip httpd # crm(live)configure# order webstore_after_ip inf: webip webstore # crm(live)configure# verify # crm(live)configure# commit
12、集羣配置總覽和查看資源狀態
<1> 查看集羣配置
# crm configure show node node1.ikki.com \ attributes standby="off" node node2.ikki.com primitive httpd lsb:httpd \ meta target-role="Started" primitive webip ocf:heartbeat:IPaddr \ params ip="192.168.1.20" \ meta target-role="Started" primitive webstore ocf:heartbeat:Filesystem \ params device="192.168.1.19:/web/htdocs" directory="/var/www/html" fstype="nfs" \ op start interval="0" timeout="60" \ op stop interval="0" timeout="60" \ meta target-role="Started" location perfer_node1 httpd \ rule $id="perfer_node1-rule" 200: #uname eq node1.ikki.com colocation httpd_with_webip inf: httpd webip colocation httpd_with_webstore inf: httpd webstore order webstore_after_ip inf: webip webstore order webstore_before_httpd inf: webstore httpd property $id="cib-bootstrap-options" \ dc-version="1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f" \ cluster-infrastructure="openais" \ expected-quorum-votes="2" \ stonith-enabled="false" \ no-quorum-policy="ignore" \ last-lrm-refresh="1379355508"
<2> 查看資源狀態
# crm status ============ Last updated: Tue Sep 17 23:58:35 2013 Stack: openais Current DC: node2.ikki.com - partition with quorum Version: 1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f 2 Nodes configured, 2 expected votes 3 Resources configured. ============ Online: [ node1.ikki.com node2.ikki.com ] webip (ocf::heartbeat:IPaddr): Started node1.ikki.com httpd (lsb:httpd): Started node1.ikki.com webstore (ocf::heartbeat:Filesystem): Started node1.ikki.com