Heartbeat+DRBD+MySQL高可用架構方案與實施過程細節

Heartbeat+DRBD+MySQL高可用架構方案與實施過程細節

互聯網公司從初期到後期的數據庫架構拓展


Heartbeat介紹


官方站點:http://linux-ha.org/wiki/Main_Page

   heartbeat可以資源(VIP地址及程序服務)從一臺有故障的服務器快速的轉移到另一臺正常的服務器提供服務,heartbeat和keepalived相似,heartbeat可以實現failover功能,但不能實現對後端的健康檢查


DRBD介紹

官方站點:http://www.drbd.org/

   DRBD(Distributed Replicated Block Device)是一個基於塊設備級別在遠程服務器直接同步和鏡像數據的軟件,用軟件實現的、無共享的、服務器之間鏡像塊設備內容的存儲複製解決方案。它可以實現在網絡中兩臺服務器之間基於塊設備級別的實時鏡像或同步複製(兩臺服務器都寫入成功)/異步複製(本地服務器寫入成功),相當於網絡的RAID1,由於是基於塊設備(磁盤,LVM邏輯卷),在文件系統的底層,所以數據複製要比cp命令更快

   DRBD已經被MySQL官方寫入文檔手冊作爲推薦的高可用的方案之一


MySQL介紹

官方站點:http://www.mysql.com/

   MySQL是一個開放源碼的小型關聯式數據庫管理系統。目前MySQL被廣泛地應用在Internet上的中小型網站中。由於其體積小、速度快、總體擁有成本低,尤其是開放源碼這一特點,許多中小型網站爲了降低網站總體擁有成本而選擇了MySQL作爲網站數據庫


heartbeat和keepalived應用場景及區別

很多網友說爲什麼不使用keepalived而使用長期不更新的heartbeat,下面說一下它們之間的應用場景及區別

1、對於web,db,負載均衡(lvs,haproxy,nginx)等,heartbeat和keepalived都可以實現

2、lvs最好和keepalived結合,因爲keepalived最初就是爲lvs產生的,(heartbeat沒有對RS的健康檢查功能,heartbeat可以通過ldircetord來進行健康檢查的功能)

3、mysql雙主多從,NFS/MFS存儲,他們的特點是需要數據同步,這樣的業務最好使用heartbeat,因爲heartbeat有自帶的drbd腳本

總結:無數據同步的應用程序高可用可選擇keepalived,有數據同步的應用程序高可用可選擇heartbeat

1、Heartbeat+DRBD+MySQL安裝部署

(1)、架構拓撲

架構說明:

一主多從最常用的架構,多個從庫可以使用lvs來提供讀的負載均衡

解決一主單點的問題,當主庫宕機後,可以實現主庫宕機後備節點自動接管,所有的從庫會自動和新的主庫進行同步,實現了mysql主庫的熱備方案

(2)、系統環境

系統環境

系統

CentOS release 5.8

系統位數

X86

內核版本

2.6.18

軟件環境

heartbeat

heartbeat-2.1.3-3

drbd

drbd83-8.3.13-2

mysql

5.5.27

(3)、部署環境

角色

IP

VIP

192.168.4.1(內網提供服務的地址)

master1

eth0:(數據庫無公網地址)

eth1:192.168.4.2/16(內網)

eth2:172.16.4.2/16(心跳線)

eth3:172.168.4.2/16(DRBD千兆數據傳輸)

master2

eth0:(數據庫無公網地址)

eth1:192.168.4.3/16(內網)

eth2:172.16.4.3/16(心跳線)

eth3:172.168.4.3/16(DRBD千兆數據傳輸)

slave1

eth1:192.168.4.4/16(外網)

說明:從庫通過主庫的VIP進行主從同步replication

需求:

1、主庫master1宕機後master2自動接管VIP以及所有從庫

2、在master2接管時,不影響從庫的主從同步replication


(4)、主庫服務器數據分區信息

磁盤

容量

分區

掛載點

說明

/dev/sdb

1G

/dev/sdb1

/data/

存放數據

/dev/sdb2


存放drbd同步的狀態信息

注意

1、meta data分區一定不能格式化建立文件系統(sdb2存放drbd同步的狀態信息)

2、分好的分區不要進行掛載

3、生產環境DRBD meta data分區一般可設置爲1-2G,數據分區看需求給最大

4、在生產環境中兩塊硬盤一樣大


2、heartbeat安裝部署

(1)、配置服務器間心跳連接路由

主節點

1
2
[root@master1 ~]# route add -host 172.16.4.3 dev eth2<==到對端的心跳路由
[root@master1 ~]# route add -host 172.168.4.3 dev eth3<==到對端的DRBD數據路由

備節點

1
2
[root@master2 ~]# route add -host 172.16.4.2 dev eth2
[root@master2 ~]# route add -host 172.168.4.2 dev eth3

(2)、安裝heartbeat

1
2
3
[root@master1 ~]# yum install heartbeat -y
[root@master1 ~]# yum install heartbeat -y
提示:需要執行兩遍安裝heartbeat操作

(3)、配置heartbeat

主備節點兩端的配置文件(ha.cf authkeys haresources)完全相同

1)、ha.cf

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
[root@master1 ~]# vim /etc/ha.d/ha.cf
#log configure
debugfile /var/log/ha-debug
logfile /var/log/ha-log
logfacility local1
#options configure
keepalive 2
deadtime 30
warntime 10
initdead 120
#bcast  eth2
mcast eth2 225.0.0.7 694 1 0
#node configure
auto_failback on
node    master1     <==主節點主機名
node    master2     <==備節點主機名
crm no

2)、配置authkeys

1
2
3
[root@master1 ~]# vim /etc/ha.d/authkeys
auth 1
1 sha1 47e9336850f1db6fa58bc470bc9b7810eb397f04

3)、配置haresources

1
2
3
4
5
6
7
[root@master1 ~]# vim /etc/ha.d/haresources
master1 IPaddr::192.168.4.1/16/eth1
#master1 IPaddr::192.168.4.1/16/eth1 drbddisk::data Filesystem::/dev/drbd1::/data::ext3 mysqld
說明:
drbddisk::data      <==啓動drbd data資源,相當於執行/etc/ha.d/resource.d/drbddisk data stop/start操作
Filesystem::/dev/drbd1::/data::ext3     <==drbd分區掛載到/data目錄,相當於執行/etc/ha.d/resource.d/Filesystem /dev/drbd1 /data ext3 stop/start        <==相當於系統中執行mount /dev/drbd1 /data
mysql               <==啓動mysql服務腳本,相當於/etc/init.d/mysql stop/start

(4)、啓動heartbeat

1
2
3
[root@master1 ~]# /etc/init.d/heartbeat start
[root@master1 ~]# chkconfig heartbeat off
說明:關閉開機自啓動,當服務器重啓時,需要人工去啓動

(5)、測試heartbeat

正常狀態

1
2
3
4
5
6
7
8
[root@master1 ~]# ip addr|grep eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000
inet 192.168.4.2/16brd 192.168.255.255 scope global eth1
inet 192.168.4.1/16brd 192.168.255.255 scope global secondary eth1:0
[root@master2 ~]# ip addr|grep eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000
inet 192.168.4.3/16brd 192.168.255.255 scope global eth1
說明:master1節點擁有vip地址,master2節點沒有

模擬主節點宕機後的狀態

1
2
3
4
5
6
[root@master1 ~]# /etc/init.d/heartbeat stop
[root@master2 ~]# ip addr|grep eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000
inet 192.168.4.3/16brd 192.168.255.255 scope global eth1
inet 192.168.4.1/16brd 192.168.255.255 scope global secondary eth1:0
說明:master1宕機後,vip地址漂移到master2節點上,master2成爲主節點

模擬主節點故障恢復後的狀態

1
2
3
4
5
6
[root@master1 ~]# /etc/init.d/heartbeat start
[root@master1 ~]# ip addr|grep eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000
inet 192.168.4.2/16brd 192.168.255.255 scope global eth1
inet 192.168.4.1/16brd 192.168.255.255 scope global secondary eth1:0
說明:master1搶佔vip資源


3、DRBD安裝部署

(1)、新添加硬盤

1
2
3
4
5
6
7
[root@master1 ~]# fdisk /dev/sdb
說明:sdb磁盤分兩個分區sdb1和sdb2
[root@master1 ~]# partprobe
[root@master1 ~]# mkfs.ext3 /dev/sdb1
說明:sdb2分區爲meta data分區,不需要格式化操作
[root@master1 ~]# tune2fs -c -1 /dev/sdb1
說明:設置最大掛載數爲-1

(2)、安裝DRBD

1
2
3
[root@master1 ~]# yum install kmod-drbd83 drbd83 -y
[root@master1 ~]# modprobe drbd
注意:不要設置echo'modprobe drbd'>>/etc/rc.loca開機自動加載drbd模塊,如果drbd服務是開機自啓動的,會先啓動drbd服務在加載drbd的順序,導致drbd啓動不了出現的問題

(3)、配置DRBD

主備節點兩端配置文件完全一致

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
[root@master1 ~]# cat /etc/drbd.conf
global {
# minor-count 64;
# dialog-refresh 5; # 5 seconds
# disable-ip-verification;
usage-count no;
}
common {
protocol C;
disk {
on-io-error   detach;
#size 454G;
no-disk-flushes;
no-md-flushes;
}
net {
sndbuf-size 512k;
# timeout       60;    #  6 seconds  (unit = 0.1 seconds)
# connect-int   10;    # 10 seconds  (unit = 1 second)
# ping-int      10;    # 10 seconds  (unit = 1 second)
# ping-timeout   5;    # 500 ms (unit = 0.1 seconds)
max-buffers     8000;
unplug-watermark   1024;
max-epoch-size  8000;
# ko-count 4;
# allow-two-primaries;
cram-hmac-alg "sha1";
shared-secret "hdhwXes23sYEhart8t";
after-sb-0pri disconnect;
after-sb-1pri disconnect;
after-sb-2pri disconnect;
rr-conflict disconnect;
# data-integrity-alg "md5";
# no-tcp-cork;
}
syncer {
rate 120M;
al-extents 517;
}
}
resource data {
on master1 {
device     /dev/drbd1;
disk       /dev/sdb1;
address    192.168.4.2:7788;
meta-disk  /dev/sdb2[0];
}
on master2 {
device     /dev/drbd1;
disk       /dev/sdb1;
address    192.168.4.3:7788;
meta-disk  /dev/sdb2[0];
}
}

(4)、初始化meta分區

1
2
3
4
5
[root@master1 ~]# drbdadm create-md data
Writing meta data...
initializing activity log
NOT initialized bitmap
New drbd meta data block successfully created.


(5)初始化設備同步(覆蓋備節點,保持數據一致)

1
[root@master1 ~]# drbdadm -- --overwrite-data-of-peer primary data

(6)、啓動drbd

1
2
[root@master1 ~]# drbdadm up all
[root@master1 ~]# chkconfig drbd off

(7)、掛載drbd分區到data數據目錄

1
2
3
[root@master1 ~]# drbdadm primary all
[root@master1 ~]# mount /dev/drbd1 /data
說明:/data目錄爲數據庫的數據目錄

(8)、測試DRBD

正常狀態

1
2
3
4
5
6
7
8
9
10
11
[root@master1 ~]# cat /proc/drbd
version: 8.3.13 (api:88/proto:86-96)
GIT-hash: 83ca112086600faacab2f157bc5a9324f7bd7f77 build by [email protected], 2012-05-07 11:56:36
1: cs:Connected ro:Primary/Secondaryds:UpToDate/UpToDateC r-----
ns:497984 nr:0 dw:1 dr:498116 al:1 bm:31 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0
[root@master2 ~]# cat /proc/drbd
version: 8.3.13 (api:88/proto:86-96)
GIT-hash: 83ca112086600faacab2f157bc5a9324f7bd7f77 build by [email protected], 2012-05-07 11:56:36
1: cs:Connected ro:Secondary/Primaryds:UpToDate/UpToDateC r-----
ns:0 nr:497984 dw:497984 dr:0 al:0 bm:30 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0
說明:master1爲主節點,master爲備節點


模擬master1宕機

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
[root@master1 ~]# umount /dev/drbd1
[root@master1 ~]# drbdadm down all
[root@master2 ~]# cat /proc/drbd
version: 8.3.13(api:88/proto:86-96)
GIT-hash: 83ca112086600faacab2f157bc5a9324f7bd7f77 build by [email protected], 2012-05-0711:56:36
1: cs:WFConnection ro:Secondary/Unknown ds:UpToDate/DUnknown C r-----
ns:0nr:497985dw:497985dr:0al:0bm:30lo:0pe:0ua:0ap:0ep:1wo:b oos:0
[root@master2 ~]# drbdadm primary all
[root@master2 ~]# mount /dev/drbd1 /data
[root@master2 ~]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda3              19G  5.1G   13G  29% /
/dev/sda1             190M   18M  163M  10% /boot
tmpfs                  60M     060M   0% /dev/shm
/dev/drbd1            471M   11M  437M   3% /data
說明:master1宕機後,master2可以升級爲主節點,可掛載drbd分區繼續使用

4、MySQL安裝部署

注意:三臺數據庫都安裝mysql服務,master2只安裝到make install即可,mysqld服務不要設置爲開機自啓動

(1)、解決perl編譯問題

1
2
echo'export LC_ALL=C'>> /etc/profile
source/etc/profile

(2)、安裝CAMKE

1
2
3
4
5
6
cd/home/xu/tools
wget http://www.cmake.org/files/v2.8/cmake-2.8.4.tar.gz
tarzxf cmake-2.8.4.tar.gz
cdcmake-2.8.4
./configure
make& makeinstall

(3)、創建用戶

1
2
groupadd mysql
useradd-g mysql mysql

(4)、編譯安裝mysql

1
2
3
4
5
6
7
8
9
10
11
12
wget http://mysql.ntu.edu.tw/Downloads/MySQL-5.5/mysql-5.5.27.tar.gz
tarzxf mysql-5.5.27.tar.gz
cdmysql-5.5.27
cmake -DCMAKE_INSTALL_PREFIX=/usr/local/mysql\
-DMYSQL_UNIX_ADDR=/tmp/mysql.sock \
-DDEFAULT_CHARSET=utf8 \
-DDEFAULT_COLLATION=utf8_general_ci \
-DWITH_EXTRA_CHARSETS=complex \
-DWITH_READLINE=1 \
-DENABLED_LOCAL_INFILE=1
make-j 4
makeinstall

(5)、設置mysql環境變量

1
2
[root@master1 ~]# echo 'PATH=$PATH:/usr/local/mysql/bin' >>/etc/profile
[root@master1 ~]# source /etc/profile

(6)、初始化數據庫

1
2
3
4
[root@master1 ~]# mount /dev/drbd1 /data
說明:數據庫存放數據的目錄是drbd分區
[root@master1 ~]# cd /usr/local/mysql/
[root@master1 ~]# ./scripts/mysql_install_db --datadir=/data/ --user=mysql

(7)、啓動數據庫

1
2
3
4
5
[root@master1 ~]# vim /etc/init.d/mysqld
datadir=/data
說明:修改mysql啓動腳本,指定數據庫的目錄爲/data
[root@master1 ~]# /etc/init.d/mysqld start
[root@master1 ~]# chkconfig mysqld off

(8)、測試數據庫

1
2
3
4
5
6
7
8
[root@master1 ~]# mysql -uroot -e "show databases;"
+--------------------+
| Database           |
+--------------------+
| information_schema |
| mysql              |
| performance_schema |
+--------------------+

5、故障切換測試

(1)、架構正常狀態

master1主節點正常狀態

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
[root@master1 ~]# ip addr|grep eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000
inet 192.168.4.2/16brd 192.168.255.255 scope global eth1
inet 192.168.4.1/16brd 192.168.255.255 scope global secondary eth1:0
[root@master1 ~]# cat /proc/drbd
version: 8.3.13 (api:88/proto:86-96)
GIT-hash: 83ca112086600faacab2f157bc5a9324f7bd7f77 build by [email protected], 2012-05-07 11:56:36
1: cs:Connected ro:Primary/Secondaryds:UpToDate/UpToDateC r-----
ns:39558 nr:12 dw:39570 dr:151 al:16 bm:1 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0
[root@master1 ~]# mysql -uroot -e "create database coral;"
[root@master1 ~]# mysql -uroot -e "show databases like 'coral';"
+------------------+
| Database (coral) |
+------------------+
| coral            |
+------------------+
說明:master1爲主節點,擁有VIP地址,爲drbd的主節點

master2備節點正常狀態

1
2
3
4
5
6
7
8
9
[root@master2 ~]# ip addr|grep eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000
inet 192.168.4.3/16brd 192.168.255.255 scope global eth1
[root@master2 ~]# cat /proc/drbd
version: 8.3.13 (api:88/proto:86-96)
GIT-hash: 83ca112086600faacab2f157bc5a9324f7bd7f77 build by [email protected], 2012-05-07 11:56:36
1: cs:Connected ro:Secondary/Primaryds:UpToDate/UpToDateC r-----
ns:0 nr:48 dw:48 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0
說明:master2備節點沒有VIP地址,爲drbd備節點

(2)、模擬master1宕機故障狀態

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
[root@master1 ~]# /etc/init.d/heartbeat stop    <==模擬master1故障宕機
[root@master2 ~]# tailf /var/log/ha-log         <==查看備節點接管日誌
heartbeat[13209]: 2013/01/23_04:09:36 info: Received shutdownnotice from 'master1'.
heartbeat[13209]: 2013/01/23_04:09:36 info: Resources being acquired from master1.
heartbeat[15293]: 2013/01/23_04:09:36 info: acquire localHA resources (standby).
heartbeat[15294]: 2013/01/23_04:09:37 info: No localresources [/usr/share/heartbeat/ResourceManagerlistkeys master2] to acquire.
heartbeat[15293]: 2013/01/23_04:09:37 info: localHA resource acquisition completed (standby).
heartbeat[13209]: 2013/01/23_04:09:37 info: Standby resource acquisition done[foreign].
harc[15319]:    2013/01/23_04:09:37 info: Running /etc/ha.d/rc.d/statusstatus
mach_down[15335]:       2013/01/23_04:09:37 info: Taking over resource group IPaddr::192.168.4.1/16/eth1
ResourceManager[15361]: 2013/01/23_04:09:37 info: Acquiring resource group: master1 IPaddr::192.168.4.1/16/eth1drbddisk::data Filesystem::/dev/drbd1::/data::ext3 mysqld
IPaddr[15388]:  2013/01/23_04:09:37 INFO:  Resource is stopped
ResourceManager[15361]: 2013/01/23_04:09:37 info: Running /etc/ha.d/resource.d/IPaddr192.168.4.1/16/eth1start
IPaddr[15486]:  2013/01/23_04:09:38 INFO: Using calculated netmask for192.168.4.1: 255.255.0.0
IPaddr[15486]:  2013/01/23_04:09:38 INFO: evalifconfigeth1:0 192.168.4.1 netmask 255.255.0.0 broadcast 192.168.255.255
IPaddr[15457]:  2013/01/23_04:09:38 INFO:  Success
ResourceManager[15361]: 2013/01/23_04:09:38 info: Running /etc/ha.d/resource.d/drbddiskdata start
Filesystem[15636]:      2013/01/23_04:09:39 INFO:  Resource is stopped
ResourceManager[15361]: 2013/01/23_04:09:39 info: Running /etc/ha.d/resource.d/Filesystem/dev/drbd1/dataext3 start
Filesystem[15717]:      2013/01/23_04:09:39 INFO: Running start for/dev/drbd1on /data
Filesystem[15706]:      2013/01/23_04:09:39 INFO:  Success
ResourceManager[15361]: 2013/01/23_04:09:40 info: Running /etc/init.d/mysqldstart
mach_down[15335]:       2013/01/23_04:09:44 info: /usr/share/heartbeat/mach_down: nice_failback: foreign resources acquired
mach_down[15335]:       2013/01/23_04:09:44 info: mach_down takeover complete fornode master1.
heartbeat[13209]: 2013/01/23_04:09:44 info: mach_down takeover complete.
heartbeat[13209]: 2013/01/23_04:10:09 WARN: node master1: is dead
heartbeat[13209]: 2013/01/23_04:10:09 info: Dead node master1 gave up resources.
heartbeat[13209]: 2013/01/23_04:10:09 info: Link master1:eth2 dead.
說明:當備節點無法檢測到主節點的心跳時,自動接管資源,啓動VIP地址、drbd服務,自動掛載drbd,啓動mysqld服務,備節點接管後,數據依然存在,檢測啓動的服務如下:
[root@master2 ~]# ip addr|grep eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000
inet 192.168.4.3/16brd 192.168.255.255 scope global eth1
inet 192.168.4.1/16brd 192.168.255.255 scope global secondary eth1:0
[root@master2 ~]# cat /proc/drbd
version: 8.3.13 (api:88/proto:86-96)
GIT-hash: 83ca112086600faacab2f157bc5a9324f7bd7f77 build by [email protected], 2012-05-07 11:56:36
1: cs:Connected ro:Primary/Secondaryds:UpToDate/UpToDateC r-----
ns:3 nr:95 dw:98 dr:10 al:1 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0
[root@master2 ~]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda319G  4.7G   14G  26% /
/dev/sda1190M   18M  163M  10% /boot
tmpfs                  60M     0   60M   0% /dev/shm
/dev/drbd1471M   40M  408M   9% /data
[root@master2 ~]# mysql -uroot -e "show databases like 'coral';"
+------------------+
| Database (coral) |
+------------------+
| coral            |
+------------------+

(3)、模擬master1宕機恢復狀態

啓動的順序是:先啓動VIP--啓動drbd資源--掛載drbd分區--啓動mysqld服務,日誌如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
[root@master1 ~]# /etc/init.d/heartbeat start
[root@master1 ~]# tailf /var/log/ha-log
heartbeat[27970]: 2013/01/09_17:34:14 info: Version 2 support: no
heartbeat[27970]: 2013/01/09_17:34:14 WARN: Logging daemon is disabled --enabling logging daemon is recommended
heartbeat[27970]: 2013/01/09_17:34:14 info: **************************
heartbeat[27970]: 2013/01/09_17:34:14 info: Configuration validated. Starting heartbeat 2.1.3
heartbeat[27971]: 2013/01/09_17:34:14 info: heartbeat: version 2.1.3
heartbeat[27971]: 2013/01/09_17:34:14 info: Heartbeat generation: 1351554533
heartbeat[27971]: 2013/01/09_17:34:14 info: glib: UDP multicast heartbeat started forgroup 225.0.0.7 port 694 interface eth2 (ttl=1 loop=0)
heartbeat[27971]: 2013/01/09_17:34:14 info: G_main_add_TriggerHandler: Added signal manual handler
heartbeat[27971]: 2013/01/09_17:34:14 info: G_main_add_TriggerHandler: Added signal manual handler
heartbeat[27971]: 2013/01/09_17:34:14 info: G_main_add_SignalHandler: Added signal handler forsignal 17
heartbeat[27971]: 2013/01/09_17:34:14 info: Local status now setto: 'up'
heartbeat[27971]: 2013/01/09_17:34:16 info: Link master2:eth2 up.
heartbeat[27971]: 2013/01/09_17:34:16 info: Status update fornode master2: status active
harc[27978]:    2013/01/09_17:34:16 info: Running /etc/ha.d/rc.d/statusstatus
heartbeat[27971]: 2013/01/09_17:34:17 info: Comm_now_up(): updating status to active
heartbeat[27971]: 2013/01/09_17:34:17 info: Local status now setto: 'active'
heartbeat[27971]: 2013/01/09_17:34:17 info: remote resource transition completed.
heartbeat[27971]: 2013/01/09_17:34:17 info: remote resource transition completed.
heartbeat[27971]: 2013/01/09_17:34:17 info: Local Resource acquisition completed. (none)
heartbeat[27971]: 2013/01/09_17:34:18 info: master2 wants to go standby [foreign]
heartbeat[27971]: 2013/01/09_17:34:20 info: standby: acquire [foreign] resources from master2
heartbeat[27997]: 2013/01/09_17:34:20 info: acquire localHA resources (standby).
ResourceManager[28010]: 2013/01/09_17:34:20 info: Acquiring resource group: master1 IPaddr::192.168.4.1/16/eth1drbddisk::data Filesystem::/dev/drbd1::/data::ext3 mysqld
IPaddr[28037]:  2013/01/09_17:34:21 INFO:  Resource is stopped
ResourceManager[28010]: 2013/01/09_17:34:21 info: Running /etc/ha.d/resource.d/IPaddr192.168.4.1/16/eth1start
IPaddr[28135]:  2013/01/09_17:34:21 INFO: Using calculated netmask for192.168.4.1: 255.255.0.0
IPaddr[28135]:  2013/01/09_17:34:21 INFO: evalifconfigeth1:0 192.168.4.1 netmask 255.255.0.0 broadcast 192.168.255.255
IPaddr[28106]:  2013/01/09_17:34:21 INFO:  Success
ResourceManager[28010]: 2013/01/09_17:34:21 info: Running /etc/ha.d/resource.d/drbddiskdata start
Filesystem[28286]:      2013/01/09_17:34:21 INFO:  Resource is stopped
ResourceManager[28010]: 2013/01/09_17:34:21 info: Running /etc/ha.d/resource.d/Filesystem/dev/drbd1/dataext3 start
Filesystem[28367]:      2013/01/09_17:34:21 INFO: Running start for/dev/drbd1on /data
Filesystem[28356]:      2013/01/09_17:34:21 INFO:  Success
ResourceManager[28010]: 2013/01/09_17:34:22 info: Running /etc/init.d/mysqldstart
heartbeat[27997]: 2013/01/09_17:34:25 info: localHA resource acquisition completed (standby).
heartbeat[27971]: 2013/01/09_17:34:25 info: Standby resource acquisition done[foreign].
heartbeat[27971]: 2013/01/09_17:34:25 info: Initial resource acquisition complete (auto_failback)
heartbeat[27971]: 2013/01/09_17:34:25 info: remote resource transition completed.

備節點釋放資源順序:停止mysqld服務--卸載drbd1分區--設置drbd爲備節點--關閉VIP地址,日志如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
[root@master2 ~]# tailf /var/log/ha-log
heartbeat[13209]: 2013/01/23_04:26:53 info: Heartbeat restart on node master1
heartbeat[13209]: 2013/01/23_04:26:53 info: Link master1:eth2 up.
heartbeat[13209]: 2013/01/23_04:26:53 info: Status update fornode master1: status init
heartbeat[13209]: 2013/01/23_04:26:53 info: Status update fornode master1: status up
harc[16151]:    2013/01/23_04:26:53 info: Running /etc/ha.d/rc.d/statusstatus
harc[16167]:    2013/01/23_04:26:53 info: Running /etc/ha.d/rc.d/statusstatus
heartbeat[13209]: 2013/01/23_04:26:53 info: all clients are now paused
heartbeat[13209]: 2013/01/23_04:26:55 info: Status update fornode master1: status active
harc[16183]:    2013/01/23_04:26:55 info: Running /etc/ha.d/rc.d/statusstatus
heartbeat[13209]: 2013/01/23_04:26:55 info: all clients are now resumed
heartbeat[13209]: 2013/01/23_04:26:55 info: remote resource transition completed.
heartbeat[13209]: 2013/01/23_04:26:55 info: master2 wants to go standby [foreign]
heartbeat[13209]: 2013/01/23_04:26:55 info: standby: master1 can take our foreign resources
heartbeat[16199]: 2013/01/23_04:26:55 info: give up foreign HA resources (standby).
ResourceManager[16212]: 2013/01/23_04:26:55 info: Releasing resource group: master1 IPaddr::192.168.4.1/16/eth1drbddisk::data Filesystem::/dev/drbd1::/data::ext3 mysqld
ResourceManager[16212]: 2013/01/23_04:26:55 info: Running /etc/init.d/mysqldstop
ResourceManager[16212]: 2013/01/23_04:26:57 info: Running /etc/ha.d/resource.d/Filesystem/dev/drbd1/dataext3 stop
Filesystem[16297]:      2013/01/23_04:26:57 INFO: Running stop for/dev/drbd1on /data
Filesystem[16297]:      2013/01/23_04:26:57 INFO: Trying to unmount /data
Filesystem[16297]:      2013/01/23_04:26:57 INFO: unmounted /datasuccessfully
Filesystem[16286]:      2013/01/23_04:26:57 INFO:  Success
ResourceManager[16212]: 2013/01/23_04:26:57 info: Running /etc/ha.d/resource.d/drbddiskdata stop
ResourceManager[16212]: 2013/01/23_04:26:57 info: Running /etc/ha.d/resource.d/IPaddr192.168.4.1/16/eth1stop
IPaddr[16445]:  2013/01/23_04:26:58 INFO: ifconfigeth1:0 down
IPaddr[16416]:  2013/01/23_04:26:58 INFO:  Success
heartbeat[16199]: 2013/01/23_04:26:58 info: foreign HA resource release completed (standby).
heartbeat[13209]: 2013/01/23_04:26:58 info: Local standby process completed [foreign].
heartbeat[13209]: 2013/01/23_04:27:02 WARN: 1 lost packet(s) for[master1] [15:17]
heartbeat[13209]: 2013/01/23_04:27:02 info: remote resource transition completed.
heartbeat[13209]: 2013/01/23_04:27:02 info: No pkts missing from master1!
heartbeat[13209]: 2013/01/23_04:27:02 info: Other node completed standby takeover of foreign resources.


6、從庫同VIP同步

(1)、master配置

1)、設置server-id值並開啓Binlog參數

1
2
3
4
5
[root@master1 ~]# vim /etc/my.cnf
log-bin=/usr/local/mysql/mysql-bin
server-id= 3
[root@master1 ~]# /etc/init.d/mysqld restart
注意:只有master1有重啓操作,master2無需重啓操作,因爲備節點的mysql是未啓動狀態,備節點只有heartbeat才能啓動mysql


2)、授權並建立同步賬戶rep

1
2
[root@master1 ~]# mysql -uroot -p
mysql> GRANT REPLICATION SLAVE ON *.* TO 'rep'@'192.168.4.%'IDENTIFIED BY 'rep';


(2)、slave配置

1)、設置server-id值並關閉binlog設置

1
2
3
4
5
[root@slave1 ~]# vim /etc/my.cnf
#log-bin=mysql-bin
server-id= 4
[root@slave1 ~]# /etc/init.d/mysqld restart
說明:從庫無需開啓binlog日誌功能,除非有需求做級聯複製架構或對mysql增量備份操作纔開啓

2)、配置同步參數

1
2
3
4
5
6
7
8
[root@Slave ~]# mysql -uroot
CHANGE MASTER TO
MASTER_HOST='192.168.4.1',
MASTER_PORT=3306,
MASTER_USER='rep',
MASTER_PASSWORD='rep',
MASTER_LOG_FILE='mysql-bin.000001',
MASTER_LOG_POS=0;

3)、檢查是否主從同步

1
2
3
4
5
6
[root@Slave ~]# mysql -uroot
mysql> show slave status\G
...
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
...

(3)、模擬高可用宕機切換是否影響從庫同步

1)、主從正常狀態

1
2
3
4
5
6
7
8
9
10
11
12
[root@master1 ~]# mysql -uroot
mysql> create database coral1;
Query OK, 1 row affected (0.02 sec)
[root@slave1 ~]# mysql -uroot -e "show slave status\G"|egrep "Slave_IO_Running|Slave_SQL_Running"
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
[root@slave1 ~]# mysql -uroot -e "show databases like 'coral%';"
+-------------------+
| Database (coral%) |
+-------------------+
| coral1            |
+-------------------+

2)、模擬高可用主節點宕機

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
[root@master1 ~]# /etc/init.d/heartbeat stop
說明:模擬主節點宕機
[root@master2 ~]# ip addr|grep eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000
inet 192.168.4.3/16brd 192.168.255.255 scope global eth1
inet 192.168.4.1/16brd 192.168.255.255 scope global secondary eth1:0
[root@master2 ~]# mysql -uroot
mysql> create database coral2;
Query OK, 1 row affected (0.08 sec)
說明:VIP地址已經漂移到master2上面
[root@slave1 ~]# mysql -uroot -e "show slave status\G"|egrep "Slave_IO_Running|Slave_SQL_Running"
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
[root@slave1 ~]# mysql -uroot -e "show databases like 'coral%'"
+-------------------+
| Database (coral%) |
+-------------------+
| coral1            |
| coral2            |
+-------------------+
注意:高可用主備節點切換過程中,會有一段時間從庫才能連接上,大於在60秒內
說明:此時主從同步是正常的

3)、模擬高可用主節點宕機恢復

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
[root@master1 ~]# /etc/init.d/heartbeat start
[root@master1 ~]# mysql -uroot
mysql> create database coral3;
[root@slave1 ~]# mysql -uroot -e "show slave status\G"|egrep "Slave_IO_Running|Slave_SQL_Running"
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
[root@slave1 ~]# mysql -uroot -e "show databases like 'coral%'"
+-------------------+
| Database (coral%) |
+-------------------+
| coral1            |
| coral2            |
| coral3            |
+-------------------+
說明:高可用主節點故障恢復後也不影響主從庫的同步

6、高可用腦裂問題及解決方案

(1)、導致裂腦發生的原因

1、高可用服務器之間心跳鏈路故障,導致無法相互檢查心跳

2、高可用服務器上開啓了防火牆,阻擋了心跳檢測

3、高可用服務器上網卡地址等信息配置不正常,導致發送心跳失敗

4、其他服務配置不當等原因,如心跳方式不同,心跳廣播衝突,軟件BUG等

(2)、防止裂腦一些方案

1、加冗餘線路

2、檢測到裂腦時,強行關閉心跳檢測(遠程關閉主節點,控制電源的電路fence)

3、做好腦裂的監控報警

4、報警後,備節點在接管時設置比較長的時間去接管,給運維人員足夠的時間去處理(人爲處理)

5、啓動磁盤鎖,正在服務的一方鎖住磁盤,裂腦發生時,讓對方完全搶不走"共享磁盤資源"

磁盤鎖存在的問題:

使用鎖磁盤會有死鎖的問題,如果佔用共享磁盤的一方不主動"解鎖"另一方就永遠得不到共享磁盤,假如服務器節點突然死機或崩潰,就不可能執行解鎖命令,備節點也就無法接管資源和服務了,有人在HA中設計了智能鎖,正在提供服務的一方只在發現心跳全部斷開時纔會啓用磁盤鎖,平時就不上鎖


發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章