HeartBeat+NFS配置
接上一篇的DRBD的系统环境及安装配置继续….
一、Hearbeat配置
1、安装heartbeat
1 2 |
# yum install epel-release -y # yum --enablerepo=epel install heartbeat -y |
2、设置heartbeat配置文件
(node1)
编辑ha.cf,添加下面配置:
1 2 3 4 5 6 7 8 |
# vi /etc/ha.d/ha.cf logfile /var/log/ha-log logfacility local0 keepalive 2 deadtime 5 ucast eth0 192.168.0.192 # 指定对方网卡及IP auto_failback off node drbd1.corp.com drbd2.corp.com |
(node2)
编辑ha.cf,添加下面配置:
1 2 3 4 5 6 7 8 |
# vi /etc/ha.d/ha.cf logfile /var/log/ha-log logfacility local0 keepalive 2 deadtime 5 ucast eth0 192.168.0.191 auto_failback off node drbd1.corp.com drbd2.corp.com |
3、编辑双机互联验证文件authkeys,添加以下内容:(node1,node2)
1 2 3 |
# vi /etc/ha.d/authkeys auth 1 1 crc |
给验证文件600权限
1 |
# chmod 600 /etc/ha.d/authkeys |
4、编辑集群资源文件:(node1,node2)
1 2 |
# vi /etc/ha.d/haresources drbd1.corp.com IPaddr::192.168.0.190/24/eth0 drbddisk::r0 Filesystem::/dev/drbd0::/store::ext4 killnfsd |
注:该文件内IPaddr,Filesystem等脚本存放路径在/etc/ha.d/resource.d/下,也可在该目录下存放服务启动脚本(例如:mysql,www),将相同脚本名称添加到/etc/ha.d/haresources内容中,从而跟随heartbeat启动而启动该脚本。
IPaddr::192.168.0.190/24/eth0:用IPaddr脚本配置对外服务的浮动虚拟IP
drbddisk::r0:用drbddisk脚本实现DRBD主从节点资源组的挂载和卸载
Filesystem::/dev/drbd0::/store::ext4:用Filesystem脚本实现磁盘挂载和卸载
5、编辑脚本文件killnfsd,用来重启NFS服务:(node1,node2)
1 2 |
# vi /etc/ha.d/resource.d/killnfsd killall -9 nfsd; /etc/init.d/nfs restart;exit 0 |
赋予755执行权限:
1 |
# chmod 755 /etc/ha.d/resource.d/killnfsd |
二、创建DRBD脚本文件drbddisk:(node1,node2)
编辑drbddisk,添加下面的脚本内容
1 |
# vi /etc/ha.d/resource.d/drbddisk |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 |
#!/bin/bash # # This script is inteded to be used as resource script by heartbeat # # Copright 2003-2008 LINBIT Information Technologies # Philipp Reisner, Lars Ellenberg # ### DEFAULTFILE="/etc/default/drbd" DRBDADM="/sbin/drbdadm" if [ -f $DEFAULTFILE ]; then . $DEFAULTFILE fi if [ "$#" -eq 2 ]; then RES="$1" CMD="$2" else RES="all" CMD="$1" fi ## EXIT CODES # since this is a "legacy heartbeat R1 resource agent" script, # exit codes actually do not matter that much as long as we conform to # http://wiki.linux-ha.org/HeartbeatResourceAgent # but it does not hurt to conform to lsb init-script exit codes, # where we can. # http://refspecs.linux-foundation.org/LSB_3.1.0/ #LSB-Core-generic/LSB-Core-generic/iniscrptact.html #### drbd_set_role_from_proc_drbd() { local out if ! test -e /proc/drbd; then ROLE="Unconfigured" return fi dev=$( $DRBDADM sh-dev $RES ) minor=${dev#/dev/drbd} if [[ $minor = *[!0-9]* ]] ; then # sh-minor is only supported since drbd 8.3.1 minor=$( $DRBDADM sh-minor $RES ) fi if [[ -z $minor ]] || [[ $minor = *[!0-9]* ]] ; then ROLE=Unknown return fi if out=$(sed -ne "/^ *$minor: cs:/ { s/:/ /g; p; q; }" /proc/drbd); then set -- $out ROLE=${5%/**} : ${ROLE:=Unconfigured} # if it does not show up else ROLE=Unknown fi } case "$CMD" in start) # try several times, in case heartbeat deadtime # was smaller than drbd ping time try=6 while true; do $DRBDADM primary $RES && break let "--try" || exit 1 # LSB generic error sleep 1 done ;; stop) # heartbeat (haresources mode) will retry failed stop # for a number of times in addition to this internal retry. try=3 while true; do $DRBDADM secondary $RES && break # We used to lie here, and pretend success for anything != 11, # to avoid the reboot on failed stop recovery for "simple # config errors" and such. But that is incorrect. # Don't lie to your cluster manager. # And don't do config errors... let --try || exit 1 # LSB generic error sleep 1 done ;; status) if [ "$RES" = "all" ]; then echo "A resource name is required for status inquiries." exit 10 fi ST=$( $DRBDADM role $RES ) ROLE=${ST%/**} case $ROLE in Primary|Secondary|Unconfigured) # expected ;; *) # unexpected. whatever... # If we are unsure about the state of a resource, we need to # report it as possibly running, so heartbeat can, after failed # stop, do a recovery by reboot. # drbdsetup may fail for obscure reasons, e.g. if /var/lock/ is # suddenly readonly. So we retry by parsing /proc/drbd. drbd_set_role_from_proc_drbd esac case $ROLE in Primary) echo "running (Primary)" exit 0 # LSB status "service is OK" ;; Secondary|Unconfigured) echo "stopped ($ROLE)" exit 3 # LSB status "service is not running" ;; *) # NOTE the "running" in below message. # this is a "heartbeat" resource script, # the exit code is _ignored_. echo "cannot determine status, may be running ($ROLE)" exit 4 # LSB status "service status is unknown" ;; esac ;; *) echo "Usage: drbddisk [resource] {start|stop|status}" exit 1 ;; esac exit 0 |
赋予755执行权限:
1 |
# chmod 755 /etc/ha.d/resource.d/drbddisk |
三、启动HeartBeat服务
在两个节点上启动HeartBeat服务,先启动node1:(node1,node2)
1 2 |
# service heartbeat start # chkconfig heartbeat on |
现在从其他机器能够ping通虚IP 192.168.0.190,表示配置成功
四、配置NFS:(node1,node2)
编辑exports配置文件,添加以下配置:
1 2 |
# vi /etc/exports /store *(rw,no_root_squash) |
重启NFS服务:
1 2 3 4 |
# service rpcbind restart # service nfs restart # chkconfig rpcbind on # chkconfig nfs off |
注:这里设置NFS开机不要自动运行,因为/etc/ha.d/resource.d/killnfsd 该脚本会控制NFS的启动。
五、测试高可用
1、正常热备切换
在客户端挂载NFS共享目录
1 |
# mount -t nfs 192.168.0.190:/store /tmp |
模拟将主节点node1 的heartbeat服务停止,则备节点node2会立即无缝接管;测试客户端挂载的NFS共享读写正常。
此时备机node2上的DRBD状态:
1 2 3 4 5 6 |
# service drbd status drbd driver loaded OK; device status: version: 8.4.3 (api:1/proto:86-101) GIT-hash: 89a294209144b68adb3ee85a73221f964d3ee515 build by root@drbd2.corp.com, 2015-05-12 21:05:41 m:res cs ro ds p mounted fstype 0:r0 Connected Primary/Secondary UpToDate/UpToDate C /store ext4 |
2、异常宕机切换
强制关机,直接关闭node1电源
node2节点也会立即无缝接管,测试客户端挂载的NFS共享读写正常。
此时node2上的DRBD状态:
1 2 3 4 5 6 |
# service drbd status drbd driver loaded OK; device status: version: 8.4.3 (api:1/proto:86-101) GIT-hash: 89a294209144b68adb3ee85a73221f964d3ee515 build by root@drbd2.corp.com, 2015-05-12 21:05:41 m:res cs ro ds p mounted fstype 0:r0 Connected Primary/Unknown UpToDate/DUnknown C /store ext4 |