Friday, January 22, 2010

Heartbeat2+DRBD

SkyHi @ Friday, January 22, 2010
Prerequisites
- Setup Minimal CentOS 5
- be sure that both nodes can resolve correctly names (either through dns or /etc/hosts)
- yum update (as usual … )
- yum install heartbeat drbd kmod-drbd (available in the extras repository)
Current situation
* node1 192.168.0.11/24 , source disc /dev/sdb that will be replicated
* node2 192.168.0.12/24 , target disc /dev/sdb
DRBD Configuration
vi /etc/drbd.conf
global { usage-count no; }
resource repdata {
protocol C;
startup { wfc-timeout 0; degr-wfc-timeout 120; }
disk { on-io-error detach; } # or panic, …
net { cram-hmac-alg “sha1″; shared-secret “Cent0Sru!3z”; } # don’t forget to choose a secret for auth !
syncer { rate 10M; }
on node1 {
device /dev/drbd0;
disk /dev/sdb;
address 192.168.0.11:7788;
meta-disk internal;
}
on node2 {
device /dev/drbd0;
disk /dev/sdb;
address 192.168.0.12:7788;
meta-disk internal;
}
}

scp /etc/drbd.conf root@node2:/etc/
- Initialize the meta-data area on disk before starting drbd (! on both nodes!)
[root@node1 etc]# drbdadm create-md repdata
- start drbd on both nodes (service drbd start)
[root@node1 etc]# service drbd start
[root@node2 etc]# service drbd start
[root@node1 etc]# drbdadm — –overwrite-data-of-peer primary repdata
[root@node1 etc]# watch -n 1 cat /proc/drbd
- we can now format /dev/drbd0 and mount it on node1 : mkfs.ext3 /dev/drbd0 ; mkdir /repdata ; mount /dev/drbd0 /repdata
- create some fake data on node 1 :
[root@node1 etc]# for i in {1..5};do dd if=/dev/zero of=/repdata/file$i bs=1M count=100;done
- now switch manually to the second node :
[root@node1 /]# umount /repdata ; drbdadm secondary repdata
[root@node2 /]# mkdir /repdata ; drbdadm primary repdata ; mount /dev/drbd0 /repdata
[root@node2 /]# ls /repdata/ file1 file2 file3 file4 file5 lost+found
Great, data was replicated …. now let’s delete/add some file :
[root@node2 /]# rm /repdata/file2 ; dd if=/dev/zero of=/repdata/file6 bs=100M count=2
- Now switch back to the first node :
[root@node2 /]# umount /repdata/ ; drbdadm secondary repdata
[root@node1 /]# drbdadm primary repdata ; mount /dev/drbd0 /repdata
[root@node1 /]# ls /repdata/ file1 file3 file4 file5 file6 lost+found
OK … Drbd is working … let’s be sure that it will always be started : chkconfig drbd on
Heartbeat V2 Configuration
vi /etc/ha.d/ha.cf
keepalive 1
deadtime 30
warntime 10
initdead 120
bcast eth0
node node1
node node2
crm yes

vi /etc/ha.d/authkeys (with permissions 600 !!!) :
auth 1
1 sha1 MySecret

Start the heartbeat service on node1 :
[root@node1 ha.d]# service heartbeat start
Starting High-Availability services: [OK]
Check the cluster status :
[root@node1 ha.d]# crm_mon
Replicate now the ha.cf and authkeys to node2 and start heartbeat
[root@node1 ha.d]# scp /etc/ha.d/ha.cf /etc/ha.d/authkeys root@node2:/etc/ha.d/
[root@node2 ha.d]# service heartbeat start
Verify cluster with crm_mon :
=====
Last updated: Wed Sep 12 16:20:39 2007
Current DC: node1.centos.org (6cb712e4-4e4f-49bf-8200-4f15d6bd7385)
2 Nodes configured.
0 Resources configured.
=====
Node: node1 (6cb712e4-4e4f-49bf-8200-4f15d6bd7385): online
Node: node2 (f6112aae-8e2b-403f-ae93-e5fd4ac4d27e): online
vi /var/lib/heartbeat/crm/cib.xml
 
 ===
 
 
Firewall considerations
You will need to make sure that the nodes can talk on ports:
DRBD: 7788
HEARTBEAT: 694
Other
从haresources生成cib.xml
/usr/local/lib/heartbeat/haresources2cib.py –stout -c /etc/ha.d/ha.cf /etc/ha.d/haresources
1)查看所有资源
crm_resource -L
2)查看资源跑在哪个节点上
crm_resource -W -r DRBD_data
4)启动/停止资源
crm_resource -r DRBD_data -p target_role -v started
crm_resource -r DRBD_data -p target_role -v stopped
5)查看资源在cib.xml中的定义
crm_resource -x -r DRBD_data
6)将资源从当前节点移动向另个节点
crm_resource -M -r DRBD_data
7)将资源移向指定节点
crm_resource -M -r DRBD_data -H node2
8)允许资源回到正常的节点
crm_resource -U -r DRBD_data
9)将资源从CRM中删除
crm_resource -D -r DRBD_data -t primitive
10)将资源组从CRM中删除
crm_resource -D -r My-DRBD-group -t group
11)将资源从CRM中禁用
crm_resource -p is_managed -r DRBD_data -t primitive -v off
12)将资源重新从CRM中启用
crm_resource -p is_managed -r DRBD_data -t primitive -v on
13)重启资源
crm_resource -C -H node2 -r DRBD_data
14)检查所有节点上未在CRM中的资源
crm_resource -P
15)检查指定节点上未在CRM中的资源
crm_resource -P -H node2

REFERENCES
http://yemaosheng.com/?p=333