One of the huge benefits of going with Open Source solutions is the ability to get Enterprise grade solutions for decidedly non Enterprise costs. We have a fairly typical Web site set up with two load balancers, 2 Apache/Tomcat servers and 2 Postgres database boxes. We wanted to be able to ensure that in the event of a failure of any of those machines, we would be able to automatically recover and continue providing services.
Using that tool, you can create a resource group for the clustered services (click on Resources and Plus and select Group). Then within that group you need to configure 4 resources, a virtual IP address that can be used to communicate with the the primary node, a filesystem resource for the DRBD filesystem, drbddisk resource and a resource for postgres. To take each in turn:
Reference:
http://osssmb.wordpress.com/2008/12/16/high-availability-with-drbd-and-heartbeat-crm/
Database
We have two machines, both with Postgres 8.1 installed on them (the latest version provided as part of CentOS 5.2). While apparently 8.3 can work in active/active mode, we decided to stick with 8.1 to reduce dependency hell with everything else on the machines and work with DRBD. Setup is incredibly simple – we created an /etc/drbd.conf file which had:global { usage-count no; } common { protocol C; } resource r0 { device /dev/drbd1; disk /dev/LVMGroup/LVMVolume; meta-disk internal; onon both nodes and ran :{ address :7789; } on { address :7789; } }
# drbdadm create-md r0 <-- on both nodes # service drbd start <-- on both nodes # drbdadm -- --overwrite-data-of-peer primary r0 <-- on the primary nodeThis started DRBD and allowed the primary node to sync to the secondary. For more details about this (and heartbeat configuration below), have a look at this excellent CentOS HOWTO. Then we needed to configure heartbeat to manage the automatic failover for us. Create /etc/ha.d/ha.cf on both nodes to contain:
keepalive 2 deadtime 30 warntime 10 initdead 120 bcast eth0 nodeThe /etc/ha.d/authkeys on both nodes should contain:node crm yes
auth 1 1 sha1This will then result in a working heartbeat. Start the heartbeat service on both nodes and ywait a few minutes and the command crm_mon will show you a running cluster:
[root@Then run hb_gui to configure the resources for heartbeat. The file that this gui configures is in /var/lib/heartbeat/crm and is defined via XML. While I would prefer to configure it manually, I haven’t worked out how to do that yet and the hb_gui tool is very easy to use.ha.d]# crm_mon Defaulting to one-shot mode You need to have curses available at compile time to enable console mode ============ Last updated: Tue Dec 16 14:29:03 2008 Current DC: (33b76ea8-7368-442f-aef3-26916c567166) 2 Nodes configured. 0 Resources configured. ============ Node: (33b76ea8-7368-442f-aef3-26916c567166): online Node: (bbccba14-0f40-4b1c-bc5d-8c03d9435a37): online
Using that tool, you can create a resource group for the clustered services (click on Resources and Plus and select Group). Then within that group you need to configure 4 resources, a virtual IP address that can be used to communicate with the the primary node, a filesystem resource for the DRBD filesystem, drbddisk resource and a resource for postgres. To take each in turn:
- IP Address – click on plus and select Native, change the resource name to ip_
, select the group it should belong to, then select IPaddr from the list and click on Add Parameter. Then enter ip and a virtual ip address for the cluster. Add another parameter nic and select the interface for this to be configured against (i.e. eth0). Then click on OK. - drbddisk resource – Same procedure, but this time select drbddisk instead of ipaddr and select Add Parameter. Then enter 1 and the name of the drbd resource created (r0 in our case).
- filesystem – Same again, but select Filesystem and add the following parameters:
- device, /dev/drbd1 (in this example)
- directory, /var/lib/pgsql (for postgres)
- type, ext3 (or the filesystem you have created on it)
- postgres – Lastly add a postgres resource with no parameters.
Web
Creating the clustering for the Web was similarly easy. We kept the 2 web machines as they were with Apache and Tomcat running on both and instead clustered the load balancers initially in active/passive (until we can work out the active/active settings) in much the same way. The key difference was that for these machines we ran the load balancing software (HAProxy) on both all the time and the cluster just looked after the IP address. That way nothing was slowed down if the primary load balancer failed while waiting for services to start.Reference:
http://osssmb.wordpress.com/2008/12/16/high-availability-with-drbd-and-heartbeat-crm/