RabbitMQ and HAProxy: a timeout issue

If you’re trying to setup a highly available RabbitMQ cluster using HAProxy, you may encounter a disconnection issue from your clients.

This problem is due to HAProxy having a timeout client (clitimeout is deprecated) setted for the default client timeout parameter. If a connection is considered idle for more than timeout client (ms), the connection is dropped by HAProxy.

RabbitMQ clients use persistent connections to a broker, which never timeout. See the problem here? If your RabbitMQ client is inactive for a period of time, HAProxy will automatically close the connection.

So how do we solve the problem ? I’ve seen that HAProxy got a clitcpka option which enable the sending of TCP keepalive packets on the client side.

Let’s use it !

But it’s not solving the problem, disconnection issue are still there. Damn.

After reading a discuss about RabbitMQ and HAProxy on the RabbitMQ mailing list, Tim Watson pointed out that:

[…]the exact behaviour of tcp keep-alive is determined by the underlying OS/Kernel configuration[…]

On Ubuntu 14.04, in the tcp man, you can see that the default value for the tcp_keepalive_time parameter is set to 2 hours. This parameter defines the time a connection needs to be idle before TCP begins sending out keep-alive packets.

You can also verify it by using the following command:

$ cat /proc/sys/net/ipv4/tcp_keepalive_time
7200

OK ! Let’s raise thetimeout client value in our HAProxy configuration for AQMP, 3 hours should be good. And that’s it ! No more disconnection issues 🙂

Here is a sample HAProxy configuration:

global
        log 127.0.0.1   local1
        maxconn 4096
        #chroot /usr/share/haproxy
        user haproxy
        group haproxy
        daemon
        #debug
        #quiet

defaults
        log     global
        mode    tcp
        option  tcplog
        retries 3
        option redispatch
        maxconn 2000
        timeout connect 5000
        timeout client 50000
        timeout server 50000

listen  stats :1936
        mode http
        stats enable
        stats hide-version
        stats realm Haproxy\ Statistics
        stats uri /

listen aqmp_front :5672
        mode            tcp
        balance         roundrobin
        timeout client  3h
        timeout server  3h
        option          clitcpka
        server          aqmp-1 rabbitmq1.domain:5672  check inter 5s rise 2 fall 3
        server          aqmp-2 rabbitmq2.domain:5672  check inter 5s rise 2 fall 3

Enjoy your highly available RabbitMQ cluster !

I think there may be another solution to this problem by using the heartbeat feature of RabbitMQ, see more about that here: https://www.rabbitmq.com/reliability.html

High-availability with HAProxy and keepalived on Ubuntu 12.04

‘Lo there !

Here is a little post on how you can easily setup a highly available HAProxy service on Ubuntu 12.04 ! I tend to use more and more HAProxy these times, adding more backends and connections on it. Then, I thought, what if it goes down? How can I ensure high availability on that service?

Here enters keepalived, which allows to setup another HAProxy node to create a active/passive cluster. If the main HAProxy node goes down, the second one will take the relay.

In the following examples, I assume the following:

  • Master node address: 10.10.1.1
  • Slave node address: 10.10.1.2
  • Highy available HAProxy virtual address: 10.10.1.3

Install HAProxy

You’ll need to install it on both nodes:

$ sudo apt-get install haproxy

Now, edit the file /etc/default/haproxy and set the property ENABLED to 1.

Start the service, and you’re done 🙂

$ sudo service haproxy start

Install keepalived

Prerequisite

You’ll need to update your sysctl configuration to allow non-local addresses binding:

$ echo "net.ipv4.ip_nonlocal_bind = 1" | sudo tee -a /etc/sysctl.conf
$ sudo sysctl -p

Setup

Install the package:

$ sudo apt-get install keepalived

Create a configuration /etc/keepalived/keepalived.conf file for the master node:

/etc/keepalived/keepalived.conf

global_defs {
 # Keepalived process identifier
 lvs_id haproxy_KA
}

# Script used to check if HAProxy is running
vrrp_script check_haproxy {
 script "killall -0 haproxy"
 interval 2
 weight 2
}

# Virtual interface
vrrp_instance VIP_01 {
 state MASTER
 interface eth0
 virtual_router_id 7
 priority 101

 virtual_ipaddress {
 10.10.1.3
 }

 track_script {
 check_haproxy
 }
}

Do the same for the slave node, with a few changes:

/etc/keepalived/keepalived.conf

global_defs {
 # Keepalived process identifier
 lvs_id haproxy_KA_passive
}

# Script used to check if HAProxy is running
vrrp_script check_haproxy {
 script "killall -0 haproxy"
 interval 2
 weight 2
}

# Virtual interface
vrrp_instance VIP_01 {
 state SLAVE
 interface eth0
 virtual_router_id 7
 priority 100

 virtual_ipaddress {
 10.10.1.3
 }

 track_script {
 check_haproxy
 }
}

WARNING: Be sure to assign a unique virtual_router_id for that keepalived configuration on the subnet 10.10.1.0.

Last step, start the service keepalived service on the master node first and then on the slave.

$ sudo service keepalived start

You can check that the virtual IP address is created with the following command on the master node:

$ ip a | grep eth0
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
 inet 10.10.1.1/25 brd 10.10.1.127 scope global eth0
 inet 10.10.1.3/32 scope global eth0

If you stop the HAProxy service on the master node or shutdown the node, the virtual IP will be transfered on the passive node, you can use the last command to verify that the VIP has been transfered.