When you have users depending on Windows Terminal Services for their main desktop, it’s a good idea to have more than one Terminal Server. RDP, however, is not an easy protocol to load balance; sessions are long-lived and need to be persistent to a particular server, and users may connect from different source addresses during one session.
The current development version of HAProxy has made an important step forward in making this possible. Thanks to work by Exceliance, it now supports RDP Cookies, offering a solution to the persistence problem.
We have been testing the latest development release of HAProxy, 1.4-dev4, on a loadbalancer.org Enterprise R16 device. The real servers were two Windows Server 2008 machines, with identical test users set up on both.
We settled upon the following HAProxy configuration (RDP Cookies):
defaults clitimeout 1h srvtimeout 1h listen VIP1 192.168.0.10:3389 mode tcp tcp-request inspect-delay 5s tcp-request content accept if RDP_COOKIE persist rdp-cookie balance rdp-cookie option tcpka option tcplog server Win2k8-1 192.168.0.11:3389 weight 1 check inter 2000 rise 2 fall 3 server Win2k8-2 192.168.0.12:3389 weight 1 check inter 2000 rise 2 fall 3 option redispatch
Note that this is only a fragment of the haproxy.cfg file, showing the relevant options.
The load balancer’s Virtual IP is set to 192.168.0.10, listening on port 3389 for RDP. The two real servers are on 192.168.0.11 and 192.168.0.12, in the same subnet as the Virtual IP.
The two new configuration directives are persist rdp-cookie
and balance rdp-cookie
. These instruct HAProxy to inspect the incoming RDP connection for a cookie; if one is found, it is used to persistently direct the connection to the correct real server. The two tcp-request
lines help to ensure that HAProxy sees the cookie on the initial request.
The only other tweak needed is to increase the clitimeout
and srvtimeout
values to one hour. In testing, this was found to be necessary to keep idle RDP sessions established.
Testing involved making multiple connections with different usernames, from varying IP addresses, using both Windows XP Professional and Linux clients. Sessions were disconnected and reconnected, and real servers removed from the cluster and re-inserted.
We found that, once a user had established a session with a particular real server, that user consistently reconnected to the correct server if it was available. When we removed and re-inserted servers, existing sessions were unaffected. After a simulated server failure, users could start a session on the remaining server.
When a failed server was brought back on-line, users that had been connected to that server would reconnect to it again – even if they had started a new session on the other server in the meantime. This may not be what you want, and requires further testing.
With client and server time-outs set to one hour, we were able to leave idle sessions running for 16 hours without problems.
For more information on the new configuration options, see the development version of HAProxy’s Configuration Manual.
NB. For some daft reason Microsoft restricted the login cookie in RDP to 9 characters! Now as the domain is usually listed first (mydomain/myusername) the first 9 characters may always be the same and RDP cookie session persistence wont work. Two work arounds for this are either reduce the length of your domain name (ouch!) OR use the myusername@mydomain format when you log in….
So what about Microsoft Connection Broker (session directory or whatever they call it) ?
A simple one line change in your HAProxy configuration (RDP Connection Broker):
#Balance rdp-cookie -> balance leastconn i.e.
defaults clitimeout 1h srvtimeout 1h listen VIP1 192.168.0.10:3389 mode tcp tcp-request inspect-delay 5s tcp-request content accept if RDP_COOKIE persist rdp-cookie balance leastconn option tcpka option tcplog server Win2k8-1 192.168.0.11:3389 weight 1 check inter 2000 rise 2 fall 3 server Win2k8-2 192.168.0.12:3389 weight 1 check inter 2000 rise 2 fall 3 option redispatch
Note that this is only a fragment of the haproxy.cfg file, showing the relevant options.
Its about time we updated this post for the juicy new features in HAProxy – Development 1.5-dev7
Their were a couple of the problems with the hash method used with RDP cookie load balancing (as described above):
- Lots of people would like to use least connection load balancing with WTS/RDP clusters (this is not possible with a HASH based persistence method).
- When you add or remove servers the HASh table gets re-configured i.e. users hit the wrong server.
So Loadbalancer.org took the decission to sponsor the development of a stick-table based RDP persistence (we sponsored the origional source IP stick table work as well). When we looked at it in more detail we decided that what we needed was:
- Flexible stick tables that could be used for multiple future requirements i.e. SSL Session ID persistence.
- RDP stick table support in order to enable least connection based scheduling.
- Some way of restoring stick tables on session restart (and also replication to other HAProxy instances).
- Ensuring that TCP connections are properly closed on server failure (especially important on long connections).
- Ensuring that the stick table is cleared out on server failure.
- And finaly making sure that the fallback server can be made non-sticky! (really irritating if you get stuck on the sorry site down page).
To cut a long story short lets just dive in with a full configuration file and explain it as we go:
#HAProxy configuration file generated by LB Cloud appliance global #uid 99 #gid 99 daemon stats socket /var/run/haproxy.stat mode 600 level admin log 127.0.0.1 local4 maxconn 40000 ulimit-n 81001 pidfile /var/run/haproxy.pid defaults log global mode http timeout connect 4000 timeout client 42000 timeout server 43000 balance roundrobin peers localpeer peer loadbalancer localhost:8888 listen stats :7777 stats enable stats uri / stats hide-version option httpclose frontend F1 bind *:3389 maxconn 40000 default_backend B1 mode tcp option tcplog backend B1 mode tcp option tcpka balance leastconn tcp-request inspect-delay 5s tcp-request content accept if RDP_COOKIE persist rdp-cookie stick-table type string size 204800 expire 120m stick on rdp_cookie(mstshash) server R1 www.loadbalancer.org:3389 weight 1 check port 3389 inter 2000 rise 2 fall 3 on-marked-down shutdown-sessions server R2 www.clusterscale.com:3389 weight 1 check port 3389 inter 2000 rise 2 fall 3 on-marked-down shutdown-sessions server backup us.loadbalancer.org backup non-stick option redispatch option abortonclose
An important new section is the peers section:
peers localpeer peer loadbalancer localhost:8888
In this configuration we are syncronising all of the stick table information with localhost:8888 (it could be with another HAProxy instance for session table high-availability).
When HAProxy restarts it will run existing sessions on the old process until they expire, only new sessions will run on the new HAProxy instance (this can get quite confusing as the stats socket or page will only show the new sessions (not the old ones)
You will need to change your HAProxy start up scripts:
start() { /usr/local/sbin/$BASENAME -L loadbalancer -c -q -f /etc/$BASENAME/$BASENAME.cfg if [ $? -ne 0 ]; then echo "Errors found in configuration file." return 1 fi echo -n "Starting $BASENAME: " daemon /usr/local/sbin/$BASENAME -D -f /etc/$BASENAME/$BASENAME.cfg -p /var/run/$BASENAME.pid -L loadbalancer RETVAL=$? echo [ $RETVAL -eq 0 ] && touch /var/lock/subsys/$BASENAME return $RETVAL }
reload() { /usr/local/sbin/$BASENAME -L loadbalancer -c -q -f /etc/$BASENAME/$BASENAME.cfg if [ $? -ne 0 ]; then echo "Errors found in configuration file." return 1 fi /usr/local/sbin/$BASENAME -D -L loadbalancer -f /etc/$BASENAME/$BASENAME.cfg -p /var/run/$BASENAME.pid -sf $(cat /var/run/$BASENAME.pid) }
The important thing is that the peers definition “loadbalancer” must be prsent in both the start up scripts and the haproxy.cfg file.
Now we have the new section to make the stick table use RDP cookies and the least connection scheduler:
balance leastconn tcp-request inspect-delay 5s tcp-request content accept if RDP_COOKIE persist rdp-cookie stick-table type string size 204800 expire 120m stick on rdp_cookie(mstshash)
And the new clean and quick session kill options + making the backup server not go in the stick table:
server R2 www.clusterscale.com:3389 weight 1 check port 3389 inter 2000 rise 2 fall 3 on-marked-down shutdown-sessions server backup us.loadbalancer.org backup non-stick
I probably haven’t explained all that very well… but these tweeks ensure that servers that fail health checks immediately break the long held TCP connections
but feel free to ask questions .
Someone asked for a complete configuration file , so here goes:
# HAProxy configuration file generated by loadbalancer.org appliance global daemon stats socket /var/run/haproxy.stat mode 600 level admin pidfile /var/run/haproxy.pid maxconn 40000 ulimit-n 81000 tune.maxrewrite 1024 defaults mode http balance roundrobin timeout connect 4000 timeout client 42000 timeout server 43000 peers loadbalancer_replication peer lbmaster localhost:7778 peer lbslave localhost:7778 listen RDP_Test bind 192.168.67.30:3389 mode tcp balance leastconn server backup 127.0.0.1:9081 backup non-stick option tcpka tcp-request inspect-delay 5s tcp-request content accept if RDP_COOKIE stick-table type string size 10240k expire 12h peers loadbalancer_replication stick on rdp_cookie(mstshash) upper timeout client 12h timeout server 12h option redispatch option abortonclose maxconn 40000 server 2008_R2 192.168.64.50:3389 weight 1 check inter 2000 rise 2 fall 3 minconn 0 maxconn 0 on-marked-down shutdown-sessions listen stats :7777 stats enable stats uri / option httpclose stats auth loadbalancer:loadbalancer Note some small changes in the timeouts & stick table section:
stick-table type string size 10240k expire 12h peers loadbalancer_replication stick on rdp_cookie(mstshash) upper <-- Nice little tweek to ensure cookies don't have a case sensitive match! timeout client 12h timeout server 12h <-- Massive timeout but works well in office environments where people go for long lunch breaks and forget to log off...