I work in a school, so I have to run a web content filter to ensure users are kept safe while online. Running the filter on a single Linux box was once satisfactory, but the network grew over time and put the filter under considerable load, causing lengthy delays and frequently making it unusable. It got to a stage where I needed to implement a more scalable solution. I decided that running multiple identically configured servers in a load balanced cluster was the best path to take. The next few posts will describe in detail how to set up a clustered web filter/proxy.
In order to make the project successful, I needed to achieve the following goals:
1) Scalable - add or remove servers without causing disruption
2) Highly available - protect all components against hardware or OS failure
3) Configurable - easily block/allow sites based on domain, URL or regex
4) Consistent - any configuration changes made on one server must be replicated to all others
5) Auditable - logs must be analysed for most frequently visited sites, most blocked and if possible, make users accountable for the websites they visit
After a few weeks of research and testing, I had a system set up that accomplished all of my objectives and outperforms the old system by miles. I made use of the following packages:
- Corosync (Cluster Engine) - Keeps track of High Availability cluster members
- Pacemaker (Cluster Resource Manager) - moves cluster resources (IPs, processes etc) to relevant High Availability members
- IPVS - kernel module for routing cluster access requests to a backend server
- ldirectord (Load Balancing Cluster Manager) - monitors cluster nodes ("realservers") and updates the kernel's IPVS table as nodes go online/offline
- ipvsadm (IP Virtual Server Administration) - configure IPVS and inspect its current configuration/status
- csync2 (Synchronisation daemon) - ensure all hosts have up to date configuration
- Squid (WWW Proxy/Cache) - the service we will be load balancing
- SquidGuard (URL Filter) - loads as a module in Squid
- SGAdmin, my own custom-written web admin interface: Requires Apache & PHP running on the director
Initial Infrastructure
This is a logical diagram of a basic Network Load Balancing (NLB) cluster protected by a High Availability (HA) cluster of load balancers (directors). The core software (IPVS) is developed by the Linux Virtual Server Project. I advise you look over the documentation for a more detailed explanation of clustering concepts, but beware that a considerable part of it is out of date, so I will briefly explain the basic concepts here.
Server A is the load balancer or "director". This handles all incoming connections from clients (F) and forwards them to a cluster node (C, D, E) based on a scheme you specify (round robin, least number of connections etc). Server B is an identically configured failover director that lays dormant until a failure is detected on Server A.
There are three possible NLB cluster configurations:
- NAT: Involves isolating the cluster nodes in a separate common network shared with the directors. The directors have two network interfaces each - one for the LAN and one for the NAT network. All traffic to and from the cluster nodes goes via the active director, which performs IP address translation on each packet. Individual nodes are not directly reachable from outside, only if explicit NAT rules are configured on the director.
- DR (Direct Routing): A much more feasible option if you have a routed IP environment. The cluster nodes don't have to sit in their own network, but it is a good idea to keep them on a separate broadcast domain, similar to NAT, except the network is routable independent of the director. Incoming connections are forwarded (NOT translated) by the active director, then the receiving node replies directly to the originator of the connection. Only the source and destination MAC addresses are changed in the packet.
- Tun (IPIP): Similar to DR except the backend servers are on a network the director does not belong to, so packets are encapsulated in IPIP packets and routed to the destination network. It's beyond the scope of what I needed for configuring a local cluster, therefore I could discard this option.
Cluster Type Choice
I chose a DR setup because it does not carry the additional overhead that NAT does (i.e. a NAT director translates incoming and outgoing packets, a DR director simply forwards them). The overhead is actually fairly minimal but I knew there could be thousands of connections per second to the cluster, and as I had the infrastructure in place already, the choice was a no-brainer. In addition, each node required Internet access to fulfil its role as a web proxy so allowing them direct access to my network router was preferable. I recommend you strive to implement Direct Routing.
Before continuing, you need to have the following infrastructure already in place:
- routed IP network (LAN VLAN and CACHE VLAN, although it can easily be done on one network)
- multicast/IGMP-aware switch (or a separate physical switch that exists in a separate broadcast domain to your internal LAN)
- 2x Ubuntu Server 10.04 servers with two physical NICs (directors, one NIC if running all on one network)
- 3x Ubuntu Server 10.04 servers with one physical NIC (cluster nodes)
IP Addressing
Plan the IP addresses you will use. In my setup I used two networks - LAN and CACHE. The directors had an interface on each network, and all cluster nodes had a single interface on the CACHE network only.
You need to allocate the following:
- *Cluster LAN IP (the shared IP the director and all nodes in the cluster need to be aware of. LAN clients will use this IP for their proxy)
- Director LAN fixed IP (the IP the director uses on the LAN network for normal communication)
- *Cluster CACHE IP (the shared IP assigned to the director on the CACHE network for internal traffic only)
- Director CACHE fixed IP (the IP the director uses on the shared CACHE network for normal communication)
- Node fixed IPs (one IP for each cluster node)
* For now we will allocate two IPs for the first director on each network. When we introduce high availability later, the shared Cluster IPs will only be assigned to the currently active director by the Cluster Resource Manager (CRM). Fixed IPs stay with their hosts so they always have a means of communication.
LAN Network (192.168.0.0/24, gateway 192.168.0.254)
- Cluster LAN IP: 192.168.0.110
- Director1 LAN IP: 192.168.0.111
- Director2 LAN IP: 192.168.0.112
CACHE Network (10.2.0.0/24, gateway 10.2.0.254)
- Cluster CACHE IP: 10.2.0.10
- Director1 CACHE IP: 10.2.0.11
- Director2 CACHE IP: 10.2.0.12
- Node1 CACHE IP: 10.2.0.1
- Node2 CACHE IP: 10.2.0.2
- Node3 CACHE IP: 10.2.0.3
I recommend printing the network diagram below and jotting down your allocated IPs on it during the initial setup. All hosts have their default gateway set to that of the network they belong to, except for the directors whose gateway is set to the LAN network gateway. The physical gateway device can be a router with multiple interfaces, or a layer 3 switch with multiple routed VLANs. My setup was configured for the latter, as seen in the diagram below.
The next post will give instructions on getting started with the initial configuration of director and cluster nodes.