VPS.NET, CentOS and NginX Load Balanced Cloud Cluster

October 24, 2009

This week I have been experimenting with the cloud computing provider, VPS.NET.

The application I am trying to scale is a custom built PHP/MySQL web logging application, so unlike many web apps it has more database writes than reads. This is one of the challenges of scaling it, as a central database will be a single point of failure and a bottleneck.

About VPS.NET

The cheapest Virtual Machines at VPS.NET are £15 a month for 400MHZ CPU, 256MB RAM, 10GB disk space and 250GB/month of bandwidth.

VPS.NET have an interesting approach to choosing the power of a virtual machines. Instead of upgrading (and paying for) individual virtual machines, instead you purchase 'nodes' of resources, which you can allocate to 1 or more virtual machines. This gives a lot of flexibility as you can move 'nodes' around between virtual machines within minutes, without having to change your billing amount.

Effectively this means that the virtual machines themselves are free, but the resources to run them are what you must pay for.

Scaling the Web App

I use a traditional load balancing layout of 1 load balancer (with 1 passive failover) and multiple application VPSs.

The load balancer in question is NingX, a very high performance web server and reverse proxy.

Each application VPS was allocated an internal IP address from VPS.NET so that internal traffic would not be counted towards the bandwidth quota. Then Apache, PHP and MySQL were installed onto each application VPS. The database writes are scaled by having each application VPS write to its own local database, then logging data is then merged together in a central reporting suite later.

CentOS Configuration

CentOS is a free community rebuild of the popular RedHat Enterprise Linux distribution.

The only tuning I made to the CentOS basic image provided by VPS.NET was to turn off the firewall. Obviously this is not very secure, but I intend to re-enable it later, however during the testing phase it was causing me problems as the sheer number of TCP connections was causing the iptables connection tracking table to fill up and drop packets.

service iptables stop
chkconfig iptables off

NingX Load Balancer Configuration

I used the latest stable NginX release and also added in the Fair Upstream add-on module.

This is the NginX configuration:

upstream backend  {
        fair;
    server ip1:80   max_fails=20 fail_timeout=10s;
        server ip2:80   max_fails=20 fail_timeout=10s;
        server ip3:80   max_fails=20 fail_timeout=10s;
        server ip4:80   max_fails=20 fail_timeout=10s;
}

server {
        listen          80 default;
        server_name     default;

        proxy_set_header        Host            $host;
        proxy_set_header        X-Real-IP   $remote_addr;
        proxy_set_header        X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_connect_timeout           5;
        proxy_send_timeout              10;
        proxy_read_timeout              15;

        location / {
                proxy_pass              http://backend;
                proxy_cache             off;
        }
}

Application VPS Configuration

Each application VPS runs Apache and MySQL using 1 node of VPS.NET power.

Admittedly this is probably not how I would run it in production, but I wanted to see just how much I could get out of 256MB of RAM!

Apache was tuned so that each instance would not serve more concurrent requests than it could realistically handle, this way the VPS would not get overloaded and slow to a crawl.

Apache's prefork settings:

StartServers      10
MinSpareServers   10
MaxSpareServers   20
ServerLimit  256
MaxClients   30
MaxRequestsPerChild  4000

I also installed the APC PHP accelerator to get better PHP performance.

Siege Test Results

I used an HTTP benchmarking tool called Siege to repeatedly request the web application over the Internet (not on the same network as the VPSs).

Here are the results:

** SIEGE 2.69
** Preparing 500 concurrent users for battle.
The server is now under siege..      done.
Transactions:                      100000 hits
Availability:                      100.00 %
Elapsed time:                       99.71 secs
Data transferred:               80.20 MB
Response time:                        0.42 secs
Transaction rate:             *1002.91 trans/sec*
Throughput:                        0.80 MB/sec (6.4Mbits/sec)
Concurrency:                      *422.61*
Successful transactions:      100000
Failed transactions:                   0
Longest transaction:               21.01
Shortest transaction:                0.00

Conclusion

I was able to achieve 950-1000 transactions a second using this set-up, although the VPS cluster was being pushed to its limits. During the benchmark the NginX load balancer logged many connection errors to the application VPS (as they're MaxClients was set low), however the load on each application VPS did not exceed 4, and did not start swapping to disk.

By configuring NginX with a short connect and read time out, it was able to redirect the request to another application VPS so that the client would not notice.

The beauty of this system is that adding additional application VPS nodes is simple with VPS.NET and so the cluster can be scaled with no downtime.