Tuesday, June 8, 2010

how to tune Apache so as to survive a Slashdotting

SkyHi @ Tuesday, June 08, 2010
Like many techno-geeks I host my LAMP website on a cheap ($150) computer and my broadband connection. I have also wondered what would happen if my site was linked on Slashdot or Digg. Specifically, would my setup be able to survive the “Slashdot Effect?” A Pentium 100mhz can easily saturate a T1's worth of bandwidth and my upload speed is capped (supposedly) at 384kbps, so the server should easily be able to handle that. My bandwidth will be saturated before the server is incapacitated, at least that's the idea.
The machine I use for my web server is a $150 PC that I bought from Fry's one day (I always buy their $150 PC's when they're in stock). Here are the relevant specs on my little server:
CPU: AMD Athlon 2600+
RAM: 512MB
Hard Drive: 40GB 7200RPM
Software: Debian Linux, MySQL, Apache, PHP, WordPress
There is additional software installed on this machine because it is also used as a desktop computer. However, none of that software is important for the purposes of this article.
The RAM has been upgraded since I purchased the machine from Fry's because it originally came with 128MB, which is a little low for my tastes. The only other upgrade was a new CPU fan and that was out of personal preference, the default fan was just too loud.
Below are some directives in my httpd.conf and some general recommendations that I think are vital to helping you survive a good Slashdotting on low-budget hardware.
  • MaxKeepAliveRequests 0 The KeepAlive directive in httpd.conf allows persistent connections to the web server, so that new connection does not have to be initiated for each request. Setting the MaxKeepAliveRequests directive to 0 enables unlimited number of requests per connection, which makes sense if you think about it. Why allow persistent connections but then terminate them after a short period of time?
  • KeepAliveTimeout 15
    Because persistent connections are allowed, it is important that they are not kept open indefinitely. This directive will close the connection after 15 seconds of inactivity.
  • MinSpareServers 15
    This is the minimum number of spare servers you want running at any given time. This way, if multiple simultaneous requests are received there will already be child processes running to handle them. Setting this number too high is a waste of system resources and setting it too low will cause the system to slow down.
  • MaxSpareServers 65
    Same as above, but the maximum child processes running at any given time.
  • StartServers 15
    This is the number of servers Apache will start initially. As more servers handle requests a minimum of 15 spare servers will run up to the maximum of 64.
  • MaxClients 500
    This is the maximum number of simultaneous clients that can connect to the server at any given time. Setting this number too low will result in users being locked out of the server under normal traffic situations and setting it too high will result in your server being so overloaded that all the requests timeout anyway. I think 500 is about right for most people's needs.
  • MaxRequestsPerChild 100000
    Sets the maximum number of requests each child process will handle. This is mostly to prevent memory leaks and other mishaps but is important nonetheless. Setting this too low will cause a large portion of child processes to end for no real reason, thus slowing down the site. This could be set to 0 (unlimited) but that would negate any protection from valid issues like memory leaks.
  • HostnameLookups off
    This prevents DNS lookups of all the visitors to the site, I am pretty sure it's off by default. If it's on in your httpd.conf I would recommend turning it off.

I minimize graphics on my site, and use css instead (where I can). This is pretty easy with WordPress, depending on which theme you use. I stay away from themes with a lot of images and I tend not to put any in my posts either. They're just too much of a drain on bandwidth, especially if you have a lot of traffic. On top of all that, I don't really like seeing graphics when I go to other sites. Most of the time they just get in the way of the information.

As far as static pages go, that isn't much of an option for me. Everything in WordPress is dynamically pulled from the database unless certain plug-ins are installed and since my upload speed is the main bottleneck in my implementation, static pages aren't really a factor. However, if you have a faster upload speed, then having a cache of static pages would speed things up for you.

Another thing that will help with bandwidth if you submit a link to one of the larger sites (Digg, Slashdot, etc.) is to use CoralCDN. CoralCDN is essentially a caching/proxy service that will reduce the drain on your bandwidth. All you have to do to use it is append “.nyud:net:8090” (without the quotes) to any link you submit. All requests for that link will then be automatically routed through CoralCDN.

Those are just a few things you can do to help avoid having your server killed by Slashdot or Digg. The Apache configuration changes are important, but so is having a simple site that is low on graphics and other bandwidth intensive content. I'm sure there are many other things that can be done, and I don't claim to be an expert in this field (hence the general recommendations). So far, the previously mentioned things have helped my site stay up under some heavy traffic, but I have yet to be Slashdotted (thankfully?). If anyone else has recommendations on additional precautions I can take, I'm more than happy to hear them. If this gets onto Digg, Slashdot, etc. then it will be a good test of the things I've mentioned. We'll have to wait and see.

echo '##### Apache Load Check #####' ;\
watch \
"echo "CTRL+C TO EXIT" ;\
clear ;\
echo vmstat ;\
vmstat ;\
echo Load ;\
w ;\
echo Apache Processes ;\
ps -elf |grep 'http' |wc -l ;\
echo Active Apache Conections ;\
netstat -nalp |grep ':80 ' |grep 'ESTABLISHED'|wc -l ;\
echo Apache Conections ;\
netstat -nalp |grep ':80 ' |wc -l ;\
echo SYN Conections ;\
netstat -nalp |grep 'SYN' |wc -l ;\
echo IPCS ;\
ipcs |grep 0x0 |wc -l" ;\
echo '##### End Apache Load Check #####' 

If you see a lot of SYN connections then you still need to increase MaxSpareServers:
 #netstat -nalp | grep ':80 ' | grep SYN