Thursday, December 22, 2011

Drupal 7: HipHop for PHP vs APC – benchmark

SkyHi @ Thursday, December 22, 2011
Drupal is one of two most popular content management systems (CMS) written in PHP . It is used as a back-end system for at least 1.5% of all websites worldwide. It is also one with the the slowest systems of this kind on the Internet
There have been many suggestions on improving Drupal performance, some of them recommend the use of APC module, data caching, or even compilation of the entire system through HipHop for PHP. While the first two solutions have been successfully implemented, no one was able to perform the build process.
After many battles with the compiler and the Drupal code, I present you results of the first successful translation of Drupal 7 to C++ language.

Introduction

All tests were conducted on a modified version of Drupal. These changes were necessary in order to ensure compatibility with HipHop translator.
You can download modified source codes from this link.
The system was installed in the minimal version and then launched in three different ways:
  • as a standard PHP script
  • as a PHP script with APC opcode caching enabled
  • in the form of a compiled program
Due to hardware limitations the MySQL server is located on the same machine as the web server, more advanced tests will be performed in the near future using the multiple servers.

Testing platform

Processor: Intel(R) Core(TM)2 Duo CPU E7600 @ 3.06GHz
Memory: 2.5GB RAM
System: Fedora 12 (64bit)
Kernel: 2.6.32.26-175.fc12.x86_64 #1 SMP
The server was used exclusively for testing purposes – it was running only the services associated with the benchmark.

Test Configuration

Apache version: 2.2.15
MySQL Server Version: 5.1.47
PHP Version: 5.3.3
HipHop for PHP Version: 806ee06
Drupal Version: 7.0
Drupal was compiled with this command:
cd ~drupal
date && ~/hiphop/hiphop-php/src/hphp/hphp  --keep-tempdir=1\
 --log=3\
 --input-list=files.full.list\
 --include-path="." --force=1\
 --cluster-count=240\
 -v "AllDynamic=true"\
 -v "AllVolatile=true"\
 -o /tmp/drupal\
 --parse-on-demand=0\
 --sync-dir=/tmp/sync
And launched with this command:
cd ~drupal
/tmp/drupal/program -m server -p 80\
 -v "Server.SourceRoot=`pwd`"\
 -v "Server.DefaultDocument=index.php"\
 -c $HPHP_HOME/bin/mime.hdf\
 -v "Log.File=/tmp/errors"\
 -v "ErrorHandling.AssertActive=true"\
 -v "ErrorHandling.AssertWarning=true"\
 -v "ErrorHandling.WarningFrequency=10000"\
 -v "ErrorHandling.NoticeFrequency=10000" &

CPU usage

The following test examines the performance of Drupal by simulating the concurrent activity of many visitors on the Drupal home page.
I found that in case of a dual core server four was the optimal number of concurrent users, so I used the ab program as as a benchmark tool and launched it with a following command:
ab -n 300 -c 4 http://achilles.webtutor/
The first result shows the CPU usage of a regular PHP script during the execution of the 300 HTTP requests started by 4 concurrent users:

sy = system CPU usage (gradient color), us = user CPU usage (solid color)
As you can see Drupal is indeed a very demanding system. The test in this case took almost 20 seconds, which gave a dissapointing result of 15 requests per second.
Let’s see what we get after enabling the APC module. Results are as follows:

Drupal performance improved dramatically. The test was completed in 6 seconds, 3 times faster than with traditional PHP! Due to the size of Drupal code, however, this result is not shocking. While the APC module will not speed up the script itself, its opcode cache eliminates the delay caused by having to parse PHP code on every HTTP request.
Interestingly, Drupal is not able to use the full computing power of the test server. The official cause is unknown, however it may caused by some kind of internal locking in the APC module.
Since we know how much the opcode cache improves performance by omitting the PHP parser, its time to test how much we can accelerate the PHP code itself. We check this by translating Drupal source code to C++ and compiling the application:
Compiled Drupal application is five times faster than a regular script, and almost two times faster than a script launched from the opcode cache!
Let’s compare the results. The first is the detailed comparison of CPU usage:
And the overall CPU usage:
Here are the results taken directly from the ab tool:
Environment      Execution 
   Type        time [300 req]
-----------------------------
Regular PHP      19.873 sec
PHP + APC         6.396 sec
HipHop for PHP    3.896 sec

Concurrency benchmark

In this scenario I measured a number of requests performed per second. The summary results are as follows:
Type of environmentRequests per second [#/sec]Time per request [ms]Req/sec ratio [%]
Regular PHP15.1066.242100%
PHP + APC (opcode cache)46.9021.321310%
HipHop for PHP77.0112.985510%

Other concurrency levels

As a curiosity I decided to investigate how the Drupal’s performance can be affected by a variable number of concurrent users. In order to do so I tested the system with seven different workloads simulating 1, 2, 4, 8, 16, 32 and 64 concurrent users.
Here are the results:
Tabular version:
Users    PHP     APC   HipHop 
-----------------------------
  1      8.68   28.11   40.23
  2     13.34   38.25   56.12
  4     15.28   46.82   74.12
  8     14.76   49.96   72.12
  16    14.04   49.09   74.12
  32    12.35   45.00   77.67
  64     5.22   39.17   73.02
Please note: this test is heavily CPU bound and should be executed on a multicore servers instead.
As you can see, in case of APC and HipHop for PHP translator Drupal scales quite well up to the 8 simultaneous users on a dual CPU system. Unfortunately the same cannot be said about regular PHP interpreter, which is much slower in every tested scenario.

Different optimization levels in GCC compiler

Without any optimization option, the compiler’s goal is to reduce the cost of compilation and to make debugging produce the expected results.
Turning on optimization flags makes the compiler attempt to improve the performance and/or code size at the expense of compilation time and possibly the ability to debug the program.
Let’s see the results of different optimization levels switched on during the Drupal compilation:
optimization    req/sec
    level
--------------------------
   default         74.12
     -o2           87.11
     -o3           90.04
The difference of 12 req/sec between a default and -o2 optimization is quite big and almost equals to the 15 req/sec achieved by an interpreted PHP script!
What’s more in case of -o3 optimization Drupal is up to 6 times faster than in a pure PHP environment.

Summary

As I mentioned on the outset Drupal is not the fastest system on the Internet. However, after several changes in the code to add compatibility with HipHop for PHP, it becomes a very effective tool in the hands of every webmaster.

Other articles about HipHop for PHP