Getting the most from your Drupal site means getting the most from your server – optimizing the various layers of the the LAMP stack. This includes the filesystem, database, web server, PHP, RAM and CPU. Tuning the LAMP stack is a major subject requiring a lot of study and practice to become proficient. It’s something you will probably never completely master Try Googling lamp performance tune for a few articles to whet your appetite. For now, we’ll cover a few of the major considerations for Drupal, although most of this advice would apply to any PHP web app running on Linux.
Opcode caches cache the compiled form of a PHP script in shared memory to avoid the overhead of parsing and compiling the code every time the script runs. This saves RAM and reduces script execution time.
Quite a bit of benchmarking has been done in the Drupal and PHP communities betweenAPC, eAccelerator and XCache. eAccelerator may have the edge in raw performance, but it appears that APC is the preferred opcode cache in the Drupal community because it is well maintained and less buggy.
There are a number of choices to be made when tuning your MySQL database server. The MySQLTuner script can be helpful for identifying outstanding issues you may be unaware of. It can be run on a functioning production server to see how your database is performing in the wild. It’s possible to take a best guess at config options on your dev machine but you aren’t going to know how things are going to shape up until real users start hitting the DB.
A default install of Drupal 6 installs the DB tables as MyISAM. This will change in Drupal 7 with the default set to InnoDB. A Drupal 6 installation may well have some InnoDB tables as modules may create new tables in the InnoBD engine. Your installation may therefore be a mix between the two engines.
In many places on the web you will read statements such as ‘All high performance Drupal sites run InnoDB”. This is not necessarily so as there are some cases where MyISAM may still be preferred although with recent changes to Drupal core the pendulum has swung to InnoDB as a sensible default.
A list of the main difference between the engines is as follows:/p>
InnoDB is transactional (better integrity), MyISAM isn’t
InnoDB more reliable (better recovery), MyISAM can be repaired
InnoDB has row level locking (better concurrency), MyISAM locks tables
InnoDB uses clustered indexes (faster access to data), MyISAM indexes just the keys
InnoDB has a bigger memory footprint
In general, you would consider sticking with MyISAM if
Memory footprint was an issue. If you have very big indexes which might only just fit into the key buffer then MyISAM could offer faster lookups.
Most activity is read only.
InnoDB tables definitely should be used for all of the Drupal cache tables since this is where most contention is likely to occur.
Finally, it must be noted that Drupal was written based on the MyISAM engine and as such many queries were not optimized for InnoDB. The SELECT COUNT(*) is particularly slow in InnoDB because it must scan all rows to calculate the count. Many of these shortcomings have been removed in the PressFlow distribution and have since made their way back into core.
All sites: InnoDB for less contention on cache
Most sites: InnoDB for everything else
Big unchanging sites: MyISAM faster reads less RAM
If you are running MyISAM tables then the key buffer is a very important variable to set. The key buffer stores table indexes in memory, allowing for fast lookups and joins. For large node, node_version and url_alias tables it is a must to have enough room to fit these tables into memory, otherwise your site will very slow on the most basic of operations: looking up nodes, titles and paths.
One rule of thumb is to set this buffer to somewhere between 25% and 50% of the memory on the server. To determine the best value up front sum the size of all the .MYI files.
MySQL has a query cache which stores results up to a certain size in memory. The cache is very handy for quickly returning commonly accessed data when all other forms of caching (reverse proxies, page cache, Drupal caches) have not been invoked. Queries which may take sometime return almost instantly.
During the development and testing of a site the query cache can catch developers out since a query may appear to be performing quite well the second and subsequent times through. To really test a query you need to fire up mysql client (or phpmyadmin) and add the SQL_NO_CACHE option to the query to see the real time it takes. Don’t be fooled!
The query cache is destroyed if any row in the table is changed and so it cannot be relied upon if tables are changing frequently. The cache shines when the are big tables which don’t change that often. Unless your site has such characteristics it is best to limit it so that it fits small unchanging tables and then some for the most popular queries. Examination of cache hit rates will show you if it needs to be extended or reduced.
If you are running InnoDB tables then it is essential to optimize the InnoDB Buffer Pool Size, increasing the memory to reduce query time. InnoDB is more memory intensive and so the pool will be larger than that used for MyISAM tables. MySQL documentation suggests that the size can be upped to 80% of physical memory. Anymore could lead to swap issues.
A warm database will perform much better than a recently started one because its caches and buffers will be primed with keys and data. It therefore makes sense to warm up a DB every time the database is restarted. The best way to do this is to load in the indexes of commonly used tables. This guide recommends loading in node, node_revisions and url_alias. Taxonomy information could be good candidates as well.
LOAD INDEX INTO CACHE node;
LOAD INDEX INTO CACHE node_revisions;
LOAD INDEX INTO CACHE url_alias;
LOAD INDEX INTO CACHE term_data;
LOAD INDEX INTO CACHE term_node;
This SQL code can then be put in a script and run when MySQL restarts. It is possible to configure the init_file variable in my.cnf to tell mysql where to find the startup SQL.
Indexes on columns can dramatically speed up queries if the columns are used for filtering, sorting or joining. Generally, Drupal has most of the indexes you need covered, however, there are some areas where standard tables can benefit from an additional index. It is recommended that you profile your queries to see where things are slow before adding indexes in a scattergun approach because adding indexes can harm performance if they are not being used properly. You can use MySQL’s slow query log for queries with no index to identify areas for improvement.
Apache + MPM Prefork + mod_php is the default web server configuration in the LAMP stack. This combination does consume large amounts of RAM which can be a problem for handling many requests. It can also be quite heavy and slow for serving static content. Many administrators have looked to replace it with other combinations including multithreaded processes (MPM Worker) and external PHP (mod_fcgid) as well as swapping it out completely for another server such as Nginx. This guide has adopted the position that Apache problems can be ameliorated somewhat by removing unneeded modules, running fcgid to connect with PHP and using MPM Worker to enable multithreading per process. However, in some cases this won’t be enough and Nginx is a must.
APACHE VS NGINX
Other Drupal users have replaced Apache with faster more lightweight (RAM and CPU) web servers such as Nginx and Lighttpd. Nginx is generally preferred over Lighttpd because of memory leaks in the latter. It is currently possible to run Nginx without losing any functionality in Drupal. Boost, a module based on .htaccess rules, now supports Nginx so it is feasible to run Nginx as the main web server. If you are constrained by CPU or have high loads then this certainly is an option worth considering.
Setting up Nginx is not trivial but it is reasonably straight forward if you are comfortable with compiling and patching. There are some good tutorials on the Web for user who want to do this.
Low resources, High Traffic, Many logged in: Possible to get more for less with Nginx.
“Nginx seems to compete pretty well with Apache and there doesn’t seem like there is a good reason not to use it especially in CPU usage constrained situations (ie. huge traffic, slow machines and etc).”
“The following guide will walk you through setting up possibly the fastest way to serve PHP known to man…In this article, we’ll be installing nginx http server, PHP with the PHP-FPM patches, as well as APC.”
It is possible to turn off unneeded modules in Apache to reduce memory footprint. The modules you require depends very much on your setup.
The traditional way of controlling modules in Apache has been through the LoadModule directive in httpd.conf. Ubuntu and Debian do it differently with the /etc.apache2/mods-available directory and the a2enmod command. To list all modules to enable try:
$ sudo a2enmod
$ sudo /etc/init.d/apache2 force-reload
And to see what you have enabled you can do $ sudo a2dismod.
The use of MPM Worker allows for the handling of more requests due to multithreading in each process. It has a smaller memory footprint than Prefork and is faster. According to docs, Apache must be compiled with the --with-mpm argument in order to install Worker as “prefork” is the default on Unix systems.
The use of mod_php with Apache is the most common setup for calling PHP. mod_php works by embedding PHP into every Apache process. This has the disadvantage of a large memory footprint for each Apache process. FastCGI and mod_fcgid overcomes this problem and reduces resource utilization with no gains in performance.
The MaxClients parameter controls how many simultaneous clients Apache is able to serve. If it is set to high RAM will be chewed up and the Machine will go into swap. If it is set to low then your site will be unnecessarily limited by the number of clients it can serve. The setting of this value should be determined after consideration of (i) how much spare RAM is available on the server and (ii) how much RAM each Apache process consumes. Obviously you will want to maximize available RAM through frugal allocation of RAM to MySQL, JVM, etc and minimize the size of Apache process through techniques described above.
If you are running Apache then it is possible to either use .htaccess or the apache conf file to specify directives such as rewrite rules, etc. If you use .htaccess then Apache must look for .htaccess rules in the directory hierarchy for every request. This can take some time even if no rules are found. You may consider putting the rules in httpd.conf/apache2.conf if you are looking to eek out the most performance from your site.
.htaccess can slow down site if performance is crucial.