Monday, February 21, 2011

Understanding logrotate on CentOS

SkyHi @ Monday, February 21, 2011

Understanding logrotate on CentOS - part 1

It's no fun when log files grow out of control. In this two-part series, learn how to use logrotate to keep those logs in check.

What is logrotate?

It may surprise you to learn that logrotate is a program used to rotate logs. It's true! The system usually runs logrotate once a day, and when it runs it checks rules that can be customized on a per-directory or per-log basis.
"Log rotation" refers to the practice of archiving an application's current log, starting a fresh log, and deleting older logs. And while we're explaining things, a "log" is a file where an application stores information that might be useful to an administrator or developer - what it's been doing, what errors it's run into, that sort of thing. So logs are good, you just usually don't want to keep a ton of them around. That's where logrotate comes in.

The importance of log rotation

Logs are wonderful things when you want to track usage or troubleshoot an application. Unfortunately the more information that gets logged, the more disk space the log uses. Over time it can really add up.
A log left unrotated can grow to a pretty unwieldy size. Running out of disk space because of a giant log is a problem of course, but a huge log file can also slow down the process of resizing or backing up your virtual server. Another practical consideration is that it's hard to look for a particular event if you have a million log entries to skim through. So on the whole it's a good idea to keep log files down to a manageable size, and to prune them when they get too old to be of much use.
Fortunately logrotate makes log rotation easy.

How it works

The system runs logrotate on a schedule, usually daily. In fact, you'll find the script that runs logrotate daily at:
/etc/cron.daily/logrotate
If you want logrotate to run more often (for hourly log rotation, for example) you'll need to look into using cron to run logrotate through a script in /etc/cron.hourly.
When logrotate runs it reads its configuration files to determine where to find the log files it needs to rotate, and to check on details like how often the files should be rotated and how many archived logs to keep.

logrotate.conf

The main logrotate configuration file is located at:
/etc/logrotate.conf
If you look inside that file you'll see the default parameters logrotate uses when it rotates logs. The file is nicely commented, so skim it to see how things are set up. We'll talk about several of the specific commands in that file shortly.
Note that one line reads:
include /etc/logrotate.d
That's where we'll find most of the application-specific configuration files.

logrotate.d

Take a look inside the directory where you'll store application-specific log settings:
ls /etc/logrotate.d
Depending on how much you've installed on your server there may be no files in this directory, or there may be several. In general, applications that are installed through CentOS's package manager (yum) will also create a config file in /etc/logrotate.d.
Most likely you will at least see a config file for syslog, which logrotate will read when it goes to rotate the system logs. If you look inside you'll see an entry for various system logs along with some commands similar to what you saw in logrotate.conf.

Inside an application file

As an example, let's take a look at the contents of a logrotate config file that might be put in place when you install apache:
/var/log/httpd/*log {
    missingok
    notifempty
    sharedscripts
    postrotate
        /sbin/service httpd reload > /dev/null 2>/dev/null || true
    endscript
}
We'll look at what most of the specific directives in this file mean in a bit, but the short version is that when logrotate runs it will check for any files in /var/log/httpd that end in "log" and rotate them, so long as they aren't empty. If it checks the httpd directory and doesn't find any logfiles it won't throw an error. Then it will run the command in the "postrotate/endscript" block (in this case, a command that will tell apache to restart), but only after it's processed all the specified logs.

What you don't see in that file are some settings you saw back in logrotate.conf. This is because the commands in logrotate.conf act as defaults for log rotation. You can specify different settings for any application where you want to override the defaults. For example, if you run a busy web server, you may want to include a "daily" command in apache's config block so apache's logs will rotate daily instead of the default weekly rotation.

That might be more clear if we talk about what some of the more commonly-used commands actually do in a logrotate config file. So let's do that next.

Configuration commands

You can get a full list of commands used in logrotate configuration files by checking the man page:
man logrotate
We'll go over more commonly-used commands here.
Remember, the config files for applications in /etc/logrotate.d inherit their defaults from the main /etc/logrotate.conf file.

Log files

A log file and its rotation behavior is defined by listing the log file (or files) followed by curly brackets. Most application configuration files will contain just one of these blocks, but it's possible to put more than one in a file, or to add log file blocks to the main logrotate.conf file.

You can list more than one log file for a block either by using a wildcard in the name or by separating log files in the list with spaces. For example, to specify all files in the directory /var/foo that end in ".log", as well as the file "/var/bar/log.txt", you would set up the block like so:
/var/foo/*.log /var/bar/log.txt {
        blah blah blah
        blah blah blah redux
}
Just not with as many blahs.

Rotate count

The "rotate" command determines how many archived logs will be kept around before logrotate starts deleting the older ones. For example:
rotate 4
That command tells logrotate to keep 4 archived logs at a time. If there are already four archived logs when the log is rotated again, the oldest one (the one with ".4" at the end, usually) will be deleted to make room for the new archive.

Rotation interval

You can specify a command that will tell logrotate how often to rotate a particular log. The possible commands include:
daily
weekly
monthly
yearly
If a rotation interval is not specified the log will be rotated whenever logrotate runs (unless another condition like "size" has been set).

If you want to use a time interval other than the keywords listed here you'll have to get clever with cron and a separate config file. For example, if you wanted to rotate a particular log file hourly, you could create a file in "/etc/cron.hourly" (you may need to create that directory too) that would contain a line like:
/usr/sbin/logrotate /etc/logrotate.hourly.conf
 
Then put the configuration for that hourly run of logrotate (the log file location, whether or not to compress old files, and so on) into "/etc/logrotate.hourly.conf".

Size

You can specify a file size that logrotate will check when determining whether or not to perform a rotation by using the "size" command. The format of the command tells logrotate what units you're using to specify the size:
size 100k
size 100M
size 100G
 
The first example would rotate the log if it gets larger than 100 kilobytes, the second if it's larger than 100 megabytes, and the third if it's over 100 gigabytes. I don't recommend using a limit of 100G, mind you, the example just got a little out of hand there.

The size command takes priority over a rotation interval. When a log rotates because it hit its size limit the time interval resets for that log file. For example, if a log set to rotate "weekly" is rotated because it hit its max size after four days, logrotate will wait another week after that before rotating the log based on the time interval that has passed. As a result, if you use an interval other than "daily" and specify a maximum size for the log file, the log rotation won't be guaranteed to happen on the same day every week.

Compression

If you want archived logfiles to be compressed (in gzip format) you can include the following command, usually in /etc/logrotate.conf:
compress
This is normally a good idea, since log files are usually all text, and text compresses very well. You might, however, have some archived logs you don't want compressed, but still want compression to be on by default. In those cases you can include the following command in an application-specific config:
nocompress
 
One more command of note in regard to compression is:
delaycompress
 
This command can be useful if you want the archived logs to be compressed, but not right away. With "delaycompress" active an archived log won't be compressed until the next time the log is rotated. This can be important when you have a program that might still write to its old logfile for a time after a fresh one is rotated in. Note that "delaycompress" only works if you also have "compress" in your config.

An example of a good time to use delaycompress would be when logrotate is told to restart apache with the "graceful" or "reload" directive. Since old apache processes would not be killed until their connections are finished, they could potentially try to log more items to the old file for some time after the restart. Delaying the compression ensures that you won't lose those extra log entries when the logs are rotated.

Postrotate

The "postrotate" script is run by logrotate each time it rotates a log specified in a config block. You'll usually want to use this to restart an application after the log rotation so the app can switch to a new log.
postrotate
    /usr/sbin/apachectl restart > /dev/null
endscript
 
That "> /dev/null" bit at the end tells logrotate to pipe the command's output to, well, nowhere. Otherwise the output of that command will be sent off to the console or the log or email or whatever, and in this case, you don't really care about the output if everything restarted okay.

The "postrotate" command tells logrotate that the script to run will start on the next line, and the "endscript" command says that the script is done.

Sharedscripts

Normally logrotate will run the "postrotate" script every time it rotates a log. This is true for multiple logs using the same config block. So for example, a web server config block that refers to both the access log and the error log will, if it rotates both, run the "postrotate" script twice (once for each file rotated). So if both files are rotated, the web server will be restarted twice.

To keep logrotate from running that script for every log, you can include the command:
sharedscripts
That tells logrotate to wait until it's checked all the logs for that config block before running the postrotate script. If one or both of the logs get rotated, the postrotate script still only gets run once. If none of the logs get rotated, the postrotate script won't run at all.

Summary

You've seen an overview of what logrotate does and what kind of configuration options are available to you. You should be all set to go poking around in the existing configs and adapt them to your needs. But let's not stop there! In the next article we'll look at putting an example config together (to rotate the logs for custom virtual hosts), and also cover some handy troubleshooting approaches.


Understanding logrotate on CentOS - part 2

In this second part of the logrotate series we look at how to set up rotation for virtual host logs, as well as some troubleshooting techniques.

Applying knowledge

In the previous article we talked about what logrotate does and how you can configure it. In this article we'll apply this new knowledge to putting together a log rotation solution for a custom virtual host or two (or three, or four, etc.). We'll also look at some options for testing and troubleshooting logrotate.

Tying it all together: virtual host logs

To show how you can use logrotate for your own applications, let's look at an example that will come in handy for a lot of people: rotating logs for your custom virtual hosts. We'll use apache for this example, but it can be tweaked pretty easily for other web servers like nginx or lighttpd, usually just by changing the postrotate script.

First we'll want to create a file to hold the configuration that will tell logrotate what to do with the virtual host's log files. We won't edit the main config file or the web server's config file, since there's always a possibility that a future package upgrade might want to overwrite the config. Instead we'll make our own. Let's call it:
/etc/logrotate.d/virtualhosts
 
This example tosses all the virtual hosts into one file, but if you have one that's busier than others you may want to create separate config files to handle the needs of your different domains. We'll also specify several items that are probably already set in your main config, just so we cover all the bases.

The files

We'll say that we have two virtual domains, domain1.com and domain2.com, and that the log files for each are in /home/demo/public_html/(domain name)/log. The first thing we'll do in our config file is tell logrotate where to find the log files, then start the config block for them:
/home/demo/public_html/domain1.com/log/*log /home/demo/public_html/domain2.com/log/*log {
If you have more log directories or files to add, just insert them into that list.

Rotate

Next we'll want to make sure logrotate only keeps as many old logs as we want:
rotate 14
We'll use 14 in this example to keep two weeks' worth of logs, but you can of course adjust that number to something suitable to your requirements.

Interval

Now we'll tell the web server to rotate these logs daily (again, change it if you prefer a longer interval):
daily

Size (optional)

If you prefer a weekly rotation it's wise to specify a max log size as well, to be on the safe side. The max size setting doesn't make much difference if you have the logs rotating daily, but if you use "weekly" or longer instead you might also include the line:
size 50M
That way if a log starts getting too large (from unexpectedly heavy traffic, for example) it will be rotated early rather than allowing it to get too unwieldy.

Compression

We'll specify whether or not we want these logs to be compressed when they're archived. For this example we'll use delaycompress to account for the graceful restart of apache, which means we also need to turn compression on:
compress
delaycompress

Sharedscripts

You might have several virtual hosts, and that would mean several logs to rotate. To make sure the web server only gets restarted after all the rotations are done we add:
sharedscripts

Postrotate

We'll specify a postrotate script that will restart the web server:
postrotate
        /usr/sbin/apachectl graceful > /dev/null
endscript
And finally, we close the config block with a curly bracket:
}

The whole shebang

Once we bring it all together our config file will look like this:
/home/demo/public_html/domain1.com/log/*log /home/demo/public_html/domain2.com/log/*log {
        rotate 14
        daily
        compress
        delaycompress
        sharedscripts
        postrotate
                /usr/sbin/apachectl graceful > /dev/null
        endscript
}
You'll want to test that, of course, either by making sure you're watching things when the nightly cron jobs are run, or by running logrotate right now:
/usr/sbin/logrotate /etc/logrotate.conf
 
If you don't get any errors back you should be okay. But if you want to be absolutely certain you can run through some of the tests we would use when we suspect something isn't working right.

Testing logrotate

If you suspect logrotate is having some trouble, or you just want to make sure a new config you've put in place will work, there are some useful flags you can pass to logrotate when you run it from the command line:

Verbose

The verbose flag, "-v", tells logrotate to say what it's doing while it's doing it. It's very useful when trying to find out why logrotate doesn't rotate a log when you want it to.

Debug

The debug flag, "-d", tells logrotate to go through the motions of rotating logs but not actually rotate them. It can be handy if you want to test a new config file but don't want any actual log rotation run when you do (if you're working on a production server, for example).

The debug flag is good for checking that the config file is formatted properly and that logrotate can find the log files it would rotate. However, since it doesn't actually run the rotations it doesn't test some parts of the process like the postrotate scripts.

Force

The force flag, "-f", forces logrotate to rotate all logs when it runs, whether or not they would normally need to be rotated at that time. If you want to thoroughly test logrotate's configs this is the flag to use. Just remember that logrotate will be rotating logs and deleting old ones according to the configuration you've set up, so don't accidentally rotate out a recent log you needed to keep.

The force flag can be useful if you're convinced that logrotate should be rotating a log, but it isn't. Forcing the issue will help you tell if the problem is that logrotate doesn't think the log needed rotating (if you run with the force flag and the log is rotated), or if the problem is that logrotate isn't able to affect the log file (if you run it and nothing happens to the log).

Note that if logrotate is set to add a date to the name of an archived log, not even using the force flag will get logrotate to make a new archive in the same day (since the name it would use for the archive is already taken). In that circumstance you may need to rename the most recent archive (for each log file in a given config block) before you can force a log rotation.

Combining flags

The testing flags can be used together quite effectively. To have logrotate tell you what it would do if you made it rotate everything, but not actually rotate anything, you can combine all three:
/usr/sbin/logrotate -vdf /etc/logrotate.conf
 
You'll get treated to a long list of things logrotate would do, including which log files it would rotate and what it would do during that process.

If you then want to test all the rotate configs in their entirety — including the scripts run after rotations — you can run logrotate without the debug flag:
/usr/sbin/logrotate -vf /etc/logrotate.conf
 
All the logs will be rotated, and skimming the output should help you catch any obvious problems. You'll also want to make sure that all your services are still running okay (that there was nothing wrong with the postrotate scripts), and that all the logs actually did get rotated.

How logrotate remembers

If you find that a log isn't rotating even though it's old enough that it should, a simple way to fix the problem is to manually run logrotate with the "-f" flag. But if you're the sort who wants to know why something's gone wrong, there's one more file you can check before forcing a rotation:
/var/lib/logrotate.status
 
That file is where logrotate stores information about when it last rotated each log file. If you look inside you'll see something like:

logrotate state -- version 2
"/var/log/acpid.log" 2010-6-18
"/var/log/iptables.log" 2010-6-18
"/var/log/uucp.log" 2010-6-29
...
 
It's a straightforward format - the log file location is on the left, and the date when it was last rotated is on the right. Sometimes it can happen that the dates on your server get a little wonky (if you were tinkering with an NTP service or the like), and the date when a log was last rotated winds up being a future date. If that's happened you'll see it here.

If you want to check logrotate out with a particular log file but don't want to force everything to rotate, you can delete the log's entry from the logrotate status file. Then when you run logrotate normally it should create a new entry for the log with today's date (even though it may not actually rotate the log - it uses that first run as a baseline if it's just interval-based).

Summary

For something that runs quietly in the background and only really performs one type of task, logrotate does quite a bit. You hopefully understand logrotate better than you wanted to. At the least you should be able to set up new logrotate config files for your own purposes, either creating them from scratch or copying existing configs and modifying them appropriately. And most importantly, you can keep your logs from getting out of control.

REFERENCES
http://articles.slicehost.com/2010/6/30/understanding-logrotate-on-centos-part-1
http://articles.slicehost.com/2010/6/30/understanding-logrotate-on-centos-part-2