Tuesday, December 22, 2009

Troubleshooting Memory Usage

SkyHi @ Tuesday, December 22, 2009

Processing dying unexpectedly?  Want to know if you need more memory?

Check your /var/log/messages.  If you see (on a 2.4.23 kernel):

<code>Dec 11 10:21:43 www kernel: __alloc_pages: 0-order allocation failed (gfp=0x1d2/0)
Dec 11 10:21:44 www kernel: __alloc_pages: 0-order allocation failed (gfp=0x1f0/0)
</code>

Or (on a pre-2.4.23 kernel):

<code><br />Dec 7 23:49:03 www kernel: Out of Memory: Killed process 31088 (java).<br />Dec 7 23:49:03 www kernel: Out of Memory: Killed process 31103 (java).<br /></code>

Or on a Xen-based VPS console:

<code><br />swapper: page allocation failure. order:0, mode:0x20<br /> [<c01303a4>] __alloc_pages+0x327/0x3e3<br /></code>

Then your programs need more memory than they can get.

Interpreting Free

To see how much memory you are currently using, run free -m.  It will provide output like:

            total    used   free    shared buffers cached
Mem:        90      85       4      0       3       34
-/+ buffers/cache:  46      43
Swap:       9        0       9

The top row 'used' (85) value will almost always nearly match the top row mem value (90).  Since Linux likes to use any spare memory to cache disk blocks (34).

The key used figure to look at is the buffers/cache row used value (46).  This is how much space your applications are currently using.  For best performance, this number should be less than your total (90) memory.  To prevent out of memory errors, it needs to be less than the total memory (90) and swap space (9).

If you wish to quickly see how much memory is free look at the buffers/cache row free value (43). This is the total memory (90)- the actual used (46). (90 - 46 = 44, not 43, this will just be a rounding issue)

Interpreting ps

If you want to see where all your memory is going, run ps aux.  That will show the percentage of memory each process is using.  You can use it to identify the top memory users (usually Apache, MySQL and Java processes).

For example in this output snippet:

USER PID %CPU %MEM VSZ     RSS   TTY   STAT   START TIME COMMAND
root 854 0.5  39.2 239372  36208 pts/0 S     22:50 0:05 /usr/local/jdk/bi
n/java -Xms16m -Xmx64m -Djava.awt.headless=true -Djetty.home=/opt/jetty -cp /opt
/jetty/ext/ant.jar:/opt/jetty/ext/jasper-compiler.jar:/opt/jetty/ext/jasper-runt
ime.jar:/opt/jetty/ext/jcert.jar:/opt/jetty/ext/jmxri.jar:/opt/jetty/ext/jmxtool

We can see that java is using up 39.2% of the available memory.

Interpreting vmstat

vmstat helps you to see, among other things, if your server is swapping.  Take a look at the following run of vmstat doing a one second refresh for two iterations.

<code><br /># vmstat 1 2<br />   procs                      memory    swap          io     system         cpu<br /> r  b  w   swpd   free   buff  cache  si  so    bi    bo   in    cs  us  sy  id<br /> 0  0  0  39132   2416    804  15668   4   3     9     6  104    13   0   0 100<br /> 0  0  0  39132   2416    804  15668   0   0     0     0   53     8   0   0 100<br /> 0  0  0  39132   2416    804  15668   0   0     0     0   54     6   0   0 100<br /></code>

The first row shows your server averages.  The si (swap in) and so (swap out) columns show if you have been swapping (i.e. needing to dip into 'virtual' memory) in order to run your server's applications.  The si/so numbers should be 0 (or close to it).  Numbers in the hundreds or thousands indicate your server is swapping heavily.  This consumes a lot of CPU and other server resources and you would get a very (!) significant benefit from adding more memory to your server.

Some other columns of interest: The r (runnable) b (blocked) and w (waiting) columns help see your server load.  Waiting processes are swapped out.  Blocked processes are typically waiting on I/O.  The runnable column is the number of processes trying to something.  These numbers combine to form the 'load' value on your server.  Typically you want the load value to be one or less per CPU in your server.

The bi (bytes in) and bo (bytes out) column show disk I/O (including swapping memory to/from disk) on your server.

The us (user), sy (system) and id (idle) show the amount of CPU your server is using.  The higher the idle value, the better.

Resolving: High Java Memory Usage

Java processes can often consume more memory than any other application running on a server.

Java processes can be passed a -Xmx option.  This controls the maximum Java memory heap size.  It is important to set a limit on the heap size, otherwise the heap will keep increasing until you get out of memory errors on your VPS (resulting in the Java process - or even some other, random, process - dying.

Usually the setting can be found in your /usr/local/jboss/bin/run.conf or /usr/local/tomcat/bin/setenv.sh config files.  And your RimuHosting default install should have a reasonable value in there already.

If you are running a custom Java application, check there is a -XmxNNm (where NN is a number of megabytes) option on the Java command line.

The optimal -Xmx setting value will depend on what you are running.  And how much memory is available on your server.

From experience we have found that Tomcat often runs well with an -Xmx between 48m and 64m.  JBoss will need a -Xmx of at least 96m to 128m.  You can set the value higher.  However, you should ensure that there is memory available on your server.

To determine how much memory you can spare for Java, try this: stop your Java process; run free -m; subtract the 'used' value from the "-/+ cache" row from the total memory allocated to your server and then subtract another 'just in case' margin of about 10% of your total server memory.  The number you come up with is a rough indicator of the largest -Xmx setting you can use on your server.

Resolving: High Spam Assassin Memory Usage

Are you running a Spam Assassin 'daemon'?  It can create multiple (typically 5) threads/processes and each of those threads can use a very large amount of memory.

SpamAssassin works very well with just one thread.  So you can reduce the 'children' setting and reclaim some memory on your server for other apps to run with.

<code><br />for location in /etc/default/spamassassin /etc/sysconfig/spamassassin; do <br />if [ ! -e $location ]; then continue; fi<br />replace "SPAMDOPTIONS=\"-d -c -m5 -H" "SPAMDOPTIONS=\"-d -c -m1 -H" -- /etc/init.d/spamassassin<br />replace "\-m 10 " "-m 1 " -- $location<br />replace "\-m 5 " "-m 1 " -- $location<br />replace "\-m5 " "-m1 " -- $location<br />replace "max-children 5 " "max-children 1 " -- $location<br />done	<br /></code>

Another thing to check with spamassassin is that any /etc/procmailrc entry only does one spamassassin check at a time.  Otherwise if you receive a batch of incoming email they will all be processed in parallel.  This could cause your server CPU usage to spike, slowing down your other apps, and it may cause your server to run out of memory.

To make procmailrc run only one email at a time through Spamassassin use a lockfile on your recipe line.  e.g. change the top line of:

<code><br />:0fw:<br /># The following line tells Procmail to send messages to Spamassassin only if they are less thatn 256000 bytes. Most spam falls well below this size and a larger size could seriously affect performance.)<br />* < 256000<br />| /usr/bin/spamc<br /></code>

To:

<code><br />:0fw:/etc/mail/spamc.lock<br /># The following line tells Procmail to send messages to Spamassassin only if they are less thatn 256000 bytes. Most spam falls well below this size and a larger size could seriously affect performance.)<br />* < 256000<br />| /usr/bin/spamc<br /></code>

Resolving: High Apache Memory Usage

Apache can be a big memory user.  Apache runs a number of 'servers' and shares incoming requests among them.  The memory used by each server grows, especially when the web page being returned by that server includes PHP or Perl that needs to load in new libraries.  It is common for each server process to use as much as 10% of a server's memory.

To reduce the number of servers, you can edit your httpd.conf file.  There are three settings to tweak: StartServers, MinSpareServers, and MaxSpareServers.  Each can be reduced to a value of 1 or 2 and your server will still respond promptly, even on quite busy sites.  Some distros have multiple versions of these settings depending on which process model Apache is using.  In this case, the 'prefork' values are the ones that would need to change.

To get a rough idea of how to set the MaxClients directive, it is best to find out how much memory the largest apache thread is using. Then stop apache, check the free memory and divide that amount by the size of the apache thread found earlier. The result will be a rough guideline that can be used to further tune (up/down) the MaxClients directive. The following script can be used to get a general idea of how to set MaxClients for a particular server:

<code><br />#!/bin/bash<br />echo "This is intended as a guideline only!"<br />if [ -e /etc/debian_version ]; then<br />    APACHE="apache2"<br />elif [ -e /etc/redhat-release ]; then<br />    APACHE="httpd"<br />fi<br />RSS=`ps -aylC $APACHE |grep "$APACHE" |awk '{print $8'} |sort -n |tail -n 1`<br />RSS=`expr $RSS / 1024`<br />echo "Stopping $APACHE to calculate free memory"<br />/etc/init.d/$APACHE stop &> /dev/null<br />MEM=`free -m |head -n 2 |tail -n 1 |awk '{free=($4); print free}'`<br />echo "Starting $APACHE again"<br />/etc/init.d/$APACHE start &> /dev/null<br />echo "MaxClients should be around" `expr $MEM / $RSS`<br /></code>

Note: httpd.conf should be tuned correctly on our newer WBEL3 and FC2 distros.  Apache is not installed by default on our Debian distros (since some people opt for Apache 2 and others prefer Apache 1.3).  So this change should only be necessary if you have a Debian distro.

Resolving: High MySQL Memory Usage

Our rpm based distros (e.g. RH9 and WBEL3) have MySQL preinstalled but not running.  Our pre-install uses a memory efficient /etc/my.cnf file.  If you install MySQL on a Debian server, edit the key_buffer_size setting in /etc/mysql/my.cnf.  A small value like 2M often works well. For an ultra-tiny setup add or change the follow entries to the mysqld section:

<code><br /># if your are not using the innodb table manager, then just skip it to save some memory<br />#skip-innodb<br />innodb_buffer_pool_size = 16k<br />key_buffer_size = 16k<br />myisam_sort_buffer_size = 16k<br />query_cache_size = 1M<br /></code>

Troubleshooting Irregular Out Of Memory Errors

Sometimes a server's regular memory usage is fine.  But it will intermittently run out of memory.  And when that happens you may lose trace of what caused the server to run out of memory.

In this case you can setup a script (see below) that will regularly log your server's memory usage.  And if there is a problem you can check the logs to see what was running.

<code><br /># create a memmon.sh script that tracks the current date, memory usage and running processes<br />cat << EOF > /root/memmon.sh<br />#!/bin/bash<br />date;<br />uptime<br />free -m<br />vmstat 1 5<br />ps auxf --width=200<br />if which iptables 2>&1 > /dev/null; then<br />iptables -L | diff iptables_default - | awk '{print "IPTABLES: " $0}'<br />iptables -L > iptables_default<br />else<br />echo "IPTABLES MISSING"<br />fi<br />dmesg | diff -u dmesg_default - | grep '^+' | awk '{print "DMESG:" $0}'<br />dmesg > dmesg_default<br />EOF<br /><br />chmod +x /root/memmon.sh<br /><br /># create a cronjob that runs every few minutes to log the memory usage<br />echo '0-59/10 * * * * root /root/memmon.sh >> /root/memmon.txt' > /etc/cron.d/memmon<br />/etc/init.d/cron* restart <br /><br /># create a logrotate entry so the log file does not get too large<br />echo '/root/memmon.txt {}' > /etc/logrotate.d/memmon<br /><br /></code>

Just Add Memory

A simple solution to resolving most out of memory problems is to add more memory.  If you'd like to increase the memory on your VPS, just send us a support ticket and let us know how much memory you need (per the pricing here).



Reference: http://rimuhosting.com/howto/memory.jsp