Thursday, August 20, 2009

Linux Find Large Files and sort du -h output by size

SkyHi @ Thursday, August 20, 2009
Q. How do I find out all large files in a directory?

A. There is no single command that can be used to list all large files. But, with the help of find command and shell pipes, you can easily list all large files.

Linux List All Large Files

To finds all files over 50,000KB (50MB+) in size and display their names, along with size, use following syntax:

Syntax for RedHat / CentOS / Fedora Linux

find {/path/to/directory/} -type f -size +{size-in-kb}k -exec ls -lh {} \; | awk '{ print $9 ": " $5 }' 

Search or find big files Linux (50MB) in current directory, enter:
$ find . -type f -size +50000k -exec ls -lh {} \; | awk '{ print $9 ": " $5 }' 

Search in my /var/log directory:
# find /var/log -type f -size +100000k -exec ls -lh {} \; | awk '{ print $9 ": " $5 }'

Syntax for Debian / Ubuntu Linux

find {/path/to/directory} -type f -size +{file-size-in-kb}k -exec ls -lh {} \; | awk '{ print $8 ": " $5 }' 

Search in current directory:
$ find . -type f -size +10000k -exec ls -lh {} \; | awk '{ print $8 ": " $5 }'

Sample output:
./.kde/share/apps/akregator/Archive/http___blogs.msdn.com_MainFeed.aspx?Type=AllBlogs.mk4: 91M
./out/out.tar.gz: 828M
./.cache/tracker/file-meta.db: 101M
./ubuntu-8.04-desktop-i386.iso: 700M
./vivek/out/mp3/Eric: 230M
 
Above commands will lists files that are are greater than 10,000 kilobytes in size. To list all files in your home directory tree less than 500 bytes in size, type:
$ find $HOME -size -500b
OR
$ find ~ -size -500b 

To list all files on the system whose size is exactly 20 512-byte blocks, type:
# find / -size 20

ls command: finding the largest files in a directory

You can also use ls command:
$ ls -lS
$ ls -lS | less
$ ls -lS | head +10

ls command: finding the smallest files in a directory

Use ls command as follows:
$ ls -lSr
$ ls -lSr | less
$ ls -lSr | tail -10

You can also use du command as pointed out georges in the comments.


Perl hack: To display large files

Jonathan has contributed following perl code print out stars and the length of the stars show the usage of each folder / file from smallest to largest on the box:

#du -k | sort -n | perl -ne 'if ( /^(\d+)\s+(.*$)/){$l=log($1+.1);$m=int($l/log(1024)); printf  ("%6.1f\t%s\t%25s  %s\n",($1/(2**(10*$m))),(("K","M","G","T","P")[$m]),"*"x (1.5*$l),$2);}'




996.0 K ********** ./3dgolddenhouse.com/html/gallery/wood_b_slides
1.0 M ********** ./3dgolddenhouse.com/html/gallery/draperies_slides
1.1 M ********** ./3dgolddenhouse.com/html/gallery/roller_shades_slide
1.1 M ********** ./3dgolddenhouse.com/html/gallery/wood_shutters_slides
1.2 M ********** ./stellmnao.com/html/cchis
1.2 M ********** ./stellmnao.com/html
1.2 M ********** ./stellmnao.com
1.3 M ********** ./3dgolddenhouse.com/html/gallery/cellular_slides
1.3 M ********** ./3dgolddenhouse.com/html/gallery/faux_wood_slides
1.3 M ********** ./heretogohome.net/html/bo/app/webroot
1.5 M ********** ./3dgolddenhouse.com/html/gallery/sheer_shades_slides
1.6 M *********** ./3dgolddenhouse.com/html/gallery/vinyl_shutters_slides
1.7 M *********** ./heretogohome.net/html/bo/cake/libs
1.9 M *********** ./heretogohome.net/html/bo/app
2.8 M *********** ./heretogohome.net/html/bo/cake
3.1 M ************ ./3dgolddenhouse.com/html/images
3.1 M ************ ./3dgolddenhouse.com/html/gallery/aluminum_slides
3.2 M ************ ./3dgolddenhouse.com/html/gallery/roman_shades_slides
5.3 M ************ ./heretogohome.net/html/bo
5.3 M ************ ./heretogohome.net/html
5.3 M ************ ./heretogohome.net
18.0 M ************** ./3dgolddenhouse.com/html/gallery
21.3 M ************** ./3dgolddenhouse.com/html
21.3 M ************** ./3dgolddenhouse.com
27.8 M *************** .

Linux tip: du --max-depth=1
du --max-depth=1 | sort -n | awk 'BEGIN {OFMT = "%.0f"} {print $1/1024,"MB", $2}' > diskusage.txt




Reference: http://www.cyberciti.biz/faq/find-large-files-linux/
http://serverfault.com/questions/62411/how-can-i-sort-du-h-output-by-size