Thursday, August 19, 2010

Recovering deleted files using grep

SkyHi @ Thursday, August 19, 2010
I was working on a server this morning and accidentally deleted an important configuration file. Like many Linux users, I lamented the absence of an “undelete” command. The file wasn’t still open by any processes, wasn’t present in the backups, and would be painful to recreate.

Fortunately, not all hope was lost. When a file is deleted from a hard drive, the blocks are freed, but not actually cleared. The data remains on disk, but it cannot be directly accessed and is in danger of being overwritten. Recovery is a matter of search and rescue.
Since the file I was hoping to recover was a text file, and I knew a fair amount about it (such as approximate file size and some text that was definitely going to be included), finding it actually turned out to be fairly simple task using grep:

grep -a -B 25 -A 100 'some string in the file' /dev/sda1 > results.txt



Here’s what the command does:
grep searches through a file and prints out all the lines that match some pattern. Here, the pattern is some string that is known to be in the deleted file. The more specific this string can be, the better. The file being searched by grep (/dev/sda1) is the partition of the hard drive the deleted file used to reside in. The “-a” flag tells grep to treat the hard drive partition, which is actually a binary file, as text. Since recovering the entire file would be nice instead of just the lines that are already known, context control is used. The flags “-B 25 -A 100” tell grep to print out 25 lines before a match and 100 lines after a match. Be conservative with estimates on these numbers to ensure the entire file is included (when in doubt, guess bigger numbers). Excess data is easy to trim out of results, but if you find yourself with a truncated or incomplete file, you need to do this all over again. Finally, the ”> results.txt” instructs the computer to store the output of grep in a file called results.txt.

Once the command is done, results.txt will probably contain lots of gibberish, but if you’re lucky, the contents of the deleted file will be intact and recoverable.
To help prevent this problem from happening in the first place, many people elect to alias the rm command to a script which will move files to a temporary location, like a trash bin, instead of actually deleting them.


REFERENCES
http://spin.atomicobject.com/2010/08/18/undelete